Adding a New Node to the Cluster

Option 1 (Preferred)

This option uses BareMetalHost and BMC for provisioning. Three objects need to be created:

  • Secret (BMC authentication)

  • Secret (NMState config)

  • BareMetalHost

Note

All objects are created in the “openshift-machine-api” namespace.

Tip

This method supports Remote Worker Node.

  1. Set the variables needed to complete steps. These all come from the new host.

    NODENAME=host35
    NODEMAC=52:54:00:f4:16:35
    NODEUUID=a3fce101-8d6c-4f74-9145-c8e79415cc84
    
  2. Create the BMC authentication secret.

    Important

    The username and password are generated with base64.

    echo -n 'kni' | base64 -w0
    
    cat << EOF > ./$NODENAME-secret.yaml
    apiVersion: v1
    kind: Secret
    metadata:
      name: bmc-secret-$NODENAME
      namespace: openshift-machine-api
    type: Opaque
    data:
      password: a25p
      username: a25p
    EOF
    
  3. Create the NMState config secret.

    Important

    Adjust nmstate interface config for the new node.

    cat << EOF > ./$NODENAME-nmstate.yaml
    apiVersion: v1
    kind: Secret
    metadata:
      name: bmc-secret-nmstate-$NODENAME
      namespace: openshift-machine-api
    type: Opaque
    stringData:
     nmstate: |
       interfaces:
         - name: enp1s0
           type: ethernet
           mtu: 1500
           state: up
         - name: enp1s0.132
           type: vlan
           state: up
           vlan:
             base-iface: enp1s0
             id: 132
           ipv4:
             enabled: true
             dhcp: false
             address:
               - ip: 192.168.132.35
                 prefix-length: 24
           ipv6:
             enabled: false
       dns-resolver:
         config:
           search:
             - lab.local
           server:
             - 192.168.1.68
       routes:
         config:
           - destination: 0.0.0.0/0
             next-hop-address: 192.168.132.1
             next-hop-interface: enp1s0.132
             table-id: 254
    EOF
    
  4. Create the BareMetalHost.

    Important

    The “credentialsName” and “preprovisioningNetworkDataName” need to match the names used in the previous two steps.

    cat << EOF > ./$NODENAME-baremetal.yaml
    apiVersion: metal3.io/v1alpha1
    kind: BareMetalHost
    metadata:
      name: $NODENAME
      namespace: openshift-machine-api
    spec:
      online: true
      bootMACAddress: $NODEMAC
      bmc:
        address: redfish-virtualmedia+http://192.168.1.72:8000/redfish/v1/Systems/$NODEUUID
        credentialsName: bmc-secret-$NODENAME
        disableCertificateVerification: true
      rootDeviceHints:
        deviceName: "/dev/vda"
      preprovisioningNetworkDataName: bmc-secret-nmstate-$NODENAME
    EOF
    
  5. Once the files are modified and ready create them:

    oc create -f ./
    
  6. Follow the creation progress. The BareMetalHost should show “available” when ready.

    Note

    Your metal3-baremenatel-operator pod will have a different hash.

    oc logs metal3-baremetal-operator-8749b7fd5-krgw6 -n openshift-machine-api --follow
    
    # and/or
    
    ssh core@$NODENAME journalctl -f
    
    oc get bmh -n openshift-machine-api
    
  7. From the OpenShift console confirm new BMH is “Available:

    Go to Compute ‣ Bare Metal Hosts

    ../_images/bmh-available.png
  8. From the OpenShift console modify the MachineSet to add the “available” node to the cluster:

    Go to Compute ‣ MachineSets

    ../_images/machineset-worker.png ../_images/machineset-adjust-count.png

    Tip

    You can make this modification via the command line:

    oc scale --replicas=<worker_nodes> machineset <machineset> -n openshift-machine-api
    
    # Example: oc scale --replicas=1 machineset ocp3-d5zw7-worker-0 -n openshift-machine-api
    

Option 2 (Manual)

These steps are based on Red Hat documentation. For a deeper understand of each step see the following URL: Adding worker nodes to single-node OpenShift clusters manually

Note

I’ve tested this on 4.12 through 4.18.

Warning

Exactly three control plane nodes must be used for all production deployments prior to 4.18. With 4.18 you can have more then three.

Important

These steps allow for the addition of a new master or worker node depending on how you set the “NODE_TYPE” variable.

  1. Set the environment variables. Be sure to use the variables that match your running version and architecture. Specify “master” or “worker” depending on the desired node type.

    OCP_VERSION=4.14.1
    ARCH=x86_64
    NODE_TYPE=worker
    
  2. Extract the ignition file.

    oc extract -n openshift-machine-api secret/$NODE_TYPE-user-data-managed --keys=userData --to=- > $NODE_TYPE.ign
    

    Important

    Place this file on a web server reachable from the control-plane network.

  3. Create a new igniton file “new-$NODE_TYPE.ign” that includes a reference to the original “$NODE_TYPE.ign” and an additional instruction that the coreos-installer program uses to populate the /etc/hostname file on the new host.

    cat << EOF > ./new-$NODE_TYPE.ign
    {
      "ignition":{
        "version":"3.2.0",
        "config":{
          "merge":[
            {
              "source":"http://192.168.1.72/$NODE_TYPE.ign"
            }
          ]
        }
      },
      "storage":{
        "files":[
          {
            "path":"/etc/hostname",
            "contents":{
              "source":"data:,host44.lab.local"
            },
            "mode":420,
            "overwrite":true,
            "path":"/etc/hostname"
          }
        ]
      }
    }
    EOF
    

    Important

    Place this file on a web server reachable from the control-plane network.

  4. If needed download the OCP installer.

    curl -L -k https://mirror.openshift.com/pub/openshift-v4/clients/ocp/$OCP_VERSION/openshift-install-linux.tar.gz \
    -o openshift-install-linux-$OCP_VERSION.tar.gz
    

    Extract the installer

    tar -xzvf openshift-install-linux-$OCP_VERSION.tar.gz
    
  5. Discover the RHCOS ISO URL

    ISO_URL=$(./openshift-install coreos print-stream-json | grep location | grep $ARCH | grep iso | cut -d\" -f4)
    
  6. Download the RHCOS ISO

    curl -L $ISO_URL -o rhcos-$OCP_VERSION-$ARCH-live.iso
    
  7. Boot the target host from the RHCOS ISO.

  8. If not using DHCP or have a custom network config use the RHEL tools to configure the network.

  9. Check the block devices and “wipe” if needed.

    Note

    With baremetal hardware it may be necesary to “wipe” the previous block device partitions and signatures.

    lsblk
    
    sudo wipefs -af /dev/vda
    

    Tip

    Be sure to check that all partitions are “wiped” with lsblk. I’ve seen LVM partitions not get removed.

  10. Once the network is configured and operational run following command:

    Attention

    Update the command for your ignition url and block device.

    sudo coreos-installer install --copy-network --insecure-ignition --ignition-url=http://192.168.1.72/new-$NODE_TYPE.ign /dev/vda
    
  11. When the install is complete, reboot the host.

    ../_images/coreos-install-complete.png

    Note

    The machine may reboot more than once.

  12. For the new host to join the cluster, several pending csr’s will need to be approved.

    Attention

    The csr approval command will need to be run more than once.

    oc get csr
    
    oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs --no-run-if-empty oc adm certificate approve
    
  13. After all the csr’s are approved, confirm the node was added.

    oc get nodes
    
    oc get mcp
    

    In my example I added two new nodes, host44 and host45.

    ../_images/checknewnode.png

Associate Node with MachineSet

After adding the new node you’ll notice the new node is up and “Ready” for use but doesn’t match the initial nodes in the cluster. The original nodes are part of a MachineSet and associated with bare metal host objects.

Note

In older version of OCP the Node Overview via the console will show errors.

The following creates and associates the required objects for the new node and resolves any console errors.

  1. Set the variables needed to complete steps. These all come from the new host.

    NODENAME=host35
    NODEMAC=52:54:00:f4:16:35
    NODEUUID=a3fce101-8d6c-4f74-9145-c8e79415cc84
    
  2. From the cli increase the MachineSet by +1.

    Warning

    Check the current number of replicas first. This will ensure you set the replicas to a proper number. The following command will show “DESIRED” and “CURRENT”. Be sure to increase the replicas by +1.

    Not adjusting this correctly will delete existing objects.

    oc get machinesets -n openshift-machine-api
    
    oc scale --replicas=1 machineset ocp3-d5zw7-worker-0 -n openshift-machine-api
    
  3. Find the name of the newly created Machine. There should be a new name in the “Provisioning” phase. Set that name to the variable MACHINENAME.

    oc get machines -n openshift-machine-api
    
    MACHINENAME=$(oc get machines | grep Provisioning | awk '{print $1}')
    
  4. Add the new BareMetalHost by copy the following yaml and making the necesary changes for you node.

    Note

    Since this node was provsioned externally we need to add the “externallyProvisioned: true” switch.

    cat << EOF > ./$NODENAME-baremetal.yaml
    apiVersion: metal3.io/v1alpha1
    kind: BareMetalHost
    metadata:
      name: $NODENAME
      namespace: openshift-machine-api
    spec:
      architecture: x86_64
      automatedCleaningMode: metadata
      bmc:
        address: redfish-virtualmedia+http://192.168.1.72:8000/redfish/v1/Systems/$NODEUUID
        credentialsName: bmc-secret-$NODENAME
        disableCertificateVerification: true
      bootMACAddress: $NODEMAC
      consumerRef:
        apiVersion: machine.openshift.io/v1beta1
        kind: Machine
        name: $MACHINENAME
        namespace: openshift-machine-api
      customDeploy:
        method: install_coreos
      online: true
      externallyProvisioned: true
      userData:
        name: worker-user-data-managed
        namespace: openshift-machine-api
    EOF
    
  5. Add the new credentialName Secret for the BareMetalHost.

    Important

    The username and password are generated with base64.

    echo -n 'kni' | base64 -w0
    
    cat << EOF > ./$NODENAME-secret.yaml
    apiVersion: v1
    kind: Secret
    metadata:
      name: bmc-secret-$NODENAME
      namespace: openshift-machine-api
    type: Opaque
    data:
      password: a25p
      username: a25p
    EOF
    
  6. Create the new objects.

    oc create -f $NODENAME-secret.yaml
    
    oc create -f $NODENAME-baremetal.yaml
    
  7. Find the new BMH UID

    BMHUID=$(oc get bmh $NODENAME --template='{{.metadata.uid}}')
    

    Warning

    Do not attempt next step until the new BMH object state is “provisioned” or “externally provisioned”.

    oc get bmh
    
  8. Modify the node to associate it with the BareMetalHost.

    oc patch node $NODENAME --patch '{"metadata":{"annotations":{"machine.openshift.io/machine": "openshift-machine-api/'$MACHINENAME'"}}}'
    
    oc patch node $NODENAME --patch '{"spec":{"providerID":"baremetalhost:///openshift-machine-api/'$NODENAME'/'$BMHUID'"}}'
    

ETCD

Back-Up

OpenShift comes with scripts that will backup the etcd state. It’s best practice to backup etcd before removing and replacing a control node.

  1. Determine which master node is currently the leader.

    1. Change to the openshift-etcd project

      oc project openshift-etcd
      
    2. List the etcd pods

      oc get pods | grep etcd
      
      ../_images/getetcdpods.png
    3. RSH into any of the etcd-<node> pods

      oc rsh etcd-host41.lab.local
      
    4. From within that pod run the following command to find the etcd leader. Exit pod after noting the current leader. This is where the backup script will be run from.

      etcdctl endpoint status -w table
      
      ../_images/etcdleader.png
  2. Connect to the etcd leader node via ssh

    ssh core@host41.lab.local
    
  3. Execute the etcd backup script

    sudo /usr/local/bin/cluster-backup.sh /home/core/etcd-backup
    
  4. Verify both snapshot_<TIME_STAMP>.db and static_kuberesources_<TIME_STAMP>.tar.gz exist. Move files to a safe location in the event of failure.

    ../_images/backupetcd.png

Clean-Up

In the event of a control node failure the failed node must be removed from etcd. Before starting be sure to follow the previous section backing up etcd.

  1. Remove failed node

    oc delete node host41.lab.local
    
  2. Confirm removal

    oc get nodes
    
  3. Change to the openshift-etcd project

    oc project openshift-etcd
    
  4. List the etcd pods

    oc get pods | grep etcd
    
    ../_images/getetcdpods.png
  5. RSH into any of the etcd-<node> pods

    oc rsh etcd-host42.lab.local
    
  6. From within that pod run the following command to list the etcd members. Note the ID associated with the failed master.

    etcdctl member list -w table
    
    ../_images/etcdmembers.png
  7. Remove the NODE from the etcd database using the ID noted in the previous step.

    etcdctl member remove <ID>
    
  8. Validate removal. The failing member should no long appear in the member list. Exit pod after validating.

    etcdctl member list -w table
    
  9. Get and delete the nodes etcd secrets. There should be three of them.

    oc get secrets | grep <NODE>
    

    Delete

    oc delete secret etcd-peer-<NODE>
    oc delete secret etcd-serving-<NODE>
    oc delete secret etcd-serving-metrics-<NODE>
    
  10. Add the replacement Node to the cluster using “Adding a New Node to the Cluster” above.

Verify ETCD

After adding the new node to the cluster, we need to ensure that the pods are running and force a redeployment of this etcd member using the etcd operator.

  1. Check the etcd operator “AVAILABLE” status is “True”. If not you may need to wait or troubleshoot the status.

    oc get co
    
  2. Change to the openshift-etcd project

    oc project openshift-etcd
    
  3. Check all etcd pods have been created

    oc get pods | grep etcd
    
    ../_images/getetcdpods.png
  4. RSH into any of the etcd-<node> pods

    oc rsh etcd-host42.lab.local
    
  5. From within that pod run the following command to list the etcd members.

    etcdctl member list -w table
    
  6. From within that pod run the following command to view the endpoint status.

    etcdctl endpoint status -w table
    
  7. (OPTIONAL) Force redeployment of etcd cluster.

    Attention

    This is from an older doc and is not necesary. I kept the command for reference. It may come in handy if etcd doesn’t automagically deploy and needs to be “forced”.

    oc patch etcd cluster --type merge \
      --patch '{"spec": {"forceRedeploymentReason": "single-master-recovery-'"$( date --rfc-3339=ns )"'"}}'