Adding a New Node to the Cluster
Option 1 (Preferred)
This option uses BareMetalHost and BMC for provisioning. Three objects need to be created:
Secret (BMC authentication)
Secret (NMState config)
BareMetalHost
Note
All objects are created in the “openshift-machine-api” namespace.
Tip
This method supports Remote Worker Node.
Set the variables needed to complete steps. These all come from the new host.
NODENAME=host35 NODEMAC=52:54:00:f4:16:35 NODEUUID=a3fce101-8d6c-4f74-9145-c8e79415cc84
Create the BMC authentication secret.
Important
The username and password are generated with base64.
echo -n 'kni' | base64 -w0
cat << EOF > ./$NODENAME-secret.yaml apiVersion: v1 kind: Secret metadata: name: bmc-secret-$NODENAME namespace: openshift-machine-api type: Opaque data: password: a25p username: a25p EOF
Create the NMState config secret.
Important
Adjust nmstate interface config for the new node.
cat << EOF > ./$NODENAME-nmstate.yaml apiVersion: v1 kind: Secret metadata: name: bmc-secret-nmstate-$NODENAME namespace: openshift-machine-api type: Opaque stringData: nmstate: | interfaces: - name: enp1s0 type: ethernet mtu: 1500 state: up - name: enp1s0.132 type: vlan state: up vlan: base-iface: enp1s0 id: 132 ipv4: enabled: true dhcp: false address: - ip: 192.168.132.35 prefix-length: 24 ipv6: enabled: false dns-resolver: config: search: - lab.local server: - 192.168.1.68 routes: config: - destination: 0.0.0.0/0 next-hop-address: 192.168.132.1 next-hop-interface: enp1s0.132 table-id: 254 EOF
Create the BareMetalHost.
Important
The “credentialsName” and “preprovisioningNetworkDataName” need to match the names used in the previous two steps.
cat << EOF > ./$NODENAME-baremetal.yaml apiVersion: metal3.io/v1alpha1 kind: BareMetalHost metadata: name: $NODENAME namespace: openshift-machine-api spec: online: true bootMACAddress: $NODEMAC bmc: address: redfish-virtualmedia+http://192.168.1.72:8000/redfish/v1/Systems/$NODEUUID credentialsName: bmc-secret-$NODENAME disableCertificateVerification: true rootDeviceHints: deviceName: "/dev/vda" preprovisioningNetworkDataName: bmc-secret-nmstate-$NODENAME EOF
Once the files are modified and ready create them:
oc create -f ./
Follow the creation progress. The BareMetalHost should show “available” when ready.
Note
Your metal3-baremenatel-operator pod will have a different hash.
oc logs metal3-baremetal-operator-8749b7fd5-krgw6 -n openshift-machine-api --follow # and/or ssh core@$NODENAME journalctl -f
oc get bmh -n openshift-machine-api
From the OpenShift console confirm new BMH is “Available:
Go to
From the OpenShift console modify the MachineSet to add the “available” node to the cluster:
Go to
Tip
You can make this modification via the command line:
oc scale --replicas=<worker_nodes> machineset <machineset> -n openshift-machine-api # Example: oc scale --replicas=1 machineset ocp3-d5zw7-worker-0 -n openshift-machine-api
Option 2 (Manual)
These steps are based on Red Hat documentation. For a deeper understand of each step see the following URL: Adding worker nodes to single-node OpenShift clusters manually
Note
I’ve tested this on 4.12 through 4.18.
Warning
Exactly three control plane nodes must be used for all production deployments prior to 4.18. With 4.18 you can have more then three.
Important
These steps allow for the addition of a new master or worker node depending on how you set the “NODE_TYPE” variable.
Set the environment variables. Be sure to use the variables that match your running version and architecture. Specify “master” or “worker” depending on the desired node type.
OCP_VERSION=4.14.1 ARCH=x86_64 NODE_TYPE=worker
Extract the ignition file.
oc extract -n openshift-machine-api secret/$NODE_TYPE-user-data-managed --keys=userData --to=- > $NODE_TYPE.ign
Important
Place this file on a web server reachable from the control-plane network.
Create a new igniton file “new-$NODE_TYPE.ign” that includes a reference to the original “$NODE_TYPE.ign” and an additional instruction that the coreos-installer program uses to populate the /etc/hostname file on the new host.
cat << EOF > ./new-$NODE_TYPE.ign { "ignition":{ "version":"3.2.0", "config":{ "merge":[ { "source":"http://192.168.1.72/$NODE_TYPE.ign" } ] } }, "storage":{ "files":[ { "path":"/etc/hostname", "contents":{ "source":"data:,host44.lab.local" }, "mode":420, "overwrite":true, "path":"/etc/hostname" } ] } } EOF
Important
Place this file on a web server reachable from the control-plane network.
If needed download the OCP installer.
curl -L -k https://mirror.openshift.com/pub/openshift-v4/clients/ocp/$OCP_VERSION/openshift-install-linux.tar.gz \ -o openshift-install-linux-$OCP_VERSION.tar.gz
Extract the installer
tar -xzvf openshift-install-linux-$OCP_VERSION.tar.gz
Discover the RHCOS ISO URL
ISO_URL=$(./openshift-install coreos print-stream-json | grep location | grep $ARCH | grep iso | cut -d\" -f4)
Download the RHCOS ISO
curl -L $ISO_URL -o rhcos-$OCP_VERSION-$ARCH-live.iso
Boot the target host from the RHCOS ISO.
If not using DHCP or have a custom network config use the RHEL tools to configure the network.
Check the block devices and “wipe” if needed.
Note
With baremetal hardware it may be necesary to “wipe” the previous block device partitions and signatures.
lsblk
sudo wipefs -af /dev/vda
Tip
Be sure to check that all partitions are “wiped” with lsblk. I’ve seen LVM partitions not get removed.
Once the network is configured and operational run following command:
Attention
Update the command for your ignition url and block device.
sudo coreos-installer install --copy-network --insecure-ignition --ignition-url=http://192.168.1.72/new-$NODE_TYPE.ign /dev/vda
When the install is complete, reboot the host.
Note
The machine may reboot more than once.
For the new host to join the cluster, several pending csr’s will need to be approved.
Attention
The csr approval command will need to be run more than once.
oc get csr
oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs --no-run-if-empty oc adm certificate approve
After all the csr’s are approved, confirm the node was added.
oc get nodes oc get mcp
In my example I added two new nodes, host44 and host45.
Associate Node with MachineSet
After adding the new node you’ll notice the new node is up and “Ready” for use but doesn’t match the initial nodes in the cluster. The original nodes are part of a MachineSet and associated with bare metal host objects.
Note
In older version of OCP the Node Overview via the console will show errors.
The following creates and associates the required objects for the new node and resolves any console errors.
Set the variables needed to complete steps. These all come from the new host.
NODENAME=host35 NODEMAC=52:54:00:f4:16:35 NODEUUID=a3fce101-8d6c-4f74-9145-c8e79415cc84
From the cli increase the MachineSet by +1.
Warning
Check the current number of replicas first. This will ensure you set the replicas to a proper number. The following command will show “DESIRED” and “CURRENT”. Be sure to increase the replicas by +1.
Not adjusting this correctly will delete existing objects.
oc get machinesets -n openshift-machine-api
oc scale --replicas=1 machineset ocp3-d5zw7-worker-0 -n openshift-machine-api
Find the name of the newly created Machine. There should be a new name in the “Provisioning” phase. Set that name to the variable MACHINENAME.
oc get machines -n openshift-machine-api MACHINENAME=$(oc get machines | grep Provisioning | awk '{print $1}')
Add the new BareMetalHost by copy the following yaml and making the necesary changes for you node.
Note
Since this node was provsioned externally we need to add the “externallyProvisioned: true” switch.
cat << EOF > ./$NODENAME-baremetal.yaml apiVersion: metal3.io/v1alpha1 kind: BareMetalHost metadata: name: $NODENAME namespace: openshift-machine-api spec: architecture: x86_64 automatedCleaningMode: metadata bmc: address: redfish-virtualmedia+http://192.168.1.72:8000/redfish/v1/Systems/$NODEUUID credentialsName: bmc-secret-$NODENAME disableCertificateVerification: true bootMACAddress: $NODEMAC consumerRef: apiVersion: machine.openshift.io/v1beta1 kind: Machine name: $MACHINENAME namespace: openshift-machine-api customDeploy: method: install_coreos online: true externallyProvisioned: true userData: name: worker-user-data-managed namespace: openshift-machine-api EOF
Add the new credentialName Secret for the BareMetalHost.
Important
The username and password are generated with base64.
echo -n 'kni' | base64 -w0
cat << EOF > ./$NODENAME-secret.yaml apiVersion: v1 kind: Secret metadata: name: bmc-secret-$NODENAME namespace: openshift-machine-api type: Opaque data: password: a25p username: a25p EOF
Create the new objects.
oc create -f $NODENAME-secret.yaml oc create -f $NODENAME-baremetal.yaml
Find the new BMH UID
BMHUID=$(oc get bmh $NODENAME --template='{{.metadata.uid}}')
Warning
Do not attempt next step until the new BMH object state is “provisioned” or “externally provisioned”.
oc get bmh
Modify the node to associate it with the BareMetalHost.
oc patch node $NODENAME --patch '{"metadata":{"annotations":{"machine.openshift.io/machine": "openshift-machine-api/'$MACHINENAME'"}}}' oc patch node $NODENAME --patch '{"spec":{"providerID":"baremetalhost:///openshift-machine-api/'$NODENAME'/'$BMHUID'"}}'
ETCD
Back-Up
OpenShift comes with scripts that will backup the etcd state. It’s best practice to backup etcd before removing and replacing a control node.
See also
Determine which master node is currently the leader.
Change to the openshift-etcd project
oc project openshift-etcd
List the etcd pods
oc get pods | grep etcd
RSH into any of the etcd-<node> pods
oc rsh etcd-host41.lab.local
From within that pod run the following command to find the etcd leader. Exit pod after noting the current leader. This is where the backup script will be run from.
etcdctl endpoint status -w table
Connect to the etcd leader node via ssh
ssh core@host41.lab.localExecute the etcd backup script
sudo /usr/local/bin/cluster-backup.sh /home/core/etcd-backup
Verify both snapshot_<TIME_STAMP>.db and static_kuberesources_<TIME_STAMP>.tar.gz exist. Move files to a safe location in the event of failure.
Clean-Up
In the event of a control node failure the failed node must be removed from etcd. Before starting be sure to follow the previous section backing up etcd.
See also
Remove failed node
oc delete node host41.lab.local
Confirm removal
oc get nodes
Change to the openshift-etcd project
oc project openshift-etcd
List the etcd pods
oc get pods | grep etcd
RSH into any of the etcd-<node> pods
oc rsh etcd-host42.lab.local
From within that pod run the following command to list the etcd members. Note the ID associated with the failed master.
etcdctl member list -w table
Remove the NODE from the etcd database using the ID noted in the previous step.
etcdctl member remove <ID>
Validate removal. The failing member should no long appear in the member list. Exit pod after validating.
etcdctl member list -w table
Get and delete the nodes etcd secrets. There should be three of them.
oc get secrets | grep <NODE>
Delete
oc delete secret etcd-peer-<NODE> oc delete secret etcd-serving-<NODE> oc delete secret etcd-serving-metrics-<NODE>
Add the replacement Node to the cluster using “Adding a New Node to the Cluster” above.
Verify ETCD
After adding the new node to the cluster, we need to ensure that the pods are running and force a redeployment of this etcd member using the etcd operator.
See also
Check the etcd operator “AVAILABLE” status is “True”. If not you may need to wait or troubleshoot the status.
oc get co
Change to the openshift-etcd project
oc project openshift-etcd
Check all etcd pods have been created
oc get pods | grep etcd
RSH into any of the etcd-<node> pods
oc rsh etcd-host42.lab.local
From within that pod run the following command to list the etcd members.
etcdctl member list -w table
From within that pod run the following command to view the endpoint status.
etcdctl endpoint status -w table
(OPTIONAL) Force redeployment of etcd cluster.
Attention
This is from an older doc and is not necesary. I kept the command for reference. It may come in handy if etcd doesn’t automagically deploy and needs to be “forced”.
oc patch etcd cluster --type merge \ --patch '{"spec": {"forceRedeploymentReason": "single-master-recovery-'"$( date --rfc-3339=ns )"'"}}'