How does "oc debug node" command works
May 04, 2021
This week, I’m working on Red Hat OpenShift Administration I (DO280) virtual instructor led training (VILT). This training is a preparatory course for the OpenShift Administration Certification Exam (EX280).
During one of the chapters today, around troubleshooting OpenShift clusters, I came across this command
oc debug node/<node-name>, which is used for debugging cluster nodes (
crio-d containers, control-plane stuff etc). This command provides a way to open up a shell prompt in a cluster node.
But I saw something interesting while using this command. As the debug shell prompt open up (after hitting the command), there is this
Pod IP: x.x.x.x appears in the terminal. I was a bit surprised to see a
Pod IP coming in a node debug shell.
In order to confirm, if this (Pod) IP in the debug shell prompt, matched the node
internal IP, I ran
oc describe node master01. And to my more surprise, it does match.
Once I was sure both these IPs were same, I asked my course instructor the following question, i.e.:
When I do
oc debug node/<node-name>, the debug shell prompts up, printing
Pod IP: <some-ip>. This
Pod IPis exactly same as the node’s
internal IP. So, does that mean, a cluster node is also essentially a pod (or a container)?
He pointed me to this article (from redhat) that totally explained this above behaviour. Before I dig into what really happened (or you could simply go that article 😂). The short answer to my question, is no. The cluster nodes are not equal to pods. But the question still made a lot of sense & because the actual behaviour of
oc debug node/<node-name> command imitates similar thing only.
Let’s discuss what really happened!
The new recent versions of OpenShift suggests not to directly SSH into a cluster node. This is evident from the
sshd_config on the cluster nodes, where they have
PermitRootLogin set to
no on them.
So what really happens is when you run
oc debug node/<node_name>, it creates a pod for the debugging session and points us in the shell (TTY) of that pod. This pod is a special-purpose tools container that mounts the node root file system at the
/host folder, and allows us to inspect any files from the node using it. This pod remains alive only until we are using the node debug shell prompt.
To check the pod’s yaml manifest, run
oc debug node/<node-name> command in one terminal & in the second terminal, list down the pods looking for the one with the name
<node-name>-debug. And then run
oc get pod <node-name>-debug -o yaml
The pod yaml manifest will look like the following:
apiVersion: v1 kind: Pod metadata: name: master01-debug namespace: default ... ... ... spec: containers: - command: - /bin/sh image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:57e87210a3f3a3ba4fc85dde180c76988a5f68445f705fd07855003986c75ab0 name: container-00 ... securityContext: privileged: true runAsUser: 0 ... tty: true volumeMounts: - mountPath: /host name: host - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: default-token-p8rss readOnly: true ... hostNetwork: true hostPID: true ... volumes: - hostPath: path: / type: Directory name: host - name: default-token-p8rss secret: defaultMode: 420 secretName: default-token-dnkrx status: … hostIP: 192.168.50.10 phase: Running podIP: 192.168.50.10 podIPs: - ip: 192.168.50.10
In OpenShift, there is a least possibility to run priviliged pods, with root acces (
user 0). But the following section in the pod manifest shows entirely a different picture.
securityContext: privileged: true runAsUser: 0
The pod created with
oc debug node/<node-name> command is a privileged pod with root user access. This specical kind of pod is what creates an equivalent scenario of SSHing into the node as a root user.
(The article I linked above, mentions this command,
setpriv -d to run in the debug shell. It will give you an idea of how highly unrestricted access this priviliged debug pod gives us.)
So, yea, that is all what I had to share for today. For me, it was quite an interesting concept.