I speak here {^_^}

How does "oc debug node" command works

May 04, 2021

This week, I’m working on Red Hat OpenShift Administration I (DO280) virtual instructor led training (VILT). This training is a preparatory course for the OpenShift Administration Certification Exam (EX280).

During one of the chapters today, around troubleshooting OpenShift clusters, I came across this command oc debug node/<node-name>, which is used for debugging cluster nodes (kubelets, crio-d containers, control-plane stuff etc). This command provides a way to open up a shell prompt in a cluster node.

But I saw something interesting while using this command. As the debug shell prompt open up (after hitting the command), there is this Pod IP: x.x.x.x appears in the terminal. I was a bit surprised to see a Pod IP coming in a node debug shell.

debug_shell

In order to confirm, if this (Pod) IP in the debug shell prompt, matched the node internal IP, I ran oc describe node master01. And to my more surprise, it does match.

node_internal_ip

Once I was sure both these IPs were same, I asked my course instructor the following question, i.e.:

When I do oc debug node/<node-name>, the debug shell prompts up, printing Pod IP: <some-ip>. This Pod IP is exactly same as the node’s internal IP. So, does that mean, a cluster node is also essentially a pod (or a container)?

He pointed me to this article (from redhat) that totally explained this above behaviour. Before I dig into what really happened (or you could simply go that article 😂). The short answer to my question, is no. The cluster nodes are not equal to pods. But the question still made a lot of sense & because the actual behaviour of oc debug node/<node-name> command imitates similar thing only.

Let’s discuss what really happened!

The new recent versions of OpenShift suggests not to directly SSH into a cluster node. This is evident from the sshd_config on the cluster nodes, where they have PermitRootLogin set to no on them.

So what really happens is when you run oc debug node/<node_name>, it creates a pod for the debugging session and points us in the shell (TTY) of that pod. This pod is a special-purpose tools container that mounts the node root file system at the /host folder, and allows us to inspect any files from the node using it. This pod remains alive only until we are using the node debug shell prompt.

To check the pod’s yaml manifest, run oc debug node/<node-name> command in one terminal & in the second terminal, list down the pods looking for the one with the name <node-name>-debug. And then run oc get pod <node-name>-debug -o yaml

Screenshot from 2021-05-04 20-40-29

The pod yaml manifest will look like the following:

apiVersion: v1
kind: Pod
metadata:
  name: master01-debug
  namespace: default
...
...
...
spec:
  containers:
    - command:
      - /bin/sh
      image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:57e87210a3f3a3ba4fc85dde180c76988a5f68445f705fd07855003986c75ab0
      name: container-00
      ...
      securityContext: 
         privileged: true
         runAsUser: 0
      ...
      tty: true
      volumeMounts: 
      - mountPath: /host  
        name: host
      - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
        name: default-token-p8rss
        readOnly: true
  ...
  hostNetwork: true
  hostPID: true
...

  volumes:
  - hostPath:
      path: / 
      type: Directory
    name: host
  - name: default-token-p8rss
    secret:
      defaultMode: 420
      secretName: default-token-dnkrx
status:

  hostIP: 192.168.50.10
  phase: Running
  podIP: 192.168.50.10
  podIPs:
  - ip: 192.168.50.10

In OpenShift, there is a least possibility to run priviliged pods, with root acces (user 0). But the following section in the pod manifest shows entirely a different picture.

      securityContext: 
         privileged: true
         runAsUser: 0

The pod created with oc debug node/<node-name> command is a privileged pod with root user access. This specical kind of pod is what creates an equivalent scenario of SSHing into the node as a root user.

(The article I linked above, mentions this command, setpriv -d to run in the debug shell. It will give you an idea of how highly unrestricted access this priviliged debug pod gives us.)

So, yea, that is all what I had to share for today. For me, it was quite an interesting concept.