Creating a sosreport on CoreOS

With OpenShift 4, Red Hat introduced Red Hat Enterprise Linux CoreOS. It is a very minimalistic operating system, focused on running container workload.

This new minimalism comes with some challenges. There are no more RPM packages and most of the tools we know and love are missing! Luckily, there is the Red Hat supplied toolbox container that contains all the necessary tools and is nicely integrated.

So to start the toolbox, use oc debug node/<nodename>. This will start a privileged container on the node you specify, mount the host filesystem on /host and drop you into a shell:

$ oc debug node/worker-0.lab.openshift.krenger.ch
Starting pod/worker-0labopenshiftkrengerch-debug ...
To use host binaries, run `chroot /host`
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.4# toolbox
Container started successfully. To exit, type 'exit'.
sh-4.2#

Now we are running in the toolbox container on our CoreOS host with all the tools we know at our disposal, for example sosreport:

sh-4.2# sosreport

Running sosreport will generate a sosreport in /host/var/tmp/, which means it will be accessible in /var/tmp/ on the CoreOS host itself.

OpenShift 4 Upgrade Paths

For OpenShift 4, the upgrade paths are kept in the cincinnati-graph-data repository as YAML files and then exposed via an API.

There is a Red Hat Solution describing how this data can be queried via api.openshift.com and how you can use this data in your automation:

$ curl -sH 'Accept:application/json' 'https://api.openshift.com/api/upgrades_info/v1/graph?channel=fast-4.2&arch=amd64' | jq .

While this data is quite helpful for automation (the Solution also describes helpful queries), it is not very nice to look at the raw data. If you are looking for a graphical presentation of that data, you should check out this wonderful website that is maintained by a Red Hat colleague with hourly generated data: www.ocp-upgrade.net

Missing X-Forwarded-For header in Spring Boot application

So here is another one from the trenches.

More than once one of our OpenShift Container Platform customers approached us and said something along the lines of: “Help, I cannot see the X-Forwarded-For header in my application, our OpenShift Router is probably configured incorrectly!”.

In such cases, it is often a good idea to check what is really being forwarded to the Pods in the cluster. For this, I typically use my simonkrenger/echoenv container to print the headers received by the application. In many cases, it turns out that the application affected is a Spring Boot application and the header is passed correctly to the Pod itself. But the Spring Boot application does not show the header anyway.

We have observed a behaviour of Spring Boot that leads to the X-Forwarded-For header not being passed to the application, as it is consumed by Spring Boot. In the application.properties of a Spring Boot application, the following setting controls this:

server.use-forward-headers: true

This configuration leads to the header being consumed by Spring Boot and the header not being available in the application. See also the relevant sections in Spring documentation. Good to know.

Exploring the OpenShift etcd with etcdctl

Kubernetes uses etcd as the persistent store for API data. As etcd is a distributed key-value store, we can also use command line tools to query this store. The examples in this post are for OpenShift 3.x.

Apart from just using get, there is also the possibility to perform the following actions on certain keys:

  • put to write to a key – unless you know what you are doing, don’t touch the Kubernetes data in etcd, as this will manifest in very strange Kubernetes behaviour.
  • del to delete a key – also, this may break your Kubernetes cluster by introducing inconsistencies.
  • watch to keep a watch on an object. This is very helpful to track changes on a certain object.

The get action is probably the most helpful functionality for in-depth API debugging directly within etcd.

Read the rest of this entry »

vim settings for YAML files

For editing YAML, be it for OpenShift / Kubernetes or Ansible, having your editor set up right can help to avoid common mistakes. So here is the minimalistic config in my ~/.vimrc to make working with YAML files easier. I am sure there are even more plugins or settings available, but this minimal set of commands works fine for me:

set ts=2
set sts=2
set sw=2
set expandtab

syntax on
filetype indent plugin on

set ruler
Read the rest of this entry »

Investigating slow DNS resolution in container

Some time ago, I had a curious case of very slow DNS resolution in a container on OpenShift. The symptoms were as follows:

  • In the PHP application in the container, DNS resolution was very slow with a 5 second delay before the lookup was resolved
  • In the container itself, DNS resolution for curl was very slow, with a 5 second timeout before the lookup was resolved
  • However, using dig in the container itself, DNS resolution was instant
  • Also, on the worker node, the DNS resolution was instant (using both dig and curl)

TL;DR: Since glibc 2.10, glibc performs IPv4 and IPv6 lookups in parallel. When IPv6 fails, there is a 5 second timeout in many cases before the lookup is returned. Disable IPv6 DNS lookups by setting “single-request” in “resolv.conf” or disable the IPv6 stack completely.

Read the rest of this entry »

Podman: “desc:bad request: add_hostfwd: slirp_add_hostfwd failed”

In the past few months, on all my machines I have replaced Docker with Podman and mostly the transition has been quite smooth. There are still some rough edges here and there, but the overall experience of using Podman has been great!

However, when trying to start a very simple container, one often runs into the following issue:

$ podman run -p80:80 nginx:latest 
Error: error from slirp4netns while setting up port redirection: map[desc:bad request: add_hostfwd: slirp_add_hostfwd failed]

The error message looks very cryptic, but the issue is quite simple: As a regular user, one is typically not allowed to bind ports < 1024. So by trying to bind port 80, you will get the error above.

The fix is trivial, just use a port greater than 1024:

$ podman run -p8080:80 -d nginx:latest 
22d2be2966e9cb77246a8b698f9024de89f4e6d1a0edfe44209bbe4fd27aa8b5
$ curl localhost:8080
[..]
Welcome to nginx!
[..]

If you really need to use a port number lower than 1024, there are multiple ways to configure that:

  • Set net.ipv4.ip_unprivileged_port_start=80or similar in your sysctl
  • Add the CAP_NET_BIND_SERVICE capability to your process or user

Kubernetes: BASH function to change namespace

So when working with a lot of different namespaces in Kubernetes and you only know the “oc project” command from OpenShift, you start to miss an easy way to change namespaces in Kubernetes.

The official documentation to switch namespaces proposes something like this:

$ kubectl config set-context $(kubectl config current-context) --namespace=<insert-namespace-name-here>

Not something that I want to type regularly. First I tried to create a BASH alias or something, which did not work. So I looked around for BASH functions. I found that Jon Whitcraft proposed a nice BASH function in a GitHub issue. I lightly modified this and placed this in my own .bashrc file:

function kubectlns() {
  ctx=`kubectl config current-context`
  ns=$1

  # verify that the namespace exists
  ns=`kubectl get namespace $1 --no-headers --output=go-template={{.metadata.name}} 2>/dev/null`
  if [ -z "${ns}" ]; then
    echo "Namespace (${1}) not found, using default"
    ns="default"
  fi

  kubectl config set-context ${ctx} --namespace="${ns}"
}

So to change your namespace, use something like this:

$ kubectlns simon
Context "kubernetes-admin@kubernetes" modified.

Nice and short.