Investigating slow DNS resolution in container

Some time ago, I had a curious case of very slow DNS resolution in a container on OpenShift. The symptoms were as follows:

  • In the PHP application in the container, DNS resolution was very slow with a 5 second delay before the lookup was resolved
  • In the container itself, DNS resolution for curl was very slow, with a 5 second timeout before the lookup was resolved
  • However, using dig in the container itself, DNS resolution was instant
  • Also, on the worker node, the DNS resolution was instant (using both dig and curl)

TL;DR: Since glibc 2.10, glibc performs IPv4 and IPv6 lookups in parallel. When IPv6 fails, there is a 5 second timeout in many cases before the lookup is returned. Disable IPv6 DNS lookups by setting “single-request” in “resolv.conf” or disable the IPv6 stack completely.

Read the rest of this entry »