Revolgy blog

Debugging kube-dns and using FQDNs

Written by Marek Bartík | May 3, 2019

Last week one of our clients started getting a lot of application errors after they migrated their main service to Google Kubernetes Engine. Quickly they found kube-dns is logging a lot of errors and is consuming a suspicious amount of cpu.

It was a relatively small GKE cluster (around 16 nodes in peak on average). One GKE kube-dns is deployed automatically and its manifests are synchronized from master nodes. You cannot simply change the kube-dns deployment.

If you take a look at

kubectl edit cm -n kube-system kube-dns-autoscaler

you see

apiVersion: v1

data:

 linear: ‘{“coresPerReplica”:256,”nodesPerReplica”:16,”preventSinglePointFailure”:true}’

kind: ConfigMap

this sets the number of kube-dns replicas to 2 (1 per 16 nodes plus 1 for HA).

Changing ”nodesPerReplica”:16 to ”nodesPerReplica”:4 made the kube-dns scale to 4 replicas and we got rid of a lot of errors.

We weren’t monitoring kube-dns and by default there are not metrics or verbose logs for kube-dns (this should change in newer k8s versions on GKE with CoreDNS I hope).

When we had time to do so, we tried the following

kubectl exec -it kube-dns-788979dc8f-9qmrz sh# apk add — update tcpdump

# timeout -t 60 — tcpdump -lvi any “udp port 53” | tee /tmp/tcpdumps

# grep -E ‘A\?’ /tmp/tcpdumps |sed -e ‘s/^.*A? //’ -e ‘s/ .*//’|sort | uniq -c | sort -nr | awk ‘{printf “%s %s\n”, $2, $1}’

this gives us a sorted list with the most requested DNS queries in the last minute of ONE kube-dns replica.

app-redis-cache-01.c.example-project-name.internal. 1688

app-elk-01.c.example-project-name.internal. 1430

app-redis-cache-01.cluster.local. 1148

app-redis-cache-01.svc.cluster.local. 1140

app-redis-cache-01.prod-namespace.svc.cluster.local. 1118

app-elk-01.svc.cluster.local. 984

app-elk-01.cluster.local. 982

app-elk-01.prod-namespace.svc.cluster.local. 922

www.googleapis.com.google.internal. 68

oauth2.googleapis.com. 50

and others…

The most interesting is the app-redis-cache-01. It’s the app cache stored in redis that was recently, for some reason, moved from GKE to GCE instance group. The app’s configuration is referencing the redis-cache as “app-redis-cache-01” which is a local DNS.

Given the configuration in /etc/resolv.conf (ndots:5, for more info check the https://pracucci.com/kubernetes-dns-resolution-ndots-options-and-why-it-may-affect-application-performances.html) it was trying to search for the app-redis-cache-01 in app-redis-cache-01.prod-namespace.svc.cluster.local., app-redis-cache-01.svc.cluster.local., app-redis-cache-01.cluster.local. and then finally app-redis-cache-01.c.example-project-name.internal. which finally resolved succesfully.

The app was querying ‘hey, where’s my redis cache’ 350 times a second.

Simple fix was to change the app’s config to use a FQDN app-redis-cache-01.c.example-project-name.internal. (notice the last dot) instead of app-redis-cache-01 and do the same for elk.

This decreased the load on kube-dns significantly and reduced the delay in resolving the DNS queries, making the request to cache faster. This is still a hotfix and more tweaking and thinking about how we use DNS and how this affects our app’s performance should be done.

We’ll probably end up playing a bit with ndots settings, HostAlias, dnsPolicy and look at caching dns requests in our app.

https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy

FAQs

Q1: What were the initial symptoms that indicated a problem with kube-dns on the GKE cluster?

After a service migration, the client began experiencing a high volume of application errors. An investigation quickly found that kube-dns was logging many errors and consuming a suspicious amount of CPU.

Q2: What was the immediate action taken to relieve the pressure on kube-dns?

The kube-dns-autoscaler ConfigMap was edited to change the nodesPerReplica setting from 16 to 4. This forced kube-dns to scale up from 2 to 4 replicas, which immediately got rid of many of the errors.

Q3: How was the source of the excessive DNS queries identified without having pre-existing monitoring?

The team used kubectl exec to get a shell inside one of the kube-dns pods and then ran tcpdump for 60 seconds to capture all DNS traffic. This captured data was then processed to create a sorted list of the most frequently requested DNS queries.

Q4: What was the root cause of the high number of DNS requests flooding the kube-dns service?

The root cause was an application repeatedly trying to resolve a local, short DNS name for a service (a Redis cache) that had been moved outside the GKE cluster. Due to the pod’s DNS search path configuration (ndots:5), a single query for the short name resulted in multiple failed lookups before the correct name was finally resolved.

Q5: How did the application’s configuration contribute to this problem?

The application’s configuration was referencing its Redis cache by the short name “app-redis-cache-01”. This triggered the Kubernetes DNS search mechanism to try multiple domain suffixes (e.g., .prod-namespace.svc.cluster.local, .svc.cluster.local) before finally finding the correct internal compute domain, generating numerous unnecessary queries.

Q6: What simple configuration change served as an effective hotfix?

The hotfix was to change the application’s configuration to use the service’s Fully Qualified Domain Name (FQDN) with a trailing dot (app-redis-cache-01.c.example-project-name.internal.) instead of the short name. This eliminated the unnecessary search queries and significantly reduced the load on kube-dns.

Q7: What are some potential long-term solutions mentioned for optimizing DNS performance in GKE?

Potential long-term solutions include adjusting the ndots settings in resolv.conf, using HostAliases to add entries directly to a pod’s /etc/hosts file, changing the pod’s dnsPolicy, and implementing DNS request caching within the application itself.