Debugging kube-dns and using FQDNs

by Marek Bartík

Last week one of our clients started getting a lot of application errors after they migrated their main service to Google Kubernetes Engine. Quickly they found kube-dns is logging a lot of errors and is consuming a suspicious amount of cpu.

It was a relatively small GKE cluster (around 16 nodes in peak on average). One GKE kube-dns is deployed automatically and its manifests are synchronized from master nodes. You cannot simply change the kube-dns deployment.

If you take a look at

kubectl edit cm -n kube-system kube-dns-autoscaler

you see

apiVersion: v1
 linear: ‘{“coresPerReplica”:256,”nodesPerReplica”:16,”preventSinglePointFailure”:true}’
kind: ConfigMap

this sets the number of kube-dns replicas to 2 (1 per 16 nodes plus 1 for HA).

Changing ”nodesPerReplica”:16 to ”nodesPerReplica”:4 made the kube-dns scale to 4 replicas and we got rid of a lot of errors.

We weren’t monitoring kube-dns and by default there are not metrics or verbose logs for kube-dns (this should change in newer k8s versions on GKE with CoreDNS I hope).

When we had time to do so, we tried the following

kubectl exec -it kube-dns-788979dc8f-9qmrz sh
# apk add — update tcpdump
# timeout -t 60 — tcpdump -lvi any “udp port 53” | tee /tmp/tcpdumps
# grep -E ‘A\?’ /tmp/tcpdumps |sed -e ‘s/^.*A? //’ -e ‘s/ .*//’|sort | uniq -c | sort -nr | awk ‘{printf “%s %s\n”, $2, $1}’

this gives us a sorted list with the most requested DNS queries in the last minute of ONE kube-dns replica.

app-redis-cache-01.c.example-project-name.internal. 1688
app-elk-01.c.example-project-name.internal. 1430
app-redis-cache-01.cluster.local. 1148
app-redis-cache-01.svc.cluster.local. 1140 1118
app-elk-01.svc.cluster.local. 984
app-elk-01.cluster.local. 982 922 68 50
and others…

The most interesting is the app-redis-cache-01. It’s the app cache stored in redis that was recently, for some reason, moved from GKE to GCE instance group. The app’s configuration is referencing the redis-cache as “app-redis-cache-01” which is a local DNS.

Given the configuration in /etc/resolv.conf (ndots:5, for more info check the it was trying to search for the app-redis-cache-01 in, app-redis-cache-01.svc.cluster.local., app-redis-cache-01.cluster.local. and then finally app-redis-cache-01.c.example-project-name.internal. which finally resolved succesfully.

The app was querying ‘hey, where’s my redis cache’ 350 times a second.

Simple fix was to change the app’s config to use a FQDN app-redis-cache-01.c.example-project-name.internal. (notice the last dot) instead of app-redis-cache-01 and do the same for elk.

This decreased the load on kube-dns significantly and reduced the delay in resolving the DNS queries, making the request to cache faster. This is still a hotfix and more tweaking and thinking about how we use DNS and how this affects our app’s performance should be done.

We’ll probably end up playing a bit with ndots settings, HostAlias, dnsPolicy and look at caching dns requests in our app.

kubernetes distributed systems Pipeline & Deployment

Marek Bartík

Marek Bartík

Marek is a NoOps/NoCode enthusiast. Starting as a C++ programmer while doing masters in Computer Systems and Networks, growing up in the SysAdmin era, quickly realized communication and collaboration is the key. Nowadays he focuses on Cloud Architecting, microservices and Continuous Everything to solve business problems, not technical ones. Marek is passionate about DevOps and Cloud Native.