Last week one of our clients started getting a lot of application errors after they migrated their main service to Google Kubernetes Engine. Quickly they found kube-dns is logging a lot of errors and is consuming a suspicious amount of cpu.
It was a relatively small GKE cluster (around 16 nodes in peak on average). One GKE kube-dns is deployed automatically and its manifests are synchronized from master nodes. You cannot simply change the kube-dns deployment.
If you take a look at
kubectl edit cm -n kube-system kube-dns-autoscaler
you see
apiVersion: v1
data:
linear: ‘{“coresPerReplica”:256,”nodesPerReplica”:16,”preventSinglePointFailure”:true}’
kind: ConfigMap
this sets the number of kube-dns replicas to 2 (1 per 16 nodes plus 1 for HA).
Changing ”nodesPerReplica”:16 to ”nodesPerReplica”:4 made the kube-dns scale to 4 replicas and we got rid of a lot of errors.
We weren’t monitoring kube-dns and by default there are not metrics or verbose logs for kube-dns (this should change in newer k8s versions on GKE with CoreDNS I hope).
When we had time to do so, we tried the following
kubectl exec -it kube-dns-788979dc8f-9qmrz sh# apk add — update tcpdump
# timeout -t 60 — tcpdump -lvi any “udp port 53” | tee /tmp/tcpdumps
# grep -E ‘A\?’ /tmp/tcpdumps |sed -e ‘s/^.*A? //’ -e ‘s/ .*//’|sort | uniq -c | sort -nr | awk ‘{printf “%s %s\n”, $2, $1}’
this gives us a sorted list with the most requested DNS queries in the last minute of ONE kube-dns replica.
app-redis-cache-01.c.example-project-name.internal. 1688
app-elk-01.c.example-project-name.internal. 1430
app-redis-cache-01.cluster.local. 1148
app-redis-cache-01.svc.cluster.local. 1140
app-redis-cache-01.prod-namespace.svc.cluster.local. 1118
app-elk-01.svc.cluster.local. 984
app-elk-01.cluster.local. 982
app-elk-01.prod-namespace.svc.cluster.local. 922
www.googleapis.com.google.internal. 68
oauth2.googleapis.com. 50
and others…
The most interesting is the app-redis-cache-01. It’s the app cache stored in redis that was recently, for some reason, moved from GKE to GCE instance group. The app’s configuration is referencing the redis-cache as “app-redis-cache-01” which is a local DNS.
Given the configuration in /etc/resolv.conf (ndots:5, for more info check the https://pracucci.com/kubernetes-dns-resolution-ndots-options-and-why-it-may-affect-application-performances.html) it was trying to search for the app-redis-cache-01 in app-redis-cache-01.prod-namespace.svc.cluster.local., app-redis-cache-01.svc.cluster.local., app-redis-cache-01.cluster.local. and then finally app-redis-cache-01.c.example-project-name.internal. which finally resolved succesfully.
The app was querying ‘hey, where’s my redis cache’ 350 times a second.
Simple fix was to change the app’s config to use a FQDN app-redis-cache-01.c.example-project-name.internal. (notice the last dot) instead of app-redis-cache-01 and do the same for elk.
This decreased the load on kube-dns significantly and reduced the delay in resolving the DNS queries, making the request to cache faster. This is still a hotfix and more tweaking and thinking about how we use DNS and how this affects our app’s performance should be done.
We’ll probably end up playing a bit with ndots settings, HostAlias, dnsPolicy and look at caching dns requests in our app.
https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy
Q1: What were the initial symptoms that indicated a problem with kube-dns on the GKE cluster?
After a service migration, the client began experiencing a high volume of application errors. An investigation quickly found that kube-dns was logging many errors and consuming a suspicious amount of CPU.
Q2: What was the immediate action taken to relieve the pressure on kube-dns?
The kube-dns-autoscaler ConfigMap was edited to change the nodesPerReplica setting from 16 to 4. This forced kube-dns to scale up from 2 to 4 replicas, which immediately got rid of many of the errors.
Q3: How was the source of the excessive DNS queries identified without having pre-existing monitoring?
The team used kubectl exec to get a shell inside one of the kube-dns pods and then ran tcpdump for 60 seconds to capture all DNS traffic. This captured data was then processed to create a sorted list of the most frequently requested DNS queries.
Q4: What was the root cause of the high number of DNS requests flooding the kube-dns service?
The root cause was an application repeatedly trying to resolve a local, short DNS name for a service (a Redis cache) that had been moved outside the GKE cluster. Due to the pod’s DNS search path configuration (ndots:5), a single query for the short name resulted in multiple failed lookups before the correct name was finally resolved.
Q5: How did the application’s configuration contribute to this problem?
The application’s configuration was referencing its Redis cache by the short name “app-redis-cache-01”. This triggered the Kubernetes DNS search mechanism to try multiple domain suffixes (e.g., .prod-namespace.svc.cluster.local, .svc.cluster.local) before finally finding the correct internal compute domain, generating numerous unnecessary queries.
Q6: What simple configuration change served as an effective hotfix?
The hotfix was to change the application’s configuration to use the service’s Fully Qualified Domain Name (FQDN) with a trailing dot (app-redis-cache-01.c.example-project-name.internal.) instead of the short name. This eliminated the unnecessary search queries and significantly reduced the load on kube-dns.
Q7: What are some potential long-term solutions mentioned for optimizing DNS performance in GKE?
Potential long-term solutions include adjusting the ndots settings in resolv.conf, using HostAliases to add entries directly to a pod’s /etc/hosts file, changing the pod’s dnsPolicy, and implementing DNS request caching within the application itself.