Professional Services

Debugging kube-dns and using FQDNs

May 3, 2019

Last week one of our clients started getting a lot of application errors after they migrated their main service to Google Kubernetes Engine. Quickly they found kube-dns is logging a lot of errors and is consuming a suspicious amount of cpu.

It was a relatively small GKE cluster (around 16 nodes in peak on average). One GKE kube-dns is deployed automatically and its manifests are synchronized from master nodes. You cannot simply change the kube-dns deployment.

If you take a look at

kubectl edit cm -n kube-system kube-dns-autoscaler

you see

apiVersion: v1

data:

linear: ‘{“coresPerReplica”:256,”nodesPerReplica”:16,”preventSinglePointFailure”:true}’

kind: ConfigMap

this sets the number of kube-dns replicas to 2 (1 per 16 nodes plus 1 for HA).

Changing ”nodesPerReplica”:16 to ”nodesPerReplica”:4 made the kube-dns scale to 4 replicas and we got rid of a lot of errors.

We weren’t monitoring kube-dns and by default there are not metrics or verbose logs for kube-dns (this should change in newer k8s versions on GKE with CoreDNS I hope).

When we had time to do so, we tried the following

kubectl exec -it kube-dns-788979dc8f-9qmrz sh# apk add — update tcpdump

# timeout -t 60 — tcpdump -lvi any “udp port 53” | tee /tmp/tcpdumps

# grep -E ‘A\?’ /tmp/tcpdumps |sed -e ‘s/^.*A? //’ -e ‘s/ .*//’|sort | uniq -c | sort -nr | awk ‘{printf “%s %s\n”, $2, $1}’

this gives us a sorted list with the most requested DNS queries in the last minute of ONE kube-dns replica.

app-redis-cache-01.c.example-project-name.internal. 1688

app-elk-01.c.example-project-name.internal. 1430

app-redis-cache-01.cluster.local. 1148

app-redis-cache-01.svc.cluster.local. 1140

app-redis-cache-01.prod-namespace.svc.cluster.local. 1118

app-elk-01.svc.cluster.local. 984

app-elk-01.cluster.local. 982

app-elk-01.prod-namespace.svc.cluster.local. 922

www.googleapis.com.google.internal. 68

oauth2.googleapis.com. 50

and others…

The most interesting is the app-redis-cache-01. It’s the app cache stored in redis that was recently, for some reason, moved from GKE to GCE instance group. The app’s configuration is referencing the redis-cache as “app-redis-cache-01” which is a local DNS.

Given the configuration in /etc/resolv.conf (ndots:5, for more info check the https://pracucci.com/kubernetes-dns-resolution-ndots-options-and-why-it-may-affect-application-performances.html) it was trying to search for the app-redis-cache-01 in app-redis-cache-01.prod-namespace.svc.cluster.local., app-redis-cache-01.svc.cluster.local., app-redis-cache-01.cluster.local. and then finally app-redis-cache-01.c.example-project-name.internal. which finally resolved succesfully.

The app was querying ‘hey, where’s my redis cache’ 350 times a second.

Simple fix was to change the app’s config to use a FQDN app-redis-cache-01.c.example-project-name.internal. (notice the last dot) instead of app-redis-cache-01 and do the same for elk.

This decreased the load on kube-dns significantly and reduced the delay in resolving the DNS queries, making the request to cache faster. This is still a hotfix and more tweaking and thinking about how we use DNS and how this affects our app’s performance should be done.

We’ll probably end up playing a bit with ndots settings, HostAlias, dnsPolicy and look at caching dns requests in our app.

Adding entries to Pod /etc/hosts with HostAliases
Edit This Page Adding entries to a Pod’s /etc/hosts file provides Pod-level override of hostname resolution when DNS…kubernetes.io

https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy

FAQs

Q1: What were the initial symptoms that indicated a problem with kube-dns on the GKE cluster?

After a service migration, the client began experiencing a high volume of application errors. An investigation quickly found that kube-dns was logging many errors and consuming a suspicious amount of CPU.

Q2: What was the immediate action taken to relieve the pressure on kube-dns?

The kube-dns-autoscaler ConfigMap was edited to change the nodesPerReplica setting from 16 to 4. This forced kube-dns to scale up from 2 to 4 replicas, which immediately got rid of many of the errors.

Q3: How was the source of the excessive DNS queries identified without having pre-existing monitoring?

The team used kubectl exec to get a shell inside one of the kube-dns pods and then ran tcpdump for 60 seconds to capture all DNS traffic. This captured data was then processed to create a sorted list of the most frequently requested DNS queries.

Q4: What was the root cause of the high number of DNS requests flooding the kube-dns service?

The root cause was an application repeatedly trying to resolve a local, short DNS name for a service (a Redis cache) that had been moved outside the GKE cluster. Due to the pod’s DNS search path configuration (ndots:5), a single query for the short name resulted in multiple failed lookups before the correct name was finally resolved.

Q5: How did the application’s configuration contribute to this problem?

The application’s configuration was referencing its Redis cache by the short name “app-redis-cache-01”. This triggered the Kubernetes DNS search mechanism to try multiple domain suffixes (e.g., .prod-namespace.svc.cluster.local, .svc.cluster.local) before finally finding the correct internal compute domain, generating numerous unnecessary queries.

Q6: What simple configuration change served as an effective hotfix?

The hotfix was to change the application’s configuration to use the service’s Fully Qualified Domain Name (FQDN) with a trailing dot (app-redis-cache-01.c.example-project-name.internal.) instead of the short name. This eliminated the unnecessary search queries and significantly reduced the load on kube-dns.

Q7: What are some potential long-term solutions mentioned for optimizing DNS performance in GKE?

Potential long-term solutions include adjusting the ndots settings in resolv.conf, using HostAliases to add entries directly to a pod’s /etc/hosts file, changing the pod’s dnsPolicy, and implementing DNS request caching within the application itself.

Marek Bartík

Marek is a NoOps/NoCode enthusiast. Starting as a C++ programmer while doing masters in Computer Systems and Networks, growing up in the SysAdmin era, quickly realized communication and collaboration is the key. Nowadays he focuses on Cloud Architecting, microservices and Continuous Everything to solve business problems, not technical ones. Marek is passionate about DevOps and Cloud Native.

Professional Services

Debugging kube-dns and using FQDNs

FAQs

Marek Bartík

Related posts

Istio: Multi-Cluster Federation and Hybrid Cloud

GCP Stackdriver Logging export to bucket and extract textPayload from json with Cloud Functions

Interested in the world of cloud?