Google Container Registry lifecycle policy for images retention

by Marek Bartík
0 Comments

Is your Google Container Registry filling up, taking up storage and becoming expensive? How to handle images retention as a service?

Amazon’s Elastic Container Registry has a feature called Lifecycle Policies to handle images retention. Google doesn’t have this feature. There is a feature request in their tracker since Aug 2018 and there is not ETA for it so far…

There is a popular bash script from Ahmet and Go in CloudRun from Seth but none of them solve the requirements I needed. What exactly do I need?

I wanna scan my whole GCR and delete the digests that are:

  • older than X days
  • not being used in my kubernetes cluster
  • not the most recent Y digests (I wanna keep, say, 20 most recent tagged digests)

When I check these requirements, I want to apply these lifecycle policies to all the images.

Say I have few images in GCR with certain prefixes:

eu.gcr.io/my-project/foo/bar/my-service:123

eu.gcr.io is the docker registry endpoint
my-project is ID of my GCP project
foo/bar is the prefix (“repo”)
my-service is an image name
123 is a tag

my-service:123 is an image with a tag, but wait, what is the digest?

Image vs Layers, taken from https://windsock.io/explaining-docker-image-ids/

A docker image digest is an ID (hashing algorithm used and the hash computed). The digest can look like this:

@sha256:296e2378f7a14695b2f53101a3bd443f656f823c46d13bf6406b91e9e9950ef0

You can tag a digest with several tags, even zero tags = untagged image.

Let’s say build an image my-service and push it to docker registry. When pushing, I tag it with :123. The new produced digest has two tags, :123 and :latest.The digest that was tagged :latest before I pushed this image, got the :latest tag removed.

If I remove a tag from an image in GCR, I simply remove a tag from the digest, I don’t delete the digest though.

What I can delete, in order to save some space, is the digest, like this:

gcloud container images delete -q — force-delete-tags eu.gcr.io/my-project/foo/bar/my-service@sha256:296e2378f7a14695b2f53101a3bd443f656f823c46d13bf6406b91e9e9950ef0

Then, what do I need to do?

  • Recursively scan gcr.io for all image prefixes (eu.gcr.io/my-project/foo/bar/my-service)
  • For each prefix — list all its digests, delete the ones that don’t match my rules

How to check if they match the rules:

  • sort them, preserve the most recent Y digests
  • fetch pods and replicaSets’ image:tags (all of them, even the ones scaled to zero, we’d need these images in case of a rollback) from the k8s cluster, then go through the digests (that belong to that image name) and check if ANY of their tags contain a tag that is used in the cluster, preserve those
  • check the rest of the digests, if they are older than X days, delete them

you could use standard kubectl to fetch the data:

kubectl get rs,po --all-namespaces -o jsonpath={..image} | tr ' ' '\n'

the gcr.io is exposing a docker v2 API, you can use a standard docker client or just curl using the gcloud token

ACCESS_TOKEN=$(gcloud auth print-access-token)
curl --silent --show-error -u_token:"$ACCESS_TOKEN" -X GET "https://eu.gcr.io/v2/_catalog"

I implemented all of this using bash/jq (yep, that wasn’t a smart idea) and published it to github:

Right now I’m running this in Gitlab-CI pipeline on a cron schedule (once a day) to evaluate it’s dry-run logs for production GCP projects.

I’m planning on rewriting this to python (py-kubeng and docker-py) if Google will not to come up with ETA for this feature :(

Cloud Computing

Marek Bartík

Marek Bartík

Marek is a NoOps/NoCode enthusiast. Starting as a C++ programmer while doing masters in Computer Systems and Networks, growing up in the SysAdmin era, quickly realized communication and collaboration is the key. Nowadays he focuses on Cloud Architecting, microservices and Continuous Everything to solve business problems, not technical ones. Marek is passionate about DevOps and Cloud Native.