I recently ran into a surprisingly tricky issue around deleting stale Docker image tags to keep our private Docker registry tidy. I ended up doing more research than expected, and I wanted to share some of my findings.
Introduction
We're big fans of Docker at FlightAware. The ability to isolate your services and all their dependencies on a host from one another is incredibly powerful. In fact, that combined with the broad adoption of Docker containers as a unit of software deployment were significant factors driving FlightAware's transition from FreeBSD to Ubuntu as our primary production OS.
Of course, once you start building a bunch of Docker images, you've got to put them somewhere, preferably somewhere private. That "somewhere" is known as a Container Registry, and there's a variety of options to choose from. We opted to self-host our registry and selected the "official" registry service maintained (until recently) by Docker, Inc. (they donated the project to the open source community in late 2023); it's called Distribution.
For several years, we ran the registry with few issues. At times, we'd get a bit tight on storage, and someone would go manually clear out some of our larger images. Recently, though, we decided that we could do with some more consistent cleanup of our registry, especially as we've begun to more consistently build images in Continuous Integration environments. If we weren't careful, we'd have unbounded growth on our hands.
"How hard could it be?", I thought, as I outlined a little cleanup script. "Find tags that match a pattern and are older than X days, delete the tags, and then garbage-collect." I didn't even need to do the last part myself; we already ran registry garbage collection on a weekly basis! Turns out it was the second step that proved to be a bit of a rabbit hole. You see, deleting a Docker image tag is apparently not trivial business. This post covers my exploration of various Container Registry clients and their different approaches to accomplish what you might have thought was one of the most trivial registry operations you can perform.
Background
First, I'll share some simplified Container Registry concepts that should help with understanding some of the details below.
Every image in a registry can be referenced by a unique, immutable digest which is generated based on the image's contents. Images can also be referenced by tags, which are generally user-selected. A single image can be referenced by multiple tags (consider the official Python image where you have the 3.12.2-bullseye
, 3.12-bullseye
, and 3-bullseye
tags all referencing the same image, and the user may select one based on how tightly they wish to pin their application to a given Python version). Docker tags are mutable and can be changed to point to some other image at will. If you're familiar with git, I've found a reasonable analogy to be that Docker digests are like git commits while Docker tags are like git branches (not git tags, which are meant to be immutable).
The Mission
So my goal was to write a hopefully simple shell script that could leverage an existing Container Registry client CLI to list out tags, determine their ages, delete old tags, and then let garbage collection reclaim that sweet sweet disk space. After some searching around, I found the following clients that all looked promising:
With the exception of Deckschrubber, all the tools listed are quite widely used, actively maintained, and implement all the essential commands you'd expect for interacting with a remote Container Registry. I found Deckschrubber appealing as well, though, as it actually functions at a higher layer, acting as an image cleanup tool out of the box rather than just providing primitives. It's also implemented on top of the registry client library present in the Distribution Docker Registry that we were using, leading me to hope that it would be more reliably compatible with any of the service's quirks.
The Problem
So how hard could deleting a little ol' Docker tag be? This is just the tag, mind you, not the underlying image itself. I would expect our periodic garbage collection to handle image deletion once there are no tags left referencing it. Well it turns out that the Docker Registry API specification entirely lacked a tag deletion endpoint until 2021. And even 3 years later, our Container Registry of choice, Distribution, hasn't yet landed the new endpoint in a stable release. With that being the case, the clients I surveyed each had their own interpretation for what should be done if a user tried to delete a tag reference. Let's explore each of them.
For each tool tested, we'll assume a repo has been set up in a private registry as follows:
docker pull busybox:glibc
docker pull busybox:musl
docker tag busybox:glibc my-docker-registry/busybox:glibc1
docker tag busybox:glibc my-docker-registry/busybox:glibc2
docker tag busybox:musl my-docker-registry/busybox:musl
docker push -a my-docker-registry/busybox
Crane
Crane has a delete
subcommand with the following description:
> crane delete --help
Delete an image reference from its registry
-- snip --
I would generally consider a string like my-docker-registry/busybox:glibc1
to be a valid image reference, yet after running the command we see the following error:
> crane delete my-docker-registry/busybox:glibc1
Error: DELETE https://my-docker-registry/v2/busybox/manifests/glibc1: DIGEST_INVALID: provided digest did not match uploaded content
I'm not the first to encounter such an error, and there doesn't appear to be a satisfactory approach to deleting just a tag.
One possible alternative here is to fetch the digest of the tag in question and then delete that. If you do that, though, you'll find that any other tags that were pointing to the specific image you deleted are now also gone!
> crane ls my-docker-registry/busybox
glibc1
glibc2
musl
> crane delete my-docker-registry/busybox@$(crane digest my-docker-registry/busybox:glibc1)
> crane ls my-docker-registry/busybox
musl
This is the core of the trouble behind tag "deletion" as implemented before 2021. You couldn't delete just a tag, you also deleted the underlying image. Fortunately, crane
forces you to be explicit with what you're deleting, requiring you to specify a digest rather than just a tag. With this approach, the user is less likely to be surprised by the deletion of the entire image and all its tags.
Skopeo
Skopeo is more flexible with what you can tell it to delete. Its delete
subcommand has the following description:
> skopeo delete --help
Delete an "IMAGE_NAME" from a transport
-- snip --
I do prefer it being clear that you're deleting an "image" rather than an "image reference". So does IMAGE_NAME
here include the <image>:<tag>
format? Let's find out:
> skopeo list-tags docker://my-docker-registry/busybox
{
"Repository": "docker://my-docker-registry/busybox",
"Tags": [
"glibc1",
"glibc2",
"musl"
]
}
> skopeo delete docker://my-docker-registry/busybox:glibc1
> skopeo list-tags docker://my-docker-registry/busybox
{
"Repository": "docker://my-docker-registry/busybox",
"Tags": [
"musl"
]
}
And just like that, your tag is gone, but so are the others! Ultimately, skopeo
is performing the same operation as crane
under the covers, resolving the tag to a digest and then deleting said digest. This is both more and less surprising than Crane's behavior. Skopeo does indicate you're deleting an image by name rather that just an image reference, but to have multiple tags disappear when only one was specified is still an unpleasant discovery.
Regctl
Regctl is quite an interesting case. It's clear the author wasn't content to be limited by a restricted spec, as evidenced by the following feature listed in the project's readme:
Delete APIs have been provided for tags, manifests, and blobs (the tag deletion will only delete a single tag even if multiple tags point to the same digest).
How did he manage to pull this off? Fortunately, the author is quite clear on his approach in the command documentation:
> regctl tag rm --help
Delete a tag in a repository.
This avoids deleting the manifest when multiple tags reference the same image.
For registries that do not support the OCI tag delete API, this is implemented
by pushing a unique dummy manifest and deleting that by digest.
If the registry does not support the delete API, the dummy manifest will remain.
-- snip --
Well isn't that clever! Indeed, if we give it a try, we'll see that only the specified tag is deleted:
> regctl tag ls my-docker-registry/busybox
glibc1
glibc2
musl
> regctl tag rm my-docker-registry/busybox:glibc1
> regctl tag ls my-docker-registry/busybox
glibc2
musl
One notable side effect of this more involved approach to deletion is that the command is noticeably slower. Ultimately, this best fits expectations for what should happen when deleting a Docker image tag. It certainly would have been my choice for our cleanup script if I hadn't stumbled across the final choice in this list.
Deckschrubber
And finally we have the (somewhat) black sheep, Deckschrubber. Instead of offering commands to list tags and delete images, Deckschrubber is just a single command which accepts various parameters that inform it of which tags you'd like to clean up or keep. For instance, the following command will delete all the tags on the image we've been working with:
> deckschrubber -latest 0 -tag ".*" -registry "https://my-docker-registry" -repo "^busybox$"
INFO[0000] Successfully fetched repositories. count=1 entries="[busybox]"
INFO[0000] Marking tag as outdated fields.time="2023-05-18 22:34:17 +0000 UTC" repo=busybox tag=glibc1
INFO[0000] Marking tag as outdated fields.time="2023-05-18 22:34:17 +0000 UTC" repo=busybox tag=glibc2
INFO[0000] Marking tag as outdated fields.time="2023-05-18 22:34:17 +0000 UTC" repo=busybox tag=musl
INFO[0000] All tags for this image digest marked for deletion repo=busybox tag=glibc1
INFO[0000] Deleting image (-dry=false) digest="sha256:db16cd196b8a37ba5f08414e6f6e71003d76665a5eac160cb75ad3759d8b3e29" fields.time="2023-05-18 22:34:17 +0000 UTC" repo=busybox tag=glibc1
INFO[0000] All tags for this image digest marked for deletion repo=busybox tag=musl
INFO[0000] Deleting image (-dry=false) digest="sha256:45561defaa53c6364b822f1782dae76b2a38c375a28b6a89b814c152eb6e2f6e" fields.time="2023-05-18 22:34:17 +0000 UTC" repo=busybox tag=musl
> regctl tag ls my-docker-registry/busybox
More relevant, though, is how Deckschrubber behaves if we tell it to only delete the glibc1
tag:
> deckschrubber -latest 0 -tag "^glibc1$" -registry "https://my-docker-registry" -repo "^busybox$"
INFO[0000] Successfully fetched repositories. count=1 entries="[busybox]"
INFO[0000] Marking tag as outdated fields.time="2023-05-18 22:34:17 +0000 UTC" repo=busybox tag=glibc1
INFO[0000] Ignore non matching tag (-tag=^glibc1$) repo=busybox tag=glibc2
INFO[0000] Ignore non matching tag (-tag=^glibc1$) repo=busybox tag=musl
INFO[0000] The underlying image is also used by non-deletable tags - skipping deletion alsoUsedByTags=glibc2 repo=busybox tag=glibc1
It doesn't delete any tags! Because Deckschrubber implicitly pulls down all image tags to compare against the specified regex, it can also inspect those tags to see what digests they reference. If there are any shared digests, then none of the tags in question are deleted. I found this approach to be conservative and straightforward; it doesn't delete images unexpectedly, nor does it require any tricky tag manipulation. That combined with the existing cleanup-oriented functionality of Deckschrubber made it a natural pick for whipping up a quick cleanup script for our Container Registry.
Conclusion
It turns out things aren't always as simple as they seem (and if you've actually seen the underlying Container Registry API you would probably have known from the beginning that this wouldn't be so simple). And heck, I didn't even discuss manifests, manifest lists, blobs, layers, or any of the other artifacts you can find buried within a Container Registry. I hope you learned something from the post, nonetheless. Maybe you found a new tool to try out, or maybe you've resolved to leave the Container Registry maintenance to the experts.