Now for a subjective diatribe about Docker's poor use of analogies

2016-02-08 by qntm

Just as a warm-up, here are some artifacts which Docker allows you to manipulate, and the commands you use to manipulate them.

Yeah. I mean, you tell me.

What happened here is actually quite simple to explain (and by "explain" I mean this is my theory which explains these increasingly stupid observations, I haven't actually dug into the history). Volumes and networks are relatively recent additions to the Docker command line interface; previously, you were only able to manipulate containers and images in this way. It was only at this later stage, when volumes and networks were added, that anybody realised that the CLI as it currently existed was rather unpredictable and organic, and that some kind of consistency amongst all the commands would be desirable... even though this should have been extremely obvious right from get-go, or at least on the day docker rmi was invented.

Perhaps in the near future docker {container|image} {create|inspect|rm|ls} will be introduced, relegating the original, befuddlingly inconsistent commands to aliases. Haha, just kidding.

But what I actually wanted to talk about here is the poor use of analogy in these commands.

The command to list all of your containers is docker ps. This is a really interesting choice of command name because it provides a touchstone, a point of familiarity for newcomers to the Docker CLI. You already know about the Unix ps command — it lists processes. So when you see that there is a docker ps command, you see something you recognise and you make the logical conclusion: that Docker containers are somewhat analogous to processes.

That's great! It is actually a pretty good analogy! A Docker container actually encapsulates a whole tree of processes, but it certainly behaves similarly to a process. It has an identifier which is analogous to a process ID, and it has a source image which is analogous to an executable file. Introducing this analogy isn't inherently misleading to a new user of Docker; it smooths the way a little, it gets them closer to understanding a little faster.

The problem is that the analogy breaks down immediately once you start trying to put it into practice. Starting from this piece of information, you might guess that ps and docker ps have similar flags. Why else would docker ps be named that? But they don't, at all. The flags they do share (most prominently -f) do different things, and the functionality they share (filtering, formatting) are provided through different flags and syntaxes.

You might also guess that analogues of other common Unix commands for manipulating processes might exist, such as docker pkill or docker pgrep. They don't. docker kill and docker top do exist, but they do subtly different things from kill and top respectively: they manipulate processes inside containers, not containers themselves. This is inconsistent with what we already learned. Does docker ls list images? Heck no, it doesn't exist. And of course you might guess that, if a Docker container is analogous to a process, stopping a container would cause it to disappear forever. It doesn't. It has to be manually cleaned up afterwards.

In other words this analogy, thinking of Docker containers as processes, has given you, the new Docker user, nothing at all. This familiar touchstone of ps was useful to you for all of two characters, then it became active disinformation.

And then it turns out that the command for deleting a container is docker rm. So Docker containers are processes, but they're also files? This second analogy collides violently with the first analogy, and doesn't work nearly as well to describe Docker containers, and has all of the same problems: The flags on docker rm are nothing like the flags on rm; to rename a file you use mv, but to rename a container you use not docker mv (which doesn't exist) but docker rename; and docker cp does the last thing you would expect: it doesn't copy containers, it copies files into or out of a container.

So what was the point?

Nothing we already know about ps and rm has helped us understand docker ps and docker rm. Nothing we know about related Unix commands has helped us understand the related Docker commands. Nothing we know about processes or files has helped us understand Docker containers or images.

These are particularly frustrating choices because Docker itself isn't all that difficult to understand, and there are two incredibly simple (albeit mutually exclusive) ways to make this interface more coherent.

One is to fully embrace the analogy: Docker images are (executable) files, Docker containers are running processes. Make it so that Docker commands interacting with images closely resemble Unix commands which interact with files and Docker commands interacting with containers closely resemble Unix commands which interact with processes. Make it so that the same is true of flags on the commands. Where the analogy breaks down, as it eventually must, make it a clean, obvious break, rather than straining to continue to adhere to it. The command docker cp must either copy Docker images, or not exist. The command to copy files into or out of containers — an operation with no good analogy in the realm of processes — is a brand new command with no precedent in Unix, so it must have a new name. Again, the same applies to flags; invent new flags, do not reuse old ones.

The other approach is to fully abandon the analogy and embrace Docker as its own thing. Images are simply images, containers are simply containers. Refer to them by name. docker image create, docker container list. Points of familiarity, such as the resemblence to executables/processes (or to virtual machine images or application server applications), are for the user to discover, and maybe for tutorial documentation, but not for the interface itself. In this way you accept up front that every analogy eventually breaks down, and you aren't tempted to strain it.

Here, Docker is acting as a convenient punching bag. But this principle can be applied universally.

A good strong analogy gives the new user a running start. It can accelerate or short-circuit volumes of tutorials. But choosing a good analogy takes skill and implementing it requires great care. You need to put yourself in the new user's shoes and try to see everything they might understand or misunderstand. This is a different skill set from software development!

You need to be consciously aware of everything that there is to be gained by using that analogy... or lost, because a bad analogy throws an annoying roadblock in front of all of your users, over and over again, forever. And if you have that awareness, you may discover that your software does not support any particular analogy far enough for it to be worth doing. This is regrettable, but it's not a tragedy.

In conclusion: these appear to be very minor, inconsequential decisions, but they warrant extended group discussion because of their collective significance. I am, in fact, calling for us to slow down and do a little bikeshedding.

And while I'm at it, shipping containers don't swarm!

Now for a subjective diatribe about Docker's poor use of analogies

Now for a subjective diatribe about Docker's poor use of analogies

Recommend

Taxonomy of teleportation models

Write Like A Programmer

Stomp On The Mystery Box

Based On A Twitter Thread About Worldbuilding

Refactoring The Rise Of Skywalker

盈客在线陈清平：数据驱动让传统餐饮业焕发新活力

神策数据与达观数据达成战略合作，共拓大数据服务市场

神策数据新 DEMO 上线，助力零售行业数据分析

Google IO 谈 AI First，我们却发现了更多

方法论难落地？来个量身定制版本吧（AARRR模型）

About Joyk