Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I do not intend to play down the importance of using docker carefully.

But the reproducible build aspect of the critic seems unnecessary to me: Isn't that more a concern of the packaging system? (no python scripter)

If your packaging systems supports version selection/locking, then use your packaging system right. If your packaging system cannot pin a version, how should docker solve this?



Docker can't escape all the blame here - its layer caching mechanism is IMHO flawed. It's fine to say that a packaging system should offer reproducibility but Docker's layer caching design assumes that every RUN command produces reproducible results.

You could of course blame users for not making sure that all the commands they use in their Dockerfiles are actually reproducible but many/most examples even in the official documention are clearly not reproducible.

Therefore you end up with what is in my opinion a semi-broken system - building images seems to be reproducible (and fast) until you lose your layer cache or you spin up a new CI build agent or a new dev joins the team and tries to build the same image.

Not that I can think of an clean and performant solution to this problem.


We have been working on a simplified container build system which does away with layers altogether. [1]

The use of layers at the build stage adds a lot of needless complexity with very little benefits and users really need to step back and question the value they are getting from the use of layers. [2]

Words like 'immutability', 'declarative' and 'reproduciblity' are often used in ways that can lead to user misunderstanding and can be accomplished with simpler workflows. For instance immutability, reuse, composition do not require layers. There needs to be a lot more technical scrutiny to avoid confusion.

[1] https://www.flockport.com/docs/containers#builds

[2] https://www.flockport.com/guides/say-yes-to-containers.html


This is a great point -- you could potentially solve this using --cache-from which makes the layer cache explicit, and not something that varies between dev / CI / new devs.


This issue isn't Docker, Docker itself obviously can't deal with this. The issue is the examples and tutorial people provide on Docker packaging often don't talk about the requirements (pun intended) for reproducible builds at all.

Or they don't talk about need to run as not-root.

Or they suggest base images that are often broken in subtle ways (Alpine Linux).

Or they talk about multi-stage builds for small images, and neglect to explain that you've just destroyed your caching (this is fixable, but you need to know to expect it and how to fix it.)

Etc.


The caching part is (mostly) generally fixable by making sure to always add your source code in two stages - dependency files first, then resolve dependencies, and the rest of the source code later.

Rarely is this done though. And definitely alpine+musl doesn't always do what you might expect, and it's often language dependent as to whether or not you'll encounter something strange (not to mention you forfeiting bash)


>If your packaging systems supports version selection/locking, then use your packaging system right.

That's exactly what the article is recommending in point 2. The original Dockerfile author was using pip in a way that's only intended for in development. Having a requirements.txt file is the correct way to use pip when distributing a project.


Most Python packaging systems don't include the Python VM itself, though they can specify which version is required. In the post, Docker is used to provide a specific version of the Python VM.


Docker is a packaging system.


Thank you for clarifying this.

Did you understand the point i tried to make nonetheless or do i need to detail it?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: