By far the most impactful product of the Apretus project are the people. To quote a memorable line from Dominique Paul (https://www.thisiscrispin.com/):
> What most people miss IMO is that this is not a team who is doing this for the fourth time like virtually any other LLM provider and who could learn from its own past experiences. I bet if the team would do another model training they could get way better results at one fourth of the costs.
Very fair pushback -- I did get carried away and will update the article to be more precise. Thanks for raising it!
> For less insane, non-bash shells there is always nc which is usually probably the wiser choice.
For completeness, `nc` or any netcat equvialent I could think of was not available in the image I was trying this with. It would certainly be a better option though.
This worries me. Some AI writing styles became mainstream; at first it was the em-dashes, now it’s “A, not B” patterns and excessive acknowledging. There will be more.
Was grandparent comment written by an LLM?
Or is this a human who copies a style they saw in a blog post, unaware that they’re copying an AI?
Or is this a human who spent too much time talking to an AI and now they just talk like this?
Or is this an organic human response and we’re all paranoid by now?
When learning a language, I've heard it's good to find a reference speaker, such as a prolific actor, and mimic them in order to absorb several aspects of what makes them sound authentic as a speaker, such as vocabulary, intonation, diction, pacing.
For many in the next generation of language learners, this reference will be Claude.
I think that the fact that AI has a very recognizable singular style is a problem. And this problem will be solved, sooner or later. It probably isn't a very important problem, because I feel that it should be relatively easy to solve (but maybe I'm wrong?).
But certainly with smarter AI I do believe it'll become more fluent with choosing more diverse idioms and phrasing, rather than repeating one thing over and over, to a point of being a comically similar. So people who learn on AI-generated text, will not learn from just one recurring style.
Insightful, and scary! Imitating an imitation machine... even if no one is trying to intentionally do so, McLuhan's "we become what we behold" is inescapable.
It's pretty rough to learn I sound like Claude. Will need to do something about it then.
(For what it's worth I did write the message above manually but I understand why no one would believe that now. At least I did not call netcat "load-bearing" [https://mareksuppa.com/til/load-bearing/] or something...)
I did not think you sounded like claude. Then I looked again after the comment was made and then I saw some of the vibes. Like acknowledging a mistake you have done.
Before that would just made you top 5% (or maybe top 1%) of the nicest people to talk too.. know ppl think you are Claude.
I’m torn. It’s a great thing to share knowledge and take feedback graciously. Maybe this kind of comment will encourage more of that. But you also need people to tell you what is up without unnecessary filters. It’s a challenge
FWIW, I didn't read this as AI-like. Even on a re-read, it's only the quasi-em-dash, and _maybe_ the polite acknowledgement of "Very fair pushback" (just good etiquette, IMO!) that would ring any alarm bells. You're fine.
I was really just trying to see if intra-container connectivity works, and this ended up being a very quick way of doing so. (The alternative being building and deploying a new image, which would likely take significantly longer.)
It seems pretty cool, but I am wondering if there's any drawback on just using images that support curl? I can't think of any and to me it's kinda a must have, even on production images
I always recommend to not have any dependencies outside of the code.
So we start at compiling the codebase (Rust) against MUSL. That way we can run it with FROM scratch images.
If we need more tooling available at runtime, then we look at alpine, but still using MUSL.
If MUSL itself is proving problematic, or if some of the libraries we use need glibc then we can look at using some locked down image.
The cool part about FROM scratch images is that you'll never have to update your base image to address CVEs. Only your software and its (compiled) dependencies.
> The cool part about FROM scratch images is that you'll never have to update your base image to address CVEs. Only your software and its (compiled) dependencies.
What's the benefit really, though? If you still need to be able to rapidly deploy a new image in response to a dependency CVE, what have you gained?
You want to ship every debug utility you will need in every image? Just seems wasteful. What about 3rd party images, you will respin images just to add your preferred toolset?
> every debug utility you will need in every image? Just seems wasteful
How wasteful though? I have to admit, I envy the person whose codebase itself they have to support is so lean and space-conservative that the size of the gnu coreutils, curl, nano, etc., would show up as anything but a rounding error in the image size.
I see it like putting a thermometer in a turkey before I stick it in the oven. Sure, the thermometer adds thermal mass itself, making the turkey take a few seconds longer to cook, but the value of it being there is greater than the cost imposed.
Nope, not my position at all. I want to have a minimal OS environment with rudimentary tools available with zero extra friction. FROM alpine:latest adds less than 10MB and covers 95% of use cases. Typically depending on the container I will often throw in curl and some other QoL tools too.
For the rare cases where you find yourself needing to attach a debugger to your pods running in staging/prod, a debug container is absolutely the right tool to reach for.
Both. Many attacks take the form of an exploit to get a shell, then using available utilities to exploit the kernel to escape to the host. If your image has neither a shell nor utilities that won't get very far.
What percentage of CVEs can be used to obtain a shell, but can't otherwise be used to obtain some other form of code execution in a distro-less container?
I haven't run any stats and am certainly not an expert but I would expect quite a few. In the one scenario you merely need to pull off an exec with a valid path. In the other you need to either write a block of memory and mark it as executable or else write your payload out to disk and mark the file executable. So it's the difference between being able to pull off a single syscall versus most likely needing arbitrary code execution.
preface: I'm not asking things rhetorically, I genuinely want to learn here.
> to not have any dependencies outside of the code.
> ... FROM scratch images is that you'll never have to update your base image to address CVEs...
So a FROM scratch image, basically doesn't have things like a package manager to install things, and maybe also libraries that things like curl would depend on? Sorry for my ignorance, I've heard of FROM scratch but never tried them.
If you want to run as another user, you need to manually add an /etc/group & /etc/password (or generate them in a stage before that and copy them over).
If you need ca-certificates, you need to install ca-certificates in a stage before that and copy over /etc/ssl/certs/ca-certificates.crt from that stage to your current one.
For what its worth, this container used `python:3.12.2-slim-bookworm` and I really would not expect that sort of an image to bundle `curl` -- even if it is intended for production.
Ah I see so it was basically a minimal image that bundles just python? I can see why it wouldn't bundle curl! Thought it was a custom Image for some reason, hence my original comment
More than one ~500 employee company I've worked at has had security policies either encouraging or requiring the use of "distro-less" images - images with no OS components other than the absolute minimum required to run the application. For go binaries this meant literally nothing in the container apart from the executable.
In theory it has a couple of benefits. You don't have to re-deploy your image to patch CVE's in OS components if you don't have any OS components. And it provides some measure of defence-in-depth - one could certainly theory-craft a scenario where an attacker gains some limited control over your application and then uses some OS component to escalate.
These days if a security engineer is proposing my team adopt distro-less containers to receive these benefits, I would point out that we need to weigh them against the very real drawbacks of not having standard debugging tools available where and when we need them. And also to consider the relative impact of other defence-in-depth measures they could be pursuing instead - such as any sort of network policy to limit network traffic.
> not having standard debugging tools available where and when we need them
Keeping in mind that containers are merely a bunch of namespaces, there's nothing stopping you from entering the same PID namespace with a different mount namespace in order to debug.
I am aware, thank you :). I responded to a sibling dupe-comment over here [1].
To summarize, in my experience there is immense value to having basic shell tools available in the environment where you need them with zero extra friction. Stripping those out provides a security benefit only in specific nebulous and niche scenarios.
> in my experience there is immense value to having basic shell tools available in the environment where you need them with zero extra friction
I agree, however assuming you maintain a chroot for debugging this can be accomplished with a shell command that takes a single argument to target a running container by name.
Your linked comment suggests being limited to kubernates but nsenter and a chroot are entirely runtime agnostic.
This of course only supports http, not https. It's great for health checks e.g. in a docker environment. To do https, you'd have to use something like socat, but of course that doesn't use bash only.
Author here. I wrote this after setting up Claude Code with MiniMax and Z.AI and realizing their docs all tell you to paste API keys into settings.json in plaintext -- which is risky given that Claude Code has been known to read .env files and leak contents into session transcripts. I already use KeePassXC, so I wrote a shell wrapper that fetches the key at invocation time and passes it as an inline env var. Nothing is written to disk. The same pattern works with any password manager CLI -- op read for 1Password, bw get password for Bitwarden, pass show for pass. Happy to answer questions.
> What most people miss IMO is that this is not a team who is doing this for the fourth time like virtually any other LLM provider and who could learn from its own past experiences. I bet if the team would do another model training they could get way better results at one fourth of the costs.
reply