> This nuance is something that only the nix model started to capture at all.
Unpopular opinion, loosely held: the whole attempt to share any dependencies at all is the source of evil.
If you imagine the absolute worst case scenario that every program shipped all of its dependencies and nothing was shared then the end result would be… a few gigabytes of duplicated data? Which could plausible be deduped at the filesystem level rather than build or deployment layer?
Feels like a big waste of time. Maybe it mattered in the 70s. But that was a long, long time ago.
I think the storage optimization aspect is secondary, it is more about keeping control over your distribution. You need processes to replace all occurrences of xz with an uncompromised version when necessary. When all packages in the distribution link against one and the same that's easy.
Nix and guix sort of move this into the source layer. Within their respective distributions you would update the package definition of xz and all packages depending on it would be rebuild to use the new version.
Using shared dependencies is a mostly irrelevant detail that falls out of this in the end. Nix can dedupe at the filesystem layer too, e.g. to reduce duplication between different versions of the same packages.
You can of course ship all dependencies for all packages separately, but you have to have a solution for security updates.
Node.js basically tried this — every package gets its own copy of every dependency in node_modules. Worked great until you had 400MB of duplicated lodash copies
and the memes started.
pnpm fixed it exactly the way you describe though: content-addressable store with hardlinks. Every package version exists once on disk, projects just link to it.
So the "dedup at filesystem level" approach does work, it just took the ecosystem a decade of pain to get there.
> If you imagine the absolute worst case scenario that every program shipped all of its dependencies and nothing was shared then the end result would be… a few gigabytes of duplicated data?
Honestly, I've seen projects that do this. In fact, a lot of projects that do this, at the compilation level.
It feels like a lot of the projects that I would want to use from git pull in their own dependencies via submodules when I compile them, even when I already have the development libraries needed to compile it. It's honestly kind of frustrating.
I mean, I get it - it makes it easier to compile for people who don't actually do things like that regularly. And yeah, I can see why that's a good thing. But at the very least, please give me an option to opt out and to use my own installed libraries.
Unpopular opinion, loosely held: the whole attempt to share any dependencies at all is the source of evil.
If you imagine the absolute worst case scenario that every program shipped all of its dependencies and nothing was shared then the end result would be… a few gigabytes of duplicated data? Which could plausible be deduped at the filesystem level rather than build or deployment layer?
Feels like a big waste of time. Maybe it mattered in the 70s. But that was a long, long time ago.