To anyone interested in addressing dependency hell with as small a footprint as possible, I encourage you to check out Nix and NixOS. Flatpak, Docker, and Ubuntu Snappy all take a slightly different bite out of the problem but still rely on imperative provisioning of a stateful system, and this comes with costs in system size, system indeterminancy, or both. The Bundles and Runtimes in Flatpak are treated the same in Nix: a package's dependencies are specified in a configuration file, and these files are composable so that shared dependencies are not replicated (as they are in the container model) and the dependency footprint is minimized. A package exists in a directory identified by a cryptographic hash of its build input, so if the input (the package) ever changes then this change will be transparent to other packages, avoiding name conflicts and allowing multiple versions of an application to exist without breaking other applications that depend on them. This means sandboxing comes for free without any of the overhead of a container. The Nix package repo is pretty impressive given how small the mindshare is compared to dpkg, RPM, etc. I am frequently surprised at how many small and lesser used applications have been packaged.
Is guix as rough on the edges as nix? I have nixos installed in qemu, but so far don't find the experience to be pleasant. The ideas are great though, definitely the direction to move for all package managers.
I've no experience with Nix, but I tried guix package manager and it was fine. I didn't hit any edge cases, but I didn't dive deep. I tried installing GuixSD (the OS) too, but failed, and because I was in a hurry, couldn't investigate much. But I did hit a critical issue: scarse docs. It's nowhere near FreeBSD where docs indeed help you out without a google search.
As someone that maintains a huge selection of packages, and an automated installer, for a very complex and interdependent set of OS vendor and third party packages, I think this is amazingly cool.
This solves nearly all of the problems with container-based distributions of software (like Docker), while providing all of the benefits. It doesn't give up the capabilities of traditional package managers (which are very good for their purpose and in the right place), nor does it impose downloading huge image files for every new thing you install (no matter how small they make the container OS, it's still bigger than being able to share a few and having filesystem level de-dup and snapshot capabilities).
I'm going to start working on a Flatpak version of our installation pretty much immediately (or, next week, once I get some other stuff rolled out). It is potentially a huge long-term time-saver for me, and a nice improvement for our customers, too, in that it would make every deployment look identical.
Flatpak is a much better name. Also, better name than dnf or apt-get or rpm or dpkg, too. (But, not a better name than yum, which I'm still bitter about the Fedora/RH folks changing.)
The problem with all of these app bundle systems is that you end up with 500 MB - 2 GB binaries for every app. Because dependencies in OSS are generally quite coarsed grained, you end up pulling in almost the entire user space of an operating system into your app.
Disk space, memory space (process size and lack of page sharing), and the size of app updates are all an issue (you don't want gigabytes of app updates every night).
Moreover, there is no standard format for dependencies -- that is completely distro dependent, or you have to grovel through the README of every project and hand-tune it (boiling the ocean).
Does Flatpak address this problem at all?
A couple years ago I was trying to write a system that addressed this. You basically use content addressed storage and chroots in the fashion of Plan 9 namespaces. And you can also use differential compression for slim network updates.
But I ran into the problem of boiling the ocean... I wrote like 2000-4000 lines of shell scripts just to build one kind of app (involving R, which involves Fortran...). I had to duplicate all the work of groveling through the dependencies, which is more difficult than you'd think for any non-trivial application. Debian dependency metadata is full of surprised like virtual packages and all that. The algorithm is to enumerate them is not straightforward and generates huge app bundles. Also many projects have non-trivial build time config options and downstream patches.
Is anyone else working on an app bundle system that solves the problems of disk space, memory size, and update size, as well as versioned dependency specification and resolution?
To me Docker seems to punt most of the problem on to the user, and then everybody ends up with OS-sized applications because they can't be bothered to trim dependencies. This isn't good for many reasons, but it's particularly bad for security. Then you have to pile hacks upon hacks and try to scan containers for vulnerabilities that you shouldn't have put there in the first place.
EDIT: I haven't looked in more detail, but it seems like Flatpak creates an artifical separation between "runtimes" and "bundles". Maybe that is a little hack that will work in practice -- my suspicion is that this will simply recapitulate the dependency versioning problem at a more coarse grained level. The natural tendency is for the runtimes to become bloated and thus require frequent updates.
Yes, it seems to. As I understand it, there's the concept of a runtime, which is kinda like an OS installation (but it's an immutable image type thing that gets distributed identically to everyone who wants that runtime), which packages can be built against, meaning the application package only needs to include its own specific dependencies that aren't part of the runtime. So, similar to how one might build for Fedora 24 now, one could build for a Fedora runtime, and any user who had that runtime wouldn't need that image to be downloaded...they would already have it. So, if you only installed packages provided by your OS vendor, the size of things wouldn't change much.
Where things begin to look different is that any Flatpak can be installed on any system that has Flatpak. So, don't have a runtime that an app you want is built against? No problem, this'll grab it and set it up, transparent to the user. And, next time you want another package built against that runtime, it won't need to add it. This is like being able to install anything from Debian repos on CentOS, or vice versa; it's also like being able to install from Debian stable, unstable, and testing all willy-nilly without fear of it breaking other packages.
At least, that's how I understand it...but, that'll be many years coming. This is such a fundamental shift in how we think about packages that it's unlikely to catch on quickly. But, I'm a package manager nerd, and I can't believe I didn't realize this was underway; I would have started following it earlier. I love this idea.
Yeah, I just noticed that and edited my comment. To me, that just makes the dependencies more coarse-grained -- now you have this artifical 2-level system of apps and runtimes. You've just recreated the same problem again with bigger blobs.
So I can't see right now how it's going to work. I feel like it will flame out. But this kind of thing involves a lot of social problems in addition to technical problems, so you never know.
EDIT: A more concrete example. Apache, nginx, Python, Ruby, Perl, node.js, Erlang web servers, etc. all depend on OpenSSL.
OpenSSL is 500K lines of code. When an important update inevitably comes out for it, what happens? Are you now updating 10+ runtimes that are 1+ GB each? Do they use differential compression at all?
I don't see how the two-level hierarchy really solves any problems. It could be the worst of both worlds (very fine-grained sharing vs. duplicating all dependencies)
"OpenSSL is 500K lines of code. When an important update inevitably comes out for it, what happens? Are you now updating 10+ runtimes that are 1+ GB each? Do they use differential compression at all?"
Yes. The images are being updated via snapshot (transparent to the user). Just as RPMs can currently be updated with delta RPMs. The Poettering blog post about it that someone linked in another thread is really helpful: http://0pointer.net/blog/revisiting-how-we-put-together-linu...
It's not an amateur attempt at the problem. It may not be perfect (it probably is far from it for the next several years), but it does seem to have addressed the biggest issues in a reasonable way. We definitely need a way forward beyond the current generation of package managers; Nix has a really cool solution (and I would be happy with Nix getting more traction), and Guix does, too. Docker, on the other hand, which is the currently fashionable option for solving these problems, leaves a lot to be desired, and does fall prey to some of the concerns you raise.
Good question... Windows doesn't have this model, because it has "DLL hell". Windows was a ad hoc solution partway between "fine-grained curated shared dependencies" (Debian) and "duplicate all your dependencies" (App Bundles/Docker).
The solution was up to each individual app author -- you could choose to bundle DLLs, but mostly you didn't bundle "system" DLLs, if you judged them to be low level enough, widely deployed enough, etc. Installers on windows also listed MANUAL prerequisites that users had to install, so they are punting on a big part of the problem (give me a the name of a thing to install and make it magically work; don't make me manually resolve dependencies.)
There was no real consistent model on Windows, and if you were completely consistent with sharing everything, you would have an even more fragile mess than Windows apps were already.
iOS and Android are successful app bundle ecosystems, and something that these Linux app bundle are explicitly trying to emulate. The apps end up around 50 MB or so, not 500 MB to 2 GB.
The reason for that is basically that the apps are less rich and featureful (think Photoshop or OpenOffice or a browser), and that they ship lots of high level libraries with the OS. Stuff like video codecs are part of the OS/hardware in iOS/Android, whereas in Debian and Windows they are applications. Stuff like browser UI views are part of the OS as well.
Android is an entire OS, with apps, built from a single Makefile! So there is less versioning skew and fragmentation than in Debian/Windows (believe it or not).
iOS has some issues with app compatibility across OS versions that Windows definitely doesn't.
Anyway, iOS and Android are definitely the models to emulate in terms of usability and stability. But I don't think they have completely solved the problems because the ecosystems aren't as rich as Debian or Windows yet.
I think the Windows story is becoming more robust by using mostly isolated apps. Very little is shared these days and dll-hell isn't really and issue any more.
The only shared libraries are things like C++ run times, directX etc which are basically optional OS components since they are also made by Microsoft. Even those are moving towards "isolated deploy" modes where you should ship a .NET runtime as a deployed dependency instead of using a shared prerequisite. This is much more robust but also wasteful and harder to patch of course.
> The apps end up around 50 MB or so, not 500 MB to 2 GB.
They try to accomplish much less and of course have fewer enormous apps dragging the average up (try the latest doom at 50GB!). The average app sizes should be counted relative to the average device space. A desktop system has 200-2000GB and a mobile device 16-128GB so around one order of magnitude is a reasonable difference.
These are all admirable goals and line up very well with my priorities, but I'm not sure why one wouldn't simply use static linkage for everything. The notion of a single, versioned "runtime" for each app is a good one; perhaps narrowing the scope of dynamic linkage down to that makes the concept useful again.
"Flatpak" is certainly a better name than "xdg-app".
Yes, which is why self-contained statically-linked packages should be placed in a directory away from the system libraries in their own directory/namespace. We could call it "Program Files" or "Program Files (x86_64)".
Or, more simply, we could put each package in its own directory. Such application package directories could perhaps be marked with a distinctive type-extension, for example ".app".
One good effect of dynamic linking is that you don't have to recompile all your binaries when it turns out a widely-used library has a gaping security hole.
Dynamic linking is also more storage efficient. If the contents of my /bin were statically linked, it'd take up 76GB. With dynamic linking, the binaries and their libraries take up 4GB.
This also matters at run-time, if I'm not writing single-binary apps: a Unix pipeline of 8 50MB binaries, say, is taking up 400MB of cache just for the code. On a 1GB server, that's a big deal.
I can understand the security argument if you're running a server, but I'm not, and what I care about is predictability: I want the apps that have been working to keep on working in exactly the same way until such a day as I decide to take the risk of upgrading them.
I don't care how foolproof semantic versioning might be in theory: in practice, sometimes you upgrade one thing and that leaves another thing broken, and this is not something I am willing to accept in a personal workstation - especially not as the result of an automated self-update process, introducing churn without bothering to get my permission!
I don't care about storage efficiency. It's been years since I've had to pay any attention to drive capacity. 4GB is nothing. 76 GB is also nothing, when you have multiple terabytes of storage space sitting completely idle.
As far as memory efficiency goes, I'm not running pipelines, so I don't care. On a server with 1GB RAM, yes, I can see that becoming a big deal; but even my laptops all have 8GB RAM now. On a personal workstation, this doesn't matter, and that's where I'd like to see the "link everything dynamically" meme die. I'm not a system administrator and I don't have system-administrator concerns. I want to get work done and not futz with it. That means I want everything statically linked, independent from each other, churn-free, and predictable.
> This also matters at run-time, if I'm not writing single-binary apps: a Unix pipeline of 8 50MB binaries, say, is taking up 400MB of cache just for the code. On a 1GB server, that's a big deal.
Most likely not, all modern OSes do memory mapping for executable code.
Regarding security, when implementing plugins, dynamic opens an application to security exploits in third party code.
Hence why Apple, Google and Microsoft when back to the separate process model as container model.
Frankly this is not actually about libs, but about package manager dependencies.
DEBs and RPMs are overly rigid by design. Because of how they track packages, you can't have two versions of the same package name installed at the same time. If you want to have two versions of a lib installed, you have to rename one of them to avoid a name collision. Even though the files inside the packages would never overlap.
Urgh. It's a lecture from a desktop Linux Dev on what I'm supposed to want.
As someone who runs stuff at stupid (not fb or google, but not tiny) scale, this just annoys me.
He's right, in some ways.. But.. How many more layers do we need? Who will sign up for this?
No one.
I'm getting close to the point where I just can't take it anymore.
I keep deleting mesos and kubernetes setups and replacing them with 200 line shell scripts at startups with no ops team who are sold lies... Maybe I'll become a pig farmer or something. Open a pub.
I'd really be less offended by this whole thing if every example wasn't about running gnome.
This is the guy driving the userland for most of us? Doesn't he realise that no one gives a shit about Linux desktop apps?
Sure there are some, we should be more focuses on the needs of people who are deploying 1000s of jvms or node apps or ruby or whatever though... Shouldn't we?
There are already plenty of people focused on those needs. In fact your previous comment seemed to be complaining about there being too many solutions already. Why are you so offended that some people are trying to solve a real problem they have, instead of piling into the server use case that's already very crowded?
I was with him 100% right until "as part of the systemd project". This isn't even about politics anymore, what the hell does application distribution have to do with init daemons? Is systemd just a dumping ground for whatever idea they currently think is great? One gets the impression that if Poettering wrote Pulseaudio today, he'd roll it into systemd...
There were numerous previous attempts to do something like this, including Klik, its successor Klik2, AutoPackage, 0install, Glick, and its successor Glick2 (the latter two by the lead developer of FlatPak).
However these all remained pretty obscure. AFAIK all of them only tried to solve the problem of how to install stuff on the end user's system, but ignored the problem of how developers can build binaries that actually run on multiple version of multiple distributions, which XdgApp/FlatPak solves with its "runtimes" and corresponding SDKs.
I don't think that's the case here. Many of your dynamic libraries will exist in the runtime (which is a mini-OS image).
And, as dependencies shift, snapshots of that runtime will allow packages to keep talking to the dynamic libs they need, while others that may upgrade to require newer deps will get a new OS image. Snapshots allow them to be small (a fedora-24 and a fedora-24.1 image might only need 1.1 times the space of fedora-24 alone, and adding fedora-24.2 might only need another 5% of space).
I think folks are thinking this is like Docker. It isn't. It has surface level similarities, but it seems to be trying to solve the kinda of problems that Docker does not attempt to solve, and attempts to keep many of the good ideas from past package managers that Docker (and the like) have thrown away.
This is just another case of having a hammer (cgroups, containers) and seeing every problem as nails.
The problem is not at the kernel or OS level, it is that package manager have historically been overly rigid with regards to dependencies.
And it will not surprise me at all if app devs start stuffing their "flatpaks" with all manner of libs because runtimes are not changing fast enough, or to ensure their special snowflakes get just the right environment.
That's only one facet of the problem Flatpak is trying to solve (and it's a feature Flatpak will use).
But, there are other considerations. Docker, despite its several negatives, has become popular because it begins to address the repeatable infrastructure needs of a service-based architecture.
e.g. say I have 10,000 servers. I can't possibly manage them all individually. The performance of using something like Kickstart to deploy them would be problematic, and each of them has a well-defined role, so having a pre-defined image (that can evolve over time) begins to make more sense. One of the things Docker got right was taking the concept of a package up a level of abstraction; so now you don't install Apache and a million other things to get your web server deployment...you install a specific bundle of things in a pre-defined image. Maybe even an immutable image that is identical on every one of your several thousand web servers (or containers or VMs, whatever). Being immutable provides some security and maintenance benefits. Being in a compressed/snapshot-able image provides deployment speed benefits and maintenance benefits. Providing dependency management at both the individual package (like RPM/deb) level and at the higher container level is also a benefit. Docker doesn't do anything along those lines (and can't, given it's implementation). This would strap into a configuration management system nicely, as well.
All of it is possible the previous way. I love RPM/yum. I love Kickstart. I love Ansible (or Puppet or Chef or Salt). Those are great tools, and you can build a repeatable infrastructure using them. But, this is another layer of abstraction, and it will enable a lot of cool stuff going forward.
And, soname is still part of the equation. You've got to build these runtimes. You've got to have some way of building and versioning all the software that is in them. And, you're not going to put 30 versions of OpenSSL in a runtime...you just need the latest version with the correct ABI version compatibility for the stuff you're building against it and distributing for it.
Let's assume that a developer wants to distribute a binary build of their application that uses the OpenSSL library that runs on every GNU/Linux system released in the last 5 years. As you might know, OpenSSL releases a new ABI-incompatible version of their library every other year or so, and so there are at least 3 different OpenSSL ABIs / SONAMEs available in the distributions the binary build should run on.
So far the only choices available (edit: for desktop applications) are either static linking OpenSSL, or abandon the idea of distro independent binaries.
With FlatPak, OpenSSL can be bundled in the runtime, so multiple applications can depend on the same runtime and share one dynamic library, and benefit from the runtime getting security updates for the monthly OpenSSL CVEs instead of doing that themselves.
Now you are looking at distro policy, especially the likes of Debian Stable.
If the package manager allows for multiple version of openssl to exist without naming shenanigans, there is nothing stopping a third party package to go on top of Debian Stable and provide the most recent ABI version.
In essence you are looking for technical solutions to policy/politics...
To anyone interested in addressing dependency hell with as small a footprint as possible, I encourage you to check out Nix and NixOS. Flatpak, Docker, and Ubuntu Snappy all take a slightly different bite out of the problem but still rely on imperative provisioning of a stateful system, and this comes with costs in system size, system indeterminancy, or both. The Bundles and Runtimes in Flatpak are treated the same in Nix: a package's dependencies are specified in a configuration file, and these files are composable so that shared dependencies are not replicated (as they are in the container model) and the dependency footprint is minimized. A package exists in a directory identified by a cryptographic hash of its build input, so if the input (the package) ever changes then this change will be transparent to other packages, avoiding name conflicts and allowing multiple versions of an application to exist without breaking other applications that depend on them. This means sandboxing comes for free without any of the overhead of a container. The Nix package repo is pretty impressive given how small the mindshare is compared to dpkg, RPM, etc. I am frequently surprised at how many small and lesser used applications have been packaged.