Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As someone with admittedly no formal CS education, I've been using conda for all of my grad school and never managed to break it.

I create a virtual environment for every project. I install almost all packages with pip, except for any binaries or CUDA related things from conda. I always exported the conda yaml file and managed to reproduce the code/environment including the Python version. I've seen a lot of posts over time praising poetry and other tools and complaining about conda but I could never relate to any of them.

Am i doing something wrong? Or something right?



My experience with conda is that its fine if you're the original author of whatever you're using it for and never share it with anyone else. But as a professional I usually have to pull in someone else's work and make it function on a completely different machine/environment. I've only had negative experiences with conda for that reason. IME the hard job of package management is not getting software to work in one location, but allowing that software to be moved somewhere else and used in the same way. Poetry solves that problem, conda doesn't.

Poetry isn't perfect, but it's working in an imperfect universe and at least gets the basics (lockfiles) correct to where packages can be semi-reproducible.

There's another rant to be had at the very existence of venvs as part of the solution, but that's neither poetry or anaconda's fault.


Poetry is pretty slow. I think `uv` will ultimately displace it on that basis alone.


For what it’s worth – A small technical fact:

It is entirely possible to use poetry to determine the precise set of packages to install and write a requirements.txt, and then shotgun install those packages in parallel. I used a stupidly simple fish shell for loop that ran every requirements line as a pip install with an “&” to background the job and a “wait” after the loop. (iirc) Could use xargs or parallel too.

This is possible at least. Maybe it breaks in some circumstances but I haven’t hit it.


That poor package server getting 39 simultaneous pulls at the same time from one user.


This is indeed something to consider!

Not as an excuse for bad behavior but rather to consider infrastructure and expectations:

The packages might be cached locally.

There might be many servers – a CDN and/or mirrors.

Each server might have connection limits.

(The machine downloading the packages miiiiiight be able to serve as a mirror for others.)

If these are true, then it’s altruistically self-interested for everyone that the downloader gets all the packages as quickly as possible to be able to get stuff done.

I don’t know if they are true. I’d hope that local caching, CDNs and mirrors as well as reasonable connection limits were a self-evident and obviously minimal requirement for package distribution in something as arguably nation-sized as Python.

And… just… everywhere, really.


I actually can't believe how fast `uv` is.


Ditto. It’s wild.


Poetry is a pain. uv is much better IME/IMO.


Can you recommend any good article / comparison of uv vs poetry vs conda?

We've used different combinations of pipx+lockfiles or poetry, which has been so far OK'ish. But recently discovered uv and are wondering about existing experience so far across the industry.


From my experience, uv is way better and it's also PEP compliant in terms of pyproject.toml. Which means in cas uv isn't a big player in the future, migrating away isn't too difficult.

At the same time, poetry still uses a custom format and is pretty slow.


I wrote an overview, but didn't post benchmarks https://dublog.net/blog/so-many-python-package-managers/


How is uv so much faster? My understanding is Poetry is slow sometimes because PyPi doesn't have all the metadata required to solve things, so it needs to download packages and then figure it out.


If I recall correctly, uv is doing some ninja stuff like guessing the part of the relevant file that is likely to contain the metadata it needs and then doing a range request to avoid downloading the whole file.


Thanks, that makes sense. I guess Poetry could add that if they liked.


+1. On top of that, even with the new resolver it still takes ages to resolve a dependency for me, so somethimes I end up just using pip directly. Not sure if I am doing something wrong(mb you have to manually tweak something in the configs?) but it's pretty common for me to experience this


Like sibling comments, after using poetry for years (and pipx for tools), I tried uv a few months ago

I was so amazed of the speed, I moved all my projects to uv and have not yet looked back.

uv replaces all of pip, pipx and poetry for me, I does not do more than these tools, but it does it right and fast.

If you're at liberty to try uv, you should try it someday, you might like it. (nothing wrong with staying with poetry or pyenv though, they get the job done)


I believe the problem is the lack of proper dependency indexing at PyPI. The SAT solvers used by poetry or pdm or uv often have to download multiple versions of the same dependencies to find a solution.


imagine being a beginner to programming and being told "use venvs"

or worse, imagine being a longtime user of shells but not python and then being presented a venv as a solution to the problem that for some reason python doesn't stash deps in a subdirectory of your project


You don't need to stash deps in a subdirectory, IMHO that's a node.js design flaw that leads to tons of duplication. I don't think there's any other package manager for a popular language that works like this by default (Bundlers does allow you to version dependencies which can be useful for deployment, but you still only ever get one version of any dependency unlike node).

You just need to have some sort of wrapper/program that knows how to figure out which dependencies to use for a project. With bundler, you just wrap everything in "bundle exec" (or use binstubs).


What was unique to node.js was the decision to not only store the dependencies in a sub-folder, but also to apply that rule, recursively, for every one of the projects you add as a dependency.

There are many dependency managers that use a project-local flat storage, and a global storage was really frowned upon until immutable versions and reliable identifiers became popular some 10 years ago.


Wasn't node the only programming language that used a subdirectory for deps by default?

Ruby and Perl certainly didn't have it - although Ruby did subsequently add Bundler to gems and gems supported multiversioning.


It’s fairly common for Perl apps to use Carton (more or less a Perl clone of Bundler) to install vendored dependencies.


Oh that's nice. When I last looked (quite a long time ago), local::lib seemed to be the recommended way, and that seemed a bit more fiddly than python's virtualenv.


Carton uses local::lib under the covers. I found local::lib far less fiddly than virtualenv myself, but it just doesn't try to do as much as virtualenv. These days I do PHP for a living, and for all the awfulness in php, they did nail it with composer.


Rust, julia, elixir


julia just store the analogue of a requirements.txt (Project.toml) and the lock file (Manifest.toml). And has its own package issues including packages regularly breaking for every minor release (although i enjoy the language and will keep using it)


yep, i was wrong about julia.


All those came after Python/C/C++ etc which were all from the wild-west of the "what is package management?" dark ages. The designers of those languages almost certainly thought the exact thought of "how can we do package management better than existing technology like pip?"


Rust doesn't store dependencies under your project dir, but it does build them under your target.


I have imagined this, because I've worked on products where our first time user had never used a CLI tool or REPL before. It's a nightmare. That said, it's no less a nightmare than every other CLI tool, because even our most basic conventions are tribal knowledge that are not taught outside of our communities and it's always an uphill battle teaching ones that may be unfamiliar to someone from a different tribe.


It is true that every field (honestly every corner of most fields) has certain specific knowledge that is both incredibly necessary to get anything done, and completely arbitrary. These are usually historical reactions to problems solved in a very particular way usually without a lot of thought, simply because it was an expedient option at the time.

I feel like venv is one such solution. A workaround that doesn’t solve the problem at its root, so much as make the symptoms manageable. But there is (at least for me) a big difference between things like that and the cool ideas that underlie shell tooling like Unix pipes. Things like jq or fzf are awesome examples of new tooling that fit beautifully in the existing paradigm but make it more powerful and useful.


Beginners in Python typically don't need venvs. They can just install a few libraries (or no libraries even) to get started. If you truly need venvs then you're either past the initial learning phase or you're learning how to run Python apps instead of learning Python itself.

For some libraries, it is not acceptable to stash the dependencies for every single toy app you use. I don't know how much space TensorFlow or PyQt use but I'm guessing most people don't want to install those in many venvs.


Intelligent systems simply cache and re-use versions and do stash deps for every toy project without consuming space.

Also installing everything with pip is a great way to enjoy unexplainable breakage when a Doesn't work with v1 and b doesn't work with v2.

It also leads to breaking Linux systems where a large part of the system is python code. Especially where user upgrades system python for no reason.


If you install a package in a fresh environment then it does actually get installed. It can be inherited from the global environment but I don't think disparate venvs that separately install a package actually share the package files. If they did, then a command executed in one tree could destroy the files in another tree. I have not done an investigation to look into this today but I think I'm right about this.


In better designed systems than python they do. To share them with python you need something with dedup. Eg BTRFS ZFS


Python's venv design is not obviously unintelligent. It must work on all sorts of filesystems, which limits how many copies can be stored and how they can be associated. More advanced filesystems can support saving space explicitly for software that exploits them, and implicitly for everyone, but there is a cost to everything.


i remember reading somewhere (on twitter iirc) an amateur sex survey statistician who decided she needed to use python to analyze her dataset, being guided toward setting up venvs pretty early on by her programmer friends and getting extremely frustrated.


Was it aella? I don't know of any other sex survey statisticians so I'm assuming you mean aella. She has a pretty funny thread here but no mention of venvs: (non-musk-link https://xcancel.com/Aella_Girl/status/1522633160483385345)

  Every google for help I do is useless. Each page is full of terms I don't understand at *all*. They're like "Oh solving that error is simple, just take the library and shove it into the jenga package loader so you can execute the lab function with a pasta variation".
She probably would have been better off being pointed towards jupyter, but that's neither here nor there


Good grief there seems to be no getting away from that woman. One of my ex girlfriends was fascinated by her but to me she is quite boring. If she wasn't fairly attractive, nobody would care about her banal ramblings.


You are doing something right, author does some pretty unusual things:

- Setup custom kernels in Jupyter Notebook

- Hardlink the environments, then install same packages via pip in one and conda in others

- install conda inside conda (!!!) and enter nested environment

- Use tox within conda

I believe as long as you treat the environments as "cattle" (if it goes bad, remove it and re-create from yaml file), you should not have any problems. It's clearly not the case of for the post's author though.


Yep nuke the bad env and start over. Conda is great only problem are when a package is not available on conda forge or you have to compile and install with setup.py. But then you can blow the env away and start over.


As someone with a formal computer science, half of my friends who work in other sciences have asked me to help them fix their broken conda environments


This is exactly the kind of thing that causes python package nightmares. Pip is barely aware of packages it's installed itself, let alone packages from other package managers and especially other package repositories. Mixing conda and pip is 100% doing it wrong (not that there's an easy way to do it right, but stick to one or the other, I would generally recommend just using pip, the reasons for conda's existance are mostly irrelevant now)


I still run into cases where a pip install that fails due to some compile issue works fine via conda. It's still very relevant. It's pip that should be switched out for something like poetry.


poetry vs pip does very little for compilation-related install failures. Most likely the difference is whether you are getting a binary package or not, and conda's repository may have a binary package that pypi does not (but also vice-versa: nowadays pypi has decent binary packages, previously conda gained a lot of popularity because it had them while pypi generally did not, especially on windows). But the main badness comes from mixing them in the same environment (or mixing pypi packages with linux distribution packages, i.e. pip installing outside of a virtual environment).

(I do agree pip is still pretty lackluster, but the proposed replacements don't really get to the heart of the problem and seem to lack staying power. I'm in 'wait and see' mode on most of them)


Oh I meant that poetry could be a general replacement for pip (actually it used it in it's backend) because it does a great job managing dependencies and projects in general.


Works absolutely fine as possible with Python using conda to manage the environments and python versions and pip to install the packages.


I had the same experience. But you should try pixi, which is to conda what uv is to pip.


Isn't uv to conda what uv is to pip?


`uv` is not a drop-in replacement for `conda` in the sense that `conda` also handles non-python dependencies, has its own distinct api server for packages, and has its own packaging yaml standard.

`pixi` basically covers `conda` while using the same solver as `uv` and is written in Rust like `uv`.

Now is it a good idea to have python's package management tool handle non-python packages? I think that's debateable. I personally am in favor of a world where `uv` is simply the final python package management solution.

Wrote an article on it here: https://dublog.net/blog/so-many-python-package-managers/


I am not sure pixi uses the same solver of uv, at least in general. pixi uses resolvo (https://github.com/mamba-org/resolvo) for conda packages, while uv (that in turns uses pubgrub https://github.com/pubgrub-rs/pubgrub) for pip packages.


Pixi uses uv for resolving pypi deps: https://prefix.dev/blog/uv_in_pixi If you look closely, pixi used `resolvo to power `rip` then switched from a `rip` solver to a `uv` solver


Hmmm I'll update that point


I have been using pixi for half a year and it has been fantastic.

It’s fast, takes yml files as an input (which is super convenient) and super intuitive

Quite surprised it isn’t more popular


Bookmarking. Thanks for sharing the link, looks like a great overview of that particular tragic landscape. :)

Also crossing fingers that uv ends up being the last one standing when the comprehensive amounts of dust here settle. But until then, I'll look into pixi, on the off chance it minimizes some of my workplace sorrows.


God forbid you should require conda-forge and more than three packages lest the dependency resolver take longer than the heat death of the planet to complete.


Install mamba first?


Mamba is indeed a lot better. I personally just don't bother with conda and stick to pip + venv.


Same but I try to use conda to install everything first, and only use pip as a last resort. If pip only installs the package and no dependency it's fine


i think you got lucky and fell into best practices on your first go

> except for any binaries or CUDA related things from conda

doing the default thing with cuda related python packages used to often result in "fuck it, reinstall linux". admittedly, i dont know how it is now. i have one machine that runs python with a gpu and it runs only one python program.


> doing the default thing with cuda related python packages used to often result in "fuck it, reinstall linux"

From about 2014-17 you are correct, but it appears (on ubuntu at least), that it mostly works now. Maybe I've just gotten better at dealing with the pain though...


1. You need to run export manual while other tools you mentioned would create it automatically (the lock file) 2. Distinguishes between direct dependencies (packages you added yourself) and indirect dependencies (packages of the packages)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: