Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Why does the model data need to be stored in the image? Download the model data on container startup using whatever method works best.
 help



You are correct! From our tests, storing model weights in the image actually isn't a preferred approach for model weights larger than ~1GB. We run a distributed, multi-layer cache system to combat this and we can load roughly 6-7GB of files in p99 of <2.5s

hey cosmotic, we're not really advocating for storing model weights in the container image.

even the smaller nvidia images (like nvidia/cuda:13.1.1-cudnn-runtime-ubuntu24.04) are about 2Gb before adding any python deps and that is a problem.

if you split the image into chunks and pull on-demand, your container will start much faster.


Just pre-install the NVIDIA layer on the filesystem instead of docker-pulling it for every single machine.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: