Hacker Newsnew | past | comments | ask | show | jobs | submit | oldcap's commentslogin



How long does it take to download Llama2 70B? On the 4x 25 Gbps NICs that aws.p4de's have, it should take ~10s. Yet in production we've observed much higher times, which makes autoscaling less responsive + more expensive. This blog post shows how we've reduced download & init time from 4m25s to 20s, using techniques such as streaming S3->CPU->GPU and multiple TCP streams.


AWS actually has a _higher_ unit cost than Alibaba Cloud


Couldn't the new record holder just use Alibaba to further break it or was it an S3 reason?


Alibaba Cloud isn't as cost-effective as it was in 2016. Also not sure how fast Alibaba's equivalent of S3 is.


Thanks! Just curious!


"We believe that Ray will continue to play an increasingly important role in bringing much needed common infrastructure and standardization to the production machine learning ecosystem, both within Uber and the industry at large."

Poll: if you are writing a ML library, do you want to try Ray as the distributed runtime? If not, what else?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: