oldcap's comments

oldcap · on Dec 21, 2023

GH: bit.ly/llm-perf Blog: https://www.anyscale.com/blog/comparing-llm-performance-intr...

oldcap · on Oct 13, 2023

oldcap · on Oct 13, 2023

How long does it take to download Llama2 70B? On the 4x 25 Gbps NICs that aws.p4de's have, it should take ~10s. Yet in production we've observed much higher times, which makes autoscaling less responsive + more expensive. This blog post shows how we've reduced download & init time from 4m25s to 20s, using techniques such as streaming S3->CPU->GPU and multiple TCP streams.

oldcap · on Jan 24, 2023

AWS actually has a _higher_ unit cost than Alibaba Cloud

dataMike · on Jan 24, 2023

Couldn't the new record holder just use Alibaba to further break it or was it an S3 reason?

franklsf95 · on Jan 24, 2023

Alibaba Cloud isn't as cost-effective as it was in 2016. Also not sure how fast Alibaba's equivalent of S3 is.

dataMike · on Jan 24, 2023

Thanks! Just curious!

oldcap · on March 11, 2021

"We believe that Ray will continue to play an increasingly important role in bringing much needed common infrastructure and standardization to the production machine learning ecosystem, both within Uber and the industry at large."

Poll: if you are writing a ML library, do you want to try Ray as the distributed runtime? If not, what else?