Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
kristopolous
10 months ago
|
parent
|
context
|
favorite
| on:
The Llama 4 herd
I'm certainly not the brightest person in this thread but has there been effort to maybe bucket the computational cost of the model so that more expensive parts are on the gpu and less expensive parts are on the cpu?
phonon
10 months ago
[–]
Take a look at
https://github.com/kvcache-ai/ktransformers/blob/main/doc/en...
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: