Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
EGreg
11 months ago
|
parent
|
context
|
favorite
| on:
Claude's system prompt is over 24k tokens with too...
Can someone explain how to use Prompt Caching with LLAMA 4?
concats
11 months ago
[–]
Depends on what front end you use. But for text-generation-webui for example, Prompt Caching is simply a checkbox under the Model tab you can select before you click "load model".
EGreg
11 months ago
|
parent
[–]
I basically want to interface with llama.cpp via an API from Node.js
What are some of the best coding models that run locally today? Do they have prompt caching support?
Consider applying for YC's Summer 2026 batch! Applications are open till May 4
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: