Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Author here. Thanks for sharing—always great to see different approaches in the space. A quick note on QPS: throughput numbers alone can be misleading without context on recall, dataset size, distribution, hardware, distance metric, and other relevant factors. For example, if we relaxed recall constraints, our QPS would also jump significantly. In the VectorDBBench results we shared, we made sure to maintain (or exceed) the recall of the previous leader while running on comparable hardware—which is why doubling their throughput at 8K QPS is meaningful in that specific setting.

You're absolutely right that a basic HNSW implementation is relatively straightforward. But achieving this level of performance required going beyond the usual techniques.

 help



Yep you are right, also: quantization is a big issue here. For instance int8 quantization has minimal effects on recall, but makes dot-product much faster among vectors, and speedups things a lot. Also the number of components in the vectors make a huge difference. Another thing I didn't mention is that for instance Redis implementation (vector sets) is threaded, so the numbers I reported is not about a single core. Btw I agree with your comment, thank you. What I wanted to say is simply that the results you get, and the results I get, are not "out of this world", and are very credible. Have a nice day :)



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: