Hacker Newsnew | past | comments | ask | show | jobs | submit | yakorevivan's commentslogin

This is very well built. And beautiful to look at too. Congrats..


Thank you so much for the kind words!


Hey, can you share the inference code please? Thanks..



Cannot compile it locally on Fedora 40:

  nunchaku/third_party/spdlog/include/spdlog/common.h(144): error: namespace "std" has no member "function"
  using err_handler = std::function<void(const std::string &err_msg)>;
                                   ^


Yea its a pain, I'm trying to make an api endpoint for a website I own, and working on a docker image. This is what I have for now that "just" works:

the conda always yes thing makes sure that you can just paste the script and it all works instead of having to press "y" for each install. Also if you don't feel like installing a wheel from random person on the internet, replace that step with "pip install -e ." as the repo suggests. I compiled that one with cuda 12.4 cause that was the part takes the most time and is what most often seems to be breaking.

Also I'm not sure if this will work on Fedora, I tried this on a runpod machine with 4090(apparently it only works on few cards, 3090, 4090, a100 etc) with Cuda 12.4 on host machine and "runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04" this image as base.

EDIT: using pastebin instead as HN doesn't seem to jive with code blocks: https://pastebin.com/zK1z0UdM


Almost working:

  [2024-11-09 19:33:55.214] [info] Initializing QuantizedFluxModel
  [2024-11-09 19:33:55.359] [info] Loading weights from ~/.cache/huggingface/hub/models--mit-han-lab--svdquant-models/snapshots/d2a46e82a378ec70e3329a2219ac4331a444a999/svdq-int4-flux.1-schnell.safetensors
  [2024-11-09 19:34:01.432] [warning] Unable to pin memory: invalid argument
  [2024-11-09 19:34:02.143] [info] Done.
  terminate called after throwing an instance of 'CUDAError'
    what():  CUDA error: pointer does not correspond to a registered memory region (at /nunchaku/src/Serialization.cpp:32)


prolly make sure your host machine cuda is also 12.4 and if not, update the other cuda versions I have on the pastebin to the one you have. I don't think it works with cuda 11.8 tho, remember trying it once

but yea, can't help you outside of runpod, I haven't even tried this on my home PCs yet. for my usecase of serverless API, it seems to work


Why aren't more people talking about cody from sourcegraph? For just 10$/month it offers unlimited completions using top models like sonnet 3.5 and gpt4o. Not to mention, the plugins for vscode and intellij products work perfectly well.


It has way less clout because afaik Sourcegraph has been doing business with companies. So for the most part it seems like they are just breaking into the market. It might also be related to this

https://news.ycombinator.com/item?id=41296481


Cody is open source: https://github.com/sourcegraph/cody. And for the reasons explained there, it makes more sense for it to be open source.


I have myself been using chatgpt/sonnet-3.5 to clean data, extract data, heck, even generate sample sql insert statements for a given table schema... small things, but, when done repeatedly, saves soooo damn amount of time and frustrations. I have been using these tools to generate sooo many small scripts to automate or do things that otherwise I either wouldn't have done it, or, it would have taken significant amount of time. These tools are tooo good now to not use them.

Also, as someone else have already pointed out, these things work correctly 99.99% of the time. But that remaining 0.01%... that's what becomes major issue since it is so small of an error, that, unless you verify, you'll end up missing.

When using LLM's..."Trust, but verify".


Instead of trying to show them in negative light by calling them as "doomsayers" , let's call them one of the following instead, which is much better:

"Realistic", "Proactive", "Forward-thinking", "Prepared", "Cautious", "Thoughtful", "Analytical", "Mindful", "Insightful", "Security-conscious".


This is sooo good... how are you guys making money since you are offering it for free?


Yes! Same question. Also, how are you different from Phind and Blackbox? I’m a newbie dev and have been using mostly Phind for dev search.


They have released finetuning code too. You can finetune it to remove the alignment finetuning. I believe it would take just a few hours at max and a couple of dollars.


genes are doing that for more than 4 billion years now... we just need to figure out a way of using genes to grow semi-conductors.


China now doing what US used to do before turning utterly greedy


Seriously man... I wonder the same... How can one person create so many brilliant softwares


Being one person makes it easier not harder. The meetings are quicker for one thing.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: