Hacker Newsnew | past | comments | ask | show | jobs | submit | zora_goron's favoriteslogin

https://sampler.meiji.industries/

I built a TUI sampler which cherry-picks my favourite features from modern & vintage hardware samplers, DAWs, plugins, outboard FX gear, and DJ equipment.

If you know what an AKAI MPC Live, MPC 3000, SP404, SP1200, BOSS RC-202, Alesis 3630, Serato Sample, S950 filters, and stem separation does, then you'll love seeing these "greatest hits" up in a terminal interface.

Last year while on vacation in Costa Rica, I started scratching my own itch for locating and organizing samples, which quickly evolved into adding more and more features while keeping it tactile and immediate. It was too fun to stop so I kept going. After a few days I was happily making beats in it, and since then it's only gotten better.

It's live and totally free to use, and works in macos & Linux (Windows soon). I'm about to launch v1.0 now, just working with folks in the community to round out the Factory Kits a little more for users new to beatmaking.

Turns out, making beats with no mouse and a terminal interface strikes the perfect balance of hardware feel and software power, and I'm loving the result. Been sharing it with folks in my beatmaking sphere and have plans to continue expanding its reach through more collaborations, contests, and in-person events.

Hope it brings you as much joy as it does to me :)


we've been tracking the deepseek threads extensively in LS. related reads:

- i consider the deepseek v3 paper required preread https://github.com/deepseek-ai/DeepSeek-V3

- R1 + Sonnet > R1 or O1 or R1+R1 or O1+Sonnet or any other combo https://aider.chat/2025/01/24/r1-sonnet.html

- independent repros: 1) https://hkust-nlp.notion.site/simplerl-reason 2) https://buttondown.com/ainews/archive/ainews-tinyzero-reprod... 3) https://x.com/ClementDelangue/status/1883154611348910181

- R1 distillations are going to hit us every few days - because it's ridiculously easy (<$400, <48hrs) to improve any base model with these chains of thought eg with Sky-T1 recipe (writeup https://buttondown.com/ainews/archive/ainews-bespoke-stratos... , 23min interview w team https://www.youtube.com/watch?v=jrf76uNs77k)

i probably have more resources but dont want to spam - seek out the latent space discord if you want the full stream i pulled these notes from


It's also worth mentioning that the original implementation by Meta is only 300 lines of very readable code [1].

[1]: https://github.com/meta-llama/llama3/blob/main/llama/model.p...


I just wanted to point out that both you and the one you're replying to keep writing "semantic web", but I think what you really mean is "semantic HTML".

The semantic web is the umbrella term for the stack of technologies that include RDF, SPARQL, OWL, etc. — it doesn't really have anything to do with HTML, other than some special syntax for embedding RDF into HTML.


How is this different from the "Recorder" feature available in Chrome (Dev Tools > Recorder)?

It can record a user-journey and print out a puppeteer script.


I'll add txtai to the list: https://github.com/neuml/txtai

txtai is an all-in-one embeddings database for semantic search, LLM orchestration and language model workflows.

Embeddings databases are a union of vector indexes (sparse and dense), graph networks and relational databases. This enables vector search with SQL, topic modeling and retrieval augmented generation.

txtai adopts a local-first approach. A production-ready instance can be run locally within a single Python instance. It can also scale out when needed.

txtai can use Faiss, Hnswlib or Annoy as it's vector index backend. This is relevant in terms of the ANN-Benchmarks scores.

Disclaimer: I am the author of txtai


Since I was using it mostly for coding, I closed my account. I now use the API with Cursor. Works out to be much cheaper than $20.

Transformers sure, but theres evidence that LLMs will continue to outperform models trained for specific tasks. In the past you needed:

1. a model for sentiment analysis

2. a model for summarization

3. ...other NLP tasks

other tasks:

4. a model for object detection in images

5. a model for face recognition in images

Whereas now the LLM does all of the above better than the previous state of the art. This will continue and eat more fields of machine learning. It will happen for images and video. I argue that it will even extend to things like time series analysis


> On macOS you can use apps like BetterDisplay, Vivid or BetterTouchTool to enable that HDR mode for the whole display

I have BTT and can't figure this out. Can you say how? Thanks.


GPT3.5 has been undergoing constant improvements, this price decrease (and context length increase) is great news!

The main problem I see with people using GPT3.5 is they try and ask it to "write a short story about aliens" and then they get back a crap boring response that sounds like it was written by an AI that was asleep at the wheel.

Good creative prompts are long and detailed, and to get the best results you really need to be able to tune temperature / top_p. Even small changes to a 3 paragraph prompt can result in a dramatic changes in the output, and unless people are willing to play around with prompting, they won't get good results.

None of the prompt guides I've seen really cover pushing GPT3.5 to its limit, I've published one of my more complicated prompts[1] but getting GPT3.5 to output good responses in just this limited sense has taken a lot of work.

As for the longer context, output length is different than following instructions, especially for a lot of use cases, pushing more input tokens is of as much interest as having more output tokens.

From what I have explored, even at 4k context length, with a detailed prompt earlier instructions in the prompt are "forgotten" (or maybe just ignored). The blog post calls out better understanding of input text, but again, I hope that isn't orthogonal to following instructions!

Finally in regards to function outputs, I wonder if it is a second layer they are running on top of the initial model output. I have always had a challenge getting the model to output parsable responses, there is a definite trade off between written creativity and well formatted responses, and to some extent having a creative AI extend out the format I specify has been really nice because it has allowed me to add features I did not think of myself!

[1] https://github.com/devlinb/arcadia/blob/main/backend/src/rou...


This is a great summary of why productionizing LLMs is hard. I'm working on a couple LLM products, including one that's in production for >10 million users.

The lack of formal tooling for prompt engineering drives me bonkers, and it compounds the problems outlined in the article around correctness and chaining.

Then there are the hot takes on Twitter from people claiming prompt engineering will soon be obsolete, or people selling blind prompts without any quality metrics. It's surprisingly hard to get LLMs to do _exactly_ what you want.

I'm building an open-source framework for systematically measuring prompt quality [0], inspired by best practices for traditional engineering systems.

0. https://github.com/typpo/promptfoo


The author starts out with an excellent observation:

  Lately, I've been playing around with LLMs to write code. I find
  that they're great at generating small self-contained snippets.
  Unfortunately, anything more than that requires a human...
I have been working on this problem quite a bit lately. I put together a writeup describing the solution that's been working well for me:

https://aider.chat/docs/ctags.html

The problem I am trying to solve is that it’s difficult to use GPT-4 to modify or extend a large, complex pre-existing codebase. To modify such code, GPT needs to understand the dependencies and APIs which interconnect its subsystems. Somehow we need to provide this “code context” to GPT when we ask it to accomplish a coding task. Specifically, we need to:

1. Help GPT understand the overall codebase, so that it can decifer the meaning of code with complex dependencies and generate new code that respects and utilizes existing abstractions.

2. Convey all of this “code context” to GPT in an efficient manner that fits within the 8k-token context window.

To address these issues, I send GPT a concise map of the whole codebase. The map includes all declared variables and functions with call signatures. This "repo map" is built automatically using ctags and enables GPT to better comprehend, navigate and edit code in larger repos.

The writeup linked above goes into more detail, and provides some examples of the actual map that I send to GPT as well as examples of how well it can work.


I still don't have a really good answer to this question:

If you want to be able to do Q&A against an existing corpus of documentation, can fine-tuning an LLM on that documentation get good results, or is that a waste of time compared to the trick where you search for relevant content and paste that into a prompt along with your question?

I see many people get excited about fine-tuning because they want to solve this problem.

The best answer I've seen so far is in https://github.com/openai/openai-cookbook/blob/main/examples...

> Although fine-tuning can feel like the more natural option—training on data is how GPT learned all of its other knowledge, after all—we generally do not recommend it as a way to teach the model knowledge. Fine-tuning is better suited to teaching specialized tasks or styles, and is less reliable for factual recall. [...] In contrast, message inputs are like short-term memory. When you insert knowledge into a message, it’s like taking an exam with open notes. With notes in hand, the model is more likely to arrive at correct answers.


It’s hard to keep up with all developments around LLaMA. What’s the best RLHF alpaca like model you can download right now?

LLaMA is the large language model published by Facebook (https://ai.facebook.com/blog/large-language-model-llama-meta...). In theory the model is private, but the model weights were shared with researchers and quickly leaked to the wider Internet. This is one of the first large language models available to ordinary people, much like Stable Diffusion is an image generation model available to ordinary people in contrast to DALL-E or MidJourney.

With the model's weights open to people, people can do interesting generative stuff. However, it's still hard to train the model to do new things: training large language models is famously expensive because of both their raw size and their structure. Enter...

LoRA is a "low rank adaptation" technique for training large language models, fairly recently published by Microsoft (https://github.com/microsoft/LoRA). In brief, the technique assumes that fine-tuning a model really just involves tweaks to the model parameters that are "small" in some sense, and through math this algorithm confines the fine-tuning to just the small adjustment weights. Rather than asking an ordinary person to re-train 7 billion or 11 billion or 65 billion parameters, LoRA lets users fine-tune a model with about three orders of magnitude fewer adjustment parameters.

Combine these two – publicly-available language model weights and a way to fine tune it – and you get work like the story here, where the language model is turned into something a lot like ChatGPT that can run on a consumer-grade laptop.


"Model weights aren't part of the release for now, to respect OpenAI TOS and LLaMA license."

I feel like the whole Open Source ML scene is slowed down by a strong chilling effect. Everyone seems to be afraid to release models.

Meanwhile, other models are freely available up to alpaca 30b:

https://github.com/underlines/awesome-marketing-datascience/...


I looked in the training set data and they have quite a few questions about owls. Also it got " downward curved beak" from davinci and got it still wrong.

Like:

"instruction": "Describe the sound an owl makes.",

"instruction": "Summarize the differences between an owl and a hawk.",

"instruction": "Find a fact about the bird of the following species", "input": "Species: Great Horned Owl",

"instruction": "What is the binomial nomenclature of the barn owl?",

"instruction": "Generate a riddle about an owl.",


Friendly reminder that we (Pinecone) have a free tier that holds up to ~5M SBERT embeddings (x768 dimensions). For quick projects, going "all Pinecone on this" could turn out to be the easier and faster option.

I've been using ChatGPT pretty consistently during the workday and have found it useful for open ended programming questions, "cleaning up" rough bullet points into a coherent paragraph of text, etc. $20/month useful is questionable though, especially with all the filters. My "in between" solution has been to configure BetterTouchTool (Mac App) with a hotkey for "Transform & Replace Selection with Javascript". This is intended for doing text transforms, but putting an API call instead seems to work fine. I highlight some text, usually just an open ended "prompt" I typed in the IDE, or Notes app, or an email body, hit the hotkey, and ~1s later it adds the answer underneath. This works...surprisingly well. It feels almost native to the OS. And it's cheaper than $20/month, assuming you aren't feeding it massive documents worth of text or expecting paragraphs in response. I've been averaging like 2-10c a day, depending on use.

Here is the javascript if anyone wants to do something similar. I don't know JS really, so I'm sure it could be improved. But it seems to work fine. You can add your own hard coded prompt if you want even.

    async (clipboardContentString) => {
        try {
          const response = await fetch("https://api.openai.com/v1/completions", {
            method: "POST",
            headers: {
              "Content-Type": "application/json",
              "Authorization": "Bearer YOUR API KEY HERE"
            },
            body: JSON.stringify({
              model: "text-davinci-003",
              prompt: `${clipboardContentString}.`,
              temperature: 0,
              max_tokens: 256
            })
          });
          const data = await response.json();
          const text = data.choices[0].text;
        return `${clipboardContentString} ${text}`;
        } catch (error) {
          return "Error"
        }
      }

  This demonstrates that in the lack of useful context GPT-3 will answer the question entirely by itself—which may or may not be what you want from this system.
You can instruct it not to do that. This is explained in OpenAI's post about the same technique[0]:

  Answer the question as truthfully as possible, and if you're unsure of the answer, say "Sorry, I don't know"
[0] https://github.com/openai/openai-cookbook/blob/main/examples... (which is now linked in OP)

Unfortunately they did not include a one-click uninstaller! Besides removing the .app, you also need to delete the 4GB+ model that it downloads and other files that it scatters about:

  $HOME/.diffusionbee
  $HOME/Library/Application\ Support/DiffusionBee

I'm having a lot of trouble with getting fonts to look clear and unfuzzy on my M1 Pro with a new LG 38WN95C-W 38" 21:9 (3840 x 1600) monitor I just bought. It's driving me nuts. I've tried different picture settings on the monitor, with and without HDR, tried different resolutions, tried every possible value of `defaults -currentHost write -g AppleFontSmoothing -int ${n}` but no dice.

Anyone have any ideas what I can do to resolve?

Machine details below:

   Software:

    System Software Overview:

      System Version: macOS 12.6 (21G115)
      Kernel Version: Darwin 21.6.0
      Boot Volume: Macintosh HD
      Boot Mode: Normal
      Secure Virtual Memory: Enabled
      System Integrity Protection: Enabled
      Time since boot: 4:14
Hardware:

    Hardware Overview:

      Model Name: MacBook Pro
      Model Identifier: MacBookPro18,1
      Chip: Apple M1 Pro
      Total Number of Cores: 10 (8 performance and 2 efficiency)
      Memory: 32 GB
      System Firmware Version: 7459.141.1
      OS Loader Version: 7459.141.1

In case it interests anyone, I built a document editor + GPT and updated it to use text-davinci-003: https://github.com/typpo/arkose/

If you want to try GPT-3 but don't have an OpenAI API key, I've set up a quick demo here until I hit my billing cap (normally users would supply their own API key): https://arkose.pages.dev/


I've been trying out various models for summarization and I have to say, on first glance, this davinci-003 doesn't impress compared to davinci-002 or facebook/bart-large-cnn. If you throw in "summarize the following text:" and copy in the contents of https://en.wikipedia.org/wiki/Hacker_News you get the following (temperature 0.0 max tokens 100)[0]:

davinci-003

Hacker News is a social news website focusing on computer science and entrepreneurship, run by Y Combinator. It was created by Paul Graham in 2007 and is moderated by Daniel Gackle. It is intended to recreate a community similar to the early days of Reddit, but with a karma system that requires users to accumulate 501 points before they can downvote content. It also has automated flame and spam detectors and active human moderation, as well as a "voting rings" detector. It

davinci-002

Hacker News is a social news website focusing on computer science and entrepreneurship. It is run by the investment fund and startup incubator Y Combinator. The site was created by Paul Graham in February 2007. Initially called Startup News or occasionally News.YC., it became known by its current name on August 14, 2007.

facebook/bart-large-cnn

Hacker News is a social news website focusing on computer science and entrepreneurship. It is run by the investment fund and startup incubator Y Combinator. In general, content that can be submitted is defined as anything that gratifies one's intellectual curiosity. The site has a proactive attitude in moderating content, including automated flame and spam detectors and active human moderation.

Curious if anyone knows any good alternatives or tricks to improve the summaries? For now I'm using the bart one as it has the advantage of being downloadable from huggingface so you can run it yourself [1].

[0]: https://beta.openai.com/playground?model=text-davinci-003

[1]: https://huggingface.co/facebook/bart-large-cnn


As a general rule, successful programmers have been doing this forever:

  - It works. You added 2 things. It doesn't work. Which thing broke it?
  - It doesn't work. You changed 2 things. It works. Which thing fixed it?
  - It works. Two people changed it. It doesn't work. Who broke it?
  - One donut left. 2 programmers went into break room. No donuts left. Who ate it?
Now change "2" to "1" in all of the above examples. See how much easier?

There's also the shelve[0] module which allows storing any pickleable object in a persistent key-value store, not just string/bytes. I've found it's very handy for caching while developing scripts which query remote resources, and not have to worry about serialization.

[0] https://docs.python.org/3.10/library/shelve.html

Obligatory pickle note: one should be aware of pickle security implications and should not open a "Shelf" provided by untrusted sources, or rather should treat opening a shelf (or any pickle deserialization operation for that matter) as running an arbitrary Python script (which cannot be read).


The Norwegian Meteorological Institute has an excellent free HTTP/JSON weather API that covers the globe. No signup required.

https://developer.yr.no


This is a laudable effort, but I'm not a fan of shipping the entire interpreter. I looked around a few weeks ago and found https://transcrypt.org, which compiles your Python script to JS, so size is minimal.

It's great for shipping small, internal tools/apps, I love how maintainable they are by all the Python devs, plus they're very fast to load and execute.


There are some alternative frontends for Spotify available. None of the ones I found works on M1 mac, but Windows and Linux users have some options available.

https://github.com/jpochyla/psst

https://github.com/toothbrush/Spotiqueue

https://github.com/xou816/spot


Yes! You just have to bind the function `org-open-at-point-global` to a keystroke. This is what I have in my .emacs file:

    (global-set-key (kbd "C-c L") 'org-insert-link-global) 
    (global-set-key (kbd "C-c O") 'org-open-at-point-global)

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: