More

trott · 2025-04-28T17:55:59 1745862959

> The AlphaFold3 analysis (the AI contribution) literally accounts for a few panels in a supplementary figure - it didn't even help guide their choice of small molecule inhibitors since those were already known.

(Disclaimer: I'm the author of a competing approach)

Searching for new small-molecule inhibitors requires going through millions of novel compounds. But AlphaFold3 was evaluated on a dataset that tends to be repetitive: https://olegtrott.substack.com/p/are-alphafolds-new-results-...

YeGoblynQueenne · 2025-04-28T18:45:02 1745865902

What is the upshot of that?

trott · 2025-04-10T18:03:07 1744308187

Regarding point number 11 (AlphaFold3 vs Vina, Gnina, etc.), see my rebuttal here (I'm the author of Vina): https://olegtrott.substack.com/p/are-alphafolds-new-results-...

Gnina is Vina with its results re-scored by a NN, so the exact same concerns apply.

I'm very optimistic about AI, for the record. It's just that in this particular case, the comparison was flawed. It's the old regurgitation vs generalization confusion: We need a method that generalizes to completely novel drug candidates, but the evaluation was done on a dataset that tends to be repetitive.

trott · 2025-04-02T16:34:47 1743611687

> According to Stack Overflow developer survey [0] Rust is at 12.5%, ... So definitely not niche.

The annual survey is very popular in the Rust community. Its results are often used for advocacy. Participation by Rust developers is very high. So what you have is a classic case of a selection bias.

trott · 2025-03-25T04:41:27 1742877687

They started with N >= 120x3 tasks, and gave each task to 4-9 humans. Then they kept only those 120x3 tasks that at least 2 humans had solved.

tananaev · 2025-03-25T05:09:02 1742879342

That's a very small sample size by task. I wonder if they give the whole data set to an average human, what the result would be. I tried some simple tasks and they are doable, but I couldn't figure out the hard ones.

trott · on Feb 10, 2025

> but Ada's real strengths lie elsewhere. Its strong typing,

Ada is not actually type-safe: https://www.enyo.de/fw/notes/ada-type-safety.html

adrian_b · on Feb 10, 2025

As explained at your link, the example program that is not type-safe is based on a mistake of the 1983 Ada standard regarding the use of "aliased", which has been removed by a later Technical Corrigendum, where the program demonstrated at your link is explicitly classified as erroneous, so any compliant Ada compiler should fail to compile it.

As also explained at your link, the same type-safety breaking technique works in unsafe Rust. Both "unchecked" Ada and "unsafe" Rust do not provide type safety, while the safe subsets of the languages provide it.

trott · on Feb 10, 2025

> a mistake of the 1983 Ada standard ... which has been removed

The article was written in 2011, and the trick still seems to work in a 2024 version of GNAT.

> Both "unchecked" Ada and "unsafe" Rust

But the `Conversion` function isn't using `Unchecked_*`. That's the point of the article. The type safety hole is in "safe" Ada.

steveklabnik · on Feb 11, 2025

While that was apparently true in 2015, it doesn't work on today's Rust: https://play.rust-lang.org/?version=stable&mode=debug&editio...

I'm not sure what the difference was, given that the representations haven't changed, and I'm doing this without invoking the optimizer.

EDIT: I had issues with Miri so I dug into the MIR myself:

  let uncopied : *const Uncopyable<*const B> =
    match magic {
      Magic::B(b) => &raw const b,

        StorageLive(_4);
        StorageLive(_5);
        _5 = move ((_3 as B).0: Uncopyable<*const B>);
        _4 = &raw const _5;
        StorageDead(_5);

_3 is magic, _4 is uncopied, and 5 is b. move here is like ptr::read, which means that uncopied points to a copy of magic, not aliasing magic, and is dangling. Because this is UB, it gets optimized straight into the panic.

After I figured that out, miri started working, I must have made a mistake earlier. It will tell us the same thing:

    test test ... error: Undefined Behavior: memory access failed: alloc113986 has been freed, so this pointer is dangling
      --> src/lib.rs:24:13
       |
    24 |     assert!((*uncopied).value != std::ptr::null());
       |             ^^^^^^^^^^^^^^^^^ memory access failed: alloc113986 has been freed, so this pointer is dangling
       |
       = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
       = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
    help: alloc113986 was allocated here:
      --> src/lib.rs:18:16
       |
    18 |       Magic::B(b) => &b,
       |                ^
    help: alloc113986 was deallocated here:
      --> src/lib.rs:18:23
       |
    18 |       Magic::B(b) => &b,
       |                       ^
       = note: BACKTRACE (of the first span) on thread `test`:
       = note: inside `magic::<&str, &u8>` at src/lib.rs:24:13: 24:30
    note: inside `test`
      --> src/lib.rs:36:5
       |
    36 |     magic::<&str, &u8>("magic string");
       |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    note: inside closure
      --> src/lib.rs:35:10
       |
    34 | #[test]
       | ------- in this procedural macro expansion
    35 | fn test() {
       |          ^
       = note: this error originates in the attribute macro `test` (in Nightly builds, run with -Z macro-backtrace for more info)

This code started failing in Rust 1.12, when MIR happened, so that's exactly my guess as to what fixed it.

Funny enough, if we take the critique in the forum as correct, and try with a union: https://play.rust-lang.org/?version=stable&mode=debug&editio...

Rust will:

  1. force us to use ManuallyDrop
  2. the API makes this move explicit, and we get a compile-time error!

trott · on Feb 9, 2025

Yep, and even without dynamic memory management, Ada is not type-safe: https://www.enyo.de/fw/notes/ada-type-safety.html

Rust also has soundness holes, by the way. This one is almost 10 years old: https://github.com/rust-lang/rust/issues/25860

steveklabnik · on Feb 10, 2025

Tons of languages have soundness bugs. While that one is ten years old, there have been zero reports of it ever being found in the wild, and it will finally be fixed relatively soon.

trott · on Feb 10, 2025

> it will finally be fixed relatively soon

2015: "The work needed to close this has not yet landed. It's in the queue though, once we finish up rust-lang/rfcs#1214."

steveklabnik · on Feb 10, 2025

Yes. For various other reasons, the trait system needed to be rewritten, and it would also end up fixing this. Since it’s essentially a theoretical issue, spending time on this wouldn’t make sense. In the meantime, that rewrite is nearing completion. The most recent version of rust uses it for coherence checking. The next few months are likely to remove several other stoppers from moving forward. It’s almost time for a crater run, and then fixing whatever issues that shows up.

SkiFire13 · on Feb 10, 2025

The Rust one seems a specific bug in the implementation, while the Agda one seems more like a fundamental flaw caused by allowing aliased mutation though.

trott · on Feb 9, 2025

Another way to look at this is that there are 12,290 bits of information in choosing 817 samples from 10,000,000.

TOMDM · on Feb 9, 2025

And much more information when selecting just as many examples from quadrillions of randomly generated examples.

The information from the selection criteria isn't available to the model, just the chosen samples.

trott · on Jan 29, 2025

How do Maxima and SymPy compare in terms of capability, features and speed (native, not WASM)?

rockybernstein · on Feb 6, 2025

In terms of features, I imagine they are about the same. Here is why I think this. Maxima's function organization seems to follow and refers to the NIST Digital Library of Mathematical Functions https://dlmf.nist.gov/

Mathematica and Sympy also seem to follow this organization.

My take: I blame compatibility on the NIST Digital Library of Mathematical Functions.

trott · on Jan 20, 2025

> This has me curious about ARC-AGI

In the o3 announcement video, the president of ARC Prize said they'd be partnering with OpenAI to develop the next benchmark.

> mechanical turking a training set, fine tuning their model

You don't need mechanical turking here. You can use an LLM to generate a lot more data that's similar to the official training data, and then you can train on that. It sounds like "pulling yourself up by your bootstraps", but isn't. An approach to do this has been published, and it seems to be scaling very well with the amount of such generated training data (They won the 1st paper award)

pastage · on Jan 20, 2025

I know nothing about LLM training, but do you mean there is a solution to the issue of LLMs gaslighting each other? Sure this is a proven way of getting training data, but you can not get theorems and axioms right by generating different versions of them.

trott · on Jan 20, 2025

This is the paper: https://arxiv.org/abs/2411.02272

They won the 1st paper award: https://arcprize.org/2024-results

In their approach, the LLM generates inputs (images to be transformed) and solutions (Python programs that do the image transformations). The output images are created by applying the programs to the inputs.

So there's a constraint on the synthetic data here that keeps it honest -- the Python interpreter.

abrichr · on Jan 20, 2025

I believe the paper being referenced is “Scaling Data-Constrained Language Models” (https://arxiv.org/abs/2305.16264).

For correctness, you can use a solver to verify generated data.

trott · on Dec 26, 2024

> Terra programs use the same LLVM backend that Apple uses for its C compilers.

Can it use anything else (as an option), e.g. Lua? That would be useful during development/debugging thanks to faster iteration and memory safety.

transpute · on Dec 27, 2024

LLVM would not be easily replaced, https://chriscummins.cc/2019/llvm-cost/

> 1.2k developers have produced a 6.9M line code base with an estimated price tag of $530M.

ajb · on Dec 27, 2024

Although it's true that replacing LLVM like-for-like would indeed be expensive, I feel like you missed GP's point. To make the easier to debug, what you want is to replace LLVM with something simple, like a plain interpreter. And writing one (or adapting one) would be nothing like this cost.

transpute · on Dec 27, 2024

Any precedent example for an LLVM-hosted language? There are benefits from being on the same toolchain as popular languages.

dvtkrlbs · on Dec 27, 2024

Rust is experimenting with a codegen tool called cranelift which powers wasmtime. The plan is to use it for debug builds. There is also a backend implementation in the works using gcc jit and a dotnet backend as well

brabel · on Dec 27, 2024

Zig is actually trying to get rid of LLVM for default builds, though they intend to keep it as one of the "backends" for the forseeable future.

The problem is that LLVM is very slow (as it applies lots of optimisations) and heavy, and that makes a compiler slow even in debug mode.

D has 3 backends, LLVM is used by the LDC compiler. Then there's DMD which is the official compiler (and frontend for all three), and GDC which uses GCC backend. D code typically compiles quite a bit faster on DMD, but the code is more highly optimised with LLVM or GCC, of course. It's a great tradeoff specially since all compilers use the DMD frontend so it's almost guaranteeed that there's no differences in behaviour between them.

1oooqooq · on Dec 27, 2024

that's a fallacy that was ignored when starting llvm and looking at the already existing gcc.