This is really cool, but why wouldn't you just use a more richly typed target language and skip this process? You could use Liquid Haskell (for refinement types) or Lean (for full dependent types) and be able to put these invariants directly in your program rather then in a sidecar.
But, for everyone else: even if you skipped the sidecar entirely, didn't use the codegen, you just had the AI interpret the spec'd application into a short Shen proof, iterate until it's internally consistent / compiles...now you have a spec that is internally consistent, straightforward for human to understand, and much stronger context for the LLM than English language spec alone.
Not the author, but I've been doing this kind of thing with Lean. I'm still trying to figure out how to make this workflow play nicely with other systems. I have a bunch of rust code that I want these kinds of guarantees on, and that code has third party dependencies that would be terrible to give up. It's totally unclear right now how to get the best of both worlds.
I remember in elementary school a friend had a book of these. The one I remember best was a dust mask that had been modified to receive something like 50 cigs so you could smoke them all at once.
From the podcast episode they talk about the idea of using an LLM for training by disallowing the model to write code. I've been experimenting with exactly that in conjunction with a proof checker (Agda) to help me learn some cubical type theory and category theory.
I find the LLM as interactive tutor reviewing my work in a proof checker to be a really killer combo.
I think Hindley Milner (for decidability) + Linear Types (for resource management) + Refinement Types (for lightly asserting invariants) + Delimited Continuation based Effects (for tracking effectful code) + Unison style Content Addressability (for corralling code changes, documentation, and tests) would make a really nice language for an LLM.
It doesn't have Hindley-Milner type inference, but it has very strong type inference.
We will get linearity soon thanks to and as part of the Capybara[1] effort.
Refinement types are already long a reality.
The whole new effect tracking thing is based on delimited continuations.
The Unison style content addressability comes up now and then, maybe it will become a reality at some point. It's though mostly not a language thing but more a build system thing.
Scala is already great for for LLMs also for other reasons:
AFAIK Scala's type system is not decidable. The point of Hindley Milner (and I really should have said System F without impredicative or higher rank types) was to get decidable polymorphism not type inference.
reply