Hacker Newsnew | past | comments | ask | show | jobs | submit | upghost's commentslogin

> Pre-training allows organizations to build domain-aware models by learning from large internal datasets.

> Post-training methods allow teams to refine model behavior for specific tasks and environments.

How do you suppose this works? They say "pretraining" but I'm certain that the amount of clean data available in proper dataset format is not nearly enough to make a "foundation model". Do you suppose what they are calling "pretraining" is actually SFT and then "post-training" is ... more SFT?

There's no way they mean "start from scratch". Maybe they do something like generate a heckin bunch of synthetic data seeded from company data using one of their SOA models -- which is basically equivalent to low resolution distillation, I would imagine. Hmm.


Pre-training mean exposing an already-trained model to more raw text like PDF extracts etc (aka continued pre-training). You wouldn't be starting from scratch, but it's still pre-training because the objective is just next token prediction of the text you expose it to.

Post-training means everything else: SFT, DPO, RL, etc. Anything that involves things like prompt/response pairs, reward models, or benefits from human feedback of any kind.


Er, then what is the "already trained" model? I thought pre-training was the gradient descent through the internet part of building foundational models.

Probably marketing speak for full fine-tuning vs PEFT/LoRA.

I would guess:

Pre-training: refining the weights in an existing model using more training data.

Post-training: Adding some training data to the prompt (RAG, basically).


I think they are referring to “continued pretraining”.

I can imagine that, as usual, you start with a few examples and then instruct an LLM to synthesize more examples out of that, and train using that. Sounds horrible, but actually works fairly well in practice.

Probably just means SFT fine-tuning a base model, vs behavioural dpo and/or SFT fine-tuning a instruction model.

> We are doing this to self-fund further investment in AI and enterprise sales while strengthening our financial profile.

Some quotes from the video:

> ...at the same time, we're a people company.

> Your work will live on in our products.

> Doing the right thing for Atlassian while acting with humanity and doing the right thing for all those on all sides of this set of decisions.

Wow. There's a lot to unpack here.


> Mr Cannon-Brookes told investors he “couldn’t be more bullish” about the opportunities ahead, despite relentlessly selling his own shares in the company daily. The Nightly reports he kept selling 7665 shares on a daily basis even in the month prior to the results at prices ranging from $US161.11 (AU$227) a share on January 8 to $US105.14 on February 4.

> While ordinary Aussies are asked to make big changes, the 46-year-old decided to treat himself to a ritzy new private jet late last year, admitting to a “deep internal conflict” over the carbon-heavy method of travel.

> The Atlassian co-founder and CEO bought a Bombardier 7500 and will use it to travel across his vast business operations, which include a minority stake in the Utah Jazz NBA team and a sponsorship deal with Formula 1.

https://www.msn.com/en-au/money/other/aussie-sacks-1600-afte...


Very interesting stuff. Apparently this is the implementation: https://github.com/dicpeynado/prolog-in-forth

Thinking about the amount of thought and energy that went into this, back in 1987 -- mostly preinternet, pre-AI. Damn.

I feel really lucky that we get to build on things like this.


There's a great 1986 book "Designing and Programming Personal Expert Systems" by Feucht and Townsend that implements expert systems in Forth (and in the process, much of the capability of Prolog and Lisp).

Ha,you beat me to it! That book was my first thought when I saw this post. I have a copy sitting here on my bookshelf.

Just to expand on how bonkers this book is... they assume that everyone has easy access to a Forth implementation. So they teach you how to build a Lisp on top of it. Then they use the Lisp you just built to build a Prolog. Then, finally, they do what the topic of the book actually is: build a simple expert system on top of that Prolog.

I love it!


To be fair, in the 1980s thanks to the Forth Interest Group (FIG), free implementations of Forth existed for most platforms at a time when most programming languages were commercial products selling for $100 or more (in 1980s dollars). It's still pretty weird, but more understandable with that in mind.

I'm surprised how hard I had to dig for an actual example of syntax[1], so here you go.

[1]: https://www.lix.polytechnique.fr/~dale/lProlog/proghol/extra...


There is also an implementation of 99 Bottles of Beer on Rosetta Code: https://rosettacode.org/wiki/99_bottles_of_beer#Lambda_Prolo...


Constantly amused by the split in comments of any moderately innovative language post between ‘I don't care about all this explanation, just show me the syntax!’ and ‘I don't understand any of this syntax, what a useless language!’

If the language is ‘JavaScript but with square brackets instead of braces’ maybe the syntax is relevant. But in general concrete syntax is the least interesting (not least important, but easiest to change) thing in a programming language, and its similarity to other languages a particular reader knows less interesting still. JavaScript is not the ultimate in programming language syntax (I hope!) so it's still worth experimenting, even if the results aren't immediately comprehensible without learning.


In Prolog the syntax is incredibly important. It is designed to be metainterpreted with the same ease in which a for-loop might be written in another language.

https://www.metalevel.at/acomip/

  mi1(true).
  mi1((A,B)) :-
        mi1(A),
        mi1(B).
  mi1(Goal) :-
        Goal \= true,
        Goal \= (_,_),
        clause(Goal, Body),
        mi1(Body).
This can be arbitrarily extended in very interesting, beautiful, and powerful ways. This is extraordinarily hard to achieve and did not happen by accident.

As a challenge, see how easy it is to write a metainterpreter in another language of your choice. Alternately, see if you can think of any way the metainterpretation system in Prolog could be improved.

Finally, think of what would happen to this if we changed the syntax and introduced something like object.field notation.

So while logical programming can be achieved with other syntaxes, the metaintrepretive aspect will be lost. I have yet to see a language that does this better.


Nice link, thank you! I'm not sure it's super related to my comment but it is closely related to some other things I'm thinking about. I'll give it a read :)

There are some examples in this tutorial PDF:

https://www.lix.polytechnique.fr/Labo/Dale.Miller/lProlog/fe...


I have written stuff in Prolog, but I find this lambda Prolog syntax very difficult to grok.


So brainfuck x lisp


Christ... it's incomprehensible... I guess that ones staying in academia :P


So I have been doing formal specification with TLA+ using AI assistance and it has been very helpful AFTER I REALIZED that quite often it was proving things that were either trivial or irrelevant to the problem at hand (and not the problem itself), but difficult to detect at a high level.

I realize formal verification with lean is a slightly different game but if anyone here has any insight, I tend to be extremely nervous about a confidently presented AI "proof" because I am sure that the proof is proving whatever it is proving, but it's still very hard for me to be confident that it is proving what I need it to prove.

Before the dog piling starts, I'm talking specifically about distributed systems scenarios where it is just not possible for a human to think through all the combinatorics of the liveness and safety properties without proof assistance.

I'm open to being wrong on this, but I think the skill of writing a proof and understanding the proof is different than being sure it actually proves for all the guarantees you have in mind.

I feel like closing this gap is make it or break it for using AI augmented proof assistance.


In my experience, finding the "correct" specification for a problem is usually very difficult for realistic systems. Generally it's unlikely that you'll be able to specify ALL the relevant properties formally. I think there's probably some facet of Kolmogorov complexity there; some properties probably cannot be significantly "compressed" in a way where the specification is significantly shorter and clearer than the solution.

But it's still usually possible to distill a few crucial properties that can be specified in an "obviously correct" manner. It takes A LOT of work (sometimes I'd be stuck for a couple of weeks trying to formalize a property). But in my experience the trade off can be worth it. One obvious benefit is that bugs can be pricey, depending on the system. But another benefit is that, even without formal verification, having a few clear properties can make it much easier to write a correct system, but crucially also make it easier to maintain the system as time goes by.


I'm curious since I'm not a mathematician: What do you mean by "stuck for a couple of weeks"? I am trying to practice more advanced math and have stumbled over lean and such but I can't imagine you just sit around for weeks to ponder over a problem, right? What do you do all this time?


I'm not a mathematician either ;) Yeah, I won't sit around and ponder at a property definition for weeks. But I will maybe spend a day on it, not get anywhere, and then spend an hour or two a day thinking about ways to formulate it. Sometimes I try something, then an hour later figure out it won't work, but sometimes I really do just stare at the ceiling with no idea how to proceed. Helps if you have someone to talk to about it!


Experience counter examples for why a specific definition is not going to work. Many times, at various levels of "not going to", usually hovering slightly above a syntactic level, but sometimes hovering on average above a plain definition semantic level, i.e. being mostly concerned with some indirect interaction aspects.


Yeah, even for simple things, it's surprisingly hard to write a correct spec. Or more to the point, it's surprisingly easy to write an incorrect spec and think it's correct, even under scrutiny, and so it turns out that you've proved the wrong thing.

There was a post a few months ago demonstrating this for various "proved" implementations of leftpad: https://news.ycombinator.com/item?id=45492274

This isn't to say it's useless; sometimes it helps you think about the problem more concretely and document it using known standards. But I'm not super bullish on "proofs" being the thing that keeps AI in line. First, like I said, they're easy to specify incorrectly, and second, they become incredibly hard to prove beyond a certain level of complexity. But I'll be interested to watch the space evolve.

(Note I'm bullish on AI+Lean for math. It's just the "provably safe AI" or "provably correct PRs" that I'm more skeptical of).


>But I'm not super bullish on "proofs" being the thing that keeps AI in line.

But do we have anything that works better than some form of formal specification?

We have to tell the AI what to do and we have to check whether it has done that. The only way to achieve that is for a person who knows the full context of the business problem and feels a social/legal/moral obligation not to cheat to write a formal spec.


Code review, tests, a planning step to make sure it's approaching things the right way, enough experience to understand the right size problems to give it, metrics that can detect potential problems, etc. Same as with a junior engineer.

If you want something fully automated, then I think more investment in automating and improving these capabilities is the way to go. If you want something fully automated and 100% provably bug free, I just don't think that's ever going to be a reality.

Formal specs are cryptic beyond even a small level of complexity, so it's hard to tell if you're even proving the right thing. And proving that an implementation meets those specs blows up even faster, to the point that a lot of stuff ends up being formally unprovable. It's also extremely fragile: one line code change or a small refactor or optimization can completely invalidate hundreds of proofs. AI doesn't change any of that.

So that's why I'm not really bullish on that approach. Maybe there will be some very specific cases where it becomes useful, but for general business logic, I don't see it having useful impact.


As a heavy user of formal methods, I think refinement types, instead of theorem proving with Lean or Isabelle, is both easier and more amenable to automation that doesn't get into these pitfalls.

It's less powerful, but easier to break down and align with code. Dafny and F* are two good showcases. Less power makes it also faster to verify and iterate on.


Completely agree. Refinement types is a much more practical tool for software developers focusing on writing real world correct code.

Using LEAN or Coq requires you to basically convert your code to LEAN/Coq before you can start proving anything. And importing some complicated Hoare logic library. While proving things correct in Dafny (for example) feels much more like programming.


You have identified the crux of the problem, just like mathematics writing down the “right” theorem is often half or more of the difficulty.

In the case of digital systems it can be much worse because we often have to include many assumptions to accommodate the complexity of our models. To use an example from your context, usually one is required to assume some kind of fairness to get anything to go through with systems operating concurrently but many kinds of fairness are not realistic (eg strong fairness).


Could you write a blog post about your experience to make it more concrete?


I was having the same intuition, but you verbalised it better: the notion of having a definitive yes/no answer is very attractive, but describing what you need in such terms using natural language, which is inherently ambiguous... that feels like a fool's errand. That's why I keep thinking that LLM usage for serious things will break down once we get to the truly complicated things: it's non-deterministic nature will be an unbreakable barrier. I wish I'm wrong, though.


Anakin: I'm going to save the world with my AI vulnerability scanner, Padme.

Padme: You're scanning for vulnerabilities so you can fix them, Anakin?

Anakin: ...

Padme: You're scanning for vulnerabilities so you can FIX THEM, right, Annie?


I assume that's why this is gated behind a request for access from teams / enterprise users rather than being GA

but there are open versions available built on the cn OSS models:

https://github.com/lintsinghua/DeepAudit


The GA functionality is already here with a crafted prompt or jailbreak :)


it's gone a bit unnoticed that they've stopped support for response prefilling in the 4.6 models :/


Definitely will be a fight against bad actors pulling bulk open source software projects, npm packages, etc and running this for their own 0 days.

I hope Anthropic can place alerts for their team to look for accounts with abnormal usage pre-emptively.


You want frontier models to actively prevent people from using them to do vulnerability research because you're worried bad people will do vulnerability research?


Not at all. I was suggesting if an account is performing source code level request scanning of "numerous" codebases - that it could be an account of interest. A sign of mis-use.

This is different than someones "npm audit" suggesting issues with packages in a build and updating to new revisions. Also different than iterating deeply on source code for a project (eg: nginx web server).


What's incredibly ironic is that research labs are releasing the most advanced hacking toolkit ever known, and cybersecurity defence stocks are going down as a result somehow. There’s no logic in the stock markets.


I don't understand the joke here.


A vuln scanner is dual-use.


It's an Internet trope — we could link to knowyourmeme, or link to the HN Guidelines


tl;dr - All this AI stuff is just Universal Paperclips[1]

I see a lot of comments about folks being worried about going soft, getting brain rot, or losing the fun part of coding.

As far as I'm concerned this is a bigger (albeit kinda flakey) self-driving tractor. Yeah I'd be bored if I just stuck to my one little cabbage patch I'd been tilling by hand. But my new cabbage patch is now a megafarm. Subjectively, same level of effort.

[1]: https://en.wikipedia.org/wiki/Universal_Paperclips


Author: "Not my favorite language"

Prolog: "Mistakes were made"

As an avid Prolog fan, I would have to agree with a lot of Mr. Wayne's comments! There are some things about the language that are now part of the ISO standard that are a bit unergonomic.

On the other hand, you don't have to write Prolog like that! The only shame is that there are 10x more examples (at least) of bad Prolog on the internet than good Prolog.

If you want to see some really beautiful stuff, check out Power of Prolog[1] (which Mr. Wayne courteously links to in his article!)

If you are really wondering why Prolog, the thing about it that makes it special among all languages is metainterpretation. No, seriously, would strongly recommend you check it out[2]

This is all that it takes to write a metainterpreter in Prolog:

  mi1(true).
  mi1((A,B)) :-
          mi1(A),
          mi1(B).
  mi1(Goal) :-
          Goal \= true,
          Goal \= (_,_),
          clause(Goal, Body),
          mi1(Body).
Writing your own Prolog-like language in Prolog is nearly as fundamental as for-loops in other language.

[1] https://www.youtube.com/@ThePowerOfProlog

https://www.metalevel.at/prolog

[2] https://www.youtube.com/watch?v=nmBkU-l1zyc

https://www.metalevel.at/acomip/


I also have a strange obsession with Prolog and Markus Triska's article on meta-interpreters heavily inspired me to write a Prolog-based agent framework with a meta-interpreter at its core [0].

I have to admit that writing Prolog sometimes makes me want to bash my my head against the wall, but sometimes the resulting code has a particular kind of beauty that's hard to explain. Anyways, Opus 4.5 is really good at Prolog, so my head feels much better now :-)

[0] http://github.com/deepclause/deepclause-desktop


>>I have to admit that writing Prolog sometimes makes me want to bash my my head against the wall

I think much of the frustration with older tech like this comes from the fact that these things were mostly written(and rewritten till perfection) on paper first and only the near-end program was input into a computer with a keyboard.

Modern ways of carving out a program with 'Successive Approximations' with a keyboard and monitor until you get to something to work is mostly a recent phenomenon. Most of us are used to working like this. Which quite honestly is mostly trial and error. The frustration is understandable because you are basically throwing darts, most of the times in the dark.

I knew a programmer from the 1980s who(built medical electronics equipment) would tell me how even writing C worked back then. It was mostly writing a lot, on paper. You had to prove things on paper first.


>> I think much of the frustration with older tech like this comes from the fact that these things were mostly written(and rewritten till perfection) on paper first and only the near-end program was input into a computer with a keyboard.

I very much agree with this, especially since Prolog's execution model doesn't seem to go that well with the "successive approximations" method.


Before personal computer revolution, compute time and even development/test time on a large computers back then was rationed.

One can imagine how development would work in a ecosystem like that. You have to understand both the problem, and your solution, and you need to be sure it would work before you start typing it out at a terminal.

This the classic Donald Knuth workflow. Like he is away disconnected from a computer for long periods of time, focussed on the problems and solutions, and he is working them out on paper and pen. Until he has arrived solutions that just work, correctly. And well enough to be explained in a text book.

When you take this away. You also take away the need to put in hard work required to make things work correctly. Take a look at how many Java devs are out there who try to use a wrong data structure for the problem, and then try to shoe horn their solution to roughly fit the problem. Eventually solution does work for some acceptable inputs, and remainder is left to be discovered by an eventual production bug. Stackoverflow is full of such questions.

Languages like Prolog just don't offer that sort of freedom. And you have to be in some way serious about what you are doing in terms of truly understanding both the problem and solution well enough to make them work.


Languages like Prolog just don't offer that sort of freedom.

Yes, they do -- that's why people have enjoyed using such languages.

It might help to think of them as being like very-high-level scripting-languages with more rigorous semantics (e.g. homoiconicity) and some nifty built-ins, like Prolog's relational-database. (Not to mention REPLs, tooling, etc.)

Read, for example, what Paul Graham wrote about using Lisp for Viaweb (which became Yahoo Store) [0] and understand that much of what he says applies to languages like Prolog and Smalltalk too.

[0] https://www.paulgraham.com/avg.html


...these things were mostly written(and rewritten till perfection) on paper first and only the near-end program was input into a computer with a keyboard.

Not if you were working in a high-level language with an interpreter, REPL, etc. where you could write small units of code that were easily testable and then integrated into the larger whole.

As with Lisp.

And Prolog.


Personal computers are a thing from the late 1980s.

Even then PC use in businesses was fairly limited.

Prolog appeared in 1972.

Either way before the PC, Programming was nothing like it is today. It was mostly a Math discipline. Math is done on paper.


The following is from David H.D. Warren's manual for DEC-10 Prolog, from 1979 [0]. It describes how Prolog development is done interactively, by being able to load code in dynamically into an interpreter and using the REPL -- note that the only mention of using paper is if the developer wants to print out a log of what they did during their session:

Interactive Environment Performance is all very well. What the programmer really needs is a good inter-active environment for developing his programs. To address this need, DEC-10 Prolog provides an interpreter in addition to the compiler.

The interpreter allows a program to be read in quickly, and to be modified on-line, by adding and deleting single clauses, or by updating whole procedures. Goals to be executed can be entered directly from the terminal. An execution can be traced, interrupted, or suspended while other actions are performed. At any time, the state of the system can be saved, and resumed later if required. The system maintains, on a disk file, a complete log of all interactions with the user's terminal. After a session, the user can examine this file, and print it out on hard copy if required.

[0] https://softwarepreservation.computerhistory.org/prolog/edin...


But I wonder if that characterization is actually flattering for Prolog? I can't think of any situation, skill, technology, paradigm, or production process for which "doing it right the first time" beats iterative refinement.


>>"doing it right the first time" beats iterative refinement.

Its not iterative refinement which is bad. Its just that when you use a keyboard a thinking device, there is a tendency to assume the first trivially working solution to be completely true.

This is doesn't happen with pen and paper as it slows you down. You get mental space to think through a lot of things, exceptions etc etc. Until even with iterative refinement you are likely to build something that is correct compared to just committing the first typed function to the repo.


Being that prolog is from the 70s, I would guess you're a bit more careful with punch cards.


ROFL.

Like Lisp and Smalltalk, Prolog was used primarily in the 1980s, so it was run on Unix workstations and also, to some extent, on PCs. (There were even efforts to create hardware designed to run Prolog a la Lisp machines.)

And, like Lisp and Smalltalk, Prolog can be very nice for iterative development/rapid prototyping (where the prototypes might be good enough to put into production).

The people who dealt with Prolog on punchcards were the academics who created and/or refined it in its early days. [0]

[0] https://softwarepreservation.computerhistory.org/prolog/


I mean there are nearly two full decades between the appearance of Prolog(1972) and PC revolution late 1980s and early 1990s.

>>The people who dealt with Prolog on punchcards were the academics who created and/or refined it in its early days. [0]

That's like a decade of work. Thats hardly early 'days'.

Also the programming culture in the PC days and before that is totally different. Heck even the editors from that era(eg vi), are designed for an entirely different workflow. That is, lots of planning, and correctness before you decided to input the code into the computer.


By 1979 at the latest -- probably closer to 1975 -- the primary Prolog implementation of the day (Warren's DEC-10 version) had an interpreter, where you could load files of code in and modify the code and you had a REPL with the ability to do all kinds of things.

I posted an excerpt of the manual, with a link to a PDF of it, in a reply to another comment [0]

(And, since even the earliest versions of Prolog were interpreted, they may've had features like this too).

And, as far as editors are concerned, people still use versions of vi (and, of course, emacs) to this day by people who don't necessarily do lots of planning and correctness before deciding to input the code into the computer.

[0] https://news.ycombinator.com/item?id=46664671


And one other thing: just because early Prolog interpreters were implemented on punchcards doesn't mean that Prolog programs run by those interpreters needed to be. It's quite possible that basically nobody ever wrote Prolog programs using punchcards, given that Prolog has the ability to read in files of code and data.


I'm assuming they were written on paper because they were commonly punched into paper at some stage after that. We tend to be more careful with non erasable media.


> Opus 4.5 is really good at Prolog

Anything you'd like to share? I did some research within the realm of classic robotic-like planning ([1]) and the results were impressive with local LLMs already a year ago, to the point that obtaining textual descriptions for complex enough problems became the bottleneck, suggesting that prompting is of limited use when you could describe the problem in Prolog concisely and directly already, given Prolog's NLP roots and one-to-one mapping of simple English sentences. Hence that report isn't updated to GLM 4.7, Claude whatever, or other "frontier" models yet.

[1]: https://quantumprolog.sgml.net/llm-demo/part1.html


Opus 4.5 helped me implement a basic coding agent in a DSL built on top of Prolog: https://deepclause.substack.com/p/implementing-a-vibed-llm-c.... It worked surprisingly well. With a bit of context it was able to (almost) one-shot about 500 lines of code. With older models, I felt that they "never really got it".


This is the sort of comment I'm on HN for. Information, especially links to appropriate resources, that only a true practitioner can offer.


Indeed. Favorited it. My Prolog is too rusty to understand it all, but even just skimming the metainterpretation article was enlightening.


Same here. The metainterpretation stuff is fascinating but dense.


Ok what I would really love is something like this but for the damn terminal. No, I don't store credentials in plaintext, but when they get pulled into memory after being decrypted you really gotta watch $TERMINAL_AGENT or it WILL read your creds eventually and it's ever so much fun explaining why you need to rotate a key.

Sure go ahead and roast me but please include full proof method you use to make sure that never happens that still allows you to use credentials for developing applications in the normal way.


If you store passwords encrypted at rest à la my SecureStore, this isn’t an issue.

https://github.com/neosmart/securestore-rs


A really simple one is traversing a linked list (or any naturally recursive data structure, such as a dictionary or tree). It is very natural to traverse a recursive data structure recursively.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: