Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

“ Seeking an improvement that makes a difference in the shorter term, researchers seek to leverage their human knowledge of the domain, but the only thing that matters in the long run is the leveraging of computation. “

http://www.incompleteideas.net/IncIdeas/BitterLesson.html



This is a great read. It is accurate to what I felt the last time I trained a CNN -- it's not fun, and I don't get to feel clever. My brain isn't wired to give me a dopamine hit when the training does its job. It's just a, "wait, that's it?"

We will always want to do the discovery ourselves, and I can see why fighting that instinct is a challenge for those in the field.


Isn't the CNN a discovery in itself? Without it, we'd be following the bitter lesson and "leveraging computation" to throw more data / compute at an MLP.

Clearly someone felt that there'd be a better inductive bias and attempted something else, and now CNNs are what's used "in the long run".


I've seen many interpretations of this article and I'm curious as to the mainstream CS reading of it.

One could look at the move from linear models to non-linear models or the use of ConvNets (yes I know ViTs exist, to my knowledge the base layers are still convolution layers) as 'leveraging human knowledge'. Only after those shifts were made did the leveraging of computation help. It would seem to me that the naive reading of that quote only rings true between breakthroughs.


See also the GPT-4 technical report, page 37.

https://images.app.goo.gl/vRP8368Z17zW2hvC9





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: