“ Seeking an improvement that makes a difference in the shorter term, researcher...

LeftHandPath · on March 16, 2023

This is a great read. It is accurate to what I felt the last time I trained a CNN -- it's not fun, and I don't get to feel clever. My brain isn't wired to give me a dopamine hit when the training does its job. It's just a, "wait, that's it?"

We will always want to do the discovery ourselves, and I can see why fighting that instinct is a challenge for those in the field.

shawntan · on March 17, 2023

Isn't the CNN a discovery in itself? Without it, we'd be following the bitter lesson and "leveraging computation" to throw more data / compute at an MLP.

Clearly someone felt that there'd be a better inductive bias and attempted something else, and now CNNs are what's used "in the long run".

shawntan · on March 16, 2023

I've seen many interpretations of this article and I'm curious as to the mainstream CS reading of it.

One could look at the move from linear models to non-linear models or the use of ConvNets (yes I know ViTs exist, to my knowledge the base layers are still convolution layers) as 'leveraging human knowledge'. Only after those shifts were made did the leveraging of computation help. It would seem to me that the naive reading of that quote only rings true between breakthroughs.

jnwatson · on March 16, 2023

See also the GPT-4 technical report, page 37.

https://images.app.goo.gl/vRP8368Z17zW2hvC9

TechnicolorByte · on March 17, 2023

Report itself: https://arxiv.org/pdf/2303.08774.pdf