For those in the ML area, what are people’s opinions on PyTorch and its use in comparison to its competition? I don’t have any experience with PyTorch or ML tech besides having to package and provide PyTorch containers for our university’s HPC cluster and running the helloworld.py against it for validation.
So we have been a TensorFlow shop since well before 1.0. We are keeping tabs on PyTorch with an eye on moving in that direction as certain things get filled, notably serving and something that is really comparable to tf.data.
The fact that our team in particular is looking at moving is probably a disaster for TF since a number of the criticisms typically leveled at it we didn't see as issues. We have a couple of really strong people with haskell backgrounds so static graphs and laziness we didn't find to be problematic.
However the embrace of Keras in 2.0 has left us dumbfounded. On one hand having a consistent layer interface is nice. On the other hand having a base class for the loss function that is not sufficiently general, the fact that all non-toy models we build seem to need model subclassing and a custom training loop with GradientTape and the number of issues we ran into while trying to port a couple of models has led me to conclude that the release was not ready. So while we like the tools around the model (tf.data, tensorboard, serving, tfx, etc...) building actual models I think has gotten worse.
Now my opinions on PyTorch are not from shipping production models but mostly porting to TF and keeping tabs on what they are doing. PyTorch also makes it easy to define reusable units. It does not try and expose a higher level interface that requires a significant investment in learning to express complex or unusual models. It seems a bit less opinionated on what the user should do.
A couple of other notes, PyTorch is being used inside Google for research I think. They have now written several papers (including one with Jeff Dean as 5th author) that have had their code released in PyTorch. PyTorch I think (it might already have) will end up with better governance but I would be interested in others opinions. They have at least one person listed under the project maintainers who does not work for FaceBook. A reason for adopting PyTorch may be that one company just does not decide to radically change the project to fit their view of the world.
This last bit is purely conjecture. PyTorch I think has already won over TF and it is going to take a couple of years for it to play out. If I had to bet today I would bet that PyTorch will become the dominate framework for both research and production. Of course something could happen to derail that but if things continue on their current trajectories I think its inevitable.
I echo this. While 2.0 initially made me happy building some models from scratch, as soon as I needed to write my own optimizers, losses, class weights, etc. it became a nightmare. Also not to mention the data pipeline for large imagesets required you to serialize the data first into tfrecords. Please just stream AND shuffle images in a folder tensorflow.
Pytorch is pretty dominant in research (outside Google) but it's use in production is lagging behind, primary due to tensorflows excellent support for deployment in all kinds of environments.
I would say it's top 3, probably top 2 with potential to get better.
PyTorch is research favorite, it's easy to debug tensors and interleave bunch of random code in between layers if you need to. Keras is the easiest one to get a model running in production, but for some advanced things like sophisticated custom losses you might be forced to switch over to base TensorFlow. TensorFlow is the production king but it's super complex unless you use tf.keras and difficult to debug.
What kind of knowledge should you have before getting into this book? I've been meaning to try to learn ML and have been looking at some university courses that have all the material available.
Goodfellow’s book on deep learning[0] is a good starter - the first chapters give a solid overview of ML theory as well. Elements of Statistical Learning is another.
I don’t think it is a good book. From a didactic point of view, I actually found it one of the worst resources out there. The math intro at the beginning is too superficial - either you know it and skip the chapter, or you need another resource to learn. The rest of the first part is okay, but parts 2 and 3 are really not very helpful to someone who doesn’t already understand it.
I strongly recommend fast.ai instead. Although often looked at as the resource for people who can’t deal with the math, I actually found it to be extremely good at explaining the math. Compare, for example, the deep learning book’s explanations on various gradient descent methods with Jemery Howard’s explanation - in the book it looks very complex, whereas in the course it’s actually really intuitive. And Jeremy doesn’t gloss over things, he actually implements the various gradient descent methods in Excel (!).
I wasn't sure if I was just dumb or if everyone recommending this book just kept it on their shelves and talked about it, sort of like Knuth or CLRS. I don't necessarily think it is a bad book, but it is hard to think of a better one to give someone to discourage them from getting into deep learning.
I started with a top-down approach via the fast.ai courses and learning Keras, then spent time brushing up on some of the math concepts (as you said, it assumes a fair amount of previous knowledge), and then went back and started re-reading it, and I finally feel like I'm starting to get some value out of it.
Definitely wouldn't recommend it as a first book, though.
I finally started looking into fast.ai and the setup seems to be a lot of hassle. It doesn't help that the course dismissively just says "just buy server time even if you have a GPU for this."
I retired this year, but my last job was managing a deep learning team. I know many people who own this book, but no one including myself who has read it. Personally, the value I got from it was the first section on math, then picking and choosing limited material that I used for a reference or overview.
I read it! But then, I love reading technical books.
I did find that it didn't provide much context around why the equations matter, and definitely wouldn't be useful for those starting out in the field. It did have some pretty good coverage of gradient descent and various optimisers, which I found useful.
tl;dr: not really worth it for its stated purpose, but not a bad second or third stats book.
The idea is that you should know basic maths (say what matrix multiplication does and what a vector space is) and a bit of Python.
We work very hard on keeping everything hands-on from there.
(Disclaimer: I'm not unbiased.)
I have used TensorFlow almost exclusively for my deep learning work in the last five years and using TensorFlow is much more common, largely I think, because Andrew Ng’s excellent five part course used it. On the other hand Jerry Howards’s wonderful free classes use mostly PyTorch so maybe my argument does not hold.
For research and for use in published papers, PyTorch is much more widely used.
Which estimate are you looking at? The main thing I see is:
“PyTorch and TensorFlow for Production
Although PyTorch is now dominant in research, a quick glance at industry shows that TensorFlow is still the dominant framework. For example, based on data from 2018 to 2019, TensorFlow had 1541 new job listings vs. 1437 job listings for PyTorch on public job boards, 3230 new TensorFlow Medium articles vs. 1200 PyTorch, 13.7k new GitHub stars for TensorFlow vs 7.2k for PyTorch, etc.”
That suggests 1:1 for jobs, 2:1 for github stars and 3:1 for articles on Medium. Really hard to say if any of those reflect uses in production in any meaningful way, but if so, I’d suggest that the jobs and/or github stars might be more useful than articles on Medium.
The jobs numbers suggests where the trend is going. The github and article counts seem more useful to indicate the current state. When looking at those numbers I'd rather pick the more conservative number.