I don’t think it is a good book. From a didactic point of view, I actually found it one of the worst resources out there. The math intro at the beginning is too superficial - either you know it and skip the chapter, or you need another resource to learn. The rest of the first part is okay, but parts 2 and 3 are really not very helpful to someone who doesn’t already understand it.
I strongly recommend fast.ai instead. Although often looked at as the resource for people who can’t deal with the math, I actually found it to be extremely good at explaining the math. Compare, for example, the deep learning book’s explanations on various gradient descent methods with Jemery Howard’s explanation - in the book it looks very complex, whereas in the course it’s actually really intuitive. And Jeremy doesn’t gloss over things, he actually implements the various gradient descent methods in Excel (!).
I wasn't sure if I was just dumb or if everyone recommending this book just kept it on their shelves and talked about it, sort of like Knuth or CLRS. I don't necessarily think it is a bad book, but it is hard to think of a better one to give someone to discourage them from getting into deep learning.
I started with a top-down approach via the fast.ai courses and learning Keras, then spent time brushing up on some of the math concepts (as you said, it assumes a fair amount of previous knowledge), and then went back and started re-reading it, and I finally feel like I'm starting to get some value out of it.
Definitely wouldn't recommend it as a first book, though.
I finally started looking into fast.ai and the setup seems to be a lot of hassle. It doesn't help that the course dismissively just says "just buy server time even if you have a GPU for this."
I strongly recommend fast.ai instead. Although often looked at as the resource for people who can’t deal with the math, I actually found it to be extremely good at explaining the math. Compare, for example, the deep learning book’s explanations on various gradient descent methods with Jemery Howard’s explanation - in the book it looks very complex, whereas in the course it’s actually really intuitive. And Jeremy doesn’t gloss over things, he actually implements the various gradient descent methods in Excel (!).