As far as I can tell, you're only missing two things: 1. It's five "Unicode scal...

As far as I can tell, you're only missing two things:

1. It's five "Unicode scalars," that's the name for the top-level logical unit. The term "code points" technically refers to a lower-level concept, one that varies across encodings, just not as much as the number of bytes. I didn't know that, and it's the helpful thing I learned from this article. UPDATE: And it's also not true, sorry. "code units" are the lower-level concept from the article, "code points" are a more expansive category at the same level: https://www.unicode.org/versions/Unicode10.0.0/ch03.pdf#G740...

2. The author takes it as an unstated assumption that top-level logical structure is useless because any specific usage either ignores all structure or has a point at which low-level structure comes into play. (That assumption is false: Top-level structure is useful for keeping track of what you are doing and as a sort of "common currency" for translating between different low level representations. For example, see the very first table in the article.)