If you did all of that you'd still fall way short of what you have been able to do with XML for probably 10 years. XSD defines structure in terms of inheritance, composition, strong types, numerical ranges, enumerations, referential integrity, namespaces.
Requiring that a markup language (that must be sent across the wire between multiple buggy parser implementations) be hand editable is a recipe for disaster, and will never be as flexible as building/modifying an appropriate AST in a scripting language.
There's not even a good reason to read HTML anymore. You hit Ctrl+Shift+C in Firefox (or I think it's J in Chrome) and you view the actual machine-parsed DOM structure. The only reason human-editable/readable formats were ever necessary is a lack of appropriate dev tooling.
Those things aren't a goal or desired feature of most JSON use. There's a reason people hated SOAP. XML is better suited for documents. For everything else, XML doesn't work very well.