It seems like the problem with Wayland is things that used to be solved by merging some code into X are now a big social synchronization problem, ie. get everybody to agree on a protocol.
I believe it is a more fundamental “problem” due to the bazaar vs cathedral model. The linux ecosystem simply can’t just say X or Y will be the way forward like apple can, so some inefficiency is inherent in standardization.
Doing so requires writing a server and library that everyone can agree on, which probably will not happen and is probably one of the main reasons they did not do it in the first place.
Please see one of my other replies here. When implementing an X11 compositor those parts were already done separately anyway, and this is actually the entire point of X11 compositing. So the trend with X11 was already going away from the direction that you suggest.
There is a remote possibility that some person comes along and builds a server and library that will work for everybody. But such things would probably be very large, much larger than systemd. Actually, the closest thing to what you describe is probably a web browser and HTML5, but building a desktop on that seems to be not popular aside from Chrome OS.
You still had the same problems in X. It was a bad idea to try to get protocol code merged into X unless you could get everyone to agree on it. Because ultimately, in either case it is still the same set of people who will be using any new protocols.