But why? Isn't most of the information you can extract from those tags stuff that's pretty obvious, like title and author (the examples the linked page uses)? How do you extract really useful information using that methodology, supporting searches that answer queries like "110 volt socket accepting grounding plugs"? Of course search engines can (and do) get such info, but afaik it doesn't require or use XSLT beyond extracting the plain text.