Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I understand where the author is coming from and respect their contributions to Commonmark.

But...

There are tons of markup languages for prose that have well-defined specs.

So, why did Markdown win?

IMO, because it does not have a well-defined spec. It is highly tolerant of formatting errors, inconsistencies, etc. If an author makes a mistake when writing Markdown, you can always look at it in plain text.

Whereas a perfectly-spec'd markup language would probably evolve toward an unreadable-to-humans mess in the committee-driven pursuit of precision.

You see this theme in so many places in tech: "less is more", the Unix philosophy of everything-is-a-file, messy HTML5 over "XHTML", ML extraction vs. explicit semantic web, etc.



> IMO, because it does not have a well-defined spec.

Same reason that JSON won.

JSON and Markdown are base standards that were generated by market need to simplify.

JSON won because it was not overly complex and there was some flexibility. If you need more go YAML or use JSON as a platform for more.

Every attempt to change JSON has and should be shot down. JSON really just has basic CS types: string, int/number, bool, object, lists. From there any data or types can be serialized or filled. With JSON you can do types via overloads/additional keys, you can add files by url/uri or base64, and any additional needs using parts of basic JSON. Even large numbers can just be strings with type defs as additional keys/patterns. Financial data can just use strings or ints with no decimal largely because this is the safest way to store financial data to prevent float issues.

KISS is life and sometimes things are just done, no improvements needed. Now you can take JSON and add things on top of it if you want. Same with Markdown. The base doesn't need to change... ever.

Don't SOAP my JSON. Don't HTML my Markdown. Though you can add specs (JSONSchema/OpenAPI) and formatting tools on top in a processing step. For messaging and base content, they are perfect, simple, clear, concise and no need to change.


I think JSON and Markdown are very different, in fact.

JSON is very strict. It won't let you have a comma after the last element of a list, for instance (which is very annoying in many cases). It won't let you add comments in any way, shape or form. It won't let you use single quotes instead of double quotes. Or forget quotes in keys. Or mess with case in null / true / false. Or use NaN values.

Markdown is ill-defined, and will happily let you do whatever the hell you want.

JSON is made for programs, and is a PITA to write as a human (for the reasons mentioned above). But a pleasure to parse and (to some extent) generate automatically. It's not very good with text.

Markdown is made for humans, and I'd hate to have to parse a markdown file and do something with its content other than basic formatting. It's bad at anything but text.


JSON won because parsing it on in the browser was just was a call to “eval()”, and then you just access the object using normal JS conventions/syntax (e.g data.foo[0].bar). Whereas XML required creating a DOM parser and document fragment, and then using cumbersome HTML DOM methods like “ getElementsByTagName()” to get each value directly (or worse Xpath). It totally sucked.

Native support for JSON parsing and stringify helped when it came later. The Selector api that also came later made XML parsing a little easier if you didn’t want to use XPath, but by then most things were JSON anyway.


It should at least have comments. Then it can freeze.


You can make a separate file that has comments or even another JSON file that has descriptions by JSONPath or key.

Some libs also support comments and trim before processing but I prefer the external/metadata way. Comments add weight.


> Every attempt to change JSON has and should be shot down.

I really wish JSON allowed for final trailing commas in arrays/objects.

It would make for more readable diffs, simpler text templating, easier writing/parsing for us humans, etc. I'd happily trade all of TOML, YAML, XML, and every other similar format in existence for that one change.


It makes generating from templates in certain (many!) instances needlessly difficult. I say needlessly, because the rule is seemingly arbitrary. I can't see what purpose it serves.



Nice, I didn't know there was a term for this.

I completely agree. My favourite software is not just functional, it also is opinionated and expresses a philosophy on how to do something. Simply adding flexibility forever in a quest to be useful for everyone ends up making it useful for no-one.


This is only perhaps correct in that a loosey-goosey proposals can spread farther because it is seemingly simple to implement (less MUST and whatever) and by the time you notice inconsistencies between implementations, the thing has reached a sort of critical mass already and the things aren’t that inconsistent so you just shrug and say whatever.

But in the case of MarkDown the original implementation was just not that great. Which has nothing to do with being easier; MacFarlane’s Djot is an easier to implement and easier to describe language.

And of course your point about “committee-driven pursuit of precision” is just a made up hypoethical which is not worth responding to. (The only committee has been on CommonMark, which is a definition of “MarkDown” (TM) which merely tries to deal with years of drift between different MarkDown implementations. With their famously long-winded spec-by-prose-enumeration style.)


Asciidoctor has a spec, reads pretty similarly to markdown, and is infinitely better IMO. And it (well, AsciiDoc) predated markdown!

I think markdown won because it was specifically made with HTML output in mind, instead of arbitrary output (docbook, in the case of AsciiDoc, which is pretty much infinitely malleable).


The Asciidoctor flavour of AsciiDoc doesn't have a specification. There is only a working group. The parsers are a mess composed of regular expressions.

There are in effect two different versions of AsciiDoc, because Asciidoctor people have appropriated the name while making their own changes to it and marking what they dislike as deprecated.

AsciiDoc cannot express all of DocBook, for example figures with multiple images.

While I despise Markdown, there isn't all that much to be a fanboy of. Just the syntax is overall saner.


Ah, DocBook //imageobjectco with something like calspair as well. I've been wanting it badly, but there's zero movement in the Asciidoctor group to try and tackle that beast.

With all due respect, and speaking as an amateur programmer, when it comes to lightweight markup, is there a better way to write a parser besides regular expressions? I suppose it's how the semantics are abstracted.

Asciidoc does get you conditionals and transclusion in the core spec, without needing to resort to extensions. This is what brought me over. That and the XML interoperability.

The Eclipse WG isn't published yet, but, in my opinion, it's a more stable surface to build on than the "many worlds" of Markdown.

Every time someone shows me a cool markdown trick, it requires me to pull something down from github and `npm-install` (or equivalent). But, well, that's kind of the point, isn't it? Markdown's ease of implementation allows a degree of glorious hackery that's just not possible otherwise. While Asciidoctor's great albatross - and its great asset - is Ruby . . which inevitably involves Opal at some point.


You are completely right. The underlying theme here is that the requirements matter.

The requirement for Markdown is to be simple and easy. It's intended for use by people who are going to ignore whatever specs and documentation there are. They'll write a little comment, a bug ticket, or a readme and they might need things like links, bold, italic, etc. And the job is to turn that into some legible HTML. So most of its features are simple and easy to remember. Just add a blank line for a new paragraph, prefix your bullets with a -. and so on.

Markdown is undeniably simple and easy to learn. Which is why it got so popular. It has edge cases but they don't really matter. It has obscure features (e.g. tables) most people don't use, so those don't matter either. And there's a wide range of things it can't do that also don't matter. The job never was being a drop in replacement for more complex tools. It was removing the need to use those for the simple use cases and be simply good enough.

The alternatives each chase requirements that are important to their creators but not to most casual users, or indeed the people that integrate markup tools. And of course the more these alternatives differ from Markdown, the harder of a sell it becomes. And the more there are, the less likely it is for any of them to become more popular than markdown. At this point, markdown is a common default in things like issue trackers, readme's on Github/Gitlab, etc. Any tool integrating some kind of markup language support in their content management is more likely to be using markdown than anything else at this point.

The reason is simply that using something else breaks the principle of the least amount of surprise for the user. Markdown is the largest common denominator. It's good enough and easy enough to deal with. So, most new things would favor using that over anything else. It's a self re-enforcing thing.


> largest common denominator

Or the lowest.

This is how populist politics works. The thing that appeals to the most people isn't necessarily the thing we should be doing.

The internet and web appealed to a small percentage of people in the early 90s, and it was glorious. You had to put in effort to get anything out, which meant most people didn't bother, which meant it was a nice place. The music industry similarly had a high level of entry. Both are filled with crap now.

Elitist old man shouting at clouds? Maybe. Doesn't mean I'm wrong though.


These things don't win on engineering merits. Markdown wasn't better than others. It was like a bunch of others. It's just natural that one form of communication becomes a monopoly because people want to be able to talk to as many people as possible.

You only need to be good enough to enter this kind of competition... and win. The reasons you might win can be many arbitrary things, like someone deciding to adopt a practice in a large organization, or dedicating efforts to writing parsers in many languages etc.


> Whereas a perfectly-spec'd markup language would probably evolve toward an unreadable-to-humans mess in the committee-driven pursuit of precision.

Maybe, and I mean that sincerely...but are you just saying this must happen or can you actually point to where MacFarlane's proposals would make a significantly less pleasant language?


Requiring a blank line before a sublist just looks wrong.


I couldn’t figure out what was meant by a sublist. Like any hierarchy? Or just list-in-paragraph-in-list, not list-in-list? That one could use some HTML disambiguation in the article.

EDIT: yeah it’s always required. That kills me.


> Whereas a perfectly-spec'd markup language would probably evolve toward an unreadable-to-humans mess in the committee-driven pursuit of precision.

This proposal shows us a clear step in that direction, going from something simple and easy for humans to understand, with complex implementation, to emphasize part of a word:

  fan*tas*tic
To proposing a simple implementation that's... weird for humans:

  fan~_tas_~tic


It seems like a minor concession since most uses of intra-word emphasis are more cutesy than communicative[1] (it is of course sometimes very useful when there is a subtle syllable emphasis, or a suble typo that you want to point out).

[1] Maybe I’m being a hypocrite here? I definitely am in favor of a lot of “cutesy” ways to communicate (things that are more stylistic than necessary). But not intra-word emphasis, really.


I had a look at djot, which adresses all of the author's grievances and I must say... I don't like it.

Sure, it probably is easier to parse, and maybe there are a few edge cases that it does better, but the goal of markdown is to have text that is:

A) human readable and looks good without parsing it

B) can be parsed and presented using different themes

In djot they sacrifice a lot (e.g. we now have to insert empty lines in a nested list?!) of point A for questionable gains at point B. Guess what I as a user care more about?

Markdown accepting a wide range of inputs is not a mistake, it is a feature. If that makes parsing more complex that is an acceptable side effect not a mistake.


I agree that empty line in front of a nested list is ugly. I very often make hierarchical descriptions of things like events or things to do or recipes and that kind of thing would be annoying to have to deal with. I like my lists tight.

I would have tried harder to find some other way to make the grammar simple.

I haven’t seen anything else (in addition) that makes it less “human readable” though.


I'd argue that it won the adoption it did in spite of parsing ambiguities and the lack of a spec. Not because of it. There are plenty of examples of well specified things that have gained mass adoption, so I think you are confusing cause and correlation here.


Well-specified formats that are primarily produced by humans writing them by hand? The main entries in this category are programming languages.


IDK, JSON? HTML and XML are markup languages also. There are obvious issues with markdown that were fixed/resolved in various markdown variants, and missing features as well that I don't think anyone could argue helped adoption. Case in point, the most commonly used markdown flavor is GFM because we all adopted GitHub and that was what they support.


It is true that Markdown won by putting simplicity for the users in front of simplicity for the parsers. But since it became ubiquitous, there's a lot of value in codifying the standard to make sure that it doesn't diverge into different dialects.

Regarding the specific author's suggestions, he explicitly writes that he doesn't propose to implement them in the actual MD "standard", since backwards compatibility is more important. That said there is value in making the markup less ambiguous while preserving the "writability" even if it's just a thought experiment.


It's not really problem with being "perfectly specced" or not, it's just matter of inertia.

If markdown just used bold _italics_ at the start, or needed a tag for HTML instead of passing it as is... it would be entirely fine and just as popular now. Or any other generally agreed upon as "good" fix.

But inertia makes things like that near-impossible to change now. Only additions can sorta work and even those are hard as critical mass of dialects needs to apply them for it to work.


What you're saying is that the most stupid and broken "solutions" win…

Now one could speculate about the reasons.


> messy HTML5 over "XHTML"

Nothing messy about HTML, whatever version. It just uses SGML features from a more civilized age, such as inferring tags not explicitly present when unambiguously required by the content model grammar.

Btw a large fragment of markdown can be implemented using SGML's SHORTREF feature, as can customizations such as GitHub-flavored markdown. John Gruber's markdown language is specified as a canonical rewriting into HTML with the option of inline HTML as fallback, making SGML SHORTREF a particularly fitting implementation model since it works just the same. It's quite striking how a technique for custom syntax invented in the 70's (however imperfectly specified, though not in a worse-is-better way lol) could foresee Wiki syntaxes and also determine the most commonly used markup language (HTML) fifty years later.

Agree with the gist of your post, though. As fantastic as MacFarlane's pandoc is, the idea to re-assign redundancies in markdown (eg. interpret minute presence/omission of space chars to mean something) was bound to fail, and that was very clear to me skimming only through a few paragraphs of the CommonMark manifesto. When it was first discussed here back then, someone commented that this was bound to happen when a logician (McFarlane) approached Wiki syntax.


SGML is a hot mess. It should have died decades ago.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: