Probably due to they're a great notation for the problem area which for regex is...

LAC-Tech · on May 14, 2022

How else would you concisely state that?

Presumably people who hate array languages think all 3 character regexes should instead be big nested loops, so they are "readable".

samatman · on May 15, 2022

Regexes in the Unix tradition are a user interface as much as a programming language. Not that there's a sharp distinction, but it's almost a trite observation that regexes per se shine for ad hoc string searching but show their weakness when they start becoming parts of programs.

When writing a program, I prefer to use a PEG, giving the less compact notation `'a'* 'b'` but also letting me say `'a'* b` and define b as its own rule, including recursion for the useful cases. It helps that it's more powerful, being little more than a formalization of the post-regular strategies used in Perl-style 'regular' expressions while embracing recursion.

For '/' in vim, grep, wherever? Yeah regex is fine, that's what it was designed for.

IshKebab · on May 15, 2022

I can't remember the names but I've seen at least two alternative syntaxes recently that are a lot more readable. At least one of them fixed the issue of regex mixing up control in-band with data. So your example would be something like

    "a"* "b"

Much more readable and less error-prone.

bear8642 · on May 15, 2022

> the issue of regex mixing up control in-band with data

Could you explain this? I don't quite understand what the problem is. Do you mean something like sed's regex substitute command?

woojoo666 · on May 15, 2022

I believe they mean the operators and operands are all mixed up, eg in `ab`, and this makes it so you have to escape all sorts of characters, but if you split it into `"a" "b"` then the separation is clear

IshKebab · on May 15, 2022

I mean it isn't clear whether a character is a control character (* + ? [ ] - etc) or a literal character because they're all mixed up. The rules about which is which are too complex, extensive and varying.

If you use syntax like "a"* "b" then it's really obvious - the stuff in quotes is literal text, everything else is control.

Lots of formats make the same mistake, e.g. YAML.