Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.
Oh yeah they definitely have uses, but there's a real tendency for people to go a bit crazy with them. Complex regexen aren't exactly readable, there's all kinds of fun performance gotchas, there's sometimes other tools/algorithms that are more suitable for the task, and sometimes people try to use them to eg. parse HTML because they don't know that it is literally impossible to use regular expressions to parse languages that aren't regular
It's entirely possible to parse HTML in PCRE. You shouldn't, but it is possible. The language stopped being strictly regular a long time ago and is entirely capable of doing it.
Oh yeah, extensions which make them non-regular definitely can make it possible, but just because it's now somewhat possible with some regex engines doesn't mean it's a good idea
I've once written a JS decompiler (de-bundler?) using ~150 regex for step-wise transformations. Worked surprisingly well!
Well... No new ones, at least? Though it was around that time that I started hearing whispers in the night... "You can use WASM to ship Client-Side PHP"
it is literally impossible to use regular expressions to parse languages that aren’t regular
It’s impossible to parse the whole syntax tree, but that doesn’t mean you can’t get the subset you’re interested in.
I learned Regex once and now it just works. Only problem for me is using MacOS so the Regex flavors aren't consistent. But once I sort that, it's smooth sailing.
Regex feels distinctly eldritch to me. Like, a lot of computing knowledge feels like magic, but regex feels like the kind of magic you get by consorting with dark forces
regex feels like the kind of magic you get by consorting with dark forces
AKA reading the manual.
Named groups are nice but can I please define a group more than once because maybe I want to group my data and consolidate values in a logical way without you complaining I have already used a group previously. I know I did, I’m the one telling you, now capture it twice!
You can use backreferences \1 \2
etc. but you can also give them names explicitly.
it looks like this: (?<name>inner-regex)
Some flavors support it, kotlins doesn't apparently.