Quote:
I don't understand why it's difficult - DFA at least.
I am going to answer your question as thoroughly as possible. Regexes are simple in theory, not in practice, which is why people have problems with them. ========================================= In theory, that's all you need to know. In practice, that's just the start - you missed '^' (which has two meanings), '\' (used for escaping those special characters) and '$'. In $PROGRAM, which of the special characters need to be escaped? How about the replacement expression in "%s" (different to the match expression "/s")? How does it match newlines (hint: in Vim, for example, '$' vs '\n' vs '\r do all different things). Write your expression to work in vim, and it fails in your Javascript program. Write your expression in sed and it fails using the regex library in C#. The expression that works in the default invocation of grep fails in the default invocation of Perl. Use `grep -E` and the expression fails on some tools but not on others. Even passing a regex on to an engine is difficult: in an interactive bash shell you'd use sed "s/\\t//g". In a script that sets the results of that invocation to an environment variable you'd use sed "s/\\\\t//g". You run into a similar problems within your programs when you pass around string variables containing regexes, which is why even though many of the programs which use match expressions in their configuration (like nginx) have quoting and escaping rules that differ to the command-line programs which use the same regex library. When you use regex liberally in Python, Bash, Grep, Vim, C#, Perl, Javascript and everything else, you never remember how they all handle the special cases - you have to keep looking them up for that particular program. I'm fairly comfortable with them, having spent the 90s as a Perl programmer, and having used Vim as my default coding editor daily for almost 30 years during which time I collected a couple of postgraduate CS (not IT) degrees (which means I know automata theory better than most), and yet even I have to look regex stuff up on a per-product basis. I am skeptical that you can look at an expression and go "This will work in $x, $y and $z, but not in $a, $b and $c.", and if my skepticism is correct, then you have problems too, but just don't know it. And that is why people have problems with them - you never quite know which contexts allow '.' to