C# RegEx Match Groups

peterchen

I have a few functions that since I have them do everything I need. They are based on "remove and return". It mayb be 50 lines instead of 5, but I feel much safer. :cool: I've always been curious about RegExes. They are cool. But the "implementation differences" are plain scary, and you end up with a lot of write-only code.

Developers, Developers, Developers, Developers, Developers, Developers, Velopers, Develprs, Developers!
We are a big screwed up dysfunctional psychotic happy family - some more screwed up, others more happy, but everybody's psychotic joint venture definition of CP
Linkify!|Fold With Us!

Lost User

What? I hope you meant this ironically :cool: Ever tried to parse incoming messages like the IRC protocol? Works with string functions, but it's a royal pain in the a**, imho. They're also great for validating certain strings. Gruß

peterchen

No :cool: With the right helpers, it's quite readable / maintainable code. Not as compact as regexes, though. I never needed "find substring" lately, and performance wasn't the most important one, so YMMV. [edit]I find that if you have to validate the string, RegExes become terribly complex. That's the main reason I avoid them.[/edit]

Developers, Developers, Developers, Developers, Developers, Developers, Velopers, Develprs, Developers!
We are a big screwed up dysfunctional psychotic happy family - some more screwed up, others more happy, but everybody's psychotic joint venture definition of CP
Linkify!|Fold With Us!

Lost User

peterchen wrote:

No With the right helpers, it's quite readable / maintainable code. Not as compact as regexes, though.

Depends on the Regex ^([0-9]( |-)?)?($?[0-9]{3}$?|[0-9]{3})( |-)?([0-9]{3}( |-)?[0-9]{4}|[a-zA-Z0-9]{7})$ I don't like this one, either :wtf:

Todd Smith

I couldn't live without them. I'd hate to write parsers everytime I wanted to extract some small piece of info from a file. As an example, in our build system we always extract the version number of the product or library from some version.h file. The format of that file has not been standardized so we need a different regex for each one and there's 10+ of them and growing. I think I ran into this regex problem the other day with groups while parsing version numbers. I couldn't figure out why my values where goofed up. I kept expecting group[0] to have the first item and it didn't. I think I ended up using named group items and worked around it that way.

Todd Smith

peterchen

They can be quite handy, no doubt, esp. if you are "fluent" in them. (I'm not)

Developers, Developers, Developers, Developers, Developers, Developers, Velopers, Develprs, Developers!
We are a big screwed up dysfunctional psychotic happy family - some more screwed up, others more happy, but everybody's psychotic joint venture definition of CP
Linkify!|Fold With Us!

Maximilien

:laugh:

Maximilien Lincourt Your Head A Splode - Strong Bad

Andy Brummer

Everyone knows you should write loops like for i = 1 to Groups.Length next I don't see what the problem is. :-D

Using the GridView is like trying to explain to someone else how to move a third person's hands in order to tie your shoelaces for you. -Chris Maunder

Andy Brummer

Todd Smith wrote:

I think I ended up using named group items and worked around it that way.

Ah, that's why I've never run into this. I've always used named groups. Probably because I ran into this and just forgot about it.

Using the GridView is like trying to explain to someone else how to move a third person's hands in order to tie your shoelaces for you. -Chris Maunder

Monkeyget2

That's where inline comments come in (see http://www.regular-expressions.info/comments.html[^] ) A good trick described at http://www.codeproject.com/dotnet/RegexTutorial.asp[^] is : "Comments please Another use of parentheses is to include comments using the "(?#comment)" syntax. A better method is to set the "Ignore Pattern Whitespace" option, which allows whitespace to be inserted in the expression and then ignored when the expression is used. With this option set, anything following a number sign "#" at the end of each line of text is ignored. For example, we can format the preceding example like this: 31. Text between HTML tags, with comments (?<= # Search for a prefix, but exclude it <(\w+)> # Match a tag of alphanumerics within angle brackets ) # End the prefix .* # Match any text (?= # Search for a suffix, but exclude it <\/\1> # Match the previously captured tag preceded by "/" ) # End the suffix"