Regular Expressions...
-
I'm having trouble getting my head around regular expressions. How do I find anything between < and > ? Thank you.
-
I'm having trouble getting my head around regular expressions. How do I find anything between < and > ? Thank you.
The following regex will return the contents of anything between < and > in a named capture called "TagContents":
\<(?<TagContents>[^\>]*)\>
Here's how it is constructed: An escaped < character: \< An escaped "<" character - escaped to prevent it from being treated as a special expression character. This < matches the beginning of the tag. A named capture group: (?<name>matchexpr) Matches the matchexpr expression and returns it in a named capture called name. A custom character class: [^\>]* Custom character classes are defined by having a set of character identifiers within square brackets [ ]. The ^ character at the beginning means that the character class should match characters that are not contained within the characters listed within the brackets. So, this character class will match characters that are not ">". The asterisk at the end is a wildcard character that indicates that the character class should be matched any number of times possible, from 0 to the remaining length of the string. An escaped > character: \> Matches the end of the tag.
--Justin Microsoft MVP, C#
C# / Web / VG.net / MyXaml expert currently looking for (telecommute) contract work![^]
-
The following regex will return the contents of anything between < and > in a named capture called "TagContents":
\<(?<TagContents>[^\>]*)\>
Here's how it is constructed: An escaped < character: \< An escaped "<" character - escaped to prevent it from being treated as a special expression character. This < matches the beginning of the tag. A named capture group: (?<name>matchexpr) Matches the matchexpr expression and returns it in a named capture called name. A custom character class: [^\>]* Custom character classes are defined by having a set of character identifiers within square brackets [ ]. The ^ character at the beginning means that the character class should match characters that are not contained within the characters listed within the brackets. So, this character class will match characters that are not ">". The asterisk at the end is a wildcard character that indicates that the character class should be matched any number of times possible, from 0 to the remaining length of the string. An escaped > character: \> Matches the end of the tag.
--Justin Microsoft MVP, C#
C# / Web / VG.net / MyXaml expert currently looking for (telecommute) contract work![^]
I'm sorry, I asked the wrong question. I actually need to find the entire tag, including the < and >.
-
I'm sorry, I asked the wrong question. I actually need to find the entire tag, including the < and >.
Then just put the \< and \> inside the capture group:
(?<TagContents>\<[^\>]*\>)
...so that they will be captured along with what's between them.
--Justin Microsoft MVP, C#
C# / Web / VG.net / MyXaml expert currently looking for (telecommute) contract work![^]