Can anybody explain how this Regex works.
-
Hi I'm using C# and I have the regex below to split words and not split a string ".. .." instead take the whole string as on item to a List. Example: text="all "1 dl"" after split all[0]="all" all[1]="1 dl" I found it in Google and it works, but I don't understand how it works.
string regexSpliter = @"(?<=^(?:[^""]*""[^""]*"")*[^""]*) ";
List all =new List_ (System.Text.RegularExpressions.Regex.Split(text, regexSpliter));And if I remove the space before the last " in the string it doesn't work as I want. It seems like it splits all the characters in to elements written in the string text. Can anybody please explain the string regexSplitter and why it has to be a space last in the string. Many thanks Fia
-
Hi I'm using C# and I have the regex below to split words and not split a string ".. .." instead take the whole string as on item to a List. Example: text="all "1 dl"" after split all[0]="all" all[1]="1 dl" I found it in Google and it works, but I don't understand how it works.
string regexSpliter = @"(?<=^(?:[^""]*""[^""]*"")*[^""]*) ";
List all =new List_ (System.Text.RegularExpressions.Regex.Split(text, regexSpliter));And if I remove the space before the last " in the string it doesn't work as I want. It seems like it splits all the characters in to elements written in the string text. Can anybody please explain the string regexSplitter and why it has to be a space last in the string. Many thanks Fia
From what I can tell, that should be matching a
SPACE
that follows aanycharactersotherthanaquoteQUOTEanycharactersotherthanaquoteQUOTEanynumberofquotes
, but you say it matches your test string so I must be misreading it. At any rate, you can make something simpler. What exactly do you have and what do you want from it? -
From what I can tell, that should be matching a
SPACE
that follows aanycharactersotherthanaquoteQUOTEanycharactersotherthanaquoteQUOTEanynumberofquotes
, but you say it matches your test string so I must be misreading it. At any rate, you can make something simpler. What exactly do you have and what do you want from it?Hi Thanks for all replies. I'm trying to get words and if there are strings between quotes get that too that users have entered in a textbox. But I still don't understand how the string regexSplitter works. And I still don't understand why it has to be a space last in that string. Because when I remove it, it doesn't work as I want. Thanks Fia
-
Hi Thanks for all replies. I'm trying to get words and if there are strings between quotes get that too that users have entered in a textbox. But I still don't understand how the string regexSplitter works. And I still don't understand why it has to be a space last in that string. Because when I remove it, it doesn't work as I want. Thanks Fia
There has to be something. Have you tried other characters?
-
There has to be something. Have you tried other characters?
-
Hi What do you meen by something? I can write any characters I want in a word or a string. For example the text can contain 'hello by "to much" 10'. Thanks Fia
I mean the SPACE (or something else) needs to be there.
-
Hi I'm using C# and I have the regex below to split words and not split a string ".. .." instead take the whole string as on item to a List. Example: text="all "1 dl"" after split all[0]="all" all[1]="1 dl" I found it in Google and it works, but I don't understand how it works.
string regexSpliter = @"(?<=^(?:[^""]*""[^""]*"")*[^""]*) ";
List all =new List_ (System.Text.RegularExpressions.Regex.Split(text, regexSpliter));And if I remove the space before the last " in the string it doesn't work as I want. It seems like it splits all the characters in to elements written in the string text. Can anybody please explain the string regexSplitter and why it has to be a space last in the string. Many thanks Fia
See The 30 Minute Regex Tutorial and search for all occurances of
(?<=
in that article. This explains the meaning of(?<=...)
. You have always to separate the way you enter a pattern in C# and the pattern the Regex sees:C# @"..." pattern:
@"(?<=^(?:[^""]*""[^""]*"")*[^""]*) "
effective Regex pattern (here delimited by /.../):
/(?<=^(?:[^"]*"[^"]*")*[^"]*) /
I'm now only talking in Regex domain (the 2nd row), not how it is entered in the C# string. Let's start with the inner most part and work outwards:
..."[^"]*"...
: "..."...[^"]*"[^"]*"...
: any number of non-"-char, followed by "..." from 1. above...(?:[^"]*"[^"]*")*...
: any repetition of the group described in 2. above...^(?:...)*...
: 3. above must match from the beginning of the text...^(?:...)*[^"]*...
: 4. above, followed by any number of non-"-char(?<=...)
: match a space that is preceeded by the expression from 5. above; the (?<=...) is not part of the match
The Regex searches for the space character and checks if the data before that space matches the prefix expression. If yes, the match is successful, otherwise, the Regex searches for the next space and checks again, etc. The given Regex and the given data match only on one space, the one after
all
. The underlined part matches withall
:(?<=^(?:[^"]*"[^"]*")*[^"]*)
. I.e. the regex splits the given data by spaces, respecting spaces within "..." strings as non-separators. Very complicated, though. I would do this differently, namely in positive terms (what you want to be part of the fields rather than what splits them):string pattern = @"\s*(""[^""]*""|\S+)\s*"; // maybe a more sophisticated pattern is
// needed since the above expression seems
// to match more,
// but this is maybe an undesired side
// effect of the complicated expression
string[] split = Regex.Matches(input, -
I mean the SPACE (or something else) needs to be there.
See my explanation below (I know, this is very old topic, but I see it was not solved in this thread, so I added my lengthly explanation below). The Regex matches for spaces where the prefix expression (
(?<=...)
) matches. Far too complicated for cases where one wants to have a string split into part separated by spaces, ignoring spaces within "...". My preferred solution is using positive match criterion (as described in the sentence above):string pattern = @"\s*(""[^""]*""|\S+)\s*";
var fields = Regex.Matches(input, pattern).Cast<Match>().Select(m=>m.Groups[1].Value);Cheers Andi