How to extract all words - using regular expression
-
EDIT OK using g - as in "global" shoud work , but it does not result = BTUL->EditLine_RegExp(result, "(\\w+ \g)"); Where is my error ?? I am having mental block - forgot how to add " match all words ":. Here is my failing attempt to do so result = BTUL->EditLine_RegExp(result, "(\\w+*)" ); Can somebody help me to modify my regular expression to match all words PLEASE No references to AI regular expression generators - they do not do well multiple entries.
Try this: Match all words[^]
The difficult we do right away... ...the impossible takes slightly longer.
-
EDIT OK using g - as in "global" shoud work , but it does not result = BTUL->EditLine_RegExp(result, "(\\w+ \g)"); Where is my error ?? I am having mental block - forgot how to add " match all words ":. Here is my failing attempt to do so result = BTUL->EditLine_RegExp(result, "(\\w+*)" ); Can somebody help me to modify my regular expression to match all words PLEASE No references to AI regular expression generators - they do not do well multiple entries.
The following will find all words:
/\w+/g
You can test it quite quickly at RegExr: Learn, Build, & Test RegEx[^]. But it would help if you showed us the actual text you are working with, and the results you get. And please use <pre> tags around the code parts so your question is clear.
-
EDIT OK using g - as in "global" shoud work , but it does not result = BTUL->EditLine_RegExp(result, "(\\w+ \g)"); Where is my error ?? I am having mental block - forgot how to add " match all words ":. Here is my failing attempt to do so result = BTUL->EditLine_RegExp(result, "(\\w+*)" ); Can somebody help me to modify my regular expression to match all words PLEASE No references to AI regular expression generators - they do not do well multiple entries.
Salvatore Terress wrote:
Where is my error ??
Your question is not specific to regular expressions but also to what is running the regular expression engine. But you did not provide that information. A 'g' is something that is external to regular expressions. So where you are using it is important and the only clue you provided is 'EditLine_RegExp' which googling for returned no results. But certainly since you didn't escape the backslash for the g that would never work. Other than that of course there is also the following - A space is not considered a word. Your capture group includes that. - There could be more than one space. - Are you matching on a single line? If not there are other complications. - If there is ONLY words in your line then it is pointless to use regex at all. Just split it. - If there are OTHER things besides words then I don't believe what you are doing will work (but again you didn't state what so maybe it is.)
-
The following will find all words:
/\w+/g
You can test it quite quickly at RegExr: Learn, Build, & Test RegEx[^]. But it would help if you showed us the actual text you are working with, and the results you get. And please use <pre> tags around the code parts so your question is clear.
Many thanks for all the support. FYI I did use "[ -~]+" to extract first continuous words. Now I am working on to extract ALL words... I did try pass "[/\w+/g]+" to my otherwise working function and ended up with [/w+/g]+ - the back slash is missing, And this is result , part of my debug messages
" instring text \t\n Waiting to connect to bluetoothd..."
" regular expression \t\n [/w+/g]+"
" Has all match g"You asked for the string I am trying to extract stuff from here it is
"Waiting to connect to bluetoothd...\\r\\u001B\[0;94m\[bluetooth\]\\u001B\[0m# \\r\\r\\u001B\[0;94m\[bluetooth\]\\u001B\[0m# \\r \\rAgent registered\\n\\u001B\[0;94m\[bluetooth\]\\u001B\[0m# "
As you can see - the extracted "g" is from "Agent". No good... I suspect Qt is messing with passing the backslash...
// result = BTUL->EditLine\_RegExp\_Ext(result, "\[ -~\]+",textEditPtr\_DEBUG, textEditPtr\_DEBUG); result = BTUL->EditLine\_RegExp\_Ext(result, "\[/\\w+/g\]+",textEditPtr\_DEBUG, textEditPtr\_DEBUG);
"Waiting to connect to bluetoothd..."
"Waiting to connect to bluetoothd..."
"START EditLine_RegExp...QString BT_Utility_Library::EditLine_RegExp_Ext(QString, QString, QTextEdit *, QTextEdit *)1321"
" instring text \t\n Waiting to connect to bluetoothd..."
" regular expression \t\n [/w+/g]+"
" Has all match g" -
Salvatore Terress wrote:
Where is my error ??
Your question is not specific to regular expressions but also to what is running the regular expression engine. But you did not provide that information. A 'g' is something that is external to regular expressions. So where you are using it is important and the only clue you provided is 'EditLine_RegExp' which googling for returned no results. But certainly since you didn't escape the backslash for the g that would never work. Other than that of course there is also the following - A space is not considered a word. Your capture group includes that. - There could be more than one space. - Are you matching on a single line? If not there are other complications. - If there is ONLY words in your line then it is pointless to use regex at all. Just split it. - If there are OTHER things besides words then I don't believe what you are doing will work (but again you didn't state what so maybe it is.)
Your question is not specific to regular expressions but also to what is running the regular expression engine. But you did not provide that information. Since most of "AI reg expressions" generators are working and Qt SAME expression does not - you have a point. I guess I will ask in Qt forum about that. Yes, there are other means to verify that the string contains desired word, (QString "contains" method works peachy ) however, I sure like to learn more about using regular expression - so I like to stick with reg expressions for now. ADDENDUM My post is about using regular expression - it is NOT about the function I am using to actually implement regular expression. That function works as expected and there is no need to evacuate that function here. If it did not work as desired I would say so.
-
Many thanks for all the support. FYI I did use "[ -~]+" to extract first continuous words. Now I am working on to extract ALL words... I did try pass "[/\w+/g]+" to my otherwise working function and ended up with [/w+/g]+ - the back slash is missing, And this is result , part of my debug messages
" instring text \t\n Waiting to connect to bluetoothd..."
" regular expression \t\n [/w+/g]+"
" Has all match g"You asked for the string I am trying to extract stuff from here it is
"Waiting to connect to bluetoothd...\\r\\u001B\[0;94m\[bluetooth\]\\u001B\[0m# \\r\\r\\u001B\[0;94m\[bluetooth\]\\u001B\[0m# \\r \\rAgent registered\\n\\u001B\[0;94m\[bluetooth\]\\u001B\[0m# "
As you can see - the extracted "g" is from "Agent". No good... I suspect Qt is messing with passing the backslash...
// result = BTUL->EditLine\_RegExp\_Ext(result, "\[ -~\]+",textEditPtr\_DEBUG, textEditPtr\_DEBUG); result = BTUL->EditLine\_RegExp\_Ext(result, "\[/\\w+/g\]+",textEditPtr\_DEBUG, textEditPtr\_DEBUG);
"Waiting to connect to bluetoothd..."
"Waiting to connect to bluetoothd..."
"START EditLine_RegExp...QString BT_Utility_Library::EditLine_RegExp_Ext(QString, QString, QTextEdit *, QTextEdit *)1321"
" instring text \t\n Waiting to connect to bluetoothd..."
" regular expression \t\n [/w+/g]+"
" Has all match g" -
Your question is not specific to regular expressions but also to what is running the regular expression engine. But you did not provide that information. Since most of "AI reg expressions" generators are working and Qt SAME expression does not - you have a point. I guess I will ask in Qt forum about that. Yes, there are other means to verify that the string contains desired word, (QString "contains" method works peachy ) however, I sure like to learn more about using regular expression - so I like to stick with reg expressions for now. ADDENDUM My post is about using regular expression - it is NOT about the function I am using to actually implement regular expression. That function works as expected and there is no need to evacuate that function here. If it did not work as desired I would say so.
Here is the actual snippet of the code. I have "hard coded " the RegExp
// "\[/\\\\w+/g\]+" RegExp = "\[/\\\\w+/g\]+"; text = " validate regular expression "; text += RegExp; qDebug() << text; textDEBUG->append(text); // RegExp = "\[/\\\\w+/g\]+"; text = " validate inString "; text += inString; qDebug() << text; textDEBUG->append(text); QRegularExpression re(RegExp); // QRegularExpression re("/(\[A-Z\])\\w+/g");
//QRegularExpression re("([A-Z])\w+");
QRegularExpressionMatch match = re.match(inString); if (match.hasMatch()) { // matches all text = " Has all match "; QStringList result = match.capturedTexts(); text += result.at(0); // test show only first qDebug() << text; textDEBUG->append(text); return result.at(0);
Here is the relevant debug output
"START EditLine_RegExp...QString BT_Utility_Library::EditLine_RegExp_Ext(QString, QString, QTextEdit *, QTextEdit *)1321"
" instring text \t\n Waiting to connect to bluetoothd...\r\u001B[0;94m[bluetooth]\u001B[0m# \r\r\u001B[0;94m[bluetooth]\u001B[0m# \r \rAgent registered\n\u001B[0;94m[bluetooth]\u001B[0m# "
" regular expression \t\n (\\w+\\s:?)"
" validate regular expression [/\\w+/g]+"
" validate inString Waiting to connect to bluetoothd...\r\u001B[0;94m[bluetooth]\u001B[0m# \r\r\u001B[0;94m[bluetooth]\u001B[0m# \r \rAgent registered\n\u001B[0;94m[bluetooth]\u001B[0m# "
" Has all match Waiting"
10:12:17: /mnt/RAID_124/BT/BT_Oct23_BASE_/mdi/MDI exited with code 0The expression matches ONLY the first word it finds. My goal is to match ALL the words in the inString. I am going to try one of the AI reg exp generators, but from experience using them this RegExp MAY work....
-
Here is the actual snippet of the code. I have "hard coded " the RegExp
// "\[/\\\\w+/g\]+" RegExp = "\[/\\\\w+/g\]+"; text = " validate regular expression "; text += RegExp; qDebug() << text; textDEBUG->append(text); // RegExp = "\[/\\\\w+/g\]+"; text = " validate inString "; text += inString; qDebug() << text; textDEBUG->append(text); QRegularExpression re(RegExp); // QRegularExpression re("/(\[A-Z\])\\w+/g");
//QRegularExpression re("([A-Z])\w+");
QRegularExpressionMatch match = re.match(inString); if (match.hasMatch()) { // matches all text = " Has all match "; QStringList result = match.capturedTexts(); text += result.at(0); // test show only first qDebug() << text; textDEBUG->append(text); return result.at(0);
Here is the relevant debug output
"START EditLine_RegExp...QString BT_Utility_Library::EditLine_RegExp_Ext(QString, QString, QTextEdit *, QTextEdit *)1321"
" instring text \t\n Waiting to connect to bluetoothd...\r\u001B[0;94m[bluetooth]\u001B[0m# \r\r\u001B[0;94m[bluetooth]\u001B[0m# \r \rAgent registered\n\u001B[0;94m[bluetooth]\u001B[0m# "
" regular expression \t\n (\\w+\\s:?)"
" validate regular expression [/\\w+/g]+"
" validate inString Waiting to connect to bluetoothd...\r\u001B[0;94m[bluetooth]\u001B[0m# \r\r\u001B[0;94m[bluetooth]\u001B[0m# \r \rAgent registered\n\u001B[0;94m[bluetooth]\u001B[0m# "
" Has all match Waiting"
10:12:17: /mnt/RAID_124/BT/BT_Oct23_BASE_/mdi/MDI exited with code 0The expression matches ONLY the first word it finds. My goal is to match ALL the words in the inString. I am going to try one of the AI reg exp generators, but from experience using them this RegExp MAY work....
Maybe this SO page can help you? [https://stackoverflow.com/questions/37003623/how-to-capture-multiple-repeated-groups\](https://stackoverflow.com/questions/37003623/how-to-capture-multiple-repeated-groups)
Keep Calm and Carry On
-
EDIT OK using g - as in "global" shoud work , but it does not result = BTUL->EditLine_RegExp(result, "(\\w+ \g)"); Where is my error ?? I am having mental block - forgot how to add " match all words ":. Here is my failing attempt to do so result = BTUL->EditLine_RegExp(result, "(\\w+*)" ); Can somebody help me to modify my regular expression to match all words PLEASE No references to AI regular expression generators - they do not do well multiple entries.
After some "RTFM" I came up with this code
if(inString.contains("Agent") & inString.contains("registered") )
{
text = "Match ";
}else
{
text = " No match ";
}qDebug() << text; textDEBUG->append(text);
The " contains " actually accepts reg expression and string too , so I am not sure how to tell the difference. But it does what I want it to do.
-
Here is the actual snippet of the code. I have "hard coded " the RegExp
// "\[/\\\\w+/g\]+" RegExp = "\[/\\\\w+/g\]+"; text = " validate regular expression "; text += RegExp; qDebug() << text; textDEBUG->append(text); // RegExp = "\[/\\\\w+/g\]+"; text = " validate inString "; text += inString; qDebug() << text; textDEBUG->append(text); QRegularExpression re(RegExp); // QRegularExpression re("/(\[A-Z\])\\w+/g");
//QRegularExpression re("([A-Z])\w+");
QRegularExpressionMatch match = re.match(inString); if (match.hasMatch()) { // matches all text = " Has all match "; QStringList result = match.capturedTexts(); text += result.at(0); // test show only first qDebug() << text; textDEBUG->append(text); return result.at(0);
Here is the relevant debug output
"START EditLine_RegExp...QString BT_Utility_Library::EditLine_RegExp_Ext(QString, QString, QTextEdit *, QTextEdit *)1321"
" instring text \t\n Waiting to connect to bluetoothd...\r\u001B[0;94m[bluetooth]\u001B[0m# \r\r\u001B[0;94m[bluetooth]\u001B[0m# \r \rAgent registered\n\u001B[0;94m[bluetooth]\u001B[0m# "
" regular expression \t\n (\\w+\\s:?)"
" validate regular expression [/\\w+/g]+"
" validate inString Waiting to connect to bluetoothd...\r\u001B[0;94m[bluetooth]\u001B[0m# \r\r\u001B[0;94m[bluetooth]\u001B[0m# \r \rAgent registered\n\u001B[0;94m[bluetooth]\u001B[0m# "
" Has all match Waiting"
10:12:17: /mnt/RAID_124/BT/BT_Oct23_BASE_/mdi/MDI exited with code 0The expression matches ONLY the first word it finds. My goal is to match ALL the words in the inString. I am going to try one of the AI reg exp generators, but from experience using them this RegExp MAY work....
Salvatore Terress wrote:
learn more about using regular expression....RegExp = "[/\\w+/g]+";
Keep in mind that that form of a regular expression will be unlikely to work in any other regular expression interpreter. Perl, javascript, C# and Java (perhaps others) all use the same rules for most of the basics for regex and that will not work with any of them. For those that means the following - Match A-Za-z0-9. - Match a forward slash (redundant twice) - Match a 'g'. Redundant with the word class match.