Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. Regular Expressions
  4. MySQL Select query that chooses something dot something between certain underscores

MySQL Select query that chooses something dot something between certain underscores

Scheduled Pinned Locked Moved Regular Expressions
databasemysqlcomregexquestion
16 Posts 3 Posters 22 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • R RedDk

    One of these then: \b([a-z0-9]+(-[a-z0-9]+)*\.)+[a-z]{2,}\b ^([a-z0-9]+(-[a-z0-9]+)*\.)+[a-z]{2,}$ \b((xn--)?[a-z0-9]+(-[a-z0-9]+)*\.)+[a-z]{2,}\b \b((?=[a-z0-9-]{1,63}\.)(xn--)?[a-z0-9]+(-[a-z0-9]+)*\.)+[a-z]{2,63}\b \b((?=[a-z0-9-]{1,63}\.)[a-z0-9]+(-[a-z0-9]+)*\.)+[a-z]{2,63}\b Footnote: these make no sense of leading BLANK as your requirement states but I throw them out there anyway for your testing pleasure.

    L Offline
    L Offline
    Lost User
    wrote on last edited by
    #4

    Unfortunately none of them worked at all. REGEXP_SUBSTR(field_name, '^([a-z0-9]+(-[a-z0-9]+)*\.)+[a-z]{2,}$') immediately timed out (#3699 - Timeout exceeded in regular expression match), and the rest just returned NULL for everything. Any idea why?

    R 1 Reply Last reply
    0
    • L Lost User

      Unfortunately none of them worked at all. REGEXP_SUBSTR(field_name, '^([a-z0-9]+(-[a-z0-9]+)*\.)+[a-z]{2,}$') immediately timed out (#3699 - Timeout exceeded in regular expression match), and the rest just returned NULL for everything. Any idea why?

      R Offline
      R Offline
      RedDk
      wrote on last edited by
      #5

      Well, first of all those regex were std::regx not MySQL. What exactly have timing out and error #3699 to do with this "SELECT" query? I'm not sure why none of them work because I'm not sure how you're leveraging this database. REGEX_SUBSTR? Is that a function? Probably Oracle, he said. I know nothing about Oracle, sorry.

      L 1 Reply Last reply
      0
      • R RedDk

        Well, first of all those regex were std::regx not MySQL. What exactly have timing out and error #3699 to do with this "SELECT" query? I'm not sure why none of them work because I'm not sure how you're leveraging this database. REGEX_SUBSTR? Is that a function? Probably Oracle, he said. I know nothing about Oracle, sorry.

        L Offline
        L Offline
        Lost User
        wrote on last edited by
        #6

        There's nothing about Oracle here other than the company name. It's just a standard MySQL command, see https://dev.mysql.com/doc/refman/8.0/en/regexp.html#function_regexp-substr[^] and MySQL regexp_substr() function - javatpoint[^]. All I asked was which input to type there to get the domain name from a column that puts the domain between underscores (or just end of line) and after at least 2 underscores (examples in the OP).

        L 1 Reply Last reply
        0
        • L Lost User

          There's nothing about Oracle here other than the company name. It's just a standard MySQL command, see https://dev.mysql.com/doc/refman/8.0/en/regexp.html#function_regexp-substr[^] and MySQL regexp_substr() function - javatpoint[^]. All I asked was which input to type there to get the domain name from a column that puts the domain between underscores (or just end of line) and after at least 2 underscores (examples in the OP).

          L Offline
          L Offline
          Lost User
          wrote on last edited by
          #7

          I just tried a quick test at RegExr: Learn, Build, & Test RegEx[^] and the following regex finds all the web addresses in the set below (shown in bold). The only thing I cannot correct is that it includes the leading underscore. However it may be a starting point to help you (I have not actually tried it with MySQL).

          _[a-zA-Z]*\.[a-zA-Z]*|\.[a-zA-Z].

          Something_Else_Type_XY_Z_26.04.23**_website.com**
          Something_Else_Type_XY_Z**_website.co.uk**_26.04.23
          Something_Else_Type_XY_Z_26.04.23**_website.com**_Comment
          Something**_website.com**_Else_Type_XY_Z_26.04.23_Comment
          Something_Else_Type_XY_Z_26.04.23_Comment
          Something_Else_Type_XY_dfd6869_3_21.12.22**_website.com**_ZU

          L 1 Reply Last reply
          0
          • L Lost User

            I just tried a quick test at RegExr: Learn, Build, & Test RegEx[^] and the following regex finds all the web addresses in the set below (shown in bold). The only thing I cannot correct is that it includes the leading underscore. However it may be a starting point to help you (I have not actually tried it with MySQL).

            _[a-zA-Z]*\.[a-zA-Z]*|\.[a-zA-Z].

            Something_Else_Type_XY_Z_26.04.23**_website.com**
            Something_Else_Type_XY_Z**_website.co.uk**_26.04.23
            Something_Else_Type_XY_Z_26.04.23**_website.com**_Comment
            Something**_website.com**_Else_Type_XY_Z_26.04.23_Comment
            Something_Else_Type_XY_Z_26.04.23_Comment
            Something_Else_Type_XY_dfd6869_3_21.12.22**_website.com**_ZU

            L Offline
            L Offline
            Lost User
            wrote on last edited by
            #8

            Unfortunately REGEXP_SUBSTR(field_name, '_[a-zA-Z]*\.[a-zA-Z]*|\.[a-zA-Z]') just gives me the first 2 letters of the field_name ("So" in this case).

            L R 2 Replies Last reply
            0
            • L Lost User

              Unfortunately REGEXP_SUBSTR(field_name, '_[a-zA-Z]*\.[a-zA-Z]*|\.[a-zA-Z]') just gives me the first 2 letters of the field_name ("So" in this case).

              L Offline
              L Offline
              Lost User
              wrote on last edited by
              #9

              Yes, I got similar results, which does not make a lot of sense. However, on reading the documentation (https://dev.mysql.com/doc/refman/8.0/en/regexp.html#function_regexp-substr[^]) closely I discovered why. Hidden down near the bottom of the page is the following:

              Quote:

              To use a literal instance of a special character in a regular expression, precede it by two backslash (\) characters. The MySQL parser interprets one of the backslashes, and the regular expression library interprets the other. For example, to match the string 1+2 that contains the special + character, only the last of the following regular expressions is the correct one:

              So I tried with that, and a slight modification to my original pattern, and successfully extracted the fields as shown in the following results:

              mysql> SELECT REGEXP_SUBSTR(Title, '_[a-zA-Z]+(\\.[a-zA-Z]+)+') from Books;
              +---------------------------------------------------+
              | REGEXP_SUBSTR(Title, '_[a-zA-Z]+(\\.[a-zA-Z]+)+') |
              +---------------------------------------------------+
              | NULL |
              | NULL |
              | NULL |
              | NULL |
              | _website.com |
              | NULL |
              | _website.com |
              | _website.com |
              | _website.co.uk |
              | _website.com |
              | NULL |
              | NULL |
              +---------------------------------------------------+

              The NULL entries are records that do not contain a web address. [EDIT] I just tried without the leading underscore in the regular expression and added some numbers into the website names and got the following:

              mysql> SELECT REGEXP_SUBSTR(Title, '[a-zA-Z][[:alnum:]]+(\\.[a-zA-Z]+)+') from Books;
              +-------------------------------------------------------------+
              | REGEXP_SUBSTR(Title, '[a-zA-Z][[:alnum:]]+(\\.[a-zA-Z]+)+') |
              +-----------------------------------------------

              L R 2 Replies Last reply
              0
              • L Lost User

                Unfortunately REGEXP_SUBSTR(field_name, '_[a-zA-Z]*\.[a-zA-Z]*|\.[a-zA-Z]') just gives me the first 2 letters of the field_name ("So" in this case).

                R Offline
                R Offline
                RedDk
                wrote on last edited by
                #10

                Ok, (didn't read the doc link but thanks for that). Using your new regex:

                _[a-zA-Z]*\.[a-zA-Z]*|\.[a-zA-Z]

                on the sample of your OP:

                1. Something_Else_Type_XY_Z_26.04.23_website.com => _website.com
                2. Something_Else_Type_XY_Z_website.com_26.04.23 => _website.com
                3. Something_Else_Type_XY_Z_26.04.23_website.com_Comment => _website.com
                4. Something_website.com_Else_Type_XY_Z_26.04.23_Comment => NULL
                5. Something_Else_Type_XY_Z_26.04.23_Comment => NULL
                6. Something_Else_Type_XY_dfd6869_3_21.12.22_website.com_ZU = _website.com

                Which isn't quite what you wanted (because of that leading "_"). And as for RM's take on it, as-written (without any _[a-zA-Z]*\.) THAT string returns no matches here either (I use a tool called RegexBuddy and after yesterday's experience, with MySql selected as the input language not std::regex). So although I can confirm today's discovery is close it'll take me some more sleuthing to run down that leading character ascii 95 ... {EDIT] I'm seeing a warning when I try to tackle that underscore using "shorthand character classes" \w to the tune of "MySql doesn't support blah blah blah" so ... this could be a while. [END EDIT]

                L 1 Reply Last reply
                0
                • R RedDk

                  Ok, (didn't read the doc link but thanks for that). Using your new regex:

                  _[a-zA-Z]*\.[a-zA-Z]*|\.[a-zA-Z]

                  on the sample of your OP:

                  1. Something_Else_Type_XY_Z_26.04.23_website.com => _website.com
                  2. Something_Else_Type_XY_Z_website.com_26.04.23 => _website.com
                  3. Something_Else_Type_XY_Z_26.04.23_website.com_Comment => _website.com
                  4. Something_website.com_Else_Type_XY_Z_26.04.23_Comment => NULL
                  5. Something_Else_Type_XY_Z_26.04.23_Comment => NULL
                  6. Something_Else_Type_XY_dfd6869_3_21.12.22_website.com_ZU = _website.com

                  Which isn't quite what you wanted (because of that leading "_"). And as for RM's take on it, as-written (without any _[a-zA-Z]*\.) THAT string returns no matches here either (I use a tool called RegexBuddy and after yesterday's experience, with MySql selected as the input language not std::regex). So although I can confirm today's discovery is close it'll take me some more sleuthing to run down that leading character ascii 95 ... {EDIT] I'm seeing a warning when I try to tackle that underscore using "shorthand character classes" \w to the tune of "MySql doesn't support blah blah blah" so ... this could be a while. [END EDIT]

                  L Offline
                  L Offline
                  Lost User
                  wrote on last edited by
                  #11

                  See my update to yesterdays answer above.

                  1 Reply Last reply
                  0
                  • L Lost User

                    Yes, I got similar results, which does not make a lot of sense. However, on reading the documentation (https://dev.mysql.com/doc/refman/8.0/en/regexp.html#function_regexp-substr[^]) closely I discovered why. Hidden down near the bottom of the page is the following:

                    Quote:

                    To use a literal instance of a special character in a regular expression, precede it by two backslash (\) characters. The MySQL parser interprets one of the backslashes, and the regular expression library interprets the other. For example, to match the string 1+2 that contains the special + character, only the last of the following regular expressions is the correct one:

                    So I tried with that, and a slight modification to my original pattern, and successfully extracted the fields as shown in the following results:

                    mysql> SELECT REGEXP_SUBSTR(Title, '_[a-zA-Z]+(\\.[a-zA-Z]+)+') from Books;
                    +---------------------------------------------------+
                    | REGEXP_SUBSTR(Title, '_[a-zA-Z]+(\\.[a-zA-Z]+)+') |
                    +---------------------------------------------------+
                    | NULL |
                    | NULL |
                    | NULL |
                    | NULL |
                    | _website.com |
                    | NULL |
                    | _website.com |
                    | _website.com |
                    | _website.co.uk |
                    | _website.com |
                    | NULL |
                    | NULL |
                    +---------------------------------------------------+

                    The NULL entries are records that do not contain a web address. [EDIT] I just tried without the leading underscore in the regular expression and added some numbers into the website names and got the following:

                    mysql> SELECT REGEXP_SUBSTR(Title, '[a-zA-Z][[:alnum:]]+(\\.[a-zA-Z]+)+') from Books;
                    +-------------------------------------------------------------+
                    | REGEXP_SUBSTR(Title, '[a-zA-Z][[:alnum:]]+(\\.[a-zA-Z]+)+') |
                    +-----------------------------------------------

                    L Offline
                    L Offline
                    Lost User
                    wrote on last edited by
                    #12

                    Did you test according to my examples? What about:

                    Something_website.com_Else_Type_XY_Z_26.04.23_Comment => NULL (due to not having 2 preceding underscores first)

                    Your regex returns website.com there instead of NULL.

                    L 1 Reply Last reply
                    0
                    • L Lost User

                      Did you test according to my examples? What about:

                      Something_website.com_Else_Type_XY_Z_26.04.23_Comment => NULL (due to not having 2 preceding underscores first)

                      Your regex returns website.com there instead of NULL.

                      L Offline
                      L Offline
                      Lost User
                      wrote on last edited by
                      #13

                      No, I am afraid I did not work that one out; I leave that as an exercise for you.

                      1 Reply Last reply
                      0
                      • L Lost User

                        Yes, I got similar results, which does not make a lot of sense. However, on reading the documentation (https://dev.mysql.com/doc/refman/8.0/en/regexp.html#function_regexp-substr[^]) closely I discovered why. Hidden down near the bottom of the page is the following:

                        Quote:

                        To use a literal instance of a special character in a regular expression, precede it by two backslash (\) characters. The MySQL parser interprets one of the backslashes, and the regular expression library interprets the other. For example, to match the string 1+2 that contains the special + character, only the last of the following regular expressions is the correct one:

                        So I tried with that, and a slight modification to my original pattern, and successfully extracted the fields as shown in the following results:

                        mysql> SELECT REGEXP_SUBSTR(Title, '_[a-zA-Z]+(\\.[a-zA-Z]+)+') from Books;
                        +---------------------------------------------------+
                        | REGEXP_SUBSTR(Title, '_[a-zA-Z]+(\\.[a-zA-Z]+)+') |
                        +---------------------------------------------------+
                        | NULL |
                        | NULL |
                        | NULL |
                        | NULL |
                        | _website.com |
                        | NULL |
                        | _website.com |
                        | _website.com |
                        | _website.co.uk |
                        | _website.com |
                        | NULL |
                        | NULL |
                        +---------------------------------------------------+

                        The NULL entries are records that do not contain a web address. [EDIT] I just tried without the leading underscore in the regular expression and added some numbers into the website names and got the following:

                        mysql> SELECT REGEXP_SUBSTR(Title, '[a-zA-Z][[:alnum:]]+(\\.[a-zA-Z]+)+') from Books;
                        +-------------------------------------------------------------+
                        | REGEXP_SUBSTR(Title, '[a-zA-Z][[:alnum:]]+(\\.[a-zA-Z]+)+') |
                        +-----------------------------------------------

                        R Offline
                        R Offline
                        RedDk
                        wrote on last edited by
                        #14

                        Concur here, using RegexBuddy and this string: [a-zA-Z][[:alnum:]]*\.[a-zA-Z]*|\.[a-zA-Z] Nothing but net (colloquial expression eliding formal disdain without turning away in disgust from self-realized talentlessness). Thanks for this.

                        L 1 Reply Last reply
                        0
                        • R RedDk

                          Concur here, using RegexBuddy and this string: [a-zA-Z][[:alnum:]]*\.[a-zA-Z]*|\.[a-zA-Z] Nothing but net (colloquial expression eliding formal disdain without turning away in disgust from self-realized talentlessness). Thanks for this.

                          L Offline
                          L Offline
                          Lost User
                          wrote on last edited by
                          #15

                          MySQL's RE engine has one or two quirks of its own(after all, why make it easy?). The only part I could not figure out is how to ensure at least two underscores in the earlier part of the string. I bet @OriginalGriff or @RichardDeeming could do it with their eyes closed.

                          R 1 Reply Last reply
                          0
                          • L Lost User

                            MySQL's RE engine has one or two quirks of its own(after all, why make it easy?). The only part I could not figure out is how to ensure at least two underscores in the earlier part of the string. I bet @OriginalGriff or @RichardDeeming could do it with their eyes closed.

                            R Offline
                            R Offline
                            RedDk
                            wrote on last edited by
                            #16

                            Using RegexBuddy (yeah, I guess I am falling into a hole here) the POSIX notation is perfectly acceptable, hence your [:alnum:]. I can substitute [:alpha:] there as well and I suspect others in the list are good actors. My brain infarct occurs though when I use Herr Gevertz' table of translations and try to replace the POSIX with his ASCII ([a-zA-Z0-9]) or for that matter UNICODE ([\p{L}\p{Nl}\p{Nd}]), neither of which work in his interface. Grand tool, RegexBuddy ... it's one of those feature-rich interfaces which swim with example/sample in the help volume. My only complaint is that searching through documentation for words has no highlight of the found set so I've got to print out the page and use my pdf search to locate all instances!

                            1 Reply Last reply
                            0
                            Reply
                            • Reply as topic
                            Log in to reply
                            • Oldest to Newest
                            • Newest to Oldest
                            • Most Votes


                            • Login

                            • Don't have an account? Register

                            • Login or register to search.
                            • First post
                              Last post
                            0
                            • Categories
                            • Recent
                            • Tags
                            • Popular
                            • World
                            • Users
                            • Groups