Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. I have begun the long slow journey to redemption.

I have begun the long slow journey to redemption.

Scheduled Pinned Locked Moved The Lounge
regexhelpcareer
8 Posts 4 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • honey the codewitchH Offline
    honey the codewitchH Offline
    honey the codewitch
    wrote on last edited by
    #1

    My Visual FA project was a bust, with broken benchmarks and too slow code. I know in theory a DFA regex can beat one that backtracks, but Microsoft has done a damned good job of optimizing their engine in recent years. My issue is they're using ReadOnlySpan over the input string so if I want to beat it I have to use that as well. That won't stream, but I can generate separate code for that. Here's my result so far Microsoft Regex compiled "Lexer": Found 170000 matches in 13ms FAStringRunner: Found 170000 matches in 10ms I can probably make it a little faster still. The FAStringRunner was created by hand, by me, not generated by a regular expression. It looks like this

    q1:
    if (((((ch >= 9)
    && (ch <= 10))
    || (ch == 13))
    || (ch == 32)))
    {
    ++len;
    if (position < span.Length - 1)
    {
    ch = span[unchecked((int)++position)];
    }
    else
    {
    ch = '\0';
    }
    goto q1;
    }
    return FAMatch.Create(2, span.Slice(unchecked((int)p), len).ToString(), p, l, c);
    q2:
    if (((ch >= 49)
    && (ch <= 57)))
    {
    ++len;
    if (position < span.Length - 1)
    {
    ch = span[unchecked((int)++position)];
    }
    else
    {
    ch = '\0';
    }
    goto q3;
    }
    goto errorout;

    I wrote this by hand because I needed to test to see if I could match faster than Microsoft before I wrote any generator code to do so. I've got a lot of code to write. Edit: Okay they're really challenging me here:

    Microsoft Regex compiled "Lexer": Found 1440000 matches in 91ms
    FAStringRunner: Found 1430000 matches in 81ms
    Microsoft Regex compiled "Lexer": Found 1440000 matches in 89ms
    FAStringRunner: Found 1430000 matches in 80ms
    Microsoft Regex compiled "Lexer": Found 1440000 matches in 80ms
    FAStringRunner: Found 1430000 matches in 80ms
    Microsoft Regex compiled "Lexer": Found 1440000 matches in 70ms
    FAStringRunner: Found 1430000 matches in 80ms
    Microsoft Regex compiled "Lexer": Found 1440000 matches in 71ms
    FAStringRunner: Found 1430000 matches in 80ms
    Microsoft Regex compiled "Lexer": Found 1440000 matches in 71ms
    FAStringRunner: Found 1430000 matches in 81ms
    Microsoft Regex compiled "Lexer": Found 1440000 matches in 71ms
    FAStringRunner: Found 1430000 matches in 80ms
    Microsoft Regex compiled "Lexer": Found 1440000 matches in 73ms
    FAStringRunner: Found 1430000 matches in 79ms
    Microsoft Regex compiled "Lexer": Found 1440000 matches in 72ms
    FAStringRunner: Found 1430000 matches in 80ms
    Microsoft Regex compiled "Lexer": Found 1440000 m

    E L B 3 Replies Last reply
    0
    • honey the codewitchH honey the codewitch

      My Visual FA project was a bust, with broken benchmarks and too slow code. I know in theory a DFA regex can beat one that backtracks, but Microsoft has done a damned good job of optimizing their engine in recent years. My issue is they're using ReadOnlySpan over the input string so if I want to beat it I have to use that as well. That won't stream, but I can generate separate code for that. Here's my result so far Microsoft Regex compiled "Lexer": Found 170000 matches in 13ms FAStringRunner: Found 170000 matches in 10ms I can probably make it a little faster still. The FAStringRunner was created by hand, by me, not generated by a regular expression. It looks like this

      q1:
      if (((((ch >= 9)
      && (ch <= 10))
      || (ch == 13))
      || (ch == 32)))
      {
      ++len;
      if (position < span.Length - 1)
      {
      ch = span[unchecked((int)++position)];
      }
      else
      {
      ch = '\0';
      }
      goto q1;
      }
      return FAMatch.Create(2, span.Slice(unchecked((int)p), len).ToString(), p, l, c);
      q2:
      if (((ch >= 49)
      && (ch <= 57)))
      {
      ++len;
      if (position < span.Length - 1)
      {
      ch = span[unchecked((int)++position)];
      }
      else
      {
      ch = '\0';
      }
      goto q3;
      }
      goto errorout;

      I wrote this by hand because I needed to test to see if I could match faster than Microsoft before I wrote any generator code to do so. I've got a lot of code to write. Edit: Okay they're really challenging me here:

      Microsoft Regex compiled "Lexer": Found 1440000 matches in 91ms
      FAStringRunner: Found 1430000 matches in 81ms
      Microsoft Regex compiled "Lexer": Found 1440000 matches in 89ms
      FAStringRunner: Found 1430000 matches in 80ms
      Microsoft Regex compiled "Lexer": Found 1440000 matches in 80ms
      FAStringRunner: Found 1430000 matches in 80ms
      Microsoft Regex compiled "Lexer": Found 1440000 matches in 70ms
      FAStringRunner: Found 1430000 matches in 80ms
      Microsoft Regex compiled "Lexer": Found 1440000 matches in 71ms
      FAStringRunner: Found 1430000 matches in 80ms
      Microsoft Regex compiled "Lexer": Found 1440000 matches in 71ms
      FAStringRunner: Found 1430000 matches in 81ms
      Microsoft Regex compiled "Lexer": Found 1440000 matches in 71ms
      FAStringRunner: Found 1430000 matches in 80ms
      Microsoft Regex compiled "Lexer": Found 1440000 matches in 73ms
      FAStringRunner: Found 1430000 matches in 79ms
      Microsoft Regex compiled "Lexer": Found 1440000 matches in 72ms
      FAStringRunner: Found 1430000 matches in 80ms
      Microsoft Regex compiled "Lexer": Found 1440000 m

      E Offline
      E Offline
      englebart
      wrote on last edited by
      #2

      is the difference between 1440000 and 1430000 significant? Or is that part of the reason for early prototyping? Almost good enough? “Perfect is the enemy of good enough”

      honey the codewitchH 1 Reply Last reply
      0
      • E englebart

        is the difference between 1440000 and 1430000 significant? Or is that part of the reason for early prototyping? Almost good enough? “Perfect is the enemy of good enough”

        honey the codewitchH Offline
        honey the codewitchH Offline
        honey the codewitch
        wrote on last edited by
        #3

        no. i noticed it after the fact. it's a one off error in my benchmark code, iterated 10,000 times. :laugh:

        Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

        1 Reply Last reply
        0
        • honey the codewitchH honey the codewitch

          My Visual FA project was a bust, with broken benchmarks and too slow code. I know in theory a DFA regex can beat one that backtracks, but Microsoft has done a damned good job of optimizing their engine in recent years. My issue is they're using ReadOnlySpan over the input string so if I want to beat it I have to use that as well. That won't stream, but I can generate separate code for that. Here's my result so far Microsoft Regex compiled "Lexer": Found 170000 matches in 13ms FAStringRunner: Found 170000 matches in 10ms I can probably make it a little faster still. The FAStringRunner was created by hand, by me, not generated by a regular expression. It looks like this

          q1:
          if (((((ch >= 9)
          && (ch <= 10))
          || (ch == 13))
          || (ch == 32)))
          {
          ++len;
          if (position < span.Length - 1)
          {
          ch = span[unchecked((int)++position)];
          }
          else
          {
          ch = '\0';
          }
          goto q1;
          }
          return FAMatch.Create(2, span.Slice(unchecked((int)p), len).ToString(), p, l, c);
          q2:
          if (((ch >= 49)
          && (ch <= 57)))
          {
          ++len;
          if (position < span.Length - 1)
          {
          ch = span[unchecked((int)++position)];
          }
          else
          {
          ch = '\0';
          }
          goto q3;
          }
          goto errorout;

          I wrote this by hand because I needed to test to see if I could match faster than Microsoft before I wrote any generator code to do so. I've got a lot of code to write. Edit: Okay they're really challenging me here:

          Microsoft Regex compiled "Lexer": Found 1440000 matches in 91ms
          FAStringRunner: Found 1430000 matches in 81ms
          Microsoft Regex compiled "Lexer": Found 1440000 matches in 89ms
          FAStringRunner: Found 1430000 matches in 80ms
          Microsoft Regex compiled "Lexer": Found 1440000 matches in 80ms
          FAStringRunner: Found 1430000 matches in 80ms
          Microsoft Regex compiled "Lexer": Found 1440000 matches in 70ms
          FAStringRunner: Found 1430000 matches in 80ms
          Microsoft Regex compiled "Lexer": Found 1440000 matches in 71ms
          FAStringRunner: Found 1430000 matches in 80ms
          Microsoft Regex compiled "Lexer": Found 1440000 matches in 71ms
          FAStringRunner: Found 1430000 matches in 81ms
          Microsoft Regex compiled "Lexer": Found 1440000 matches in 71ms
          FAStringRunner: Found 1430000 matches in 80ms
          Microsoft Regex compiled "Lexer": Found 1440000 matches in 73ms
          FAStringRunner: Found 1430000 matches in 79ms
          Microsoft Regex compiled "Lexer": Found 1440000 matches in 72ms
          FAStringRunner: Found 1430000 matches in 80ms
          Microsoft Regex compiled "Lexer": Found 1440000 m

          L Offline
          L Offline
          Lost User
          wrote on last edited by
          #4

          Doesn’t this piece of code have one set of brackets too many? Admittedly C type languages are not my forte so I’m probably wrong if (((ch >= 49) && (ch <= 57)))

          honey the codewitchH B 2 Replies Last reply
          0
          • L Lost User

            Doesn’t this piece of code have one set of brackets too many? Admittedly C type languages are not my forte so I’m probably wrong if (((ch >= 49) && (ch <= 57)))

            honey the codewitchH Offline
            honey the codewitchH Offline
            honey the codewitch
            wrote on last edited by
            #5

            The jump tables were copied from code I generated using an AST. It is valid code, even if there are spurious parentheses. They have no effect on the compilation or execution.

            Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

            1 Reply Last reply
            0
            • honey the codewitchH honey the codewitch

              My Visual FA project was a bust, with broken benchmarks and too slow code. I know in theory a DFA regex can beat one that backtracks, but Microsoft has done a damned good job of optimizing their engine in recent years. My issue is they're using ReadOnlySpan over the input string so if I want to beat it I have to use that as well. That won't stream, but I can generate separate code for that. Here's my result so far Microsoft Regex compiled "Lexer": Found 170000 matches in 13ms FAStringRunner: Found 170000 matches in 10ms I can probably make it a little faster still. The FAStringRunner was created by hand, by me, not generated by a regular expression. It looks like this

              q1:
              if (((((ch >= 9)
              && (ch <= 10))
              || (ch == 13))
              || (ch == 32)))
              {
              ++len;
              if (position < span.Length - 1)
              {
              ch = span[unchecked((int)++position)];
              }
              else
              {
              ch = '\0';
              }
              goto q1;
              }
              return FAMatch.Create(2, span.Slice(unchecked((int)p), len).ToString(), p, l, c);
              q2:
              if (((ch >= 49)
              && (ch <= 57)))
              {
              ++len;
              if (position < span.Length - 1)
              {
              ch = span[unchecked((int)++position)];
              }
              else
              {
              ch = '\0';
              }
              goto q3;
              }
              goto errorout;

              I wrote this by hand because I needed to test to see if I could match faster than Microsoft before I wrote any generator code to do so. I've got a lot of code to write. Edit: Okay they're really challenging me here:

              Microsoft Regex compiled "Lexer": Found 1440000 matches in 91ms
              FAStringRunner: Found 1430000 matches in 81ms
              Microsoft Regex compiled "Lexer": Found 1440000 matches in 89ms
              FAStringRunner: Found 1430000 matches in 80ms
              Microsoft Regex compiled "Lexer": Found 1440000 matches in 80ms
              FAStringRunner: Found 1430000 matches in 80ms
              Microsoft Regex compiled "Lexer": Found 1440000 matches in 70ms
              FAStringRunner: Found 1430000 matches in 80ms
              Microsoft Regex compiled "Lexer": Found 1440000 matches in 71ms
              FAStringRunner: Found 1430000 matches in 80ms
              Microsoft Regex compiled "Lexer": Found 1440000 matches in 71ms
              FAStringRunner: Found 1430000 matches in 81ms
              Microsoft Regex compiled "Lexer": Found 1440000 matches in 71ms
              FAStringRunner: Found 1430000 matches in 80ms
              Microsoft Regex compiled "Lexer": Found 1440000 matches in 73ms
              FAStringRunner: Found 1430000 matches in 79ms
              Microsoft Regex compiled "Lexer": Found 1440000 matches in 72ms
              FAStringRunner: Found 1430000 matches in 80ms
              Microsoft Regex compiled "Lexer": Found 1440000 m

              B Offline
              B Offline
              BernardIE5317
              wrote on last edited by
              #6

              Greetings and Kind Regards Am I mistaken or is the 1st if logically identical to if (ch == 9 || ch == 10 || ch == 13 || ch == 32) ? Thank You Kindly

              honey the codewitchH 1 Reply Last reply
              0
              • B BernardIE5317

                Greetings and Kind Regards Am I mistaken or is the 1st if logically identical to if (ch == 9 || ch == 10 || ch == 13 || ch == 32) ? Thank You Kindly

                honey the codewitchH Offline
                honey the codewitchH Offline
                honey the codewitch
                wrote on last edited by
                #7

                yes. The if conditions were generated by a tool, and it doesn't bother breaking up the range 9-10 into two single equality comparisons.

                Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                1 Reply Last reply
                0
                • L Lost User

                  Doesn’t this piece of code have one set of brackets too many? Admittedly C type languages are not my forte so I’m probably wrong if (((ch >= 49) && (ch <= 57)))

                  B Offline
                  B Offline
                  BernardIE5317
                  wrote on last edited by
                  #8

                  Greetings and Kind Regards It can also be if (c >= 49 && ch <= 57) as relational operators have higher precedence than logical.

                  1 Reply Last reply
                  0
                  Reply
                  • Reply as topic
                  Log in to reply
                  • Oldest to Newest
                  • Newest to Oldest
                  • Most Votes


                  • Login

                  • Don't have an account? Register

                  • Login or register to search.
                  • First post
                    Last post
                  0
                  • Categories
                  • Recent
                  • Tags
                  • Popular
                  • World
                  • Users
                  • Groups