Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C#
  4. regular expressions

regular expressions

Scheduled Pinned Locked Moved C#
5 Posts 2 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • X Offline
    X Offline
    xkrja
    wrote on last edited by
    #1

    I'm creating a code editor with syntax highlighting and need help with the regular expressions. I've read a couple pages about it but can't figure out the exact way to do what I want. For now, I only need support for comments and keywords like 'if', 'else', and so on. For the comments I created the following pattern:

    string commentPattern = @"%.*";

    It means that it matches everything to the right of, and including, the '%' character. This should be colored green. This works fine. The next pattern is for the keywords and looks like this:

    string syntaxPattern = @"(\bif\b)|(\bwhile\b)|(\bclassdef\b)|(\bproperties\b)|(\bend\b)|(\bmethods\b)|(\bfunction\b)|(\belse\b)|(\bfor\b)";

    It means it matches any of these words. These keywords are colored blue. It works fine too. BUT, when I run these two patterns parallell the keywords are colored blue even if they are in a comment. How can I create a pattern that doesn't color the keywords blue if there is a '%' character to the left of them??? Thanks for help!

    R 1 Reply Last reply
    0
    • X xkrja

      I'm creating a code editor with syntax highlighting and need help with the regular expressions. I've read a couple pages about it but can't figure out the exact way to do what I want. For now, I only need support for comments and keywords like 'if', 'else', and so on. For the comments I created the following pattern:

      string commentPattern = @"%.*";

      It means that it matches everything to the right of, and including, the '%' character. This should be colored green. This works fine. The next pattern is for the keywords and looks like this:

      string syntaxPattern = @"(\bif\b)|(\bwhile\b)|(\bclassdef\b)|(\bproperties\b)|(\bend\b)|(\bmethods\b)|(\bfunction\b)|(\belse\b)|(\bfor\b)";

      It means it matches any of these words. These keywords are colored blue. It works fine too. BUT, when I run these two patterns parallell the keywords are colored blue even if they are in a comment. How can I create a pattern that doesn't color the keywords blue if there is a '%' character to the left of them??? Thanks for help!

      R Offline
      R Offline
      Ravadre
      wrote on last edited by
      #2

      Probably it can be done in regular expressions themselves, but if I were you, I'd consider making your highligher a bit more intelligent. Parse your code chunk by chunk, word by word, then once you find a comment, you will just move to next line, when you find a keyword, you will move to next word etc. It can come handy for more complicated cases. Other way is of course parsing the text, which would be probably to complicated for your needs.

      X 1 Reply Last reply
      0
      • R Ravadre

        Probably it can be done in regular expressions themselves, but if I were you, I'd consider making your highligher a bit more intelligent. Parse your code chunk by chunk, word by word, then once you find a comment, you will just move to next line, when you find a keyword, you will move to next word etc. It can come handy for more complicated cases. Other way is of course parsing the text, which would be probably to complicated for your needs.

        X Offline
        X Offline
        xkrja
        wrote on last edited by
        #3

        Thanks for your reply. Can you perhaps give a little more detail on this approach?

        R 1 Reply Last reply
        0
        • X xkrja

          Thanks for your reply. Can you perhaps give a little more detail on this approach?

          R Offline
          R Offline
          Ravadre
          wrote on last edited by
          #4

          Lets say you have: int x; % foo % bar x = 5; %x = 5 Your rules: keyword = 'int' ident = Everything that starts with _ or letter and has letters,digits,_ after that comment = Starts with % to newline. Now you write simple lexer that will scan letter by letter, trying to fit what you have to as many possibilities as you can. When no possibilites are left, you go back one letter, and find first one that fits. So, for our example would be:

          buffer: what it can be:
          'i' int or ident
          'in' int or ident
          'int' int or ident
          'int ' nothing. Go back 1 letter
          'int' first rule that fits is keyword int, so it's int
          ' ' fits nothing from beggining, so ignore it
          'x' fits ident
          'x ' fits nothing, go back
          ...

          So generally, you will get lists of tokens, their start and end position, so you just color them :).

          X 1 Reply Last reply
          0
          • R Ravadre

            Lets say you have: int x; % foo % bar x = 5; %x = 5 Your rules: keyword = 'int' ident = Everything that starts with _ or letter and has letters,digits,_ after that comment = Starts with % to newline. Now you write simple lexer that will scan letter by letter, trying to fit what you have to as many possibilities as you can. When no possibilites are left, you go back one letter, and find first one that fits. So, for our example would be:

            buffer: what it can be:
            'i' int or ident
            'in' int or ident
            'int' int or ident
            'int ' nothing. Go back 1 letter
            'int' first rule that fits is keyword int, so it's int
            ' ' fits nothing from beggining, so ignore it
            'x' fits ident
            'x ' fits nothing, go back
            ...

            So generally, you will get lists of tokens, their start and end position, so you just color them :).

            X Offline
            X Offline
            xkrja
            wrote on last edited by
            #5

            Thanks for your help. I'll take a look at it!

            1 Reply Last reply
            0
            Reply
            • Reply as topic
            Log in to reply
            • Oldest to Newest
            • Newest to Oldest
            • Most Votes


            • Login

            • Don't have an account? Register

            • Login or register to search.
            • First post
              Last post
            0
            • Categories
            • Recent
            • Tags
            • Popular
            • World
            • Users
            • Groups