Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
CODE PROJECT For Those Who Code
  • Home
  • Articles
  • FAQ
Community
  1. Home
  2. General Programming
  3. C / C++ / MFC
  4. Parsing in C

Parsing in C

Scheduled Pinned Locked Moved C / C++ / MFC
c++jsontutorial
11 Posts 4 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • S Software2007

    I know how to do this in C++, but a bit rusty on C, char line[1024]="d(Text1)b(ID1)n(node1)d(Text2)b(ID2)n(node2)d(Text3)b(ID3)n(node3)... I would like to parse the line above into item0 = d(Text1)b(ID1)n(node1), item2 - d(Text2)b(ID2)n(node2)... and so forth

    char *py;
    char *px = line;
    while ((py=strstr(px,"d(")) != NULL) {
    // if ((px=strstr(py,")")) == NULL)
    // printf("error1");
    px++;
    }

    return 0;

    Thanks

    A Offline
    A Offline
    Albert Holguin
    wrote on last edited by
    #2

    Did you try sprintf()? http://www.cplusplus.com/reference/clibrary/cstdio/sprintf/[^]

    1 Reply Last reply
    0
    • S Software2007

      I know how to do this in C++, but a bit rusty on C, char line[1024]="d(Text1)b(ID1)n(node1)d(Text2)b(ID2)n(node2)d(Text3)b(ID3)n(node3)... I would like to parse the line above into item0 = d(Text1)b(ID1)n(node1), item2 - d(Text2)b(ID2)n(node2)... and so forth

      char *py;
      char *px = line;
      while ((py=strstr(px,"d(")) != NULL) {
      // if ((px=strstr(py,")")) == NULL)
      // printf("error1");
      px++;
      }

      return 0;

      Thanks

      C Offline
      C Offline
      Chris Losinger
      wrote on last edited by
      #3

      so, is it "n(nodename)" that signals the end of a 'node' and "d(whatever)" for the start? i always find state-machines the easiest way to do simple parsing. off the top of my head...

      typedef struct item_t
      {
      char d[100];
      char b[100];
      char n[100];
      } item;

      const char *consume_between_parens(const char *p, char *out)
      {
      if (*p!='(') return NULL; // error

      p++;
      while (*p)
      {
      if (p==')') // done
      {
      p++;
      break;
      }
      *out = *p; // copy
      out++; p++; // next
      }

      return p;
      }

      void parse(const char *p, itemArray array ... some array thing)
      {
      enum {wantD, wantB, wantN} state = wantD;

      item * curItem = null;

      while (*p)
      {
      switch (state)
      {
      case wantD: // we need a 'd'
      if (*p=='d')
      {
      curItem = malloc(sizeof(item));
      p = consume_between_parens(p, curItem->d);
      state = wantB;
      }
      else error
      break;
      case wantB: // we need a 'b'
      if (*p=='b')
      {
      p = consume_between_parens(p, curItem->b);
      state = wantN;
      }
      else error
      case wantN: // we need an 'n'
      if (*p=='n')
      {
      p = consume_between_parens(p, curItem->n);
      addToArray(array, curItem);
      state = wantD;
      }
      else error
      break;
      default:
      error;
      }
      p++;
      }
      while

      image processing toolkits | batch image processing

      S 1 Reply Last reply
      0
      • S Software2007

        I know how to do this in C++, but a bit rusty on C, char line[1024]="d(Text1)b(ID1)n(node1)d(Text2)b(ID2)n(node2)d(Text3)b(ID3)n(node3)... I would like to parse the line above into item0 = d(Text1)b(ID1)n(node1), item2 - d(Text2)b(ID2)n(node2)... and so forth

        char *py;
        char *px = line;
        while ((py=strstr(px,"d(")) != NULL) {
        // if ((px=strstr(py,")")) == NULL)
        // printf("error1");
        px++;
        }

        return 0;

        Thanks

        D Offline
        D Offline
        David Crow
        wrote on last edited by
        #4

        Have you looked at strtok()?

        "One man's wage rise is another man's price increase." - Harold Wilson

        "Fireproof doesn't mean the fire will never come. It means when the fire comes that you will be able to withstand it." - Michael Simmons

        "Show me a community that obeys the Ten Commandments and I'll show you a less crowded prison system." - Anonymous

        S 2 Replies Last reply
        0
        • C Chris Losinger

          so, is it "n(nodename)" that signals the end of a 'node' and "d(whatever)" for the start? i always find state-machines the easiest way to do simple parsing. off the top of my head...

          typedef struct item_t
          {
          char d[100];
          char b[100];
          char n[100];
          } item;

          const char *consume_between_parens(const char *p, char *out)
          {
          if (*p!='(') return NULL; // error

          p++;
          while (*p)
          {
          if (p==')') // done
          {
          p++;
          break;
          }
          *out = *p; // copy
          out++; p++; // next
          }

          return p;
          }

          void parse(const char *p, itemArray array ... some array thing)
          {
          enum {wantD, wantB, wantN} state = wantD;

          item * curItem = null;

          while (*p)
          {
          switch (state)
          {
          case wantD: // we need a 'd'
          if (*p=='d')
          {
          curItem = malloc(sizeof(item));
          p = consume_between_parens(p, curItem->d);
          state = wantB;
          }
          else error
          break;
          case wantB: // we need a 'b'
          if (*p=='b')
          {
          p = consume_between_parens(p, curItem->b);
          state = wantN;
          }
          else error
          case wantN: // we need an 'n'
          if (*p=='n')
          {
          p = consume_between_parens(p, curItem->n);
          addToArray(array, curItem);
          state = wantD;
          }
          else error
          break;
          default:
          error;
          }
          p++;
          }
          while

          image processing toolkits | batch image processing

          S Offline
          S Offline
          Software2007
          wrote on last edited by
          #5

          Fantastic! I did get it to work with very minor modifications.

          typedef struct item_t
          {
          char d[8192];
          char n[8192];
          char i[8192];
          } item;

          char *consume_between_parens(char *p, char *out)
          {
          p++; //advance to where we think there is '('

          if (*p!='(') return NULL; // error

          int paren_count = 1;

          p++;
          while (*p)
          {
          if (*p=='(')
          paren_count++;

             if (\*p==')') 
                paren\_count--;
          
             if(!paren\_count) //done
          	   break;
          
             \*out = \*p; // copy
             out++; p++; // next
          

          }

          return p;
          }

          void parse(char *p)
          {

          item * curItem = NULL;

          while (*p)
          {

                  if (\*p=='d')
                  {
                      curItem = (item\*)malloc(sizeof(item));
                      p = consume\_between\_parens(p, curItem->d);
                  }
                  if (\*p=='n')
                  {
                      p = consume\_between\_parens(p, curItem->n);
                  }
                  if (\*p=='i')
                  {
                      p = consume\_between\_parens(p, curItem->i);
                     // addToArray(array, curItem);
                  }
          
           p++;
          

          }

          }

          int _tmain(int argc, _TCHAR* argv[])
          {
          char *text ="1:7\22\\d(2011)n(0)i(711910)d(2010)n(1)i(711911)";

          parse(text);
          
          return 0;
          

          }

          1 Reply Last reply
          0
          • D David Crow

            Have you looked at strtok()?

            "One man's wage rise is another man's price increase." - Harold Wilson

            "Fireproof doesn't mean the fire will never come. It means when the fire comes that you will be able to withstand it." - Michael Simmons

            "Show me a community that obeys the Ten Commandments and I'll show you a less crowded prison system." - Anonymous

            S Offline
            S Offline
            Software2007
            wrote on last edited by
            #6

            I just did. Thanks for mentioning it, this function actually makes things a lot easier, since it does most of the pointer work for you. I will give it a try. Thanks again

            1 Reply Last reply
            0
            • D David Crow

              Have you looked at strtok()?

              "One man's wage rise is another man's price increase." - Harold Wilson

              "Fireproof doesn't mean the fire will never come. It means when the fire comes that you will be able to withstand it." - Michael Simmons

              "Show me a community that obeys the Ten Commandments and I'll show you a less crowded prison system." - Anonymous

              S Offline
              S Offline
              Software2007
              wrote on last edited by
              #7

              I have one more question C- related:

              char *getLine(char *p, char *out)
              {
              int Dees = 0;
              char *px = p;
              p++; //advance to where we think there is '('

              if (*p!='(') return NULL; // error

              int paren_count = 1;

              while (*px)
              {
              if(*px=='d')
              {
              Dees++;
              if(Dees >1 && *(px+1)=='(') //Done at 2nd 'd'
              break;
              }

                 \*out = \*px; // copy
                 out++; px++; // next
              

              }
              return px;
              }

              //main program
              char *px="d(text)n(0)l(1)d(text2)n(1)l(2)...";
              char line[8192];
              char *py = getLine(px,line);

              In the main program, after calling the function getLine(), I do get line correctly as in "d(text)n(0)l(1)" but filled with garbage after that as its trying to fill out 8192 characters. How do I make it not fill the rest of the unused space. Note, I don't know the size of "line" as it varies, but I need it big enough. I know strcpy could be useful here somewhere, just can't get it to work. I also tried in getLine() the following:

              char field[4095];
              strcpy(field,out);

              But nothing gets copied, out shows null in the debugger. Thanks

              D 1 Reply Last reply
              0
              • S Software2007

                I have one more question C- related:

                char *getLine(char *p, char *out)
                {
                int Dees = 0;
                char *px = p;
                p++; //advance to where we think there is '('

                if (*p!='(') return NULL; // error

                int paren_count = 1;

                while (*px)
                {
                if(*px=='d')
                {
                Dees++;
                if(Dees >1 && *(px+1)=='(') //Done at 2nd 'd'
                break;
                }

                   \*out = \*px; // copy
                   out++; px++; // next
                

                }
                return px;
                }

                //main program
                char *px="d(text)n(0)l(1)d(text2)n(1)l(2)...";
                char line[8192];
                char *py = getLine(px,line);

                In the main program, after calling the function getLine(), I do get line correctly as in "d(text)n(0)l(1)" but filled with garbage after that as its trying to fill out 8192 characters. How do I make it not fill the rest of the unused space. Note, I don't know the size of "line" as it varies, but I need it big enough. I know strcpy could be useful here somewhere, just can't get it to work. I also tried in getLine() the following:

                char field[4095];
                strcpy(field,out);

                But nothing gets copied, out shows null in the debugger. Thanks

                D Offline
                D Offline
                David Crow
                wrote on last edited by
                #8

                After the while() loop, you need to terminate out with a '\0' character.

                "One man's wage rise is another man's price increase." - Harold Wilson

                "Fireproof doesn't mean the fire will never come. It means when the fire comes that you will be able to withstand it." - Michael Simmons

                "Show me a community that obeys the Ten Commandments and I'll show you a less crowded prison system." - Anonymous

                S 1 Reply Last reply
                0
                • D David Crow

                  After the while() loop, you need to terminate out with a '\0' character.

                  "One man's wage rise is another man's price increase." - Harold Wilson

                  "Fireproof doesn't mean the fire will never come. It means when the fire comes that you will be able to withstand it." - Michael Simmons

                  "Show me a community that obeys the Ten Commandments and I'll show you a less crowded prison system." - Anonymous

                  S Offline
                  S Offline
                  Software2007
                  wrote on last edited by
                  #9

                  Fabulous!

                  while (*px != '\0')

                  did the trick. I have to admit i don't quite get it though, In the string "d(2011)n(0)i(711910)d(2010)n(1)i(711911)", I am telling to break out when it sees the next 'd(', so why do I need the terminating character? Could you clarify? Thanks much

                  D 1 Reply Last reply
                  0
                  • S Software2007

                    Fabulous!

                    while (*px != '\0')

                    did the trick. I have to admit i don't quite get it though, In the string "d(2011)n(0)i(711910)d(2010)n(1)i(711911)", I am telling to break out when it sees the next 'd(', so why do I need the terminating character? Could you clarify? Thanks much

                    D Offline
                    D Offline
                    David Crow
                    wrote on last edited by
                    #10

                    Software2007 wrote:

                    while (*px != '\0')

                    did the trick.

                    Not sure how, since these three staments are identical:

                    while (*px)
                    while (*px != 0)
                    while (*px != '\0')

                    What you should have done instead is add a '\0' character to out after the while() loop. That way, it'll be properly (i.e., null) terminated.

                    "One man's wage rise is another man's price increase." - Harold Wilson

                    "Fireproof doesn't mean the fire will never come. It means when the fire comes that you will be able to withstand it." - Michael Simmons

                    "Show me a community that obeys the Ten Commandments and I'll show you a less crowded prison system." - Anonymous

                    S 1 Reply Last reply
                    0
                    • D David Crow

                      Software2007 wrote:

                      while (*px != '\0')

                      did the trick.

                      Not sure how, since these three staments are identical:

                      while (*px)
                      while (*px != 0)
                      while (*px != '\0')

                      What you should have done instead is add a '\0' character to out after the while() loop. That way, it'll be properly (i.e., null) terminated.

                      "One man's wage rise is another man's price increase." - Harold Wilson

                      "Fireproof doesn't mean the fire will never come. It means when the fire comes that you will be able to withstand it." - Michael Simmons

                      "Show me a community that obeys the Ten Commandments and I'll show you a less crowded prison system." - Anonymous

                      S Offline
                      S Offline
                      Software2007
                      wrote on last edited by
                      #11

                      Great, makes sense.

                      1 Reply Last reply
                      0
                      Reply
                      • Reply as topic
                      Log in to reply
                      • Oldest to Newest
                      • Newest to Oldest
                      • Most Votes


                      • Login

                      • Don't have an account? Register

                      • Login or register to search.
                      • First post
                        Last post
                      0
                      • Categories
                      • Recent
                      • Tags
                      • Popular
                      • World
                      • Users
                      • Groups