Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C / C++ / MFC
  4. How to parse a file and store few strings in an array

How to parse a file and store few strings in an array

Scheduled Pinned Locked Moved C / C++ / MFC
data-structurestutorialquestion
19 Posts 5 Posters 1 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • F Faez Shingeri

    Hi, What would be the best way to extract an array of strings (only the first strings separated by a "," ) encapsulated within the curly braces ( { } )from within a file...? Eg: Test.txt *************************** This is also in the text file { abc def ghi , jkl mno pqr , stu vwl yza , } This is also in the text file ***************************** I only want "abc" "jkl" and "stu" in an array.. Thanks, Faez

    S Offline
    S Offline
    SandipG
    wrote on last edited by
    #2

    What have you done so far??

    F 1 Reply Last reply
    0
    • S SandipG

      What have you done so far??

      F Offline
      F Offline
      Faez Shingeri
      wrote on last edited by
      #3

      int main()
      {
      char *buf[1024], *tok;
      FILE *fp1, *fp2;
      fp1=fopen("test.txt","r+");
      fp2=fopen("newtest.txt","w+");

      while(fgets(buf, bufsize, fp1) != NULL)
      {
      for(tok = strtok(buf,"{");tok !="}";)
      {
      fprintf(fp2, "%s",buf);
      }
      }
      fclose(fp2);
      fclose(fp1);
      }

      L D 2 Replies Last reply
      0
      • F Faez Shingeri

        int main()
        {
        char *buf[1024], *tok;
        FILE *fp1, *fp2;
        fp1=fopen("test.txt","r+");
        fp2=fopen("newtest.txt","w+");

        while(fgets(buf, bufsize, fp1) != NULL)
        {
        for(tok = strtok(buf,"{");tok !="}";)
        {
        fprintf(fp2, "%s",buf);
        }
        }
        fclose(fp2);
        fclose(fp1);
        }

        L Offline
        L Offline
        Lost User
        wrote on last edited by
        #4

        Please use <pre> tags round your code (you can edit the above message) as it makes it so much easier to read. Your call to strtok() should use a pattern of all the characters you wish to ignore, you can then use the first returned token on each line, something like:

        while(fgets(buf, bufsize, fp1) != NULL)
        {
        tok = strtok(buf,"{ ,}");
        if (tok != NULL)
        printf("Token: %s\n", tok);
        }

        Unrequited desire is character building. OriginalGriff I'm sitting here giving you a standing ovation - Len Goodman

        1 Reply Last reply
        0
        • F Faez Shingeri

          int main()
          {
          char *buf[1024], *tok;
          FILE *fp1, *fp2;
          fp1=fopen("test.txt","r+");
          fp2=fopen("newtest.txt","w+");

          while(fgets(buf, bufsize, fp1) != NULL)
          {
          for(tok = strtok(buf,"{");tok !="}";)
          {
          fprintf(fp2, "%s",buf);
          }
          }
          fclose(fp2);
          fclose(fp1);
          }

          D Offline
          D Offline
          David Crow
          wrote on last edited by
          #5

          Did you really intend for buf to be an array of pointers?

          "One man's wage rise is another man's price increase." - Harold Wilson

          "Fireproof doesn't mean the fire will never come. It means when the fire comes that you will be able to withstand it." - Michael Simmons

          "Show me a community that obeys the Ten Commandments and I'll show you a less crowded prison system." - Anonymous

          L 1 Reply Last reply
          0
          • D David Crow

            Did you really intend for buf to be an array of pointers?

            "One man's wage rise is another man's price increase." - Harold Wilson

            "Fireproof doesn't mean the fire will never come. It means when the fire comes that you will be able to withstand it." - Michael Simmons

            "Show me a community that obeys the Ten Commandments and I'll show you a less crowded prison system." - Anonymous

            L Offline
            L Offline
            Lost User
            wrote on last edited by
            #6

            Well spotted!

            Unrequited desire is character building. OriginalGriff I'm sitting here giving you a standing ovation - Len Goodman

            F 1 Reply Last reply
            0
            • L Lost User

              Well spotted!

              Unrequited desire is character building. OriginalGriff I'm sitting here giving you a standing ovation - Len Goodman

              F Offline
              F Offline
              Faez Shingeri
              wrote on last edited by
              #7

              Lovely..! This program works like a charm....:-D But now I am getting some extra unwanted strings too till the point I wanto extract the needed string.. My new o/p file now look like Token: DCLGEN Token: LIBRARY Token: ACTION Token: LANGUAGE Token: NAMES Token: QUOTE Token: COLSUFFIX Token: IS Token: EXEC Token: POLICY_NO Token: REG_NO Token: EFFECTIVE_DATE Token: EXPIRY_DATE Token: CAN_EFF_DATE Token: CAN_PRO_DATE Token: RETURN_PREMIUM Token: CAN_PROCESSED Token: END-EXEC But I want the strings only after EXEC and before END-EXEC... ie from POLICY_NO to CAN_PROCESSED... I tried using

              while(strcmp("EXEC",tok))

              but it ain't working .. :-/ Thanks in advance, Faez

              L 1 Reply Last reply
              0
              • F Faez Shingeri

                Lovely..! This program works like a charm....:-D But now I am getting some extra unwanted strings too till the point I wanto extract the needed string.. My new o/p file now look like Token: DCLGEN Token: LIBRARY Token: ACTION Token: LANGUAGE Token: NAMES Token: QUOTE Token: COLSUFFIX Token: IS Token: EXEC Token: POLICY_NO Token: REG_NO Token: EFFECTIVE_DATE Token: EXPIRY_DATE Token: CAN_EFF_DATE Token: CAN_PRO_DATE Token: RETURN_PREMIUM Token: CAN_PROCESSED Token: END-EXEC But I want the strings only after EXEC and before END-EXEC... ie from POLICY_NO to CAN_PROCESSED... I tried using

                while(strcmp("EXEC",tok))

                but it ain't working .. :-/ Thanks in advance, Faez

                L Offline
                L Offline
                Lost User
                wrote on last edited by
                #8

                Try this:

                BOOL bCapture = FALSE; // flag to ignore unwanted tokens
                while(fgets(buf, bufsize, fp1) != NULL)
                {
                tok = strtok(buf,"{ ,}");
                if (tok != NULL)
                {
                if (bCapture == FALSE)
                {
                // not capturing yet, so check for start of list
                if (strcmp(tok, "EXEC") == 0)
                bCapture = TRUE;
                }
                else // bCapture == TRUE
                {
                // bCapture is true so check for end of
                // list, or display token
                if (strcmp(tok, "END-EXEC") == 0)
                bCapture = FALSE;
                else
                printf("Token: %s\n", tok);
                }
                }
                }

                [edit]Fixed the "END-EXEC" string, thanks to Carlo for pointing it out.[/edit]

                Unrequited desire is character building. OriginalGriff I'm sitting here giving you a standing ovation - Len Goodman

                CPalliniC 1 Reply Last reply
                0
                • L Lost User

                  Try this:

                  BOOL bCapture = FALSE; // flag to ignore unwanted tokens
                  while(fgets(buf, bufsize, fp1) != NULL)
                  {
                  tok = strtok(buf,"{ ,}");
                  if (tok != NULL)
                  {
                  if (bCapture == FALSE)
                  {
                  // not capturing yet, so check for start of list
                  if (strcmp(tok, "EXEC") == 0)
                  bCapture = TRUE;
                  }
                  else // bCapture == TRUE
                  {
                  // bCapture is true so check for end of
                  // list, or display token
                  if (strcmp(tok, "END-EXEC") == 0)
                  bCapture = FALSE;
                  else
                  printf("Token: %s\n", tok);
                  }
                  }
                  }

                  [edit]Fixed the "END-EXEC" string, thanks to Carlo for pointing it out.[/edit]

                  Unrequited desire is character building. OriginalGriff I'm sitting here giving you a standing ovation - Len Goodman

                  CPalliniC Offline
                  CPalliniC Offline
                  CPallini
                  wrote on last edited by
                  #9

                  My 5, though your code, as it stands, wouldn't match the-requirements. :-D

                  Veni, vidi, vici.

                  In testa che avete, signor di Ceprano?

                  L 1 Reply Last reply
                  0
                  • CPalliniC CPallini

                    My 5, though your code, as it stands, wouldn't match the-requirements. :-D

                    Veni, vidi, vici.

                    L Offline
                    L Offline
                    Lost User
                    wrote on last edited by
                    #10

                    I wrote it over breakfast; what's missing?

                    Unrequited desire is character building. OriginalGriff I'm sitting here giving you a standing ovation - Len Goodman

                    CPalliniC 1 Reply Last reply
                    0
                    • L Lost User

                      I wrote it over breakfast; what's missing?

                      Unrequited desire is character building. OriginalGriff I'm sitting here giving you a standing ovation - Len Goodman

                      CPalliniC Offline
                      CPalliniC Offline
                      CPallini
                      wrote on last edited by
                      #11

                      Nothing is missing, however an underscore is not an hyphen. --Carlo The Nitpick.

                      Veni, vidi, vici.

                      In testa che avete, signor di Ceprano?

                      L 1 Reply Last reply
                      0
                      • CPalliniC CPallini

                        Nothing is missing, however an underscore is not an hyphen. --Carlo The Nitpick.

                        Veni, vidi, vici.

                        L Offline
                        L Offline
                        Lost User
                        wrote on last edited by
                        #12

                        CPallini wrote:

                        Carlo The NitpickEagle-Eye

                        FTFY :)

                        Unrequited desire is character building. OriginalGriff I'm sitting here giving you a standing ovation - Len Goodman

                        F 1 Reply Last reply
                        0
                        • L Lost User

                          CPallini wrote:

                          Carlo The NitpickEagle-Eye

                          FTFY :)

                          Unrequited desire is character building. OriginalGriff I'm sitting here giving you a standing ovation - Len Goodman

                          F Offline
                          F Offline
                          Faez Shingeri
                          wrote on last edited by
                          #13

                          That code was a beauty again .... :-D Actually... the final file after strtok() looks like the one below... :|

                          Token: DCLGEN
                          Token: LIBRARY
                          Token: ACTION
                          Token: LANGUAGE
                          Token: NAMES
                          Token: QUOTE
                          Token: COLSUFFIX
                          Token: IS
                          Token: EXEC
                          Token: POLICY_NO
                          Token: REG_NO
                          Token: EFFECTIVE_DATE
                          Token: EXPIRY_DATE
                          Token: CAN_EFF_DATE
                          Token: CAN_PRO_DATE
                          Token: RETURN_PREMIUM
                          Token: CAN_PROCESSED
                          Token: END-EXEC
                          Token: COBOL
                          Token: DCLCANCL
                          Token: POLICY_NO
                          Token: CN-POLICY-NO //I need this string too in diff array name tableRow[6][10]
                          Token: REG_NO
                          Token: CN-REG-NO //I need this string too in diff array name tableRow[6][10]
                          Token: EFFECTIVE_DATE
                          Token: CN-EFFECTIVE-DATE //I need this string too in diff array name tableRow[6][10]
                          Token: EXPIRY_DATE
                          Token: CN-EXPIRY-DATE //I need this string too in diff array name tableRow[6][10]
                          Token: CAN_EFF_DATE
                          Token: CN-CAN-EFF-DATE //I need this string too in diff array name tableRow[6][10]
                          Token: CAN_PRO_DATE
                          Token: CN-CAN-PRO-DATE //I need this string too in diff array name tableRow[6][10]
                          Token: RETURN_PREMIUM
                          Token: CN-RETURN-PREMIUM //I need this string too in diff array name tableRow[6][10]
                          Token: CAN_PROCESSED
                          Token: CN-CAN-PROCESSED //I need this string too in diff array name tableRow[6][10]
                          Token: THE

                          Your code cleverly extracted the strings within EXEC and END-EXEC which I stored in

                          char tableCol[6][10]

                          But now I am finding it really difficult to store the strings which are immediatedly followed by the "tabCol" string from the file... :( ie CN-POLICY-NO, CN-REG-NO, CN-EFFECTIVE-DATE, CN-EXPIRY-DATE, CN-CAN-EFF-DATE, CN-CAN-PRO-DATE, CN-RETURN-PREMIUM & CN-CAN-PROCESSED which are to be stored in

                          char tableRow[6][10]

                          Apologies ...:( Thanks a ton, Faez

                          L D 2 Replies Last reply
                          0
                          • F Faez Shingeri

                            That code was a beauty again .... :-D Actually... the final file after strtok() looks like the one below... :|

                            Token: DCLGEN
                            Token: LIBRARY
                            Token: ACTION
                            Token: LANGUAGE
                            Token: NAMES
                            Token: QUOTE
                            Token: COLSUFFIX
                            Token: IS
                            Token: EXEC
                            Token: POLICY_NO
                            Token: REG_NO
                            Token: EFFECTIVE_DATE
                            Token: EXPIRY_DATE
                            Token: CAN_EFF_DATE
                            Token: CAN_PRO_DATE
                            Token: RETURN_PREMIUM
                            Token: CAN_PROCESSED
                            Token: END-EXEC
                            Token: COBOL
                            Token: DCLCANCL
                            Token: POLICY_NO
                            Token: CN-POLICY-NO //I need this string too in diff array name tableRow[6][10]
                            Token: REG_NO
                            Token: CN-REG-NO //I need this string too in diff array name tableRow[6][10]
                            Token: EFFECTIVE_DATE
                            Token: CN-EFFECTIVE-DATE //I need this string too in diff array name tableRow[6][10]
                            Token: EXPIRY_DATE
                            Token: CN-EXPIRY-DATE //I need this string too in diff array name tableRow[6][10]
                            Token: CAN_EFF_DATE
                            Token: CN-CAN-EFF-DATE //I need this string too in diff array name tableRow[6][10]
                            Token: CAN_PRO_DATE
                            Token: CN-CAN-PRO-DATE //I need this string too in diff array name tableRow[6][10]
                            Token: RETURN_PREMIUM
                            Token: CN-RETURN-PREMIUM //I need this string too in diff array name tableRow[6][10]
                            Token: CAN_PROCESSED
                            Token: CN-CAN-PROCESSED //I need this string too in diff array name tableRow[6][10]
                            Token: THE

                            Your code cleverly extracted the strings within EXEC and END-EXEC which I stored in

                            char tableCol[6][10]

                            But now I am finding it really difficult to store the strings which are immediatedly followed by the "tabCol" string from the file... :( ie CN-POLICY-NO, CN-REG-NO, CN-EFFECTIVE-DATE, CN-EXPIRY-DATE, CN-CAN-EFF-DATE, CN-CAN-PRO-DATE, CN-RETURN-PREMIUM & CN-CAN-PROCESSED which are to be stored in

                            char tableRow[6][10]

                            Apologies ...:( Thanks a ton, Faez

                            L Offline
                            L Offline
                            Lost User
                            wrote on last edited by
                            #14

                            Since they all begin with "CN-", you could add a test:

                            if (strncmp(tok, "CN-", 3) == 0)
                            printf ...

                            Alternatively you have to test for each one individually.

                            Unrequited desire is character building. OriginalGriff I'm sitting here giving you a standing ovation - Len Goodman

                            F 1 Reply Last reply
                            0
                            • F Faez Shingeri

                              That code was a beauty again .... :-D Actually... the final file after strtok() looks like the one below... :|

                              Token: DCLGEN
                              Token: LIBRARY
                              Token: ACTION
                              Token: LANGUAGE
                              Token: NAMES
                              Token: QUOTE
                              Token: COLSUFFIX
                              Token: IS
                              Token: EXEC
                              Token: POLICY_NO
                              Token: REG_NO
                              Token: EFFECTIVE_DATE
                              Token: EXPIRY_DATE
                              Token: CAN_EFF_DATE
                              Token: CAN_PRO_DATE
                              Token: RETURN_PREMIUM
                              Token: CAN_PROCESSED
                              Token: END-EXEC
                              Token: COBOL
                              Token: DCLCANCL
                              Token: POLICY_NO
                              Token: CN-POLICY-NO //I need this string too in diff array name tableRow[6][10]
                              Token: REG_NO
                              Token: CN-REG-NO //I need this string too in diff array name tableRow[6][10]
                              Token: EFFECTIVE_DATE
                              Token: CN-EFFECTIVE-DATE //I need this string too in diff array name tableRow[6][10]
                              Token: EXPIRY_DATE
                              Token: CN-EXPIRY-DATE //I need this string too in diff array name tableRow[6][10]
                              Token: CAN_EFF_DATE
                              Token: CN-CAN-EFF-DATE //I need this string too in diff array name tableRow[6][10]
                              Token: CAN_PRO_DATE
                              Token: CN-CAN-PRO-DATE //I need this string too in diff array name tableRow[6][10]
                              Token: RETURN_PREMIUM
                              Token: CN-RETURN-PREMIUM //I need this string too in diff array name tableRow[6][10]
                              Token: CAN_PROCESSED
                              Token: CN-CAN-PROCESSED //I need this string too in diff array name tableRow[6][10]
                              Token: THE

                              Your code cleverly extracted the strings within EXEC and END-EXEC which I stored in

                              char tableCol[6][10]

                              But now I am finding it really difficult to store the strings which are immediatedly followed by the "tabCol" string from the file... :( ie CN-POLICY-NO, CN-REG-NO, CN-EFFECTIVE-DATE, CN-EXPIRY-DATE, CN-CAN-EFF-DATE, CN-CAN-PRO-DATE, CN-RETURN-PREMIUM & CN-CAN-PROCESSED which are to be stored in

                              char tableRow[6][10]

                              Apologies ...:( Thanks a ton, Faez

                              D Offline
                              D Offline
                              David Crow
                              wrote on last edited by
                              #15

                              Faez Shingeri wrote:

                              ...which are to be stored in

                              char tableRow[6][10]

                              Which won't be large enough.

                              "One man's wage rise is another man's price increase." - Harold Wilson

                              "Fireproof doesn't mean the fire will never come. It means when the fire comes that you will be able to withstand it." - Michael Simmons

                              "Show me a community that obeys the Ten Commandments and I'll show you a less crowded prison system." - Anonymous

                              1 Reply Last reply
                              0
                              • L Lost User

                                Since they all begin with "CN-", you could add a test:

                                if (strncmp(tok, "CN-", 3) == 0)
                                printf ...

                                Alternatively you have to test for each one individually.

                                Unrequited desire is character building. OriginalGriff I'm sitting here giving you a standing ovation - Len Goodman

                                F Offline
                                F Offline
                                Faez Shingeri
                                wrote on last edited by
                                #16

                                No no... It ain't always preceeded with "CN-" string ... The strings which needs to be stored in TabRow array are followed by the second occurrence of each tableCol values... Initially I also thought of using strcat () and append CN- to the tabCol values... But den I realized that it was for this file only... Other files are completely different. Regards, Faez

                                L 1 Reply Last reply
                                0
                                • F Faez Shingeri

                                  No no... It ain't always preceeded with "CN-" string ... The strings which needs to be stored in TabRow array are followed by the second occurrence of each tableCol values... Initially I also thought of using strcat () and append CN- to the tabCol values... But den I realized that it was for this file only... Other files are completely different. Regards, Faez

                                  L Offline
                                  L Offline
                                  Lost User
                                  wrote on last edited by
                                  #17

                                  If you don't have a common character identifier to select your strings then there is little that can be done. And from what you say you will have different rules for different files so it's difficult to suggest any other options.

                                  Unrequited desire is character building. OriginalGriff I'm sitting here giving you a standing ovation - Len Goodman

                                  F 1 Reply Last reply
                                  0
                                  • L Lost User

                                    If you don't have a common character identifier to select your strings then there is little that can be done. And from what you say you will have different rules for different files so it's difficult to suggest any other options.

                                    Unrequited desire is character building. OriginalGriff I'm sitting here giving you a standing ovation - Len Goodman

                                    F Offline
                                    F Offline
                                    Faez Shingeri
                                    wrote on last edited by
                                    #18

                                    Finally I have come up witha Not-so-good code... But it works.. :) I created another similar situation to parse from END-EXEC to End of File as below...

                                    if (k == 0)
                                    {
                                    if (strcmp(tok, "END-EXEC") == 0)
                                    k = 1;
                                    }
                                    else
                                    {
                                    if (strcmp(tok, "THE") == 0)
                                    {
                                    k = 0;
                                    }
                                    else
                                    {
                                    strcpy(tabRow[m],tok);m++;
                                    fprintf(fp3,"Token: %s\n", tok);

                                           }
                                    

                                    }
                                    }
                                    }
                                    // Out of all the loops and condition now....

                                    j=0,n=0;
                                    for(j=0;j<=m;j++)
                                    {
                                    	if(strcmp(tabCol\[n\],tabRow\[j\])==0)
                                    	{
                                    		k=j;
                                    		strcpy(tabRow\[n\],tabRow\[k+1\]);
                                    		n++;
                                    	}
                                    	else ;
                                    
                                    
                                    }
                                    for(k=0;k
                                    

                                    It wud be nice to get ur comments on this... :-D

                                    Thanks,
                                    Faez

                                    L 1 Reply Last reply
                                    0
                                    • F Faez Shingeri

                                      Finally I have come up witha Not-so-good code... But it works.. :) I created another similar situation to parse from END-EXEC to End of File as below...

                                      if (k == 0)
                                      {
                                      if (strcmp(tok, "END-EXEC") == 0)
                                      k = 1;
                                      }
                                      else
                                      {
                                      if (strcmp(tok, "THE") == 0)
                                      {
                                      k = 0;
                                      }
                                      else
                                      {
                                      strcpy(tabRow[m],tok);m++;
                                      fprintf(fp3,"Token: %s\n", tok);

                                             }
                                      

                                      }
                                      }
                                      }
                                      // Out of all the loops and condition now....

                                      j=0,n=0;
                                      for(j=0;j<=m;j++)
                                      {
                                      	if(strcmp(tabCol\[n\],tabRow\[j\])==0)
                                      	{
                                      		k=j;
                                      		strcpy(tabRow\[n\],tabRow\[k+1\]);
                                      		n++;
                                      	}
                                      	else ;
                                      
                                      
                                      }
                                      for(k=0;k
                                      

                                      It wud be nice to get ur comments on this... :-D

                                      Thanks,
                                      Faez

                                      L Offline
                                      L Offline
                                      Lost User
                                      wrote on last edited by
                                      #19

                                      Difficult to say really. Assuming it does what you want then that's fine, but I don't see any checks for the specific strings after the "END-EXEC" that you wish to save. Or perhaps that is what the line

                                      if(strcmp(tabCol[n],tabRow[j])==0)

                                      is doing.

                                      Unrequited desire is character building. OriginalGriff I'm sitting here giving you a standing ovation - Len Goodman

                                      1 Reply Last reply
                                      0
                                      Reply
                                      • Reply as topic
                                      Log in to reply
                                      • Oldest to Newest
                                      • Newest to Oldest
                                      • Most Votes


                                      • Login

                                      • Don't have an account? Register

                                      • Login or register to search.
                                      • First post
                                        Last post
                                      0
                                      • Categories
                                      • Recent
                                      • Tags
                                      • Popular
                                      • World
                                      • Users
                                      • Groups