Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C / C++ / MFC
  4. extracting data from .TXT file

extracting data from .TXT file

Scheduled Pinned Locked Moved C / C++ / MFC
questionhtmlcomtutorial
4 Posts 4 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • B Offline
    B Offline
    benjamin yap
    wrote on last edited by
    #1

    Hi Guys, I got a txt file which contains html codes in it. <td class="ticker_name"><a href="http://finance.yahoo.com/q;\_ylt=Al7he7j4xkWAM99.PPB7UVFO7sMF;\_ylu=X3oDMTE5cnE2OWVuBHBvcwMzBHNlYwNtYXJrZXRTdW1tYXJ5SW5kaWNlcwRzbGsDbmFzZGFx?s=^IXIC" >Nasdaq</a></td><td><span class="streaming-datum" id="yfs_l10_^ixic">2,147.35</span></td><td class="ticker_down"><span class="streaming-datum" id="yfs_c10_^ixic">-31.65</span></td><td class="right_cell ticker_down"><span class="streaming-datum" id="yfs_pp0_^ixic">-1.45%</span> how do i extract the value example 2,147.35 and -31.65 of NASDAQ

    L enhzflepE M 3 Replies Last reply
    0
    • B benjamin yap

      Hi Guys, I got a txt file which contains html codes in it. <td class="ticker_name"><a href="http://finance.yahoo.com/q;\_ylt=Al7he7j4xkWAM99.PPB7UVFO7sMF;\_ylu=X3oDMTE5cnE2OWVuBHBvcwMzBHNlYwNtYXJrZXRTdW1tYXJ5SW5kaWNlcwRzbGsDbmFzZGFx?s=^IXIC" >Nasdaq</a></td><td><span class="streaming-datum" id="yfs_l10_^ixic">2,147.35</span></td><td class="ticker_down"><span class="streaming-datum" id="yfs_c10_^ixic">-31.65</span></td><td class="right_cell ticker_down"><span class="streaming-datum" id="yfs_pp0_^ixic">-1.45%</span> how do i extract the value example 2,147.35 and -31.65 of NASDAQ

      L Offline
      L Offline
      Lost User
      wrote on last edited by
      #2

      You could write your own parser or adapt from this article[^].

      MVP 2010 - are they mad?

      1 Reply Last reply
      0
      • B benjamin yap

        Hi Guys, I got a txt file which contains html codes in it. <td class="ticker_name"><a href="http://finance.yahoo.com/q;\_ylt=Al7he7j4xkWAM99.PPB7UVFO7sMF;\_ylu=X3oDMTE5cnE2OWVuBHBvcwMzBHNlYwNtYXJrZXRTdW1tYXJ5SW5kaWNlcwRzbGsDbmFzZGFx?s=^IXIC" >Nasdaq</a></td><td><span class="streaming-datum" id="yfs_l10_^ixic">2,147.35</span></td><td class="ticker_down"><span class="streaming-datum" id="yfs_c10_^ixic">-31.65</span></td><td class="right_cell ticker_down"><span class="streaming-datum" id="yfs_pp0_^ixic">-1.45%</span> how do i extract the value example 2,147.35 and -31.65 of NASDAQ

        enhzflepE Offline
        enhzflepE Offline
        enhzflep
        wrote on last edited by
        #3

        My immediate approach would be to: (a) Scan for and replace all comas with nothing. I.e "," --> "" (b) Scan for the text ixic"> (c) if string not found, then exit loop - jump to (g) (d) Advance the returned pointer by the length of the search string (6 bytes) (e) Do a scanf, asking for a float (f) Return to (b) (g) ... Perhaps a little something like this?

        #include <stdlib.h>
        #include<stdio.h>
        #include<string.h>
        #include <string>

        using namespace std;

        string& str_replace(const string &search, const string &replace, string &subject)
        {
        string buffer;

        int sealeng = search.length();
        int strleng = subject.length();
        
        if (sealeng==0)
            return subject;//no change
        
        for(int i=0, j=0; i<strleng; j=0 )
        {
            while (i+j<strleng && j<sealeng && subject\[i+j\]==search\[j\])
                j++;
            if (j==sealeng)//found 'search'
            {
                buffer.append(replace);
                i+=sealeng;
            }
            else
            {
                buffer.append( &subject\[i++\], 1);
            }
        }
        subject = buffer;
        return subject;
        

        }

        int main()
        {
        FILE *fp;
        char *htmlStr, *tmp, *filename="infile.html";
        // char *findMe = "ixic\">";
        char *pos1, *pos2, *pos3;
        float retrievedNum;
        long fileSize;
        string findMe = "ixic\">";

        fp = fopen(filename, "r+b");
        fseek(fp, 0, SEEK\_END);
        fileSize = ftell(fp);
        fseek(fp, 0, SEEK\_SET);
        htmlStr = new char\[fileSize+1\];
        htmlStr\[fileSize\] = 0;
        fread(htmlStr, sizeof(char), fileSize, fp);
        string tmpS = htmlStr;
        
        string find = ",";
        string replace = "";
        
        tmpS = str\_replace(find, replace, tmpS);
        
        printf("%s\\n\\n", tmpS.c\_str() );
        fclose(fp);
        strcpy(htmlStr, tmpS.c\_str() );
        pos1 = htmlStr;
        
        while (pos1 = strstr(pos1, findMe.c\_str()))
        {
                pos1 += strlen(findMe.c\_str());
                sscanf(pos1, "%f", &retrievedNum);
                printf("Retrieved: %f\\n", retrievedNum);
        }
        
        delete htmlStr;
        

        }

        Yeah, the code's not winning any beauty pageants. :-O

        1 Reply Last reply
        0
        • B benjamin yap

          Hi Guys, I got a txt file which contains html codes in it. <td class="ticker_name"><a href="http://finance.yahoo.com/q;\_ylt=Al7he7j4xkWAM99.PPB7UVFO7sMF;\_ylu=X3oDMTE5cnE2OWVuBHBvcwMzBHNlYwNtYXJrZXRTdW1tYXJ5SW5kaWNlcwRzbGsDbmFzZGFx?s=^IXIC" >Nasdaq</a></td><td><span class="streaming-datum" id="yfs_l10_^ixic">2,147.35</span></td><td class="ticker_down"><span class="streaming-datum" id="yfs_c10_^ixic">-31.65</span></td><td class="right_cell ticker_down"><span class="streaming-datum" id="yfs_pp0_^ixic">-1.45%</span> how do i extract the value example 2,147.35 and -31.65 of NASDAQ

          M Offline
          M Offline
          Moak
          wrote on last edited by
          #4

          benjamin yap wrote:

          how do i extract the value example 2,147.35 and -31.65 of NASDAQ

          Have a look at regular expressions (RE). There are various libraries for C++, see Boost or CodeProject articles about it. You could scan line by line trough your HTML-input and with a regular expression test/extract the wanted information. Hope this helps, M

          Webchat in Europe :java: (only 4K)

          1 Reply Last reply
          0
          Reply
          • Reply as topic
          Log in to reply
          • Oldest to Newest
          • Newest to Oldest
          • Most Votes


          • Login

          • Don't have an account? Register

          • Login or register to search.
          • First post
            Last post
          0
          • Categories
          • Recent
          • Tags
          • Popular
          • World
          • Users
          • Groups