extracting data from .TXT file
-
Hi Guys, I got a txt file which contains html codes in it. <td class="ticker_name"><a href="http://finance.yahoo.com/q;\_ylt=Al7he7j4xkWAM99.PPB7UVFO7sMF;\_ylu=X3oDMTE5cnE2OWVuBHBvcwMzBHNlYwNtYXJrZXRTdW1tYXJ5SW5kaWNlcwRzbGsDbmFzZGFx?s=^IXIC" >Nasdaq</a></td><td><span class="streaming-datum" id="yfs_l10_^ixic">2,147.35</span></td><td class="ticker_down"><span class="streaming-datum" id="yfs_c10_^ixic">-31.65</span></td><td class="right_cell ticker_down"><span class="streaming-datum" id="yfs_pp0_^ixic">-1.45%</span> how do i extract the value example 2,147.35 and -31.65 of NASDAQ
-
Hi Guys, I got a txt file which contains html codes in it. <td class="ticker_name"><a href="http://finance.yahoo.com/q;\_ylt=Al7he7j4xkWAM99.PPB7UVFO7sMF;\_ylu=X3oDMTE5cnE2OWVuBHBvcwMzBHNlYwNtYXJrZXRTdW1tYXJ5SW5kaWNlcwRzbGsDbmFzZGFx?s=^IXIC" >Nasdaq</a></td><td><span class="streaming-datum" id="yfs_l10_^ixic">2,147.35</span></td><td class="ticker_down"><span class="streaming-datum" id="yfs_c10_^ixic">-31.65</span></td><td class="right_cell ticker_down"><span class="streaming-datum" id="yfs_pp0_^ixic">-1.45%</span> how do i extract the value example 2,147.35 and -31.65 of NASDAQ
-
Hi Guys, I got a txt file which contains html codes in it. <td class="ticker_name"><a href="http://finance.yahoo.com/q;\_ylt=Al7he7j4xkWAM99.PPB7UVFO7sMF;\_ylu=X3oDMTE5cnE2OWVuBHBvcwMzBHNlYwNtYXJrZXRTdW1tYXJ5SW5kaWNlcwRzbGsDbmFzZGFx?s=^IXIC" >Nasdaq</a></td><td><span class="streaming-datum" id="yfs_l10_^ixic">2,147.35</span></td><td class="ticker_down"><span class="streaming-datum" id="yfs_c10_^ixic">-31.65</span></td><td class="right_cell ticker_down"><span class="streaming-datum" id="yfs_pp0_^ixic">-1.45%</span> how do i extract the value example 2,147.35 and -31.65 of NASDAQ
My immediate approach would be to: (a) Scan for and replace all comas with nothing. I.e "," --> "" (b) Scan for the text ixic"> (c) if string not found, then exit loop - jump to (g) (d) Advance the returned pointer by the length of the search string (6 bytes) (e) Do a scanf, asking for a float (f) Return to (b) (g) ... Perhaps a little something like this?
#include <stdlib.h>
#include<stdio.h>
#include<string.h>
#include <string>using namespace std;
string& str_replace(const string &search, const string &replace, string &subject)
{
string buffer;int sealeng = search.length(); int strleng = subject.length(); if (sealeng==0) return subject;//no change for(int i=0, j=0; i<strleng; j=0 ) { while (i+j<strleng && j<sealeng && subject\[i+j\]==search\[j\]) j++; if (j==sealeng)//found 'search' { buffer.append(replace); i+=sealeng; } else { buffer.append( &subject\[i++\], 1); } } subject = buffer; return subject;
}
int main()
{
FILE *fp;
char *htmlStr, *tmp, *filename="infile.html";
// char *findMe = "ixic\">";
char *pos1, *pos2, *pos3;
float retrievedNum;
long fileSize;
string findMe = "ixic\">";fp = fopen(filename, "r+b"); fseek(fp, 0, SEEK\_END); fileSize = ftell(fp); fseek(fp, 0, SEEK\_SET); htmlStr = new char\[fileSize+1\]; htmlStr\[fileSize\] = 0; fread(htmlStr, sizeof(char), fileSize, fp); string tmpS = htmlStr; string find = ","; string replace = ""; tmpS = str\_replace(find, replace, tmpS); printf("%s\\n\\n", tmpS.c\_str() ); fclose(fp); strcpy(htmlStr, tmpS.c\_str() ); pos1 = htmlStr; while (pos1 = strstr(pos1, findMe.c\_str())) { pos1 += strlen(findMe.c\_str()); sscanf(pos1, "%f", &retrievedNum); printf("Retrieved: %f\\n", retrievedNum); } delete htmlStr;
}
Yeah, the code's not winning any beauty pageants. :-O
-
Hi Guys, I got a txt file which contains html codes in it. <td class="ticker_name"><a href="http://finance.yahoo.com/q;\_ylt=Al7he7j4xkWAM99.PPB7UVFO7sMF;\_ylu=X3oDMTE5cnE2OWVuBHBvcwMzBHNlYwNtYXJrZXRTdW1tYXJ5SW5kaWNlcwRzbGsDbmFzZGFx?s=^IXIC" >Nasdaq</a></td><td><span class="streaming-datum" id="yfs_l10_^ixic">2,147.35</span></td><td class="ticker_down"><span class="streaming-datum" id="yfs_c10_^ixic">-31.65</span></td><td class="right_cell ticker_down"><span class="streaming-datum" id="yfs_pp0_^ixic">-1.45%</span> how do i extract the value example 2,147.35 and -31.65 of NASDAQ
benjamin yap wrote:
how do i extract the value example 2,147.35 and -31.65 of NASDAQ
Have a look at regular expressions (RE). There are various libraries for C++, see Boost or CodeProject articles about it. You could scan line by line trough your HTML-input and with a regular expression test/extract the wanted information. Hope this helps, M
Webchat in Europe :java: (only 4K)