Access web page text from C#
-
20 Feb 2007 I want to gather data from this page: http://moneycentral.msn.com/investor/StockRating/srstopstocksresults.aspx?Score=10 and process it in C#. The problem is the page content is loaded via scripts (I think) and the page's .html source does not contain the data. So my initial idea of getting and processing the page's .html source won't work. Manually typing Control-A Control-C and pasting into a text document would work, but I'd prefer an automated solution. A. Is there an easy way to actually handle the scripts from within C# (or .NET in general)? OR B. I've never done control of another program through C#, although I understand that .NET allows this. Can my browser be controlled to send me the text or save it to a file? OR C. Perhaps a FireFox plug-in should be written? (I've never written a plug-in before, either.) Suggestions would be appreciated. Thanks, Mark
-
20 Feb 2007 I want to gather data from this page: http://moneycentral.msn.com/investor/StockRating/srstopstocksresults.aspx?Score=10 and process it in C#. The problem is the page content is loaded via scripts (I think) and the page's .html source does not contain the data. So my initial idea of getting and processing the page's .html source won't work. Manually typing Control-A Control-C and pasting into a text document would work, but I'd prefer an automated solution. A. Is there an easy way to actually handle the scripts from within C# (or .NET in general)? OR B. I've never done control of another program through C#, although I understand that .NET allows this. Can my browser be controlled to send me the text or save it to a file? OR C. Perhaps a FireFox plug-in should be written? (I've never written a plug-in before, either.) Suggestions would be appreciated. Thanks, Mark
-
Try System.Net.Webrequest .. to create a request for the page, and if you're not sure of a correct response, use WebResponse.GetResponseStream() to process the response Patt
Thanks for the try, but this just retrieves the general page setup and the scripts themselves... none of the data that is displayed on the page is in what is obtained. (I do really appreciate the pointer to these functions, however. I need them for another project I have in mind!) Still looking for a solution. Mark
-
Thanks for the try, but this just retrieves the general page setup and the scripts themselves... none of the data that is displayed on the page is in what is obtained. (I do really appreciate the pointer to these functions, however. I need them for another project I have in mind!) Still looking for a solution. Mark
-
Maybe you can be a little more specific about the "data" you are looking for. I'm assuming you need the items displayed in the 6 column result table on the page ? Patt
-
20 Feb 2007 I want to gather data from this page: http://moneycentral.msn.com/investor/StockRating/srstopstocksresults.aspx?Score=10 and process it in C#. The problem is the page content is loaded via scripts (I think) and the page's .html source does not contain the data. So my initial idea of getting and processing the page's .html source won't work. Manually typing Control-A Control-C and pasting into a text document would work, but I'd prefer an automated solution. A. Is there an easy way to actually handle the scripts from within C# (or .NET in general)? OR B. I've never done control of another program through C#, although I understand that .NET allows this. Can my browser be controlled to send me the text or save it to a file? OR C. Perhaps a FireFox plug-in should be written? (I've never written a plug-in before, either.) Suggestions would be appreciated. Thanks, Mark
The information that I want is the large table of Stock related information. I tried looking at links in the .html source, and the one that I suspect provides the content gave an "invalid" reply from a database. Perusing the scripts (which I do not know how to read) it appears the script and the database may exchange some password-like information before allowing the query. Still looking for a way to capture the text that the Web browser has already obtained, without doing a manual copy and paste to a text file. Mark