How to grab a web page using .Net 2.0 WebBrowser Control
-
Hi All, I want to grab a page using WebBrowser Control (see code below). I try to get the page content through DocumentText property in "DocumentCompleted" event handler, but it seems that the event won't be triggered. The "Navigated" event won't be triggered either. However, the "Navigating" event does. I used to use HttpWebRequest/HttpWebResponse to grab webpage content, however, the drawback is that I can't get the full loaded page through it. What I mean by this is that maybe some javascript will execute to modify elements on the page after it's loaded, I call this page as full loaded page. I think using WebBrowser Control can overcome this drawback. It's just like a browser, and it'll execute those javascript after loading the page. But I can't make this control work correctly. Could anyone help me? I'm very appreciate. Thanks Jie using System; using System.Collections.Generic; using System.Text; using System.Windows.Forms; namespace ConsoleApplication1 { class Program { static WebBrowser wb; [STAThread] static void Main(string[] args) { Go(); } private static void Go() { wb = new WebBrowser(); wb.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(wb_DocumentCompleted); wb.Navigating +=new WebBrowserNavigatingEventHandler(wb_Navigating); wb.Navigated += new WebBrowserNavigatedEventHandler(wb_Navigated); wb.Navigate("http://www.google.com"); Console.ReadLine(); } static void wb_Navigating(object sender, WebBrowserNavigatingEventArgs e) { Console.WriteLine("Navigating"); } static void wb_Navigated(object sender, WebBrowserNavigatedEventArgs e) { Console.WriteLine("Navigated"); } static void wb_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e) { Console.WriteLine("DocumentCompleted"); Console.WriteLine(wb.DocumentText); } } }
-
Hi All, I want to grab a page using WebBrowser Control (see code below). I try to get the page content through DocumentText property in "DocumentCompleted" event handler, but it seems that the event won't be triggered. The "Navigated" event won't be triggered either. However, the "Navigating" event does. I used to use HttpWebRequest/HttpWebResponse to grab webpage content, however, the drawback is that I can't get the full loaded page through it. What I mean by this is that maybe some javascript will execute to modify elements on the page after it's loaded, I call this page as full loaded page. I think using WebBrowser Control can overcome this drawback. It's just like a browser, and it'll execute those javascript after loading the page. But I can't make this control work correctly. Could anyone help me? I'm very appreciate. Thanks Jie using System; using System.Collections.Generic; using System.Text; using System.Windows.Forms; namespace ConsoleApplication1 { class Program { static WebBrowser wb; [STAThread] static void Main(string[] args) { Go(); } private static void Go() { wb = new WebBrowser(); wb.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(wb_DocumentCompleted); wb.Navigating +=new WebBrowserNavigatingEventHandler(wb_Navigating); wb.Navigated += new WebBrowserNavigatedEventHandler(wb_Navigated); wb.Navigate("http://www.google.com"); Console.ReadLine(); } static void wb_Navigating(object sender, WebBrowserNavigatingEventArgs e) { Console.WriteLine("Navigating"); } static void wb_Navigated(object sender, WebBrowserNavigatedEventArgs e) { Console.WriteLine("Navigated"); } static void wb_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e) { Console.WriteLine("DocumentCompleted"); Console.WriteLine(wb.DocumentText); } } }
Hello Please don't take any offence when I say it's a very bad piece of code you got there!! Really! You got to declare your Webbrowser as a member of your forms class. And why the console if you got a browser on a form?!! Not to mention the main method!! You don't want to display it make
visible = false;
but I don't think it will ever load anything this way?!! The WebBrowser object dies the moment this Go method finishes, that's why it doesn't load anything. To make the webbrowser really works: 1- Make a form and put it on it -even if invisible- using the toolbox 2- Handle theDocumentCompleted
event, it does fire. 3- If you don't want to handle it, you can handleProcessChanged
event There is a workaround to make Webbrowser work on console but I don't recommend it!!Regards:rose:
-
Hi All, I want to grab a page using WebBrowser Control (see code below). I try to get the page content through DocumentText property in "DocumentCompleted" event handler, but it seems that the event won't be triggered. The "Navigated" event won't be triggered either. However, the "Navigating" event does. I used to use HttpWebRequest/HttpWebResponse to grab webpage content, however, the drawback is that I can't get the full loaded page through it. What I mean by this is that maybe some javascript will execute to modify elements on the page after it's loaded, I call this page as full loaded page. I think using WebBrowser Control can overcome this drawback. It's just like a browser, and it'll execute those javascript after loading the page. But I can't make this control work correctly. Could anyone help me? I'm very appreciate. Thanks Jie using System; using System.Collections.Generic; using System.Text; using System.Windows.Forms; namespace ConsoleApplication1 { class Program { static WebBrowser wb; [STAThread] static void Main(string[] args) { Go(); } private static void Go() { wb = new WebBrowser(); wb.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(wb_DocumentCompleted); wb.Navigating +=new WebBrowserNavigatingEventHandler(wb_Navigating); wb.Navigated += new WebBrowserNavigatedEventHandler(wb_Navigated); wb.Navigate("http://www.google.com"); Console.ReadLine(); } static void wb_Navigating(object sender, WebBrowserNavigatingEventArgs e) { Console.WriteLine("Navigating"); } static void wb_Navigated(object sender, WebBrowserNavigatedEventArgs e) { Console.WriteLine("Navigated"); } static void wb_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e) { Console.WriteLine("DocumentCompleted"); Console.WriteLine(wb.DocumentText); } } }
You don't need the web browser to do that. Just do the following:
public string GetWebPage(string url)
{
HttpWebRequest webRequest = null;
HttpWebResponse webResponse = null;
Stream responseStream = null;
Encoding encode = null;
StreamReader webPageStream = null;
string webPageText = "";try { // Create the web request webRequest = (HttpWebRequest)HttpWebRequest.Create(url); // Get the response webResponse = (HttpWebResponse)webRequest.GetResponse(); // Get the response stream responseStream = webResponse.GetResponseStream(); // Get the encoding encode = Encoding.GetEncoding("utf-8"); // Read the response and using a StreamReader webPageStream = new StreamReader(responseStream, encode); webPageText = webPageStream.ReadToEnd(); } catch (Exception e) { MessageBox.Show("Error : " + e.Message); } finally { // free any resources if (webPageStream != null) { webPageStream.Close(); webPageStream.Dispose(); } if (responseStream != null) { responseStream.Close(); responseStream.Dispose(); } if (webResponse != null) { webResponse.Close(); } } // return the page contents return webPageText;
}