Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C#
  4. How to grab a web page using .Net 2.0 WebBrowser Control

How to grab a web page using .Net 2.0 WebBrowser Control

Scheduled Pinned Locked Moved C#
csharpjavascriptcomhelptutorial
3 Posts 3 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • R Offline
    R Offline
    rryyjw
    wrote on last edited by
    #1

    Hi All, I want to grab a page using WebBrowser Control (see code below). I try to get the page content through DocumentText property in "DocumentCompleted" event handler, but it seems that the event won't be triggered. The "Navigated" event won't be triggered either. However, the "Navigating" event does. I used to use HttpWebRequest/HttpWebResponse to grab webpage content, however, the drawback is that I can't get the full loaded page through it. What I mean by this is that maybe some javascript will execute to modify elements on the page after it's loaded, I call this page as full loaded page. I think using WebBrowser Control can overcome this drawback. It's just like a browser, and it'll execute those javascript after loading the page. But I can't make this control work correctly. Could anyone help me? I'm very appreciate. Thanks Jie using System; using System.Collections.Generic; using System.Text; using System.Windows.Forms; namespace ConsoleApplication1 { class Program { static WebBrowser wb; [STAThread] static void Main(string[] args) { Go(); } private static void Go() { wb = new WebBrowser(); wb.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(wb_DocumentCompleted); wb.Navigating +=new WebBrowserNavigatingEventHandler(wb_Navigating); wb.Navigated += new WebBrowserNavigatedEventHandler(wb_Navigated); wb.Navigate("http://www.google.com"); Console.ReadLine(); } static void wb_Navigating(object sender, WebBrowserNavigatingEventArgs e) { Console.WriteLine("Navigating"); } static void wb_Navigated(object sender, WebBrowserNavigatedEventArgs e) { Console.WriteLine("Navigated"); } static void wb_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e) { Console.WriteLine("DocumentCompleted"); Console.WriteLine(wb.DocumentText); } } }

    N S 2 Replies Last reply
    0
    • R rryyjw

      Hi All, I want to grab a page using WebBrowser Control (see code below). I try to get the page content through DocumentText property in "DocumentCompleted" event handler, but it seems that the event won't be triggered. The "Navigated" event won't be triggered either. However, the "Navigating" event does. I used to use HttpWebRequest/HttpWebResponse to grab webpage content, however, the drawback is that I can't get the full loaded page through it. What I mean by this is that maybe some javascript will execute to modify elements on the page after it's loaded, I call this page as full loaded page. I think using WebBrowser Control can overcome this drawback. It's just like a browser, and it'll execute those javascript after loading the page. But I can't make this control work correctly. Could anyone help me? I'm very appreciate. Thanks Jie using System; using System.Collections.Generic; using System.Text; using System.Windows.Forms; namespace ConsoleApplication1 { class Program { static WebBrowser wb; [STAThread] static void Main(string[] args) { Go(); } private static void Go() { wb = new WebBrowser(); wb.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(wb_DocumentCompleted); wb.Navigating +=new WebBrowserNavigatingEventHandler(wb_Navigating); wb.Navigated += new WebBrowserNavigatedEventHandler(wb_Navigated); wb.Navigate("http://www.google.com"); Console.ReadLine(); } static void wb_Navigating(object sender, WebBrowserNavigatingEventArgs e) { Console.WriteLine("Navigating"); } static void wb_Navigated(object sender, WebBrowserNavigatedEventArgs e) { Console.WriteLine("Navigated"); } static void wb_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e) { Console.WriteLine("DocumentCompleted"); Console.WriteLine(wb.DocumentText); } } }

      N Offline
      N Offline
      Nader Elshehabi
      wrote on last edited by
      #2

      Hello Please don't take any offence when I say it's a very bad piece of code you got there!! Really! You got to declare your Webbrowser as a member of your forms class. And why the console if you got a browser on a form?!! Not to mention the main method!! You don't want to display it make visible = false; but I don't think it will ever load anything this way?!! The WebBrowser object dies the moment this Go method finishes, that's why it doesn't load anything. To make the webbrowser really works: 1- Make a form and put it on it -even if invisible- using the toolbox 2- Handle the DocumentCompleted event, it does fire. 3- If you don't want to handle it, you can handle ProcessChanged event There is a workaround to make Webbrowser work on console but I don't recommend it!!

      Regards:rose:

      1 Reply Last reply
      0
      • R rryyjw

        Hi All, I want to grab a page using WebBrowser Control (see code below). I try to get the page content through DocumentText property in "DocumentCompleted" event handler, but it seems that the event won't be triggered. The "Navigated" event won't be triggered either. However, the "Navigating" event does. I used to use HttpWebRequest/HttpWebResponse to grab webpage content, however, the drawback is that I can't get the full loaded page through it. What I mean by this is that maybe some javascript will execute to modify elements on the page after it's loaded, I call this page as full loaded page. I think using WebBrowser Control can overcome this drawback. It's just like a browser, and it'll execute those javascript after loading the page. But I can't make this control work correctly. Could anyone help me? I'm very appreciate. Thanks Jie using System; using System.Collections.Generic; using System.Text; using System.Windows.Forms; namespace ConsoleApplication1 { class Program { static WebBrowser wb; [STAThread] static void Main(string[] args) { Go(); } private static void Go() { wb = new WebBrowser(); wb.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(wb_DocumentCompleted); wb.Navigating +=new WebBrowserNavigatingEventHandler(wb_Navigating); wb.Navigated += new WebBrowserNavigatedEventHandler(wb_Navigated); wb.Navigate("http://www.google.com"); Console.ReadLine(); } static void wb_Navigating(object sender, WebBrowserNavigatingEventArgs e) { Console.WriteLine("Navigating"); } static void wb_Navigated(object sender, WebBrowserNavigatedEventArgs e) { Console.WriteLine("Navigated"); } static void wb_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e) { Console.WriteLine("DocumentCompleted"); Console.WriteLine(wb.DocumentText); } } }

        S Offline
        S Offline
        scott_hackett
        wrote on last edited by
        #3

        You don't need the web browser to do that. Just do the following:

        public string GetWebPage(string url)
        {
        HttpWebRequest webRequest = null;
        HttpWebResponse webResponse = null;
        Stream responseStream = null;
        Encoding encode = null;
        StreamReader webPageStream = null;
        string webPageText = "";

        try
        {
            // Create the web request
            webRequest = (HttpWebRequest)HttpWebRequest.Create(url);
            // Get the response
            webResponse = (HttpWebResponse)webRequest.GetResponse();
            // Get the response stream
            responseStream = webResponse.GetResponseStream();
            // Get the encoding
            encode = Encoding.GetEncoding("utf-8");
            // Read the response and using a StreamReader
            webPageStream = new StreamReader(responseStream, encode);
            webPageText = webPageStream.ReadToEnd();
        }
        catch (Exception e)
        {
            MessageBox.Show("Error : " + e.Message);
        }
        finally
        {
            // free any resources
            if (webPageStream != null)
            {
                webPageStream.Close();
                webPageStream.Dispose();
            }
            if (responseStream != null)
            {
                responseStream.Close();
                responseStream.Dispose();
            }
            if (webResponse != null)
            {
                webResponse.Close();
            }
        }
        
        // return the page contents
        return webPageText;
        

        }

        1 Reply Last reply
        0
        Reply
        • Reply as topic
        Log in to reply
        • Oldest to Newest
        • Newest to Oldest
        • Most Votes


        • Login

        • Don't have an account? Register

        • Login or register to search.
        • First post
          Last post
        0
        • Categories
        • Recent
        • Tags
        • Popular
        • World
        • Users
        • Groups