Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C / C++ / MFC
  4. Need To Create a crawler/spider in vc++

Need To Create a crawler/spider in vc++

Scheduled Pinned Locked Moved C / C++ / MFC
c++visual-studio
19 Posts 7 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • A Offline
    A Offline
    Ash_VCPP
    wrote on last edited by
    #1

    Hi All, I have an urgent requirement to create a crawler by which i can be able to fetch data from a url, the ide should be vc++.

    Thanks A Ton Ash_VCPP

    C _ I S D 5 Replies Last reply
    0
    • A Ash_VCPP

      Hi All, I have an urgent requirement to create a crawler by which i can be able to fetch data from a url, the ide should be vc++.

      Thanks A Ton Ash_VCPP

      C Offline
      C Offline
      Chandrasekharan P
      wrote on last edited by
      #2

      very good.. now what is the problem??

      CPalliniC A 2 Replies Last reply
      0
      • C Chandrasekharan P

        very good.. now what is the problem??

        CPalliniC Online
        CPalliniC Online
        CPallini
        wrote on last edited by
        #3

        Wow, starting the working day with a smile is very good, my five. :)

        If the Lord God Almighty had consulted me before embarking upon the Creation, I would have recommended something simpler. -- Alfonso the Wise, 13th Century King of Castile.
        This is going on my arrogant assumptions. You may have a superb reason why I'm completely wrong. -- Iain Clarke
        [My articles]

        In testa che avete, signor di Ceprano?

        A 1 Reply Last reply
        0
        • A Ash_VCPP

          Hi All, I have an urgent requirement to create a crawler by which i can be able to fetch data from a url, the ide should be vc++.

          Thanks A Ton Ash_VCPP

          _ Offline
          _ Offline
          _AnsHUMAN_
          wrote on last edited by
          #4

          By definition: Crawler-a person who tries to please someone in order to gain a personal advantage Do you need it to please someone for some personal advantage? Did you try to meet the requirements. Go get the IDE...

          You need to google first, if you have "It's urgent please" mentioned in your question. ;-)_AnShUmAn_

          1 Reply Last reply
          0
          • C Chandrasekharan P

            very good.. now what is the problem??

            A Offline
            A Offline
            Ash_VCPP
            wrote on last edited by
            #5

            I need to decide many things before starting the project,coz i am the only one responsible to make this project, so please tell me the initial guideline to start with,Like what i should use....win32 exe,win32 dll,com etc....which inter process communication logic i should use.....

            Thanks A Ton Ash_VCPP

            1 Reply Last reply
            0
            • CPalliniC CPallini

              Wow, starting the working day with a smile is very good, my five. :)

              If the Lord God Almighty had consulted me before embarking upon the Creation, I would have recommended something simpler. -- Alfonso the Wise, 13th Century King of Castile.
              This is going on my arrogant assumptions. You may have a superb reason why I'm completely wrong. -- Iain Clarke
              [My articles]

              A Offline
              A Offline
              Ash_VCPP
              wrote on last edited by
              #6

              Do you have any idea about crawler if yes then please provide me the way to start working its urgent...... :-O

              Thanks A Ton Ash_VCPP

              CPalliniC 1 Reply Last reply
              0
              • A Ash_VCPP

                Do you have any idea about crawler if yes then please provide me the way to start working its urgent...... :-O

                Thanks A Ton Ash_VCPP

                CPalliniC Online
                CPalliniC Online
                CPallini
                wrote on last edited by
                #7

                Ash_VCPP wrote:

                Do you have any idea about crawler

                Yes.

                Ash_VCPP wrote:

                then please provide me the way to start working its urgent......

                Sorry, *urgent* questions automatically falls to the bottom of the stack (just a bit above *very urgent* questions). :)

                If the Lord God Almighty had consulted me before embarking upon the Creation, I would have recommended something simpler. -- Alfonso the Wise, 13th Century King of Castile.
                This is going on my arrogant assumptions. You may have a superb reason why I'm completely wrong. -- Iain Clarke
                [My articles]

                In testa che avete, signor di Ceprano?

                A 1 Reply Last reply
                0
                • A Ash_VCPP

                  Hi All, I have an urgent requirement to create a crawler by which i can be able to fetch data from a url, the ide should be vc++.

                  Thanks A Ton Ash_VCPP

                  I Offline
                  I Offline
                  Iain Clarke Warrior Programmer
                  wrote on last edited by
                  #8

                  As you may have seen from your response, it's not a very good question. 1/ You haven't actually asked a question - you've just told us you have work to do. While we are, of course, very happy for you, there's not much to answer. 2/ You've got quite a bit challenge, especially if your starting from scratch. 3/ You can break it down into several challenges... Handling delays, timeouts, gettinf HTPP pages, parsing them into links, etc. I've attached below some code I wrote years ago, grabbing a certain page from a specific URL every hour or so - an early RSS reader, essentially. It may help you with your search terms. There are other articles on codeproject grabbing information from web pages. John Simmons wrote one recently scraping information from a codeproject page. Good luck with your task! Iain.

                  DWORD WINAPI UpdatePageThread ( LPVOID lpParameter )
                  {
                  HWND hWnd = (HWND)lpParameter;

                  DWORD dw, dwDelay = 100;
                  HINTERNET	hInternet, hIConnect, hIRequest;
                  BOOL	bSuccess;
                  DWORD	dwStatus, dwSize, dwIndex;
                  
                  PCHAR	AcceptTypes \[\] = { "text/\*", NULL };
                  
                  // Set up the query.
                  hInternet	= NULL;
                  hIConnect	= NULL;
                  hIRequest	= NULL;
                  hInternet = ::InternetOpen ("OC UK Notify", INTERNET\_OPEN\_TYPE\_PRECONFIG, NULL, NULL, 0);
                  
                  if (hInternet)
                  	hIConnect = ::InternetConnect (hInternet, "www.overclock-uk.net", INTERNET\_DEFAULT\_HTTP\_PORT, "user", "pass", INTERNET\_SERVICE\_HTTP, 0, 1);
                  if (hIConnect)
                  {
                  	hIRequest = ::HttpOpenRequest (hIConnect, NULL, "update.ocuk", NULL, NULL, (const char \*\*)AcceptTypes,
                  		INTERNET\_FLAG\_NO\_CACHE\_WRITE | INTERNET\_FLAG\_NO\_COOKIES | INTERNET\_FLAG\_NO\_UI | INTERNET\_FLAG\_RELOAD | INTERNET\_FLAG\_NO\_AUTH,
                  		1);
                  }
                  
                  if (!hIRequest) // Raise an error?
                  	return 1;
                  
                  char	buf \[4096\];
                  std::string	Page;
                  
                  while (1)
                  {
                  	dw = WaitForSingleObject (g\_hEventStop, dwDelay);
                  	if (dw != WAIT\_TIMEOUT)
                  		break;
                  

                  // dwDelay = 30000; // Wait a minute before we try again.
                  dwDelay = 90 * 60000; // 3/2 hours.

                  	bSuccess = ::HttpSendRequest (hIRequest, NULL, 0, NULL, 0);
                  	if (!bSuccess)
                  		continue; // Try again in a while.
                  
                  	dwSize = sizeof (DWORD);
                  	dwIndex = 0;
                  	bSuccess = ::HttpQueryInfo (hIRequest, HTTP\_QUERY\_STATUS\_CODE | HTTP\_QUERY\_FLAG\_NUMBER, &dwStatus, &dwSize, &dwIndex);
                  	if (!bSuccess)
                  		continue;
                  	dwStatus /= 100; // Just get the 2XX part.
                  	if (dwStatus != 2)
                  		continue;
                  
                  	Page.erase ();
                  
                  	while (1)
                  	{
                  		memset (buf, 0, sizeof (buf));
                  		bSuccess = ::InternetReadFile (hIRequest, buf, sizeof (buf), &dw
                  
                  A 1 Reply Last reply
                  0
                  • I Iain Clarke Warrior Programmer

                    As you may have seen from your response, it's not a very good question. 1/ You haven't actually asked a question - you've just told us you have work to do. While we are, of course, very happy for you, there's not much to answer. 2/ You've got quite a bit challenge, especially if your starting from scratch. 3/ You can break it down into several challenges... Handling delays, timeouts, gettinf HTPP pages, parsing them into links, etc. I've attached below some code I wrote years ago, grabbing a certain page from a specific URL every hour or so - an early RSS reader, essentially. It may help you with your search terms. There are other articles on codeproject grabbing information from web pages. John Simmons wrote one recently scraping information from a codeproject page. Good luck with your task! Iain.

                    DWORD WINAPI UpdatePageThread ( LPVOID lpParameter )
                    {
                    HWND hWnd = (HWND)lpParameter;

                    DWORD dw, dwDelay = 100;
                    HINTERNET	hInternet, hIConnect, hIRequest;
                    BOOL	bSuccess;
                    DWORD	dwStatus, dwSize, dwIndex;
                    
                    PCHAR	AcceptTypes \[\] = { "text/\*", NULL };
                    
                    // Set up the query.
                    hInternet	= NULL;
                    hIConnect	= NULL;
                    hIRequest	= NULL;
                    hInternet = ::InternetOpen ("OC UK Notify", INTERNET\_OPEN\_TYPE\_PRECONFIG, NULL, NULL, 0);
                    
                    if (hInternet)
                    	hIConnect = ::InternetConnect (hInternet, "www.overclock-uk.net", INTERNET\_DEFAULT\_HTTP\_PORT, "user", "pass", INTERNET\_SERVICE\_HTTP, 0, 1);
                    if (hIConnect)
                    {
                    	hIRequest = ::HttpOpenRequest (hIConnect, NULL, "update.ocuk", NULL, NULL, (const char \*\*)AcceptTypes,
                    		INTERNET\_FLAG\_NO\_CACHE\_WRITE | INTERNET\_FLAG\_NO\_COOKIES | INTERNET\_FLAG\_NO\_UI | INTERNET\_FLAG\_RELOAD | INTERNET\_FLAG\_NO\_AUTH,
                    		1);
                    }
                    
                    if (!hIRequest) // Raise an error?
                    	return 1;
                    
                    char	buf \[4096\];
                    std::string	Page;
                    
                    while (1)
                    {
                    	dw = WaitForSingleObject (g\_hEventStop, dwDelay);
                    	if (dw != WAIT\_TIMEOUT)
                    		break;
                    

                    // dwDelay = 30000; // Wait a minute before we try again.
                    dwDelay = 90 * 60000; // 3/2 hours.

                    	bSuccess = ::HttpSendRequest (hIRequest, NULL, 0, NULL, 0);
                    	if (!bSuccess)
                    		continue; // Try again in a while.
                    
                    	dwSize = sizeof (DWORD);
                    	dwIndex = 0;
                    	bSuccess = ::HttpQueryInfo (hIRequest, HTTP\_QUERY\_STATUS\_CODE | HTTP\_QUERY\_FLAG\_NUMBER, &dwStatus, &dwSize, &dwIndex);
                    	if (!bSuccess)
                    		continue;
                    	dwStatus /= 100; // Just get the 2XX part.
                    	if (dwStatus != 2)
                    		continue;
                    
                    	Page.erase ();
                    
                    	while (1)
                    	{
                    		memset (buf, 0, sizeof (buf));
                    		bSuccess = ::InternetReadFile (hIRequest, buf, sizeof (buf), &dw
                    
                    A Offline
                    A Offline
                    Ash_VCPP
                    wrote on last edited by
                    #9

                    Hi Iain, Thanks for providing this important information and code, now i will try in this way and if found any difficulties then i will let you know...once again thanks for the reply..

                    Thanks A Ton Ash_VCPP

                    I 1 Reply Last reply
                    0
                    • CPalliniC CPallini

                      Ash_VCPP wrote:

                      Do you have any idea about crawler

                      Yes.

                      Ash_VCPP wrote:

                      then please provide me the way to start working its urgent......

                      Sorry, *urgent* questions automatically falls to the bottom of the stack (just a bit above *very urgent* questions). :)

                      If the Lord God Almighty had consulted me before embarking upon the Creation, I would have recommended something simpler. -- Alfonso the Wise, 13th Century King of Castile.
                      This is going on my arrogant assumptions. You may have a superb reason why I'm completely wrong. -- Iain Clarke
                      [My articles]

                      A Offline
                      A Offline
                      Ash_VCPP
                      wrote on last edited by
                      #10

                      then can you please provide me any code , guidelines or any url where i can get some useful things.......

                      Thanks A Ton Ash_VCPP

                      1 Reply Last reply
                      0
                      • A Ash_VCPP

                        Hi Iain, Thanks for providing this important information and code, now i will try in this way and if found any difficulties then i will let you know...once again thanks for the reply..

                        Thanks A Ton Ash_VCPP

                        I Offline
                        I Offline
                        Iain Clarke Warrior Programmer
                        wrote on last edited by
                        #11

                        The website / page this code pointed to has long since gone, by the way! And take the error checking with heavy skepticism... Iain.

                        Codeproject MVP for C++, I can't believe it's for my lounge posts...

                        1 Reply Last reply
                        0
                        • A Ash_VCPP

                          Hi All, I have an urgent requirement to create a crawler by which i can be able to fetch data from a url, the ide should be vc++.

                          Thanks A Ton Ash_VCPP

                          S Offline
                          S Offline
                          Sandeep Saini SRE
                          wrote on last edited by
                          #12

                          Hi Ash, You still need the code? If yes then please let me know.

                          A 1 Reply Last reply
                          0
                          • S Sandeep Saini SRE

                            Hi Ash, You still need the code? If yes then please let me know.

                            A Offline
                            A Offline
                            Ash_VCPP
                            wrote on last edited by
                            #13

                            hi sandeep, Actually with code i also need to do some planning as i have to start the project from the scratch.....so please provide me the idea as well with the code that which way would be the better one.......

                            Thanks A Ton Ash_VCPP

                            1 Reply Last reply
                            0
                            • A Ash_VCPP

                              Hi All, I have an urgent requirement to create a crawler by which i can be able to fetch data from a url, the ide should be vc++.

                              Thanks A Ton Ash_VCPP

                              D Offline
                              D Offline
                              David Crow
                              wrote on last edited by
                              #14

                              Ash_VCPP wrote:

                              I have an urgent requirement to create a crawler...

                              Care to define this?

                              "Old age is like a bank account. You withdraw later in life what you have deposited along the way." - Unknown

                              "Fireproof doesn't mean the fire will never come. It means when the fire comes that you will be able to withstand it." - Michael Simmons

                              A 1 Reply Last reply
                              0
                              • D David Crow

                                Ash_VCPP wrote:

                                I have an urgent requirement to create a crawler...

                                Care to define this?

                                "Old age is like a bank account. You withdraw later in life what you have deposited along the way." - Unknown

                                "Fireproof doesn't mean the fire will never come. It means when the fire comes that you will be able to withstand it." - Michael Simmons

                                A Offline
                                A Offline
                                Ash_VCPP
                                wrote on last edited by
                                #15

                                i got your point till some extent but i would be pleased if you can explain it more...

                                Thanks A Ton Ash_VCPP

                                D 1 Reply Last reply
                                0
                                • A Ash_VCPP

                                  i got your point till some extent but i would be pleased if you can explain it more...

                                  Thanks A Ton Ash_VCPP

                                  D Offline
                                  D Offline
                                  David Crow
                                  wrote on last edited by
                                  #16

                                  Ash_VCPP wrote:

                                  ...i would be pleased if you can explain it more...

                                  I believe that was the question I posed to you. The term "crawler" can take on several different meanings. What is yours?

                                  "Old age is like a bank account. You withdraw later in life what you have deposited along the way." - Unknown

                                  "Fireproof doesn't mean the fire will never come. It means when the fire comes that you will be able to withstand it." - Michael Simmons

                                  A 1 Reply Last reply
                                  0
                                  • D David Crow

                                    Ash_VCPP wrote:

                                    ...i would be pleased if you can explain it more...

                                    I believe that was the question I posed to you. The term "crawler" can take on several different meanings. What is yours?

                                    "Old age is like a bank account. You withdraw later in life what you have deposited along the way." - Unknown

                                    "Fireproof doesn't mean the fire will never come. It means when the fire comes that you will be able to withstand it." - Michael Simmons

                                    A Offline
                                    A Offline
                                    Ash_VCPP
                                    wrote on last edited by
                                    #17

                                    basically i need an exe which can fetch data from any url and dump it to data base.....

                                    Thanks A Ton Ash_VCPP

                                    D 1 Reply Last reply
                                    0
                                    • A Ash_VCPP

                                      basically i need an exe which can fetch data from any url and dump it to data base.....

                                      Thanks A Ton Ash_VCPP

                                      D Offline
                                      D Offline
                                      David Crow
                                      wrote on last edited by
                                      #18

                                      Ash_VCPP wrote:

                                      ...fetch data from any url...

                                      Such as URLDownloadToFile()?

                                      "Old age is like a bank account. You withdraw later in life what you have deposited along the way." - Unknown

                                      "Fireproof doesn't mean the fire will never come. It means when the fire comes that you will be able to withstand it." - Michael Simmons

                                      A 1 Reply Last reply
                                      0
                                      • D David Crow

                                        Ash_VCPP wrote:

                                        ...fetch data from any url...

                                        Such as URLDownloadToFile()?

                                        "Old age is like a bank account. You withdraw later in life what you have deposited along the way." - Unknown

                                        "Fireproof doesn't mean the fire will never come. It means when the fire comes that you will be able to withstand it." - Michael Simmons

                                        A Offline
                                        A Offline
                                        Ash_VCPP
                                        wrote on last edited by
                                        #19

                                        i am not sure that it will work...coz i remember that before few months i used it to download an xml file from server and icons.....

                                        Thanks A Ton Ash_VCPP

                                        1 Reply Last reply
                                        0
                                        Reply
                                        • Reply as topic
                                        Log in to reply
                                        • Oldest to Newest
                                        • Newest to Oldest
                                        • Most Votes


                                        • Login

                                        • Don't have an account? Register

                                        • Login or register to search.
                                        • First post
                                          Last post
                                        0
                                        • Categories
                                        • Recent
                                        • Tags
                                        • Popular
                                        • World
                                        • Users
                                        • Groups