Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. Saving Web Site from automatic download

Saving Web Site from automatic download

Scheduled Pinned Locked Moved The Lounge
javahtml
11 Posts 10 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • H Offline
    H Offline
    HinJinShah
    wrote on last edited by
    #1

    Hi There! Just was wondering as if there is some way of blocking the offline browsing softwares to download my entire web site. Like Web Site Ripper and same softwars, which have the capability to download the complete web site .. Is there any HTML code for that or VB/java Scripting stuff for that or something else. thanks regards,

    L P R C realJSOPR 5 Replies Last reply
    0
    • H HinJinShah

      Hi There! Just was wondering as if there is some way of blocking the offline browsing softwares to download my entire web site. Like Web Site Ripper and same softwars, which have the capability to download the complete web site .. Is there any HTML code for that or VB/java Scripting stuff for that or something else. thanks regards,

      L Offline
      L Offline
      led mike
      wrote on last edited by
      #2

      Yeah I think there is some stuff for that. I'll get back to you if I remember what it is.

      1 Reply Last reply
      0
      • H HinJinShah

        Hi There! Just was wondering as if there is some way of blocking the offline browsing softwares to download my entire web site. Like Web Site Ripper and same softwars, which have the capability to download the complete web site .. Is there any HTML code for that or VB/java Scripting stuff for that or something else. thanks regards,

        P Offline
        P Offline
        Pete OHanlon
        wrote on last edited by
        #3

        Hmmm. This is a tricky one. I'm going to give you the benefit of the doubt and assume that you aren't asking a programming question in the lounge - that it's more about the general philosophy. I suspect that one way to do this would be to detect the user agent that was coming in and then allow/disallow access as appropriate. If you were doing this in ASP.NET you could do this in a HTTP Handler.

        Deja View - the feeling that you've seen this post before.

        My blog | My articles

        1 Reply Last reply
        0
        • H HinJinShah

          Hi There! Just was wondering as if there is some way of blocking the offline browsing softwares to download my entire web site. Like Web Site Ripper and same softwars, which have the capability to download the complete web site .. Is there any HTML code for that or VB/java Scripting stuff for that or something else. thanks regards,

          R Offline
          R Offline
          Rocky Moore
          wrote on last edited by
          #4

          Not really. Depending on the software used, it might list itself in the user agent of the request and you could block that way. I think the most reliable method of blocking it to block pages that are being sent to the same IP at too fast of rate. If a request is getting received, one after the other from the same IP with no pause, it would either be a search bot or download sofware. They should be required to follow the same robot.txt file that search engines do, but I doubt they do.

          Rocky <>< Blog Post: Silverlight goes Beta 2.0 Tech Blog Post: Cheap Biofuels and Synthetics coming soon?

          R M F 3 Replies Last reply
          0
          • H HinJinShah

            Hi There! Just was wondering as if there is some way of blocking the offline browsing softwares to download my entire web site. Like Web Site Ripper and same softwars, which have the capability to download the complete web site .. Is there any HTML code for that or VB/java Scripting stuff for that or something else. thanks regards,

            C Offline
            C Offline
            code frog 0
            wrote on last edited by
            #5

            Once it's in my browser on my PC there's not much you can do to stop me. I'll get it if I really want it and if I really want it there's oh... a hundred ways to get it. I won't need to make a single request from your site I'll just tip-toe through my cache and pick up all the parts and pieces I want to make a nice little clone. I won't get your server side code that way but there's many ways around that as well. What is it you are afraid of? If it's that important... do not put it on the internet that's all I can tell you. Look at what the music industry is going through because they were stupid enough to not heed that rule. There's no way to stop a determined internet and the internet is very determined... Good luck though... :-D

            D 1 Reply Last reply
            0
            • R Rocky Moore

              Not really. Depending on the software used, it might list itself in the user agent of the request and you could block that way. I think the most reliable method of blocking it to block pages that are being sent to the same IP at too fast of rate. If a request is getting received, one after the other from the same IP with no pause, it would either be a search bot or download sofware. They should be required to follow the same robot.txt file that search engines do, but I doubt they do.

              Rocky <>< Blog Post: Silverlight goes Beta 2.0 Tech Blog Post: Cheap Biofuels and Synthetics coming soon?

              R Offline
              R Offline
              Rama Krishna Vavilala
              wrote on last edited by
              #6

              Rocky Moore wrote:

              I think the most reliable method of blocking it to block pages that are being sent to the same IP at too fast of rate.

              Yes, something similar to what google does. Will make a great article too ;) But it is still not perfect solution due to IP spoofing.

              1 Reply Last reply
              0
              • H HinJinShah

                Hi There! Just was wondering as if there is some way of blocking the offline browsing softwares to download my entire web site. Like Web Site Ripper and same softwars, which have the capability to download the complete web site .. Is there any HTML code for that or VB/java Scripting stuff for that or something else. thanks regards,

                realJSOPR Offline
                realJSOPR Offline
                realJSOP
                wrote on last edited by
                #7

                What you need is a way to encrypt all of the files on the site, and then decrypt them as they're served. The following cost money, bbut less than $100. http://www.encrypt-html.com/[^] http://www.protware.com/default.htm[^] http://www.tagslock.com/[^] Google is your friend.

                "Why don't you tie a kerosene-soaked rag around your ankles so the ants won't climb up and eat your candy ass..." - Dale Earnhardt, 1997
                -----
                "...the staggering layers of obscenity in your statement make it a work of art on so many levels." - Jason Jystad, 10/26/2001

                1 Reply Last reply
                0
                • C code frog 0

                  Once it's in my browser on my PC there's not much you can do to stop me. I'll get it if I really want it and if I really want it there's oh... a hundred ways to get it. I won't need to make a single request from your site I'll just tip-toe through my cache and pick up all the parts and pieces I want to make a nice little clone. I won't get your server side code that way but there's many ways around that as well. What is it you are afraid of? If it's that important... do not put it on the internet that's all I can tell you. Look at what the music industry is going through because they were stupid enough to not heed that rule. There's no way to stop a determined internet and the internet is very determined... Good luck though... :-D

                  D Offline
                  D Offline
                  daniilzol
                  wrote on last edited by
                  #8

                  I believe the op was asking about making a complete copy of the website, not just that one single page you have in cache. There are plenty or reasons to do it, one is to prevent traffic congestion, website bandwidth can be expensive, and serving entire website which can be hundreds of megabytes if you include all the art and downloads (if any) to everybody who wants a copy will be expensive as well.

                  C 1 Reply Last reply
                  0
                  • D daniilzol

                    I believe the op was asking about making a complete copy of the website, not just that one single page you have in cache. There are plenty or reasons to do it, one is to prevent traffic congestion, website bandwidth can be expensive, and serving entire website which can be hundreds of megabytes if you include all the art and downloads (if any) to everybody who wants a copy will be expensive as well.

                    C Offline
                    C Offline
                    code frog 0
                    wrote on last edited by
                    #9

                    Yeah, you can do it a 100 ways. I'm not sure you can stop someone who wants it though.

                    1 Reply Last reply
                    0
                    • R Rocky Moore

                      Not really. Depending on the software used, it might list itself in the user agent of the request and you could block that way. I think the most reliable method of blocking it to block pages that are being sent to the same IP at too fast of rate. If a request is getting received, one after the other from the same IP with no pause, it would either be a search bot or download sofware. They should be required to follow the same robot.txt file that search engines do, but I doubt they do.

                      Rocky <>< Blog Post: Silverlight goes Beta 2.0 Tech Blog Post: Cheap Biofuels and Synthetics coming soon?

                      M Offline
                      M Offline
                      Mike Dimmick
                      wrote on last edited by
                      #10

                      Rocky Moore wrote:

                      to block pages that are being sent to the same IP at too fast of rate

                      Your Chinese visitors won't like you very much. Outside the USA Network Address Translation is alive and well because the IPv4 numbering plan gives large US corporations and US and UK government departments 16 million public IP addresses each while the whole of some Asian countries get a few thousand between them. That said, IANA are holding on to some pretty damn big blocks. Regardless, unique public IP address is not a good way to distinguish 'unique' users.

                      DoEvents: Generating unexpected recursion since 1991

                      1 Reply Last reply
                      0
                      • R Rocky Moore

                        Not really. Depending on the software used, it might list itself in the user agent of the request and you could block that way. I think the most reliable method of blocking it to block pages that are being sent to the same IP at too fast of rate. If a request is getting received, one after the other from the same IP with no pause, it would either be a search bot or download sofware. They should be required to follow the same robot.txt file that search engines do, but I doubt they do.

                        Rocky <>< Blog Post: Silverlight goes Beta 2.0 Tech Blog Post: Cheap Biofuels and Synthetics coming soon?

                        F Offline
                        F Offline
                        Flynn Arrowstarr Regular Schmoe
                        wrote on last edited by
                        #11

                        Rocky Moore wrote:

                        They should be required to follow the same robot.txt file that search engines do, but I doubt they do.

                        I use WinHTTrack when I want to nab an offline copy of a game guide. In it's settings, it defaults to using robots.txt rules, but you can turn it off. Generally, I'll leave it on as most game walkthroughs I download don't use robots.txt files anyway. If it's just a single page, I'll use IE's "Save As" to make an .mht file. Flynn


                        _If we can't corrupt the youth of today,
                        the adults of tomorrow will be no fun...
                        _

                        1 Reply Last reply
                        0
                        Reply
                        • Reply as topic
                        Log in to reply
                        • Oldest to Newest
                        • Newest to Oldest
                        • Most Votes


                        • Login

                        • Don't have an account? Register

                        • Login or register to search.
                        • First post
                          Last post
                        0
                        • Categories
                        • Recent
                        • Tags
                        • Popular
                        • World
                        • Users
                        • Groups