Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C#
  4. Knowing the contents of a file..

Knowing the contents of a file..

Scheduled Pinned Locked Moved C#
questioncsharptutorialcomalgorithms
9 Posts 6 Posters 1 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • P Offline
    P Offline
    ptr2void
    wrote on last edited by
    #1

    Hi, I am new to C#.NET.. I am building a C# application which searches for files in the HDD.. However there may be many duplicate files with the same name and same contents.. My objective is to delete the files with same contents.. So, say if there are two files (mscorlib.dll for framework 1.1 and another mscorlib.dll for framework 2.0), the application should not delete them, since they are essentially two different assemblies for different framework and should have different contents in it.. Where as say we have a file (abc.exe and another: pqr.exe) having the same contents(say I copied one and renamed it to another), they should be deleted, since they contain same contents and differ in name only.. How do I apply this? How can I know the contents of a file? Do I have to apply some checksome algorithm for that? If so, How? If No, then how to find out whether the contents of two files are same?? Please keep in mind I am doing this only for executables (.exe,.dll,.bat,.com etc) Please guide.. Any help would be greatly appreciated.. Thanks..

    C 1 Reply Last reply
    0
    • P ptr2void

      Hi, I am new to C#.NET.. I am building a C# application which searches for files in the HDD.. However there may be many duplicate files with the same name and same contents.. My objective is to delete the files with same contents.. So, say if there are two files (mscorlib.dll for framework 1.1 and another mscorlib.dll for framework 2.0), the application should not delete them, since they are essentially two different assemblies for different framework and should have different contents in it.. Where as say we have a file (abc.exe and another: pqr.exe) having the same contents(say I copied one and renamed it to another), they should be deleted, since they contain same contents and differ in name only.. How do I apply this? How can I know the contents of a file? Do I have to apply some checksome algorithm for that? If so, How? If No, then how to find out whether the contents of two files are same?? Please keep in mind I am doing this only for executables (.exe,.dll,.bat,.com etc) Please guide.. Any help would be greatly appreciated.. Thanks..

      C Offline
      C Offline
      Christian Graus
      wrote on last edited by
      #2

      This sounds like a nightmare. If you do it across your HDD, I will bet money you break somethings. File.ReadAllBytes will load the files, then you can compare them.

      Christian Graus Please read this if you don't understand the answer I've given you "also I don't think "TranslateOneToTwoBillion OneHundredAndFortySevenMillion FourHundredAndEightyThreeThousand SixHundredAndFortySeven()" is a very good choice for a function name" - SpacixOne ( offering help to someone who really needed it ) ( spaces added for the benefit of people running at < 1280x1024 )

      P J 2 Replies Last reply
      0
      • C Christian Graus

        This sounds like a nightmare. If you do it across your HDD, I will bet money you break somethings. File.ReadAllBytes will load the files, then you can compare them.

        Christian Graus Please read this if you don't understand the answer I've given you "also I don't think "TranslateOneToTwoBillion OneHundredAndFortySevenMillion FourHundredAndEightyThreeThousand SixHundredAndFortySeven()" is a very good choice for a function name" - SpacixOne ( offering help to someone who really needed it ) ( spaces added for the benefit of people running at < 1280x1024 )

        P Offline
        P Offline
        ptr2void
        wrote on last edited by
        #3

        Christian Graus wrote:

        If you do it across your HDD, I will bet money you break somethings.

        Didnt quite get this line.. What might break?? :doh:

        Christian Graus wrote:

        File.ReadAllBytes will load the files, then you can compare them.

        How to compare them?? Any particular method in BCL or write an algorithm for that?

        J M 2 Replies Last reply
        0
        • P ptr2void

          Christian Graus wrote:

          If you do it across your HDD, I will bet money you break somethings.

          Didnt quite get this line.. What might break?? :doh:

          Christian Graus wrote:

          File.ReadAllBytes will load the files, then you can compare them.

          How to compare them?? Any particular method in BCL or write an algorithm for that?

          J Offline
          J Offline
          J4amieC
          wrote on last edited by
          #4

          ptr2void wrote:

          Didnt quite get this line.. What might break??

          If you let a process run wild on your machine which randomly decides to delete exe's dll's etc there is a fairly good chance you will break your windows install.

          ptr2void wrote:

          How to compare them?? Any particular method in BCL or write an algorithm for that?

          You compare them by comparing their bytes. Try some code, read any exe on your machine using File.ReadAllBytes. Have a look at the return from that method (a byte array) and have a think about how you might compare 2 separate file's bytes (hint: a byte is essentially a number between 0 and 255)

          P 1 Reply Last reply
          0
          • P ptr2void

            Christian Graus wrote:

            If you do it across your HDD, I will bet money you break somethings.

            Didnt quite get this line.. What might break?? :doh:

            Christian Graus wrote:

            File.ReadAllBytes will load the files, then you can compare them.

            How to compare them?? Any particular method in BCL or write an algorithm for that?

            M Offline
            M Offline
            Marek Grzenkowicz
            wrote on last edited by
            #5

            ptr2void wrote:

            Didnt quite get this line.. What might break??

            Imagine that you installed two different applications that use the same control (e.g. Excellent.Grid.dll). Your app will find the Excellent.Grid.dll file in two different directories (e.g. C:\Program Files\Excellent Calendar and C:\Program Files\Excellent Cookbook), will delete one copy since they are exactly the same and... one application will stop working.

            1 Reply Last reply
            0
            • J J4amieC

              ptr2void wrote:

              Didnt quite get this line.. What might break??

              If you let a process run wild on your machine which randomly decides to delete exe's dll's etc there is a fairly good chance you will break your windows install.

              ptr2void wrote:

              How to compare them?? Any particular method in BCL or write an algorithm for that?

              You compare them by comparing their bytes. Try some code, read any exe on your machine using File.ReadAllBytes. Have a look at the return from that method (a byte array) and have a think about how you might compare 2 separate file's bytes (hint: a byte is essentially a number between 0 and 255)

              P Offline
              P Offline
              ptr2void
              wrote on last edited by
              #6

              J4amieC wrote:

              If you let a process run wild on your machine which randomly decides to delete exe's dll's etc there is a fairly good chance you will break your windows install.

              I think you didnt understand my question or maybe I didnt put in a comprehansible manner.. I dont want to delete the searched files on the HDD.. That would be horrible.. I just want to delete(omit) them from my search results !! :)

              J 1 Reply Last reply
              0
              • P ptr2void

                J4amieC wrote:

                If you let a process run wild on your machine which randomly decides to delete exe's dll's etc there is a fairly good chance you will break your windows install.

                I think you didnt understand my question or maybe I didnt put in a comprehansible manner.. I dont want to delete the searched files on the HDD.. That would be horrible.. I just want to delete(omit) them from my search results !! :)

                J Offline
                J Offline
                J4amieC
                wrote on last edited by
                #7

                delete implies to remove. "omit from results" has a very different meaning.

                C 1 Reply Last reply
                0
                • J J4amieC

                  delete implies to remove. "omit from results" has a very different meaning.

                  C Offline
                  C Offline
                  carbon_golem
                  wrote on last edited by
                  #8

                  The bolding had a profound effect as well.

                  "Run for your life from any man who tells you that money is evil. That sentence is the leper's bell of an approaching looter." --Ayn Rand

                  1 Reply Last reply
                  0
                  • C Christian Graus

                    This sounds like a nightmare. If you do it across your HDD, I will bet money you break somethings. File.ReadAllBytes will load the files, then you can compare them.

                    Christian Graus Please read this if you don't understand the answer I've given you "also I don't think "TranslateOneToTwoBillion OneHundredAndFortySevenMillion FourHundredAndEightyThreeThousand SixHundredAndFortySeven()" is a very good choice for a function name" - SpacixOne ( offering help to someone who really needed it ) ( spaces added for the benefit of people running at < 1280x1024 )

                    J Offline
                    J Offline
                    Jordanwb
                    wrote on last edited by
                    #9

                    You could get the MD5 hash of the file, of course this may take a while. I recomend against your course of action strongly because you may screw something up.

                    1 Reply Last reply
                    0
                    Reply
                    • Reply as topic
                    Log in to reply
                    • Oldest to Newest
                    • Newest to Oldest
                    • Most Votes


                    • Login

                    • Don't have an account? Register

                    • Login or register to search.
                    • First post
                      Last post
                    0
                    • Categories
                    • Recent
                    • Tags
                    • Popular
                    • World
                    • Users
                    • Groups