Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. hashing algorithms

hashing algorithms

Scheduled Pinned Locked Moved The Lounge
questioncomalgorithmscryptography
27 Posts 18 Posters 5 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • J James Simpson

    Can someone please elighten me as to how these are possible? Is a 'hash' a peice of data that can be used to validate data eg: Data -> Hash Algorithm -> Hash value the same data creates the same hash value but you can not recreate the data from the same algorithm? In my ignorance I must ask, what is the point of the hash value? surely the data represents itself - and the hash value could not uniquely identify the item of data? If this is not suitable for the lounge - I apologise, I will remain confused! James Simpson Web Developer imebgo@hotmail.com P S - This is what part of the alphabet would look like if Q and R were eliminated
    Mitch Hedberg

    J Offline
    J Offline
    JWood
    wrote on last edited by
    #2

    I believe hash - as it applies to the lounge - is a drug that is derived from the cannibis plant and is illegal in many countries. A hashing algorithm therefore would be the smoking of this substance. J. ----------------------------

    1 Reply Last reply
    0
    • J James Simpson

      Can someone please elighten me as to how these are possible? Is a 'hash' a peice of data that can be used to validate data eg: Data -> Hash Algorithm -> Hash value the same data creates the same hash value but you can not recreate the data from the same algorithm? In my ignorance I must ask, what is the point of the hash value? surely the data represents itself - and the hash value could not uniquely identify the item of data? If this is not suitable for the lounge - I apologise, I will remain confused! James Simpson Web Developer imebgo@hotmail.com P S - This is what part of the alphabet would look like if Q and R were eliminated
      Mitch Hedberg

      J Offline
      J Offline
      James Simpson
      wrote on last edited by
      #3

      Questions questions! Message digest algorithm. (RFC 1320). The message digest algorithm takes as input a message of arbitrary length and produces as output a "fingerprint" or "message digest" of the input. It is conjectured that it is computationally infeasible to produce two messages having the same message digest, or to produce any message having a given pre-specified target message digest. The above paragraph to me indicates that the message digest is a fixed length value of a peice of data (which is created from a variable length peice of data). How can a fixed length peice of data represent ANY input peice of data? it makes no sense - or am I being totally stupid? Maybe if you knew the length of the input peice of data with the message digest then maybe it would work! Im confused James Simpson Web Developer imebgo@hotmail.com P S - This is what part of the alphabet would look like if Q and R were eliminated
      Mitch Hedberg

      J N G 3 Replies Last reply
      0
      • J James Simpson

        Questions questions! Message digest algorithm. (RFC 1320). The message digest algorithm takes as input a message of arbitrary length and produces as output a "fingerprint" or "message digest" of the input. It is conjectured that it is computationally infeasible to produce two messages having the same message digest, or to produce any message having a given pre-specified target message digest. The above paragraph to me indicates that the message digest is a fixed length value of a peice of data (which is created from a variable length peice of data). How can a fixed length peice of data represent ANY input peice of data? it makes no sense - or am I being totally stupid? Maybe if you knew the length of the input peice of data with the message digest then maybe it would work! Im confused James Simpson Web Developer imebgo@hotmail.com P S - This is what part of the alphabet would look like if Q and R were eliminated
        Mitch Hedberg

        J Offline
        J Offline
        James Simpson
        wrote on last edited by
        #4

        I suppose reading the message again computationally infeasible Which I suppose doesnt mean that the hash will be unique, just very very difficult to find to items of data which have the same hash Fair enough - I suppose by checking the length of the input data and the hash will nearly guarentee unique data Sorry for wasting everyones time! :-D James Simpson Web Developer imebgo@hotmail.com P S - This is what part of the alphabet would look like if Q and R were eliminated
        Mitch Hedberg

        1 Reply Last reply
        0
        • J James Simpson

          Can someone please elighten me as to how these are possible? Is a 'hash' a peice of data that can be used to validate data eg: Data -> Hash Algorithm -> Hash value the same data creates the same hash value but you can not recreate the data from the same algorithm? In my ignorance I must ask, what is the point of the hash value? surely the data represents itself - and the hash value could not uniquely identify the item of data? If this is not suitable for the lounge - I apologise, I will remain confused! James Simpson Web Developer imebgo@hotmail.com P S - This is what part of the alphabet would look like if Q and R were eliminated
          Mitch Hedberg

          J Offline
          J Offline
          James T Johnson
          wrote on last edited by
          #5

          James Simpson wrote: what is the point of the hash value? A hash is good when you don't want to store/transmit some sensitive data. APOP (a secure mechanism for logging into POP3) as well as Windows Auth (from what I've read) both use a hash in a challenge/response mechanism for authentication. APOP works by comparing the hashes that result from taking the password and appending some random value to it. Usually this random value is the server time at the start of the connection, as reported in the welcome message. Because the password is never transmitted over the line, and the value appended to the password will change with every connection, you are relatively safe from packet sniffers and replay attacks on the system. Another use for hashing can come from taking some large piece of data that has been transmitted and comparing the hashes that result from the server and the client. If they are the same, then you can be relatively assured that the data was unchanged during transmision. If you take that hash, encrypt it with a public/private key pair, then add it to the document (ala PGP) then you have a basis to decide whether the document is authentic. Some systems will also store user passwords as hashes, unix does this as do many websites. The password still gets transmitted as plain text, but the idea here is to protect the system itself should it get broken into (or some ill-minded employee decides to execute a SELECT username, password FROM users). In response to a later post of yours, the hash is created by applying some mathematical algorithm to the data you wish to hash. The point of which isn't to recreate your data from the hash but to use it as a way of verifying some aspect of that data because of its small size. James Simpson wrote: If this is not suitable for the lounge I'm not sure if it is or not...oh well, no harm done I guess ;) James "When you get frunk whats really fuinny is that you dont really realize you are cdtrunk till you are too drunk and by thewn you are too drunk to give a damn about being drubnk :-0" A drunk Nish over Sonork

          1 Reply Last reply
          0
          • J James Simpson

            Can someone please elighten me as to how these are possible? Is a 'hash' a peice of data that can be used to validate data eg: Data -> Hash Algorithm -> Hash value the same data creates the same hash value but you can not recreate the data from the same algorithm? In my ignorance I must ask, what is the point of the hash value? surely the data represents itself - and the hash value could not uniquely identify the item of data? If this is not suitable for the lounge - I apologise, I will remain confused! James Simpson Web Developer imebgo@hotmail.com P S - This is what part of the alphabet would look like if Q and R were eliminated
            Mitch Hedberg

            D Offline
            D Offline
            Daniel Turini
            wrote on last edited by
            #6

            There are lots of uses. One example is password checking: you don't need to store a password, you can store only the hash of the password, and compare the hashes when you need to check it. Another use is a hash table. Suppose you have 100,000 elements you want to search. If you can create (and often you can) a hash function which will give a unique number to each of these elements, you can search things 100,000 times faster. Like this: string "test" -> hash value 5 -> a[5] = "test"; string "another" -> hash value 10 -> a[10] = "another"; When you need to search for "test", you only compute its hash value (5) and look at a[5], without need to search through all the array. Trying to make bits uncopyable is like trying to make water not wet. -- Bruce Schneier

            J P 2 Replies Last reply
            0
            • J James Simpson

              Questions questions! Message digest algorithm. (RFC 1320). The message digest algorithm takes as input a message of arbitrary length and produces as output a "fingerprint" or "message digest" of the input. It is conjectured that it is computationally infeasible to produce two messages having the same message digest, or to produce any message having a given pre-specified target message digest. The above paragraph to me indicates that the message digest is a fixed length value of a peice of data (which is created from a variable length peice of data). How can a fixed length peice of data represent ANY input peice of data? it makes no sense - or am I being totally stupid? Maybe if you knew the length of the input peice of data with the message digest then maybe it would work! Im confused James Simpson Web Developer imebgo@hotmail.com P S - This is what part of the alphabet would look like if Q and R were eliminated
              Mitch Hedberg

              N Offline
              N Offline
              Nic Rowan
              wrote on last edited by
              #7

              Arg I suck at explaining things but here goes: I used a one-way hash when I was making a licence key generator. Basically I could take a string of any length and use the hash algorithm with it and it would return a 4 byte Hex string to me. You couldn't use that Hex to convert back into the original string but you could perform the same hash on the original string and get the same Hex out. This was a very specific Hash algo so I'm sure there are other types. Good luck :-D


              Despite the high cost of living, it remains popular. Build a man a fire, and he'll be warm for a day. Set a man on fire, and he'll be warm for the rest of his life.


              1 Reply Last reply
              0
              • D Daniel Turini

                There are lots of uses. One example is password checking: you don't need to store a password, you can store only the hash of the password, and compare the hashes when you need to check it. Another use is a hash table. Suppose you have 100,000 elements you want to search. If you can create (and often you can) a hash function which will give a unique number to each of these elements, you can search things 100,000 times faster. Like this: string "test" -> hash value 5 -> a[5] = "test"; string "another" -> hash value 10 -> a[10] = "another"; When you need to search for "test", you only compute its hash value (5) and look at a[5], without need to search through all the array. Trying to make bits uncopyable is like trying to make water not wet. -- Bruce Schneier

                J Offline
                J Offline
                James Simpson
                wrote on last edited by
                #8

                I see the uses behind it, but I dont fully understand how the algorithms can take any peice of data and effectivly shrink or grow it to a fixed size (say 128 bit) and still keep it unique! I think from what I have read that it can not gaurentee its uniqueness, but reduce the chances of two items having the same hash value to a very very very small value! James Simpson Web Developer imebgo@hotmail.com P S - This is what part of the alphabet would look like if Q and R were eliminated
                Mitch Hedberg

                D D B 3 Replies Last reply
                0
                • J James Simpson

                  I see the uses behind it, but I dont fully understand how the algorithms can take any peice of data and effectivly shrink or grow it to a fixed size (say 128 bit) and still keep it unique! I think from what I have read that it can not gaurentee its uniqueness, but reduce the chances of two items having the same hash value to a very very very small value! James Simpson Web Developer imebgo@hotmail.com P S - This is what part of the alphabet would look like if Q and R were eliminated
                  Mitch Hedberg

                  D Offline
                  D Offline
                  Daniel Turini
                  wrote on last edited by
                  #9

                  James Simpson wrote: I see the uses behind it, but I dont fully understand how the algorithms can take any peice of data and effectivly shrink or grow it to a fixed size (say 128 bit) and still keep it unique! You are right: one can do it only for finite and small sets of data. James Simpson wrote: I think from what I have read that it can not gaurentee its uniqueness, but reduce the chances of two items having the same hash value to a very very very small value! Yes, in this case we have what we call a "collision". So, normally your hash table entries are not strings, like in my sample, but linked lists (or arrays) of strings. So, you still can deal with collisions, and a good hash function will keep it at a minimum. If you want to look a program that generates perfect hash functions for limited sets of data take a look at gperf[^] Trying to make bits uncopyable is like trying to make water not wet. -- Bruce Schneier

                  1 Reply Last reply
                  0
                  • J James Simpson

                    Can someone please elighten me as to how these are possible? Is a 'hash' a peice of data that can be used to validate data eg: Data -> Hash Algorithm -> Hash value the same data creates the same hash value but you can not recreate the data from the same algorithm? In my ignorance I must ask, what is the point of the hash value? surely the data represents itself - and the hash value could not uniquely identify the item of data? If this is not suitable for the lounge - I apologise, I will remain confused! James Simpson Web Developer imebgo@hotmail.com P S - This is what part of the alphabet would look like if Q and R were eliminated
                    Mitch Hedberg

                    M Offline
                    M Offline
                    markkuk
                    wrote on last edited by
                    #10

                    Data validation is the use for cryptographically secure, "one way" hashing algorithms. There are other uses for hash functions that have different requirements, e.g. quick searching of data.

                    1 Reply Last reply
                    0
                    • J James Simpson

                      Can someone please elighten me as to how these are possible? Is a 'hash' a peice of data that can be used to validate data eg: Data -> Hash Algorithm -> Hash value the same data creates the same hash value but you can not recreate the data from the same algorithm? In my ignorance I must ask, what is the point of the hash value? surely the data represents itself - and the hash value could not uniquely identify the item of data? If this is not suitable for the lounge - I apologise, I will remain confused! James Simpson Web Developer imebgo@hotmail.com P S - This is what part of the alphabet would look like if Q and R were eliminated
                      Mitch Hedberg

                      L Offline
                      L Offline
                      l a u r e n
                      wrote on last edited by
                      #11

                      for example... a user signs up with a web site giving a username and password .. everytime they log in u dont want to be sending the password over the wire so u create a hash of the password and send that instead and u store a hash of the same password on the server no-one can get the password from packet sniffing the wire cos it isnt sent :)


                      "there is no spoon"
                      biz stuff   about me

                      C 1 Reply Last reply
                      0
                      • L l a u r e n

                        for example... a user signs up with a web site giving a username and password .. everytime they log in u dont want to be sending the password over the wire so u create a hash of the password and send that instead and u store a hash of the same password on the server no-one can get the password from packet sniffing the wire cos it isnt sent :)


                        "there is no spoon"
                        biz stuff   about me

                        C Offline
                        C Offline
                        ColinDavies
                        wrote on last edited by
                        #12

                        l a u r e n wrote: no-one can get the password from packet sniffing the wire cos it isnt sent True but the hash was sent. And the dealers will be happy. :-) Regardz Colin J Davies

                        *** WARNING *
                        This could be addictive
                        **The minion's version of "Catch :bob: "

                        It's a real shame that people as stupid as you can work out how to use a computer. said by Christian Graus in the Soapbox

                        S J D 3 Replies Last reply
                        0
                        • C ColinDavies

                          l a u r e n wrote: no-one can get the password from packet sniffing the wire cos it isnt sent True but the hash was sent. And the dealers will be happy. :-) Regardz Colin J Davies

                          *** WARNING *
                          This could be addictive
                          **The minion's version of "Catch :bob: "

                          It's a real shame that people as stupid as you can work out how to use a computer. said by Christian Graus in the Soapbox

                          S Offline
                          S Offline
                          Shog9 0
                          wrote on last edited by
                          #13

                          :laugh::laugh:

                          Your sincerity about keeping the soapbox organized and civilized is so obvious. I solute your effort. -- Anonymous, 10/18/03

                          1 Reply Last reply
                          0
                          • C ColinDavies

                            l a u r e n wrote: no-one can get the password from packet sniffing the wire cos it isnt sent True but the hash was sent. And the dealers will be happy. :-) Regardz Colin J Davies

                            *** WARNING *
                            This could be addictive
                            **The minion's version of "Catch :bob: "

                            It's a real shame that people as stupid as you can work out how to use a computer. said by Christian Graus in the Soapbox

                            J Offline
                            J Offline
                            Jorgen Sigvardsson
                            wrote on last edited by
                            #14

                            Ooooh! Tomorrow I'll be pushing SHA-1 and MD5 to kids on school yards. The first 32 bits are free... -- Your life as it has been is over. From this time forward you will service us.

                            1 Reply Last reply
                            0
                            • C ColinDavies

                              l a u r e n wrote: no-one can get the password from packet sniffing the wire cos it isnt sent True but the hash was sent. And the dealers will be happy. :-) Regardz Colin J Davies

                              *** WARNING *
                              This could be addictive
                              **The minion's version of "Catch :bob: "

                              It's a real shame that people as stupid as you can work out how to use a computer. said by Christian Graus in the Soapbox

                              D Offline
                              D Offline
                              Daniel Turini
                              wrote on last edited by
                              #15

                              That's why I love CP: today I've learned a new meaning for "hash" :) Trying to make bits uncopyable is like trying to make water not wet. -- Bruce Schneier

                              N 1 Reply Last reply
                              0
                              • J James Simpson

                                Can someone please elighten me as to how these are possible? Is a 'hash' a peice of data that can be used to validate data eg: Data -> Hash Algorithm -> Hash value the same data creates the same hash value but you can not recreate the data from the same algorithm? In my ignorance I must ask, what is the point of the hash value? surely the data represents itself - and the hash value could not uniquely identify the item of data? If this is not suitable for the lounge - I apologise, I will remain confused! James Simpson Web Developer imebgo@hotmail.com P S - This is what part of the alphabet would look like if Q and R were eliminated
                                Mitch Hedberg

                                P Offline
                                P Offline
                                peterchen
                                wrote on last edited by
                                #16

                                one use that has not been mentioned is speeding up lookup e.g. in an associative map. Imagine you have a dictionary [key, value] with quite a many long words as keys. Instead of using string comparisons, you just compare the hashes. The hashes of the strings in the dicitonary can be precalculated, and the dicitonary indexed by the hash instead of the string. of course there's a slight chance of two keys have the same hash - this must be treated separately, so you end up with a dictionary [key-hash, [ vector([key,value]) ] ] However, with a well-chosen hash this can be much faster.


                                "Vierteile den, der sie Hure schimpft mit einem türkischen Säbel."
                                sighist | Agile Programming | doxygen

                                T C L 3 Replies Last reply
                                0
                                • D Daniel Turini

                                  There are lots of uses. One example is password checking: you don't need to store a password, you can store only the hash of the password, and compare the hashes when you need to check it. Another use is a hash table. Suppose you have 100,000 elements you want to search. If you can create (and often you can) a hash function which will give a unique number to each of these elements, you can search things 100,000 times faster. Like this: string "test" -> hash value 5 -> a[5] = "test"; string "another" -> hash value 10 -> a[10] = "another"; When you need to search for "test", you only compute its hash value (5) and look at a[5], without need to search through all the array. Trying to make bits uncopyable is like trying to make water not wet. -- Bruce Schneier

                                  P Offline
                                  P Offline
                                  Paul Oss
                                  wrote on last edited by
                                  #17

                                  Daniel Turini wrote: When you need to search for "test", you only compute its hash value (5) and look at a[5], without need to search through all the array. Assuming that your hash algorithm is truly generating unique #'s for each unique string. Ie, if your algorithm takes two completely different strings, and due to weaknesses in your math, return the same hash value, you're screwed. ;) Paul

                                  D 1 Reply Last reply
                                  0
                                  • J James Simpson

                                    I see the uses behind it, but I dont fully understand how the algorithms can take any peice of data and effectivly shrink or grow it to a fixed size (say 128 bit) and still keep it unique! I think from what I have read that it can not gaurentee its uniqueness, but reduce the chances of two items having the same hash value to a very very very small value! James Simpson Web Developer imebgo@hotmail.com P S - This is what part of the alphabet would look like if Q and R were eliminated
                                    Mitch Hedberg

                                    D Offline
                                    D Offline
                                    David Crow
                                    wrote on last edited by
                                    #18

                                    In the context of hashing, this "uniqueness" you speak of is called collision. It's not always a bad thing. In some instances, the normalization (spreading out) of data is all that's required (i.e., collision is expected). If your specific implementation will not tolerate collision, it must be dealt with using a variety of methods (e.g., separate chaining and open addressing).


                                    Five birds are sitting on a fence. Three of them decide to fly off. How many are left?

                                    1 Reply Last reply
                                    0
                                    • J James Simpson

                                      I see the uses behind it, but I dont fully understand how the algorithms can take any peice of data and effectivly shrink or grow it to a fixed size (say 128 bit) and still keep it unique! I think from what I have read that it can not gaurentee its uniqueness, but reduce the chances of two items having the same hash value to a very very very small value! James Simpson Web Developer imebgo@hotmail.com P S - This is what part of the alphabet would look like if Q and R were eliminated
                                      Mitch Hedberg

                                      B Offline
                                      B Offline
                                      BrainJar
                                      wrote on last edited by
                                      #19

                                      That's because hashing algorithms don't grow or shrink the input data, and they don't generate unique values. If a particular hash algorithm generates a 128 bit value, it generally means that it has a 16 byte array (or vector) of initial values. It takes the first 16 bytes of the input data and performs some addition and bit shifting and/or other operations to combine those values with the internal 16 bytes. (These 16 bytes are generally processed as four, 32-bit integers rather than as 16, 8-bit integers, but you get the point.) This changes data in the interal 16 byte vector. The program then repeats the operations on it with the next 16 bytes of input, then the next 16 bytes and so on. In the end, the program just outputs whatever result is left in the internal vector. In fact, most hashes must perform some type of padding on the input to get an even multiple of 16 (or whatever the vector length is) bytes. It doesn't matter how large or small the input length is, only 16 bytes are done at a time and only 16 bytes will be outputed. Increasing the input length just increases the number of times thru the computation loop. The trick to a good hash, like MD4, MD5 or SHA1, is to use computations that result in an avalanche effect: a small change in the input, like a single character in a 10MB file, radically alters the output. It's possible to have two different inputs hash to the same 128-bit value, it fact it's inevitable. But your chances of finding two out of the 2^128 possibilities are pretty slim.

                                      1 Reply Last reply
                                      0
                                      • D Daniel Turini

                                        That's why I love CP: today I've learned a new meaning for "hash" :) Trying to make bits uncopyable is like trying to make water not wet. -- Bruce Schneier

                                        N Offline
                                        N Offline
                                        Nitron
                                        wrote on last edited by
                                        #20

                                        :omg: You mean you never knew that people used this to identify data? ;P - Nitron


                                        "Those that say a task is impossible shouldn't interrupt the ones who are doing it." - Chinese Proverb

                                        1 Reply Last reply
                                        0
                                        • P peterchen

                                          one use that has not been mentioned is speeding up lookup e.g. in an associative map. Imagine you have a dictionary [key, value] with quite a many long words as keys. Instead of using string comparisons, you just compare the hashes. The hashes of the strings in the dicitonary can be precalculated, and the dicitonary indexed by the hash instead of the string. of course there's a slight chance of two keys have the same hash - this must be treated separately, so you end up with a dictionary [key-hash, [ vector([key,value]) ] ] However, with a well-chosen hash this can be much faster.


                                          "Vierteile den, der sie Hure schimpft mit einem türkischen Säbel."
                                          sighist | Agile Programming | doxygen

                                          T Offline
                                          T Offline
                                          Terry Denham
                                          wrote on last edited by
                                          #21

                                          This is exactly what I did with an ADO Recordset class that was generated using the #import directive. One of my peers was needing to pull back a fairly large set of data but do to some requirements we weren't able to build the data with a set of joins so we had to have this data represented in the recordset. Some of the records would be linked to other records in the same set. The process was taking about 14 hours to process about 140000 records due to the large number of loops that it had. I had them remove one of the loop and wrote a CAdoRecordsetIndex class that would be an associative map on the records in the CAdoRecordset class based on what what columns you told it to build the index on. Then when you needed to find the values you would pass in the array of values that you wanted to search for, the index class would turn this into a key, find the bookmark to the record that had this key, I used the Vector as the value incase there were multiple records that had the same nonunique key. Just this change alone took the processing from 14 Hours to 15 Minutes just by using the associative hash. This could have been improved some more if I would have changed the hash bucket size. I used the default of 17 (which is usually a prime number). If I would have used a larger prime I would have reduced the amount of time looking in the vector.

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups