Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. The Proof that a GUID is not unique

The Proof that a GUID is not unique

Scheduled Pinned Locked Moved The Lounge
questioncomalgorithmsperformance
43 Posts 30 Posters 31 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • Z ZurdoDev

    Quote:

    They are a lot like snowflakes:

    Funny you bring that up. Recently a scientist claimed to have found many duplicates after cataloging many, many snowflakes. Of course they aren't unique.

    There are only 10 types of people in the world, those who understand binary and those who don't.

    S Offline
    S Offline
    SortaCore
    wrote on last edited by
    #22

    Does that include artificial snowflakes? :rolleyes: :~

    1 Reply Last reply
    0
    • L Lost User

      It's a random number from a limited domain. Ask enough numbers, and you'll encounter the same number sooner or later - one doesn't need much math to explain the logic.

      Bastard Programmer from Hell :suss: If you can't read my code, try converting it here[^]

      S Offline
      S Offline
      SortaCore
      wrote on last edited by
      #23

      That's why I always repetitively hash and bitmask things until I end up with one byte. And it has to be a human-readable byte, too, for those pesky tech support calls. :cool:

      1 Reply Last reply
      0
      • P peterchen

        Only it's not really 128 "random" bits[^]. For example, time value will roll over at less than 1400 years.

        ORDER BY what user wants

        J Offline
        J Offline
        Jan Holst Jensen2
        wrote on last edited by
        #24

        Quote:

        For example, time value will roll over at less than 1400 years.

        Oh My! Prepare for Year 3400 panic - all the COBOL code will have to be updated all over again.

        1 Reply Last reply
        0
        • N Nicholas Marty

          I still try discourage everyone from using GUID's if it's not truly necessary. 1. Waste of disk space, if an 32bit integer is sufficient why use a GUID? 2. Waste of processing power, more expensive to create, more expensive to compare against etc. 3. GUID is way more difficult to read for humans Sure there are uses for it but in most cases you're perfectly fine without them.

          L Offline
          L Offline
          Lost User
          wrote on last edited by
          #25

          Guids are great for building object relationships client side without a need to hit a database to get the next available integer. Users should never see a Guid.

          1 Reply Last reply
          0
          • M Marco Bertschi

            Searching for the number of possible GUIDs (answering questions in the forum section) I struggled over this amusing SO Thread where they are seriously discussing how the non-unique nature of a GUID can be proofed. Enjoy[^] Edit -> Here is my answer to the said question (can you think of a creative one yourself?):

            Quote:

            A Guid is 128 bit. Therefore you would have to generate 2^128 + 1 GUIDs to encounter a single GUID twice. A thread on StackOverflow.com[^] says that you would need about 10790283070806014188970 years to encounter a single GUID twice, assuming your program does nothing else than creating GUIDs and runs at a processor speed of 1 GhZ, without any interruption by CPU power eaten by other programs or the operating system itself. As you probably can think now, encountering the same GUID twice would be very bad luck and can safely considered as being unrealistic.

            People becoming wiser in order to notice the stupid things they did back in the young days. This doesn't mean that they really stop doing those things. Wise people still do stupid things, only on purpose.

            S Offline
            S Offline
            svella
            wrote on last edited by
            #26

            Of course it is possible to generate a non-unique GUIDs, but if you use a reliable implementation of a good generation algorithm, you can be reasonably confident that GUIDs will be unique within the context that matters. The only time I have seen problems was when using an unreliable implementation - NetWare 5 used to have problems with duplicate GUIDs being generated when timesync caused the clock to go backwards.

            1 Reply Last reply
            0
            • P PIEBALDconsult

              Nicholas Marty wrote:

              3. GUID is way more difficult to read for humans

              So what? IDs are meant to be meaningless; so that's a good thing. Even with integers you'll likely wind up just copying-and-pasting anyway. Or do this: 00000000-0000-0000-0000-000000000001 00000000-0000-0000-0000-000000000002 00000000-0000-0000-0000-000000000003 00000000-0000-0000-0000-000000000004 00000000-0000-0000-0000-000000000005

              S Offline
              S Offline
              StatementTerminator
              wrote on last edited by
              #27

              But...but...integers fit so much more nicely into a query string! *duck and cover* Seriously though, one of the few uses I've found for GUIDS is to allow the keys in your tables to be easily moved between different databases, for instance to move between a production and test DB. But otherwise integers are more efficient and easier to work with. As for the uniqueness debate, who cares? Raise your hand if you have ever seen a duplicate GUID pop up in a real-world situation. It's like arguing about the randomness of pseudo-random number generators, it's a moot point for almost all real-world implementations. If you're generating thousands of GUIDs per second in a system that you expect to be around for centuries, then maybe you should worry about it. Otherwise, it's like worrying about the server being taken out by a meteor hit. And even if you're unlucky enough to have a collision, you'd have to have a pretty fragile system for that to be a huge disaster; you'll probably have a dupe showing up in a join somewhere, not that hard to find and fix.

              I M P 3 Replies Last reply
              0
              • S StatementTerminator

                But...but...integers fit so much more nicely into a query string! *duck and cover* Seriously though, one of the few uses I've found for GUIDS is to allow the keys in your tables to be easily moved between different databases, for instance to move between a production and test DB. But otherwise integers are more efficient and easier to work with. As for the uniqueness debate, who cares? Raise your hand if you have ever seen a duplicate GUID pop up in a real-world situation. It's like arguing about the randomness of pseudo-random number generators, it's a moot point for almost all real-world implementations. If you're generating thousands of GUIDs per second in a system that you expect to be around for centuries, then maybe you should worry about it. Otherwise, it's like worrying about the server being taken out by a meteor hit. And even if you're unlucky enough to have a collision, you'd have to have a pretty fragile system for that to be a huge disaster; you'll probably have a dupe showing up in a join somewhere, not that hard to find and fix.

                I Offline
                I Offline
                IndifferentDisdain
                wrote on last edited by
                #28

                Yup; my previous employer offered both standalone installations and a SAAS model where we hosted, and some of the primary keys were ints. When a client decided it was better to go from standalone to SAAS, merging was a giant PITA.

                1 Reply Last reply
                0
                • T ThatEffinIanHarrisBloke

                  Yep, I always work on the principal that a GUID may be duplicated so I add Ticks since the Epoch or something like that to my GUIDs because Ticks should only ever increase. The chances of getting the exact same ticks in milliseconds AND a duplicate GUID are not gunna happen! you're welcome :)

                  J Offline
                  J Offline
                  Jadoti
                  wrote on last edited by
                  #29

                  Adding it where? Guids aren't always "increasing". so adding an increasing value to a random value doesn't mean you won't get a duplicated outcome.

                  1 Reply Last reply
                  0
                  • M Marco Bertschi

                    Searching for the number of possible GUIDs (answering questions in the forum section) I struggled over this amusing SO Thread where they are seriously discussing how the non-unique nature of a GUID can be proofed. Enjoy[^] Edit -> Here is my answer to the said question (can you think of a creative one yourself?):

                    Quote:

                    A Guid is 128 bit. Therefore you would have to generate 2^128 + 1 GUIDs to encounter a single GUID twice. A thread on StackOverflow.com[^] says that you would need about 10790283070806014188970 years to encounter a single GUID twice, assuming your program does nothing else than creating GUIDs and runs at a processor speed of 1 GhZ, without any interruption by CPU power eaten by other programs or the operating system itself. As you probably can think now, encountering the same GUID twice would be very bad luck and can safely considered as being unrealistic.

                    People becoming wiser in order to notice the stupid things they did back in the young days. This doesn't mean that they really stop doing those things. Wise people still do stupid things, only on purpose.

                    R Offline
                    R Offline
                    RafagaX
                    wrote on last edited by
                    #30

                    Arguing about the uniqueness of GUIDs is pointless, given that they're integer numbers in a finite space, which means that soon or (most likely) later there's going to be a collision, however, for practical purposes we can say that they're unique.

                    CEO at: - Rafaga Systems - Para Facturas - Modern Components for the moment...

                    1 Reply Last reply
                    0
                    • OriginalGriffO OriginalGriff

                      You sure? They are a lot like snowflakes: if you look at the fine detail they aren't the same. :laugh:

                      The only instant messaging I do involves my middle finger. English doesn't borrow from other languages. English follows other languages down dark alleys, knocks them over and goes through their pockets for loose grammar.

                      M Offline
                      M Offline
                      Magnamus
                      wrote on last edited by
                      #31

                      To be fair though, snowflakes are only 64 bit

                      1 Reply Last reply
                      0
                      • S StatementTerminator

                        But...but...integers fit so much more nicely into a query string! *duck and cover* Seriously though, one of the few uses I've found for GUIDS is to allow the keys in your tables to be easily moved between different databases, for instance to move between a production and test DB. But otherwise integers are more efficient and easier to work with. As for the uniqueness debate, who cares? Raise your hand if you have ever seen a duplicate GUID pop up in a real-world situation. It's like arguing about the randomness of pseudo-random number generators, it's a moot point for almost all real-world implementations. If you're generating thousands of GUIDs per second in a system that you expect to be around for centuries, then maybe you should worry about it. Otherwise, it's like worrying about the server being taken out by a meteor hit. And even if you're unlucky enough to have a collision, you'd have to have a pretty fragile system for that to be a huge disaster; you'll probably have a dupe showing up in a join somewhere, not that hard to find and fix.

                        M Offline
                        M Offline
                        Magnamus
                        wrote on last edited by
                        #32

                        They are also good for situations when you want an unpredictable value, such as a reference in a link for proving an email address.

                        1 Reply Last reply
                        0
                        • M Marco Bertschi

                          Searching for the number of possible GUIDs (answering questions in the forum section) I struggled over this amusing SO Thread where they are seriously discussing how the non-unique nature of a GUID can be proofed. Enjoy[^] Edit -> Here is my answer to the said question (can you think of a creative one yourself?):

                          Quote:

                          A Guid is 128 bit. Therefore you would have to generate 2^128 + 1 GUIDs to encounter a single GUID twice. A thread on StackOverflow.com[^] says that you would need about 10790283070806014188970 years to encounter a single GUID twice, assuming your program does nothing else than creating GUIDs and runs at a processor speed of 1 GhZ, without any interruption by CPU power eaten by other programs or the operating system itself. As you probably can think now, encountering the same GUID twice would be very bad luck and can safely considered as being unrealistic.

                          People becoming wiser in order to notice the stupid things they did back in the young days. This doesn't mean that they really stop doing those things. Wise people still do stupid things, only on purpose.

                          C Offline
                          C Offline
                          code_junkie
                          wrote on last edited by
                          #33

                          I've actually been burned twice in the same year by GUID collisions within unrelated software products from other companies. They are a very poor architecture choice.

                          1 Reply Last reply
                          0
                          • M Marco Bertschi

                            Searching for the number of possible GUIDs (answering questions in the forum section) I struggled over this amusing SO Thread where they are seriously discussing how the non-unique nature of a GUID can be proofed. Enjoy[^] Edit -> Here is my answer to the said question (can you think of a creative one yourself?):

                            Quote:

                            A Guid is 128 bit. Therefore you would have to generate 2^128 + 1 GUIDs to encounter a single GUID twice. A thread on StackOverflow.com[^] says that you would need about 10790283070806014188970 years to encounter a single GUID twice, assuming your program does nothing else than creating GUIDs and runs at a processor speed of 1 GhZ, without any interruption by CPU power eaten by other programs or the operating system itself. As you probably can think now, encountering the same GUID twice would be very bad luck and can safely considered as being unrealistic.

                            People becoming wiser in order to notice the stupid things they did back in the young days. This doesn't mean that they really stop doing those things. Wise people still do stupid things, only on purpose.

                            D Offline
                            D Offline
                            dpminusa
                            wrote on last edited by
                            #34

                            This may not be in the same spirit of fun that the article sets up, but I thought GUIDs were guaranteed unique because they are based on MAC addresses that are guaranteed as unique!? So proving MAC addresses are not unique would be a prerequisite?!

                            "Courtesy is the product of a mature, disciplined mind ... ridicule is lack of the same - DPM"

                            S 1 Reply Last reply
                            0
                            • D dpminusa

                              This may not be in the same spirit of fun that the article sets up, but I thought GUIDs were guaranteed unique because they are based on MAC addresses that are guaranteed as unique!? So proving MAC addresses are not unique would be a prerequisite?!

                              "Courtesy is the product of a mature, disciplined mind ... ridicule is lack of the same - DPM"

                              S Offline
                              S Offline
                              StatementTerminator
                              wrote on last edited by
                              #35

                              But your server is generating GUIDs using the same MAC address, right? So it wouldn't be unique per GUID generated on that server. But you shouldn't have to worry about collisions with GUIDs generated on other machines, I guess.

                              P D 2 Replies Last reply
                              0
                              • S StatementTerminator

                                But...but...integers fit so much more nicely into a query string! *duck and cover* Seriously though, one of the few uses I've found for GUIDS is to allow the keys in your tables to be easily moved between different databases, for instance to move between a production and test DB. But otherwise integers are more efficient and easier to work with. As for the uniqueness debate, who cares? Raise your hand if you have ever seen a duplicate GUID pop up in a real-world situation. It's like arguing about the randomness of pseudo-random number generators, it's a moot point for almost all real-world implementations. If you're generating thousands of GUIDs per second in a system that you expect to be around for centuries, then maybe you should worry about it. Otherwise, it's like worrying about the server being taken out by a meteor hit. And even if you're unlucky enough to have a collision, you'd have to have a pretty fragile system for that to be a huge disaster; you'll probably have a dupe showing up in a join somewhere, not that hard to find and fix.

                                P Offline
                                P Offline
                                PIEBALDconsult
                                wrote on last edited by
                                #36

                                StatementTerminator wrote:

                                easily moved between different databases

                                Yes indeed. Reminds me of one place I worked where identities were used and the only way to view PROD data was to have a tool copy the data to a DEV database -- but the tool didn't allow the IDs to be copied :sigh: , the copied rows all had new IDs that didn't match PROD. My argument isn't entirely against integers, but auto-increment integers over which the developer has no control. X|

                                StatementTerminator wrote:

                                integers are more efficient and easier to work with

                                My experience has been the opposite.

                                StatementTerminator wrote:

                                As for the uniqueness debate, who cares?

                                'Xactly

                                1 Reply Last reply
                                0
                                • S StatementTerminator

                                  But your server is generating GUIDs using the same MAC address, right? So it wouldn't be unique per GUID generated on that server. But you shouldn't have to worry about collisions with GUIDs generated on other machines, I guess.

                                  P Offline
                                  P Offline
                                  Plamen Dragiyski
                                  wrote on last edited by
                                  #37

                                  do {
                                  myguid = getGUID();
                                  } while(!exists(myguild))

                                  This would solve the problem with unique, it'll execute almost only once for the next several thousand years. It'll never generate bug for "bad luck". P.S. Sorry for my pseudocode, I write in javascript usually.

                                  1 Reply Last reply
                                  0
                                  • S StatementTerminator

                                    But your server is generating GUIDs using the same MAC address, right? So it wouldn't be unique per GUID generated on that server. But you shouldn't have to worry about collisions with GUIDs generated on other machines, I guess.

                                    D Offline
                                    D Offline
                                    dpminusa
                                    wrote on last edited by
                                    #38

                                    I assume their algorithm is designed to use the MAC Address as a seed and is contructed to guarantee unique GUIDs given this. You are right that there is a lot more to it than just the unique seed. I have never seen the algorithm and clearly never will. Sounds like a really interesting Math problem though. I would love to know the rest of the approach they used.

                                    "Courtesy is the product of a mature, disciplined mind ... ridicule is lack of the same - DPM"

                                    P 1 Reply Last reply
                                    0
                                    • D dpminusa

                                      I assume their algorithm is designed to use the MAC Address as a seed and is contructed to guarantee unique GUIDs given this. You are right that there is a lot more to it than just the unique seed. I have never seen the algorithm and clearly never will. Sounds like a really interesting Math problem though. I would love to know the rest of the approach they used.

                                      "Courtesy is the product of a mature, disciplined mind ... ridicule is lack of the same - DPM"

                                      P Offline
                                      P Offline
                                      patbob
                                      wrote on last edited by
                                      #39

                                      Systems without network cards can generate GUIDs. What do they use for their MAC address? Yup.. zeros. A GUID also includes clock ticks of some sort or another that (they hope) tick faster than the system can request GUIDs. I seem to recall there's some bits in there for sequence number within a clock tick, or maybe systems just keep track of the last one issued and ensure they don't generate duplicates. I think there might be some other sources of mostly unique bits thrown in, like CPU serial numbers or something. The idea being that within a given uptime, of a given OS load, on a given system, they are guaranteed to be unique, and between systems, they are as unique as reasonably possible. How are these bits packed into the GUID? It doesn't matter, no amount of deterministically massaging the bits will give you anything more unique. Massaging the source bits could obscure them and make backtracking to the original values for nefarious purposes more difficult, and I suspect its done. Using the source bits as seed for a pseudorandom number generator won't add uniqueness, but it is probably a pretty good, and inexpensive, way to deterministically massage the bits to obscure them. GUIDs were never absolutely guaranteed to be unique, and I'd be willing to bet most of those sources of unique bits are no longer unique once one starts running GUID generation code in virtual machines.

                                      We can program with only 1's, but if all you've got are zeros, you've got nothing.

                                      D 2 Replies Last reply
                                      0
                                      • P patbob

                                        Systems without network cards can generate GUIDs. What do they use for their MAC address? Yup.. zeros. A GUID also includes clock ticks of some sort or another that (they hope) tick faster than the system can request GUIDs. I seem to recall there's some bits in there for sequence number within a clock tick, or maybe systems just keep track of the last one issued and ensure they don't generate duplicates. I think there might be some other sources of mostly unique bits thrown in, like CPU serial numbers or something. The idea being that within a given uptime, of a given OS load, on a given system, they are guaranteed to be unique, and between systems, they are as unique as reasonably possible. How are these bits packed into the GUID? It doesn't matter, no amount of deterministically massaging the bits will give you anything more unique. Massaging the source bits could obscure them and make backtracking to the original values for nefarious purposes more difficult, and I suspect its done. Using the source bits as seed for a pseudorandom number generator won't add uniqueness, but it is probably a pretty good, and inexpensive, way to deterministically massage the bits to obscure them. GUIDs were never absolutely guaranteed to be unique, and I'd be willing to bet most of those sources of unique bits are no longer unique once one starts running GUID generation code in virtual machines.

                                        We can program with only 1's, but if all you've got are zeros, you've got nothing.

                                        D Offline
                                        D Offline
                                        dpminusa
                                        wrote on last edited by
                                        #40

                                        Great insights. Thanks. This has always interested me. I guess I should try to research it better sometime.

                                        "Courtesy is the product of a mature, disciplined mind ... ridicule is lack of the same - DPM"

                                        1 Reply Last reply
                                        0
                                        • P patbob

                                          Systems without network cards can generate GUIDs. What do they use for their MAC address? Yup.. zeros. A GUID also includes clock ticks of some sort or another that (they hope) tick faster than the system can request GUIDs. I seem to recall there's some bits in there for sequence number within a clock tick, or maybe systems just keep track of the last one issued and ensure they don't generate duplicates. I think there might be some other sources of mostly unique bits thrown in, like CPU serial numbers or something. The idea being that within a given uptime, of a given OS load, on a given system, they are guaranteed to be unique, and between systems, they are as unique as reasonably possible. How are these bits packed into the GUID? It doesn't matter, no amount of deterministically massaging the bits will give you anything more unique. Massaging the source bits could obscure them and make backtracking to the original values for nefarious purposes more difficult, and I suspect its done. Using the source bits as seed for a pseudorandom number generator won't add uniqueness, but it is probably a pretty good, and inexpensive, way to deterministically massage the bits to obscure them. GUIDs were never absolutely guaranteed to be unique, and I'd be willing to bet most of those sources of unique bits are no longer unique once one starts running GUID generation code in virtual machines.

                                          We can program with only 1's, but if all you've got are zeros, you've got nothing.

                                          D Offline
                                          D Offline
                                          dpminusa
                                          wrote on last edited by
                                          #41

                                          This has always interested me. I guess I should try to research it better sometime. Thanks for the great insights.

                                          "Courtesy is the product of a mature, disciplined mind ... ridicule is lack of the same - DPM"

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups