Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. Algorithms
  4. generate unrepeated serials

generate unrepeated serials

Scheduled Pinned Locked Moved Algorithms
16 Posts 5 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • N Nathan Addy

    Actually, if I understand your algorithm correctly, I don't think it's the best thing to do. (Please correct me if I'm wrong though) Basically, your set of numbers will be much less random than they should be (they will end up being much more evenly spread over the range than they would be if they were "really" random). If you try to generate a list of a bunch of random and non-equal 4 digit numbers, the way I understand your algorithm works, it would do something like the following (got the random umbers on the right hand side from python). 10 . 47 = 1047 11 . 67 = 1167 12 . 77 = 1277 13 . 23 = 1323 14 . 01 = 1411 And you're saying that because of the numbers on the left, from 10 to 14, guarantee the numbers will be unique (which is true), and the ones on the right "randomize" the numbers. The problem is that your numbers aren't very random at all (not uniformly distributed in the range 1000-9999). If you generate 40 numbers this way, none of them will be over 5000. For a really random set of 40 different numbers, the chance would be somewhere around (but not equal to, because the numbers are all different) about (1/2)^40, which is no chance at all. If you switch your algorithm around, by putting the random half first, you get something MUCH better. In this case, you would get the numbers 4710, 6711, 7712, 2313, 1114. Your numbers will be much, much, much more random. In this guys case, you'd generate 11 digit random numbers, and use the last 5 for the sequence. They still won't be as great as they could theoretically be. If, for example, you did statistical tests of these numbers, you would find that exactly 10% of the numbers end in 1, 10% in 2, and so on. Obviously that wouldn't be true of a "really" random set. Basically the best way to do it is just generate random numbers in that range. There is about a one in 50 million chance you'll have overlaps if your random number generator is good.

    C Offline
    C Offline
    cp9876
    wrote on last edited by
    #6

    The OP simply requested unrepeated, not random. Truly random numbers must be allowed to repeat (otherwise you are choosing from a smaller and smaller pool). For serial numbers Nathan's approach sounds fine, how you combine (or mangle) the two/three components depends on the user's requirements. Another approach is to take the numbers 1,2 .. 100,000 and encrypt or hash them, but this is not as simple as Nathan's approach. If you encrypt them it is possible to find the original serial number if this is important.

    Peter "Until the invention of the computer, the machine gun was the device that enabled humans to make the most mistakes in the smallest amount of time."

    N U 2 Replies Last reply
    0
    • C cp9876

      The OP simply requested unrepeated, not random. Truly random numbers must be allowed to repeat (otherwise you are choosing from a smaller and smaller pool). For serial numbers Nathan's approach sounds fine, how you combine (or mangle) the two/three components depends on the user's requirements. Another approach is to take the numbers 1,2 .. 100,000 and encrypt or hash them, but this is not as simple as Nathan's approach. If you encrypt them it is possible to find the original serial number if this is important.

      Peter "Until the invention of the computer, the machine gun was the device that enabled humans to make the most mistakes in the smallest amount of time."

      N Offline
      N Offline
      Nathan Addy
      wrote on last edited by
      #7

      Agreed. The original post asked for "unrepeated" and "nonconsecutive". By itself, this is really easy (10^6, 10^6 +2, 10^6 + 4, ...), and so I simply assumed he misasked his question. Specifically I assumed that he wanted a random sample from that range without replacement. Like I told him, without getting a better requirements list, it wasn't going to be possible to answer his question properly. Assuming he means "a random sample without replacement", I think my method (just generate the numbers) is best. The only problem here is that there aren't any guarantees they are unequal. If you could "fudge" the randomness a bit, a previous poster came up with a good method (although in his method, you should put the random portion of the numbers first, then the sequential part, to maximize randomness). That method would generate 100K numbers that will be close to uniformly distributed. You could also generate the numbers randomly and just keep the already generated ones sorted, so you can just check to see if they are in the list. This would produce the most accurate result, although also would take the most work (you'd have to do an extra insert and search into some data structure for each of your 100K iteration). Finally, if you are comfortable with the theoretical foundations of random number generators, you might have some interesting possibilities there. As you might know, many of the most common RNGs are "Linear congruential RNGS". Each RNG of this type is defined by 3 numbers, a, b, and m. You start with an initial seed, n_0, and the number that comes out is (a*n_0 + b) mod m. That is your random number. By reapplying the formula with the JUST generated number will get you your second random number and so on. As far as this class of generators goes, there is an art and a science to picking "good" numbers" I think if you worked at it for an hour, you could figure out how to pick a, b, and m such that you could get a "uniform random without replacement" sample of 100K. I would guess this could be "the best" way, but it would require a little mathematical knowledge (nothing you can't get in a couple hours on wikipedia), and if you didn't know what you were doing, it would be pretty easy to mess it up. (I haven't looked at this in a while, but I think if you pick a prime as close to (10^17 - 10^16) as possible, and then get REALLY REALLY big a and b, you'd be ok.) On the other hand, if unique and non-consecutive is REALLY what he wants, the following will do the trick: std::vector

      C U 2 Replies Last reply
      0
      • N Nathan Addy

        Agreed. The original post asked for "unrepeated" and "nonconsecutive". By itself, this is really easy (10^6, 10^6 +2, 10^6 + 4, ...), and so I simply assumed he misasked his question. Specifically I assumed that he wanted a random sample from that range without replacement. Like I told him, without getting a better requirements list, it wasn't going to be possible to answer his question properly. Assuming he means "a random sample without replacement", I think my method (just generate the numbers) is best. The only problem here is that there aren't any guarantees they are unequal. If you could "fudge" the randomness a bit, a previous poster came up with a good method (although in his method, you should put the random portion of the numbers first, then the sequential part, to maximize randomness). That method would generate 100K numbers that will be close to uniformly distributed. You could also generate the numbers randomly and just keep the already generated ones sorted, so you can just check to see if they are in the list. This would produce the most accurate result, although also would take the most work (you'd have to do an extra insert and search into some data structure for each of your 100K iteration). Finally, if you are comfortable with the theoretical foundations of random number generators, you might have some interesting possibilities there. As you might know, many of the most common RNGs are "Linear congruential RNGS". Each RNG of this type is defined by 3 numbers, a, b, and m. You start with an initial seed, n_0, and the number that comes out is (a*n_0 + b) mod m. That is your random number. By reapplying the formula with the JUST generated number will get you your second random number and so on. As far as this class of generators goes, there is an art and a science to picking "good" numbers" I think if you worked at it for an hour, you could figure out how to pick a, b, and m such that you could get a "uniform random without replacement" sample of 100K. I would guess this could be "the best" way, but it would require a little mathematical knowledge (nothing you can't get in a couple hours on wikipedia), and if you didn't know what you were doing, it would be pretty easy to mess it up. (I haven't looked at this in a while, but I think if you pick a prime as close to (10^17 - 10^16) as possible, and then get REALLY REALLY big a and b, you'd be ok.) On the other hand, if unique and non-consecutive is REALLY what he wants, the following will do the trick: std::vector

        C Offline
        C Offline
        cp9876
        wrote on last edited by
        #8

        Guess we had different takes on the original question, but it does raise some interesting ideas, as long as we agree that they may have nothing to do with the original post. I was looking at the serial number aspect, which I interpreted to be the type of alpha-numeric key type things you have to type in to install some software. Here the requirements MAY be: 1. hard to create valid serial numbers without the exact algorithm 2. easy to check if a serial number is valid For 100,000 serial numbers, with 16 digits, if all numeric then only 10^5 of 10^16 are valid, so if the numbers appear 'random', the chance of creating one by guessing is 1 in 10^11. If the digits can be alphanumeric then the chance is lower. Tim's approach of mangling bits from a sequence number, a random number and some hash value is a simple solution to this, likely to be adequate for many applications. However, if these numbers are to be tested for validity in the field, it is vulnerable as it places the algorithm needed to generate the serial numbers actually in the field software, so is susceptible to reverse engineering. As to whether this is a problem depends on waht you are trying to achieve. You can make it safer using public key cryptography. For example, if you were to encrypt the blocks{serial_000001, serial_000002, serial_000003.. etc} using an RSA private key, then you could decrypt using the RSA public key in your software. Only valid serial numbers would decrypt to something of the correct form. It would be very difficult to generate valid numbers without the private key. This can only be part of a security system, but is a relatively simple way to make keys that are hard to fake. You could generate random numbers using linear congruences, 'good' here usually refers to mathematical randomness; the uniform (or specifically shaped) distribution and lack of hidden correlation, characteristics essential for their use in monte-Carlo simulations. There is lots on the web on this, Numerical Recipes have a simple canned routine. For serial numbers though, Random numbers don't have a simple way of determining if a particular number is valid.

        Peter "Until the invention of the computer, the machine gun was the device that enabled humans to make the most mistakes in the smallest amount of time."

        U 1 Reply Last reply
        0
        • C cp9876

          Guess we had different takes on the original question, but it does raise some interesting ideas, as long as we agree that they may have nothing to do with the original post. I was looking at the serial number aspect, which I interpreted to be the type of alpha-numeric key type things you have to type in to install some software. Here the requirements MAY be: 1. hard to create valid serial numbers without the exact algorithm 2. easy to check if a serial number is valid For 100,000 serial numbers, with 16 digits, if all numeric then only 10^5 of 10^16 are valid, so if the numbers appear 'random', the chance of creating one by guessing is 1 in 10^11. If the digits can be alphanumeric then the chance is lower. Tim's approach of mangling bits from a sequence number, a random number and some hash value is a simple solution to this, likely to be adequate for many applications. However, if these numbers are to be tested for validity in the field, it is vulnerable as it places the algorithm needed to generate the serial numbers actually in the field software, so is susceptible to reverse engineering. As to whether this is a problem depends on waht you are trying to achieve. You can make it safer using public key cryptography. For example, if you were to encrypt the blocks{serial_000001, serial_000002, serial_000003.. etc} using an RSA private key, then you could decrypt using the RSA public key in your software. Only valid serial numbers would decrypt to something of the correct form. It would be very difficult to generate valid numbers without the private key. This can only be part of a security system, but is a relatively simple way to make keys that are hard to fake. You could generate random numbers using linear congruences, 'good' here usually refers to mathematical randomness; the uniform (or specifically shaped) distribution and lack of hidden correlation, characteristics essential for their use in monte-Carlo simulations. There is lots on the web on this, Numerical Recipes have a simple canned routine. For serial numbers though, Random numbers don't have a simple way of determining if a particular number is valid.

          Peter "Until the invention of the computer, the machine gun was the device that enabled humans to make the most mistakes in the smallest amount of time."

          U Offline
          U Offline
          User 12349108
          wrote on last edited by
          #9

          thanks: https://movied.org

          1 Reply Last reply
          0
          • N Nathan Addy

            Agreed. The original post asked for "unrepeated" and "nonconsecutive". By itself, this is really easy (10^6, 10^6 +2, 10^6 + 4, ...), and so I simply assumed he misasked his question. Specifically I assumed that he wanted a random sample from that range without replacement. Like I told him, without getting a better requirements list, it wasn't going to be possible to answer his question properly. Assuming he means "a random sample without replacement", I think my method (just generate the numbers) is best. The only problem here is that there aren't any guarantees they are unequal. If you could "fudge" the randomness a bit, a previous poster came up with a good method (although in his method, you should put the random portion of the numbers first, then the sequential part, to maximize randomness). That method would generate 100K numbers that will be close to uniformly distributed. You could also generate the numbers randomly and just keep the already generated ones sorted, so you can just check to see if they are in the list. This would produce the most accurate result, although also would take the most work (you'd have to do an extra insert and search into some data structure for each of your 100K iteration). Finally, if you are comfortable with the theoretical foundations of random number generators, you might have some interesting possibilities there. As you might know, many of the most common RNGs are "Linear congruential RNGS". Each RNG of this type is defined by 3 numbers, a, b, and m. You start with an initial seed, n_0, and the number that comes out is (a*n_0 + b) mod m. That is your random number. By reapplying the formula with the JUST generated number will get you your second random number and so on. As far as this class of generators goes, there is an art and a science to picking "good" numbers" I think if you worked at it for an hour, you could figure out how to pick a, b, and m such that you could get a "uniform random without replacement" sample of 100K. I would guess this could be "the best" way, but it would require a little mathematical knowledge (nothing you can't get in a couple hours on wikipedia), and if you didn't know what you were doing, it would be pretty easy to mess it up. (I haven't looked at this in a while, but I think if you pick a prime as close to (10^17 - 10^16) as possible, and then get REALLY REALLY big a and b, you'd be ok.) On the other hand, if unique and non-consecutive is REALLY what he wants, the following will do the trick: std::vector

            U Offline
            U Offline
            User 12349108
            wrote on last edited by
            #10

            thanks: https://movied.org

            1 Reply Last reply
            0
            • C cp9876

              The OP simply requested unrepeated, not random. Truly random numbers must be allowed to repeat (otherwise you are choosing from a smaller and smaller pool). For serial numbers Nathan's approach sounds fine, how you combine (or mangle) the two/three components depends on the user's requirements. Another approach is to take the numbers 1,2 .. 100,000 and encrypt or hash them, but this is not as simple as Nathan's approach. If you encrypt them it is possible to find the original serial number if this is important.

              Peter "Until the invention of the computer, the machine gun was the device that enabled humans to make the most mistakes in the smallest amount of time."

              U Offline
              U Offline
              User 12349108
              wrote on last edited by
              #11

              thanks: https://movied.org

              1 Reply Last reply
              0
              • T Tim Paaschen

                Nathan A. wrote:

                ... The problem is that your numbers aren't very random at all (not uniformly distributed in the range 1000-9999). ...

                Of cause your are right - this is why I called my "algorithm" a simple approach.

                Nathan A. wrote:

                ... If you switch your algorithm around, by putting the random half first, you get something MUCH better. ...

                When I suggested to use two (or three) "sections" I didn't meant to place them specifically in that sequence, just that the number can be compsed of two parts, each designed to fullfill one of the requirements.

                Nathan A. wrote:

                ... Basically the best way to do it is just generate random numbers in that range. ...

                IMO the "best way" depends on your requirements. If the numbers must be uniformly distributed, a well known and tested pseudo random nuber generator will be the best solution. The "used" numbers can be stored and each newly generated number can be compared with this list to guarantee it is unique. However, if you just want to generate some serial numbers for your application, which must be unique and should not be consecutive (for some reason), this approach seems to be overkill.

                Regards, Tim

                U Offline
                U Offline
                User 12349108
                wrote on last edited by
                #12

                thanks: https://movied.org

                1 Reply Last reply
                0
                • N Nathan Addy

                  Actually, if I understand your algorithm correctly, I don't think it's the best thing to do. (Please correct me if I'm wrong though) Basically, your set of numbers will be much less random than they should be (they will end up being much more evenly spread over the range than they would be if they were "really" random). If you try to generate a list of a bunch of random and non-equal 4 digit numbers, the way I understand your algorithm works, it would do something like the following (got the random umbers on the right hand side from python). 10 . 47 = 1047 11 . 67 = 1167 12 . 77 = 1277 13 . 23 = 1323 14 . 01 = 1411 And you're saying that because of the numbers on the left, from 10 to 14, guarantee the numbers will be unique (which is true), and the ones on the right "randomize" the numbers. The problem is that your numbers aren't very random at all (not uniformly distributed in the range 1000-9999). If you generate 40 numbers this way, none of them will be over 5000. For a really random set of 40 different numbers, the chance would be somewhere around (but not equal to, because the numbers are all different) about (1/2)^40, which is no chance at all. If you switch your algorithm around, by putting the random half first, you get something MUCH better. In this case, you would get the numbers 4710, 6711, 7712, 2313, 1114. Your numbers will be much, much, much more random. In this guys case, you'd generate 11 digit random numbers, and use the last 5 for the sequence. They still won't be as great as they could theoretically be. If, for example, you did statistical tests of these numbers, you would find that exactly 10% of the numbers end in 1, 10% in 2, and so on. Obviously that wouldn't be true of a "really" random set. Basically the best way to do it is just generate random numbers in that range. There is about a one in 50 million chance you'll have overlaps if your random number generator is good.

                  U Offline
                  U Offline
                  User 12349108
                  wrote on last edited by
                  #13

                  thanks: https://movied.org

                  1 Reply Last reply
                  0
                  • T Tim Paaschen

                    A simple approach would be to split your serial number into two sections. One section contains a sequential number the other is a random number. The first section will make your number unique while the second makes it non-consecutive. If you want to have some kind of validation for the numbers add a third section containing a hash of the first two. Hope this helps. Tim

                    U Offline
                    U Offline
                    User 12349108
                    wrote on last edited by
                    #14

                    thanks: https://movied.org

                    1 Reply Last reply
                    0
                    • N Nathan Addy

                      I've got a couple ideas, and I'd really like to help, but in order to do that, you need to explain what you mean by "not being consecutive". If that really is what you mean, the numbers (1e16, 1e16 + 2, 1e16 + 4,...) will fit your criterion (all unique, not consecutive), but I suspect you mean something else (random perhaps?). So what are these numbers, why do they all need to be distinct, and why do they need to not be consecutive? Answer that question and I promise I'll get back to you.

                      U Offline
                      U Offline
                      User 12349108
                      wrote on last edited by
                      #15

                      thanks: https://movied.org

                      1 Reply Last reply
                      0
                      • N Nathan Addy

                        I've got a couple ideas, and I'd really like to help, but in order to do that, you need to explain what you mean by "not being consecutive". If that really is what you mean, the numbers (1e16, 1e16 + 2, 1e16 + 4,...) will fit your criterion (all unique, not consecutive), but I suspect you mean something else (random perhaps?). So what are these numbers, why do they all need to be distinct, and why do they need to not be consecutive? Answer that question and I promise I'll get back to you.

                        U Offline
                        U Offline
                        User 12349108
                        wrote on last edited by
                        #16

                        thanks: https://movied.org

                        1 Reply Last reply
                        0
                        Reply
                        • Reply as topic
                        Log in to reply
                        • Oldest to Newest
                        • Newest to Oldest
                        • Most Votes


                        • Login

                        • Don't have an account? Register

                        • Login or register to search.
                        • First post
                          Last post
                        0
                        • Categories
                        • Recent
                        • Tags
                        • Popular
                        • World
                        • Users
                        • Groups