Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. Database & SysAdmin
  3. Database
  4. MS SQL: how to reorganize identity column values?

MS SQL: how to reorganize identity column values?

Scheduled Pinned Locked Moved Database
databasetutorialquestion
20 Posts 5 Posters 4 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • P PIEBALDconsult

    Mycroft Holmes wrote:

    especially in the development phase

    And after a few months or years? And, perhaps, you could mock it up; have your own Guid class that returns: g = new System.Guid ( "00000000-0000-0000-0000-000000000001" ) ; g = new System.Guid ( "00000000-0000-0000-0000-000000000002" ) ; etc. for during development. P.S. Maybe I should get right on that.

    M Offline
    M Offline
    Mycroft Holmes
    wrote on last edited by
    #10

    After a few more years it is the problem of the support team :rolleyes: . I know that's not a reasonable answer!

    PIEBALDconsult wrote:

    And, perhaps, you could mock it up; have your own Guid class that returns:

    and the difference between

    Select * from SomeTable where ID = 1

    and

    Select * from SomeTable where ID = '00000000-0000-0000-0000-000000000001'

    Is that in the 2nd instance I HAVE to cut and paste or 29 key strokes plus 4 shifts oh yeah and the ', and I can remember 1. Nope guids are a developers nightmare, if I knew before hand that a contract used guids for IDs I would refuse the contract (I have not been desperate for work for a looong time). IMHO the only valid application for guids is in a distributed application where the data is to be merged and even then I would opt for a locationid or only put the guids on the transaction table!

    Never underestimate the power of human stupidity RAH

    P 1 Reply Last reply
    0
    • M Mycroft Holmes

      After a few more years it is the problem of the support team :rolleyes: . I know that's not a reasonable answer!

      PIEBALDconsult wrote:

      And, perhaps, you could mock it up; have your own Guid class that returns:

      and the difference between

      Select * from SomeTable where ID = 1

      and

      Select * from SomeTable where ID = '00000000-0000-0000-0000-000000000001'

      Is that in the 2nd instance I HAVE to cut and paste or 29 key strokes plus 4 shifts oh yeah and the ', and I can remember 1. Nope guids are a developers nightmare, if I knew before hand that a contract used guids for IDs I would refuse the contract (I have not been desperate for work for a looong time). IMHO the only valid application for guids is in a distributed application where the data is to be merged and even then I would opt for a locationid or only put the guids on the transaction table!

      Never underestimate the power of human stupidity RAH

      P Offline
      P Offline
      PIEBALDconsult
      wrote on last edited by
      #11

      Well, I've used them for a few projects now (by choice) and I find them superior in ways that matter to me.

      M 1 Reply Last reply
      0
      • P PIEBALDconsult

        Well, I've used them for a few projects now (by choice) and I find them superior in ways that matter to me.

        M Offline
        M Offline
        Mycroft Holmes
        wrote on last edited by
        #12

        PIEBALDconsult wrote:

        I find them superior in ways that matter to me

        Would you care to elaborate, I'm very interested to hear how they are of a benefit other than the distributed requirement I am aware of.

        Never underestimate the power of human stupidity RAH

        M 1 Reply Last reply
        0
        • M Mycroft Holmes

          So did Geralds ideas help?

          Never underestimate the power of human stupidity RAH

          M Offline
          M Offline
          michal kreslik
          wrote on last edited by
          #13

          update: so I'm again at the beginning. the attempt to set the bigint column in the new table to Identity results in the SQL server growing the LOG so that it fills the whole disk and then it stops doing anything. why should this simple operation be so dramatically demanding? I'll now try to repeat the whole process, but first I'll create the new table with the Identity column already set up and only then I'll select into this table from the original table. I'll appreciate any suggestions. thanks, Michal

          M 1 Reply Last reply
          0
          • M michal kreslik

            update: so I'm again at the beginning. the attempt to set the bigint column in the new table to Identity results in the SQL server growing the LOG so that it fills the whole disk and then it stops doing anything. why should this simple operation be so dramatically demanding? I'll now try to repeat the whole process, but first I'll create the new table with the Identity column already set up and only then I'll select into this table from the original table. I'll appreciate any suggestions. thanks, Michal

            M Offline
            M Offline
            Mycroft Holmes
            wrote on last edited by
            #14

            Don't forget to turn indentity insert on

            Never underestimate the power of human stupidity RAH

            M 1 Reply Last reply
            0
            • M Mycroft Holmes

              Don't forget to turn indentity insert on

              Never underestimate the power of human stupidity RAH

              M Offline
              M Offline
              michal kreslik
              wrote on last edited by
              #15

              yes, I know, thanks. I've done this already with the same DB before. let's see, I'll update you, thanks much. Michal

              M 2 Replies Last reply
              0
              • M Mycroft Holmes

                PIEBALDconsult wrote:

                I find them superior in ways that matter to me

                Would you care to elaborate, I'm very interested to hear how they are of a benefit other than the distributed requirement I am aware of.

                Never underestimate the power of human stupidity RAH

                M Offline
                M Offline
                michal kreslik
                wrote on last edited by
                #16

                I'm using GUIDs as keys in another database where multiple servers need to write to the same table. so this is the model of the distributed app you were talking about. but even there it's not ideal. just yesterday I was thinking about changing this structure. in another table of the same database I'm storing rows that contain this GUID as a FK. there are many rows with the same GUID and I need to group these rows based on the GUID. the problem is obviously the performance. I have a clustered index on a table that has this GUID as a FK on which I base the clustering. since new GUIDs don't come in a sequential order (they're generated randomly), each insert with a new GUID into this GUID-clustered table will force the index to be recalculated. this wouldn't happen with sequential autogenerated values like int. Michal

                M 1 Reply Last reply
                0
                • M michal kreslik

                  I'm using GUIDs as keys in another database where multiple servers need to write to the same table. so this is the model of the distributed app you were talking about. but even there it's not ideal. just yesterday I was thinking about changing this structure. in another table of the same database I'm storing rows that contain this GUID as a FK. there are many rows with the same GUID and I need to group these rows based on the GUID. the problem is obviously the performance. I have a clustered index on a table that has this GUID as a FK on which I base the clustering. since new GUIDs don't come in a sequential order (they're generated randomly), each insert with a new GUID into this GUID-clustered table will force the index to be recalculated. this wouldn't happen with sequential autogenerated values like int. Michal

                  M Offline
                  M Offline
                  Mycroft Holmes
                  wrote on last edited by
                  #17

                  If you opt for the identity type column the cost is a complex primary key on the merged table (ID & Server) and I'd think that may have a higher cost than the guid. As piebald suggested a custom guid based on an identity field and server may be a good idea!

                  Never underestimate the power of human stupidity RAH

                  1 Reply Last reply
                  0
                  • M michal kreslik

                    yes, I know, thanks. I've done this already with the same DB before. let's see, I'll update you, thanks much. Michal

                    M Offline
                    M Offline
                    michal kreslik
                    wrote on last edited by
                    #18

                    actually, I'm thinking it might be a good idea to also set up all indexes (most importantly, the clustered one) on the blank new table before I do the bulk INSERT INTO. this way SQL will be building the indexes along the way as it copies the data. I think this is potentially less time-and-space costly compared to SQL having to physically repartition the data in the clustered index later on, after the INSERT INTO has completed. let's see :)

                    1 Reply Last reply
                    0
                    • M michal kreslik

                      yes, I know, thanks. I've done this already with the same DB before. let's see, I'll update you, thanks much. Michal

                      M Offline
                      M Offline
                      michal kreslik
                      wrote on last edited by
                      #19

                      udpate: so even this way, the LOG file grows so much that it fills the whole 1000 GB disk. which is ridiculous as the db itself without the indexes is only about 110 GBs big. it's 315 GBs with all indexes. in the new table I'm only using 2 out of the 6 original indexes. as always when one does some change, I've found out I'm actually no longer using the remaining 4 indexes, so I dropped them in the new table :) I think the problem here is that although the recovery model is set to simple, the log file keeps track of all the transactions until the statement has terminated and all data has been fully written to the disk. so I think the solution now is to try to do this INSERT INTO in batches and make sure the log file gets truncated after the completion of each batch. I'm wondering just how much disk space this operation would eventually need to complete if done at once. it would be useful if SQL server provided a way to carry out these bulk operations in some kind of "unsafe" mode with no logging at all. unless I'm missing something, it's not possible to turn off all logging in the MS SQL server. I'll update you :)

                      M 1 Reply Last reply
                      0
                      • M michal kreslik

                        udpate: so even this way, the LOG file grows so much that it fills the whole 1000 GB disk. which is ridiculous as the db itself without the indexes is only about 110 GBs big. it's 315 GBs with all indexes. in the new table I'm only using 2 out of the 6 original indexes. as always when one does some change, I've found out I'm actually no longer using the remaining 4 indexes, so I dropped them in the new table :) I think the problem here is that although the recovery model is set to simple, the log file keeps track of all the transactions until the statement has terminated and all data has been fully written to the disk. so I think the solution now is to try to do this INSERT INTO in batches and make sure the log file gets truncated after the completion of each batch. I'm wondering just how much disk space this operation would eventually need to complete if done at once. it would be useful if SQL server provided a way to carry out these bulk operations in some kind of "unsafe" mode with no logging at all. unless I'm missing something, it's not possible to turn off all logging in the MS SQL server. I'll update you :)

                        M Offline
                        M Offline
                        michal kreslik
                        wrote on last edited by
                        #20

                        final update: all done. the correct way on how to do this was: 1) create a new table with the same columns, change Int32 column to Int64 2) script out all keys and indexes from the old table and create them on the new table 3) make sure the DB is in the simple recovery mode 4) INSERT the rows from the old table into the new table in batches, truncating the LOG after each batch. I was also inserting the identity column values, so I have also set the IDENTITY_INSERT to ON before the batch loop. that's it :) thanks for your help, guys. Michal

                        1 Reply Last reply
                        0
                        Reply
                        • Reply as topic
                        Log in to reply
                        • Oldest to Newest
                        • Newest to Oldest
                        • Most Votes


                        • Login

                        • Don't have an account? Register

                        • Login or register to search.
                        • First post
                          Last post
                        0
                        • Categories
                        • Recent
                        • Tags
                        • Popular
                        • World
                        • Users
                        • Groups