Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. Database & SysAdmin
  3. Database
  4. SQL Advice needed [modified]

SQL Advice needed [modified]

Scheduled Pinned Locked Moved Database
databasequestionsysadminhardwarehelp
14 Posts 8 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • S StevenWalsh

    This isn't a programming question... so thats why i'm here :) I'm working on a system thats working with some stock market data (this is a personal project at home) The system is collecting about 7 million records a week, and its really starting to slow down. I've done all the SQL optimizations i can think of (indexing, and what not) but i'm running into a road block with hardware i think... during the day when its downloading new quotes its flatlined at 100% CPU usage all day, and i'm seeing an increase of errors due to timeouts (when attempting to submit more data to sql). Currently the database engine is running on 1 dell poweredge 2550, i have 4 other servers with similar hardware (and 5 more that aren't really servers) so i'm looking for a way to distribute the load amoung the other servers. I've started to think about clustering, but i believe thats only for error tolerance (incase the server goes down) anyone have anyother suggestions?

    modified on Wednesday, July 23, 2008 6:11 PM

    G Offline
    G Offline
    Graham Bradshaw
    wrote on last edited by
    #5

    StevenWalsh wrote:

    its flatlined at 100% CPU usage all day

    What process is usingthe CPU? What is the physical disk subsystem? IDE drives can be very processor intensive, especially when writing.

    StevenWalsh wrote:

    I've done all the SQL optimizations i can think of (indexing, and what not)

    What indices do you have? Adding indices will only help querying performance - it will slow down inserts, updates and deletes. Are you in auto-commit mode, with an implicit commit after every insert, or do you perform number of inserts in a transaction and then commit? Auto-commit hurts performance since the database engine updates the indices for every insert. As a suggestion, if you are not querying the data during the day, you could consider writing it to a formatted text file, and use SQL Servers bulk copy to load it once per day.

    D S 2 Replies Last reply
    0
    • G Graham Bradshaw

      StevenWalsh wrote:

      its flatlined at 100% CPU usage all day

      What process is usingthe CPU? What is the physical disk subsystem? IDE drives can be very processor intensive, especially when writing.

      StevenWalsh wrote:

      I've done all the SQL optimizations i can think of (indexing, and what not)

      What indices do you have? Adding indices will only help querying performance - it will slow down inserts, updates and deletes. Are you in auto-commit mode, with an implicit commit after every insert, or do you perform number of inserts in a transaction and then commit? Auto-commit hurts performance since the database engine updates the indices for every insert. As a suggestion, if you are not querying the data during the day, you could consider writing it to a formatted text file, and use SQL Servers bulk copy to load it once per day.

      D Offline
      D Offline
      Daniel Turini
      wrote on last edited by
      #6

      Graham Bradshaw wrote:

      IDE drives can be very processor intensive, especially when writing.

      I thought that this was true only when PIO mode was active, but this was a thing of the 90's and newer disks used DMA, am I wrong?

      I see dead pixels Yes, even I am blogging now!

      G 1 Reply Last reply
      0
      • D Daniel Turini

        Graham Bradshaw wrote:

        IDE drives can be very processor intensive, especially when writing.

        I thought that this was true only when PIO mode was active, but this was a thing of the 90's and newer disks used DMA, am I wrong?

        I see dead pixels Yes, even I am blogging now!

        G Offline
        G Offline
        Graham Bradshaw
        wrote on last edited by
        #7

        It depends on the drive itself, the controller hardware and BIOS, and the driver used by the OS. Put it this way - there's a good reason why high performance servers use SCSI/SAS.

        1 Reply Last reply
        0
        • G Graham Bradshaw

          StevenWalsh wrote:

          its flatlined at 100% CPU usage all day

          What process is usingthe CPU? What is the physical disk subsystem? IDE drives can be very processor intensive, especially when writing.

          StevenWalsh wrote:

          I've done all the SQL optimizations i can think of (indexing, and what not)

          What indices do you have? Adding indices will only help querying performance - it will slow down inserts, updates and deletes. Are you in auto-commit mode, with an implicit commit after every insert, or do you perform number of inserts in a transaction and then commit? Auto-commit hurts performance since the database engine updates the indices for every insert. As a suggestion, if you are not querying the data during the day, you could consider writing it to a formatted text file, and use SQL Servers bulk copy to load it once per day.

          S Offline
          S Offline
          StevenWalsh
          wrote on last edited by
          #8

          Actually the servers are using a SCSI controller with 4 10k rpm drives. i'd have to check the auto-commit tip, never thought about that before :) i can't do the bulk copies unfortuantely since i query the data during the day.

          P 1 Reply Last reply
          0
          • S StevenWalsh

            Actually the servers are using a SCSI controller with 4 10k rpm drives. i'd have to check the auto-commit tip, never thought about that before :) i can't do the bulk copies unfortuantely since i query the data during the day.

            P Offline
            P Offline
            Pete OHanlon
            wrote on last edited by
            #9

            StevenWalsh wrote:

            i can't do the bulk copies unfortuantely since i query the data during the day.

            You can if you load the data into a staging database and use a linked database with a cross database view on it. Then, in the quiet periods do the bulk insert.

            Deja View - the feeling that you've seen this post before.

            My blog | My articles

            1 Reply Last reply
            0
            • E Ennis Ray Lynch Jr

              Distributed Partitioned views are ideal for this massive data scenario.

              Need a C# Consultant? I'm available.
              Happiness in intelligent people is the rarest thing I know. -- Ernest Hemingway

              S Offline
              S Offline
              StevenWalsh
              wrote on last edited by
              #10

              i'm thinking this is what i'm going to do.

              1 Reply Last reply
              0
              • S StevenWalsh

                This isn't a programming question... so thats why i'm here :) I'm working on a system thats working with some stock market data (this is a personal project at home) The system is collecting about 7 million records a week, and its really starting to slow down. I've done all the SQL optimizations i can think of (indexing, and what not) but i'm running into a road block with hardware i think... during the day when its downloading new quotes its flatlined at 100% CPU usage all day, and i'm seeing an increase of errors due to timeouts (when attempting to submit more data to sql). Currently the database engine is running on 1 dell poweredge 2550, i have 4 other servers with similar hardware (and 5 more that aren't really servers) so i'm looking for a way to distribute the load amoung the other servers. I've started to think about clustering, but i believe thats only for error tolerance (incase the server goes down) anyone have anyother suggestions?

                modified on Wednesday, July 23, 2008 6:11 PM

                G Offline
                G Offline
                GuyThiebaut
                wrote on last edited by
                #11

                Are any the indexes clustered on your large tables? If you do you may have a lot of page splitting going on which can really slow things down with inserting 7m rows. Do you have the transaction log on a separate drive? Is the database on a separate drive from the application?

                Continuous effort - not strength or intelligence - is the key to unlocking our potential.(Winston Churchill)
                1 Reply Last reply
                0
                • S StevenWalsh

                  This isn't a programming question... so thats why i'm here :) I'm working on a system thats working with some stock market data (this is a personal project at home) The system is collecting about 7 million records a week, and its really starting to slow down. I've done all the SQL optimizations i can think of (indexing, and what not) but i'm running into a road block with hardware i think... during the day when its downloading new quotes its flatlined at 100% CPU usage all day, and i'm seeing an increase of errors due to timeouts (when attempting to submit more data to sql). Currently the database engine is running on 1 dell poweredge 2550, i have 4 other servers with similar hardware (and 5 more that aren't really servers) so i'm looking for a way to distribute the load amoung the other servers. I've started to think about clustering, but i believe thats only for error tolerance (incase the server goes down) anyone have anyother suggestions?

                  modified on Wednesday, July 23, 2008 6:11 PM

                  S Offline
                  S Offline
                  SomeGuyThatIsMe
                  wrote on last edited by
                  #12

                  depending on the db used, and how you set it up it may have to grow itself to be able to store that data i.e. if you origionally set up the db for 3gb, and its taking up all 3 or has gotten bigger, any time you insert you'll have to wait on the database to go find another contiguous section of disc to put the new data on. on sql server 2k you can set the inital size and the growth rate as a percentage or fixed amount, we have a 1tb raid array so we can let it grow 1gb at a time, which lets it work for a while without needing new space. we also created the db several times larger than it was when we built the server, so it wouldnt need to grow for a long time. Contiguous disk space is your friend for data heavy applications, because if you can put all a table or tables together it will reduce seek time. also disabling the log file or setting the log file up to have a large amount of space could help too, if its logging the inserts then that file will get pretty big and have to go find more space for itself. i've greatly sped up our db server and other databases by forcing it to clear the log file, and giving it quite abit of space for it since we regularly upload (insert, delete, update) over 500 million rows in a handful of tables. we upload them to a temp database on the main server then do compares to deteremin what needs to be added removed or changed, then run those queries, so we arent adding 500 million rows every few months, but we do have to mess with that much data, the inserts are ususally only 20 or 30 million, but it does run fairly fast(20 to 30 minutes at most including initial upload done with bcp) on the new machine. top end dell 5u 15k rpm sata2 drives..its sweet. hopefully this was somewhat helpful

                  Please remember to rate helpful or unhelpful answers, it lets us and people reading the forums know if our answers are any good.

                  S 1 Reply Last reply
                  0
                  • S SomeGuyThatIsMe

                    depending on the db used, and how you set it up it may have to grow itself to be able to store that data i.e. if you origionally set up the db for 3gb, and its taking up all 3 or has gotten bigger, any time you insert you'll have to wait on the database to go find another contiguous section of disc to put the new data on. on sql server 2k you can set the inital size and the growth rate as a percentage or fixed amount, we have a 1tb raid array so we can let it grow 1gb at a time, which lets it work for a while without needing new space. we also created the db several times larger than it was when we built the server, so it wouldnt need to grow for a long time. Contiguous disk space is your friend for data heavy applications, because if you can put all a table or tables together it will reduce seek time. also disabling the log file or setting the log file up to have a large amount of space could help too, if its logging the inserts then that file will get pretty big and have to go find more space for itself. i've greatly sped up our db server and other databases by forcing it to clear the log file, and giving it quite abit of space for it since we regularly upload (insert, delete, update) over 500 million rows in a handful of tables. we upload them to a temp database on the main server then do compares to deteremin what needs to be added removed or changed, then run those queries, so we arent adding 500 million rows every few months, but we do have to mess with that much data, the inserts are ususally only 20 or 30 million, but it does run fairly fast(20 to 30 minutes at most including initial upload done with bcp) on the new machine. top end dell 5u 15k rpm sata2 drives..its sweet. hopefully this was somewhat helpful

                    Please remember to rate helpful or unhelpful answers, it lets us and people reading the forums know if our answers are any good.

                    S Offline
                    S Offline
                    StevenWalsh
                    wrote on last edited by
                    #13

                    Actually that helps quite a bit, thanks for the advice

                    S 1 Reply Last reply
                    0
                    • S StevenWalsh

                      Actually that helps quite a bit, thanks for the advice

                      S Offline
                      S Offline
                      SomeGuyThatIsMe
                      wrote on last edited by
                      #14

                      not a problem, the disadvantage is that if you wanted to make it all contiguous you'd have to drop the DB, defrag the hdd then recreate the db larger than it was before. but that would require a decent amount of downtime(a few hours depending on your server) and enough storage some place to back up your current data so it could be restored. also on mass inserts, i've found it usefull to drop the indices and recreate them after the insert, if you can do it all at once. you should see a noticable performance improvement if you try that, doing a lot of inserts and updates to an index can(sometimes) corrupt it quite easily, thus slowing down performance for any action on that table, or anything that is related to that table.

                      Please remember to rate helpful or unhelpful answers, it lets us and people reading the forums know if our answers are any good.

                      1 Reply Last reply
                      0
                      Reply
                      • Reply as topic
                      Log in to reply
                      • Oldest to Newest
                      • Newest to Oldest
                      • Most Votes


                      • Login

                      • Don't have an account? Register

                      • Login or register to search.
                      • First post
                        Last post
                      0
                      • Categories
                      • Recent
                      • Tags
                      • Popular
                      • World
                      • Users
                      • Groups