Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. How NOT to optimize a database!

How NOT to optimize a database!

Scheduled Pinned Locked Moved The Lounge
c++performancephpdatabasecom
57 Posts 26 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • N Nish Nishant

    http://articles.economictimes.indiatimes.com/2011-10-23/news/30313183_1_diwali-rush-western-railway-berths[^] TLDR : The IT company that designed the online reservation system for India's western railways used a 3-character field storing numbers to hold the reservation token (001 to 999) on a per-train basis. This limit was never ever hit until last week when (due to Diwali, an Indian festival) enough people made online bookings that the database crashed and with it the reservation system. Funny side-bit : 3 characters (even non-Unicode) is still larger than a short int (memory wise). So whoever did this pseudo-optimization was terribly ignorant.

    Regards, Nish


    My technology blog: voidnish.wordpress.com Part 2 in my WinRT/C++ series : Visual C++ and WinRT/Metro - Databinding Basics

    L Offline
    L Offline
    Lost User
    wrote on last edited by
    #28

    Which means those morons store the number in a string actually. God, stupidity is limitless.

    1 Reply Last reply
    0
    • A Andy Brummer

      From the US, I think the response would be they can fit 999 people on a train? :-D On the other side of the situation. The guy in charge of the India unique ID project gave a talk at the foundation where I work. That is a mind boggling huge project, and seem to be doing good planning to prevent these kinds of issues.

      Curvature of the Mind now with 3D

      N Offline
      N Offline
      NedPat
      wrote on last edited by
      #29

      I travel a lot, 999 is more than enough, block the thousandth and solve the problem.

      1 Reply Last reply
      0
      • N Nish Nishant

        In India, online shopping is still limited to a minority of upper-middle and upper-class folks. Recently though the masses have started using credit cards online, so that is a factor although the whole thing is still rather stupid.

        Regards, Nish


        My technology blog: voidnish.wordpress.com Part 2 in my WinRT/C++ series : Visual C++ and WinRT/Metro - Databinding Basics

        B Offline
        B Offline
        BrainiacV
        wrote on last edited by
        #30

        Another case of Famous. Last Words. When I was programming databases (actually flat files) back in '76, I encountered a date field that needed to be stored. I asked my manager whether I should store a 2 digit or 4 digit year. I was concerned about the upcoming millennium. He said, "Don't worry, by that time, the software will be replaced." Famous. Last. Words. I have had a few other times where managers would tell me limits that would not be exceeded and in the end, I learned to ignore them and build something that had excess capacity.

        Psychosis at 10 Film at 11 Those who do not remember the past, are doomed to repeat it. Those who do not remember the past, cannot build upon it.

        Y 1 Reply Last reply
        0
        • P Pascal Ganaye

          The sad thing is that it is one column in a 300 pages specification. The guy who do the specifications often do not have the technical expertise to know that a 16 bit integer can store 32000 values in 2 bytes. The programmer usually does, but the programmer is part of the chain. Every programmer is confronted to this problem at times. Do I raise the issue? If you raise the issue it will waste a few hours and you might end up finishing your development late. Also If you raise issues a bit too often you're quickly categorized as annoying git who can't follow orders. This is not the best way to progress in a company. I am not sure if this is true in India but this is true in the countries I worked in. If you shut up and follow orders, you won't get blamed and if you're on time you'll perhaps be promoted. The blame goes on the guy who did the specs despite the fact that the programmer is the best person to notice and fix the issue. To me this is the waterfall drama.

          A Offline
          A Offline
          agolddog
          wrote on last edited by
          #31

          Pascal Ganaye wrote:

          Do I raise the issue?

          The answer is always yes. If the developer doesn't recognize the issue, that's one thing. (Of course, in this example, if they didn't recognize it, they should be fired. Many other times, it's not until you've done a little development that the "something doesn't feel right" notion crystallizes). The correct approach is for the developer to raise the issue, pointing out the flaws in the current approach. If he gets people to understand and accept those flaws (or not accept them and change development), then he's done his job. Anything short of that, he's failed. Should the developer fix it? That's a second question. It's o.k., from a business perspecitve, to weigh the development time versus the benefit of "doing it right" and decide that the bad scenario is so unlikely as to make the fix not worth it. We all do that. The analysis which needs to be done is (likelihood of bad thing happening) * (cost of bad thing happening) > (cost of fix), understanding that "cost" is not strictly monetary. (i.e., the cost of negative publicity on your company, getting a bad reputation, etc). In our example here, it's hard to believe that analysis was done. Certainly doesn't seem as if the cost of fix was too great, but I (of course) don't know their development.

          1 Reply Last reply
          0
          • Sander RosselS Sander Rossel

            There are cases where the opposite is also true... A table that will probably never get beyond a couple of 100's of records, but has a bigint as key :rolleyes:

            It's an OO world.

            public class Naerling : Lazy<Person>{}

            P Offline
            P Offline
            Peter R Fletcher
            wrote on last edited by
            #32

            That isn't as silly as it sounds - for a couple of reasons: 1) If the table is reasonably dynamic and the Key is defined as an AutoNumber, its value might get out of the int range in finite time, even if there were never more than a few hundred records in the table simultaneously. 2) If you maintain the conventions that your Primary Key is always a bigint, you always know how to declare linking fields in other tables without going back to check.

            Sander RosselS 1 Reply Last reply
            0
            • N Nish Nishant

              http://articles.economictimes.indiatimes.com/2011-10-23/news/30313183_1_diwali-rush-western-railway-berths[^] TLDR : The IT company that designed the online reservation system for India's western railways used a 3-character field storing numbers to hold the reservation token (001 to 999) on a per-train basis. This limit was never ever hit until last week when (due to Diwali, an Indian festival) enough people made online bookings that the database crashed and with it the reservation system. Funny side-bit : 3 characters (even non-Unicode) is still larger than a short int (memory wise). So whoever did this pseudo-optimization was terribly ignorant.

              Regards, Nish


              My technology blog: voidnish.wordpress.com Part 2 in my WinRT/C++ series : Visual C++ and WinRT/Metro - Databinding Basics

              I Offline
              I Offline
              I explore code
              wrote on last edited by
              #33

              Oh dear God!!! no way!!! just 999 e-tickets in a country of 1-billion + blinking people, are you serious??? even a non-programmer can tell that's downright stupid. The problem is most of the government IT projects are assigned to lowest bidding organisations whose portfolios are shady at best. IT cannot work with inefficient management, no matter how cheap you get your projects implemented for. End Of. BTW, how did this exactly happen? a regular 16/32-bit int can store more values than 999, does anyone know what they actually used? doesn't sound like an int... these days when memory comes for cheap, disasters like these just get me.

              N 1 Reply Last reply
              0
              • I I explore code

                Oh dear God!!! no way!!! just 999 e-tickets in a country of 1-billion + blinking people, are you serious??? even a non-programmer can tell that's downright stupid. The problem is most of the government IT projects are assigned to lowest bidding organisations whose portfolios are shady at best. IT cannot work with inefficient management, no matter how cheap you get your projects implemented for. End Of. BTW, how did this exactly happen? a regular 16/32-bit int can store more values than 999, does anyone know what they actually used? doesn't sound like an int... these days when memory comes for cheap, disasters like these just get me.

                N Offline
                N Offline
                NedPat
                wrote on last edited by
                #34

                The spec is " to hold the reservation token (001 to 999) on a per-train basis", reservations per train, not tickets per country if read correctly.

                I 2 Replies Last reply
                0
                • V Vivi Chellappa

                  Every bit counted when your memory was limited to 4K and disk sizes were 5 MB. That gave us the Y2K problem. With 4GB of main memory and terabyte disks aplenty, there is NO reason at all today for this kind of design.

                  J Offline
                  J Offline
                  Jasmine2501
                  wrote on last edited by
                  #35

                  What? There are plenty of reasons to design for efficiency today. It's my biggest peeve with vendor software right now. They designed it as if memory and storage are infinite, and so the performance sucks because everything is bigger than it needs to be. Network is still a bottleneck, and when you're transferring 1000 rows of data, it matters if the key is 2 bytes or 4. We still need to pay attention to how big we make things.

                  M 1 Reply Last reply
                  0
                  • I ii_noname_ii

                    ..And the guy went back to working for SAP? (I always wondered how they sell that piece of scheisse! Good marketting, good lobbying, hopeless software, unrealistic costs... Oh, don't get me started on SAP!! lol)

                    V Offline
                    V Offline
                    Vivi Chellappa
                    wrote on last edited by
                    #36

                    He came from another semiconductor manufacturing company and went back to them. Inflicted SAP on them too! :laugh:

                    1 Reply Last reply
                    0
                    • J Jasmine2501

                      What? There are plenty of reasons to design for efficiency today. It's my biggest peeve with vendor software right now. They designed it as if memory and storage are infinite, and so the performance sucks because everything is bigger than it needs to be. Network is still a bottleneck, and when you're transferring 1000 rows of data, it matters if the key is 2 bytes or 4. We still need to pay attention to how big we make things.

                      M Offline
                      M Offline
                      Marbry Hardin
                      wrote on last edited by
                      #37

                      Certainly, but there are usually a lot of other optimizations that can be made that will make much more difference. You're almost always better off trying NOT to design something that can be easily broken through usage patterns. The PITA you save may be your own.

                      J 1 Reply Last reply
                      0
                      • P Peter R Fletcher

                        That isn't as silly as it sounds - for a couple of reasons: 1) If the table is reasonably dynamic and the Key is defined as an AutoNumber, its value might get out of the int range in finite time, even if there were never more than a few hundred records in the table simultaneously. 2) If you maintain the conventions that your Primary Key is always a bigint, you always know how to declare linking fields in other tables without going back to check.

                        Sander RosselS Offline
                        Sander RosselS Offline
                        Sander Rossel
                        wrote on last edited by
                        #38

                        Sure, having a bigint as key is not the end of the world (certainly better than char(3) for storing numerics or (tiny)int where more is needed), but if none of your points apply it certainly deserves no beauty award ;) What I am talking about is a few hundred records ever (simultaneously or not) and every table having defined another key anyway (char, varchar, byte, tinyint, int, bigint, multiple columns etc.). Which might not deserve a beauty award at all ;p

                        It's an OO world.

                        public class Naerling : Lazy<Person>{}

                        1 Reply Last reply
                        0
                        • M Marbry Hardin

                          Certainly, but there are usually a lot of other optimizations that can be made that will make much more difference. You're almost always better off trying NOT to design something that can be easily broken through usage patterns. The PITA you save may be your own.

                          J Offline
                          J Offline
                          Jasmine2501
                          wrote on last edited by
                          #39

                          Other optimizations exist, but making things as small as possible is often overlooked. People believe what you said - that other optimizations are better. I think this is backwards thinking. In my 25 years of doing this stuff, optimizing the size of your objects has fallen out of favor, but it still has the same performance impact that it has always had, and we see it in the results. I'm willing to go out on a limb and say, probably, most of the performance issues we face with modern software are rooted in this problem. Sure, you can take something that's too big and make the performance even worse, but if you don't start with an efficient object, you've shot yourself in the foot before you even started the race.

                          M 1 Reply Last reply
                          0
                          • N NedPat

                            The spec is " to hold the reservation token (001 to 999) on a per-train basis", reservations per train, not tickets per country if read correctly.

                            I Offline
                            I Offline
                            I explore code
                            wrote on last edited by
                            #40

                            aha! its 999 e-reservations per train considering that not all people book tickets online which would mean in the worst case scenario if you have 999 e-bookings and lets say even 1 non-electronic booking the system would still fail. Coz as i understand electronic reservation system is the same as the one used by staff over the counters. Weird. Also one ticket means >= 1 person (not sure what relevance that has to my statement :) ).

                            1 Reply Last reply
                            0
                            • N NedPat

                              The spec is " to hold the reservation token (001 to 999) on a per-train basis", reservations per train, not tickets per country if read correctly.

                              I Offline
                              I Offline
                              I explore code
                              wrote on last edited by
                              #41

                              In retrospect though, this is the first time I have come across the Indian railway reservation system crashing which probably could have been avoided by a software check which could have just rejected 1000th reservation. Short-sightedness on the developers' part or over-confidence? who knows....

                              1 Reply Last reply
                              0
                              • J Jasmine2501

                                Other optimizations exist, but making things as small as possible is often overlooked. People believe what you said - that other optimizations are better. I think this is backwards thinking. In my 25 years of doing this stuff, optimizing the size of your objects has fallen out of favor, but it still has the same performance impact that it has always had, and we see it in the results. I'm willing to go out on a limb and say, probably, most of the performance issues we face with modern software are rooted in this problem. Sure, you can take something that's too big and make the performance even worse, but if you don't start with an efficient object, you've shot yourself in the foot before you even started the race.

                                M Offline
                                M Offline
                                Marbry Hardin
                                wrote on last edited by
                                #42

                                Perhaps, but I think there is a distinction to be made in WHY they are too big. One can minimize the size in ways that are more or less flexible. Obviously they chose one that was less rather then more.

                                1 Reply Last reply
                                0
                                • N Nish Nishant

                                  http://articles.economictimes.indiatimes.com/2011-10-23/news/30313183_1_diwali-rush-western-railway-berths[^] TLDR : The IT company that designed the online reservation system for India's western railways used a 3-character field storing numbers to hold the reservation token (001 to 999) on a per-train basis. This limit was never ever hit until last week when (due to Diwali, an Indian festival) enough people made online bookings that the database crashed and with it the reservation system. Funny side-bit : 3 characters (even non-Unicode) is still larger than a short int (memory wise). So whoever did this pseudo-optimization was terribly ignorant.

                                  Regards, Nish


                                  My technology blog: voidnish.wordpress.com Part 2 in my WinRT/C++ series : Visual C++ and WinRT/Metro - Databinding Basics

                                  R Offline
                                  R Offline
                                  rjemp
                                  wrote on last edited by
                                  #43

                                  The strange thing is that the 3 char field uses 3 bytes and only allows up to 999. Depending on the platform, a 2 byte int would allow at least 32767 entries.

                                  N 1 Reply Last reply
                                  0
                                  • R rjemp

                                    The strange thing is that the 3 char field uses 3 bytes and only allows up to 999. Depending on the platform, a 2 byte int would allow at least 32767 entries.

                                    N Offline
                                    N Offline
                                    Nish Nishant
                                    wrote on last edited by
                                    #44

                                    rjemp wrote:

                                    The strange thing is that the 3 char field uses 3 bytes and only allows up to 999. Depending on the platform, a 2 byte int would allow at least 32767 entries.

                                    You probably missed this but the original post mentions this very thing. :rolleyes:

                                    Regards, Nish


                                    My technology blog: voidnish.wordpress.com Part 2 in my WinRT/C++ series : Visual C++ and WinRT/Metro - Databinding Basics

                                    R N 2 Replies Last reply
                                    0
                                    • N Nish Nishant

                                      rjemp wrote:

                                      The strange thing is that the 3 char field uses 3 bytes and only allows up to 999. Depending on the platform, a 2 byte int would allow at least 32767 entries.

                                      You probably missed this but the original post mentions this very thing. :rolleyes:

                                      Regards, Nish


                                      My technology blog: voidnish.wordpress.com Part 2 in my WinRT/C++ series : Visual C++ and WinRT/Metro - Databinding Basics

                                      R Offline
                                      R Offline
                                      rjemp
                                      wrote on last edited by
                                      #45

                                      One day someone will jab their fingers in those rolling eyes.

                                      N 1 Reply Last reply
                                      0
                                      • B Bassam Abdul Baki

                                        I could never understand why people only allow for 50% or 100% expansion. Always allow for 10 times even though you may never reach it. Yes, I know every bit counts, but better to lose a few bits than gain a bitch.

                                        Web - BM - RSS - Math - LinkedIn

                                        K Offline
                                        K Offline
                                        KP Lee
                                        wrote on last edited by
                                        #46

                                        If every bit helps, they could have saved 8 for every record recorded by using smallint and this would have allowed more than 32 times the 1000 limit. They weren't saving bits, they were showing ignorance. (Per another poster, the ignorance of the manager who can't code.)

                                        V 1 Reply Last reply
                                        0
                                        • V Vivi Chellappa

                                          Every bit counted when your memory was limited to 4K and disk sizes were 5 MB. That gave us the Y2K problem. With 4GB of main memory and terabyte disks aplenty, there is NO reason at all today for this kind of design.

                                          K Offline
                                          K Offline
                                          KP Lee
                                          wrote on last edited by
                                          #47

                                          The Y2K problem came from mainframes which completely pre-date micro-computers and had nothing to do with memory or disk size. Actually, they were storing 2 digit numbers in 16 bits of storage. Don't tell me it had anything to do with memory or storage.

                                          V 2 Replies Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups