Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. Stroustrup on NASA's loss of $654 million Mars Climate Orbiter

Stroustrup on NASA's loss of $654 million Mars Climate Orbiter

Scheduled Pinned Locked Moved The Lounge
performancetutorialcomsysadminhelp
27 Posts 18 Posters 1 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • D ddt_tdd

    Those NASA engineers are real engineers sticking to their language they know C/C++. Don't get me wrong, I do like C/C+ (especialy C++11/14), but maybe it would be better they created a DSL (Domain Specific Language) for the control units ( at base station and at space-probe). In stead of using runtime type checking, they could have done compile time type checking, preventing the overhead of the meta-data for the Run Time Type checking. These practices is all well written in the book "Compilers: Principles, Techniques, and Tools" (1st edition dates from 1986). The DSL uses the ISO standard as types and the translator-part of the compiler can add conversion code when they use non ISO standard. It will not prevent human errors, but it will make it more clear what's going on when you use a Meter-type instead of a Feet-type if you writing down the calculation in the source-code of the computer.

    to err is human; to forgive, divine

    R Offline
    R Offline
    raddevus
    wrote on last edited by
    #15

    Great stuff. Thanks for commenting. :thumbsup: It's an interesting story. Software crashes like this just get a lot of attention since they are so obviously catastrophic. Other problems occur and people do not hear about them but they happen everywhere in software.

    A 1 Reply Last reply
    0
    • R raddevus

      Maybe you've heard about this before, but it is very interesting. It's a good example of the need for "code organization / code management" that you get from higher-level languages and OOP.

      Bjarne Stroustrup wrote:

      On 23 September 1999, NASA lost its US$654 million Mars Climate Orbiter due to a navigation error. “The root cause for the loss of the MCO spacecraft was the failure to use metric units in the coding of a ground software file, ‘Small Forces,’ used in trajectory models. Specifically, thruster performance data in English units instead of metric units was used.”5 The amount of work lost was roughly equivalent to the lifetime’s work of 200 good engineers. In reality, the cost is even higher because we’re deprived of the mission’s scientific results until (and if) it can be repeated. The really galling aspect is that we were all taught how to avoid such errors in high school: “Always make sure the units are correct in your computations.” Why didn’t the NASA engineers do that? They’re indisputably experts in their field, so there must be good reasons. No mainstream programming language supports units, but every general-purpose language allows a programmer to encode a value as a {quantity,unit} pair. We can encode enough of the ISO standard SI units (meters, kilograms, seconds, and so on) in an integer to deal with all of NASA’s needs, but we don’t because that would almost double the size of our data. Furthermore, checking the units in every computation would more than double the amount of computation needed. Space probes tend to be both memory and compute limited, so the engineers—just as essentially everyone else in their situation has done—decided to keep track of the units themselves (in their heads, in the comments, and in the documentation). In this case, they lost.

      From http://www.stroustrup.com/Software-for-infrastructure.pdf[^]

      L Offline
      L Offline
      Lost User
      wrote on last edited by
      #16

      Perhaps the USA should just adopt metric like the rest of the world - problem solved. :laugh:

      J 1 Reply Last reply
      0
      • OriginalGriffO OriginalGriff

        Some form of quality control and testing regimen would probably have helped too. The only good thing about this is that QA didn't exist in 1999 so it wasn't as a result of a "how do I navigate my spacecraft to Mars? SND CODZ URGNTZZZZ!!!!!" question and answer... :laugh:

        Bad command or file name. Bad, bad command! Sit! Stay! Staaaay... AntiTwitter: @DalekDave is now a follower!

        B Offline
        B Offline
        BGArts
        wrote on last edited by
        #17

        QA didn't exist in 1999? I hope that was a joke, because that was my exact job description. :)

        1 Reply Last reply
        0
        • L Lost User

          Perhaps the USA should just adopt metric like the rest of the world - problem solved. :laugh:

          J Offline
          J Offline
          James Lonero
          wrote on last edited by
          #18

          I second the motion. But, the US government has too many other problems to solve.

          1 Reply Last reply
          0
          • M molesworth

            raddevus wrote:

            Space probes tend to be both memory and compute limited, so the engineers—just as essentially everyone else in their situation has done—decided to keep track of the units themselves (in their heads, in the comments, and in the documentation).

            While this is true, and one of the reasons for extensive and exhaustive testing of spacecraft software, in the case of Mars Climate Orbiter (as also noted in the original post) :

            raddevus wrote:

            ... root cause for the loss of the MCO spacecraft was the failure to use metric units in the coding of a ground software file...

            (My re-bolding) While not excusing the failure, which was a result of insufficient testing and other related issues, in this case it wasn't due to memory limitations of the on-board computers, or people "keeping track of units in their heads...". Anyway, source code isn't uploaded and compiled on a spacecraft, so type-safety and better languages shouldn't have any impact on the size of the executables. While I'm sure Stroustrup means well, and intends the use of this failure as a lesson in the advantages of more strongly typed languages and methods, I think he's misrepresenting the actual situation. I'd recommend reading the actual reports on what happened for more details : [Mars Climate Orbiter Failure Board Releases Report](https://mars.jpl.nasa.gov/msp98/news/mco991110.html) which also contains a link to the full, PDF, report. m (in spacecraft test systems engineer mode :) )

            Days spent at sea are not deducted from one's alloted span - Phoenician proverb

            J Offline
            J Offline
            James Lonero
            wrote on last edited by
            #19

            Reminds me of the Hubble telescope mirror problem. Perkin Elmer was contracted to create the large reflecting mirror for the Hubble telescope, and NASA was not allowed to review the process, visit the factory, nor check off the various milestones. NASA was only allowed to accept delivery when the mirror was complete. Then on delivery, NASA didn’t even do a thorough QA check of the mirror. Unfortunately, the mirror was so badly flawed, that after it was installed and used, everything the telescope saw was blurred. Useless data was collected. NASA had to order another manned mission to fix the problem at a good cost to the American taxpayer. Thus, it pays to have constant checks and quality reviews by the customer.

            1 Reply Last reply
            0
            • R raddevus

              Maybe you've heard about this before, but it is very interesting. It's a good example of the need for "code organization / code management" that you get from higher-level languages and OOP.

              Bjarne Stroustrup wrote:

              On 23 September 1999, NASA lost its US$654 million Mars Climate Orbiter due to a navigation error. “The root cause for the loss of the MCO spacecraft was the failure to use metric units in the coding of a ground software file, ‘Small Forces,’ used in trajectory models. Specifically, thruster performance data in English units instead of metric units was used.”5 The amount of work lost was roughly equivalent to the lifetime’s work of 200 good engineers. In reality, the cost is even higher because we’re deprived of the mission’s scientific results until (and if) it can be repeated. The really galling aspect is that we were all taught how to avoid such errors in high school: “Always make sure the units are correct in your computations.” Why didn’t the NASA engineers do that? They’re indisputably experts in their field, so there must be good reasons. No mainstream programming language supports units, but every general-purpose language allows a programmer to encode a value as a {quantity,unit} pair. We can encode enough of the ISO standard SI units (meters, kilograms, seconds, and so on) in an integer to deal with all of NASA’s needs, but we don’t because that would almost double the size of our data. Furthermore, checking the units in every computation would more than double the amount of computation needed. Space probes tend to be both memory and compute limited, so the engineers—just as essentially everyone else in their situation has done—decided to keep track of the units themselves (in their heads, in the comments, and in the documentation). In this case, they lost.

              From http://www.stroustrup.com/Software-for-infrastructure.pdf[^]

              L Offline
              L Offline
              Lost User
              wrote on last edited by
              #20

              Yes ... and now, picture hundreds of "self-driving" cars hurtling at each other ... (Wifi error)

              "(I) am amazed to see myself here rather than there ... now rather than then". ― Blaise Pascal

              1 Reply Last reply
              0
              • R raddevus

                Maybe you've heard about this before, but it is very interesting. It's a good example of the need for "code organization / code management" that you get from higher-level languages and OOP.

                Bjarne Stroustrup wrote:

                On 23 September 1999, NASA lost its US$654 million Mars Climate Orbiter due to a navigation error. “The root cause for the loss of the MCO spacecraft was the failure to use metric units in the coding of a ground software file, ‘Small Forces,’ used in trajectory models. Specifically, thruster performance data in English units instead of metric units was used.”5 The amount of work lost was roughly equivalent to the lifetime’s work of 200 good engineers. In reality, the cost is even higher because we’re deprived of the mission’s scientific results until (and if) it can be repeated. The really galling aspect is that we were all taught how to avoid such errors in high school: “Always make sure the units are correct in your computations.” Why didn’t the NASA engineers do that? They’re indisputably experts in their field, so there must be good reasons. No mainstream programming language supports units, but every general-purpose language allows a programmer to encode a value as a {quantity,unit} pair. We can encode enough of the ISO standard SI units (meters, kilograms, seconds, and so on) in an integer to deal with all of NASA’s needs, but we don’t because that would almost double the size of our data. Furthermore, checking the units in every computation would more than double the amount of computation needed. Space probes tend to be both memory and compute limited, so the engineers—just as essentially everyone else in their situation has done—decided to keep track of the units themselves (in their heads, in the comments, and in the documentation). In this case, they lost.

                From http://www.stroustrup.com/Software-for-infrastructure.pdf[^]

                F Offline
                F Offline
                firegryphon
                wrote on last edited by
                #21

                The real issue was bad, or woafully understaffed, systems engineering on a program that was underfunded with a process where some documents issued by the managing organization weren't part of the contractual requirements and were directly in conflict of the actual contractual documents. The fact that units were involved is secondary or tertiary. It could have been any other data type that was passed between the two completely independent software organizations that didn't share any top level organizations. Also that cost is so stupidly and grossly overestimated that the person should be put on charges or at least sued for libel. Mars '98 Fact Sheet (Orbiter) [^] According to this the entire spacecraft development (including managing organization involvement) was $327 million for two spacecraft.

                1 Reply Last reply
                0
                • M molesworth

                  raddevus wrote:

                  Space probes tend to be both memory and compute limited, so the engineers—just as essentially everyone else in their situation has done—decided to keep track of the units themselves (in their heads, in the comments, and in the documentation).

                  While this is true, and one of the reasons for extensive and exhaustive testing of spacecraft software, in the case of Mars Climate Orbiter (as also noted in the original post) :

                  raddevus wrote:

                  ... root cause for the loss of the MCO spacecraft was the failure to use metric units in the coding of a ground software file...

                  (My re-bolding) While not excusing the failure, which was a result of insufficient testing and other related issues, in this case it wasn't due to memory limitations of the on-board computers, or people "keeping track of units in their heads...". Anyway, source code isn't uploaded and compiled on a spacecraft, so type-safety and better languages shouldn't have any impact on the size of the executables. While I'm sure Stroustrup means well, and intends the use of this failure as a lesson in the advantages of more strongly typed languages and methods, I think he's misrepresenting the actual situation. I'd recommend reading the actual reports on what happened for more details : [Mars Climate Orbiter Failure Board Releases Report](https://mars.jpl.nasa.gov/msp98/news/mco991110.html) which also contains a link to the full, PDF, report. m (in spacecraft test systems engineer mode :) )

                  Days spent at sea are not deducted from one's alloted span - Phoenician proverb

                  F Offline
                  F Offline
                  firegryphon
                  wrote on last edited by
                  #22

                  Exactly right. Everything in that report says bad or insufficient systems engineering process (which was encouraged under the Faster, Better, Cheaper mantra). As a result, missions with approximately the same capability are now back to costing three or more times as much. His quoted cost for the program is ridiculous as well. (not a spacecraft test system engineer, but have read a great deal on it)

                  1 Reply Last reply
                  0
                  • R raddevus

                    Maybe you've heard about this before, but it is very interesting. It's a good example of the need for "code organization / code management" that you get from higher-level languages and OOP.

                    Bjarne Stroustrup wrote:

                    On 23 September 1999, NASA lost its US$654 million Mars Climate Orbiter due to a navigation error. “The root cause for the loss of the MCO spacecraft was the failure to use metric units in the coding of a ground software file, ‘Small Forces,’ used in trajectory models. Specifically, thruster performance data in English units instead of metric units was used.”5 The amount of work lost was roughly equivalent to the lifetime’s work of 200 good engineers. In reality, the cost is even higher because we’re deprived of the mission’s scientific results until (and if) it can be repeated. The really galling aspect is that we were all taught how to avoid such errors in high school: “Always make sure the units are correct in your computations.” Why didn’t the NASA engineers do that? They’re indisputably experts in their field, so there must be good reasons. No mainstream programming language supports units, but every general-purpose language allows a programmer to encode a value as a {quantity,unit} pair. We can encode enough of the ISO standard SI units (meters, kilograms, seconds, and so on) in an integer to deal with all of NASA’s needs, but we don’t because that would almost double the size of our data. Furthermore, checking the units in every computation would more than double the amount of computation needed. Space probes tend to be both memory and compute limited, so the engineers—just as essentially everyone else in their situation has done—decided to keep track of the units themselves (in their heads, in the comments, and in the documentation). In this case, they lost.

                    From http://www.stroustrup.com/Software-for-infrastructure.pdf[^]

                    K Offline
                    K Offline
                    kalberts
                    wrote on last edited by
                    #23

                    Didn't Algol68 include unit arithmetic? I don't have any near-complete Algol68 reference at hand, but I have a slight memory that it did. Then: To claim that Algol68 ever was a widespread language would be a lie. You may question whether it was spread at all... :-) But then #2: In any modern OO language supporting generics, you could easily implement a "NumberAndUnit" class associating a unit with any number, implementing the appropriate arithmetic for the unit part. I can't claim to have seen any library supporting this, but I'd actually be suprised if none at all are available. Of course: That won't help the NASA guys as long as they don't use such a library. Sidetrack/historic: In my first job after completing my studies (long ago!), I worked with at text processing system where the basic unit was an AH, labeled after the initials of the developer who introduced it. One AH was 1/86400 of an inch. For the "modern" point definition, 1/72 of an inch, there was no roundoff errors. For the "traditional" pica points or didot points, the roundoff errors were significantly smaller than the precision of any of the peripherals we handled-

                    R 1 Reply Last reply
                    0
                    • K kalberts

                      Didn't Algol68 include unit arithmetic? I don't have any near-complete Algol68 reference at hand, but I have a slight memory that it did. Then: To claim that Algol68 ever was a widespread language would be a lie. You may question whether it was spread at all... :-) But then #2: In any modern OO language supporting generics, you could easily implement a "NumberAndUnit" class associating a unit with any number, implementing the appropriate arithmetic for the unit part. I can't claim to have seen any library supporting this, but I'd actually be suprised if none at all are available. Of course: That won't help the NASA guys as long as they don't use such a library. Sidetrack/historic: In my first job after completing my studies (long ago!), I worked with at text processing system where the basic unit was an AH, labeled after the initials of the developer who introduced it. One AH was 1/86400 of an inch. For the "modern" point definition, 1/72 of an inch, there was no roundoff errors. For the "traditional" pica points or didot points, the roundoff errors were significantly smaller than the precision of any of the peripherals we handled-

                      R Offline
                      R Offline
                      raddevus
                      wrote on last edited by
                      #24

                      Member 7989122 wrote:

                      Sidetrack/historic:

                      That's an interesting story. Thanks for sharing.

                      1 Reply Last reply
                      0
                      • R raddevus

                        Great stuff. Thanks for commenting. :thumbsup: It's an interesting story. Software crashes like this just get a lot of attention since they are so obviously catastrophic. Other problems occur and people do not hear about them but they happen everywhere in software.

                        A Offline
                        A Offline
                        Abbas A Ali
                        wrote on last edited by
                        #25

                        Like comments on this post? :^)

                        R 1 Reply Last reply
                        0
                        • A Abbas A Ali

                          Like comments on this post? :^)

                          R Offline
                          R Offline
                          raddevus
                          wrote on last edited by
                          #26

                          Hmm... the post was edited. There was an entire comment about what had happened. Interesting.

                          1 Reply Last reply
                          0
                          • R raddevus

                            Maybe you've heard about this before, but it is very interesting. It's a good example of the need for "code organization / code management" that you get from higher-level languages and OOP.

                            Bjarne Stroustrup wrote:

                            On 23 September 1999, NASA lost its US$654 million Mars Climate Orbiter due to a navigation error. “The root cause for the loss of the MCO spacecraft was the failure to use metric units in the coding of a ground software file, ‘Small Forces,’ used in trajectory models. Specifically, thruster performance data in English units instead of metric units was used.”5 The amount of work lost was roughly equivalent to the lifetime’s work of 200 good engineers. In reality, the cost is even higher because we’re deprived of the mission’s scientific results until (and if) it can be repeated. The really galling aspect is that we were all taught how to avoid such errors in high school: “Always make sure the units are correct in your computations.” Why didn’t the NASA engineers do that? They’re indisputably experts in their field, so there must be good reasons. No mainstream programming language supports units, but every general-purpose language allows a programmer to encode a value as a {quantity,unit} pair. We can encode enough of the ISO standard SI units (meters, kilograms, seconds, and so on) in an integer to deal with all of NASA’s needs, but we don’t because that would almost double the size of our data. Furthermore, checking the units in every computation would more than double the amount of computation needed. Space probes tend to be both memory and compute limited, so the engineers—just as essentially everyone else in their situation has done—decided to keep track of the units themselves (in their heads, in the comments, and in the documentation). In this case, they lost.

                            From http://www.stroustrup.com/Software-for-infrastructure.pdf[^]

                            S Offline
                            S Offline
                            swampwiz
                            wrote on last edited by
                            #27

                            I used to be an aerospace engineer on the Space Shuttle External Tank, and before that the Delta rocket. Our units were always inch, pound (force) and the mass being the unit that gets accelerated at 1 in/sec2 by a force of 1 pound (about 386 pounds (mass)) - and being that I was in structural dynamics, it was important to get that mass correct. The error here was that some paper-pusher whose job it was to change the parameters screwed up. Of course, it's ridiculous that the USA still uses non-metric units, but U-S-A, U-S-A!

                            1 Reply Last reply
                            0
                            Reply
                            • Reply as topic
                            Log in to reply
                            • Oldest to Newest
                            • Newest to Oldest
                            • Most Votes


                            • Login

                            • Don't have an account? Register

                            • Login or register to search.
                            • First post
                              Last post
                            0
                            • Categories
                            • Recent
                            • Tags
                            • Popular
                            • World
                            • Users
                            • Groups