Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. SW Engineering, NASA & how things go wrong

SW Engineering, NASA & how things go wrong

Scheduled Pinned Locked Moved The Lounge
helpcombeta-testingtutorialquestion
25 Posts 13 Posters 2 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • raddevusR raddevus

    I really can't stop with this book, Modern Software Engineering[^], because so much of it resonates with me after working in IT/Dev for over 30 years. I came to Dev thru QA so I've always focused on "repeatable processes, errors & failing safely".

    Quote:

    One of the driving forces behind [Margaret] Hamilton’s[^] approach was the focus on how things fail—the ways in which we get things wrong. "There was a fascination on my part with errors, a never ending pass-time of mine was what made a particular error, or class of errors, happen and how to prevent it in the future." This focus was grounded in a scientifically rational approach to problem-solving. The assumption was not that you could plan and get it right the first time, rather that you treated all ideas, solutions, and designs with skepticism until you ran out of ideas about how things could go wrong. Occasionally, reality is still going to surprise you, but this is engineering empiricism at work. The other engineering principle that is embodied in Hamilton’s early work is the idea of “failing safely.” The assumption is that we can never code for every scenario, so how do we code in ways that allow our systems to cope with the unexpected and still make progress? Famously it was Hamilton’s unasked-for implementation of this idea that saved the Apollo 11 mission and allowed the Lunar Module Eagle to successfully land on the moon, despite the computer becoming overloaded during the descent. As Neil Armstrong and Buzz Aldrin descended in the Lunar Excursion Module (LEM) toward the moon, there was an exchange between the a

    A Offline
    A Offline
    agolddog
    wrote on last edited by
    #21

    It's just shocking to me how many "professional" developers don't embrace defensive programming.

    raddevusR 1 Reply Last reply
    0
    • raddevusR raddevus

      I really can't stop with this book, Modern Software Engineering[^], because so much of it resonates with me after working in IT/Dev for over 30 years. I came to Dev thru QA so I've always focused on "repeatable processes, errors & failing safely".

      Quote:

      One of the driving forces behind [Margaret] Hamilton’s[^] approach was the focus on how things fail—the ways in which we get things wrong. "There was a fascination on my part with errors, a never ending pass-time of mine was what made a particular error, or class of errors, happen and how to prevent it in the future." This focus was grounded in a scientifically rational approach to problem-solving. The assumption was not that you could plan and get it right the first time, rather that you treated all ideas, solutions, and designs with skepticism until you ran out of ideas about how things could go wrong. Occasionally, reality is still going to surprise you, but this is engineering empiricism at work. The other engineering principle that is embodied in Hamilton’s early work is the idea of “failing safely.” The assumption is that we can never code for every scenario, so how do we code in ways that allow our systems to cope with the unexpected and still make progress? Famously it was Hamilton’s unasked-for implementation of this idea that saved the Apollo 11 mission and allowed the Lunar Module Eagle to successfully land on the moon, despite the computer becoming overloaded during the descent. As Neil Armstrong and Buzz Aldrin descended in the Lunar Excursion Module (LEM) toward the moon, there was an exchange between the a

      R Offline
      R Offline
      rjmoses
      wrote on last edited by
      #22

      One of the hardest, or at least most memorable, software problem I ever had to chase was a programming error in a seldomly used error recovery routine. It seems my programmer, who was highly experienced in other programming languages such as COBOL, coded a "=" instead of an "==" inside an if statement in a C program--kinda like "if ( A = B)....". This ALWAYS returns true WHILE assigning the value of B to A. Unfortunately, this little error caused the system to crash. It took about six months to find the cause of the crash and understand what was happening. Looking at the code under the pressure of a "down system", we always asked why the condition was true because our mind was saying if A equals B and not considering that A was not equal to B before the if statement. I cursed (and still do curse to this day) whoever decided that allowing an assignment inside of a conditional statement was a good idea. And I have to wonder how many systems, like autonomous cars, have a statement like that buried way down deep in a infrequently used piece of critical code.

      raddevusR 1 Reply Last reply
      0
      • L Lost User

        I found that (user) logging reduces a lot of "errors". Another term is graceful degradation. But that requires understanding when a try-catch block should continue or not. And, yes, it may require asking the user if it should proceed (e.g. a file not available). It then comes down to transparency (of the software) ... what Boeing failed to do with their software changes: tell the user what they did and how it might impact them.

        "Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I

        raddevusR Offline
        raddevusR Offline
        raddevus
        wrote on last edited by
        #23

        Very good points and exactly right on the Boeing problem. Could’ve been solved properly. Such a sad and terrible thing.

        1 Reply Last reply
        0
        • A agolddog

          It's just shocking to me how many "professional" developers don't embrace defensive programming.

          raddevusR Offline
          raddevusR Offline
          raddevus
          wrote on last edited by
          #24

          I believe it is because they don’t “have to” since they are not the ones who will be wakened in the middle of the night. You learn a lot from losing sleep. :rolleyes:

          1 Reply Last reply
          0
          • R rjmoses

            One of the hardest, or at least most memorable, software problem I ever had to chase was a programming error in a seldomly used error recovery routine. It seems my programmer, who was highly experienced in other programming languages such as COBOL, coded a "=" instead of an "==" inside an if statement in a C program--kinda like "if ( A = B)....". This ALWAYS returns true WHILE assigning the value of B to A. Unfortunately, this little error caused the system to crash. It took about six months to find the cause of the crash and understand what was happening. Looking at the code under the pressure of a "down system", we always asked why the condition was true because our mind was saying if A equals B and not considering that A was not equal to B before the if statement. I cursed (and still do curse to this day) whoever decided that allowing an assignment inside of a conditional statement was a good idea. And I have to wonder how many systems, like autonomous cars, have a statement like that buried way down deep in a infrequently used piece of critical code.

            raddevusR Offline
            raddevusR Offline
            raddevus
            wrote on last edited by
            #25

            That is a fantastic story, thanks for sharing. I’ve seen that error (in my own code) and fortunately caught it before it went into production. You would def think the compiler would catch that.

            1 Reply Last reply
            0
            Reply
            • Reply as topic
            Log in to reply
            • Oldest to Newest
            • Newest to Oldest
            • Most Votes


            • Login

            • Don't have an account? Register

            • Login or register to search.
            • First post
              Last post
            0
            • Categories
            • Recent
            • Tags
            • Popular
            • World
            • Users
            • Groups