Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. I have vanquished the creeping horror

I have vanquished the creeping horror

Scheduled Pinned Locked Moved The Lounge
helpwindows-adminsales
18 Posts 10 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • G Offline
    G Offline
    Gary Wheeler
    wrote on last edited by
    #1

    We have a customer for whom we've done a [great steaming] pile of custom work. A couple of times a year they find a problem with the custom stuff. It always turns into a wretched slog through a Lovecraftian swamp of bubbling ichor (e.g. legacy code) I didn't write but am now required to maintain. Three months ago they reported an issue with one of their features in the new generation of product that didn't happen with the old one. I compared the code between the two and it was identical. I've spent considerable hours debugging through the code for the feature. It turns out the new code is looking in the wrong place in the registry to see if their custom features are enabled :doh: . The old code only worked accidently. Cue the fireworks a day early, and let the naked happy dance :jig: commence!

    Software Zen: delete this;

    J J Richard Andrew x64R R 4 Replies Last reply
    0
    • G Gary Wheeler

      We have a customer for whom we've done a [great steaming] pile of custom work. A couple of times a year they find a problem with the custom stuff. It always turns into a wretched slog through a Lovecraftian swamp of bubbling ichor (e.g. legacy code) I didn't write but am now required to maintain. Three months ago they reported an issue with one of their features in the new generation of product that didn't happen with the old one. I compared the code between the two and it was identical. I've spent considerable hours debugging through the code for the feature. It turns out the new code is looking in the wrong place in the registry to see if their custom features are enabled :doh: . The old code only worked accidently. Cue the fireworks a day early, and let the naked happy dance :jig: commence!

      Software Zen: delete this;

      J Offline
      J Offline
      jeron1
      wrote on last edited by
      #2

      One of our embedded systems has been around for 15+ years, we had one or two customers in that time have random device crashes. We tried setting things up and using the customers configuration, to no avail, for many months. Data from the clients revealed the crashes occur at many different addresses. Finally, in desperation, I decided to look at stack usage. There were two routines related to little used features out of hundreds (think 70-80K lines of code in assembler) that allocated 4 bytes more stack space than it deallocated. It took thousands of these routine calls before encroaching on runtime variables. Twas a happy day when I found it :beer:, and as important, a big learning experience.

      "the debugger doesn't tell me anything because this code compiles just fine" - random QA comment "Facebook is where you tell lies to your friends. Twitter is where you tell the truth to strangers." - chriselst "I don't drink any more... then again, I don't drink any less." - Mike Mullikins uncle

      G H 2 Replies Last reply
      0
      • J jeron1

        One of our embedded systems has been around for 15+ years, we had one or two customers in that time have random device crashes. We tried setting things up and using the customers configuration, to no avail, for many months. Data from the clients revealed the crashes occur at many different addresses. Finally, in desperation, I decided to look at stack usage. There were two routines related to little used features out of hundreds (think 70-80K lines of code in assembler) that allocated 4 bytes more stack space than it deallocated. It took thousands of these routine calls before encroaching on runtime variables. Twas a happy day when I found it :beer:, and as important, a big learning experience.

        "the debugger doesn't tell me anything because this code compiles just fine" - random QA comment "Facebook is where you tell lies to your friends. Twitter is where you tell the truth to strangers." - chriselst "I don't drink any more... then again, I don't drink any less." - Mike Mullikins uncle

        G Offline
        G Offline
        Gary Wheeler
        wrote on last edited by
        #3

        jeron1 wrote:

        Twas a happy day when I found it :beer:, and as important, a big learning experience

        Same here. I've had other memorable bugs that were excruciating to recreate and diagnose. One was a GDI handle leak that took over a week of run time to show up and crash the application. Another was a piece of embedded code where the TCP/IP code we bought back in 1995 did not re-initialize properly after a network hardware error. Both of these took weeks of debugging to find and reproduce and only a couple of hours to correct.

        Software Zen: delete this;

        J J E 3 Replies Last reply
        0
        • G Gary Wheeler

          jeron1 wrote:

          Twas a happy day when I found it :beer:, and as important, a big learning experience

          Same here. I've had other memorable bugs that were excruciating to recreate and diagnose. One was a GDI handle leak that took over a week of run time to show up and crash the application. Another was a piece of embedded code where the TCP/IP code we bought back in 1995 did not re-initialize properly after a network hardware error. Both of these took weeks of debugging to find and reproduce and only a couple of hours to correct.

          Software Zen: delete this;

          J Offline
          J Offline
          jeron1
          wrote on last edited by
          #4

          Gary Wheeler wrote:

          Both of these took weeks of debugging to find and reproduce and only a couple of hours to correct.

          Funny how that is. :thumbsup:

          "the debugger doesn't tell me anything because this code compiles just fine" - random QA comment "Facebook is where you tell lies to your friends. Twitter is where you tell the truth to strangers." - chriselst "I don't drink any more... then again, I don't drink any less." - Mike Mullikins uncle

          D 1 Reply Last reply
          0
          • J jeron1

            Gary Wheeler wrote:

            Both of these took weeks of debugging to find and reproduce and only a couple of hours to correct.

            Funny how that is. :thumbsup:

            "the debugger doesn't tell me anything because this code compiles just fine" - random QA comment "Facebook is where you tell lies to your friends. Twitter is where you tell the truth to strangers." - chriselst "I don't drink any more... then again, I don't drink any less." - Mike Mullikins uncle

            D Offline
            D Offline
            dandy72
            wrote on last edited by
            #5

            Now imagine weeks of debugging and seconds for the actual fix (after the problem was identified and well understood).

            J 1 Reply Last reply
            0
            • J jeron1

              One of our embedded systems has been around for 15+ years, we had one or two customers in that time have random device crashes. We tried setting things up and using the customers configuration, to no avail, for many months. Data from the clients revealed the crashes occur at many different addresses. Finally, in desperation, I decided to look at stack usage. There were two routines related to little used features out of hundreds (think 70-80K lines of code in assembler) that allocated 4 bytes more stack space than it deallocated. It took thousands of these routine calls before encroaching on runtime variables. Twas a happy day when I found it :beer:, and as important, a big learning experience.

              "the debugger doesn't tell me anything because this code compiles just fine" - random QA comment "Facebook is where you tell lies to your friends. Twitter is where you tell the truth to strangers." - chriselst "I don't drink any more... then again, I don't drink any less." - Mike Mullikins uncle

              H Offline
              H Offline
              honey the codewitch
              wrote on last edited by
              #6

              My takeaway is "don't write things in assembly" :laugh:

              Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

              J 1 Reply Last reply
              0
              • H honey the codewitch

                My takeaway is "don't write things in assembly" :laugh:

                Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                J Offline
                J Offline
                jeron1
                wrote on last edited by
                #7

                I wouldn't necessarily argue! :-D In this particular product there's a lot of time sensitive code, and over the years it's grown, A LOT.

                "the debugger doesn't tell me anything because this code compiles just fine" - random QA comment "Facebook is where you tell lies to your friends. Twitter is where you tell the truth to strangers." - chriselst "I don't drink any more... then again, I don't drink any less." - Mike Mullikins uncle

                1 Reply Last reply
                0
                • D dandy72

                  Now imagine weeks of debugging and seconds for the actual fix (after the problem was identified and well understood).

                  J Offline
                  J Offline
                  jeron1
                  wrote on last edited by
                  #8

                  True, it's tough when you are going through it, for me sleepless nights are the usual result!

                  dandy72 wrote:

                  after the problem was identified and well understood

                  To me there's a certain satisfaction to that, as rough as the road was to get to that point (and maybe learning a thing or two about better implementation, as I did).

                  "the debugger doesn't tell me anything because this code compiles just fine" - random QA comment "Facebook is where you tell lies to your friends. Twitter is where you tell the truth to strangers." - chriselst "I don't drink any more... then again, I don't drink any less." - Mike Mullikins uncle

                  D 1 Reply Last reply
                  0
                  • G Gary Wheeler

                    We have a customer for whom we've done a [great steaming] pile of custom work. A couple of times a year they find a problem with the custom stuff. It always turns into a wretched slog through a Lovecraftian swamp of bubbling ichor (e.g. legacy code) I didn't write but am now required to maintain. Three months ago they reported an issue with one of their features in the new generation of product that didn't happen with the old one. I compared the code between the two and it was identical. I've spent considerable hours debugging through the code for the feature. It turns out the new code is looking in the wrong place in the registry to see if their custom features are enabled :doh: . The old code only worked accidently. Cue the fireworks a day early, and let the naked happy dance :jig: commence!

                    Software Zen: delete this;

                    J Offline
                    J Offline
                    jschell
                    wrote on last edited by
                    #9

                    Hopefully you bill by the hour.

                    G 1 Reply Last reply
                    0
                    • J jschell

                      Hopefully you bill by the hour.

                      G Offline
                      G Offline
                      Gary Wheeler
                      wrote on last edited by
                      #10

                      Unfortunately no :sigh:, at least in my day job.

                      Software Zen: delete this;

                      1 Reply Last reply
                      0
                      • G Gary Wheeler

                        We have a customer for whom we've done a [great steaming] pile of custom work. A couple of times a year they find a problem with the custom stuff. It always turns into a wretched slog through a Lovecraftian swamp of bubbling ichor (e.g. legacy code) I didn't write but am now required to maintain. Three months ago they reported an issue with one of their features in the new generation of product that didn't happen with the old one. I compared the code between the two and it was identical. I've spent considerable hours debugging through the code for the feature. It turns out the new code is looking in the wrong place in the registry to see if their custom features are enabled :doh: . The old code only worked accidently. Cue the fireworks a day early, and let the naked happy dance :jig: commence!

                        Software Zen: delete this;

                        Richard Andrew x64R Offline
                        Richard Andrew x64R Offline
                        Richard Andrew x64
                        wrote on last edited by
                        #11

                        Wait, you said the code was identical between the new and old versions of the product. So how is there a difference that caused it to look in the wrong place? UNLESS, it was always looking in the wrong place, and changes to the surrounding code put the information into a different place that it was unable to find by accident? OR, are you saying that the code that just turned the feature on didn't work, and the code of the feature itself was unchanged?

                        The difficult we do right away... ...the impossible takes slightly longer.

                        G 1 Reply Last reply
                        0
                        • G Gary Wheeler

                          We have a customer for whom we've done a [great steaming] pile of custom work. A couple of times a year they find a problem with the custom stuff. It always turns into a wretched slog through a Lovecraftian swamp of bubbling ichor (e.g. legacy code) I didn't write but am now required to maintain. Three months ago they reported an issue with one of their features in the new generation of product that didn't happen with the old one. I compared the code between the two and it was identical. I've spent considerable hours debugging through the code for the feature. It turns out the new code is looking in the wrong place in the registry to see if their custom features are enabled :doh: . The old code only worked accidently. Cue the fireworks a day early, and let the naked happy dance :jig: commence!

                          Software Zen: delete this;

                          R Offline
                          R Offline
                          Ron Anders
                          wrote on last edited by
                          #12

                          Whew.

                          G 1 Reply Last reply
                          0
                          • Richard Andrew x64R Richard Andrew x64

                            Wait, you said the code was identical between the new and old versions of the product. So how is there a difference that caused it to look in the wrong place? UNLESS, it was always looking in the wrong place, and changes to the surrounding code put the information into a different place that it was unable to find by accident? OR, are you saying that the code that just turned the feature on didn't work, and the code of the feature itself was unchanged?

                            The difficult we do right away... ...the impossible takes slightly longer.

                            G Offline
                            G Offline
                            Gary Wheeler
                            wrote on last edited by
                            #13

                            The difference was in a class used throughout the product to access the registry. This component (a Windows service) used that class incorrectly in the old product but still managed to find the values at the appropriate key. When the service was migrated to the new product (which changes the registry key used), its incorrect usage of the registry class caused it to look at the incorrect registry key and not find the required values. In both cases, the service and the registry class, they looked correct on inspection. It wasn't until I stepped through the service that I discovered it was assuming certain things about the registry class weren't true and never had been. FWIW, I didn't write either of them.

                            Software Zen: delete this;

                            1 Reply Last reply
                            0
                            • R Ron Anders

                              Whew.

                              G Offline
                              G Offline
                              Gary Wheeler
                              wrote on last edited by
                              #14

                              Yup :-D.

                              Software Zen: delete this;

                              1 Reply Last reply
                              0
                              • J jeron1

                                True, it's tough when you are going through it, for me sleepless nights are the usual result!

                                dandy72 wrote:

                                after the problem was identified and well understood

                                To me there's a certain satisfaction to that, as rough as the road was to get to that point (and maybe learning a thing or two about better implementation, as I did).

                                "the debugger doesn't tell me anything because this code compiles just fine" - random QA comment "Facebook is where you tell lies to your friends. Twitter is where you tell the truth to strangers." - chriselst "I don't drink any more... then again, I don't drink any less." - Mike Mullikins uncle

                                D Offline
                                D Offline
                                dandy72
                                wrote on last edited by
                                #15

                                I get what you say, but the last time such a thing happened to me it was a case of a missing quote in a string that was being built with multiple layers of escaping them. So of course the compiler didn't know any better and was of no help. There was nothing satisfying about solving that particular problem. Just annoyance at whoever last modified the string in the first place...

                                1 Reply Last reply
                                0
                                • G Gary Wheeler

                                  jeron1 wrote:

                                  Twas a happy day when I found it :beer:, and as important, a big learning experience

                                  Same here. I've had other memorable bugs that were excruciating to recreate and diagnose. One was a GDI handle leak that took over a week of run time to show up and crash the application. Another was a piece of embedded code where the TCP/IP code we bought back in 1995 did not re-initialize properly after a network hardware error. Both of these took weeks of debugging to find and reproduce and only a couple of hours to correct.

                                  Software Zen: delete this;

                                  J Offline
                                  J Offline
                                  jsrjsr
                                  wrote on last edited by
                                  #16

                                  My favorite was a 49-day crash bug caused by a 32-bit timer rollover. Management wanted to know why test did not find this bug.

                                  K 1 Reply Last reply
                                  0
                                  • G Gary Wheeler

                                    jeron1 wrote:

                                    Twas a happy day when I found it :beer:, and as important, a big learning experience

                                    Same here. I've had other memorable bugs that were excruciating to recreate and diagnose. One was a GDI handle leak that took over a week of run time to show up and crash the application. Another was a piece of embedded code where the TCP/IP code we bought back in 1995 did not re-initialize properly after a network hardware error. Both of these took weeks of debugging to find and reproduce and only a couple of hours to correct.

                                    Software Zen: delete this;

                                    E Offline
                                    E Offline
                                    englebart
                                    wrote on last edited by
                                    #17

                                    You always find something in the last place you look for it.

                                    1 Reply Last reply
                                    0
                                    • J jsrjsr

                                      My favorite was a 49-day crash bug caused by a 32-bit timer rollover. Management wanted to know why test did not find this bug.

                                      K Offline
                                      K Offline
                                      kholsinger
                                      wrote on last edited by
                                      #18

                                      I have a similar story about a once-per-49-days issue. Takes a long time to run those experiments, and to be patient enough to not interrupt them for some other test of the system.

                                      1 Reply Last reply
                                      0
                                      Reply
                                      • Reply as topic
                                      Log in to reply
                                      • Oldest to Newest
                                      • Newest to Oldest
                                      • Most Votes


                                      • Login

                                      • Don't have an account? Register

                                      • Login or register to search.
                                      • First post
                                        Last post
                                      0
                                      • Categories
                                      • Recent
                                      • Tags
                                      • Popular
                                      • World
                                      • Users
                                      • Groups