Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. CodeProject outage

CodeProject outage

Scheduled Pinned Locked Moved The Lounge
sysadminsecurityannouncement
23 Posts 16 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • C Chris Maunder

    At around 2AM this morning things started to go a little pear-shaped with our servers. Our tech guys were looking into it trying to work out what was happening, but it became a all-hands exercise in trying to work out what broke what. We had updated CodeProject's code. So we redeployed, cleaned and deployed, rolled back and cleaned and deployed. It wasn't our code. Our requests per second were a little crazy. As in 1000x what we normally would get, but nothing that set off the DoS alarms. Things were adjusted to ease the load but the load remained uneased. Finally, with zero load we still had the site pinned. It turns out the firewall needs replacing. Firewall fixed, load reduced, site back up. Mostly. There were also a series of Windows patches that were installed as part of routine maintenance. These were to do with HTTP security, so they naturally got some attention. Uninstalling Windows patches can be painful, but once one of the patches was removed everything popped back up. However, that was only on 1 of the servers, so that patch, since it's a security update, will be reinstalled, and if it causes issues the entire server will be binned and a new one rolled in. So: fun and games at CodeProject central and I apologise for being down for so long. This was a bit of a trifecta, but we're now into mopping up and analysis stage so :beer: all round.

    cheers Chris Maunder

    P Offline
    P Offline
    Pete OHanlon
    wrote on last edited by
    #9

    Still running on AWS? This happens to us sometimes too.

    Advanced TypeScript Programming Projects

    1 Reply Last reply
    0
    • G Gary R Wheeler

      Must... not... tell... war... story... It involved a customer replacing the 1Gb hub we have in our equipment with a 100Mb hub they had laying around. They had the audacity to (1) not tell us what they had done, (2) lie when we asked them point blank what they had changed, and (3) tried to hide the evidence when a [very sleepy after a 10 hour drive] field service dude showed up. Irrelevant side note: the outage must be Jeremy's fault, since he raised the subject of uptime a few threads down.

      Software Zen: delete this;

      Sander RosselS Offline
      Sander RosselS Offline
      Sander Rossel
      wrote on last edited by
      #10

      Been there done that :laugh: A customer called and told me things didn't work. I asked them if anything at all had happened that could've caused this? Nope, nothing had happend. Really, nothing you can think of? Nothing. This customer never called so I wasn't too familiar with their software and setup. After a few hours of looking into it I found nothing. When I called them back the IT manager "suddenly remembered" they'd restarted the server "but that can't be it, right?". We fixed the problem five minutes later :doh:

      Best, Sander Azure DevOps Succinctly (free eBook) Azure Serverless Succinctly (free eBook) Migrating Apps to the Cloud with Azure arrgh.js - Bringing LINQ to JavaScript

      1 Reply Last reply
      0
      • C Chris Maunder

        At around 2AM this morning things started to go a little pear-shaped with our servers. Our tech guys were looking into it trying to work out what was happening, but it became a all-hands exercise in trying to work out what broke what. We had updated CodeProject's code. So we redeployed, cleaned and deployed, rolled back and cleaned and deployed. It wasn't our code. Our requests per second were a little crazy. As in 1000x what we normally would get, but nothing that set off the DoS alarms. Things were adjusted to ease the load but the load remained uneased. Finally, with zero load we still had the site pinned. It turns out the firewall needs replacing. Firewall fixed, load reduced, site back up. Mostly. There were also a series of Windows patches that were installed as part of routine maintenance. These were to do with HTTP security, so they naturally got some attention. Uninstalling Windows patches can be painful, but once one of the patches was removed everything popped back up. However, that was only on 1 of the servers, so that patch, since it's a security update, will be reinstalled, and if it causes issues the entire server will be binned and a new one rolled in. So: fun and games at CodeProject central and I apologise for being down for so long. This was a bit of a trifecta, but we're now into mopping up and analysis stage so :beer: all round.

        cheers Chris Maunder

        M Offline
        M Offline
        Mark Starr
        wrote on last edited by
        #11

        Things go sideways sometimes. A bit of adrenaline to get the blood pumping. Sounds like you’ve got it figured out. Cheers, and thanks! :java:

        Time is the differentiation of eternity devised by man to measure the passage of human events. - Manly P. Hall Mark Just another cog in the wheel

        1 Reply Last reply
        0
        • C Chris Maunder

          At around 2AM this morning things started to go a little pear-shaped with our servers. Our tech guys were looking into it trying to work out what was happening, but it became a all-hands exercise in trying to work out what broke what. We had updated CodeProject's code. So we redeployed, cleaned and deployed, rolled back and cleaned and deployed. It wasn't our code. Our requests per second were a little crazy. As in 1000x what we normally would get, but nothing that set off the DoS alarms. Things were adjusted to ease the load but the load remained uneased. Finally, with zero load we still had the site pinned. It turns out the firewall needs replacing. Firewall fixed, load reduced, site back up. Mostly. There were also a series of Windows patches that were installed as part of routine maintenance. These were to do with HTTP security, so they naturally got some attention. Uninstalling Windows patches can be painful, but once one of the patches was removed everything popped back up. However, that was only on 1 of the servers, so that patch, since it's a security update, will be reinstalled, and if it causes issues the entire server will be binned and a new one rolled in. So: fun and games at CodeProject central and I apologise for being down for so long. This was a bit of a trifecta, but we're now into mopping up and analysis stage so :beer: all round.

          cheers Chris Maunder

          O Offline
          O Offline
          obeobe
          wrote on last edited by
          #12

          Maybe some day you will upgrade your servers to Linux...

          D 1 Reply Last reply
          0
          • O obeobe

            Maybe some day you will upgrade your servers to Linux...

            D Offline
            D Offline
            dandy72
            wrote on last edited by
            #13

            Trading known problems with brand new unknown ones. Hmmmm, tough choice...

            O 1 Reply Last reply
            0
            • C Chris Maunder

              At around 2AM this morning things started to go a little pear-shaped with our servers. Our tech guys were looking into it trying to work out what was happening, but it became a all-hands exercise in trying to work out what broke what. We had updated CodeProject's code. So we redeployed, cleaned and deployed, rolled back and cleaned and deployed. It wasn't our code. Our requests per second were a little crazy. As in 1000x what we normally would get, but nothing that set off the DoS alarms. Things were adjusted to ease the load but the load remained uneased. Finally, with zero load we still had the site pinned. It turns out the firewall needs replacing. Firewall fixed, load reduced, site back up. Mostly. There were also a series of Windows patches that were installed as part of routine maintenance. These were to do with HTTP security, so they naturally got some attention. Uninstalling Windows patches can be painful, but once one of the patches was removed everything popped back up. However, that was only on 1 of the servers, so that patch, since it's a security update, will be reinstalled, and if it causes issues the entire server will be binned and a new one rolled in. So: fun and games at CodeProject central and I apologise for being down for so long. This was a bit of a trifecta, but we're now into mopping up and analysis stage so :beer: all round.

              cheers Chris Maunder

              abmvA Offline
              abmvA Offline
              abmv
              wrote on last edited by
              #14

              so there was some windows patch and the firewall config raising the cpu weird.......

              Caveat Emptor. "Progress doesn't come from early risers – progress is made by lazy men looking for easier ways to do things." Lazarus Long

              We are in the beginning of a mass extinction. - Greta Thunberg

              1 Reply Last reply
              0
              • G Gary R Wheeler

                Must... not... tell... war... story... It involved a customer replacing the 1Gb hub we have in our equipment with a 100Mb hub they had laying around. They had the audacity to (1) not tell us what they had done, (2) lie when we asked them point blank what they had changed, and (3) tried to hide the evidence when a [very sleepy after a 10 hour drive] field service dude showed up. Irrelevant side note: the outage must be Jeremy's fault, since he raised the subject of uptime a few threads down.

                Software Zen: delete this;

                N Offline
                N Offline
                Nelek
                wrote on last edited by
                #15

                Gary R. Wheeler wrote:

                Irrelevant side note: the outage must be Jeremy's fault, since he raised the subject of uptime a few threads down.

                I thought the same :rolleyes: :laugh: :laugh:

                M.D.V. ;) If something has a solution... Why do we have to worry about?. If it has no solution... For what reason do we have to worry about? Help me to understand what I'm saying, and I'll explain it better to you Rating helpful answers is nice, but saying thanks can be even nicer.

                1 Reply Last reply
                0
                • D dandy72

                  Trading known problems with brand new unknown ones. Hmmmm, tough choice...

                  O Offline
                  O Offline
                  obeobe
                  wrote on last edited by
                  #16

                  Obviously it was tongue in cheek, but seriously, Windows updates too often break stuff, and Murphy's law ensures that they break said stuff at the most inconvenient times.

                  D 1 Reply Last reply
                  0
                  • O obeobe

                    Obviously it was tongue in cheek, but seriously, Windows updates too often break stuff, and Murphy's law ensures that they break said stuff at the most inconvenient times.

                    D Offline
                    D Offline
                    dandy72
                    wrote on last edited by
                    #17

                    obeobe wrote:

                    Obviously it was tongue in cheek, but seriously,

                    The problem is that some people make that sort of suggestion while being absolutely serious.

                    obeobe wrote:

                    Windows updates too often break stuff

                    Can't argue there...

                    obeobe wrote:

                    they break said stuff at the most inconvenient times.

                    Is there ever a convenient time during which updates should be okay to break stuff...?

                    J 1 Reply Last reply
                    0
                    • D dandy72

                      obeobe wrote:

                      Obviously it was tongue in cheek, but seriously,

                      The problem is that some people make that sort of suggestion while being absolutely serious.

                      obeobe wrote:

                      Windows updates too often break stuff

                      Can't argue there...

                      obeobe wrote:

                      they break said stuff at the most inconvenient times.

                      Is there ever a convenient time during which updates should be okay to break stuff...?

                      J Offline
                      J Offline
                      jschell
                      wrote on last edited by
                      #18

                      dandy72 wrote:

                      Is there ever a convenient time during which updates should be okay to break stuff...?

                      When I am on vacation and unreachable?

                      D 1 Reply Last reply
                      0
                      • J jschell

                        dandy72 wrote:

                        Is there ever a convenient time during which updates should be okay to break stuff...?

                        When I am on vacation and unreachable?

                        D Offline
                        D Offline
                        dandy72
                        wrote on last edited by
                        #19

                        ...so less qualified people rush their own fix that you then inherit when you come back...?

                        J 1 Reply Last reply
                        0
                        • D dandy72

                          ...so less qualified people rush their own fix that you then inherit when you come back...?

                          J Offline
                          J Offline
                          jschell
                          wrote on last edited by
                          #20

                          well yes...but that of course leads to job security...since of course by then the higher ups learn that only I can fix things!

                          1 Reply Last reply
                          0
                          • J Jeremy Falcon

                            Chris Maunder wrote:

                            if it causes issues the entire server will be binned and a new one rolled in

                            So you know Chris, ever think about hosting CP on Debian? :-\

                            Jeremy Falcon

                            C Offline
                            C Offline
                            Chris Maunder
                            wrote on last edited by
                            #21

                            We have this awful albatross of a webforms project that ruins the fun. .NET Core, Linux, PostgreSQL or MariaDB and I'd be so happy and our costs would be dramatically lower. Sigh.

                            cheers Chris Maunder

                            1 Reply Last reply
                            0
                            • G Gary R Wheeler

                              Must... not... tell... war... story... It involved a customer replacing the 1Gb hub we have in our equipment with a 100Mb hub they had laying around. They had the audacity to (1) not tell us what they had done, (2) lie when we asked them point blank what they had changed, and (3) tried to hide the evidence when a [very sleepy after a 10 hour drive] field service dude showed up. Irrelevant side note: the outage must be Jeremy's fault, since he raised the subject of uptime a few threads down.

                              Software Zen: delete this;

                              C Offline
                              C Offline
                              Chris Maunder
                              wrote on last edited by
                              #22

                              10% of capacity? But what could do wrong? ;)

                              cheers Chris Maunder

                              G 1 Reply Last reply
                              0
                              • C Chris Maunder

                                10% of capacity? But what could do wrong? ;)

                                cheers Chris Maunder

                                G Offline
                                G Offline
                                Gary R Wheeler
                                wrote on last edited by
                                #23

                                Especially on a machine printing both sides of the paper at 17 feet per second, full color, front and back.

                                Software Zen: delete this;

                                1 Reply Last reply
                                0
                                Reply
                                • Reply as topic
                                Log in to reply
                                • Oldest to Newest
                                • Newest to Oldest
                                • Most Votes


                                • Login

                                • Don't have an account? Register

                                • Login or register to search.
                                • First post
                                  Last post
                                0
                                • Categories
                                • Recent
                                • Tags
                                • Popular
                                • World
                                • Users
                                • Groups