Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. Other Discussions
  3. The Weird and The Wonderful
  4. Need to vent...

Need to vent...

Scheduled Pinned Locked Moved The Weird and The Wonderful
salestoolshelpannouncement
13 Posts 10 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • T Offline
    T Offline
    Timothy Byrd
    wrote on last edited by
    #1

    Our marketing department hired a new web developer, and they were given a task to write some code for our department. Here are a couple excerpts:

    // Names have been changed to protect the guilty...
    //
    namespace OMG
    {
    static class TranslationClass
    {
    static ArrayList nonTranslatable = new ArrayList();

        public static ArrayList NonTranslatable
        {
            get
            {
                nonTranslatable.Add("1st String");
                nonTranslatable.Add("2nd String");
                // ...
                nonTranslatable.Add("30th String");
                return nonTranslatable;
            }
        }
    
        private SortedList translationDictionary = new SortedList();
    
        // more stuff here to fill the translationDictionary
    
        public void TranslateNode(XmlNode node)
        {
            foreach (string s in NonTranslatable)
            {
                if (node.Attributes\["Publish"\].InnerText.Trim().Contains(s))
                {
                    return;
                }
            }
    
            foreach (DictionaryEntry de in translationDictionary)
            {
                if (de.Key.ToString().Equals(node.Attributes\["Publish"\].InnerText.Trim()))
                    node.Attributes\["Publish"\].InnerText
                            = translationDictionary\[node.Attributes\["Publish"\].InnerText.Trim()\].ToString();
            }
        }
    }
    

    }

    When I asked them to run their code on the sample dataset, I was told they'd need to run it over-night. Looking at their code, I could see why. So I wrote my own version of the utility, and timed them versus each other. Their program took an hour and a quarter to process a 2MB text file; mine took 2 seconds. The biggest problem was that every time that NonTranslatable was used, new, redundant elements were being added to it, and the loop took longer and longer to run. Also they were looping through the dictionary instead of using something like ContainsKey(). Argh!

    P C T D M 5 Replies Last reply
    0
    • T Timothy Byrd

      Our marketing department hired a new web developer, and they were given a task to write some code for our department. Here are a couple excerpts:

      // Names have been changed to protect the guilty...
      //
      namespace OMG
      {
      static class TranslationClass
      {
      static ArrayList nonTranslatable = new ArrayList();

          public static ArrayList NonTranslatable
          {
              get
              {
                  nonTranslatable.Add("1st String");
                  nonTranslatable.Add("2nd String");
                  // ...
                  nonTranslatable.Add("30th String");
                  return nonTranslatable;
              }
          }
      
          private SortedList translationDictionary = new SortedList();
      
          // more stuff here to fill the translationDictionary
      
          public void TranslateNode(XmlNode node)
          {
              foreach (string s in NonTranslatable)
              {
                  if (node.Attributes\["Publish"\].InnerText.Trim().Contains(s))
                  {
                      return;
                  }
              }
      
              foreach (DictionaryEntry de in translationDictionary)
              {
                  if (de.Key.ToString().Equals(node.Attributes\["Publish"\].InnerText.Trim()))
                      node.Attributes\["Publish"\].InnerText
                              = translationDictionary\[node.Attributes\["Publish"\].InnerText.Trim()\].ToString();
              }
          }
      }
      

      }

      When I asked them to run their code on the sample dataset, I was told they'd need to run it over-night. Looking at their code, I could see why. So I wrote my own version of the utility, and timed them versus each other. Their program took an hour and a quarter to process a 2MB text file; mine took 2 seconds. The biggest problem was that every time that NonTranslatable was used, new, redundant elements were being added to it, and the loop took longer and longer to run. Also they were looping through the dictionary instead of using something like ContainsKey(). Argh!

      P Offline
      P Offline
      PIEBALDconsult
      wrote on last edited by
      #2

      Ouch! If you fix those two problems (repetitive additions and looping) how much faster is it? And how about removing the .ToString and .Equals ? If you are using .net 3.5, how about if they use a HashSet?

      G T 2 Replies Last reply
      0
      • T Timothy Byrd

        Our marketing department hired a new web developer, and they were given a task to write some code for our department. Here are a couple excerpts:

        // Names have been changed to protect the guilty...
        //
        namespace OMG
        {
        static class TranslationClass
        {
        static ArrayList nonTranslatable = new ArrayList();

            public static ArrayList NonTranslatable
            {
                get
                {
                    nonTranslatable.Add("1st String");
                    nonTranslatable.Add("2nd String");
                    // ...
                    nonTranslatable.Add("30th String");
                    return nonTranslatable;
                }
            }
        
            private SortedList translationDictionary = new SortedList();
        
            // more stuff here to fill the translationDictionary
        
            public void TranslateNode(XmlNode node)
            {
                foreach (string s in NonTranslatable)
                {
                    if (node.Attributes\["Publish"\].InnerText.Trim().Contains(s))
                    {
                        return;
                    }
                }
        
                foreach (DictionaryEntry de in translationDictionary)
                {
                    if (de.Key.ToString().Equals(node.Attributes\["Publish"\].InnerText.Trim()))
                        node.Attributes\["Publish"\].InnerText
                                = translationDictionary\[node.Attributes\["Publish"\].InnerText.Trim()\].ToString();
                }
            }
        }
        

        }

        When I asked them to run their code on the sample dataset, I was told they'd need to run it over-night. Looking at their code, I could see why. So I wrote my own version of the utility, and timed them versus each other. Their program took an hour and a quarter to process a 2MB text file; mine took 2 seconds. The biggest problem was that every time that NonTranslatable was used, new, redundant elements were being added to it, and the loop took longer and longer to run. Also they were looping through the dictionary instead of using something like ContainsKey(). Argh!

        C Offline
        C Offline
        Chris Meech
        wrote on last edited by
        #3

        Well, I think marketing are getting exactly what they deserve. The slower they work, the better. :)

        Chris Meech I am Canadian. [heard in a local bar] In theory there is no difference between theory and practice. In practice there is. [Yogi Berra]

        D 1 Reply Last reply
        0
        • P PIEBALDconsult

          Ouch! If you fix those two problems (repetitive additions and looping) how much faster is it? And how about removing the .ToString and .Equals ? If you are using .net 3.5, how about if they use a HashSet?

          G Offline
          G Offline
          GibbleCH
          wrote on last edited by
          #4

          Or better, if you're going to use 3.5, a List and some linq

          P 1 Reply Last reply
          0
          • P PIEBALDconsult

            Ouch! If you fix those two problems (repetitive additions and looping) how much faster is it? And how about removing the .ToString and .Equals ? If you are using .net 3.5, how about if they use a HashSet?

            T Offline
            T Offline
            Timothy Byrd
            wrote on last edited by
            #5

            Simply putting an if (nonTranslatable.Count == 0) block around the code in the NonTranslatable property took the run time down from >75 minutes to ~30 seconds. (Fun fact: During the 75 minutes of processing, those repeated Add() calls caused the program to take up about 25MB of extra memory. That's a lot of strings to loop through and compare repeatedly. :omg: )

            PIEBALDconsult wrote:

            And how about removing the .ToString and .Equals ?

            Not to mention all the calls to Trim() inside the loop. Or does the runtime optimize those out as invariant? I ended up just dumping all their code and using my own. To add another wrinkle they wrote it in VS 2008, and we are still on 2005 until we get this release out, so I couldn't even build their solution. Besides there were some other issues. Not only was it still 10x slower than my rewrite, but the output wasn't quite correct. The problem is that there was also a non-XML text file that needed to be parsed - imagine taking a Windows .rc file and extracting all the strings that can be translated from it and putting them into an Excel file. They weren't being careful enough with their regular expressions, so some things were getting missed and some extra things were being found. Using .net 2.0 and a Dictionary<string,> worked fine for me. By making the regular expressions more correct, I didn't need the NonTranslatable array at all. (The two loops I showed were actually in two separate routines, but I wanted to keep my post relatively compact.) I've vented - I feel better now. -- T

            1 Reply Last reply
            0
            • G GibbleCH

              Or better, if you're going to use 3.5, a List and some linq

              P Offline
              P Offline
              PIEBALDconsult
              wrote on last edited by
              #6

              I think Hashset is better than List in this case. 0) No duplicates 1) Hash rather than iterate

              1 Reply Last reply
              0
              • C Chris Meech

                Well, I think marketing are getting exactly what they deserve. The slower they work, the better. :)

                Chris Meech I am Canadian. [heard in a local bar] In theory there is no difference between theory and practice. In practice there is. [Yogi Berra]

                D Offline
                D Offline
                Deflinek
                wrote on last edited by
                #7

                Chris Meech wrote:

                The slower they work, the better.

                Because there are soooo maaaany ways to improve! Imagine if customer would get used to "the 2.5h" routine, and after few weeks or even better months they would show him your "2sec" one? WOW! :laugh:

                -- "My software never has bugs. It just develops random features."

                K 1 Reply Last reply
                0
                • D Deflinek

                  Chris Meech wrote:

                  The slower they work, the better.

                  Because there are soooo maaaany ways to improve! Imagine if customer would get used to "the 2.5h" routine, and after few weeks or even better months they would show him your "2sec" one? WOW! :laugh:

                  -- "My software never has bugs. It just develops random features."

                  K Offline
                  K Offline
                  kenrentz
                  wrote on last edited by
                  #8

                  You're doing it wrong. First you give them the 2.5h routine. then you give them a free update that trims it to an even 2 hours. After that you offer to sell them the latest version that drops that time to about 15 minutes. the rest of the speed improvement you save until a later time when you can add it in to make up for the lack of genuine improvements in a release. I guess I've spent way too much time around marketing executives for my own good.

                  S 1 Reply Last reply
                  0
                  • K kenrentz

                    You're doing it wrong. First you give them the 2.5h routine. then you give them a free update that trims it to an even 2 hours. After that you offer to sell them the latest version that drops that time to about 15 minutes. the rest of the speed improvement you save until a later time when you can add it in to make up for the lack of genuine improvements in a release. I guess I've spent way too much time around marketing executives for my own good.

                    S Offline
                    S Offline
                    supercat9
                    wrote on last edited by
                    #9

                    I've written some code that turned out to be dog slow when given sufficiently-large data sets. Usually I've tried to fix it before the customer could try it and complain. It's not always easy, though, to know what things are going to pose a problem until one actually codes them. Obviously O(n^2) algorithms are likely to cause problems unless 'n' is bounded to a small value, but it may not be possible to recognize an algorithm which takes time c+kn as an obvious bottleneck without testing it on large data sets. For example, if c==1000k the program may seem to take the same amount of time with n=10 and n=100, but be unacceptably slow with n=1,000,000. It may well be that the n=100 case is much more common, but if the n=1,000,000 case could occur the system needs to be able to deal with it without causing timeouts.

                    1 Reply Last reply
                    0
                    • T Timothy Byrd

                      Our marketing department hired a new web developer, and they were given a task to write some code for our department. Here are a couple excerpts:

                      // Names have been changed to protect the guilty...
                      //
                      namespace OMG
                      {
                      static class TranslationClass
                      {
                      static ArrayList nonTranslatable = new ArrayList();

                          public static ArrayList NonTranslatable
                          {
                              get
                              {
                                  nonTranslatable.Add("1st String");
                                  nonTranslatable.Add("2nd String");
                                  // ...
                                  nonTranslatable.Add("30th String");
                                  return nonTranslatable;
                              }
                          }
                      
                          private SortedList translationDictionary = new SortedList();
                      
                          // more stuff here to fill the translationDictionary
                      
                          public void TranslateNode(XmlNode node)
                          {
                              foreach (string s in NonTranslatable)
                              {
                                  if (node.Attributes\["Publish"\].InnerText.Trim().Contains(s))
                                  {
                                      return;
                                  }
                              }
                      
                              foreach (DictionaryEntry de in translationDictionary)
                              {
                                  if (de.Key.ToString().Equals(node.Attributes\["Publish"\].InnerText.Trim()))
                                      node.Attributes\["Publish"\].InnerText
                                              = translationDictionary\[node.Attributes\["Publish"\].InnerText.Trim()\].ToString();
                              }
                          }
                      }
                      

                      }

                      When I asked them to run their code on the sample dataset, I was told they'd need to run it over-night. Looking at their code, I could see why. So I wrote my own version of the utility, and timed them versus each other. Their program took an hour and a quarter to process a 2MB text file; mine took 2 seconds. The biggest problem was that every time that NonTranslatable was used, new, redundant elements were being added to it, and the loop took longer and longer to run. Also they were looping through the dictionary instead of using something like ContainsKey(). Argh!

                      T Offline
                      T Offline
                      ToddHileHoffer
                      wrote on last edited by
                      #10

                      Yeah, the code is AIDS but maybe it was an afterthought. Everybody writes stupid code once in a while. Maybe the coder just didn't care at the moment.

                      I didn't get any requirements for the signature

                      1 Reply Last reply
                      0
                      • T Timothy Byrd

                        Our marketing department hired a new web developer, and they were given a task to write some code for our department. Here are a couple excerpts:

                        // Names have been changed to protect the guilty...
                        //
                        namespace OMG
                        {
                        static class TranslationClass
                        {
                        static ArrayList nonTranslatable = new ArrayList();

                            public static ArrayList NonTranslatable
                            {
                                get
                                {
                                    nonTranslatable.Add("1st String");
                                    nonTranslatable.Add("2nd String");
                                    // ...
                                    nonTranslatable.Add("30th String");
                                    return nonTranslatable;
                                }
                            }
                        
                            private SortedList translationDictionary = new SortedList();
                        
                            // more stuff here to fill the translationDictionary
                        
                            public void TranslateNode(XmlNode node)
                            {
                                foreach (string s in NonTranslatable)
                                {
                                    if (node.Attributes\["Publish"\].InnerText.Trim().Contains(s))
                                    {
                                        return;
                                    }
                                }
                        
                                foreach (DictionaryEntry de in translationDictionary)
                                {
                                    if (de.Key.ToString().Equals(node.Attributes\["Publish"\].InnerText.Trim()))
                                        node.Attributes\["Publish"\].InnerText
                                                = translationDictionary\[node.Attributes\["Publish"\].InnerText.Trim()\].ToString();
                                }
                            }
                        }
                        

                        }

                        When I asked them to run their code on the sample dataset, I was told they'd need to run it over-night. Looking at their code, I could see why. So I wrote my own version of the utility, and timed them versus each other. Their program took an hour and a quarter to process a 2MB text file; mine took 2 seconds. The biggest problem was that every time that NonTranslatable was used, new, redundant elements were being added to it, and the loop took longer and longer to run. Also they were looping through the dictionary instead of using something like ContainsKey(). Argh!

                        D Offline
                        D Offline
                        Douglas Dean
                        wrote on last edited by
                        #11

                        How about simply bagging the non-translatable list (which was probably only put there as a lame fix to the performance of his clunky lookup) and using the DataSet and DataTable, etc. classes to parse the XML?

                        1 Reply Last reply
                        0
                        • T Timothy Byrd

                          Our marketing department hired a new web developer, and they were given a task to write some code for our department. Here are a couple excerpts:

                          // Names have been changed to protect the guilty...
                          //
                          namespace OMG
                          {
                          static class TranslationClass
                          {
                          static ArrayList nonTranslatable = new ArrayList();

                              public static ArrayList NonTranslatable
                              {
                                  get
                                  {
                                      nonTranslatable.Add("1st String");
                                      nonTranslatable.Add("2nd String");
                                      // ...
                                      nonTranslatable.Add("30th String");
                                      return nonTranslatable;
                                  }
                              }
                          
                              private SortedList translationDictionary = new SortedList();
                          
                              // more stuff here to fill the translationDictionary
                          
                              public void TranslateNode(XmlNode node)
                              {
                                  foreach (string s in NonTranslatable)
                                  {
                                      if (node.Attributes\["Publish"\].InnerText.Trim().Contains(s))
                                      {
                                          return;
                                      }
                                  }
                          
                                  foreach (DictionaryEntry de in translationDictionary)
                                  {
                                      if (de.Key.ToString().Equals(node.Attributes\["Publish"\].InnerText.Trim()))
                                          node.Attributes\["Publish"\].InnerText
                                                  = translationDictionary\[node.Attributes\["Publish"\].InnerText.Trim()\].ToString();
                                  }
                              }
                          }
                          

                          }

                          When I asked them to run their code on the sample dataset, I was told they'd need to run it over-night. Looking at their code, I could see why. So I wrote my own version of the utility, and timed them versus each other. Their program took an hour and a quarter to process a 2MB text file; mine took 2 seconds. The biggest problem was that every time that NonTranslatable was used, new, redundant elements were being added to it, and the loop took longer and longer to run. Also they were looping through the dictionary instead of using something like ContainsKey(). Argh!

                          M Offline
                          M Offline
                          Megidolaon
                          wrote on last edited by
                          #12

                          What is the point of

                          if (node.Attributes["Publish"].InnerText.Trim().Contains(s))
                          {
                          return;
                          }

                          A check for containting a single letter and then doing nothing? Also,

                          public static ArrayList NonTranslatable
                          {
                          get
                          {
                          nonTranslatable.Add("1st String");
                          nonTranslatable.Add("2nd String");
                          // ...
                          nonTranslatable.Add("30th String");
                          return nonTranslatable;
                          }
                          }

                          Sorry, but WTF is this ****? Who the hell adds elements to a collection in the getter? And where do the strings come from? From a DB? When are they checked? When is determined if something is non-translatable?

                          foreach (DictionaryEntry de in translationDictionary)
                          {
                          if (de.Key.ToString().Equals(node.Attributes["Publish"].InnerText.Trim()))
                          node.Attributes["Publish"].InnerText
                          = translationDictionary[node.Attributes["Publish"].InnerText.Trim()].ToString();
                          }

                          I never used a SortedList, you can access its elements with the DictionaryEntry class, is there any difference to a dictionary?

                          Timothy Byrd wrote:

                          Simply putting an if (nonTranslatable.Count == 0) block around the code in the NonTranslatable property took the run time down from >75 minutes to ~30 seconds. (Fun fact: During the 75 minutes of processing, those repeated Add() calls caused the program to take up about 25MB of extra memory. That's a lot of strings to loop through and compare repeatedly. OMG )

                          Wow.

                          Timothy Byrd wrote:

                          Not to mention all the calls to Trim() inside the loop. Or does the runtime optimize those out as invariant?

                          Maybe the XML contains trailing spaces? Unless you trim them, the XML Nodes' InnerText wouldn't equal the DictionaryEntries with same key. Anyway, I feel truly sorry for you... X|

                          T 1 Reply Last reply
                          0
                          • M Megidolaon

                            What is the point of

                            if (node.Attributes["Publish"].InnerText.Trim().Contains(s))
                            {
                            return;
                            }

                            A check for containting a single letter and then doing nothing? Also,

                            public static ArrayList NonTranslatable
                            {
                            get
                            {
                            nonTranslatable.Add("1st String");
                            nonTranslatable.Add("2nd String");
                            // ...
                            nonTranslatable.Add("30th String");
                            return nonTranslatable;
                            }
                            }

                            Sorry, but WTF is this ****? Who the hell adds elements to a collection in the getter? And where do the strings come from? From a DB? When are they checked? When is determined if something is non-translatable?

                            foreach (DictionaryEntry de in translationDictionary)
                            {
                            if (de.Key.ToString().Equals(node.Attributes["Publish"].InnerText.Trim()))
                            node.Attributes["Publish"].InnerText
                            = translationDictionary[node.Attributes["Publish"].InnerText.Trim()].ToString();
                            }

                            I never used a SortedList, you can access its elements with the DictionaryEntry class, is there any difference to a dictionary?

                            Timothy Byrd wrote:

                            Simply putting an if (nonTranslatable.Count == 0) block around the code in the NonTranslatable property took the run time down from >75 minutes to ~30 seconds. (Fun fact: During the 75 minutes of processing, those repeated Add() calls caused the program to take up about 25MB of extra memory. That's a lot of strings to loop through and compare repeatedly. OMG )

                            Wow.

                            Timothy Byrd wrote:

                            Not to mention all the calls to Trim() inside the loop. Or does the runtime optimize those out as invariant?

                            Maybe the XML contains trailing spaces? Unless you trim them, the XML Nodes' InnerText wouldn't equal the DictionaryEntries with same key. Anyway, I feel truly sorry for you... X|

                            T Offline
                            T Offline
                            Timothy Byrd
                            wrote on last edited by
                            #13

                            Megidolaon wrote:

                            What is the point of

                                if (node.Attributes\["Publish"\].InnerText.Trim().Contains(s))
                                {
                                    return;
                                }
                            

                            A check for containting a single letter and then doing nothing?

                            No, that loop checks if the node text contains one of the non-translatable strings and if so, bails.

                            Megidolaon wrote:

                            Maybe the XML contains trailing spaces? Unless you trim them, the XML Nodes' InnerText wouldn't equal the DictionaryEntries with same

                            You are correct, however within a given call to TranslateNode(), the xml node stays the same. Until it gets translated from the dictionary. Oh crap, there's yet another bug in there that I hadn't noticed - it can double translate. Anyway, the phrase node.Attributes["Publish"].InnerText.Trim() should be invariant, and I assume it takes some processing. So a line like:

                            string xmlText = node.Attributes\["Publish"\].InnerText.Trim();
                            

                            and then refering to that string wold have been more efficient,and easier for me to read, I think. -- T

                            1 Reply Last reply
                            0
                            Reply
                            • Reply as topic
                            Log in to reply
                            • Oldest to Newest
                            • Newest to Oldest
                            • Most Votes


                            • Login

                            • Don't have an account? Register

                            • Login or register to search.
                            • First post
                              Last post
                            0
                            • Categories
                            • Recent
                            • Tags
                            • Popular
                            • World
                            • Users
                            • Groups