Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. Other Discussions
  3. Clever Code
  4. Challenge: the fastest way to filter doubled items from a list.

Challenge: the fastest way to filter doubled items from a list.

Scheduled Pinned Locked Moved Clever Code
learningcsharpjavahelpquestion
31 Posts 24 Posters 4 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • 0 Offline
    0 Offline
    0bx
    wrote on last edited by
    #1

    To all the clever coders out here! I have a list full of strings (doubledList) and I want to create a new list containing every unique string from the first list exactly once(UniqueList). My solution is this (C#):

    private List LoadUniqueList(List doubledList)
    {
    List UniqueList = new List();
    foreach (string item in doubledList)
    {
    bool x = true;
    foreach (string compare in UniqueList)
    {
    if (item == compare)
    {
    x = false;
    }
    if (!x) break;
    }
    if (x) UniqueList.Add(item);
    }
    return UniqueList;
    }

    Can you make it lighter, faster, sharper? :java::java::java: X| Personally, I have no idea if it's even possible; otherwise it wouldn't be very interesting for me of course. Also, do you think it's a good idea to use this part of the forum similar challenges? I think that starting with a working solution for a generic problem and then discussing whether you can improve it would be a great way to create more learning opportunities.

    Giraffes are not real.

    R L J E P 14 Replies Last reply
    0
    • 0 0bx

      To all the clever coders out here! I have a list full of strings (doubledList) and I want to create a new list containing every unique string from the first list exactly once(UniqueList). My solution is this (C#):

      private List LoadUniqueList(List doubledList)
      {
      List UniqueList = new List();
      foreach (string item in doubledList)
      {
      bool x = true;
      foreach (string compare in UniqueList)
      {
      if (item == compare)
      {
      x = false;
      }
      if (!x) break;
      }
      if (x) UniqueList.Add(item);
      }
      return UniqueList;
      }

      Can you make it lighter, faster, sharper? :java::java::java: X| Personally, I have no idea if it's even possible; otherwise it wouldn't be very interesting for me of course. Also, do you think it's a good idea to use this part of the forum similar challenges? I think that starting with a working solution for a generic problem and then discussing whether you can improve it would be a great way to create more learning opportunities.

      Giraffes are not real.

      R Offline
      R Offline
      riced
      wrote on last edited by
      #2

      Don't know if it's faster but I'd use the Exists() method to see if it was already there rather than loop through UniqueList. :) I'm sure there's also a neat way to do this using LINQ. :-D

      Regards David R --------------------------------------------------------------- "Every program eventually becomes rococo, and then rubble." - Alan Perlis The only valid measurement of code quality: WTFs/minute.

      1 Reply Last reply
      0
      • 0 0bx

        To all the clever coders out here! I have a list full of strings (doubledList) and I want to create a new list containing every unique string from the first list exactly once(UniqueList). My solution is this (C#):

        private List LoadUniqueList(List doubledList)
        {
        List UniqueList = new List();
        foreach (string item in doubledList)
        {
        bool x = true;
        foreach (string compare in UniqueList)
        {
        if (item == compare)
        {
        x = false;
        }
        if (!x) break;
        }
        if (x) UniqueList.Add(item);
        }
        return UniqueList;
        }

        Can you make it lighter, faster, sharper? :java::java::java: X| Personally, I have no idea if it's even possible; otherwise it wouldn't be very interesting for me of course. Also, do you think it's a good idea to use this part of the forum similar challenges? I think that starting with a working solution for a generic problem and then discussing whether you can improve it would be a great way to create more learning opportunities.

        Giraffes are not real.

        L Offline
        L Offline
        Luc Pattyn
        wrote on last edited by
        #3

        You're in the wrong forum. This one is for showing clever code, not for asking questions. BTW: IMO a fast solution would be based on a Dictionary (since 2.0) or a HashSet (since 3.5). :)

        Luc Pattyn [My Articles] Nil Volentibus Arduum

        J P 0 3 Replies Last reply
        0
        • L Luc Pattyn

          You're in the wrong forum. This one is for showing clever code, not for asking questions. BTW: IMO a fast solution would be based on a Dictionary (since 2.0) or a HashSet (since 3.5). :)

          Luc Pattyn [My Articles] Nil Volentibus Arduum

          J Offline
          J Offline
          Jorgen Andersson
          wrote on last edited by
          #4

          A hashset might not be the best choice to find doubled items since:

          MSDN wrote:

          The HashSet(Of T) class provides high-performance set operations. A set is a collection that contains no duplicate elements, and whose elements are in no particular order.

          Light moves faster than sound. That is why some people appear bright, until you hear them speak. List of common misconceptions

          P L 2 Replies Last reply
          0
          • 0 0bx

            To all the clever coders out here! I have a list full of strings (doubledList) and I want to create a new list containing every unique string from the first list exactly once(UniqueList). My solution is this (C#):

            private List LoadUniqueList(List doubledList)
            {
            List UniqueList = new List();
            foreach (string item in doubledList)
            {
            bool x = true;
            foreach (string compare in UniqueList)
            {
            if (item == compare)
            {
            x = false;
            }
            if (!x) break;
            }
            if (x) UniqueList.Add(item);
            }
            return UniqueList;
            }

            Can you make it lighter, faster, sharper? :java::java::java: X| Personally, I have no idea if it's even possible; otherwise it wouldn't be very interesting for me of course. Also, do you think it's a good idea to use this part of the forum similar challenges? I think that starting with a working solution for a generic problem and then discussing whether you can improve it would be a great way to create more learning opportunities.

            Giraffes are not real.

            J Offline
            J Offline
            Jorgen Andersson
            wrote on last edited by
            #5

            UniqueList = doubledList.distinct

            Light moves faster than sound. That is why some people appear bright, until you hear them speak. List of common misconceptions

            1 Reply Last reply
            0
            • J Jorgen Andersson

              A hashset might not be the best choice to find doubled items since:

              MSDN wrote:

              The HashSet(Of T) class provides high-performance set operations. A set is a collection that contains no duplicate elements, and whose elements are in no particular order.

              Light moves faster than sound. That is why some people appear bright, until you hear them speak. List of common misconceptions

              P Offline
              P Offline
              PIEBALDconsult
              wrote on last edited by
              #6

              Exactly. Iterate the input List, add new items to the HashSet and the result List.

              Z J 2 Replies Last reply
              0
              • L Luc Pattyn

                You're in the wrong forum. This one is for showing clever code, not for asking questions. BTW: IMO a fast solution would be based on a Dictionary (since 2.0) or a HashSet (since 3.5). :)

                Luc Pattyn [My Articles] Nil Volentibus Arduum

                P Offline
                P Offline
                PIEBALDconsult
                wrote on last edited by
                #7

                Luc Pattyn wrote:

                You're in the wrong forum.

                Correct, this should be a Friday Programming Challenge.

                1 Reply Last reply
                0
                • J Jorgen Andersson

                  A hashset might not be the best choice to find doubled items since:

                  MSDN wrote:

                  The HashSet(Of T) class provides high-performance set operations. A set is a collection that contains no duplicate elements, and whose elements are in no particular order.

                  Light moves faster than sound. That is why some people appear bright, until you hear them speak. List of common misconceptions

                  L Offline
                  L Offline
                  Luc Pattyn
                  wrote on last edited by
                  #8

                  if (!hashSet.Contains(candidate)) hashSet.Add(candidate); takes full advantage of hashing and collecting, much faster than other collections' Contains method. [ADDED] And the test is redundant, Add() already does it, so hashSet.Add(candidate); suffices [/ADDED] :)

                  Luc Pattyn [My Articles] Nil Volentibus Arduum

                  J 1 Reply Last reply
                  0
                  • L Luc Pattyn

                    if (!hashSet.Contains(candidate)) hashSet.Add(candidate); takes full advantage of hashing and collecting, much faster than other collections' Contains method. [ADDED] And the test is redundant, Add() already does it, so hashSet.Add(candidate); suffices [/ADDED] :)

                    Luc Pattyn [My Articles] Nil Volentibus Arduum

                    J Offline
                    J Offline
                    Jorgen Andersson
                    wrote on last edited by
                    #9

                    :doh: :sigh:

                    Light moves faster than sound. That is why some people appear bright, until you hear them speak. List of common misconceptions

                    1 Reply Last reply
                    0
                    • L Luc Pattyn

                      You're in the wrong forum. This one is for showing clever code, not for asking questions. BTW: IMO a fast solution would be based on a Dictionary (since 2.0) or a HashSet (since 3.5). :)

                      Luc Pattyn [My Articles] Nil Volentibus Arduum

                      0 Offline
                      0 Offline
                      0bx
                      wrote on last edited by
                      #10

                      I know... but the question forums are primarily used for showing code that doesn't work, so I just wanted to throw a stone in the pond. BTW: I just realized that I know you personally. We both play in the same chess club. :-D

                      Giraffes are not real.

                      L 1 Reply Last reply
                      0
                      • 0 0bx

                        I know... but the question forums are primarily used for showing code that doesn't work, so I just wanted to throw a stone in the pond. BTW: I just realized that I know you personally. We both play in the same chess club. :-D

                        Giraffes are not real.

                        L Offline
                        L Offline
                        Luc Pattyn
                        wrote on last edited by
                        #11

                        You could have wrinkled the C# pond; the programming forums are for discussions, not just for "help, my code fails" kind of questions. CU.

                        Luc Pattyn [My Articles] Nil Volentibus Arduum

                        1 Reply Last reply
                        0
                        • 0 0bx

                          To all the clever coders out here! I have a list full of strings (doubledList) and I want to create a new list containing every unique string from the first list exactly once(UniqueList). My solution is this (C#):

                          private List LoadUniqueList(List doubledList)
                          {
                          List UniqueList = new List();
                          foreach (string item in doubledList)
                          {
                          bool x = true;
                          foreach (string compare in UniqueList)
                          {
                          if (item == compare)
                          {
                          x = false;
                          }
                          if (!x) break;
                          }
                          if (x) UniqueList.Add(item);
                          }
                          return UniqueList;
                          }

                          Can you make it lighter, faster, sharper? :java::java::java: X| Personally, I have no idea if it's even possible; otherwise it wouldn't be very interesting for me of course. Also, do you think it's a good idea to use this part of the forum similar challenges? I think that starting with a working solution for a generic problem and then discussing whether you can improve it would be a great way to create more learning opportunities.

                          Giraffes are not real.

                          E Offline
                          E Offline
                          ekolis
                          wrote on last edited by
                          #12

                          private List LoadUniqueList(List doubledList)
                          {
                          return doubledList.Distinct().ToList();
                          }

                          LINQ is fun :D And as for using this part of the forum for challenges, I think it's a great idea - hardly anyone ever posts here to begin with, since people don't usually look at their own code and say "Damn, that was clever - I need to share it with someone!"

                          1 Reply Last reply
                          0
                          • 0 0bx

                            To all the clever coders out here! I have a list full of strings (doubledList) and I want to create a new list containing every unique string from the first list exactly once(UniqueList). My solution is this (C#):

                            private List LoadUniqueList(List doubledList)
                            {
                            List UniqueList = new List();
                            foreach (string item in doubledList)
                            {
                            bool x = true;
                            foreach (string compare in UniqueList)
                            {
                            if (item == compare)
                            {
                            x = false;
                            }
                            if (!x) break;
                            }
                            if (x) UniqueList.Add(item);
                            }
                            return UniqueList;
                            }

                            Can you make it lighter, faster, sharper? :java::java::java: X| Personally, I have no idea if it's even possible; otherwise it wouldn't be very interesting for me of course. Also, do you think it's a good idea to use this part of the forum similar challenges? I think that starting with a working solution for a generic problem and then discussing whether you can improve it would be a great way to create more learning opportunities.

                            Giraffes are not real.

                            P Offline
                            P Offline
                            Philippe Mori
                            wrote on last edited by
                            #13

                            Such an algorithm is 0(n²) thus for large collection, it will be very slow. Each time, you double the collection size, it is 4 times slower. Using an HastSet or a SortedSet or a Dictionary would be much faster for large collections. The order would then be about O(n) with hashing or O(n log(n)) with Dictionary (binary search). Another alternative would be to sort the list if the order does not matters. It is then easier to skip skip or remove duplicates.

                            Philippe Mori

                            P C 2 Replies Last reply
                            0
                            • 0 0bx

                              To all the clever coders out here! I have a list full of strings (doubledList) and I want to create a new list containing every unique string from the first list exactly once(UniqueList). My solution is this (C#):

                              private List LoadUniqueList(List doubledList)
                              {
                              List UniqueList = new List();
                              foreach (string item in doubledList)
                              {
                              bool x = true;
                              foreach (string compare in UniqueList)
                              {
                              if (item == compare)
                              {
                              x = false;
                              }
                              if (!x) break;
                              }
                              if (x) UniqueList.Add(item);
                              }
                              return UniqueList;
                              }

                              Can you make it lighter, faster, sharper? :java::java::java: X| Personally, I have no idea if it's even possible; otherwise it wouldn't be very interesting for me of course. Also, do you think it's a good idea to use this part of the forum similar challenges? I think that starting with a working solution for a generic problem and then discussing whether you can improve it would be a great way to create more learning opportunities.

                              Giraffes are not real.

                              S Offline
                              S Offline
                              Stefan_Lang
                              wrote on last edited by
                              #14

                              I'd concatenate the lists ( O(1); or O(n) if you must copy them ), then sort it ( raising complexity to O(n*log(n) ) and then remove duplicates ( O(n) again ) That means your complexity will be ruled by your sorting algorithm: choose whichever algorithm fits your data best.

                              1 Reply Last reply
                              0
                              • P Philippe Mori

                                Such an algorithm is 0(n²) thus for large collection, it will be very slow. Each time, you double the collection size, it is 4 times slower. Using an HastSet or a SortedSet or a Dictionary would be much faster for large collections. The order would then be about O(n) with hashing or O(n log(n)) with Dictionary (binary search). Another alternative would be to sort the list if the order does not matters. It is then easier to skip skip or remove duplicates.

                                Philippe Mori

                                P Offline
                                P Offline
                                pchinery
                                wrote on last edited by
                                #15

                                I don't agree that hashing would make it O(n), as it requires to handle hash collisions. This if course is not a real problem for average use cases, but after all that's not what the Landau thing is about, right? ;-) Usually, this would end up taking O(n log(n)) time, even with hashing. The main difference would be the constant that is involved. Of course it is hard to tell, but I would think that the HashSet would provide highly optimized code to do just this and it would be recommended to just use that.

                                1 Reply Last reply
                                0
                                • 0 0bx

                                  To all the clever coders out here! I have a list full of strings (doubledList) and I want to create a new list containing every unique string from the first list exactly once(UniqueList). My solution is this (C#):

                                  private List LoadUniqueList(List doubledList)
                                  {
                                  List UniqueList = new List();
                                  foreach (string item in doubledList)
                                  {
                                  bool x = true;
                                  foreach (string compare in UniqueList)
                                  {
                                  if (item == compare)
                                  {
                                  x = false;
                                  }
                                  if (!x) break;
                                  }
                                  if (x) UniqueList.Add(item);
                                  }
                                  return UniqueList;
                                  }

                                  Can you make it lighter, faster, sharper? :java::java::java: X| Personally, I have no idea if it's even possible; otherwise it wouldn't be very interesting for me of course. Also, do you think it's a good idea to use this part of the forum similar challenges? I think that starting with a working solution for a generic problem and then discussing whether you can improve it would be a great way to create more learning opportunities.

                                  Giraffes are not real.

                                  M Offline
                                  M Offline
                                  M Hussain
                                  wrote on last edited by
                                  #16

                                  I bet it will be faster than everything.

                                  UniqueList.Add((from string itm in UniqueList
                                  from string compareItem in doubledList
                                  where itm == compareItem
                                  select compareItem).ToString());

                                  1 Reply Last reply
                                  0
                                  • 0 0bx

                                    To all the clever coders out here! I have a list full of strings (doubledList) and I want to create a new list containing every unique string from the first list exactly once(UniqueList). My solution is this (C#):

                                    private List LoadUniqueList(List doubledList)
                                    {
                                    List UniqueList = new List();
                                    foreach (string item in doubledList)
                                    {
                                    bool x = true;
                                    foreach (string compare in UniqueList)
                                    {
                                    if (item == compare)
                                    {
                                    x = false;
                                    }
                                    if (!x) break;
                                    }
                                    if (x) UniqueList.Add(item);
                                    }
                                    return UniqueList;
                                    }

                                    Can you make it lighter, faster, sharper? :java::java::java: X| Personally, I have no idea if it's even possible; otherwise it wouldn't be very interesting for me of course. Also, do you think it's a good idea to use this part of the forum similar challenges? I think that starting with a working solution for a generic problem and then discussing whether you can improve it would be a great way to create more learning opportunities.

                                    Giraffes are not real.

                                    L Offline
                                    L Offline
                                    Lilith C
                                    wrote on last edited by
                                    #17

                                    Usually I sort the list then iterate through the list, outputting each change. When a change is found its value is used to compare for the next change detection. This code outputs to a string.

                                    iList.Sort();
                                    txtOutput.Text = "";
                                    foreach (string str in iList)
                                    {
                                    if (str != lastEntry)
                                    {
                                    txtOutput.Text += (str + "\r\n");
                                    lastEntry = str;
                                    }
                                    }

                                    And, yes, I know that StringBuilder might be more appropriate for some lists.

                                    I'm not a programmer but I play one at the office

                                    1 Reply Last reply
                                    0
                                    • P Philippe Mori

                                      Such an algorithm is 0(n²) thus for large collection, it will be very slow. Each time, you double the collection size, it is 4 times slower. Using an HastSet or a SortedSet or a Dictionary would be much faster for large collections. The order would then be about O(n) with hashing or O(n log(n)) with Dictionary (binary search). Another alternative would be to sort the list if the order does not matters. It is then easier to skip skip or remove duplicates.

                                      Philippe Mori

                                      C Offline
                                      C Offline
                                      Climate Turnip
                                      wrote on last edited by
                                      #18

                                      Its always dangerous to assume that all performance issues can be assessed using big O notation as an evaluation method, particularly when not comparing apples with apples in terms of implementation specifics. Hashing resolves down to numeric comparison as opposed to String evaluation but involves a hashing function overhead, these are important factors, there are also allocation issues particular to any implementation. Trivialising performance evaluation in this way where growth rate is a factor is always a dangerous proposition.

                                      1 Reply Last reply
                                      0
                                      • 0 0bx

                                        To all the clever coders out here! I have a list full of strings (doubledList) and I want to create a new list containing every unique string from the first list exactly once(UniqueList). My solution is this (C#):

                                        private List LoadUniqueList(List doubledList)
                                        {
                                        List UniqueList = new List();
                                        foreach (string item in doubledList)
                                        {
                                        bool x = true;
                                        foreach (string compare in UniqueList)
                                        {
                                        if (item == compare)
                                        {
                                        x = false;
                                        }
                                        if (!x) break;
                                        }
                                        if (x) UniqueList.Add(item);
                                        }
                                        return UniqueList;
                                        }

                                        Can you make it lighter, faster, sharper? :java::java::java: X| Personally, I have no idea if it's even possible; otherwise it wouldn't be very interesting for me of course. Also, do you think it's a good idea to use this part of the forum similar challenges? I think that starting with a working solution for a generic problem and then discussing whether you can improve it would be a great way to create more learning opportunities.

                                        Giraffes are not real.

                                        B Offline
                                        B Offline
                                        Brian L Parker
                                        wrote on last edited by
                                        #19

                                        public List<String> RemoveDoubles(List<string> doublelist) { return doublelist.Distinct().ToList(); }

                                        1 Reply Last reply
                                        0
                                        • 0 0bx

                                          To all the clever coders out here! I have a list full of strings (doubledList) and I want to create a new list containing every unique string from the first list exactly once(UniqueList). My solution is this (C#):

                                          private List LoadUniqueList(List doubledList)
                                          {
                                          List UniqueList = new List();
                                          foreach (string item in doubledList)
                                          {
                                          bool x = true;
                                          foreach (string compare in UniqueList)
                                          {
                                          if (item == compare)
                                          {
                                          x = false;
                                          }
                                          if (!x) break;
                                          }
                                          if (x) UniqueList.Add(item);
                                          }
                                          return UniqueList;
                                          }

                                          Can you make it lighter, faster, sharper? :java::java::java: X| Personally, I have no idea if it's even possible; otherwise it wouldn't be very interesting for me of course. Also, do you think it's a good idea to use this part of the forum similar challenges? I think that starting with a working solution for a generic problem and then discussing whether you can improve it would be a great way to create more learning opportunities.

                                          Giraffes are not real.

                                          K Offline
                                          K Offline
                                          KP Lee
                                          wrote on last edited by
                                          #20

                                          You've been given several options to use pre-existing code. Makes it easier, doesn't necessarily make it faster. I can certainly see how to make your code slightly faster:

                                                  foreach (string item in doubledList)
                                                 {
                                                      bool x = true;
                                                      foreach (string compare in UniqueList)
                                                      {
                                                          if (item == compare)
                                                          {
                                                              x = false;
                                                              break;
                                                          }
                                                      }
                                                      if (x) UniqueList.Add(item);
                                                  }
                                          

                                          There is one less if statement needed. The break command knows it is breaking the inner foreach, not the if statement it is in. If you have 100 items with one duplicate, the first outer loop saves nothing, after that, you've removed an if check for every inner loop executed. If the first and last are duplicates you've saved 4800+ if checks. (First outer loop never executes an inner loop, every loop after that executes the inner 1 less than the outer ones taken except the last loop that takes just one inner loop. My math may be wrong, but I'm sure that saves over 4800 if tests. 98/2=49 The last full inner loop will be for 98 times, the prior loop would be 97 and the first inner loop is 1 added together is 98, then 96+2 is 98, etc for 49 times 98 inner loops were taken. 49*98=4802 plus 1 for the 100th outer loop.) Maybe a sorted unique list could be faster? I'll think about that. Is a sorted list OK?

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups