Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C#
  4. Find unique strings for a string array

Find unique strings for a string array

Scheduled Pinned Locked Moved C#
data-structurestutorialquestion
44 Posts 11 Posters 5 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • G George_George

    Hello everyone, I have a string array, but may have duplicate strings. Any built-in or smart way to remove the duplicate ones and generate a string array contains only unique ones? For example, the input array is {"abc", "bcd", "abc"}, the unique output array is {"abc", "bcd"}. thanks in advance, George

    M Offline
    M Offline
    Mark Churchill
    wrote on last edited by
    #15

    Not sure why people are saying there isn't a built in set class. Use Hashset< string>. Insertion and checking for existing values is roughly O(n). Has extension methods on it for doing linqy kind of things. Also noticed a lot of people said "use linq!". Linq does not make things run faster - it's not a magic replacement for Array.Find. It just makes your code look pretty, thats all :D

    Mark Churchill Director, Dunn & Churchill Pty Ltd Free Download: Diamond Binding: The simple, powerful, reliable, and effective data layer toolkit for Visual Studio.
    Alpha release: Entanglar: Transparant multiplayer framework for .Net games.

    G 1 Reply Last reply
    0
    • D Dragonfly_Lee

      You can use the Distinct method(an Extention Method) if you are using C#3.0 and the implementation code is quit simple, such as: string[] strs = new string[] { "abc", "bcd", "abc" }; IEnumerable newStrs = strs.Distinct(); Hope this will help. LuckyBoy

      G Offline
      G Offline
      George_George
      wrote on last edited by
      #16

      LuckyBoy, Distinct belongs to .Net 3.5, and I have to use .Net 3.0. :-) Any ideas for .Net 3.0? regards, George

      P 1 Reply Last reply
      0
      • G George_George

        LuckyBoy, Distinct belongs to .Net 3.5, and I have to use .Net 3.0. :-) Any ideas for .Net 3.0? regards, George

        P Offline
        P Offline
        PIEBALDconsult
        wrote on last edited by
        #17

        George_George wrote:

        I have to use .Net 3.0

        Then I can't suggest HashSet. :( But I can suggest my Set class. :-D

        G 1 Reply Last reply
        0
        • B Brij

          You have to iteretae it.You can use some generic for that

          Cheers!! Brij

          B Offline
          B Offline
          Brij
          wrote on last edited by
          #18

          Make a custom function,In which create an a generic as taken below a list. List<string> UnqueList=new List<string>(); for (int i = 0; i < strarr.Length; i++) { if(!UnqueList.Exists(strarr[0])) { UnqueList.Add(strarr[0]); } } Now you'll the list conatining unique elements.You can conert it to array too as UnqueList.ToArray();

          Cheers!! Brij

          G 1 Reply Last reply
          0
          • M Mark Churchill

            Not sure why people are saying there isn't a built in set class. Use Hashset< string>. Insertion and checking for existing values is roughly O(n). Has extension methods on it for doing linqy kind of things. Also noticed a lot of people said "use linq!". Linq does not make things run faster - it's not a magic replacement for Array.Find. It just makes your code look pretty, thats all :D

            Mark Churchill Director, Dunn & Churchill Pty Ltd Free Download: Diamond Binding: The simple, powerful, reliable, and effective data layer toolkit for Visual Studio.
            Alpha release: Entanglar: Transparant multiplayer framework for .Net games.

            G Offline
            G Offline
            George_George
            wrote on last edited by
            #19

            Thanks Mark, I think people means no built-in single call for find the uniqueness for string. BTW: if LINQ is slow, why people will use LINQ? regards, George

            M D 2 Replies Last reply
            0
            • P PIEBALDconsult

              George_George wrote:

              I have to use .Net 3.0

              Then I can't suggest HashSet. :( But I can suggest my Set class. :-D

              G Offline
              G Offline
              George_George
              wrote on last edited by
              #20

              What are the advantages of your Set class over .Net Set class? regards, George

              P 1 Reply Last reply
              0
              • I Igor Velikorossov

                List<string> newArray = new List<string>();
                foreach (string token in yourArray)
                {
                if (!newArray.Contains(token))
                {
                newArray.Add(token);
                }
                }

                G Offline
                G Offline
                George_George
                wrote on last edited by
                #21

                Thanks Igor, I like your solution! :-) regards, George

                I 1 Reply Last reply
                0
                • G George_George

                  Thanks Mark, I think people means no built-in single call for find the uniqueness for string. BTW: if LINQ is slow, why people will use LINQ? regards, George

                  M Offline
                  M Offline
                  Mark Churchill
                  wrote on last edited by
                  #22

                  *shrug* I think Hashset< T>.Add(T item) returning bool if it was unique is close enough. People use LINQ because it makes the code more readable. Generally CPU is cheap and good programmers aren't. Its ok to have a 10% overhead if your code is more reliable and easier to maintain as a result.

                  Mark Churchill Director, Dunn & Churchill Pty Ltd Free Download: Diamond Binding: The simple, powerful, reliable, and effective data layer toolkit for Visual Studio.
                  Alpha release: Entanglar: Transparant multiplayer framework for .Net games.

                  N G 2 Replies Last reply
                  0
                  • B Brij

                    Make a custom function,In which create an a generic as taken below a list. List<string> UnqueList=new List<string>(); for (int i = 0; i < strarr.Length; i++) { if(!UnqueList.Exists(strarr[0])) { UnqueList.Add(strarr[0]); } } Now you'll the list conatining unique elements.You can conert it to array too as UnqueList.ToArray();

                    Cheers!! Brij

                    G Offline
                    G Offline
                    George_George
                    wrote on last edited by
                    #23

                    Thanks Brij! The "generic" you mean List? regards, George

                    B 1 Reply Last reply
                    0
                    • G George_George

                      Thanks Igor, I like your solution! :-) regards, George

                      I Offline
                      I Offline
                      Igor Velikorossov
                      wrote on last edited by
                      #24

                      no worries ;)

                      1 Reply Last reply
                      0
                      • M Mark Churchill

                        *shrug* I think Hashset< T>.Add(T item) returning bool if it was unique is close enough. People use LINQ because it makes the code more readable. Generally CPU is cheap and good programmers aren't. Its ok to have a 10% overhead if your code is more reliable and easier to maintain as a result.

                        Mark Churchill Director, Dunn & Churchill Pty Ltd Free Download: Diamond Binding: The simple, powerful, reliable, and effective data layer toolkit for Visual Studio.
                        Alpha release: Entanglar: Transparant multiplayer framework for .Net games.

                        N Offline
                        N Offline
                        N a v a n e e t h
                        wrote on last edited by
                        #25

                        Mark Churchill wrote:

                        Generally CPU is cheap and good programmers aren't

                        That's a good one :)

                        Navaneeth How to use google | Ask smart questions

                        1 Reply Last reply
                        0
                        • G George_George

                          Thanks Christian, What do you mean "have a list inside"? I am talking about string array, I am not sure where is the list you are talking about. Show some pseudo code? regards, George

                          C Offline
                          C Offline
                          Christian Graus
                          wrote on last edited by
                          #26

                          public class set { private List theList; public bool Add(T item) { if (theList.Contains(item)) return false; theList.Add(item); return true; } } This is the start of a set class, a container that only contains one of any object.

                          Christian Graus Driven to the arms of OSX by Vista.

                          G 1 Reply Last reply
                          0
                          • G George_George

                            Thanks Brij! The "generic" you mean List? regards, George

                            B Offline
                            B Offline
                            Brij
                            wrote on last edited by
                            #27

                            Generic does means only List.We have some more like Dictionary,SortedList,Queue,Stack but list suits your requirement best.

                            Cheers!! Brij

                            G 1 Reply Last reply
                            0
                            • G George_George

                              Thanks Mark, I think people means no built-in single call for find the uniqueness for string. BTW: if LINQ is slow, why people will use LINQ? regards, George

                              D Offline
                              D Offline
                              Dragonfly_Lee
                              wrote on last edited by
                              #28

                              George_George wrote:

                              BTW: if LINQ is slow, why people will use LINQ?

                              I think Mark is right. We can get the benefits from LINQ for integrating the data and object. But I am not sure about the performance of LINQ. Though I do believe Mircosoft would make greate efforts to improve it.

                              LuckyBoy

                              G 1 Reply Last reply
                              0
                              • G George_George

                                Hello everyone, I have a string array, but may have duplicate strings. Any built-in or smart way to remove the duplicate ones and generate a string array contains only unique ones? For example, the input array is {"abc", "bcd", "abc"}, the unique output array is {"abc", "bcd"}. thanks in advance, George

                                L Offline
                                L Offline
                                Lost User
                                wrote on last edited by
                                #29

                                Sounds to me like you should be using a set data structure, if you never need the duplicate strings.

                                At university studying Software Engineering - if i say this line to girls i find they won't talk to me Dan

                                G 1 Reply Last reply
                                0
                                • M Mark Churchill

                                  *shrug* I think Hashset< T>.Add(T item) returning bool if it was unique is close enough. People use LINQ because it makes the code more readable. Generally CPU is cheap and good programmers aren't. Its ok to have a 10% overhead if your code is more reliable and easier to maintain as a result.

                                  Mark Churchill Director, Dunn & Churchill Pty Ltd Free Download: Diamond Binding: The simple, powerful, reliable, and effective data layer toolkit for Visual Studio.
                                  Alpha release: Entanglar: Transparant multiplayer framework for .Net games.

                                  G Offline
                                  G Offline
                                  George_George
                                  wrote on last edited by
                                  #30

                                  Thanks Mark, 1. "*shrug* I think Hashset< T>.Add(T item) returning bool if it was unique is close enough." -- I am still confused why do you think .Net built-in Hashset is good enough. Any comments? 2. I know about LINQ but not very experienced. My concern about LINQ is, I do not think it is more readable, why do you think it is more readable? 3. LINQ is only 10% slower? I think it is much slower in my experience. Do you have any performance benchmarking data? regards, George

                                  1 Reply Last reply
                                  0
                                  • G George_George

                                    What are the advantages of your Set class over .Net Set class? regards, George

                                    P Offline
                                    P Offline
                                    PIEBALDconsult
                                    wrote on last edited by
                                    #31

                                    Operators.

                                    G 1 Reply Last reply
                                    0
                                    • D Dragonfly_Lee

                                      George_George wrote:

                                      BTW: if LINQ is slow, why people will use LINQ?

                                      I think Mark is right. We can get the benefits from LINQ for integrating the data and object. But I am not sure about the performance of LINQ. Though I do believe Mircosoft would make greate efforts to improve it.

                                      LuckyBoy

                                      G Offline
                                      G Offline
                                      George_George
                                      wrote on last edited by
                                      #32

                                      In my experience and other guys besides me, the performnace feedback of LINQ is bad. :-) What about yours? regards, George

                                      D 1 Reply Last reply
                                      0
                                      • L Lost User

                                        Sounds to me like you should be using a set data structure, if you never need the duplicate strings.

                                        At university studying Software Engineering - if i say this line to girls i find they won't talk to me Dan

                                        G Offline
                                        G Offline
                                        George_George
                                        wrote on last edited by
                                        #33

                                        Thanks Dan! But there is no such data structure in .Net, correct? regards, George

                                        L 1 Reply Last reply
                                        0
                                        • C Christian Graus

                                          public class set { private List theList; public bool Add(T item) { if (theList.Contains(item)) return false; theList.Add(item); return true; } } This is the start of a set class, a container that only contains one of any object.

                                          Christian Graus Driven to the arms of OSX by Vista.

                                          G Offline
                                          G Offline
                                          George_George
                                          wrote on last edited by
                                          #34

                                          Thanks Christian, I like your idea. I am surprised why there is no built-in Set class in .Net. :-) regards, George

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups