Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C#
  4. no duplicates in array

no duplicates in array

Scheduled Pinned Locked Moved C#
questiondata-structureshelp
11 Posts 7 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • D Offline
    D Offline
    duta
    wrote on last edited by
    #1

    Hi there I have a file with allot of sentences. I need to make a dictionary with the words from that file. Until now I've separated the words and sort them using Split() and Sort() methods. My problem is to make a list without duplicate words. How can I do that?

    static int n = 0;
    public static string[] NoDuplicate(string[] array)
    {
    int i;
    string[] res = (string[])array.Clone();
    for (i = 0; i < array.Length-1; i++)
    if (array[i + 1] != array[i])
    res[n++] = (string)array[i];
    return res;
    }

    1. how can I do it more neat? 2) i don't like that method because is initialized using Clone() and the lenght is too big. many thx
    G realJSOPR D S 4 Replies Last reply
    0
    • D duta

      Hi there I have a file with allot of sentences. I need to make a dictionary with the words from that file. Until now I've separated the words and sort them using Split() and Sort() methods. My problem is to make a list without duplicate words. How can I do that?

      static int n = 0;
      public static string[] NoDuplicate(string[] array)
      {
      int i;
      string[] res = (string[])array.Clone();
      for (i = 0; i < array.Length-1; i++)
      if (array[i + 1] != array[i])
      res[n++] = (string)array[i];
      return res;
      }

      1. how can I do it more neat? 2) i don't like that method because is initialized using Clone() and the lenght is too big. many thx
      G Offline
      G Offline
      Giorgi Dalakishvili
      wrote on last edited by
      #2

      Store words in dictionary. Before adding new word check if the dictionary already contains it or not.

      Giorgi Dalakishvili #region signature My Articles Asynchronous Registry Notification Using Strongly-typed WMI Classes in .NET [^] My blog #endregion

      1 Reply Last reply
      0
      • D duta

        Hi there I have a file with allot of sentences. I need to make a dictionary with the words from that file. Until now I've separated the words and sort them using Split() and Sort() methods. My problem is to make a list without duplicate words. How can I do that?

        static int n = 0;
        public static string[] NoDuplicate(string[] array)
        {
        int i;
        string[] res = (string[])array.Clone();
        for (i = 0; i < array.Length-1; i++)
        if (array[i + 1] != array[i])
        res[n++] = (string)array[i];
        return res;
        }

        1. how can I do it more neat? 2) i don't like that method because is initialized using Clone() and the lenght is too big. many thx
        realJSOPR Online
        realJSOPR Online
        realJSOP
        wrote on last edited by
        #3

        First, I'd use a List instead of an array because a List can grow dynamically, and will only be as large as is required to store your data. To avoid duplicates, you could do this:

        List<string> dictionary = new List<string>()

        // get the next sentence to process (you have to write
        // the GetNextSentence function)
        string sentence = GetNextSentence();
        // split the sentence into words
        string[] words = sentence.ToLower().Split(' ');
        // for each word
        for (int i = 0; i < words.Length; i++)
        {
        // if it's not already in the dictionary using the
        // List.Contains method
        if (!dictionary.Contains(words[i])
        {
        // add it to the dictionary
        dictionary.Add(words[i]);
        }
        }

        "Why don't you tie a kerosene-soaked rag around your ankles so the ants won't climb up and eat your candy ass..." - Dale Earnhardt, 1997
        -----
        "...the staggering layers of obscenity in your statement make it a work of art on so many levels." - Jason Jystad, 10/26/2001

        G 1 Reply Last reply
        0
        • realJSOPR realJSOP

          First, I'd use a List instead of an array because a List can grow dynamically, and will only be as large as is required to store your data. To avoid duplicates, you could do this:

          List<string> dictionary = new List<string>()

          // get the next sentence to process (you have to write
          // the GetNextSentence function)
          string sentence = GetNextSentence();
          // split the sentence into words
          string[] words = sentence.ToLower().Split(' ');
          // for each word
          for (int i = 0; i < words.Length; i++)
          {
          // if it's not already in the dictionary using the
          // List.Contains method
          if (!dictionary.Contains(words[i])
          {
          // add it to the dictionary
          dictionary.Add(words[i]);
          }
          }

          "Why don't you tie a kerosene-soaked rag around your ankles so the ants won't climb up and eat your candy ass..." - Dale Earnhardt, 1997
          -----
          "...the staggering layers of obscenity in your statement make it a work of art on so many levels." - Jason Jystad, 10/26/2001

          G Offline
          G Offline
          Guffa
          wrote on last edited by
          #4

          Instead of using a List and calling it a dictionary, use a real Dictionary. The Dictionary.Contains method is a lot faster than the List.Contains method.

          Despite everything, the person most likely to be fooling you next is yourself.

          J 1 Reply Last reply
          0
          • D duta

            Hi there I have a file with allot of sentences. I need to make a dictionary with the words from that file. Until now I've separated the words and sort them using Split() and Sort() methods. My problem is to make a list without duplicate words. How can I do that?

            static int n = 0;
            public static string[] NoDuplicate(string[] array)
            {
            int i;
            string[] res = (string[])array.Clone();
            for (i = 0; i < array.Length-1; i++)
            if (array[i + 1] != array[i])
            res[n++] = (string)array[i];
            return res;
            }

            1. how can I do it more neat? 2) i don't like that method because is initialized using Clone() and the lenght is too big. many thx
            D Offline
            D Offline
            DaveyM69
            wrote on last edited by
            #5

            Ignore what I posted before, Simon's HashSet is perfect, and it has the ToArray if you need it. Leaned something new today :-D

            D 1 Reply Last reply
            0
            • G Guffa

              Instead of using a List and calling it a dictionary, use a real Dictionary. The Dictionary.Contains method is a lot faster than the List.Contains method.

              Despite everything, the person most likely to be fooling you next is yourself.

              J Offline
              J Offline
              J4amieC
              wrote on last edited by
              #6

              why would you need key & value just to store a word though? I would've gone with John's suggestion myself.

              S 1 Reply Last reply
              0
              • J J4amieC

                why would you need key & value just to store a word though? I would've gone with John's suggestion myself.

                S Offline
                S Offline
                Simon P Stevens
                wrote on last edited by
                #7

                The difference is the style of the storage object. A list is just that, an unsorted sequential list of items. To do a Contains() operation on it, you have to iterate through the list and check every item. On the other hand, a dictionary is a form of hash table, so a Contains() operation only has to hash the key and check if it already exists. in .net 3.5 you could instead consider a HashSet<String> This is specifically optimised for sets containing no duplicates.

                Simon

                J 1 Reply Last reply
                0
                • D duta

                  Hi there I have a file with allot of sentences. I need to make a dictionary with the words from that file. Until now I've separated the words and sort them using Split() and Sort() methods. My problem is to make a list without duplicate words. How can I do that?

                  static int n = 0;
                  public static string[] NoDuplicate(string[] array)
                  {
                  int i;
                  string[] res = (string[])array.Clone();
                  for (i = 0; i < array.Length-1; i++)
                  if (array[i + 1] != array[i])
                  res[n++] = (string)array[i];
                  return res;
                  }

                  1. how can I do it more neat? 2) i don't like that method because is initialized using Clone() and the lenght is too big. many thx
                  S Offline
                  S Offline
                  Simon P Stevens
                  wrote on last edited by
                  #8

                  Take a look at the HashSet<String>[^] class (.net 3.5 only). It provides an optimised hash collection and it doesn't allow duplicates, (it just ignores attempts to add duplicates), and you can call ToArray() when you are done with it if you really need a string array.

                  Simon

                  D 1 Reply Last reply
                  0
                  • S Simon P Stevens

                    Take a look at the HashSet<String>[^] class (.net 3.5 only). It provides an optimised hash collection and it doesn't allow duplicates, (it just ignores attempts to add duplicates), and you can call ToArray() when you are done with it if you really need a string array.

                    Simon

                    D Offline
                    D Offline
                    DaveyM69
                    wrote on last edited by
                    #9

                    Nice find Simon, hadn't come accross this one before... always good to learn something new :-D

                    Dave
                    BTW, in software, hope and pray is not a viable strategy. (Luc Pattyn)
                    Visual Basic is not used by normal people so we're not covering it here. (Uncyclopedia)

                    1 Reply Last reply
                    0
                    • S Simon P Stevens

                      The difference is the style of the storage object. A list is just that, an unsorted sequential list of items. To do a Contains() operation on it, you have to iterate through the list and check every item. On the other hand, a dictionary is a form of hash table, so a Contains() operation only has to hash the key and check if it already exists. in .net 3.5 you could instead consider a HashSet<String> This is specifically optimised for sets containing no duplicates.

                      Simon

                      J Offline
                      J Offline
                      J4amieC
                      wrote on last edited by
                      #10

                      thanks for the info.

                      1 Reply Last reply
                      0
                      • D DaveyM69

                        Ignore what I posted before, Simon's HashSet is perfect, and it has the ToArray if you need it. Leaned something new today :-D

                        D Offline
                        D Offline
                        duta
                        wrote on last edited by
                        #11

                        HashSet is only in .Net framework 3.5 :( and i;m using vs2005:(( But the advice is soo great, thx to all

                        1 Reply Last reply
                        0
                        Reply
                        • Reply as topic
                        Log in to reply
                        • Oldest to Newest
                        • Newest to Oldest
                        • Most Votes


                        • Login

                        • Don't have an account? Register

                        • Login or register to search.
                        • First post
                          Last post
                        0
                        • Categories
                        • Recent
                        • Tags
                        • Popular
                        • World
                        • Users
                        • Groups