Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C#
  4. Using threads to process a collection [modified]

Using threads to process a collection [modified]

Scheduled Pinned Locked Moved C#
databasevisual-studiocomtutorial
8 Posts 4 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • L Offline
    L Offline
    LetMeFinclOut
    wrote on last edited by
    #1

    I'm looking to add threading support to an application I'm working on, and am looking for advice on how to accomplish this. I have a collection of objects which are updated through a foreach loop. Instead, I'd like to dedicate this to a series of threads, where each object is moved in for processing as the thread becomes available. I've come up with 2 approached on how to do this. Method 1: Using a WaitHandle Give each thread an AutoResetEvent and rely on the WaitHandle.WaitAny() method to identify when a thread is finished so a new one can be created. Link: MSDN[^] Pseudocode:

    void QueueThreads(IEnumerable myCollection, uint maxThreads)
    {
    uint usedThreads = 0;
    AutoResetEvent[] autoEvents = new AutoResetEvent[Math.Min(maxThreads, myCollection.Count)];

    foreach (T myObject in myCollection)
    {
        if (usedThreads < maxThreads)
        {
            autoEvents\[usedThreads\] = new AutoResetEvent(false);
            //Start a DoWork() thread, passing myObject and autoEvents\[usedThreads\]
            usedThreads++;
        }
        else
        {
            uint freeThreadID = WaitHandle.WaitAny(autoEvents);
            usedThreads--;
            //Start a DoWork() thread, passing myObject and autoEvents\[freeThreadID\]
            usedThreads++;
        }
    }
    WaitHandle.WaitAll(autoEvents);
    

    }

    void DoWork(myObject, AutoResetEvent are)
    {
    //Do stuff with myObject
    //Save results to file/database
    are.Set()
    }

    Method 2: Sharing the enumerator across threads Give each thread access to the enumerator used to iterate over the collection. Each thread stays alive until all items in the collection have been exhausted. Pseudocode:

    void QueueThreads(IEnumerable myCollection, uint maxThreads)
    {
    uint usedThreads = 0;
    IEnumerator myCollectionEnum = myCollection.GetEnumerator();

    for (uint i = 0; i <= Min(myCollection.Count,maxThreads); i++)
    {
        //Start a DoWork() thread, passing myCollectionEnum
    }
    //Find some way to wait until all threads are done
    myCollectionEnum.Reset();
    

    }

    void DoWork(IEnumerator myColEn)
    {
    do
    {
    T myObject;
    lock (myColEn)
    {

    L N 2 Replies Last reply
    0
    • L LetMeFinclOut

      I'm looking to add threading support to an application I'm working on, and am looking for advice on how to accomplish this. I have a collection of objects which are updated through a foreach loop. Instead, I'd like to dedicate this to a series of threads, where each object is moved in for processing as the thread becomes available. I've come up with 2 approached on how to do this. Method 1: Using a WaitHandle Give each thread an AutoResetEvent and rely on the WaitHandle.WaitAny() method to identify when a thread is finished so a new one can be created. Link: MSDN[^] Pseudocode:

      void QueueThreads(IEnumerable myCollection, uint maxThreads)
      {
      uint usedThreads = 0;
      AutoResetEvent[] autoEvents = new AutoResetEvent[Math.Min(maxThreads, myCollection.Count)];

      foreach (T myObject in myCollection)
      {
          if (usedThreads < maxThreads)
          {
              autoEvents\[usedThreads\] = new AutoResetEvent(false);
              //Start a DoWork() thread, passing myObject and autoEvents\[usedThreads\]
              usedThreads++;
          }
          else
          {
              uint freeThreadID = WaitHandle.WaitAny(autoEvents);
              usedThreads--;
              //Start a DoWork() thread, passing myObject and autoEvents\[freeThreadID\]
              usedThreads++;
          }
      }
      WaitHandle.WaitAll(autoEvents);
      

      }

      void DoWork(myObject, AutoResetEvent are)
      {
      //Do stuff with myObject
      //Save results to file/database
      are.Set()
      }

      Method 2: Sharing the enumerator across threads Give each thread access to the enumerator used to iterate over the collection. Each thread stays alive until all items in the collection have been exhausted. Pseudocode:

      void QueueThreads(IEnumerable myCollection, uint maxThreads)
      {
      uint usedThreads = 0;
      IEnumerator myCollectionEnum = myCollection.GetEnumerator();

      for (uint i = 0; i <= Min(myCollection.Count,maxThreads); i++)
      {
          //Start a DoWork() thread, passing myCollectionEnum
      }
      //Find some way to wait until all threads are done
      myCollectionEnum.Reset();
      

      }

      void DoWork(IEnumerator myColEn)
      {
      do
      {
      T myObject;
      lock (myColEn)
      {

      L Offline
      L Offline
      Luc Pattyn
      wrote on last edited by
      #2

      Hi, here are my thoughts on the matter: 1. I watched the amount of state variables you were using. Method #1 tries to keep track of the number of threads, and needs a number of AutoResetEvents. Method #2 uses an enumerator and a lock, to me that is much simpler to write, to understand, to debug, and hence the preferred way of them both. With the API you have chosen (taking an IEnumerable myCollection) it probably is the best one can do. Not tested: If myCollection were an IList, you wouldn't need the enumerator and could simply use indexing, which also implies you wouldn't need the lock, a simple Interlocked.Increment() would suffice. All the above is assuming your collection does not change while the jobs are being executed; if the workload is dynamic, I recommend using a real queue, and a lock of course. 2. choosing the kind of thread should be easy: BackgroundWorker is based on ThreadPool; if the progress and completion features are appealing, prefer BGW over TP, otherwise don't. There is however one big caveat: ThreadPool has it's own mind about how many threads will actually be used: when you queue 20 jobs, you won't necessarily get 20 threads from the pool right away (other things may be going on in the ThreadPool); when the pool is too busy, it will get extended (within limits) and there is a complex algorithm in place that extends the pool with up to 2 threads per second if necessary; I don't now exactly what rule applies to retiring threads from the pool. This really boils down to your uint maxThreads parameter possibly be of limited value to what is going to really happen. Using instances of Thread puts you fully in charge. On top of that, whatever you do, determining what the optimal number of threads would be isn't very easy as they need to be mapped onto the number of cores in the system. I typically launch no more than N (and never more than 2*N) identical jobs on a system with N cores (see Environment.ProcessorCount). 3. The value of multi-threading is greatest when the performance limiting factor is computational; threads add processing power to your app, as long as you don't exceed the processor count (i.e. typically Task Manager shows less than 99% CPU load). Quite often adding threads to an app moves the performance bottleneck to some other part of the system, maybe the cache efficiency goes way down and the memory bus bandwidth becomes critical; maybe the disk or network bandwidth becomes the major factor, etc. I recommend making the number of threads a variable, and experimenti

      1 Reply Last reply
      0
      • L LetMeFinclOut

        I'm looking to add threading support to an application I'm working on, and am looking for advice on how to accomplish this. I have a collection of objects which are updated through a foreach loop. Instead, I'd like to dedicate this to a series of threads, where each object is moved in for processing as the thread becomes available. I've come up with 2 approached on how to do this. Method 1: Using a WaitHandle Give each thread an AutoResetEvent and rely on the WaitHandle.WaitAny() method to identify when a thread is finished so a new one can be created. Link: MSDN[^] Pseudocode:

        void QueueThreads(IEnumerable myCollection, uint maxThreads)
        {
        uint usedThreads = 0;
        AutoResetEvent[] autoEvents = new AutoResetEvent[Math.Min(maxThreads, myCollection.Count)];

        foreach (T myObject in myCollection)
        {
            if (usedThreads < maxThreads)
            {
                autoEvents\[usedThreads\] = new AutoResetEvent(false);
                //Start a DoWork() thread, passing myObject and autoEvents\[usedThreads\]
                usedThreads++;
            }
            else
            {
                uint freeThreadID = WaitHandle.WaitAny(autoEvents);
                usedThreads--;
                //Start a DoWork() thread, passing myObject and autoEvents\[freeThreadID\]
                usedThreads++;
            }
        }
        WaitHandle.WaitAll(autoEvents);
        

        }

        void DoWork(myObject, AutoResetEvent are)
        {
        //Do stuff with myObject
        //Save results to file/database
        are.Set()
        }

        Method 2: Sharing the enumerator across threads Give each thread access to the enumerator used to iterate over the collection. Each thread stays alive until all items in the collection have been exhausted. Pseudocode:

        void QueueThreads(IEnumerable myCollection, uint maxThreads)
        {
        uint usedThreads = 0;
        IEnumerator myCollectionEnum = myCollection.GetEnumerator();

        for (uint i = 0; i <= Min(myCollection.Count,maxThreads); i++)
        {
            //Start a DoWork() thread, passing myCollectionEnum
        }
        //Find some way to wait until all threads are done
        myCollectionEnum.Reset();
        

        }

        void DoWork(IEnumerator myColEn)
        {
        do
        {
        T myObject;
        lock (myColEn)
        {

        N Offline
        N Offline
        Not Active
        wrote on last edited by
        #3

        The question to ask would be why you are trying to use multiple threads, what are you expecting to gain from it? If it is increased performance you may not get what you are looking for since the bottle neck would most likely be at the file IO or database level.


        I know the language. I've read a book. - _Madmatt

        L 1 Reply Last reply
        0
        • N Not Active

          The question to ask would be why you are trying to use multiple threads, what are you expecting to gain from it? If it is increased performance you may not get what you are looking for since the bottle neck would most likely be at the file IO or database level.


          I know the language. I've read a book. - _Madmatt

          L Offline
          L Offline
          LetMeFinclOut
          wrote on last edited by
          #4

          Essentially, this process would be to collect data from a list of networked machines. Most of the latency comes from locating the machine and then establishing a connection. The threading is so I could continue to process other machined while one is waiting to connect.

          L 1 Reply Last reply
          0
          • L LetMeFinclOut

            Essentially, this process would be to collect data from a list of networked machines. Most of the latency comes from locating the machine and then establishing a connection. The threading is so I could continue to process other machined while one is waiting to connect.

            L Offline
            L Offline
            Luc Pattyn
            wrote on last edited by
            #5

            IIRC the number of concurrent TCP/IP connections you can establish is limited to a few tens, which wouldn't be a problem, unless you try and connect to a lot of machines that are absent or off, as those connection attempts would hit a time-out (possibly 1 minute). BTW: using lots of threads for actions that are mostly blocking (waiting on something) rather than computing, is quite wasteful. A better approach could well be to apply asynchronous operations (similar to serving all WinForms Controls from a single thread). :)

            Luc Pattyn [Forum Guidelines] [My Articles] Nil Volentibus Arduum

            Please use <PRE> tags for code snippets, they preserve indentation, improve readability, and make me actually look at the code.

            L B 2 Replies Last reply
            0
            • L Luc Pattyn

              IIRC the number of concurrent TCP/IP connections you can establish is limited to a few tens, which wouldn't be a problem, unless you try and connect to a lot of machines that are absent or off, as those connection attempts would hit a time-out (possibly 1 minute). BTW: using lots of threads for actions that are mostly blocking (waiting on something) rather than computing, is quite wasteful. A better approach could well be to apply asynchronous operations (similar to serving all WinForms Controls from a single thread). :)

              Luc Pattyn [Forum Guidelines] [My Articles] Nil Volentibus Arduum

              Please use <PRE> tags for code snippets, they preserve indentation, improve readability, and make me actually look at the code.

              L Offline
              L Offline
              LetMeFinclOut
              wrote on last edited by
              #6

              Yes, I do expect that I will get a timeout for many of the connections. I'm starting with 10 threads, but will probably adjust that number once I get a better idea how the threads perform.

              Luc Pattyn wrote:

              BTW: using lots of threads for actions that are mostly blocking (waiting on something) rather than computing, is quite wasteful. A better approach could well be to apply asynchronous operations (similar to serving all WinForms Controls from a single thread).

              There's a difference? :^) I've always understood "asynchronous programming" to be just a different term for threading.

              L 1 Reply Last reply
              0
              • L LetMeFinclOut

                Yes, I do expect that I will get a timeout for many of the connections. I'm starting with 10 threads, but will probably adjust that number once I get a better idea how the threads perform.

                Luc Pattyn wrote:

                BTW: using lots of threads for actions that are mostly blocking (waiting on something) rather than computing, is quite wasteful. A better approach could well be to apply asynchronous operations (similar to serving all WinForms Controls from a single thread).

                There's a difference? :^) I've always understood "asynchronous programming" to be just a different term for threading.

                L Offline
                L Offline
                Luc Pattyn
                wrote on last edited by
                #7

                Of course there is a difference, a rather big one. In multi-threading you write code in sequential mode: lets do A, then wait for B, then do C, then wait for D, etc. That is easy to write as the program counter is your main state variable; now each thread needs a stack (so it can remember where it is when nesting functions/methods), and the operating system has to switch threads all the time. In asynchronous operation, you could have just one thread; the workload is split in small jobs, and the thread executes these jobs one at the time, to completion. That is how Windows does a WinForms GUI: you have a button handler, a paint handler, etc. Now if you need to split a long-winding operation over multiple handlers, you need some state variables, you basically build a state machine, which is event-driven, not progressing based on the program counter. Consider a serial port receiver: with its own thread, you can write like: now I expect this, then I wait, then I should get some acknowledge, then wait, then get a length, then some amount of data, etc. With a DataReceived handler, your handler has to reflect on "what am I getting know" and "where was I in the protocol", and make those match up. :)

                Luc Pattyn [Forum Guidelines] [My Articles] Nil Volentibus Arduum

                Please use <PRE> tags for code snippets, they preserve indentation, improve readability, and make me actually look at the code.

                1 Reply Last reply
                0
                • L Luc Pattyn

                  IIRC the number of concurrent TCP/IP connections you can establish is limited to a few tens, which wouldn't be a problem, unless you try and connect to a lot of machines that are absent or off, as those connection attempts would hit a time-out (possibly 1 minute). BTW: using lots of threads for actions that are mostly blocking (waiting on something) rather than computing, is quite wasteful. A better approach could well be to apply asynchronous operations (similar to serving all WinForms Controls from a single thread). :)

                  Luc Pattyn [Forum Guidelines] [My Articles] Nil Volentibus Arduum

                  Please use <PRE> tags for code snippets, they preserve indentation, improve readability, and make me actually look at the code.

                  B Offline
                  B Offline
                  BobJanova
                  wrote on last edited by
                  #8

                  Yes. it seems that the asynchronous network access in the framework (Socket.BeginReceive etc) is the way forward here.

                  1 Reply Last reply
                  0
                  Reply
                  • Reply as topic
                  Log in to reply
                  • Oldest to Newest
                  • Newest to Oldest
                  • Most Votes


                  • Login

                  • Don't have an account? Register

                  • Login or register to search.
                  • First post
                    Last post
                  0
                  • Categories
                  • Recent
                  • Tags
                  • Popular
                  • World
                  • Users
                  • Groups