Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C#
  4. Looping through a CSV with threads

Looping through a CSV with threads

Scheduled Pinned Locked Moved C#
winformsquestion
10 Posts 5 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • D Offline
    D Offline
    Danpeking
    wrote on last edited by
    #1

    Hi, I've recently started using Windows forms and need to create many threads which loop through a given CSV and process the contents to a method. I'm not sure of the best way of doing this. I can create a few threads but each thread processes each line rather than 1 line being processed by one thread and the next line being processed by another thread. The first value is a unique identifier if that helps. Hope I've explained it ok, can anybody please advise? Thanks

    P L realJSOPR D D 5 Replies Last reply
    0
    • D Danpeking

      Hi, I've recently started using Windows forms and need to create many threads which loop through a given CSV and process the contents to a method. I'm not sure of the best way of doing this. I can create a few threads but each thread processes each line rather than 1 line being processed by one thread and the next line being processed by another thread. The first value is a unique identifier if that helps. Hope I've explained it ok, can anybody please advise? Thanks

      P Offline
      P Offline
      PIEBALDconsult
      wrote on last edited by
      #2

      I would expect that one thread should read the file and place each line in a Queue. Then several other threads can read from the Queue and proceed from there.

      1 Reply Last reply
      0
      • D Danpeking

        Hi, I've recently started using Windows forms and need to create many threads which loop through a given CSV and process the contents to a method. I'm not sure of the best way of doing this. I can create a few threads but each thread processes each line rather than 1 line being processed by one thread and the next line being processed by another thread. The first value is a unique identifier if that helps. Hope I've explained it ok, can anybody please advise? Thanks

        L Offline
        L Offline
        Luc Pattyn
        wrote on last edited by
        #3

        Hi, you could open a streamreader on the file, then launch a few threads all executing the same code, containing this pseudo-code:

        while(!allDone)
        lock the streamreader (or some other object)
        read one line from the streamreader, and set allDone true if no more data available
        unlock again
        process the line read, if any
        }

        allDone is a bool, starting false, shared by all threads. :)

        Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles]


        I only read code that is properly formatted, adding PRE tags is the easiest way to obtain that.
        [The QA section does it automatically now, I hope we soon get it on regular forums as well]


        modified on Sunday, January 31, 2010 10:44 PM

        P D 2 Replies Last reply
        0
        • L Luc Pattyn

          Hi, you could open a streamreader on the file, then launch a few threads all executing the same code, containing this pseudo-code:

          while(!allDone)
          lock the streamreader (or some other object)
          read one line from the streamreader, and set allDone true if no more data available
          unlock again
          process the line read, if any
          }

          allDone is a bool, starting false, shared by all threads. :)

          Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles]


          I only read code that is properly formatted, adding PRE tags is the easiest way to obtain that.
          [The QA section does it automatically now, I hope we soon get it on regular forums as well]


          modified on Sunday, January 31, 2010 10:44 PM

          P Offline
          P Offline
          PIEBALDconsult
          wrote on last edited by
          #4

          I'm just not convinced that that will yield a significant performance boost. And I suspect that decoupling the reading and processing will yield a more-maintainable system.

          L 1 Reply Last reply
          0
          • P PIEBALDconsult

            I'm just not convinced that that will yield a significant performance boost. And I suspect that decoupling the reading and processing will yield a more-maintainable system.

            L Offline
            L Offline
            Luc Pattyn
            wrote on last edited by
            #5

            Neither am I, however the OP stated "need to create many threads", so I must assume those threads will be busy outside the lock most of the time, in which case I don't need code to read the entire file and pump it through a queue. :)

            Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles]


            I only read code that is properly formatted, adding PRE tags is the easiest way to obtain that.
            [The QA section does it automatically now, I hope we soon get it on regular forums as well]


            1 Reply Last reply
            0
            • L Luc Pattyn

              Hi, you could open a streamreader on the file, then launch a few threads all executing the same code, containing this pseudo-code:

              while(!allDone)
              lock the streamreader (or some other object)
              read one line from the streamreader, and set allDone true if no more data available
              unlock again
              process the line read, if any
              }

              allDone is a bool, starting false, shared by all threads. :)

              Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles]


              I only read code that is properly formatted, adding PRE tags is the easiest way to obtain that.
              [The QA section does it automatically now, I hope we soon get it on regular forums as well]


              modified on Sunday, January 31, 2010 10:44 PM

              D Offline
              D Offline
              Danpeking
              wrote on last edited by
              #6

              Thanks, but I'm new to this. How do I go about locking my streamreader so each thread does a different line, etc? Thank you

              L 1 Reply Last reply
              0
              • D Danpeking

                Hi, I've recently started using Windows forms and need to create many threads which loop through a given CSV and process the contents to a method. I'm not sure of the best way of doing this. I can create a few threads but each thread processes each line rather than 1 line being processed by one thread and the next line being processed by another thread. The first value is a unique identifier if that helps. Hope I've explained it ok, can anybody please advise? Thanks

                realJSOPR Offline
                realJSOPR Offline
                realJSOP
                wrote on last edited by
                #7

                Load the file contents into a list, and run the threads in a thread pool. When you queue the threads into the thread pool, assign an index into the list to the thread. Then, when a thread runs, it can simply retrieve the specified index item and be done with it. I think it might be even better to queue just a handful of threads, assign a block of items to it (specifying the first and last index to process). This would save the time/resources associated with overhead regarding thread contexts and switching. The thread pool will report itself ide when processing is complete, so you don't have to track the status/progress unless you really want to.

                .45 ACP - because shooting twice is just silly
                -----
                "Why don't you tie a kerosene-soaked rag around your ankles so the ants won't climb up and eat your candy ass..." - Dale Earnhardt, 1997
                -----
                "The staggering layers of obscenity in your statement make it a work of art on so many levels." - J. Jystad, 2001

                1 Reply Last reply
                0
                • D Danpeking

                  Thanks, but I'm new to this. How do I go about locking my streamreader so each thread does a different line, etc? Thank you

                  L Offline
                  L Offline
                  Luc Pattyn
                  wrote on last edited by
                  #8

                  Hi, not tested, you should read up on all classes and methods used, then adapt whatever needs adapted, and add error handling:

                  StreamReader sr;
                  bool allDone;

                  void handleFileWithManyThreads(string filename, bool waitTillDone) {
                  allDone=false;
                  sr=File.Open(filename);
                  List<Thread> threads=new List<Thread>();
                  // launch N threads
                  for(int i=0; i<Environment.NumberOfProcessors; i++) {
                  Thread thread=new Thread();
                  thread.Start(runner);
                  threads.Add(thread);
                  }
                  if (waitTillDone) {
                  // now wait for all these threads to finish
                  foreach(Thread thread in threads) thread.Join();
                  sr.Close();
                  }
                  }

                  void runner() {
                  while(!allDone) {
                  string line=null;
                  lock(sr) {
                  line=sr.ReadLine();
                  if (line==null) allDone=true;
                  }
                  if (line!=null) {
                  // process line here
                  }
                  }
                  }

                  :)

                  Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles]


                  I only read code that is properly formatted, adding PRE tags is the easiest way to obtain that.
                  [The QA section does it automatically now, I hope we soon get it on regular forums as well]


                  1 Reply Last reply
                  0
                  • D Danpeking

                    Hi, I've recently started using Windows forms and need to create many threads which loop through a given CSV and process the contents to a method. I'm not sure of the best way of doing this. I can create a few threads but each thread processes each line rather than 1 line being processed by one thread and the next line being processed by another thread. The first value is a unique identifier if that helps. Hope I've explained it ok, can anybody please advise? Thanks

                    D Offline
                    D Offline
                    Dan Mos
                    wrote on last edited by
                    #9

                    If you're new to this, I think that the outlow programmers ideea is the best. To make things even easier(not faster) you could use ParallelLINQ or the parallel task(s) library to process the datas from the list. This way you don't have to worry about threads, locks and other quite painfull stuff. It's available as an extension/add on to .NET 3.5 SP1 and out of the box for .NET 4.0(soon to come).

                    1 Reply Last reply
                    0
                    • D Danpeking

                      Hi, I've recently started using Windows forms and need to create many threads which loop through a given CSV and process the contents to a method. I'm not sure of the best way of doing this. I can create a few threads but each thread processes each line rather than 1 line being processed by one thread and the next line being processed by another thread. The first value is a unique identifier if that helps. Hope I've explained it ok, can anybody please advise? Thanks

                      D Offline
                      D Offline
                      Danpeking
                      wrote on last edited by
                      #10

                      Thank you very much for the replies. I created 10 threads and broke down my file into 10 files. Each thread would loop through then. This is what I went with in the end: private void btnRun_Click(object sender, EventArgs e) { for (int i = 0; i < threadCount; i++) { CreateThread(i); } } private void CreateThread(int threadID) { Thread thread = new Thread(new ParameterizedThreadStart(RunProcess)); thread.Start(threadID); lstThread.Add(thread); } private void RunProcess(object threadID) { int id = (int)threadID; for (int i = id; i < threadCount; i += threadCount) { ProcessByThread(Convert.ToInt32(threadID)); } lstThread[id].Abort(); } private void ProcessByThread(int threadNumber) { GetDataFromCsv("Batch" + threadNumber + ".csv", threadNumber); }

                      1 Reply Last reply
                      0
                      Reply
                      • Reply as topic
                      Log in to reply
                      • Oldest to Newest
                      • Newest to Oldest
                      • Most Votes


                      • Login

                      • Don't have an account? Register

                      • Login or register to search.
                      • First post
                        Last post
                      0
                      • Categories
                      • Recent
                      • Tags
                      • Popular
                      • World
                      • Users
                      • Groups