Looping through a CSV with threads
-
Hi, I've recently started using Windows forms and need to create many threads which loop through a given CSV and process the contents to a method. I'm not sure of the best way of doing this. I can create a few threads but each thread processes each line rather than 1 line being processed by one thread and the next line being processed by another thread. The first value is a unique identifier if that helps. Hope I've explained it ok, can anybody please advise? Thanks
-
Hi, I've recently started using Windows forms and need to create many threads which loop through a given CSV and process the contents to a method. I'm not sure of the best way of doing this. I can create a few threads but each thread processes each line rather than 1 line being processed by one thread and the next line being processed by another thread. The first value is a unique identifier if that helps. Hope I've explained it ok, can anybody please advise? Thanks
I would expect that one thread should read the file and place each line in a Queue. Then several other threads can read from the Queue and proceed from there.
-
Hi, I've recently started using Windows forms and need to create many threads which loop through a given CSV and process the contents to a method. I'm not sure of the best way of doing this. I can create a few threads but each thread processes each line rather than 1 line being processed by one thread and the next line being processed by another thread. The first value is a unique identifier if that helps. Hope I've explained it ok, can anybody please advise? Thanks
Hi, you could open a streamreader on the file, then launch a few threads all executing the same code, containing this pseudo-code:
while(!allDone)
lock the streamreader (or some other object)
read one line from the streamreader, and set allDone true if no more data available
unlock again
process the line read, if any
}allDone is a bool, starting false, shared by all threads. :)
Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles]
I only read code that is properly formatted, adding PRE tags is the easiest way to obtain that.
[The QA section does it automatically now, I hope we soon get it on regular forums as well]
modified on Sunday, January 31, 2010 10:44 PM
-
Hi, you could open a streamreader on the file, then launch a few threads all executing the same code, containing this pseudo-code:
while(!allDone)
lock the streamreader (or some other object)
read one line from the streamreader, and set allDone true if no more data available
unlock again
process the line read, if any
}allDone is a bool, starting false, shared by all threads. :)
Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles]
I only read code that is properly formatted, adding PRE tags is the easiest way to obtain that.
[The QA section does it automatically now, I hope we soon get it on regular forums as well]
modified on Sunday, January 31, 2010 10:44 PM
I'm just not convinced that that will yield a significant performance boost. And I suspect that decoupling the reading and processing will yield a more-maintainable system.
-
I'm just not convinced that that will yield a significant performance boost. And I suspect that decoupling the reading and processing will yield a more-maintainable system.
Neither am I, however the OP stated "need to create many threads", so I must assume those threads will be busy outside the lock most of the time, in which case I don't need code to read the entire file and pump it through a queue. :)
Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles]
I only read code that is properly formatted, adding PRE tags is the easiest way to obtain that.
[The QA section does it automatically now, I hope we soon get it on regular forums as well]
-
Hi, you could open a streamreader on the file, then launch a few threads all executing the same code, containing this pseudo-code:
while(!allDone)
lock the streamreader (or some other object)
read one line from the streamreader, and set allDone true if no more data available
unlock again
process the line read, if any
}allDone is a bool, starting false, shared by all threads. :)
Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles]
I only read code that is properly formatted, adding PRE tags is the easiest way to obtain that.
[The QA section does it automatically now, I hope we soon get it on regular forums as well]
modified on Sunday, January 31, 2010 10:44 PM
-
Hi, I've recently started using Windows forms and need to create many threads which loop through a given CSV and process the contents to a method. I'm not sure of the best way of doing this. I can create a few threads but each thread processes each line rather than 1 line being processed by one thread and the next line being processed by another thread. The first value is a unique identifier if that helps. Hope I've explained it ok, can anybody please advise? Thanks
Load the file contents into a list, and run the threads in a thread pool. When you queue the threads into the thread pool, assign an index into the list to the thread. Then, when a thread runs, it can simply retrieve the specified index item and be done with it. I think it might be even better to queue just a handful of threads, assign a block of items to it (specifying the first and last index to process). This would save the time/resources associated with overhead regarding thread contexts and switching. The thread pool will report itself ide when processing is complete, so you don't have to track the status/progress unless you really want to.
.45 ACP - because shooting twice is just silly
-----
"Why don't you tie a kerosene-soaked rag around your ankles so the ants won't climb up and eat your candy ass..." - Dale Earnhardt, 1997
-----
"The staggering layers of obscenity in your statement make it a work of art on so many levels." - J. Jystad, 2001 -
Thanks, but I'm new to this. How do I go about locking my streamreader so each thread does a different line, etc? Thank you
Hi, not tested, you should read up on all classes and methods used, then adapt whatever needs adapted, and add error handling:
StreamReader sr;
bool allDone;void handleFileWithManyThreads(string filename, bool waitTillDone) {
allDone=false;
sr=File.Open(filename);
List<Thread> threads=new List<Thread>();
// launch N threads
for(int i=0; i<Environment.NumberOfProcessors; i++) {
Thread thread=new Thread();
thread.Start(runner);
threads.Add(thread);
}
if (waitTillDone) {
// now wait for all these threads to finish
foreach(Thread thread in threads) thread.Join();
sr.Close();
}
}void runner() {
while(!allDone) {
string line=null;
lock(sr) {
line=sr.ReadLine();
if (line==null) allDone=true;
}
if (line!=null) {
// process line here
}
}
}:)
Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles]
I only read code that is properly formatted, adding PRE tags is the easiest way to obtain that.
[The QA section does it automatically now, I hope we soon get it on regular forums as well]
-
Hi, I've recently started using Windows forms and need to create many threads which loop through a given CSV and process the contents to a method. I'm not sure of the best way of doing this. I can create a few threads but each thread processes each line rather than 1 line being processed by one thread and the next line being processed by another thread. The first value is a unique identifier if that helps. Hope I've explained it ok, can anybody please advise? Thanks
If you're new to this, I think that the outlow programmers ideea is the best. To make things even easier(not faster) you could use ParallelLINQ or the parallel task(s) library to process the datas from the list. This way you don't have to worry about threads, locks and other quite painfull stuff. It's available as an extension/add on to .NET 3.5 SP1 and out of the box for .NET 4.0(soon to come).
-
Hi, I've recently started using Windows forms and need to create many threads which loop through a given CSV and process the contents to a method. I'm not sure of the best way of doing this. I can create a few threads but each thread processes each line rather than 1 line being processed by one thread and the next line being processed by another thread. The first value is a unique identifier if that helps. Hope I've explained it ok, can anybody please advise? Thanks
Thank you very much for the replies. I created 10 threads and broke down my file into 10 files. Each thread would loop through then. This is what I went with in the end: private void btnRun_Click(object sender, EventArgs e) { for (int i = 0; i < threadCount; i++) { CreateThread(i); } } private void CreateThread(int threadID) { Thread thread = new Thread(new ParameterizedThreadStart(RunProcess)); thread.Start(threadID); lstThread.Add(thread); } private void RunProcess(object threadID) { int id = (int)threadID; for (int i = id; i < threadCount; i += threadCount) { ProcessByThread(Convert.ToInt32(threadID)); } lstThread[id].Abort(); } private void ProcessByThread(int threadNumber) { GetDataFromCsv("Batch" + threadNumber + ".csv", threadNumber); }