MultiThreading
-
I am finishing up my program, and I built just like an typical amateur does, only one process works at a time. I need to start multi-threading, but because of the considerable amount of repeditive type of calculations non graphic I was wondering how I might proceed. I need to run 8000 sets of data through the same algorithms which may be up to 1000 every few minutes. Furthermore how does this fit in with using a multiple processors with multicore cores and will it speed it up, if it will do I need to set the threading up accordingly. Right now I have an older machine with a dual core and it processes about 10 data sets a second, will I be able to set up the program where I can use dual processors with multi-cores and speed up this process (other than the processor speed). Or do I need to use a blade server and split up the program to process more data, if so do I thread it differently also. Any help would be greatly appreciated. Thanks in advance. Michael
-
I am finishing up my program, and I built just like an typical amateur does, only one process works at a time. I need to start multi-threading, but because of the considerable amount of repeditive type of calculations non graphic I was wondering how I might proceed. I need to run 8000 sets of data through the same algorithms which may be up to 1000 every few minutes. Furthermore how does this fit in with using a multiple processors with multicore cores and will it speed it up, if it will do I need to set the threading up accordingly. Right now I have an older machine with a dual core and it processes about 10 data sets a second, will I be able to set up the program where I can use dual processors with multi-cores and speed up this process (other than the processor speed). Or do I need to use a blade server and split up the program to process more data, if so do I thread it differently also. Any help would be greatly appreciated. Thanks in advance. Michael
The benefit you get depends on your datasets and how they need to be processed. You have a couple of options for dividing the work up: 1) Each thread will handle its own dataset, processing each, by itself, from start to finish. The benefit you get is if you have more than one dataset being worked on at a time. 2) If your dataset can be worked on if you partition a dataset into smaller chunks, each thread can work on a small chunk of the data. Say your dataset has 100 items in it and each can be processed individually with no dependancies on the other items in the set. Partitioning the dataset means breaking the 100 items into groups of, say, 25 each. Then 4 threads could work on the same dataset, each processing 25 items seperate from the others. In both cases, you're best benefit is usually obtained when the number of worker threads you create matches the number of cores in the CPU. So, if you have an Intel proc with 4 cores and 4 hyperthreads, you'll see 8 logical cores, so you can get away with 8 threads. You can also look into the Task Parallel Library (pre .NET 4.0) to simplify this process greatly. If using .NET 4.0 or 4.5, the TPL is integrated into the .NET Framework. You don't have to install a seperate library to use it. See the System.Threading.Tasks namespace documentation.
A guide to posting questions on CodeProject[^]
Dave Kreskowiak -
The benefit you get depends on your datasets and how they need to be processed. You have a couple of options for dividing the work up: 1) Each thread will handle its own dataset, processing each, by itself, from start to finish. The benefit you get is if you have more than one dataset being worked on at a time. 2) If your dataset can be worked on if you partition a dataset into smaller chunks, each thread can work on a small chunk of the data. Say your dataset has 100 items in it and each can be processed individually with no dependancies on the other items in the set. Partitioning the dataset means breaking the 100 items into groups of, say, 25 each. Then 4 threads could work on the same dataset, each processing 25 items seperate from the others. In both cases, you're best benefit is usually obtained when the number of worker threads you create matches the number of cores in the CPU. So, if you have an Intel proc with 4 cores and 4 hyperthreads, you'll see 8 logical cores, so you can get away with 8 threads. You can also look into the Task Parallel Library (pre .NET 4.0) to simplify this process greatly. If using .NET 4.0 or 4.5, the TPL is integrated into the .NET Framework. You don't have to install a seperate library to use it. See the System.Threading.Tasks namespace documentation.
A guide to posting questions on CodeProject[^]
Dave KreskowiakI would difinely go with option 1 below because all data sets will not be processed to completion. You said for every core you would have a separate thread, does this mean whether I use a single server with 2 processors and multiple cores or blade server with multiple computers it is only the number of cores that is important.
-
I would difinely go with option 1 below because all data sets will not be processed to completion. You said for every core you would have a separate thread, does this mean whether I use a single server with 2 processors and multiple cores or blade server with multiple computers it is only the number of cores that is important.
You cannot start a thread on a seperate computer. If you want to use blade servers, each will have to run a seperate copy of your app and you'll have to come up with some way to spread the number of datasets over the set of servers and communicate with each.
A guide to posting questions on CodeProject[^]
Dave Kreskowiak