Threads, speed up calculations
-
Hi, i will be appreciated if someone could help me in this situation: my PC: Pentium 2Core 2.1GHz 4Gb(Ram) windows 7 x64 The problem that I can not solve by myself is how to speed up calculations. My method runs 850-970ms. method created for image processing (crops image ,recognize coped image and put calculated value into matrix) When i am using threads (in this example 2 threads) ThreadStart ts = new ThreadStart(ThreadOneP); ThreadStart ts2 = new ThreadStart(ThreadTwoP); Thread thread1 = new Thread(ts); Thread thread2 = new Thread(ts2); thread1.Start(); thread2.Start(); thread1.Join(); thread2.Join(); method takes 900-1070ms. How to speed up that method will take about 400ms? it is possible? P.S sorry for my English :)
-
Hi, i will be appreciated if someone could help me in this situation: my PC: Pentium 2Core 2.1GHz 4Gb(Ram) windows 7 x64 The problem that I can not solve by myself is how to speed up calculations. My method runs 850-970ms. method created for image processing (crops image ,recognize coped image and put calculated value into matrix) When i am using threads (in this example 2 threads) ThreadStart ts = new ThreadStart(ThreadOneP); ThreadStart ts2 = new ThreadStart(ThreadTwoP); Thread thread1 = new Thread(ts); Thread thread2 = new Thread(ts2); thread1.Start(); thread2.Start(); thread1.Join(); thread2.Join(); method takes 900-1070ms. How to speed up that method will take about 400ms? it is possible? P.S sorry for my English :)
Well, that's a bit vague without knowing what the threads are actually doing. The bit of code you posted isn't really incorrect, although the calling thread is blocked and you could do some work on that rather than create a new one then just block on its completion. To improve performance, you need to take a look at your algorithm and make sure it can't be improved. Splitting the task onto more threads might be an idea as well so that machines with more processors can make use of them. Perhaps try increases the thread priority of the workers?
Regards, Rob Philpott.
-
Well, that's a bit vague without knowing what the threads are actually doing. The bit of code you posted isn't really incorrect, although the calling thread is blocked and you could do some work on that rather than create a new one then just block on its completion. To improve performance, you need to take a look at your algorithm and make sure it can't be improved. Splitting the task onto more threads might be an idea as well so that machines with more processors can make use of them. Perhaps try increases the thread priority of the workers?
Regards, Rob Philpott.
Also I just now try to set 7 thread 1 thread for each image row. but time consumption is the same as Using 2 threads 900-1000ms. can you tell a little bit mote how to use thread priority, because i thing i i will set high priority for all threads there will be no effect.
-
Hi, i will be appreciated if someone could help me in this situation: my PC: Pentium 2Core 2.1GHz 4Gb(Ram) windows 7 x64 The problem that I can not solve by myself is how to speed up calculations. My method runs 850-970ms. method created for image processing (crops image ,recognize coped image and put calculated value into matrix) When i am using threads (in this example 2 threads) ThreadStart ts = new ThreadStart(ThreadOneP); ThreadStart ts2 = new ThreadStart(ThreadTwoP); Thread thread1 = new Thread(ts); Thread thread2 = new Thread(ts2); thread1.Start(); thread2.Start(); thread1.Join(); thread2.Join(); method takes 900-1070ms. How to speed up that method will take about 400ms? it is possible? P.S sorry for my English :)
Hi, adding threads can improve latency time by doing some things in parallel; it only is effective if: 1. (some of the) CPUs aren't fully loaded (watch Task Manager, performance) to start with. 2. synchronization cost is low with only two active threads (the launching thread is idle as it waits for the others to join) a dual-core should work well, independent of thread priorities. you are not allowed to directly touch WinForm Controls from inside your threads! I suggest you show us the actual code for more feedback. :)
Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles]
Merry Christmas and a Happy New Year to all.
-
Hi, i will be appreciated if someone could help me in this situation: my PC: Pentium 2Core 2.1GHz 4Gb(Ram) windows 7 x64 The problem that I can not solve by myself is how to speed up calculations. My method runs 850-970ms. method created for image processing (crops image ,recognize coped image and put calculated value into matrix) When i am using threads (in this example 2 threads) ThreadStart ts = new ThreadStart(ThreadOneP); ThreadStart ts2 = new ThreadStart(ThreadTwoP); Thread thread1 = new Thread(ts); Thread thread2 = new Thread(ts2); thread1.Start(); thread2.Start(); thread1.Join(); thread2.Join(); method takes 900-1070ms. How to speed up that method will take about 400ms? it is possible? P.S sorry for my English :)
(how code looks after trying with 7 threads (because captured image is divided in 7 rows)) I have to warn you CODE is not optimized. as example I show you only 2 thread methods: ThreadOneP and ThreadTwoP public int TransferToMatrixThread() { temptest = 0; CabX = CabX - 18; CabY = CabY + 53; CabXX = CabX - 484; //POSSITION CabYY = CabY + 20; for (int i = 0; i < 11; i++) //FILLS MATRIX WITH 0 { for (int j = 0; j < 11; j++) { transfered[i, j] = 0; } } ThreadStart ts = new ThreadStart(ThreadOneP); ThreadStart ts2 = new ThreadStart(ThreadTwoP); ThreadStart ts3 = new ThreadStart(ThreadThreeP); ThreadStart ts4 = new ThreadStart(ThreadFourP); ThreadStart ts5 = new ThreadStart(ThreadFiveP); ThreadStart ts6 = new ThreadStart(ThreadSixP); ThreadStart ts7 = new ThreadStart(ThreadSevenP); Thread thread1 = new Thread(ts); Thread thread2 = new Thread(ts2); Thread thread3 = new Thread(ts3); Thread thread4 = new Thread(ts4); Thread thread5 = new Thread(ts5); Thread thread6 = new Thread(ts6); Thread thread7 = new Thread(ts7); DateTime startTime = DateTime.Now; //TIME thread1.Start(); thread2.Start(); thread3.Start(); thread4.Start(); thread5.Start(); thread6.Start(); thread7.Start(); thread1.Join(); thread2.Join(); thread3.Join(); thread4.Join(); thread5.Join(); thread6.Join(); thread7.Join(); DateTime stopTime = DateTime.Now; TimeSpan duration = stopTime - startTime; listBox1.Items.Add("7 THREAD = " + duration); TransfX = currX + CabXX + 9; // norima nustatyti i pav viduri todel pridedamos konstantos TransfY = currY + CabYY + 11; // norima nustatyti i pav viduri todel pridedamos konstantos return temptest; } public void ThreadOneP() { int o = CabYY; int zy = 8; int zx = 2; for (int ok = CabXX; ok < CabXX + 490; ok = ok + 72) { var color22 = RGBreturnLVL(ok, o, 30
-
Hi, i will be appreciated if someone could help me in this situation: my PC: Pentium 2Core 2.1GHz 4Gb(Ram) windows 7 x64 The problem that I can not solve by myself is how to speed up calculations. My method runs 850-970ms. method created for image processing (crops image ,recognize coped image and put calculated value into matrix) When i am using threads (in this example 2 threads) ThreadStart ts = new ThreadStart(ThreadOneP); ThreadStart ts2 = new ThreadStart(ThreadTwoP); Thread thread1 = new Thread(ts); Thread thread2 = new Thread(ts2); thread1.Start(); thread2.Start(); thread1.Join(); thread2.Join(); method takes 900-1070ms. How to speed up that method will take about 400ms? it is possible? P.S sorry for my English :)
Why don't you use a thread pool instead of all that start/join crap?
.45 ACP - because shooting twice is just silly
-----
"Why don't you tie a kerosene-soaked rag around your ankles so the ants won't climb up and eat your candy ass..." - Dale Earnhardt, 1997
-----
"The staggering layers of obscenity in your statement make it a work of art on so many levels." - J. Jystad, 2001 -
(how code looks after trying with 7 threads (because captured image is divided in 7 rows)) I have to warn you CODE is not optimized. as example I show you only 2 thread methods: ThreadOneP and ThreadTwoP public int TransferToMatrixThread() { temptest = 0; CabX = CabX - 18; CabY = CabY + 53; CabXX = CabX - 484; //POSSITION CabYY = CabY + 20; for (int i = 0; i < 11; i++) //FILLS MATRIX WITH 0 { for (int j = 0; j < 11; j++) { transfered[i, j] = 0; } } ThreadStart ts = new ThreadStart(ThreadOneP); ThreadStart ts2 = new ThreadStart(ThreadTwoP); ThreadStart ts3 = new ThreadStart(ThreadThreeP); ThreadStart ts4 = new ThreadStart(ThreadFourP); ThreadStart ts5 = new ThreadStart(ThreadFiveP); ThreadStart ts6 = new ThreadStart(ThreadSixP); ThreadStart ts7 = new ThreadStart(ThreadSevenP); Thread thread1 = new Thread(ts); Thread thread2 = new Thread(ts2); Thread thread3 = new Thread(ts3); Thread thread4 = new Thread(ts4); Thread thread5 = new Thread(ts5); Thread thread6 = new Thread(ts6); Thread thread7 = new Thread(ts7); DateTime startTime = DateTime.Now; //TIME thread1.Start(); thread2.Start(); thread3.Start(); thread4.Start(); thread5.Start(); thread6.Start(); thread7.Start(); thread1.Join(); thread2.Join(); thread3.Join(); thread4.Join(); thread5.Join(); thread6.Join(); thread7.Join(); DateTime stopTime = DateTime.Now; TimeSpan duration = stopTime - startTime; listBox1.Items.Add("7 THREAD = " + duration); TransfX = currX + CabXX + 9; // norima nustatyti i pav viduri todel pridedamos konstantos TransfY = currY + CabYY + 11; // norima nustatyti i pav viduri todel pridedamos konstantos return temptest; } public void ThreadOneP() { int o = CabYY; int zy = 8; int zx = 2; for (int ok = CabXX; ok < CabXX + 490; ok = ok + 72) { var color22 = RGBreturnLVL(ok, o, 30
Some comments: 1. please show code in PRE tags (e.g. use "code block" widget) for better readability 2. your threads share the transfered[] array, i.e. they each fill one row. For performance, I would avoid 2-D arrays, give them each a 1-D array. 3. you are abusing the var keyword; if color22 is a Color, then declare it a Color (for readability). 4. as there are only a few iterations in the for loop, most time is spent inside RGBreturnLVL(). So one may want to see that code too. 5. if threads do almost identical things, they don't need separate code; make a simple class that describes one threaded job, holding its code and data. 6. if is not running on the main thread (the name seems to indicate so), then
listBox1.Items.Add("7 THREAD = " + duration);
is NOT allowed. See this article.[^] 7. assuming there is no blocking system call (e.g. a file/network read) inside your threaded code, it does not make sense to have more calculation threads than there are CPUs. Threading overhead can make it slower. 8. seems like maze[] is pretty constant, and only a few elements are needed here; I would copy them in local variables once, and use those all the time. :)Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles]
Merry Christmas and a Happy New Year to all.
-
Why don't you use a thread pool instead of all that start/join crap?
.45 ACP - because shooting twice is just silly
-----
"Why don't you tie a kerosene-soaked rag around your ankles so the ants won't climb up and eat your candy ass..." - Dale Earnhardt, 1997
-----
"The staggering layers of obscenity in your statement make it a work of art on so many levels." - J. Jystad, 2001I was thinking the same thing! Also, it looks like the methods that do the work are almost identical (at least the two that were posted), so I'd abstract that logic out into its own method and pass in just the data that changes between rows.
Hold on a second here... Don't you think you might be putting the horse ahead of the cart?
-
Some comments: 1. please show code in PRE tags (e.g. use "code block" widget) for better readability 2. your threads share the transfered[] array, i.e. they each fill one row. For performance, I would avoid 2-D arrays, give them each a 1-D array. 3. you are abusing the var keyword; if color22 is a Color, then declare it a Color (for readability). 4. as there are only a few iterations in the for loop, most time is spent inside RGBreturnLVL(). So one may want to see that code too. 5. if threads do almost identical things, they don't need separate code; make a simple class that describes one threaded job, holding its code and data. 6. if is not running on the main thread (the name seems to indicate so), then
listBox1.Items.Add("7 THREAD = " + duration);
is NOT allowed. See this article.[^] 7. assuming there is no blocking system call (e.g. a file/network read) inside your threaded code, it does not make sense to have more calculation threads than there are CPUs. Threading overhead can make it slower. 8. seems like maze[] is pretty constant, and only a few elements are needed here; I would copy them in local variables once, and use those all the time. :)Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles]
Merry Christmas and a Happy New Year to all.
-
Some comments: 1. please show code in PRE tags (e.g. use "code block" widget) for better readability 2. your threads share the transfered[] array, i.e. they each fill one row. For performance, I would avoid 2-D arrays, give them each a 1-D array. 3. you are abusing the var keyword; if color22 is a Color, then declare it a Color (for readability). 4. as there are only a few iterations in the for loop, most time is spent inside RGBreturnLVL(). So one may want to see that code too. 5. if threads do almost identical things, they don't need separate code; make a simple class that describes one threaded job, holding its code and data. 6. if is not running on the main thread (the name seems to indicate so), then
listBox1.Items.Add("7 THREAD = " + duration);
is NOT allowed. See this article.[^] 7. assuming there is no blocking system call (e.g. a file/network read) inside your threaded code, it does not make sense to have more calculation threads than there are CPUs. Threading overhead can make it slower. 8. seems like maze[] is pretty constant, and only a few elements are needed here; I would copy them in local variables once, and use those all the time. :)Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles]
Merry Christmas and a Happy New Year to all.
AND, YES most of time is spend in RGBreturnLVL(...) method. //Calculate average 30x30 image color private Color RGBreturnLVL(int PositionX, int PositionY, int width, int height, Bitmap Image) { int ra = 0; int ga = 0; int ba = 0; int total = 0; for (int x = 0; x < width; x++) { for (int y = 0; y < height; y++) { Color clr = CropBitmap(Image, PositionX, PositionY, width, height).GetPixel(x, y); ra += clr.R; ga += clr.G; ba += clr.B; total++; } } ra /= total; ga /= total; ba /= total; return Color.FromArgb(ra, ga, ba); }
-
AND, YES most of time is spend in RGBreturnLVL(...) method. //Calculate average 30x30 image color private Color RGBreturnLVL(int PositionX, int PositionY, int width, int height, Bitmap Image) { int ra = 0; int ga = 0; int ba = 0; int total = 0; for (int x = 0; x < width; x++) { for (int y = 0; y < height; y++) { Color clr = CropBitmap(Image, PositionX, PositionY, width, height).GetPixel(x, y); ra += clr.R; ga += clr.G; ba += clr.B; total++; } } ra /= total; ga /= total; ba /= total; return Color.FromArgb(ra, ga, ba); }
I asked you to use PRE tags. and that code refers to yet another method. and doesn't look very efficient. :|
Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles]
Merry Christmas and a Happy New Year to all.
-
AND, YES most of time is spend in RGBreturnLVL(...) method. //Calculate average 30x30 image color private Color RGBreturnLVL(int PositionX, int PositionY, int width, int height, Bitmap Image) { int ra = 0; int ga = 0; int ba = 0; int total = 0; for (int x = 0; x < width; x++) { for (int y = 0; y < height; y++) { Color clr = CropBitmap(Image, PositionX, PositionY, width, height).GetPixel(x, y); ra += clr.R; ga += clr.G; ba += clr.B; total++; } } ra /= total; ga /= total; ba /= total; return Color.FromArgb(ra, ga, ba); }
GetPixel is a performance disaster, you should really use LockBits and unsafe code (that's what GetPixel does, too, but it does it every time you call it) Using LockBits once and then using unsafe code to read the pixels is several orders of magnitude faster than using GetPixel
-
AND, YES most of time is spend in RGBreturnLVL(...) method. //Calculate average 30x30 image color private Color RGBreturnLVL(int PositionX, int PositionY, int width, int height, Bitmap Image) { int ra = 0; int ga = 0; int ba = 0; int total = 0; for (int x = 0; x < width; x++) { for (int y = 0; y < height; y++) { Color clr = CropBitmap(Image, PositionX, PositionY, width, height).GetPixel(x, y); ra += clr.R; ga += clr.G; ba += clr.B; total++; } } ra /= total; ga /= total; ba /= total; return Color.FromArgb(ra, ga, ba); }
CropBitmap
creates a copy of some sub-range of the bitmap? If so, you are creating one copy for each pixel you read! That looks like an incredible waste of time. Move thatCropBitmap
call out of your loop. Or try to get rid of it entirely; useImage.GetPixel(x + PositionX, y + PositionY)
instead. -
CropBitmap
creates a copy of some sub-range of the bitmap? If so, you are creating one copy for each pixel you read! That looks like an incredible waste of time. Move thatCropBitmap
call out of your loop. Or try to get rid of it entirely; useImage.GetPixel(x + PositionX, y + PositionY)
instead. -
I was thinking the same thing! Also, it looks like the methods that do the work are almost identical (at least the two that were posted), so I'd abstract that logic out into its own method and pass in just the data that changes between rows.
Hold on a second here... Don't you think you might be putting the horse ahead of the cart?
-
Daniel Grunwald thanks a lot! results after changing : without threads : 50-70ms with threads: ~115ms significant change, i do not expected to save that much time. :)
as long as the threaded version is slower, you're doing things the wrong way. :)
Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles]
Merry Christmas and a Happy New Year to all.
-
Daniel Grunwald thanks a lot! results after changing : without threads : 50-70ms with threads: ~115ms significant change, i do not expected to save that much time. :)
You should be able to speed it up by another factor of 5 using harold's advice (use a single LockBits call and unsafe code inside the loop). Once you've done that, you can still make it much faster by using a more intelligent algorithm for finding the averages: cumulative array sums. It's easiest to explain in one dimension: If you have one int-array and need to run multiple queries "give me the average of all elements from index a to b", then it's faster to preprocess the array by building the cumulative sum: Original array = { 10, 20, 30, 40 } Cumulative sums = { 10, 30, 60, 100 } Every entry in the new array is the sum of all previous entries in the original array. The new array can be calculated in a single O(N) loop. You can find the sum of any contiguous index range in the original array just by subtracting the end points in the cumulative sum array. The sum of 20+30+40 is 100-10. This scheme can be extended into more than one dimension: for two dimensional images, create an array where every entry is the sum of all original entries to the top left of it. Such a new array can be calculated in O(N*M) if you first calculate the one-dimensional cumulative sums for one dimension and then for the other dimension. Then you can calculate the sum of any rectangle easily: sum of all values in some rectangle = sums[bottom, right] - sums[bottom, left] - sums[top, right] + sums[top, left] So using cumulative sums, you'll be able to calculate the average color of an rectangle using just 4 memory accesses instead of 900 memory access (30*30 pixels). None of this is related to threading. I don't know why the multithreaded version is slower in your case (maybe due locking inside of the GetPixel calls?). But multithreading usually won't give you as much of an speedup as using a better algorithm, so keep your program single-threaded when possible. Multiple threads can introduce lots of complexity and subtle bugs for little benefit.
-
You should be able to speed it up by another factor of 5 using harold's advice (use a single LockBits call and unsafe code inside the loop). Once you've done that, you can still make it much faster by using a more intelligent algorithm for finding the averages: cumulative array sums. It's easiest to explain in one dimension: If you have one int-array and need to run multiple queries "give me the average of all elements from index a to b", then it's faster to preprocess the array by building the cumulative sum: Original array = { 10, 20, 30, 40 } Cumulative sums = { 10, 30, 60, 100 } Every entry in the new array is the sum of all previous entries in the original array. The new array can be calculated in a single O(N) loop. You can find the sum of any contiguous index range in the original array just by subtracting the end points in the cumulative sum array. The sum of 20+30+40 is 100-10. This scheme can be extended into more than one dimension: for two dimensional images, create an array where every entry is the sum of all original entries to the top left of it. Such a new array can be calculated in O(N*M) if you first calculate the one-dimensional cumulative sums for one dimension and then for the other dimension. Then you can calculate the sum of any rectangle easily: sum of all values in some rectangle = sums[bottom, right] - sums[bottom, left] - sums[top, right] + sums[top, left] So using cumulative sums, you'll be able to calculate the average color of an rectangle using just 4 memory accesses instead of 900 memory access (30*30 pixels). None of this is related to threading. I don't know why the multithreaded version is slower in your case (maybe due locking inside of the GetPixel calls?). But multithreading usually won't give you as much of an speedup as using a better algorithm, so keep your program single-threaded when possible. Multiple threads can introduce lots of complexity and subtle bugs for little benefit.
Daniel Grunwald wrote:
keep your program single-threaded when possible
My general advice is slightly different: first optimize using a single thread, then consider adding one or a few threads. :)
Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles]
Merry Christmas and a Happy New Year to all.