Fast Image Processing - programming language to use
-
I have embarked on writing an image processing application - and am now concerned with simple operations like brightness / contrast - window / level. Now, the big question is about performance. For small images (upto 1000 x 1000), things happen at real time. But, for images of size 3000 x 2000, the program is sluggish. My program is written in C#, with GDI+. Now, taking a step back, is this (C#, GDI+) a good choice for such a program; or should one revert back to unmanaged C++, MFC? I would like to use WPF, but again, have any of you, great programmers out there, seen any performance problems on WPF with fast image processing? Also, would be grateful if you can share some performance improving tips.
-
I have embarked on writing an image processing application - and am now concerned with simple operations like brightness / contrast - window / level. Now, the big question is about performance. For small images (upto 1000 x 1000), things happen at real time. But, for images of size 3000 x 2000, the program is sluggish. My program is written in C#, with GDI+. Now, taking a step back, is this (C#, GDI+) a good choice for such a program; or should one revert back to unmanaged C++, MFC? I would like to use WPF, but again, have any of you, great programmers out there, seen any performance problems on WPF with fast image processing? Also, would be grateful if you can share some performance improving tips.
All the .NET languages are slow. Despite the dubious claims by some people that they can be as fast as unmanaged code, .NET programs are usually sluggish. Managed code does JITing, boxing/unboxing, and run-time checking (e.g. bounds checking on array accesses) for example. However, .NET programs are more reliable. Recently the same subsystem was implemented at my company in C# and unmanaged C++. The C++ code was faster but would regularly crash mysteriously. The C# code was solid. The best approach is to write the bulk of your code in C#, but do the image processing in unmanaged C++. This gives you the reliability of managed code, and the speed of unmanaged code for the time-consuming repetitive tasks. The fastest processing is in small C++ loops that fit entirely into the cache, with no contained branches. This allows out-of-order execution which speeds up processing. Since a loop is a branch, which makes out-of-order execution difficult, you can process two or more cases inside your loop (instead of one) to do more processing before the branch. This is called "loop unrolling", and can speed up processing. Processing the image from low addresses to high addresses minimizes memory accesses and makes use of the cache, which fills a 32 (or 64)-byte buffer (called a "cache line") with a single memory access (even one pixel). Subsequent memory accesses at addresses just above this DON'T access memory because the contents are already in the cache. This saves time by avoiding the need for the processor to wait for a memory access. The Intel (and AMD) SSE and MMX extensions to the instruction set (http://en.wikipedia.org/wiki/Streaming_SIMD_Extensions[^]) allow the use of 128-bit registers which may allow you to do image-processing operations in parallel for more speed.
-
All the .NET languages are slow. Despite the dubious claims by some people that they can be as fast as unmanaged code, .NET programs are usually sluggish. Managed code does JITing, boxing/unboxing, and run-time checking (e.g. bounds checking on array accesses) for example. However, .NET programs are more reliable. Recently the same subsystem was implemented at my company in C# and unmanaged C++. The C++ code was faster but would regularly crash mysteriously. The C# code was solid. The best approach is to write the bulk of your code in C#, but do the image processing in unmanaged C++. This gives you the reliability of managed code, and the speed of unmanaged code for the time-consuming repetitive tasks. The fastest processing is in small C++ loops that fit entirely into the cache, with no contained branches. This allows out-of-order execution which speeds up processing. Since a loop is a branch, which makes out-of-order execution difficult, you can process two or more cases inside your loop (instead of one) to do more processing before the branch. This is called "loop unrolling", and can speed up processing. Processing the image from low addresses to high addresses minimizes memory accesses and makes use of the cache, which fills a 32 (or 64)-byte buffer (called a "cache line") with a single memory access (even one pixel). Subsequent memory accesses at addresses just above this DON'T access memory because the contents are already in the cache. This saves time by avoiding the need for the processor to wait for a memory access. The Intel (and AMD) SSE and MMX extensions to the instruction set (http://en.wikipedia.org/wiki/Streaming_SIMD_Extensions[^]) allow the use of 128-bit registers which may allow you to do image-processing operations in parallel for more speed.
Alan Balkany wrote:
The C++ code was faster but would regularly crash mysteriously. The C# code was solid.
I would suggest the C++ code was written by an incompetent or hadn't been debugged properly. I've written a ton of C++ code over the years and my code does not typically "just crash mysteriously". Or it was some Microsoft benchmark slanted to show how their proprietary garbage is superior.
You measure democracy by the freedom it gives its dissidents, not the freedom it gives its assimilated conformists.
-
All the .NET languages are slow. Despite the dubious claims by some people that they can be as fast as unmanaged code, .NET programs are usually sluggish. Managed code does JITing, boxing/unboxing, and run-time checking (e.g. bounds checking on array accesses) for example. However, .NET programs are more reliable. Recently the same subsystem was implemented at my company in C# and unmanaged C++. The C++ code was faster but would regularly crash mysteriously. The C# code was solid. The best approach is to write the bulk of your code in C#, but do the image processing in unmanaged C++. This gives you the reliability of managed code, and the speed of unmanaged code for the time-consuming repetitive tasks. The fastest processing is in small C++ loops that fit entirely into the cache, with no contained branches. This allows out-of-order execution which speeds up processing. Since a loop is a branch, which makes out-of-order execution difficult, you can process two or more cases inside your loop (instead of one) to do more processing before the branch. This is called "loop unrolling", and can speed up processing. Processing the image from low addresses to high addresses minimizes memory accesses and makes use of the cache, which fills a 32 (or 64)-byte buffer (called a "cache line") with a single memory access (even one pixel). Subsequent memory accesses at addresses just above this DON'T access memory because the contents are already in the cache. This saves time by avoiding the need for the processor to wait for a memory access. The Intel (and AMD) SSE and MMX extensions to the instruction set (http://en.wikipedia.org/wiki/Streaming_SIMD_Extensions[^]) allow the use of 128-bit registers which may allow you to do image-processing operations in parallel for more speed.
I had implemented an image processing program in Delphi 7, and it showed real-time performance. However, the support for Delphi is not very strong. Let me hasten to add that I found absolutely no problems with the Delphi 7 executable - no crashing, etc. for big images. Also, Delphi has a function to assign all bits of a single scan-line - so you don't need nested 'for' loops (outer loop over height, and inner loop over width); such a function is missing in C#. I am amazed at the way Delphi achieves high performance. Java is yet another option, which I have not yet explored.
modified on Tuesday, September 22, 2009 2:35 AM
-
Alan Balkany wrote:
The C++ code was faster but would regularly crash mysteriously. The C# code was solid.
I would suggest the C++ code was written by an incompetent or hadn't been debugged properly. I've written a ton of C++ code over the years and my code does not typically "just crash mysteriously". Or it was some Microsoft benchmark slanted to show how their proprietary garbage is superior.
You measure democracy by the freedom it gives its dissidents, not the freedom it gives its assimilated conformists.
"I would suggest the C++ code was written by an incompetent or hadn't been debugged properly." Probably. But my point is that C++ lets you get away with that. C# quickly detects that type of problem. In this case, a few months previously the C++ programmer had been deriding the need for C# memory protections, saying "My code doesn't have memory leaks!". People aren't perfect, and a language that protects you against some mistakes will produce more reliable programs.
-
I had implemented an image processing program in Delphi 7, and it showed real-time performance. However, the support for Delphi is not very strong. Let me hasten to add that I found absolutely no problems with the Delphi 7 executable - no crashing, etc. for big images. Also, Delphi has a function to assign all bits of a single scan-line - so you don't need nested 'for' loops (outer loop over height, and inner loop over width); such a function is missing in C#. I am amazed at the way Delphi achieves high performance. Java is yet another option, which I have not yet explored.
modified on Tuesday, September 22, 2009 2:35 AM
I don't think Java is a good approach if you're looking for maximum performance. It's an interpreted language, with bytecodes being executed by a software virtual machine, so it's slower than native code.
-
"I would suggest the C++ code was written by an incompetent or hadn't been debugged properly." Probably. But my point is that C++ lets you get away with that. C# quickly detects that type of problem. In this case, a few months previously the C++ programmer had been deriding the need for C# memory protections, saying "My code doesn't have memory leaks!". People aren't perfect, and a language that protects you against some mistakes will produce more reliable programs.
-
I don't think Java is a good approach if you're looking for maximum performance. It's an interpreted language, with bytecodes being executed by a software virtual machine, so it's slower than native code.
Alan Balkany wrote:
It's an interpreted language, with bytecodes being executed by a software virtual machine
That of course is highly debatable. Both Java and C# compile to an intermediate language, which is then compiled to and stored and executed as native code (at "run-time", which actually means just before it runs, so not really different from "at build-time" except that it adds to your app's start-u[ time). An interpreter would never generate native code. Whether the end result is worse, equal or better performance-wise is mainly determined by the amount of effort they have chosen to spend in the compiler and virtual machine. After all, the intermediate code, containing a lot of meta information, is a perfect representation of the original source code. BTW: most/all regular compilers also have a front-end dealing with the source language, and a back-end generating the final instructions. With the two parts communicating through a rather language-agnostic internal representation of the source; that basically is what bytecode and IL also are. :)
Luc Pattyn
Have a look at my entry for the lean-and-mean competition; please provide comments, feedback, discussion, and don’t forget to vote for it! Thank you.
Local announcement (Antwerp region): Lange Wapper? Neen!
-
"I would suggest the C++ code was written by an incompetent or hadn't been debugged properly." Probably. But my point is that C++ lets you get away with that. C# quickly detects that type of problem. In this case, a few months previously the C++ programmer had been deriding the need for C# memory protections, saying "My code doesn't have memory leaks!". People aren't perfect, and a language that protects you against some mistakes will produce more reliable programs.
The whole memory leak problem that .NET is supposed to cure is a big Microsoft FUD. Since at least the early 90s, Microsoft as provided a debug heap that's instrumented to detect memory leaks. All you have to do is enable its use and test your debug version. When you exit it will not only tell you if you have memory leaks but where the unfreed memory was allocated. This works both for C and C++. The bigger problem is resource leaks which Microsoft didn't do much to address. Forget to call Dispose and/or not implement it properly and it's no better than forgetting to call free or not handling your destructor properly.
You measure democracy by the freedom it gives its dissidents, not the freedom it gives its assimilated conformists.
-
The whole memory leak problem that .NET is supposed to cure is a big Microsoft FUD. Since at least the early 90s, Microsoft as provided a debug heap that's instrumented to detect memory leaks. All you have to do is enable its use and test your debug version. When you exit it will not only tell you if you have memory leaks but where the unfreed memory was allocated. This works both for C and C++. The bigger problem is resource leaks which Microsoft didn't do much to address. Forget to call Dispose and/or not implement it properly and it's no better than forgetting to call free or not handling your destructor properly.
You measure democracy by the freedom it gives its dissidents, not the freedom it gives its assimilated conformists.
.NET detects memory problems that the unmanaged debug heap won't, such as array indexes out of bounds. .NET code is more reliable. Of course if you write perfect code, you can have a reliable unmanaged program, but who writes perfect code? At the current level of technology, we can't prove that ANY program is correct.
-
.NET detects memory problems that the unmanaged debug heap won't, such as array indexes out of bounds. .NET code is more reliable. Of course if you write perfect code, you can have a reliable unmanaged program, but who writes perfect code? At the current level of technology, we can't prove that ANY program is correct.
-
Alan Balkany wrote:
It's an interpreted language, with bytecodes being executed by a software virtual machine
That of course is highly debatable. Both Java and C# compile to an intermediate language, which is then compiled to and stored and executed as native code (at "run-time", which actually means just before it runs, so not really different from "at build-time" except that it adds to your app's start-u[ time). An interpreter would never generate native code. Whether the end result is worse, equal or better performance-wise is mainly determined by the amount of effort they have chosen to spend in the compiler and virtual machine. After all, the intermediate code, containing a lot of meta information, is a perfect representation of the original source code. BTW: most/all regular compilers also have a front-end dealing with the source language, and a back-end generating the final instructions. With the two parts communicating through a rather language-agnostic internal representation of the source; that basically is what bytecode and IL also are. :)
Luc Pattyn
Have a look at my entry for the lean-and-mean competition; please provide comments, feedback, discussion, and don’t forget to vote for it! Thank you.
Local announcement (Antwerp region): Lange Wapper? Neen!
Luc Pattyn wrote:
That of course is highly debatable
However, experience shows that java programs are (usually) deadly slow. :)
If the Lord God Almighty had consulted me before embarking upon the Creation, I would have recommended something simpler. -- Alfonso the Wise, 13th Century King of Castile.
This is going on my arrogant assumptions. You may have a superb reason why I'm completely wrong. -- Iain Clarke
[My articles] -
Luc Pattyn wrote:
That of course is highly debatable
However, experience shows that java programs are (usually) deadly slow. :)
If the Lord God Almighty had consulted me before embarking upon the Creation, I would have recommended something simpler. -- Alfonso the Wise, 13th Century King of Castile.
This is going on my arrogant assumptions. You may have a superb reason why I'm completely wrong. -- Iain Clarke
[My articles]Experience also show that VB code is sh!t, this is often due to the monkeys rather than the tree.
Panic, Chaos, Destruction. My work here is done.
-
Experience also show that VB code is sh!t, this is often due to the monkeys rather than the tree.
Panic, Chaos, Destruction. My work here is done.
That's true. Anyway I wouldn't suggest anyone to use java for programming 'damned-fast' applications. While java has many many many qualities, alas speed is not one of (of course this is going on my...). :)
If the Lord God Almighty had consulted me before embarking upon the Creation, I would have recommended something simpler. -- Alfonso the Wise, 13th Century King of Castile.
This is going on my arrogant assumptions. You may have a superb reason why I'm completely wrong. -- Iain Clarke
[My articles] -
Luc Pattyn wrote:
That of course is highly debatable
However, experience shows that java programs are (usually) deadly slow. :)
If the Lord God Almighty had consulted me before embarking upon the Creation, I would have recommended something simpler. -- Alfonso the Wise, 13th Century King of Castile.
This is going on my arrogant assumptions. You may have a superb reason why I'm completely wrong. -- Iain Clarke
[My articles]I have been using a very fast and proprietary Java system for many years, well before .NET came to be. The average Java implementation being slow is due to the fact that everyone can create a JVM (just like everyone can write a compiler), it takes a professional and performance oriented approach to create a good one. "It works, lets ship it" isn't good enough. Not here, not anywhere if performance matters. :)
Luc Pattyn
Local announcement (Antwerp region): Lange Wapper? Neen!
-
I have been using a very fast and proprietary Java system for many years, well before .NET came to be. The average Java implementation being slow is due to the fact that everyone can create a JVM (just like everyone can write a compiler), it takes a professional and performance oriented approach to create a good one. "It works, lets ship it" isn't good enough. Not here, not anywhere if performance matters. :)
Luc Pattyn
Local announcement (Antwerp region): Lange Wapper? Neen!
No: nothing can even approach plain
C
language performance (wellassembly
can be better). I'm talking about well writtenjava
applications vs well writtenC
ones. Believe me there's a reason why light speed isc
:rolleyes:. On the other handjava
has many many many good features, but, you know it is the "compile once, slow down everywhere" language (well, everywhere but the proprietary implementation you experienced...) :-DIf the Lord God Almighty had consulted me before embarking upon the Creation, I would have recommended something simpler. -- Alfonso the Wise, 13th Century King of Castile.
This is going on my arrogant assumptions. You may have a superb reason why I'm completely wrong. -- Iain Clarke
[My articles]modified on Friday, September 25, 2009 3:45 PM
-
I have embarked on writing an image processing application - and am now concerned with simple operations like brightness / contrast - window / level. Now, the big question is about performance. For small images (upto 1000 x 1000), things happen at real time. But, for images of size 3000 x 2000, the program is sluggish. My program is written in C#, with GDI+. Now, taking a step back, is this (C#, GDI+) a good choice for such a program; or should one revert back to unmanaged C++, MFC? I would like to use WPF, but again, have any of you, great programmers out there, seen any performance problems on WPF with fast image processing? Also, would be grateful if you can share some performance improving tips.
unmanaged code kills managed code for this type of thing. we do all of our stuff in C++, with a bit of assembly for the MMX/SSE optimizations. sometimes we use the MMX/SSE intrinsics (which are essentially macros), but that's just being lazy. hand-coded assembly can beat the intrinsics. nothing will beat an unmanaged pointer zipping across the image data for sheer speed. of course, a decent algorithm can do wonders, too.
-
I have embarked on writing an image processing application - and am now concerned with simple operations like brightness / contrast - window / level. Now, the big question is about performance. For small images (upto 1000 x 1000), things happen at real time. But, for images of size 3000 x 2000, the program is sluggish. My program is written in C#, with GDI+. Now, taking a step back, is this (C#, GDI+) a good choice for such a program; or should one revert back to unmanaged C++, MFC? I would like to use WPF, but again, have any of you, great programmers out there, seen any performance problems on WPF with fast image processing? Also, would be grateful if you can share some performance improving tips.
why isn't talking about matlab?