Why .NET CLR so slow compared to JVM or Dart or V8 ? Call for help
-
Is that a XNA vector class? http://msdn.microsoft.com/en-us/library/microsoft.xna.framework.vector3.aspx[^]
Bastard Programmer from Hell :suss: If you can't read my code, try converting it here[^]
no, it's just a user-defined class:
public class /\*struct\*/ Vector3f { public number x, y, z; public Vector3f(number x = 0, number y = 0, number z = 0) { this.x = x; this.y = y; this.z = z; } public static Vector3f operator -(Vector3f a, Vector3f b) { return new Vector3f(a.x - b.x, a.y - b.y, a.z - b.z); }
-
I'm struggling to understand the cause of the poor performances of the .NET virtual machine when compared to Java (JVM), DartVM or even JavaScript (V8) in this Raytracer demo. In brief: I've written a ray tracer program in various languages: C#, Dart, TypeScript and Java, that I compile both for native virtual machines (CLR, JVM, DartVM) and browsers (thanks to JavaScript compilation). Now when comparing them all, C# running natively is astonishingly slow. What Java calculates in 5 seconds, .NET takes 40 seconds, that is 8 times slower! 3x slower than Dart or JavaScript. Not only, what is really embarrassing is that C# compiled to JavaScript and running in Chrome is 2.5x faster than native C#! How is it that possible? I tried all possible compile switches but nothing seem to be able to cut the execution times. Do you have some possible explanation? Is .NET really that slow? If you like, please have a look at the Github repo and run the tests independently. There must be a flaw somewhere that I can't see. Please help me find it out.
Interesting. Did you get to the bottom of this? .NET is fast, so something is amiss. I don't know about web stuff like V8, but this looks like code that runs on the CPU rather than the GPU. Pete might be onto something with the number of object instantiations, but these are stupidly fast. It's the GC that comes later that slows things down. I notice you swapped the class for a struct - did that make any difference?
Regards, Rob Philpott.
-
I'm struggling to understand the cause of the poor performances of the .NET virtual machine when compared to Java (JVM), DartVM or even JavaScript (V8) in this Raytracer demo. In brief: I've written a ray tracer program in various languages: C#, Dart, TypeScript and Java, that I compile both for native virtual machines (CLR, JVM, DartVM) and browsers (thanks to JavaScript compilation). Now when comparing them all, C# running natively is astonishingly slow. What Java calculates in 5 seconds, .NET takes 40 seconds, that is 8 times slower! 3x slower than Dart or JavaScript. Not only, what is really embarrassing is that C# compiled to JavaScript and running in Chrome is 2.5x faster than native C#! How is it that possible? I tried all possible compile switches but nothing seem to be able to cut the execution times. Do you have some possible explanation? Is .NET really that slow? If you like, please have a look at the Github repo and run the tests independently. There must be a flaw somewhere that I can't see. Please help me find it out.
I just downloaded your code and ran the C# version. Running the Debug version under Visual Studio took 54 seconds. Running the Release version under Visual Studio took 43 seconds. Running the Release version from the command line took 14 seconds. Part of the problem here is that you are running everything on the UI thread, so there is nothing at all being offloaded to the GPU.
-
no, it's just a user-defined class:
public class /\*struct\*/ Vector3f { public number x, y, z; public Vector3f(number x = 0, number y = 0, number z = 0) { this.x = x; this.y = y; this.z = z; } public static Vector3f operator -(Vector3f a, Vector3f b) { return new Vector3f(a.x - b.x, a.y - b.y, a.z - b.z); }
return new Vector3f(a.x - b.x, a.y - b.y, a.z - b.z);
Fetches the field value for x, after fetching object a. Fetches object b, fetches the value of the x field of that instance, and subtracts it from the other value. Does that three times. Create a new vector, and returns that. Calculations would benefit if they were local variables. Furthermore I'm fairly certain that the "number" datatype does not exist in C#; we call that a "double".
Bastard Programmer from Hell :suss: If you can't read my code, try converting it here[^]
-
Interesting. Did you get to the bottom of this? .NET is fast, so something is amiss. I don't know about web stuff like V8, but this looks like code that runs on the CPU rather than the GPU. Pete might be onto something with the number of object instantiations, but these are stupidly fast. It's the GC that comes later that slows things down. I notice you swapped the class for a struct - did that make any difference?
Regards, Rob Philpott.
yes there was huge increase (~50%) but still far from Java performances. And also it's not very fair to use it for the benchmark, as the other VM does not have values types (ok, I see the argument of using the best of each language).
-
return new Vector3f(a.x - b.x, a.y - b.y, a.z - b.z);
Fetches the field value for x, after fetching object a. Fetches object b, fetches the value of the x field of that instance, and subtracts it from the other value. Does that three times. Create a new vector, and returns that. Calculations would benefit if they were local variables. Furthermore I'm fairly certain that the "number" datatype does not exist in C#; we call that a "double".
Bastard Programmer from Hell :suss: If you can't read my code, try converting it here[^]
Eddy Vluggen wrote:
Furthermore I'm fairly certain that the "number" datatype does not exist in C#; we call that a "double".
there's an
using number = System.double
so to make easy to switch from float to double. -
yes there was huge increase (~50%) but still far from Java performances. And also it's not very fair to use it for the benchmark, as the other VM does not have values types (ok, I see the argument of using the best of each language).
I've found your big holdup. It's this line:
if(x==639) Document.canvas.Refresh();
Basically, you're forcing the user control to refresh its content here, which is slowing you down because you're refreshing on the same thread that you're performing your calculations on.
-
I just downloaded your code and ran the C# version. Running the Debug version under Visual Studio took 54 seconds. Running the Release version under Visual Studio took 43 seconds. Running the Release version from the command line took 14 seconds. Part of the problem here is that you are running everything on the UI thread, so there is nothing at all being offloaded to the GPU.
I tried calculating bitmap rows on a separate thread, but it was ~2% slower (I guess for the thread preparation overhead). So the problem isn't the UI thread. I did it sequentially (one single separate thread, no multiple threads) to make it comparable with the other versions (Java, Dart ecc..).
-
I tried calculating bitmap rows on a separate thread, but it was ~2% slower (I guess for the thread preparation overhead). So the problem isn't the UI thread. I did it sequentially (one single separate thread, no multiple threads) to make it comparable with the other versions (Java, Dart ecc..).
The problem IS the UI thread. If you don't believe me, which you plainly don't, comment that line out and run it again. You're doing a bare bones GDI implementation here, and this is notoriously inefficient in C#. As this is something that no sane person would ever do in C# using this pattern, you really aren't comparing like for like here. The normal way to do this would be to use unsafe code and standard techniques such as LockBits.
-
The problem IS the UI thread. If you don't believe me, which you plainly don't, comment that line out and run it again. You're doing a bare bones GDI implementation here, and this is notoriously inefficient in C#. As this is something that no sane person would ever do in C# using this pattern, you really aren't comparing like for like here. The normal way to do this would be to use unsafe code and standard techniques such as LockBits.
Pete O'Hanlon wrote:
If you don't believe me, which you plainly don't, comment that line out and run it again.
I did that, and also made a better thing: I made a
Console
version of the program, where all the calculations are done "in memory". As expected, it didn't gain much: from 33 secs it went down to 31 seconds only (in this actual machine). For me this clearly tells it's not a graphic issue, and it's not a UI thread issue. -
Pete O'Hanlon wrote:
If you don't believe me, which you plainly don't, comment that line out and run it again.
I did that, and also made a better thing: I made a
Console
version of the program, where all the calculations are done "in memory". As expected, it didn't gain much: from 33 secs it went down to 31 seconds only (in this actual machine). For me this clearly tells it's not a graphic issue, and it's not a UI thread issue.And yet when I took your existing code and commented out 1 line - the Refresh line, it ran in less than 10 seconds. Oh look, a graphics and UI issue - could it possibly be that I know something of what I'm talking about, having dealt with the various vagaries of this stuff for nigh on 14 years. It's easy for you to replicate this - comment out the Refresh call on your canvas and run your app in Release mode from the command line (don't run it inside Visual Studio).
-
And yet when I took your existing code and commented out 1 line - the Refresh line, it ran in less than 10 seconds. Oh look, a graphics and UI issue - could it possibly be that I know something of what I'm talking about, having dealt with the various vagaries of this stuff for nigh on 14 years. It's easy for you to replicate this - comment out the Refresh call on your canvas and run your app in Release mode from the command line (don't run it inside Visual Studio).
Sorry, I don't want to seem crazy or stubborn, but I get different numbers here! This one is taken with commenting out all the .Refresh() calls and run out of Visual Studio: Picture 1: no refresh in this other you can see the original (no comments on refresh and run out of VS) and the Console version (run from prompt): Picture 2: original + console side by side So to sum up: - Original in UI thread: 18 secs - Refresh commented: 17 secs - Console 16: secs these are my numbers.
-
I'm struggling to understand the cause of the poor performances of the .NET virtual machine when compared to Java (JVM), DartVM or even JavaScript (V8) in this Raytracer demo. In brief: I've written a ray tracer program in various languages: C#, Dart, TypeScript and Java, that I compile both for native virtual machines (CLR, JVM, DartVM) and browsers (thanks to JavaScript compilation). Now when comparing them all, C# running natively is astonishingly slow. What Java calculates in 5 seconds, .NET takes 40 seconds, that is 8 times slower! 3x slower than Dart or JavaScript. Not only, what is really embarrassing is that C# compiled to JavaScript and running in Chrome is 2.5x faster than native C#! How is it that possible? I tried all possible compile switches but nothing seem to be able to cut the execution times. Do you have some possible explanation? Is .NET really that slow? If you like, please have a look at the Github repo and run the tests independently. There must be a flaw somewhere that I can't see. Please help me find it out.
I get about 10s on my machine with the .NET version -- have not tested the others, but 40 seconds seems way too long. I've actually written a ray tracer in C# before, and it could render images very quickly (it included advanced features like transparency as well). There is definitely something wrong. I see a bunch of odd things that could be making this much less efficient than it could be: * Color should probably be a struct instead of class (but measure and find out) * The stopwatch class uses DateTime, which is really, really bad for measuring small amounts of time. It's slow and inprecise for this usage. Use System.Diagnostics.Stopwatch instead. * Outputting to the console is slow -- even at once per row, it might be enough to skew the timing. ** Related: calling String.Format is expensive. Might not matter for this application. * Profiling this shows that the most expensive method is Sphere.Intersect. I suspect it's just the math and the fact that you're calling it so much. * You can optimize the Sphere.Intersect method a bit more. You can store a precalculated radiusSquared value. You can put off the call to Math.Sqrt by squaring both sides of the equation and still checking against 0. You only need to cal Sqrt when distance > 0. * a bunch of foreach statements. In many cases, these can be transformed into for-loops by the JIT, but not always. Enumerator.MoveNext is showing up in the profile, so consider changing these. Lastly, what is the CPU usage like during each version of the program? Could other versions be automatically parallelizing some of the loops? Seems like a long shot, but I don't know. And yeah, I know it's an obvious plug, but I've written an entire book on .NET performance. See my signature. It can teach you a bunch of stuff like the above, but more importantly, how to measure and diagnose these problems. In the end, .NET is really just x86/x64 code running on a processor just like anything else. To see such a wide disparity in running times, especially compared to javascript is a red flag that something major is wrong.
Ben Watson Author, Writing High-Performance .NET Code
-
I get about 10s on my machine with the .NET version -- have not tested the others, but 40 seconds seems way too long. I've actually written a ray tracer in C# before, and it could render images very quickly (it included advanced features like transparency as well). There is definitely something wrong. I see a bunch of odd things that could be making this much less efficient than it could be: * Color should probably be a struct instead of class (but measure and find out) * The stopwatch class uses DateTime, which is really, really bad for measuring small amounts of time. It's slow and inprecise for this usage. Use System.Diagnostics.Stopwatch instead. * Outputting to the console is slow -- even at once per row, it might be enough to skew the timing. ** Related: calling String.Format is expensive. Might not matter for this application. * Profiling this shows that the most expensive method is Sphere.Intersect. I suspect it's just the math and the fact that you're calling it so much. * You can optimize the Sphere.Intersect method a bit more. You can store a precalculated radiusSquared value. You can put off the call to Math.Sqrt by squaring both sides of the equation and still checking against 0. You only need to cal Sqrt when distance > 0. * a bunch of foreach statements. In many cases, these can be transformed into for-loops by the JIT, but not always. Enumerator.MoveNext is showing up in the profile, so consider changing these. Lastly, what is the CPU usage like during each version of the program? Could other versions be automatically parallelizing some of the loops? Seems like a long shot, but I don't know. And yeah, I know it's an obvious plug, but I've written an entire book on .NET performance. See my signature. It can teach you a bunch of stuff like the above, but more importantly, how to measure and diagnose these problems. In the end, .NET is really just x86/x64 code running on a processor just like anything else. To see such a wide disparity in running times, especially compared to javascript is a red flag that something major is wrong.
Ben Watson Author, Writing High-Performance .NET Code
thank you for your reply it's really appreciated :-) Agree totally with your considerations, there is lot to optimize in the code, but still that doesn't explain why the Java or the Dart version are much faster (and they don't even have structs). One could point out that it's like comparing apples vs oranges, but Java and C# are very close sharing common traits to the point that the two source codes can be compared line by line. So what's exactly the reason why .NET is so slow here? Do you have an idea?