Writing Faster Managed Code

MtnBiknGuy

Jan Gray of the Microsoft CLR Performance Team wrote an excellent article titled "Writing Faster Managed Code: Know What Things Cost" (http://msdn.microsoft.com/library/?url=/library/en-us/dndotnet/html/fastmanagedcode.asp). This article has motivated me to seek answers to the following questions. Any replies from the group are appreciated. The questions probably (hopefully) have very simple answers. Between these two foreach loops, is there any difference in performance, safety or correctness? (All examples are in C#) 1. declare and initialize new type in body of loop: foreach (T t in tList) { A a = new A(t.ID); //do more work myCollection.Add(a); } 2. declare new type outside loop, and 'new' it inside loop: A a = null; foreach (T t in tList) { a = new A(t.ID); //do more work myCollection.Add(a); } I would think #2 is preferrable, but I see that most C# examples use the style of #1. In fact, I often see the following in C# examples: T MyMethod() { T t = new T(); return t; } This would be a big problem in C++, but in C# the garbage collector makes it OK. But does that mean it's a good practice? Why is it so common in examples? Finally, I'm curious if there are any performance differences between the following two loops. The reason I ask is that I'm one of those people who (prior to reading the above mentioned article) pulled the array.Length property out of the loop header in the interest of avoiding extra calls to get the length value. I now understand why that is not optimal. I'm curious if any processor optimizations or compiler optimizations make choice 1 equal to or better than choice 2 below: 1. repeat same property calls inside a loop: for (int i = 0; i < 10; ++i) { DoAlgorithm1(i, T.x); DoAlgorithm2(i, T.x + y); DoAlgorithm3(i, T.x + y); } 2. pull property calls outside the loop: double x = T.x; double w = x + y; for (int i = 0; i < 10; ++i) { DoAlgorithm1(i, x); DoAlgorithm2(i, w); DoAlgorithm3(i, w); } All input is appreciated!

Daniel Turini

Optimization is fun, but not easy... Not a single concept is always true. Combine this with multi-threading (think about the SMP machines and the newer hyperthreading processors) and optimization is almost a try and guess game, although some basic techniques do help a bit. MtnBiknGuy wrote: I would think #2 is preferrable, but I see that most C# examples use the style of #1. In fact, I often see the following in C# examples: #2 does one inutile assignment, #1 don't. But to solve this doubt, remove the "= null" (BTW, this isn't a good practice, as it defeats some useful compiler warnings) and just look and the generated code with ILDASM. I bet they generate exactly the same IL, as the maximum stack space is allocated on the method's entry. MtnBiknGuy wrote: Finally, I'm curious if there are any performance differences between the following two loops. If the JIT is as good as MS sells it, #1 could even be slightly faster than #2. MS says JIT does common subexpression elimination and code motion of loop invariants, which is exactly what you did, but sometimes the JIT could even do it without the need for another local variable (imagine that you have several loops). Notice that the JIT sometimes will do a better job than you. If T.x is a virtual property, and not a field, you can't safely do what you showed (T.x could have a side effect, e.g. incrementing a counter to keep usage count). Although sometimes the JIT hasn't all the info or time it needs to make all the possible optimizations, it has all the info it needs to always act safely. On a side note, you should make all of your timings on Release builds, because the optimizer is disabled on the Debug builds. Kant wrote: Actually she replied back to me "You shouldn't fix the bug. You should kill it"