64bit JIT practices Defensive Programming
-
Fair enough. But maybe your starting to answer your own question... "The condition codes used by the Jcc, CMOVcc, and SETcc instructions are based on the results of a CMP instruction." "JZ - Jump near if 0 (ZF=1)"
Sure, but then you might as well not compare and jmp without a condition - or better yet, just immediately put the code there. That's what the 32bit version does.
bob16972 wrote:
"The condition codes used by the Jcc, CMOVcc, and SETcc instructions are based on the results of a CMP instruction."
That's true, but also misleading. They all use RFlags or however you want to call it. Most arithmetic instructions also change the flags (though inc/dec don't affect the carry flag and therefore introduce a partial flag stall on some architectures)
-
Sure, but then you might as well not compare and jmp without a condition - or better yet, just immediately put the code there. That's what the 32bit version does.
bob16972 wrote:
"The condition codes used by the Jcc, CMOVcc, and SETcc instructions are based on the results of a CMP instruction."
That's true, but also misleading. They all use RFlags or however you want to call it. Most arithmetic instructions also change the flags (though inc/dec don't affect the carry flag and therefore introduce a partial flag stall on some architectures)
It might appear that the 32-bit listing you posted has been optimized and the 64-bit listing is a typical debug build literal translation of the code as if you are listing release build 32-bit and debug build 64-bit of the same thing. (I'm not saying this is what happened but the instruction listings seem pretty reasonable based on the possible number of factors involved. This is just what it looks like to me) There is nothing disturbing from what I can see. I'm guessing your potentially being mislead by the 32-bit build optimizing out the alternate branch path or some other factors that are contributing to some illusion of a problem. It is pretty typical for a debug build to leave dead code paths in there and such. I'm seeing it roughly like this with some liberties taken for brevity...
cmp rax,rax ; if (typeof(int) == typeof(int))
jz throw_exception
xor eax,eax ; else and then later return eax = 0
add rsp,20h
pop rbx
rep retBTW, what exactly was the original problem or issue?
-
It might appear that the 32-bit listing you posted has been optimized and the 64-bit listing is a typical debug build literal translation of the code as if you are listing release build 32-bit and debug build 64-bit of the same thing. (I'm not saying this is what happened but the instruction listings seem pretty reasonable based on the possible number of factors involved. This is just what it looks like to me) There is nothing disturbing from what I can see. I'm guessing your potentially being mislead by the 32-bit build optimizing out the alternate branch path or some other factors that are contributing to some illusion of a problem. It is pretty typical for a debug build to leave dead code paths in there and such. I'm seeing it roughly like this with some liberties taken for brevity...
cmp rax,rax ; if (typeof(int) == typeof(int))
jz throw_exception
xor eax,eax ; else and then later return eax = 0
add rsp,20h
pop rbx
rep retBTW, what exactly was the original problem or issue?
It wasn't a debug build, I'm 100% certain of that. And you're right, it does look somewhat like a debug build (but it isn't, the debug one is even more cluttered) and that's exactly what the problem is - a proper release build that is ran without debugger attached still manages to look like a debug build.
bob16972 wrote:
BTW, what exactly was the original problem or issue?
I wanted to know whether comparing types for equality like that was fast of whether it would first build a full Type object and then compare that. Turns out it's "pretty fast, but still stupid". By the way, if you (or anyone else) don't believe it or such, I invite you all to go try it. I'm not making this up.
-
It wasn't a debug build, I'm 100% certain of that. And you're right, it does look somewhat like a debug build (but it isn't, the debug one is even more cluttered) and that's exactly what the problem is - a proper release build that is ran without debugger attached still manages to look like a debug build.
bob16972 wrote:
BTW, what exactly was the original problem or issue?
I wanted to know whether comparing types for equality like that was fast of whether it would first build a full Type object and then compare that. Turns out it's "pretty fast, but still stupid". By the way, if you (or anyone else) don't believe it or such, I invite you all to go try it. I'm not making this up.
Daniel Scott wrote:
By the way, if you (or anyone else) don't believe it or such, I invite you all to go try it. I'm not making this up.
I hope I didn't come across as challenging your statements as I was just posting my thoughts as they evolved. I tend to try and behave myself when it comes to criticizing what someone says about what they are observing. If an alternate explanation seems more plausible at first, it is forgiveable to entertain it for a bit, and the tendency to gravitate toward it seems reasonable for the first 5 minutes. Regardless, I'm still hoping there's a reasonable explanation for why it would leave that code in there for an optimized build. Anyway, good luck and best regards!
-
Just a couple of hours ago I found a strange issue that affects the 64bit JIT compiler, but not the 32bit JIT compiler. If in C# you do this:
static void Main(string\[\] args) { A<int> instance = new A<int>(); instance.Test(); } class A<T> { public A() { } public int Test() { if (typeof(T) == typeof(int)) throw new Exception(); else return 0; } }
The 32bit JIT compiler does this:
push eax
mov ecx,79330CB8h
call FF9B1F84
mov dword ptr [esp],eax
mov ecx,eax
call 77464C88
mov ecx,dword ptr [esp]
call 77CECE6FIt may not be very clear what's going on here if you're not used to reading the JIT-ed code, but what it doesn't do is more important. The 64bit JIT compiler is crazy and does this:
push rbx
sub rsp,20h
mov rax,6427843D998h
cmp rax,rax ; WTF?!?!
jz throw_exception
xor eax,eax
add rsp,20h
pop rbx
rep ret
nop dword ptr [rax] ; this aligns throw_exception to 16
throw_exception:
mov rcx,642784369F0h
call FFFFFFFFFF4803F0
mov rbx,rax
mov rcx,rbx
call FFFFFFFFF871F310
mov rcx,rbx
call FFFFFFFFFF8C7E20The bottom half looks familiar - it's the 64bit equivalent of what the 32bit JIT compiler produces. But the first part, that's the problem. Somehow the JIT compiler missed that comparing an integer to itself is not a very productive thing to do (and worse, the integer is a constant). So it is checking whether 0x6427843D998 still equals 0x6427843D998, and if so it throws an exception. Just in case you are wondering, typeof(int).GetHashCode() is 0x7843D998.
Daniel Scott wrote:
Somehow the JIT compiler missed that comparing an integer to itself is not a very productive thing to do (and worse, the integer is a constant)
Yes and no. In this case to do the
jz
, you have to have a comparison. Where optimizer is whacked is that there is no need for a comparison and it should just throw the exception. (A good optimizer would "see" that there is only one instance and would optimize for that one instance. The C++ compiler does this all the time.) -
Just a couple of hours ago I found a strange issue that affects the 64bit JIT compiler, but not the 32bit JIT compiler. If in C# you do this:
static void Main(string\[\] args) { A<int> instance = new A<int>(); instance.Test(); } class A<T> { public A() { } public int Test() { if (typeof(T) == typeof(int)) throw new Exception(); else return 0; } }
The 32bit JIT compiler does this:
push eax
mov ecx,79330CB8h
call FF9B1F84
mov dword ptr [esp],eax
mov ecx,eax
call 77464C88
mov ecx,dword ptr [esp]
call 77CECE6FIt may not be very clear what's going on here if you're not used to reading the JIT-ed code, but what it doesn't do is more important. The 64bit JIT compiler is crazy and does this:
push rbx
sub rsp,20h
mov rax,6427843D998h
cmp rax,rax ; WTF?!?!
jz throw_exception
xor eax,eax
add rsp,20h
pop rbx
rep ret
nop dword ptr [rax] ; this aligns throw_exception to 16
throw_exception:
mov rcx,642784369F0h
call FFFFFFFFFF4803F0
mov rbx,rax
mov rcx,rbx
call FFFFFFFFF871F310
mov rcx,rbx
call FFFFFFFFFF8C7E20The bottom half looks familiar - it's the 64bit equivalent of what the 32bit JIT compiler produces. But the first part, that's the problem. Somehow the JIT compiler missed that comparing an integer to itself is not a very productive thing to do (and worse, the integer is a constant). So it is checking whether 0x6427843D998 still equals 0x6427843D998, and if so it throws an exception. Just in case you are wondering, typeof(int).GetHashCode() is 0x7843D998.
Yeah, seems unnecessary. What about if you tweak the optimization options?
-
Yeah, seems unnecessary. What about if you tweak the optimization options?
Are they tweakable? How?
-
Daniel Scott wrote:
Somehow the JIT compiler missed that comparing an integer to itself is not a very productive thing to do (and worse, the integer is a constant)
Yes and no. In this case to do the
jz
, you have to have a comparison. Where optimizer is whacked is that there is no need for a comparison and it should just throw the exception. (A good optimizer would "see" that there is only one instance and would optimize for that one instance. The C++ compiler does this all the time.)Even if it somehow had to use jz and needed to set the z flag, it could just have done cmp eax,eax (without the REX.W prefix) without loading anything in eax first.
-
Are they tweakable? How?
I haven't tried it, but the C# compiler has optimization options. Looks less complicated than the C++ compiler options, so maybe it's just a single switch to optimize or not optimize.
-
Even if it somehow had to use jz and needed to set the z flag, it could just have done cmp eax,eax (without the REX.W prefix) without loading anything in eax first.
My guess is that the optimizer originally set the code up to handle more than one instance of object. When it "realized" there was only once instance, it partially optimized away some code. Like I said, the entire code block is bogus for both 32 and 64-bit since it will always throw an exception. (I'm just trying to explain, not defend. I find the C# optimizer generally completely sucks which throws a big monkey wrench into the JIT will make .NET better crowd. I've never seen ANY JIT compiler do more than a half-assed job. Maybe they will some day, but not today.)
-
I haven't tried it, but the C# compiler has optimization options. Looks less complicated than the C++ compiler options, so maybe it's just a single switch to optimize or not optimize.
Oh that, yes it does make a difference, without the optimize switch the code is even more horrible. But sadly it was already on (by default on release builds) so I really have to blame the JIT compiler for this..
-
My guess is that the optimizer originally set the code up to handle more than one instance of object. When it "realized" there was only once instance, it partially optimized away some code. Like I said, the entire code block is bogus for both 32 and 64-bit since it will always throw an exception. (I'm just trying to explain, not defend. I find the C# optimizer generally completely sucks which throws a big monkey wrench into the JIT will make .NET better crowd. I've never seen ANY JIT compiler do more than a half-assed job. Maybe they will some day, but not today.)
In general the optimizer isn't very good, I know. This case struck me as particularly silly though - comparing a constant with itself, really? And the 32bit JIT compiler does get it right, that one just throws the exception without checking whether integer equality is still a reflexive property :)