Does anyone know of a good guide to the MSIL JIT compiler?

11917640 Member

"

Quote:

Running the code through a debugger and dropping to assembly. The only way I can do that reliably is with debug info, which may change how the JITter drops native instructions. I can't rely on it.

Probably, the answer is here: Do PDB Files Affect Performance? Generally, the answer is: No. Debugging information is just additional file, which helps debugger to match the native instructions and source code. Of course, if implemented correctly. The article is written by John Robbins.

honey the codewitch

I think that's about unmanaged code, and not the JITter

Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

11917640 Member

Well, buzzwords like .NET, VB .NET, C#, JIT compiler, ILDASM are used in this article only by accident. You are right.

honey the codewitch

I am tired and I read the first bit of it. Sorry. It's 3am here and I shouldn't be awake.

Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

Jacquers

Wouldn't the Rosslyn compiler stuff be a good place to look? It's open source afaik.

Simbosan

Even if it did, I wouldn't assume that it always would and would do so on all systems. I would code explicitly and not use behaviour that isn't part of the doco.

honey the codewitch

Well, I didn't ask you what you would do. And this isn't bizdev

Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

honey the codewitch

Probably not, since at best it uses Emit facilities and has nothing to do with the final JITter output

Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

Stuart Dootson

The JIT (as well as the rest of the runtime) is also [open source](https://github.com/dotnet/runtime/tree/main/src/coreclr/jit) - there's an `optimizer.cpp` in that directory, which might be of interest. Also in that directory is a file (`viewing-jit-dumps.md`) which talks about looking at disassembly, and also mentions [a Visual Studio plugin, Disasmo](https://marketplace.visualstudio.com/items?itemName=EgorBogatov.Disasmo), that simplifies this process. [Edit]Another option - use Godbolt - [it supports C#](https://godbolt.org/z/vnqvGdqfe)![/Edit]

Java, Basic, who cares - it's all a bunch of tree-hugging hippy cr*p

Chris Copeland

Why not download ILSpy[^] and nosey at the produced IL code? Just compile your application in release mode and take a look at the produced IL to see whether it's been optimised. I would hazard a guess that it probably doesn't optimise something like that, but I could be wrong!

[ MQ | Tor.NET | Mimick ]

honey the codewitch

Because I'm not interested in the IL code, but in the post jitted native code.

Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

honey the codewitch

Oh wow. I learned two new things from your post. Thanks! Will check that out.

Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

jschell

"real world implications for some C# code that my library generates" I don't think this is a good line of research for that reason. They will change the JIT in the future. I wouldn't be surprised if there are minor version updates that change how it works. So how are you going to validate that optimizations that you put into place for one single version will continue to be valid for every version in the future and in the past?

honey the codewitch

If it's such a significant difference in the generated code then yes. Especially because in the case I outlined (turns out it does register access after all though) it would require relatively minor adjustments to my generated code to avoid that potential performance pitfall, and do so without significantly impacting readability. I don't like to wait around and hope that Microsoft will one day do the right thing. I've worked there.

Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

trønderen

I don't know of any developer using the gcc compiler suite who studies one or more of the code generators (quite a bunch is available) to learn how it works, in order to modify their source code to make one specific code generator produce some specific binary code. Not even "minor code adjustments". The code generator part(s) of gcc is a close parallel to the dotNet jitter. The IL is analogous to the API between the gcc source code parsers (and overall optimizer) and the gcc code generator. When you switch to a newer version of a gcc compiler, you do not adapt your C, C++, Fortran, or whatever, code for making one specific code generator create the very best code. Well, maybe you would do it, but I never met or heard of anyone else who would even consider adapting HLL source code to one specific gcc code generator. ...With one possible exception: Way back in time, when you would go to NetNews (aka. Usenet) for discussions, there was one developer who very intensely claimed that the C compiler for DEC VAX was completely useless! There was this one machine instruction that he wanted the compiler to generate for his C code, but he had found no way to force the compiler to do that. So the compiler was complete garbage! The discussion involved some very experienced VAX programmers, who could certify that this machine instruction would not at all speed up execution, or reduce the code size. It would have no advantages whatsoever to use that instruction. Yet the insistent developer continued insisting that when he wants that instruction, it is the compiler's d**n responsibility to provide a way to generate it. I guess that this fellow would go along with you in modifying the source code to fit one specific code generator. This happened in an age when offline digital storage was limited to (expensive) floppies, and URLs were not yet invented. I found the arguing from this fellow to be so funny that I did preserve it in a printout, where I can also find the specific instruction in question (I have forgotten which one), but that printout is buried deep down in one of my historical IT scrapbooks in the basement. I am not digging up that tonight.

Religious freedom is the freedom to say that two plus two make five.

honey the codewitch

You're comparing something that involves a total rewrite with a change that makes Advance() take an additional parameter, which it uses instead of current. So really, you're blowing this out of proportion.

Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

trønderen

If you make a question about some super-fine peephole optimization, an answer that says "Trying to do anything like that is a waste of your time" is an appropriate answer. Years ago, I could spend days timing and fine-tuning code, testing out various inline assembly variants. Gradually, I came to realize that the compiler would beat me almost every time. Instructions sequences that "looked like" being inefficient, actually run faster when I timed it. Since those days, CPUs have gotten even bigger caches, more lookahead, hyperthreading and whathaveyou, all confusing tight timing loops to the degree of making them useless. Writing (or generating) assembler code to suppress single instructions was meaningful in the days of true RISCs (including pre-1975 architectures when all machines were RISCs...) running at 1 instruction/cycle with (almost) no exception. Today, we are in a different world. I really should have spent the time to assembler code the example you bring up, with and without the repeated register load, and time them for you. But I have a very strong gut feeling of what it would show. I am so certain that I do not spend the time to do that for you.

Religious freedom is the freedom to say that two plus two make five.

honey the codewitch

I guess I just don't see looking at a new (to me) tech for code generation to see if it's doing what I expect in terms of performance as a waste of time. To be fair, I also look at the native output of my C++ code. I'm glad I have. Even if not especially the times when it ruined my day, like when I realized how craptastic the ESP32 floating point coprocessor was.

Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

trønderen

If you are working on a jitter for one specific CPU, or a gcc code generator for one specific CPU, and your task is to improve the code generating, then you would study common methods for code generating and peephole optimization. If you are not developing or improving a code generator (whether gcc or jitter), the only reason for studying the architecture of one specific of them is for curiosity. Not for modifying your source code, not even with "minor adjustments". It can be both educating and interesting to study what resides a couple of layers below the layer you are working on. But you should remember that it is a couple of layers down. You are not working at that layer, and should not try to interfere with it. (I should mention that I grew up in an OSI protocol world. Not the one where all you know is that some people have something they call 'layers', but one where layers were separated by solid hulls, and service/protocol were as separated as oil and water. An application entity should never fiddle with TCP protocol elements or IP routing, shouldn't even know that they are there! 30+ years of OO programming, interface definitions, private and protected elements -- and still, developers have not learned to keep their fingers out of lower layers, neither in protocol stacks nor in general programming!)

Religious freedom is the freedom to say that two plus two make five.

trønderen

What I am saying is: Leave peephole optimizing to the compiler/code generator, and trust it at that. We have been doing that kind of optimizations since the 1960s (starting in the late 1950s!). It is routine work. Any reasonably well trained code generator developer will handle it well using his left hand. If you think you can improve on it, you are most likely wrong. And even if you manage to dig up some special case, for the reduction in time in the execution time of some user level operation, "percent" is likely to be a much too large unit. Spend your optimizing efforts on considering algorithms, and not the least: data structures. These are way more essential to user perceived execution speed that register allocation. Do timing at user level, not at instruction level.

Religious freedom is the freedom to say that two plus two make five.