Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. Does anyone know of a good guide to the MSIL JIT compiler?

Does anyone know of a good guide to the MSIL JIT compiler?

Scheduled Pinned Locked Moved The Lounge
designdebuggingtutorialquestioncsharp
54 Posts 11 Posters 4 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • H honey the codewitch

    Basically I am not sure about a number of things regarding how it works

    if((this.current >= 'A' && this.current <= 'Z') ||
    (this.current >= 'a' && this.current <= 'z')) {
    // do something
    }

    In MSIL you'd have to pepper the IL you drop for that if construct with a bunch of extra Ldarg_0 arguments to retrieve the this reference for *each* comparison. On x86 CPUs (and well, most any CPU with registers, which IL doesn't really have unless you stretch the terminology to include its list of function arguments and locals) you'd load the this pointer into a register and work off that rather than repeatedly loading it onto the stack every time you need to access it as you would in IL. On pretty much any supporting architecture this is much faster than hitting stack. Maybe an order of magnitude. So my question is for example, is the JIT compiler smart enough to resolve those repeated Ldarg_0s into register access? That's just one thing I want to know. Some avenues of research I considered to figure this out: 1. Running the code through a debugger and dropping to assembly. The only way I can do that reliably is with debug info, which may change how the JITter drops native instructions. I can't rely on it. 2. Using ngen and then disassembling the result but again, that's not JITted, but rather precompiled so things like whole program optimization are in play. I can't rely on it. And I can't find any material that will help me figure that out short of the very dry and difficult specs they release, which I'm not even sure tell me that, since the JIT compiler's actual implementation details aren't part of the standard. What I'm hoping for is something some clever Microsoft employee or blogger wrote that describes the behavior of Microsoft's JITter in some detail. There are some real world implications for some C# code that my library generates. I need to make some decisions about it and I feel like I don't have all the information I need.

    Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

    T Offline
    T Offline
    trønderen
    wrote on last edited by
    #4

    For that specific question, "is the JIT compiler smart enough to resolve those repeated Ldarg_0s into register access?", you'll find the answer with magnitudes less effort (compared to learning the inner workings of a JIT compiler) by compiling and linking the code in a tiny test program, load it into VS and display the disassembly. Another remark: JITting is essentially code generation - including loophole optimization. Code generation is inherently CPU dependent. x86, x64 and ARM require significantly different code generators. If they are developed by the same team, you can expect them to have similar overall structure, but the actual code generation may be quite different - because the CPUs are different. Significant parts may have been created by different people, each of them expert on one specific CPU. Maybe you'll see one optimization on ARM that you do not see on x86, or even the other way around. Maybe one optimization that you expected to see was omitted because it didn't give a speed increase at all, on that specific processor (remember that jitting is done for one specific CPU, e.g. utilizing instruction set extensions available on that specific chip where the jitter is running). If I could spare the time, it sure would be fascinating to dig into the entire jitter for ARM, say, to learn how many of all the tricks in the book they have implemented. I guess it would be more or less the entire book, but not necessarily for all the latest instruction set extensions. Learning and fully understanding the entire ARM JITter would be a major task, though, way beyond finding out if one specific peephole optimization is applied on one specific CPU chip in one specific context.

    Religious freedom is the freedom to say that two plus two make five.

    H 1 Reply Last reply
    0
    • T trønderen

      For that specific question, "is the JIT compiler smart enough to resolve those repeated Ldarg_0s into register access?", you'll find the answer with magnitudes less effort (compared to learning the inner workings of a JIT compiler) by compiling and linking the code in a tiny test program, load it into VS and display the disassembly. Another remark: JITting is essentially code generation - including loophole optimization. Code generation is inherently CPU dependent. x86, x64 and ARM require significantly different code generators. If they are developed by the same team, you can expect them to have similar overall structure, but the actual code generation may be quite different - because the CPUs are different. Significant parts may have been created by different people, each of them expert on one specific CPU. Maybe you'll see one optimization on ARM that you do not see on x86, or even the other way around. Maybe one optimization that you expected to see was omitted because it didn't give a speed increase at all, on that specific processor (remember that jitting is done for one specific CPU, e.g. utilizing instruction set extensions available on that specific chip where the jitter is running). If I could spare the time, it sure would be fascinating to dig into the entire jitter for ARM, say, to learn how many of all the tricks in the book they have implemented. I guess it would be more or less the entire book, but not necessarily for all the latest instruction set extensions. Learning and fully understanding the entire ARM JITter would be a major task, though, way beyond finding out if one specific peephole optimization is applied on one specific CPU chip in one specific context.

      Religious freedom is the freedom to say that two plus two make five.

      H Offline
      H Offline
      honey the codewitch
      wrote on last edited by
      #5

      Quote:

      For that specific question, "is the JIT compiler smart enough to resolve those repeated Ldarg_0s into register access?", you'll find the answer with magnitudes less effort (compared to learning the inner workings of a JIT compiler) by compiling and linking the code in a tiny test program, load it into VS and display the disassembly.

      I must be a dunce, because I can't get it to disassemble in release.

      Quote:

      Another remark: JITting is essentially code generation - including loophole optimization. Code generation is inherently CPU dependent. x86, x64 and ARM require significantly different code generators.

      I would still expect them all to use registers (assuming the architecture supports it) if one does. Or if not, they will eventually. Looking at the x86 code gives me baseline information I can use to determine the code it produces on most machines, and some insight into how their code generation works generally. Yes they are different, but the performance priorities Microsoft assigns to them won't be. If the x86 JITter uses registers, the ARM one does too, and if it doesn't, it will get there.

      Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

      T 2 Replies Last reply
      0
      • H honey the codewitch

        Quote:

        For that specific question, "is the JIT compiler smart enough to resolve those repeated Ldarg_0s into register access?", you'll find the answer with magnitudes less effort (compared to learning the inner workings of a JIT compiler) by compiling and linking the code in a tiny test program, load it into VS and display the disassembly.

        I must be a dunce, because I can't get it to disassemble in release.

        Quote:

        Another remark: JITting is essentially code generation - including loophole optimization. Code generation is inherently CPU dependent. x86, x64 and ARM require significantly different code generators.

        I would still expect them all to use registers (assuming the architecture supports it) if one does. Or if not, they will eventually. Looking at the x86 code gives me baseline information I can use to determine the code it produces on most machines, and some insight into how their code generation works generally. Yes they are different, but the performance priorities Microsoft assigns to them won't be. If the x86 JITter uses registers, the ARM one does too, and if it doesn't, it will get there.

        Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

        T Offline
        T Offline
        trønderen
        wrote on last edited by
        #6

        Your first step is to see the disassembly in VS in the debug version. If a loophole optimization is applied in the debug version, you can be sure that it is applied in the release version as well! Debugging does not by definition imply that the generated code is different from the release version. Sometimes you 'voluntarily' turn off some optimizations while debugging, because e.g. single stepping can be somewhat messy if code has been moved around. (Then we are talking about more advanced, not peephole, optimization.) The 'Debug' and 'Release' configuration names are arbitrary names. You can create configurations of any other name, and set configuration options for your project for any of your configurations: Project| Properties|Build|Advanced sets the amount of debug generated. In principle you could set Debug information: None for the Debug configuration, and Debug information: Full for the Release configuration, but most developers would find that rather non-intuitive :-) In any case, debug information is external to the executable code. E.g. to insert a breakpoint into release code, the debugger finds (from the debug info) the address where you want the BP, stuffs away the original instruction at that address and inserts a BP instruction, which it catches at run time. When the user commands 'go on', it re-inserts the original, stuffed away instruction, pulls the program counter one instruction back and restarts the target process. If the BP is meant to be persistent (not one of the volatile kind like 'run to cursor'), when continuing, the debugger will first execute a single target instruction (the one inserted), re-insert the BP instruction for the next round, and then set the target process running. Well, this is one way of doing it. It requires write access to code memory. Many CPUs, typically embedded ones, must handle debugging of read only code (for the sake of this discussion, consider code in flash to be read only). They may have a few registers where the debugger can load code addresses that are continuously compared to the instruction pointer. If equal, a debug interrupt is generated. The CPU may have e.g. 4 such registers, so you can only have 4 BPs active at any one time, but they can be set in any release code. For both (and other) alternatives: If you do not have debug information for the code, then you will have to know the binary address yourself. Nothing keeps you from generating debug information for release code. If you have debug info availab

        H 1 Reply Last reply
        0
        • T trønderen

          Your first step is to see the disassembly in VS in the debug version. If a loophole optimization is applied in the debug version, you can be sure that it is applied in the release version as well! Debugging does not by definition imply that the generated code is different from the release version. Sometimes you 'voluntarily' turn off some optimizations while debugging, because e.g. single stepping can be somewhat messy if code has been moved around. (Then we are talking about more advanced, not peephole, optimization.) The 'Debug' and 'Release' configuration names are arbitrary names. You can create configurations of any other name, and set configuration options for your project for any of your configurations: Project| Properties|Build|Advanced sets the amount of debug generated. In principle you could set Debug information: None for the Debug configuration, and Debug information: Full for the Release configuration, but most developers would find that rather non-intuitive :-) In any case, debug information is external to the executable code. E.g. to insert a breakpoint into release code, the debugger finds (from the debug info) the address where you want the BP, stuffs away the original instruction at that address and inserts a BP instruction, which it catches at run time. When the user commands 'go on', it re-inserts the original, stuffed away instruction, pulls the program counter one instruction back and restarts the target process. If the BP is meant to be persistent (not one of the volatile kind like 'run to cursor'), when continuing, the debugger will first execute a single target instruction (the one inserted), re-insert the BP instruction for the next round, and then set the target process running. Well, this is one way of doing it. It requires write access to code memory. Many CPUs, typically embedded ones, must handle debugging of read only code (for the sake of this discussion, consider code in flash to be read only). They may have a few registers where the debugger can load code addresses that are continuously compared to the instruction pointer. If equal, a debug interrupt is generated. The CPU may have e.g. 4 such registers, so you can only have 4 BPs active at any one time, but they can be set in any release code. For both (and other) alternatives: If you do not have debug information for the code, then you will have to know the binary address yourself. Nothing keeps you from generating debug information for release code. If you have debug info availab

          H Offline
          H Offline
          honey the codewitch
          wrote on last edited by
          #7

          I know how debug information and symbol mapping and such work. What I don't know is if Microsoft does some magic to the JITter in debug to make it produce different code. So I can't rely on that method.

          Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

          T 1 Reply Last reply
          0
          • H honey the codewitch

            Quote:

            For that specific question, "is the JIT compiler smart enough to resolve those repeated Ldarg_0s into register access?", you'll find the answer with magnitudes less effort (compared to learning the inner workings of a JIT compiler) by compiling and linking the code in a tiny test program, load it into VS and display the disassembly.

            I must be a dunce, because I can't get it to disassemble in release.

            Quote:

            Another remark: JITting is essentially code generation - including loophole optimization. Code generation is inherently CPU dependent. x86, x64 and ARM require significantly different code generators.

            I would still expect them all to use registers (assuming the architecture supports it) if one does. Or if not, they will eventually. Looking at the x86 code gives me baseline information I can use to determine the code it produces on most machines, and some insight into how their code generation works generally. Yes they are different, but the performance priorities Microsoft assigns to them won't be. If the x86 JITter uses registers, the ARM one does too, and if it doesn't, it will get there.

            Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

            T Offline
            T Offline
            trønderen
            wrote on last edited by
            #8

            I put that code snippet of yours into a tiny test program. Here is the disassembly window in VS2022 for x86 code:

            class c {
            char current;
            public void setA(char c) {
            this.current = c;
            00EF0DF0 push ebp
            00EF0DF1 mov ebp,esp
            00EF0DF3 mov word ptr [ecx+4],dx
            if ((this.current >= 'A' && this.current <= 'Z') ||
            (this.current >= 'a' && this.current <= 'z')) {
            00EF0DF7 movzx eax,word ptr [ecx+4]
            00EF0DFB cmp eax,41h
            00EF0DFE jl Test.c.setA(Char)+015h (0EF0E05h)
            00EF0E00 cmp eax,5Ah
            00EF0E03 jle Test.c.setA(Char)+01Fh (0EF0E0Fh)
            00EF0E05 cmp eax,61h
            00EF0E08 jl Test.c.setA(Char)+02Ah (0EF0E1Ah)
            00EF0E0A cmp eax,7Ah
            00EF0E0D jg Test.c.setA(Char)+02Ah (0EF0E1Ah)
            Console.WriteLine("Argument is an alphabetic character");
            00EF0E0F mov ecx,dword ptr ds:[3BB24A0h]
            00EF0E15 call System.Console.WriteLine(System.String) (65A637B8h)
            00EF0E1A pop ebp
            00EF0E1B ret

            And for x64 code:

            class c {
            char current;
            public void setA(char c) {
            this.current = c;
            00007FFE45A90EE0 sub rsp,28h
            00007FFE45A90EE4 mov word ptr [rcx+8],dx
            if ((this.current >= 'A' && this.current <= 'Z') ||
            (this.current >= 'a' && this.current <= 'z')) {
            00007FFE45A90EE8 movzx ecx,word ptr [rcx+8]
            00007FFE45A90EEC cmp ecx,41h
            00007FFE45A90EEF jl Test.c.setA(Char)+016h (07FFE45A90EF6h)
            00007FFE45A90EF1 cmp ecx,5Ah
            00007FFE45A90EF4 jle Test.c.setA(Char)+020h (07FFE45A90F00h)
            00007FFE45A90EF6 cmp ecx,61h
            00007FFE45A90EF9 jl Test.c.setA(Char)+032h (07FFE45A90F12h)
            00007FFE45A90EFB cmp ecx,7Ah
            00007FFE45A90EFE jg Test.c.setA(Char)+032h (07FFE45A90F12h)
            Console.WriteLine("Argument is an alphabetic character");
            00007FFE45A90F00 mov rcx,1FE90003938h
            00007FFE45A90F0A mov rcx,qword ptr [rcx]
            00007FFE45A90F0D call System.Console.WriteLine(System.String) (07FFEA3F80DB0h)
            00007FFE45A90F12 nop
            00007FFE45A90F13 add rsp,28h
            00007FFE45A90F17 ret

            On both architectures, the code is identical in default settings for Debug and Release configurations. I'd be surprised if it wasn't, and I'd be surprised if - as you seemed to fear - the base register was reloaded for each of the four tests. I do not have a

            H 1 Reply Last reply
            0
            • T trønderen

              I put that code snippet of yours into a tiny test program. Here is the disassembly window in VS2022 for x86 code:

              class c {
              char current;
              public void setA(char c) {
              this.current = c;
              00EF0DF0 push ebp
              00EF0DF1 mov ebp,esp
              00EF0DF3 mov word ptr [ecx+4],dx
              if ((this.current >= 'A' && this.current <= 'Z') ||
              (this.current >= 'a' && this.current <= 'z')) {
              00EF0DF7 movzx eax,word ptr [ecx+4]
              00EF0DFB cmp eax,41h
              00EF0DFE jl Test.c.setA(Char)+015h (0EF0E05h)
              00EF0E00 cmp eax,5Ah
              00EF0E03 jle Test.c.setA(Char)+01Fh (0EF0E0Fh)
              00EF0E05 cmp eax,61h
              00EF0E08 jl Test.c.setA(Char)+02Ah (0EF0E1Ah)
              00EF0E0A cmp eax,7Ah
              00EF0E0D jg Test.c.setA(Char)+02Ah (0EF0E1Ah)
              Console.WriteLine("Argument is an alphabetic character");
              00EF0E0F mov ecx,dword ptr ds:[3BB24A0h]
              00EF0E15 call System.Console.WriteLine(System.String) (65A637B8h)
              00EF0E1A pop ebp
              00EF0E1B ret

              And for x64 code:

              class c {
              char current;
              public void setA(char c) {
              this.current = c;
              00007FFE45A90EE0 sub rsp,28h
              00007FFE45A90EE4 mov word ptr [rcx+8],dx
              if ((this.current >= 'A' && this.current <= 'Z') ||
              (this.current >= 'a' && this.current <= 'z')) {
              00007FFE45A90EE8 movzx ecx,word ptr [rcx+8]
              00007FFE45A90EEC cmp ecx,41h
              00007FFE45A90EEF jl Test.c.setA(Char)+016h (07FFE45A90EF6h)
              00007FFE45A90EF1 cmp ecx,5Ah
              00007FFE45A90EF4 jle Test.c.setA(Char)+020h (07FFE45A90F00h)
              00007FFE45A90EF6 cmp ecx,61h
              00007FFE45A90EF9 jl Test.c.setA(Char)+032h (07FFE45A90F12h)
              00007FFE45A90EFB cmp ecx,7Ah
              00007FFE45A90EFE jg Test.c.setA(Char)+032h (07FFE45A90F12h)
              Console.WriteLine("Argument is an alphabetic character");
              00007FFE45A90F00 mov rcx,1FE90003938h
              00007FFE45A90F0A mov rcx,qword ptr [rcx]
              00007FFE45A90F0D call System.Console.WriteLine(System.String) (07FFEA3F80DB0h)
              00007FFE45A90F12 nop
              00007FFE45A90F13 add rsp,28h
              00007FFE45A90F17 ret

              On both architectures, the code is identical in default settings for Debug and Release configurations. I'd be surprised if it wasn't, and I'd be surprised if - as you seemed to fear - the base register was reloaded for each of the four tests. I do not have a

              H Offline
              H Offline
              honey the codewitch
              wrote on last edited by
              #9

              Yeah, the output is what I was hoping for, and sort of expecting. That answers one question, so thank you.

              Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

              1 Reply Last reply
              0
              • H honey the codewitch

                I know how debug information and symbol mapping and such work. What I don't know is if Microsoft does some magic to the JITter in debug to make it produce different code. So I can't rely on that method.

                Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                T Offline
                T Offline
                trønderen
                wrote on last edited by
                #10

                If I select code generating for 'Any CPU' the main compiler will generate an IL assembly, which is processed by the JITter when the assembly is run for the first time. At the moment, I am running 32 bit CLR, and that jitter generates exactly the same binary code as the 'x86' CPU option. I'd be very surprised if they were different. I'd be very surprised if there were two different x86 code generators. The linkers do completely different jobs, but not the code generators. I do not understand where MS could do some magic that is not visible in the generated code.

                Religious freedom is the freedom to say that two plus two make five.

                H 1 Reply Last reply
                0
                • H honey the codewitch

                  Basically I am not sure about a number of things regarding how it works

                  if((this.current >= 'A' && this.current <= 'Z') ||
                  (this.current >= 'a' && this.current <= 'z')) {
                  // do something
                  }

                  In MSIL you'd have to pepper the IL you drop for that if construct with a bunch of extra Ldarg_0 arguments to retrieve the this reference for *each* comparison. On x86 CPUs (and well, most any CPU with registers, which IL doesn't really have unless you stretch the terminology to include its list of function arguments and locals) you'd load the this pointer into a register and work off that rather than repeatedly loading it onto the stack every time you need to access it as you would in IL. On pretty much any supporting architecture this is much faster than hitting stack. Maybe an order of magnitude. So my question is for example, is the JIT compiler smart enough to resolve those repeated Ldarg_0s into register access? That's just one thing I want to know. Some avenues of research I considered to figure this out: 1. Running the code through a debugger and dropping to assembly. The only way I can do that reliably is with debug info, which may change how the JITter drops native instructions. I can't rely on it. 2. Using ngen and then disassembling the result but again, that's not JITted, but rather precompiled so things like whole program optimization are in play. I can't rely on it. And I can't find any material that will help me figure that out short of the very dry and difficult specs they release, which I'm not even sure tell me that, since the JIT compiler's actual implementation details aren't part of the standard. What I'm hoping for is something some clever Microsoft employee or blogger wrote that describes the behavior of Microsoft's JITter in some detail. There are some real world implications for some C# code that my library generates. I need to make some decisions about it and I feel like I don't have all the information I need.

                  Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                  A Offline
                  A Offline
                  Amarnath S
                  wrote on last edited by
                  #11

                  Could this be a possible workaround to avoid those extra JIT compiler arguments?

                  var cur = this.current;
                  if( cur >= 'A' && cur <= 'Z' || cur >= 'a' && cur <= 'z') {
                  // do something
                  }

                  H 1 Reply Last reply
                  0
                  • A Amarnath S

                    Could this be a possible workaround to avoid those extra JIT compiler arguments?

                    var cur = this.current;
                    if( cur >= 'A' && cur <= 'Z' || cur >= 'a' && cur <= 'z') {
                    // do something
                    }

                    H Offline
                    H Offline
                    honey the codewitch
                    wrote on last edited by
                    #12

                    Not in the instance I'm using it in without a rework. I'd have to change the structure of the code, which is made more complicated by the fact that it's a CodeDOM tree instead of real code. Before I do that, I want to make sure I'm not (A) doing something for nothing, and more importantly (B) introducing clutter or extra overhead in an attempt to optimize. I've included a chunk of the state machine runner code which should illustrate the issue I hope.

                    int p;
                    int l;
                    int c;
                    ch = -1;
                    this.capture.Clear();
                    if ((this.current == -2)) {
                    this.Advance();
                    }
                    p = this.position;
                    l = this.line;
                    c = this.column;
                    // q0:
                    // [\t-\n\r ]
                    if (((((this.current >= 9)
                    && (this.current <= 10))
                    || (this.current == 13))
                    || (this.current == 32))) {
                    this.Advance();
                    goto q1;
                    }
                    // [A-Z_hj-kmqxz]
                    if ((((((((((this.current >= 65)
                    && (this.current <= 90))
                    || (this.current == 95))
                    || (this.current == 104))
                    || ((this.current >= 106)
                    && (this.current <= 107)))
                    || (this.current == 109))
                    || (this.current == 113))
                    || (this.current == 120))
                    || (this.current == 122))) {
                    this.Advance();
                    goto q2;
                    }
                    // [a]
                    if ((this.current == 97)) {
                    this.Advance();
                    goto q3;
                    }
                    // [b]
                    if ((this.current == 98)) {
                    this.Advance();
                    goto q22;
                    }
                    // ...snip...
                    q1:
                    // [\t-\n\r ]
                    if (((((this.current >= 9)
                    && (this.current <= 10))
                    || (this.current == 13))
                    || (this.current == 32))) {
                    this.Advance();
                    goto q1;
                    }
                    return FAMatch.Create(2, this.capture.ToString(), p, l, c);
                    q2:
                    // [0-9A-Z_a-z]
                    if ((((((this.current >= 48)
                    && (this.current <= 57))
                    || ((this.current >= 65)
                    && (this.current <= 90)))
                    || (this.current == 95))
                    || ((this.current >= 97)
                    && (this.current <= 122)))) {
                    this.Advance();
                    goto q2;
                    }
                    return FAMatch.Create(0, this.capture.ToString(), p, l, c);

                    Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                    1 Reply Last reply
                    0
                    • H honey the codewitch

                      Basically I am not sure about a number of things regarding how it works

                      if((this.current >= 'A' && this.current <= 'Z') ||
                      (this.current >= 'a' && this.current <= 'z')) {
                      // do something
                      }

                      In MSIL you'd have to pepper the IL you drop for that if construct with a bunch of extra Ldarg_0 arguments to retrieve the this reference for *each* comparison. On x86 CPUs (and well, most any CPU with registers, which IL doesn't really have unless you stretch the terminology to include its list of function arguments and locals) you'd load the this pointer into a register and work off that rather than repeatedly loading it onto the stack every time you need to access it as you would in IL. On pretty much any supporting architecture this is much faster than hitting stack. Maybe an order of magnitude. So my question is for example, is the JIT compiler smart enough to resolve those repeated Ldarg_0s into register access? That's just one thing I want to know. Some avenues of research I considered to figure this out: 1. Running the code through a debugger and dropping to assembly. The only way I can do that reliably is with debug info, which may change how the JITter drops native instructions. I can't rely on it. 2. Using ngen and then disassembling the result but again, that's not JITted, but rather precompiled so things like whole program optimization are in play. I can't rely on it. And I can't find any material that will help me figure that out short of the very dry and difficult specs they release, which I'm not even sure tell me that, since the JIT compiler's actual implementation details aren't part of the standard. What I'm hoping for is something some clever Microsoft employee or blogger wrote that describes the behavior of Microsoft's JITter in some detail. There are some real world implications for some C# code that my library generates. I need to make some decisions about it and I feel like I don't have all the information I need.

                      Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                      1 Offline
                      1 Offline
                      11917640 Member
                      wrote on last edited by
                      #13

                      Don't expect to see any optimizations in MSIL code, even in Release configuration. They are done by JIT-compiler, and may be more effective, since exact CPU type is known at runtime. You may try to see optimized real Assembly code, but this is difficult task, since there is huge distance from the source C# code and MSIL to machine language instructions.

                      H 1 Reply Last reply
                      0
                      • 1 11917640 Member

                        Don't expect to see any optimizations in MSIL code, even in Release configuration. They are done by JIT-compiler, and may be more effective, since exact CPU type is known at runtime. You may try to see optimized real Assembly code, but this is difficult task, since there is huge distance from the source C# code and MSIL to machine language instructions.

                        H Offline
                        H Offline
                        honey the codewitch
                        wrote on last edited by
                        #14

                        I'm aware of that. I am generating MSIL instructions using Reflection Emit as part of my project. The other part generates source code. I would like to ensure that this source code generates IL that will be then be optimized appropriately by the JITter. If not, I will generate the source code differently, but my interest is in post-jitted code. Not the IL.

                        Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                        1 1 Reply Last reply
                        0
                        • T trønderen

                          If I select code generating for 'Any CPU' the main compiler will generate an IL assembly, which is processed by the JITter when the assembly is run for the first time. At the moment, I am running 32 bit CLR, and that jitter generates exactly the same binary code as the 'x86' CPU option. I'd be very surprised if they were different. I'd be very surprised if there were two different x86 code generators. The linkers do completely different jobs, but not the code generators. I do not understand where MS could do some magic that is not visible in the generated code.

                          Religious freedom is the freedom to say that two plus two make five.

                          H Offline
                          H Offline
                          honey the codewitch
                          wrote on last edited by
                          #15

                          That seems to be assuming more than I am usually comfortable with when it comes to MS. I've worked at Microsoft and with Microsoft code enough to expect the unexpected deep in the bowels of their frameworks. You should have seen me wrestle with the some less oft used typelib generation functions in oleaut32.dll. I was working there at the time, and nobody could answer me about what the heck they were doing. If they made the JITter produce different code for debug builds than release, it would be totally on brand for them, is what I'm saying, no matter if it's not intuitive. You can't put anything past these people. You really can't. And I know those are names, but Debug generates debug symbols and such. Does the jitter for example? do something different if a pdb is present? Or some other magic signaled by the linker dropping some flag in the binary's metadata? Probably not. "Probably" is doing a lot of heavy lifting there.

                          Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                          1 Reply Last reply
                          0
                          • H honey the codewitch

                            I'm aware of that. I am generating MSIL instructions using Reflection Emit as part of my project. The other part generates source code. I would like to ensure that this source code generates IL that will be then be optimized appropriately by the JITter. If not, I will generate the source code differently, but my interest is in post-jitted code. Not the IL.

                            Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                            1 Offline
                            1 Offline
                            11917640 Member
                            wrote on last edited by
                            #16

                            "

                            Quote:

                            Running the code through a debugger and dropping to assembly. The only way I can do that reliably is with debug info, which may change how the JITter drops native instructions. I can't rely on it.

                            Probably, the answer is here: Do PDB Files Affect Performance? Generally, the answer is: No. Debugging information is just additional file, which helps debugger to match the native instructions and source code. Of course, if implemented correctly. The article is written by John Robbins.

                            H 1 Reply Last reply
                            0
                            • 1 11917640 Member

                              "

                              Quote:

                              Running the code through a debugger and dropping to assembly. The only way I can do that reliably is with debug info, which may change how the JITter drops native instructions. I can't rely on it.

                              Probably, the answer is here: Do PDB Files Affect Performance? Generally, the answer is: No. Debugging information is just additional file, which helps debugger to match the native instructions and source code. Of course, if implemented correctly. The article is written by John Robbins.

                              H Offline
                              H Offline
                              honey the codewitch
                              wrote on last edited by
                              #17

                              I think that's about unmanaged code, and not the JITter

                              Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                              1 1 Reply Last reply
                              0
                              • H honey the codewitch

                                I think that's about unmanaged code, and not the JITter

                                Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                                1 Offline
                                1 Offline
                                11917640 Member
                                wrote on last edited by
                                #18

                                Well, buzzwords like .NET, VB .NET, C#, JIT compiler, ILDASM are used in this article only by accident. You are right.

                                H 1 Reply Last reply
                                0
                                • 1 11917640 Member

                                  Well, buzzwords like .NET, VB .NET, C#, JIT compiler, ILDASM are used in this article only by accident. You are right.

                                  H Offline
                                  H Offline
                                  honey the codewitch
                                  wrote on last edited by
                                  #19

                                  I am tired and I read the first bit of it. Sorry. It's 3am here and I shouldn't be awake.

                                  Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                                  1 Reply Last reply
                                  0
                                  • H honey the codewitch

                                    Basically I am not sure about a number of things regarding how it works

                                    if((this.current >= 'A' && this.current <= 'Z') ||
                                    (this.current >= 'a' && this.current <= 'z')) {
                                    // do something
                                    }

                                    In MSIL you'd have to pepper the IL you drop for that if construct with a bunch of extra Ldarg_0 arguments to retrieve the this reference for *each* comparison. On x86 CPUs (and well, most any CPU with registers, which IL doesn't really have unless you stretch the terminology to include its list of function arguments and locals) you'd load the this pointer into a register and work off that rather than repeatedly loading it onto the stack every time you need to access it as you would in IL. On pretty much any supporting architecture this is much faster than hitting stack. Maybe an order of magnitude. So my question is for example, is the JIT compiler smart enough to resolve those repeated Ldarg_0s into register access? That's just one thing I want to know. Some avenues of research I considered to figure this out: 1. Running the code through a debugger and dropping to assembly. The only way I can do that reliably is with debug info, which may change how the JITter drops native instructions. I can't rely on it. 2. Using ngen and then disassembling the result but again, that's not JITted, but rather precompiled so things like whole program optimization are in play. I can't rely on it. And I can't find any material that will help me figure that out short of the very dry and difficult specs they release, which I'm not even sure tell me that, since the JIT compiler's actual implementation details aren't part of the standard. What I'm hoping for is something some clever Microsoft employee or blogger wrote that describes the behavior of Microsoft's JITter in some detail. There are some real world implications for some C# code that my library generates. I need to make some decisions about it and I feel like I don't have all the information I need.

                                    Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                                    J Offline
                                    J Offline
                                    Jacquers
                                    wrote on last edited by
                                    #20

                                    Wouldn't the Rosslyn compiler stuff be a good place to look? It's open source afaik.

                                    H 1 Reply Last reply
                                    0
                                    • H honey the codewitch

                                      Basically I am not sure about a number of things regarding how it works

                                      if((this.current >= 'A' && this.current <= 'Z') ||
                                      (this.current >= 'a' && this.current <= 'z')) {
                                      // do something
                                      }

                                      In MSIL you'd have to pepper the IL you drop for that if construct with a bunch of extra Ldarg_0 arguments to retrieve the this reference for *each* comparison. On x86 CPUs (and well, most any CPU with registers, which IL doesn't really have unless you stretch the terminology to include its list of function arguments and locals) you'd load the this pointer into a register and work off that rather than repeatedly loading it onto the stack every time you need to access it as you would in IL. On pretty much any supporting architecture this is much faster than hitting stack. Maybe an order of magnitude. So my question is for example, is the JIT compiler smart enough to resolve those repeated Ldarg_0s into register access? That's just one thing I want to know. Some avenues of research I considered to figure this out: 1. Running the code through a debugger and dropping to assembly. The only way I can do that reliably is with debug info, which may change how the JITter drops native instructions. I can't rely on it. 2. Using ngen and then disassembling the result but again, that's not JITted, but rather precompiled so things like whole program optimization are in play. I can't rely on it. And I can't find any material that will help me figure that out short of the very dry and difficult specs they release, which I'm not even sure tell me that, since the JIT compiler's actual implementation details aren't part of the standard. What I'm hoping for is something some clever Microsoft employee or blogger wrote that describes the behavior of Microsoft's JITter in some detail. There are some real world implications for some C# code that my library generates. I need to make some decisions about it and I feel like I don't have all the information I need.

                                      Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                                      S Offline
                                      S Offline
                                      Simbosan
                                      wrote on last edited by
                                      #21

                                      Even if it did, I wouldn't assume that it always would and would do so on all systems. I would code explicitly and not use behaviour that isn't part of the doco.

                                      H 1 Reply Last reply
                                      0
                                      • S Simbosan

                                        Even if it did, I wouldn't assume that it always would and would do so on all systems. I would code explicitly and not use behaviour that isn't part of the doco.

                                        H Offline
                                        H Offline
                                        honey the codewitch
                                        wrote on last edited by
                                        #22

                                        Well, I didn't ask you what you would do. And this isn't bizdev

                                        Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                                        T 1 Reply Last reply
                                        0
                                        • J Jacquers

                                          Wouldn't the Rosslyn compiler stuff be a good place to look? It's open source afaik.

                                          H Offline
                                          H Offline
                                          honey the codewitch
                                          wrote on last edited by
                                          #23

                                          Probably not, since at best it uses Emit facilities and has nothing to do with the final JITter output

                                          Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                                          S 1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups