Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. I thought I knew C++ *sob* It has been inserting extra code on me this whole time.

I thought I knew C++ *sob* It has been inserting extra code on me this whole time.

Scheduled Pinned Locked Moved The Lounge
c++wpfperformance
38 Posts 8 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • honey the codewitchH honey the codewitch

    Tried on clang x86, gcc x86, gcc xtensa, gcc AVR.

    To err is human. Fortune favors the monsters.

    K Offline
    K Offline
    k5054
    wrote on last edited by
    #17

    What release of gcc/clang are you using? According to [Compiler Explorer](https://godbolt.org/) I get the following with clang 5.0 with -O1 -std=C++17:

    main: # @main
    push rax
    mov edi, .L.str
    mov esi, 65
    xor eax, eax
    call printf
    mov edi, .L.str
    mov esi, 65
    xor eax, eax
    call printf
    xor eax, eax
    pop rcx
    ret
    .L.str:
    .asciz "%c\n"

    And x86-64 gcc 5.1 with the same flags gives:

    .LC0:
    .string "%c\n"
    main:
    sub rsp, 8
    mov esi, 65
    mov edi, OFFSET FLAT:.LC0
    mov eax, 0
    call printf
    mov esi, 65
    mov edi, OFFSET FLAT:.LC0
    mov eax, 0
    call printf
    mov eax, 0
    add rsp, 8
    ret

    Those are both pretty old compilers - the first of their lines to support C++17 AFAICT. Both produce the same code for each call. So maybe something in the compiler flags you're passing?

    Keep Calm and Carry On

    honey the codewitchH 1 Reply Last reply
    0
    • K k5054

      What release of gcc/clang are you using? According to [Compiler Explorer](https://godbolt.org/) I get the following with clang 5.0 with -O1 -std=C++17:

      main: # @main
      push rax
      mov edi, .L.str
      mov esi, 65
      xor eax, eax
      call printf
      mov edi, .L.str
      mov esi, 65
      xor eax, eax
      call printf
      xor eax, eax
      pop rcx
      ret
      .L.str:
      .asciz "%c\n"

      And x86-64 gcc 5.1 with the same flags gives:

      .LC0:
      .string "%c\n"
      main:
      sub rsp, 8
      mov esi, 65
      mov edi, OFFSET FLAT:.LC0
      mov eax, 0
      call printf
      mov esi, 65
      mov edi, OFFSET FLAT:.LC0
      mov eax, 0
      call printf
      mov eax, 0
      add rsp, 8
      ret

      Those are both pretty old compilers - the first of their lines to support C++17 AFAICT. Both produce the same code for each call. So maybe something in the compiler flags you're passing?

      Keep Calm and Carry On

      honey the codewitchH Offline
      honey the codewitchH Offline
      honey the codewitch
      wrote on last edited by
      #18

      It probably has to do with the fact that I can't convince godbolt.org to allow me to remove their default compiler options and replace them with my own. I'm stuck with -o -whole-program or whatever. I used to be able to change it there somehow. Maybe someone exploited it and they turned off the feature.

      To err is human. Fortune favors the monsters.

      K 1 Reply Last reply
      0
      • L Lost User

        I'm not on a PC tonight, it takes longer to type on my TV onscreen keyboard. It's not easy! :sigh: Anyways, I found some better material for you to read. [Variadic arguments - cppreference.com](https://en.cppreference.com/w/cpp/language/variadic\_arguments#Default\_conversions)

        honey the codewitchH Offline
        honey the codewitchH Offline
        honey the codewitch
        wrote on last edited by
        #19

        Well it's tomorrow. In case you're curious, I took the variadic arguments out of the code. I replaced printf with putchar. Same result.

            push    rbp
            mov     rbp, rsp
            sub     rsp, 16
            mov     DWORD PTR \[rbp-4\], edi
            mov     QWORD PTR \[rbp-16\], rsi
            mov     eax, 65 ; \*\*\*
            movsx   eax, al ; \*\*\*
            mov     edi, eax ;\*\*\*
            call    putchar
            mov     edi, 65  ;\*\*\*
            call    putchar
            mov     eax, 0
            leave
            ret
        

        To err is human. Fortune favors the monsters.

        L 1 Reply Last reply
        0
        • honey the codewitchH honey the codewitch

          It probably has to do with the fact that I can't convince godbolt.org to allow me to remove their default compiler options and replace them with my own. I'm stuck with -o -whole-program or whatever. I used to be able to change it there somehow. Maybe someone exploited it and they turned off the feature.

          To err is human. Fortune favors the monsters.

          K Offline
          K Offline
          k5054
          wrote on last edited by
          #20

          I get the same results using g++ 5.5.0 on my local linux box. That would be a CentOS 7 system on which I compiled g++-5.5.0 from source. So, still wondering if its maybe the flags you're using.

          Keep Calm and Carry On

          honey the codewitchH 1 Reply Last reply
          0
          • K k5054

            I get the same results using g++ 5.5.0 on my local linux box. That would be a CentOS 7 system on which I compiled g++-5.5.0 from source. So, still wondering if its maybe the flags you're using.

            Keep Calm and Carry On

            honey the codewitchH Offline
            honey the codewitchH Offline
            honey the codewitch
            wrote on last edited by
            #21

            Looking at your output more carefully, your initial output is similar to mine. Your final output is less optimized, probably having to do with your compiler flags.

            To err is human. Fortune favors the monsters.

            1 Reply Last reply
            0
            • honey the codewitchH honey the codewitch

              Well it's tomorrow. In case you're curious, I took the variadic arguments out of the code. I replaced printf with putchar. Same result.

                  push    rbp
                  mov     rbp, rsp
                  sub     rsp, 16
                  mov     DWORD PTR \[rbp-4\], edi
                  mov     QWORD PTR \[rbp-16\], rsi
                  mov     eax, 65 ; \*\*\*
                  movsx   eax, al ; \*\*\*
                  mov     edi, eax ;\*\*\*
                  call    putchar
                  mov     edi, 65  ;\*\*\*
                  call    putchar
                  mov     eax, 0
                  leave
                  ret
              

              To err is human. Fortune favors the monsters.

              L Offline
              L Offline
              Lost User
              wrote on last edited by
              #22

              Ok, I tested this on my dev box, everything we talked about above in the C standard applies. And I get the same exact assembler output you get. Only with optimizations disabled. So I guess you have optimization disabled?

              honey the codewitchH 2 Replies Last reply
              0
              • L Lost User

                Ok, I tested this on my dev box, everything we talked about above in the C standard applies. And I get the same exact assembler output you get. Only with optimizations disabled. So I guess you have optimization disabled?

                honey the codewitchH Offline
                honey the codewitchH Offline
                honey the codewitch
                wrote on last edited by
                #23

                I have -o on godbolt.org and as I said somewhere else on this thread (I don't remember where or to whom) it seems to not be letting me change that. It used to, so I either can't find it again, or they've removed the feature.

                To err is human. Fortune favors the monsters.

                1 Reply Last reply
                0
                • L Lost User

                  Ok, I tested this on my dev box, everything we talked about above in the C standard applies. And I get the same exact assembler output you get. Only with optimizations disabled. So I guess you have optimization disabled?

                  honey the codewitchH Offline
                  honey the codewitchH Offline
                  honey the codewitch
                  wrote on last edited by
                  #24

                  I have -o on. I can't seem to find how to change that at godbolt.org. I just remembered there's a GCC pragma where I can change it but I can't remember what it is, and so I'm googling now to figure out what it is. Edit: Now I feel like an idiot. I thought -o did at least minimal optimizations but maybe the switch means something different unsuffixed. #pragma GCC optimize("Os") That reflects the default of my IoT build environment It fixes it, so maybe I'm worrying over nothing. I wish I could actually check my production code, but it relies on the Arduino framework, and I can't run that at godbolt. I've tried disassembler extensions in VSCode but none work with platformIO because it makes its own CMake/ninja scripts for everything on the fly.

                  To err is human. Fortune favors the monsters.

                  L 2 Replies Last reply
                  0
                  • honey the codewitchH honey the codewitch

                    I have -o on. I can't seem to find how to change that at godbolt.org. I just remembered there's a GCC pragma where I can change it but I can't remember what it is, and so I'm googling now to figure out what it is. Edit: Now I feel like an idiot. I thought -o did at least minimal optimizations but maybe the switch means something different unsuffixed. #pragma GCC optimize("Os") That reflects the default of my IoT build environment It fixes it, so maybe I'm worrying over nothing. I wish I could actually check my production code, but it relies on the Arduino framework, and I can't run that at godbolt. I've tried disassembler extensions in VSCode but none work with platformIO because it makes its own CMake/ninja scripts for everything on the fly.

                    To err is human. Fortune favors the monsters.

                    L Offline
                    L Offline
                    Lost User
                    wrote on last edited by
                    #25

                    honey the codewitch wrote:

                    #pragma GCC optimize("Os") It fixes it

                    Awesome, I'm glad it's sorted out! Congratulations. Don't rely on the optimization pass. The C++ standards are correct. It's just that the optimization pass can rearrange code, remove functions and/or use intrinsics instead. The unoptimized code would be more standards compliant. :-D

                    honey the codewitchH 1 Reply Last reply
                    0
                    • L Lost User

                      honey the codewitch wrote:

                      #pragma GCC optimize("Os") It fixes it

                      Awesome, I'm glad it's sorted out! Congratulations. Don't rely on the optimization pass. The C++ standards are correct. It's just that the optimization pass can rearrange code, remove functions and/or use intrinsics instead. The unoptimized code would be more standards compliant. :-D

                      honey the codewitchH Offline
                      honey the codewitchH Offline
                      honey the codewitch
                      wrote on last edited by
                      #26

                      In general you're right, but in this case, there are special considerations. For starters, the toolchain is fixed to GCC, and other compilers simply don't have the backends to target what I target. So I have the luxury of using GCC specific things, and expecting GCC specific behavior, but I'm also saddled with GNU C++ vs STD C++ because the frameworks my code runs under require it, despite my code being (more) standard than GNU. That being said, I am counting on those optimizations because this is IoT, and this is critical code paths. That's why I'm looking at the asm output in the first place. :)

                      To err is human. Fortune favors the monsters.

                      1 Reply Last reply
                      0
                      • honey the codewitchH honey the codewitch

                        I have -o on. I can't seem to find how to change that at godbolt.org. I just remembered there's a GCC pragma where I can change it but I can't remember what it is, and so I'm googling now to figure out what it is. Edit: Now I feel like an idiot. I thought -o did at least minimal optimizations but maybe the switch means something different unsuffixed. #pragma GCC optimize("Os") That reflects the default of my IoT build environment It fixes it, so maybe I'm worrying over nothing. I wish I could actually check my production code, but it relies on the Arduino framework, and I can't run that at godbolt. I've tried disassembler extensions in VSCode but none work with platformIO because it makes its own CMake/ninja scripts for everything on the fly.

                        To err is human. Fortune favors the monsters.

                        L Offline
                        L Offline
                        Lost User
                        wrote on last edited by
                        #27

                        honey the codewitch wrote:

                        I wish I could actually check my production code, but it relies on the Arduino framework, and I can't run that at godbolt. I've tried disassembler extensions in VSCode but none work with platformIO because it makes its own CMake/ninja scripts for everything on the fly.

                        If you are comfortable looking at assembler then you could analyze your Arduino code with [Ghidra](https://github.com/NationalSecurityAgency/ghidra).

                        honey the codewitchH 1 Reply Last reply
                        0
                        • L Lost User

                          honey the codewitch wrote:

                          I wish I could actually check my production code, but it relies on the Arduino framework, and I can't run that at godbolt. I've tried disassembler extensions in VSCode but none work with platformIO because it makes its own CMake/ninja scripts for everything on the fly.

                          If you are comfortable looking at assembler then you could analyze your Arduino code with [Ghidra](https://github.com/NationalSecurityAgency/ghidra).

                          honey the codewitchH Offline
                          honey the codewitchH Offline
                          honey the codewitch
                          wrote on last edited by
                          #28

                          Ooooh, you just made my morning. I was just looking for something like that and gave up at the time. Thanks. Edit: NVM it wasn't what I was thinking. I might be able to use it on my firmware.bin but I'm not sure how I would match the symbols back up to the source without it being aware of my build environment so it could load the symbols for each library's C or C++ source translation unit.

                          To err is human. Fortune favors the monsters.

                          1 Reply Last reply
                          0
                          • L Lost User

                            honey the codewitch wrote:

                            Yeah, that's not really the issue I'm having though.

                            :laugh: That's why the code there is being generated. It's promoting the char to 32 bits. The language spec calls it "default argument promotion" I have nothing more to add. Good luck

                            L Offline
                            L Offline
                            Lost User
                            wrote on last edited by
                            #29

                            So the char must be sign-extended. But that does not, and cannot (due to the as-if rule), mean that the compiler must make that happen at run time, it can trivially be done at compile time after all.

                            L 1 Reply Last reply
                            0
                            • L Lost User

                              So the char must be sign-extended. But that does not, and cannot (due to the as-if rule), mean that the compiler must make that happen at run time, it can trivially be done at compile time after all.

                              L Offline
                              L Offline
                              Lost User
                              wrote on last edited by
                              #30

                              Hmmm, I'm not really sure what you're saying here. You are obviously referring to the code optimization pass. But this sentence doesn't make sense.

                              harold aptroot wrote:

                              But that does not, and cannot, mean that the compiler must make that happen at run time

                              Nearly every compiler will perform the sign-extending at run time with optimization disabled, I just tested 4 MSVC versions few hours ago with the code at the top of this thread. Sure, it can be trivially optimized away.

                              L 1 Reply Last reply
                              0
                              • L Lost User

                                Hmmm, I'm not really sure what you're saying here. You are obviously referring to the code optimization pass. But this sentence doesn't make sense.

                                harold aptroot wrote:

                                But that does not, and cannot, mean that the compiler must make that happen at run time

                                Nearly every compiler will perform the sign-extending at run time with optimization disabled, I just tested 4 MSVC versions few hours ago with the code at the top of this thread. Sure, it can be trivially optimized away.

                                L Offline
                                L Offline
                                Lost User
                                wrote on last edited by
                                #31

                                I decided against any further elaboration

                                L 1 Reply Last reply
                                0
                                • L Lost User

                                  I decided against any further elaboration

                                  L Offline
                                  L Offline
                                  Lost User
                                  wrote on last edited by
                                  #32

                                  harold aptroot wrote:

                                  I decided against any further elaboration

                                  Because there isn't anything to elaborate. :laugh: :laugh: It's OK, we all make mistakes. I was waiting to see what you had to say though.

                                  L 1 Reply Last reply
                                  0
                                  • L Lost User

                                    harold aptroot wrote:

                                    I decided against any further elaboration

                                    Because there isn't anything to elaborate. :laugh: :laugh: It's OK, we all make mistakes. I was waiting to see what you had to say though.

                                    L Offline
                                    L Offline
                                    Lost User
                                    wrote on last edited by
                                    #33

                                    My mistake was talking to you at all. Don't worry, that won't happen again.

                                    L 1 Reply Last reply
                                    0
                                    • L Lost User

                                      My mistake was talking to you at all. Don't worry, that won't happen again.

                                      L Offline
                                      L Offline
                                      Lost User
                                      wrote on last edited by
                                      #34

                                      I have no idea what's happening here, I apologize if I've offended you. It wasn't intentional. Are you OK?

                                      1 Reply Last reply
                                      0
                                      • honey the codewitchH honey the codewitch

                                        Tried on clang x86, gcc x86, gcc xtensa, gcc AVR.

                                        To err is human. Fortune favors the monsters.

                                        CPalliniC Offline
                                        CPalliniC Offline
                                        CPallini
                                        wrote on last edited by
                                        #35

                                        I get

                                        main:
                                        .LFB31:
                                        .cfi_startproc
                                        endbr64
                                        subq $8, %rsp
                                        .cfi_def_cfa_offset 16
                                        movl $65, %edx
                                        leaq .LC0(%rip), %rsi
                                        movl $1, %edi
                                        movl $0, %eax
                                        call __printf_chk@PLT
                                        movl $65, %edx
                                        leaq .LC0(%rip), %rsi
                                        movl $1, %edi
                                        movl $0, %eax
                                        call __printf_chk@PLT
                                        movl $0, %eax
                                        addq $8, %rsp
                                        .cfi_def_cfa_offset 8
                                        ret

                                        with g++ -std=c++17 -O1 (g++ 9.4 on local linux box).

                                        "In testa che avete, Signor di Ceprano?" -- Rigoletto

                                        In testa che avete, signor di Ceprano?

                                        1 Reply Last reply
                                        0
                                        • honey the codewitchH honey the codewitch

                                          #include template class foo {
                                          constexpr const static int pin = Pin;
                                          public:
                                          constexpr inline static char test() __attribute((always_inline)) {
                                          if(Pin==-1) {
                                          return 'A';
                                          } else {
                                          return 'B';
                                          }
                                          }
                                          static_assert(test()!=0,"test");
                                          };
                                          int main(int argc, char** argv) {
                                          // mov eax,65
                                          // movsx eax, al
                                          // mov esi, eax
                                          printf("%c\n",foo<-1>::test());
                                          // move esi, 65
                                          printf("%c\n",65);
                                          return 0;
                                          }

                                          I'd like someone smarter than I am to explain to me why the first printf does not generate a mov esi, 65 or even movsx esi, 65, but rather, 3 instructions that are seemingly redundant and yet don't get removed by the peephole optimizer, but I don't think that's going to happen. The worst part is, I have a dozen libraries using a bus framework I wrote that relies on my bad assumptions about the code that is generated. The upshot is the code is slow, and the only way to speed it up is to A) rewrite it to not use templates B) nix the ability to run multiple displays at once But IT SHOULD NOT BE THIS WAY. I feel misled by the C++ documentation. But it was my fault for not checking my assumptions. :~ :(

                                          To err is human. Fortune favors the monsters.

                                          C Offline
                                          C Offline
                                          CodeWomble
                                          wrote on last edited by
                                          #36

                                          Disclaimer: I have never used templates and rarely use C++, so this is just an observation... It looks like there is something swish about the use of pin and Pin - it might be worth accessing Pin as pin in the test() method? or something like that. Good luck!

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups