I thought I knew C++ *sob* It has been inserting extra code on me this whole time.
-
I have -o on. I can't seem to find how to change that at godbolt.org. I just remembered there's a GCC pragma where I can change it but I can't remember what it is, and so I'm googling now to figure out what it is. Edit: Now I feel like an idiot. I thought -o did at least minimal optimizations but maybe the switch means something different unsuffixed.
#pragma GCC optimize("Os")
That reflects the default of my IoT build environment It fixes it, so maybe I'm worrying over nothing. I wish I could actually check my production code, but it relies on the Arduino framework, and I can't run that at godbolt. I've tried disassembler extensions in VSCode but none work with platformIO because it makes its own CMake/ninja scripts for everything on the fly.To err is human. Fortune favors the monsters.
honey the codewitch wrote:
I wish I could actually check my production code, but it relies on the Arduino framework, and I can't run that at godbolt. I've tried disassembler extensions in VSCode but none work with platformIO because it makes its own CMake/ninja scripts for everything on the fly.
If you are comfortable looking at assembler then you could analyze your Arduino code with [Ghidra](https://github.com/NationalSecurityAgency/ghidra).
-
honey the codewitch wrote:
I wish I could actually check my production code, but it relies on the Arduino framework, and I can't run that at godbolt. I've tried disassembler extensions in VSCode but none work with platformIO because it makes its own CMake/ninja scripts for everything on the fly.
If you are comfortable looking at assembler then you could analyze your Arduino code with [Ghidra](https://github.com/NationalSecurityAgency/ghidra).
Ooooh, you just made my morning. I was just looking for something like that and gave up at the time. Thanks. Edit: NVM it wasn't what I was thinking. I might be able to use it on my firmware.bin but I'm not sure how I would match the symbols back up to the source without it being aware of my build environment so it could load the symbols for each library's C or C++ source translation unit.
To err is human. Fortune favors the monsters.
-
honey the codewitch wrote:
Yeah, that's not really the issue I'm having though.
:laugh: That's why the code there is being generated. It's promoting the char to 32 bits. The language spec calls it "default argument promotion" I have nothing more to add. Good luck
-
So the char must be sign-extended. But that does not, and cannot (due to the as-if rule), mean that the compiler must make that happen at run time, it can trivially be done at compile time after all.
Hmmm, I'm not really sure what you're saying here. You are obviously referring to the code optimization pass. But this sentence doesn't make sense.
harold aptroot wrote:
But that does not, and cannot, mean that the compiler must make that happen at run time
Nearly every compiler will perform the sign-extending at run time with optimization disabled, I just tested 4 MSVC versions few hours ago with the code at the top of this thread. Sure, it can be trivially optimized away.
-
Hmmm, I'm not really sure what you're saying here. You are obviously referring to the code optimization pass. But this sentence doesn't make sense.
harold aptroot wrote:
But that does not, and cannot, mean that the compiler must make that happen at run time
Nearly every compiler will perform the sign-extending at run time with optimization disabled, I just tested 4 MSVC versions few hours ago with the code at the top of this thread. Sure, it can be trivially optimized away.
-
harold aptroot wrote:
I decided against any further elaboration
Because there isn't anything to elaborate. :laugh: :laugh: It's OK, we all make mistakes. I was waiting to see what you had to say though.
-
Tried on clang x86, gcc x86, gcc xtensa, gcc AVR.
To err is human. Fortune favors the monsters.
I get
main:
.LFB31:
.cfi_startproc
endbr64
subq $8, %rsp
.cfi_def_cfa_offset 16
movl $65, %edx
leaq .LC0(%rip), %rsi
movl $1, %edi
movl $0, %eax
call __printf_chk@PLT
movl $65, %edx
leaq .LC0(%rip), %rsi
movl $1, %edi
movl $0, %eax
call __printf_chk@PLT
movl $0, %eax
addq $8, %rsp
.cfi_def_cfa_offset 8
retwith
g++ -std=c++17 -O1
(g++ 9.4
on local linux box)."In testa che avete, Signor di Ceprano?" -- Rigoletto
-
#include template class foo {
constexpr const static int pin = Pin;
public:
constexpr inline static char test() __attribute((always_inline)) {
if(Pin==-1) {
return 'A';
} else {
return 'B';
}
}
static_assert(test()!=0,"test");
};
int main(int argc, char** argv) {
// mov eax,65
// movsx eax, al
// mov esi, eax
printf("%c\n",foo<-1>::test());
// move esi, 65
printf("%c\n",65);
return 0;
}I'd like someone smarter than I am to explain to me why the first printf does not generate a
mov esi, 65
or evenmovsx esi, 65
, but rather, 3 instructions that are seemingly redundant and yet don't get removed by the peephole optimizer, but I don't think that's going to happen. The worst part is, I have a dozen libraries using a bus framework I wrote that relies on my bad assumptions about the code that is generated. The upshot is the code is slow, and the only way to speed it up is to A) rewrite it to not use templates B) nix the ability to run multiple displays at once But IT SHOULD NOT BE THIS WAY. I feel misled by the C++ documentation. But it was my fault for not checking my assumptions. :~ :(To err is human. Fortune favors the monsters.
Disclaimer: I have never used templates and rarely use C++, so this is just an observation... It looks like there is something swish about the use of pin and Pin - it might be worth accessing Pin as pin in the test() method? or something like that. Good luck!
-
#include template class foo {
constexpr const static int pin = Pin;
public:
constexpr inline static char test() __attribute((always_inline)) {
if(Pin==-1) {
return 'A';
} else {
return 'B';
}
}
static_assert(test()!=0,"test");
};
int main(int argc, char** argv) {
// mov eax,65
// movsx eax, al
// mov esi, eax
printf("%c\n",foo<-1>::test());
// move esi, 65
printf("%c\n",65);
return 0;
}I'd like someone smarter than I am to explain to me why the first printf does not generate a
mov esi, 65
or evenmovsx esi, 65
, but rather, 3 instructions that are seemingly redundant and yet don't get removed by the peephole optimizer, but I don't think that's going to happen. The worst part is, I have a dozen libraries using a bus framework I wrote that relies on my bad assumptions about the code that is generated. The upshot is the code is slow, and the only way to speed it up is to A) rewrite it to not use templates B) nix the ability to run multiple displays at once But IT SHOULD NOT BE THIS WAY. I feel misled by the C++ documentation. But it was my fault for not checking my assumptions. :~ :(To err is human. Fortune favors the monsters.
Try `consteval`, rather than `constexpr`, for `test()` PS: Stack Overflow is a good place for questions like this.
Paul Sanders. If I had more time, I would have written a shorter letter - Blaise Pascal. Some of my best work is in the undo buffer.
-
Try `consteval`, rather than `constexpr`, for `test()` PS: Stack Overflow is a good place for questions like this.
Paul Sanders. If I had more time, I would have written a shorter letter - Blaise Pascal. Some of my best work is in the undo buffer.
I can't use consteval because I can't target C++20
To err is human. Fortune favors the monsters.