Inline assembly - what good is it?
-
I'm just wondering, how much of a performance increase could I gain from using inline assembly? I know it's all relative to what I'm doing and so on and so forth. But if every nanosecond counts in a processing loop that dealt with a lot of numbers, would inline assembly make it any faster? int i = 2, x = 2; int j = x + y; sprintf("%d",j); 4.7388937 ??? My articles www.stillwaterexpress.com BlackDice
-
I'm just wondering, how much of a performance increase could I gain from using inline assembly? I know it's all relative to what I'm doing and so on and so forth. But if every nanosecond counts in a processing loop that dealt with a lot of numbers, would inline assembly make it any faster? int i = 2, x = 2; int j = x + y; sprintf("%d",j); 4.7388937 ??? My articles www.stillwaterexpress.com BlackDice
I think, this depends how smart is your compiler. If you will do it in the inline assembly (or in the assembly at all) you know exactly what are you doing so you can take the appropriate optimalization (e.g. using registers and memory as needed), compiler can just guess it from your code in the higher level language. Also, compiler writers can optimize only on some common scenarios. So in theory, with a supersmart compiler, your assembly and result of the compilation should be equal. But in reality, it always depends how smart your compiler is in that specific situation, and how your high-level language is structured and interpreted by the compiler.
-
I think, this depends how smart is your compiler. If you will do it in the inline assembly (or in the assembly at all) you know exactly what are you doing so you can take the appropriate optimalization (e.g. using registers and memory as needed), compiler can just guess it from your code in the higher level language. Also, compiler writers can optimize only on some common scenarios. So in theory, with a supersmart compiler, your assembly and result of the compilation should be equal. But in reality, it always depends how smart your compiler is in that specific situation, and how your high-level language is structured and interpreted by the compiler.
cool, thanks! int i = 2, x = 2; int j = x + y; sprintf("%d",j); 4.7388937 ??? My articles www.stillwaterexpress.com BlackDice
-
I'm just wondering, how much of a performance increase could I gain from using inline assembly? I know it's all relative to what I'm doing and so on and so forth. But if every nanosecond counts in a processing loop that dealt with a lot of numbers, would inline assembly make it any faster? int i = 2, x = 2; int j = x + y; sprintf("%d",j); 4.7388937 ??? My articles www.stillwaterexpress.com BlackDice
Optimized assembly code can make it faster, but now days the ability of modern complires to optimize the code makes it difficult to improve upon. Writing your C/C++ code with optimization in mind, is the simplest aproach. Because it makes it easeir for the compiler to optimize the code for you. ------------------------------------------------- I'll answer you 2nd question here as well. ------------------------------------------------- You can look at the disassembled code at run-time: View->Debug Windows->Disassembly If your program is compiled for debugging, that will bring up a mixed assembly and C/C++ source code view. You should place a break point at the point in your code where you want to look at the assembly code. This is mainly used for debugging, in the case where you think the compiler may have messed up (I have not seen this happen in years). You can also look at the dissembled code for a program compiled for realese by running it as if it where a debug version from VC6, but I am not sure if you can see the whole program. ------------------------------------------------- ------------------------------------------------- Although with modern compileres (with multiple optimization options) it is not as important, but having some idea of what assembly code your compiler produces for a given piece of C/C++ code, helps you make better decisions on how to write that code (with optimization in mind). FYI: Airliners have 3 redundant computer systems. Each one is running the same program and each program was compiled using a different compiler. Why? Because each compiler produces code in its own way. That is if one compiler introduce a flaw in the final program, then that same problem will not be on either of the other systems (in theory). ------------------------------------------------- Learning game programing ------------------------------------------------- 1) Tons of sights and new groups. 2) Graphics Programming By Michael Abrash. 3) http://sourceforge.net/[^];) :-DGood luck and have fun. INTP "The more help VB provides VB programmers, the more miserable your life as a C++ programmer becomes." Andrew W. Troelsen
-
Optimized assembly code can make it faster, but now days the ability of modern complires to optimize the code makes it difficult to improve upon. Writing your C/C++ code with optimization in mind, is the simplest aproach. Because it makes it easeir for the compiler to optimize the code for you. ------------------------------------------------- I'll answer you 2nd question here as well. ------------------------------------------------- You can look at the disassembled code at run-time: View->Debug Windows->Disassembly If your program is compiled for debugging, that will bring up a mixed assembly and C/C++ source code view. You should place a break point at the point in your code where you want to look at the assembly code. This is mainly used for debugging, in the case where you think the compiler may have messed up (I have not seen this happen in years). You can also look at the dissembled code for a program compiled for realese by running it as if it where a debug version from VC6, but I am not sure if you can see the whole program. ------------------------------------------------- ------------------------------------------------- Although with modern compileres (with multiple optimization options) it is not as important, but having some idea of what assembly code your compiler produces for a given piece of C/C++ code, helps you make better decisions on how to write that code (with optimization in mind). FYI: Airliners have 3 redundant computer systems. Each one is running the same program and each program was compiled using a different compiler. Why? Because each compiler produces code in its own way. That is if one compiler introduce a flaw in the final program, then that same problem will not be on either of the other systems (in theory). ------------------------------------------------- Learning game programing ------------------------------------------------- 1) Tons of sights and new groups. 2) Graphics Programming By Michael Abrash. 3) http://sourceforge.net/[^];) :-DGood luck and have fun. INTP "The more help VB provides VB programmers, the more miserable your life as a C++ programmer becomes." Andrew W. Troelsen
thanks a bunch. Looks like you put forth some effort and time in posting your answer, and it's appreciated. :) int i = 2, x = 2; int j = x + y; sprintf("%d",j); 4.7388937 ??? My articles www.stillwaterexpress.com BlackDice
-
I'm just wondering, how much of a performance increase could I gain from using inline assembly? I know it's all relative to what I'm doing and so on and so forth. But if every nanosecond counts in a processing loop that dealt with a lot of numbers, would inline assembly make it any faster? int i = 2, x = 2; int j = x + y; sprintf("%d",j); 4.7388937 ??? My articles www.stillwaterexpress.com BlackDice
Maybe. What processor? In many cases something that will make one processor faster will make a different one slower. In many cases your compiler can optimise things better than you can (unless you have years to write the whole thing in assembly using the highest priced experts, and even then a new processor is like to come out that needs different optimizations) because it will optimise a larger part of the code. Never even consider inline assembly until you are convinced (with several eyes looking) that you have the best algorithm for the job, and you know something the compiler doesn't know. When I was working with RC5 we could get major improvements from 2 lines of assembly, but that was a special case where the compiler did not know about an instruction (ROTL) that is almost never useful, but is critical to that algorithm. In the real world it is rare for such a situation come up. In most cases the most optimized code can only save you nano-seconds over the less optimized code, and your loop won't be long enough for those nano-second to add up. However there are exceptions. Don't forget that when you go to a different processor (AMD athlon and P4, not just x86 to powerPc) your best optimization changes.
-
Maybe. What processor? In many cases something that will make one processor faster will make a different one slower. In many cases your compiler can optimise things better than you can (unless you have years to write the whole thing in assembly using the highest priced experts, and even then a new processor is like to come out that needs different optimizations) because it will optimise a larger part of the code. Never even consider inline assembly until you are convinced (with several eyes looking) that you have the best algorithm for the job, and you know something the compiler doesn't know. When I was working with RC5 we could get major improvements from 2 lines of assembly, but that was a special case where the compiler did not know about an instruction (ROTL) that is almost never useful, but is critical to that algorithm. In the real world it is rare for such a situation come up. In most cases the most optimized code can only save you nano-seconds over the less optimized code, and your loop won't be long enough for those nano-second to add up. However there are exceptions. Don't forget that when you go to a different processor (AMD athlon and P4, not just x86 to powerPc) your best optimization changes.
Henry miller wrote: and your loop won't be long enough for those nano-second to add up well, I am thinking about doing some game programming eventually, and that's one of the reasons I asked this question in the first place. Thanks for your response int i = 2, x = 2; int j = x + y; sprintf("%d",j); 4.7388937 ??? My articles www.stillwaterexpress.com BlackDice
-
I'm just wondering, how much of a performance increase could I gain from using inline assembly? I know it's all relative to what I'm doing and so on and so forth. But if every nanosecond counts in a processing loop that dealt with a lot of numbers, would inline assembly make it any faster? int i = 2, x = 2; int j = x + y; sprintf("%d",j); 4.7388937 ??? My articles www.stillwaterexpress.com BlackDice
This is something of a religious subject in that people have certain beliefs and will not change them whatever the argument. And those that do convert, tend to froth at the mouth! :) However, IMHO, unless you need to do low-level access to hardware, or you have something in an extremely tight loop, assembly code should not ever be seen. Remember, the vast majority of the life-cycle of a program is in the maintenance phase and assembly is harder to read and understand than a higher level language. And porting is an issue with assembly, even if on the same processor -- different compilers handle assembly escapes differently. And modern compilers are pretty good. If you think about how the code is liable to be translated (or peek at the generated code), you can optimize the C/C++ code to make very good machine code. Things like using the natural size integer for a processor rather than something that needs conversion, e.g, use an
int
rather than ashort
for a number so low level conversion doesn't have to happen. Naturally, such code should be heavily commented. As an aside, certain kinds of optimizations that programmers often use because they think they are smarter than the compiler are not needed. For instance, doingx <<= 1;
in instead ofx *= 2;
is almost never needed. The compiler can figure this out just fine. And the first is harder to read... -
I'm just wondering, how much of a performance increase could I gain from using inline assembly? I know it's all relative to what I'm doing and so on and so forth. But if every nanosecond counts in a processing loop that dealt with a lot of numbers, would inline assembly make it any faster? int i = 2, x = 2; int j = x + y; sprintf("%d",j); 4.7388937 ??? My articles www.stillwaterexpress.com BlackDice
You should be aware that this is a religious subject; people of one faith will ignore all arguments from heretics. And converts tend to be evangelistic! :) IMHO, unless you need to do low-level hardware access or are in an extremely tight loop, assembly code should never be seen. And if it is, it should be heavily commented. Remember, the vast majority of the life-cycle of a piece of software is in the maintenance phase, not development -- assuming it is a successful program! And assembly is harder to read and hard to port. Even when porting from one compiler to another for the same target processor, porting can be awkward as different compilers have different assembly escapes, stack conventions, internal symbol representations, etc. Further, modern compilers are smart. Even things like
x <<=1;
instead of the more readablex *= 2;
don't buy you anything as the compiler understands what*= 2
means and will find the best way to accomplish this. If you really are concerned about optimization in a tight loop, then optimize your algorithm (always your best bang for the buck) and adjust variables such that a minimum of assembly-level conversions are needed. For instance, use the machine's natural size integer (usuallyint
) instead of ashort
. If the compiler cannot determine the absolute maximum value and the processor doesn't handleshort
s as easily asint
s, then a low-level conversion might be needed. Try peeking at the generated assembly. And heavily comment such tweaks!