There is a better way to achieve that?

V 0

you have to go through the registries? I forgot (it was a long time ;-)) "If I don't see you in this world, I'll see you in the next one... and don't be late." ~ Jimi Hendrix

Zdeslav Vojkovic

if you are only concerned about speed, there is no need for the assembly in this case: inline void UltraSwap( LONG* a, LONG* b ) { LONG x = *a; LONG y = *b; __asm mov eax, x __asm mov ebx, y __asm mov x, ebx __asm mov y, eax *a = x; *b = y; } inline void UltraSwap2( LONG* a, LONG* b ) { LONG t = *b; *b = *a; *a = t; } inline void UltraSwap3( LONG* a, LONG* b ) { *b ^= *a ^= *b ^= *a; } of these three methods UltraSwap2 is simplest and fastest (almost twice as UltraSwap) and UltraSwap3 is approx. same as UltraSwap.

Stephan Poirier

Thanks everyone! I think I will use the UltraSwap2 solution from Zdeslav, since it is more faster than all the other ones. It was my mistake to think that it should be faster to do it in assembly code. But if I refer at my wishes to learn some assembly code, the suhredayan solution is what I was looking for. I used to code on some industrial programmable controllers a long time ago and it was in assembly code. But both syntax and function names wasn't the same. Great help guys! Stef Progamming looks like taking drugs... I think I did an overdose. ;-P

224917

Zdeslav Vojkovic wrote: of these three methods UltraSwap2 is simplest and fastest

00413701 mov eax,dword ptr [x] __asm mov ebx, dword ptr [y] 00413704 mov ebx,dword ptr [y] __asm mov dword ptr [x], ebx 00413707 mov dword ptr [x],ebx __asm mov dword ptr [y], eax 0041370A mov dword ptr [y],eax

suhredayan
There is no spoon.

Stephan Poirier

That's what I thought at first sight. I didn't tested Zdeslav's code, I just relied on the comment he gaves us, it seemed to have sence. I didn't thought to scan the code in assembly:^), my fault:) ! So I did a little test with performance counters, just to see what's the real result... :-D I tested the code on my old P3 450, I've tested each solution 10 times and did an average. For the Zdeslav solution:

void UltraSwap( LONG* a, LONG* b ) { LONG t = *b; *b = *a; *a = t; }

                 IN DEBUG                   IN RELEASE

One single call : 0.004190476 ms 0.004190476 ms
1000 calls loop : 0.186895210 ms 0.035199995 ms
10000 calls loop : 1.822018770 ms 0.307580906 ms

and for the suhredayan solution:

_inline PVOID UltraSwap2( LONG* a, LONG* b ) { __asm mov eax, dword ptr [a] __asm mov ebx, dword ptr [b] __asm mov dword ptr [a], ebx __asm mov dword ptr [b], eax }

                 IN DEBUG                   IN RELEASE

One single call : 0.004190476 ms 0.003352380 ms
1000 calls loop : 0.170133307 ms 0.016761902 ms
10000 calls loop : 1.599923566 ms 0.158399976 ms

So, know, everyone can see the results. I don't think I have to explain furter... :-D In debug mode, there is not a lot of difference but after a 10000 calls loop in release mode, now I'm sure that UltraSwap2 is the great winner! It just took the half time of the other one. If you want to see my test code, let me know, I will try to post it. Thanks suhredayan for your advise, you pointed me on the right track!! Have a nice day, Stef Progamming looks like taking drugs... I think I did an overdose. ;-P

Stephan Poirier

Hey, I found something interesting while playing with my test application, I did a modification to UltraSwap2 and made this one :

_inline PVOID UltraSwap3( LONG* a, LONG* b )
{
__asm mov eax, dword ptr [a]
*a = *b;
__asm mov dword ptr [b], eax
}

It's not quite elegant for a "supposed" assembly but look at the result in Release mode : UltraSwap2 after 1 loops : 0.004190476 ms UltraSwap2 after 1000 loops : 0.016761902 ms UltraSwap2 after 10000 loops : 0.138285693 ms UltraSwap2 after 1000000 loops : 13.729674098 ms UltraSwap2 after 10000000 loops : 218.611242878 ms ------------------------------------- UltraSwap3 after 1 loops : 0.003352380 ms UltraSwap3 after 1000 loops : 0.013409522 ms UltraSwap3 after 10000 loops : 0.092190462 ms UltraSwap3 after 1000000 loops : 8.895541502 ms UltraSwap3 after 10000000 loops : 128.132170951 ms It's a lot more faster for long loops!! Stef Progamming looks like taking drugs... I think I did an overdose. ;-P

Zdeslav Vojkovic

ok, we have a misunderstanding here: i compared the results with original post, not with suhredayan's solution which i didn't tested because of comments below it. i never said that my solution is the fastest one, i said that it is fastest of the three i showed. it would be extremely stupid to say that something is fastest, everything can be optimized. another thing is that dissasembly for my function is larger but not that much as it seems, because suhredayan's version is missing prolog and epilog code, which is automatically added by the compiler. when you compare code which is really generated, suhredayan's version is only 2 instructions shorter (16:14). however, i did some test with following test app: #include "stdafx.h" #include #define ITERATIONS 300000000 inline void UltraSwap2( LONG* a, LONG* b ) { LONG t = *b; *b = *a; *a = t; } inline void UltraSwap4( LONG* a, LONG* b ) { __asm mov eax, dword ptr [a] __asm mov ebx, dword ptr [b] __asm mov dword ptr [a], ebx __asm mov dword ptr [b], eax } int main(int argc, char* argv[]) { long t; long a = 111111111; long b = 222222222; char txt[1024]; { t = GetTickCount(); for(long i = 0; i < ITERATIONS; ++i) { UltraSwap2(&a, &b); } t = GetTickCount() - t; sprintf(txt, "UltraSwap2: %ld iterations done in %ld ms\n", ITERATIONS, t); printf(txt); } { t = GetTickCount(); for(long i = 0; i < ITERATIONS; ++i) { UltraSwap4(&a, &b); } t = GetTickCount() - t; sprintf(txt, "UltraSwap4: %ld iterations done in %ld ms\n", ITERATIONS, t); printf(txt); } return 0; } here are my results (3 runs for each version): release build, non optimized: D:\test\Release>test.exe UltraSwap2: 300000000 iterations done in 3024 ms UltraSwap4: 300000000 iterations done in 4376 ms D:\test\Release>test.exe UltraSwap2: 300000000 iterations done in 4387 ms UltraSwap4: 300000000 iterations done in 4366 ms D:\test\Release>test.exe UltraSwap2: 300000000 iterations done in 4376 ms UltraSwap4: 300000000 iterations done in 4537 ms release build, optimized for speed: D:\test\Release>test.exe UltraSwap2: 300000000 iterations done in 201 ms UltraSwap4: 300000000 iterations done in 771 ms D:\test\Release>test.exe UltraSwap2: 300000000 iterations done in 210 ms UltraSwap4: 300000000 iterations done in 761 ms D:\test\Release>test.exe UltraSwap2: 300000000 iterations done in 211 ms UltraSwap4: 300000000 itera

Zdeslav Vojkovic

yes, almost every solution can be made even better, but this solution doesn't work correctly, it stores the address of a into eax, then sets a to value of b, and then sets b to value now stored in a which results in a and b being equal. if you have a chance, take a look at michael abrash's book "zen of code optimization", it will show you many neat tricks. understanding/knowing assembly can only make you a better developer, so this is the right way to go.

Stephan Poirier

Oups!! :-D I've tested every code in debug to be sure that everything was swapped properly but forgot this one:^)! Thanks Zdeslav! For sure, I will look for the book you're talking about. I ran my test program with swap codes exactly identical to the ones you tested yourself and it gives me always the same result, the assembly code is always still faster. I don't understand. Maybe it depend on the way it is compiled and on which CPU it is ran... I use MS Visual C++ 6.0 Compiler version : MS 32-bit C/C++ Optimizing Compiler Version 12.00.8168 for 80x86 Linker version : MS Incremental Linker Version 6.00.8168 My CPU is : Intel Pentium III, 450MHz SDK installed : MS SDK for WinXP SP2 My project settings are the base one for an MFC dialog-based application. Also, every loops are called within a worker thread sets with normal priority. Progamming looks like taking drugs... I think I did an overdose. ;-P

Stephan Poirier

Forget my last post Zdeslav! I found what's going wrong, I missed to place the "inline" instruction in my function header!!:(( Sorry. Ok, now I go sleep, I think I need it! Ha ha ha! Thanks for help! Progamming looks like taking drugs... I think I did an overdose. ;-P