Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C / C++ / MFC
  4. There is a better way to achieve that?

There is a better way to achieve that?

Scheduled Pinned Locked Moved C / C++ / MFC
c++tutorialquestion
14 Posts 4 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • S Offline
    S Offline
    Stephan Poirier
    wrote on last edited by
    #1

    Hi everyone, I need to know if there is a better way to swap variables values in assembly code. I have a thread that sort some variables and I wanted to make it as fast as possible. Also, I wanted to do it in assembly code just to learn a little about it. It is my first attempt in assembly code so apologize for mistakes. If you know how to do it in a better way, let me know! The code you see below works great but I'm sure we can do that in a more "fancy" way. This how I call the function : UltraSwap( &Var1, &Var2 ); Then this is the function in assembly code : #pragma warning (disable:4035) // disable warning 4035 (function must return something) _inline PVOID UltraSwap( LONG* a, LONG* b ) { LONG x = *a; LONG y = *b; __asm mov eax, x __asm mov ebx, y __asm mov x, ebx __asm mov y, eax *a = x; *b = y; } #pragma warning (default:4035) // Reenable it The only thing I don't understand is that I can't move *x or *y into eax and ebx respectively. I had to declare two local variables to achieve that. I think that just declaring that will take some times. I didn't tested what's the difference in time between swapping two variables in C++ code and in assembly because I didn't know how to, since GetTickCount() isn't too much reliable and not so much fast. Let me know if you guys find something! Have a nice day! Stef Progamming looks like taking drugs... I think I did an overdose. ;-P

    2 V Z S 4 Replies Last reply
    0
    • S Stephan Poirier

      Hi everyone, I need to know if there is a better way to swap variables values in assembly code. I have a thread that sort some variables and I wanted to make it as fast as possible. Also, I wanted to do it in assembly code just to learn a little about it. It is my first attempt in assembly code so apologize for mistakes. If you know how to do it in a better way, let me know! The code you see below works great but I'm sure we can do that in a more "fancy" way. This how I call the function : UltraSwap( &Var1, &Var2 ); Then this is the function in assembly code : #pragma warning (disable:4035) // disable warning 4035 (function must return something) _inline PVOID UltraSwap( LONG* a, LONG* b ) { LONG x = *a; LONG y = *b; __asm mov eax, x __asm mov ebx, y __asm mov x, ebx __asm mov y, eax *a = x; *b = y; } #pragma warning (default:4035) // Reenable it The only thing I don't understand is that I can't move *x or *y into eax and ebx respectively. I had to declare two local variables to achieve that. I think that just declaring that will take some times. I didn't tested what's the difference in time between swapping two variables in C++ code and in assembly because I didn't know how to, since GetTickCount() isn't too much reliable and not so much fast. Let me know if you guys find something! Have a nice day! Stef Progamming looks like taking drugs... I think I did an overdose. ;-P

      2 Offline
      2 Offline
      224917
      wrote on last edited by
      #2

      Stephan Poirier wrote: _inline PVOID UltraSwap( LONG* a, LONG* b ) { LONG x = *a; LONG y = *b; __asm mov eax, x __asm mov ebx, y __asm mov x, ebx __asm mov y, eax *a = x; *b = y; } Can the local variable be avoided like this ? _inline PVOID UltraSwap( LONG* a, LONG* b ) { __asm mov eax, dword ptr [a] __asm mov ebx, dword ptr [b] __asm mov dword ptr [a], ebx __asm mov dword ptr [b], eax }


      suhredayan
      There is no spoon.

      1 Reply Last reply
      0
      • S Stephan Poirier

        Hi everyone, I need to know if there is a better way to swap variables values in assembly code. I have a thread that sort some variables and I wanted to make it as fast as possible. Also, I wanted to do it in assembly code just to learn a little about it. It is my first attempt in assembly code so apologize for mistakes. If you know how to do it in a better way, let me know! The code you see below works great but I'm sure we can do that in a more "fancy" way. This how I call the function : UltraSwap( &Var1, &Var2 ); Then this is the function in assembly code : #pragma warning (disable:4035) // disable warning 4035 (function must return something) _inline PVOID UltraSwap( LONG* a, LONG* b ) { LONG x = *a; LONG y = *b; __asm mov eax, x __asm mov ebx, y __asm mov x, ebx __asm mov y, eax *a = x; *b = y; } #pragma warning (default:4035) // Reenable it The only thing I don't understand is that I can't move *x or *y into eax and ebx respectively. I had to declare two local variables to achieve that. I think that just declaring that will take some times. I didn't tested what's the difference in time between swapping two variables in C++ code and in assembly because I didn't know how to, since GetTickCount() isn't too much reliable and not so much fast. Let me know if you guys find something! Have a nice day! Stef Progamming looks like taking drugs... I think I did an overdose. ;-P

        V Offline
        V Offline
        V 0
        wrote on last edited by
        #3

        how about __asm mov eax, x __asm mov x, y __asm mov y, eax? For swapping only three variables/memory places are needed. Hope this helps. "If I don't see you in this world, I'll see you in the next one... and don't be late." ~ Jimi Hendrix

        2 1 Reply Last reply
        0
        • V V 0

          how about __asm mov eax, x __asm mov x, y __asm mov y, eax? For swapping only three variables/memory places are needed. Hope this helps. "If I don't see you in this world, I'll see you in the next one... and don't be late." ~ Jimi Hendrix

          2 Offline
          2 Offline
          224917
          wrote on last edited by
          #4

          V. wrote: __asm mov x, y That is not a valid assembly language statement. You always require a register for mov instruction AFAMK.


          suhredayan
          There is no spoon.

          V 1 Reply Last reply
          0
          • 2 224917

            V. wrote: __asm mov x, y That is not a valid assembly language statement. You always require a register for mov instruction AFAMK.


            suhredayan
            There is no spoon.

            V Offline
            V Offline
            V 0
            wrote on last edited by
            #5

            you have to go through the registries? I forgot (it was a long time ;-)) "If I don't see you in this world, I'll see you in the next one... and don't be late." ~ Jimi Hendrix

            1 Reply Last reply
            0
            • S Stephan Poirier

              Hi everyone, I need to know if there is a better way to swap variables values in assembly code. I have a thread that sort some variables and I wanted to make it as fast as possible. Also, I wanted to do it in assembly code just to learn a little about it. It is my first attempt in assembly code so apologize for mistakes. If you know how to do it in a better way, let me know! The code you see below works great but I'm sure we can do that in a more "fancy" way. This how I call the function : UltraSwap( &Var1, &Var2 ); Then this is the function in assembly code : #pragma warning (disable:4035) // disable warning 4035 (function must return something) _inline PVOID UltraSwap( LONG* a, LONG* b ) { LONG x = *a; LONG y = *b; __asm mov eax, x __asm mov ebx, y __asm mov x, ebx __asm mov y, eax *a = x; *b = y; } #pragma warning (default:4035) // Reenable it The only thing I don't understand is that I can't move *x or *y into eax and ebx respectively. I had to declare two local variables to achieve that. I think that just declaring that will take some times. I didn't tested what's the difference in time between swapping two variables in C++ code and in assembly because I didn't know how to, since GetTickCount() isn't too much reliable and not so much fast. Let me know if you guys find something! Have a nice day! Stef Progamming looks like taking drugs... I think I did an overdose. ;-P

              Z Offline
              Z Offline
              Zdeslav Vojkovic
              wrote on last edited by
              #6

              if you are only concerned about speed, there is no need for the assembly in this case: inline void UltraSwap( LONG* a, LONG* b ) { LONG x = *a; LONG y = *b; __asm mov eax, x __asm mov ebx, y __asm mov x, ebx __asm mov y, eax *a = x; *b = y; } inline void UltraSwap2( LONG* a, LONG* b ) { LONG t = *b; *b = *a; *a = t; } inline void UltraSwap3( LONG* a, LONG* b ) { *b ^= *a ^= *b ^= *a; } of these three methods UltraSwap2 is simplest and fastest (almost twice as UltraSwap) and UltraSwap3 is approx. same as UltraSwap.

              2 1 Reply Last reply
              0
              • S Stephan Poirier

                Hi everyone, I need to know if there is a better way to swap variables values in assembly code. I have a thread that sort some variables and I wanted to make it as fast as possible. Also, I wanted to do it in assembly code just to learn a little about it. It is my first attempt in assembly code so apologize for mistakes. If you know how to do it in a better way, let me know! The code you see below works great but I'm sure we can do that in a more "fancy" way. This how I call the function : UltraSwap( &Var1, &Var2 ); Then this is the function in assembly code : #pragma warning (disable:4035) // disable warning 4035 (function must return something) _inline PVOID UltraSwap( LONG* a, LONG* b ) { LONG x = *a; LONG y = *b; __asm mov eax, x __asm mov ebx, y __asm mov x, ebx __asm mov y, eax *a = x; *b = y; } #pragma warning (default:4035) // Reenable it The only thing I don't understand is that I can't move *x or *y into eax and ebx respectively. I had to declare two local variables to achieve that. I think that just declaring that will take some times. I didn't tested what's the difference in time between swapping two variables in C++ code and in assembly because I didn't know how to, since GetTickCount() isn't too much reliable and not so much fast. Let me know if you guys find something! Have a nice day! Stef Progamming looks like taking drugs... I think I did an overdose. ;-P

                S Offline
                S Offline
                Stephan Poirier
                wrote on last edited by
                #7

                Thanks everyone! I think I will use the UltraSwap2 solution from Zdeslav, since it is more faster than all the other ones. It was my mistake to think that it should be faster to do it in assembly code. But if I refer at my wishes to learn some assembly code, the suhredayan solution is what I was looking for. I used to code on some industrial programmable controllers a long time ago and it was in assembly code. But both syntax and function names wasn't the same. Great help guys! Stef Progamming looks like taking drugs... I think I did an overdose. ;-P

                1 Reply Last reply
                0
                • Z Zdeslav Vojkovic

                  if you are only concerned about speed, there is no need for the assembly in this case: inline void UltraSwap( LONG* a, LONG* b ) { LONG x = *a; LONG y = *b; __asm mov eax, x __asm mov ebx, y __asm mov x, ebx __asm mov y, eax *a = x; *b = y; } inline void UltraSwap2( LONG* a, LONG* b ) { LONG t = *b; *b = *a; *a = t; } inline void UltraSwap3( LONG* a, LONG* b ) { *b ^= *a ^= *b ^= *a; } of these three methods UltraSwap2 is simplest and fastest (almost twice as UltraSwap) and UltraSwap3 is approx. same as UltraSwap.

                  2 Offline
                  2 Offline
                  224917
                  wrote on last edited by
                  #8

                  Zdeslav Vojkovic wrote: of these three methods UltraSwap2 is simplest and fastest

                  00413701 mov eax,dword ptr [x] __asm mov ebx, dword ptr [y] 00413704 mov ebx,dword ptr [y] __asm mov dword ptr [x], ebx 00413707 mov dword ptr [x],ebx __asm mov dword ptr [y], eax 0041370A mov dword ptr [y],eax


                  suhredayan
                  There is no spoon.

                  S 2 Replies Last reply
                  0
                  • 2 224917

                    Zdeslav Vojkovic wrote: of these three methods UltraSwap2 is simplest and fastest

                    00413701 mov eax,dword ptr [x] __asm mov ebx, dword ptr [y] 00413704 mov ebx,dword ptr [y] __asm mov dword ptr [x], ebx 00413707 mov dword ptr [x],ebx __asm mov dword ptr [y], eax 0041370A mov dword ptr [y],eax


                    suhredayan
                    There is no spoon.

                    S Offline
                    S Offline
                    Stephan Poirier
                    wrote on last edited by
                    #9

                    That's what I thought at first sight. I didn't tested Zdeslav's code, I just relied on the comment he gaves us, it seemed to have sence. I didn't thought to scan the code in assembly:^), my fault:) ! So I did a little test with performance counters, just to see what's the real result... :-D I tested the code on my old P3 450, I've tested each solution 10 times and did an average. For the Zdeslav solution:

                    void UltraSwap( LONG* a, LONG* b ) { LONG t = *b; *b = *a; *a = t; }

                                     IN DEBUG                   IN RELEASE
                    

                    One single call : 0.004190476 ms 0.004190476 ms
                    1000 calls loop : 0.186895210 ms 0.035199995 ms
                    10000 calls loop : 1.822018770 ms 0.307580906 ms

                    and for the suhredayan solution:

                    _inline PVOID UltraSwap2( LONG* a, LONG* b ) { __asm mov eax, dword ptr [a] __asm mov ebx, dword ptr [b] __asm mov dword ptr [a], ebx __asm mov dword ptr [b], eax }

                                     IN DEBUG                   IN RELEASE
                    

                    One single call : 0.004190476 ms 0.003352380 ms
                    1000 calls loop : 0.170133307 ms 0.016761902 ms
                    10000 calls loop : 1.599923566 ms 0.158399976 ms

                    So, know, everyone can see the results. I don't think I have to explain furter... :-D In debug mode, there is not a lot of difference but after a 10000 calls loop in release mode, now I'm sure that UltraSwap2 is the great winner! It just took the half time of the other one. If you want to see my test code, let me know, I will try to post it. Thanks suhredayan for your advise, you pointed me on the right track!! Have a nice day, Stef Progamming looks like taking drugs... I think I did an overdose. ;-P

                    Z 1 Reply Last reply
                    0
                    • 2 224917

                      Zdeslav Vojkovic wrote: of these three methods UltraSwap2 is simplest and fastest

                      00413701 mov eax,dword ptr [x] __asm mov ebx, dword ptr [y] 00413704 mov ebx,dword ptr [y] __asm mov dword ptr [x], ebx 00413707 mov dword ptr [x],ebx __asm mov dword ptr [y], eax 0041370A mov dword ptr [y],eax


                      suhredayan
                      There is no spoon.

                      S Offline
                      S Offline
                      Stephan Poirier
                      wrote on last edited by
                      #10

                      Hey, I found something interesting while playing with my test application, I did a modification to UltraSwap2 and made this one :

                      _inline PVOID UltraSwap3( LONG* a, LONG* b )
                      {
                      __asm mov eax, dword ptr [a]
                      *a = *b;
                      __asm mov dword ptr [b], eax
                      }

                      It's not quite elegant for a "supposed" assembly but look at the result in Release mode : UltraSwap2 after 1 loops : 0.004190476 ms UltraSwap2 after 1000 loops : 0.016761902 ms UltraSwap2 after 10000 loops : 0.138285693 ms UltraSwap2 after 1000000 loops : 13.729674098 ms UltraSwap2 after 10000000 loops : 218.611242878 ms ------------------------------------- UltraSwap3 after 1 loops : 0.003352380 ms UltraSwap3 after 1000 loops : 0.013409522 ms UltraSwap3 after 10000 loops : 0.092190462 ms UltraSwap3 after 1000000 loops : 8.895541502 ms UltraSwap3 after 10000000 loops : 128.132170951 ms It's a lot more faster for long loops!! Stef Progamming looks like taking drugs... I think I did an overdose. ;-P

                      Z 1 Reply Last reply
                      0
                      • S Stephan Poirier

                        That's what I thought at first sight. I didn't tested Zdeslav's code, I just relied on the comment he gaves us, it seemed to have sence. I didn't thought to scan the code in assembly:^), my fault:) ! So I did a little test with performance counters, just to see what's the real result... :-D I tested the code on my old P3 450, I've tested each solution 10 times and did an average. For the Zdeslav solution:

                        void UltraSwap( LONG* a, LONG* b ) { LONG t = *b; *b = *a; *a = t; }

                                         IN DEBUG                   IN RELEASE
                        

                        One single call : 0.004190476 ms 0.004190476 ms
                        1000 calls loop : 0.186895210 ms 0.035199995 ms
                        10000 calls loop : 1.822018770 ms 0.307580906 ms

                        and for the suhredayan solution:

                        _inline PVOID UltraSwap2( LONG* a, LONG* b ) { __asm mov eax, dword ptr [a] __asm mov ebx, dword ptr [b] __asm mov dword ptr [a], ebx __asm mov dword ptr [b], eax }

                                         IN DEBUG                   IN RELEASE
                        

                        One single call : 0.004190476 ms 0.003352380 ms
                        1000 calls loop : 0.170133307 ms 0.016761902 ms
                        10000 calls loop : 1.599923566 ms 0.158399976 ms

                        So, know, everyone can see the results. I don't think I have to explain furter... :-D In debug mode, there is not a lot of difference but after a 10000 calls loop in release mode, now I'm sure that UltraSwap2 is the great winner! It just took the half time of the other one. If you want to see my test code, let me know, I will try to post it. Thanks suhredayan for your advise, you pointed me on the right track!! Have a nice day, Stef Progamming looks like taking drugs... I think I did an overdose. ;-P

                        Z Offline
                        Z Offline
                        Zdeslav Vojkovic
                        wrote on last edited by
                        #11

                        ok, we have a misunderstanding here: i compared the results with original post, not with suhredayan's solution which i didn't tested because of comments below it. i never said that my solution is the fastest one, i said that it is fastest of the three i showed. it would be extremely stupid to say that something is fastest, everything can be optimized. another thing is that dissasembly for my function is larger but not that much as it seems, because suhredayan's version is missing prolog and epilog code, which is automatically added by the compiler. when you compare code which is really generated, suhredayan's version is only 2 instructions shorter (16:14). however, i did some test with following test app: #include "stdafx.h" #include #define ITERATIONS 300000000 inline void UltraSwap2( LONG* a, LONG* b ) { LONG t = *b; *b = *a; *a = t; } inline void UltraSwap4( LONG* a, LONG* b ) { __asm mov eax, dword ptr [a] __asm mov ebx, dword ptr [b] __asm mov dword ptr [a], ebx __asm mov dword ptr [b], eax } int main(int argc, char* argv[]) { long t; long a = 111111111; long b = 222222222; char txt[1024]; { t = GetTickCount(); for(long i = 0; i < ITERATIONS; ++i) { UltraSwap2(&a, &b); } t = GetTickCount() - t; sprintf(txt, "UltraSwap2: %ld iterations done in %ld ms\n", ITERATIONS, t); printf(txt); } { t = GetTickCount(); for(long i = 0; i < ITERATIONS; ++i) { UltraSwap4(&a, &b); } t = GetTickCount() - t; sprintf(txt, "UltraSwap4: %ld iterations done in %ld ms\n", ITERATIONS, t); printf(txt); } return 0; } here are my results (3 runs for each version): release build, non optimized: D:\test\Release>test.exe UltraSwap2: 300000000 iterations done in 3024 ms UltraSwap4: 300000000 iterations done in 4376 ms D:\test\Release>test.exe UltraSwap2: 300000000 iterations done in 4387 ms UltraSwap4: 300000000 iterations done in 4366 ms D:\test\Release>test.exe UltraSwap2: 300000000 iterations done in 4376 ms UltraSwap4: 300000000 iterations done in 4537 ms release build, optimized for speed: D:\test\Release>test.exe UltraSwap2: 300000000 iterations done in 201 ms UltraSwap4: 300000000 iterations done in 771 ms D:\test\Release>test.exe UltraSwap2: 300000000 iterations done in 210 ms UltraSwap4: 300000000 iterations done in 761 ms D:\test\Release>test.exe UltraSwap2: 300000000 iterations done in 211 ms UltraSwap4: 300000000 itera

                        1 Reply Last reply
                        0
                        • S Stephan Poirier

                          Hey, I found something interesting while playing with my test application, I did a modification to UltraSwap2 and made this one :

                          _inline PVOID UltraSwap3( LONG* a, LONG* b )
                          {
                          __asm mov eax, dword ptr [a]
                          *a = *b;
                          __asm mov dword ptr [b], eax
                          }

                          It's not quite elegant for a "supposed" assembly but look at the result in Release mode : UltraSwap2 after 1 loops : 0.004190476 ms UltraSwap2 after 1000 loops : 0.016761902 ms UltraSwap2 after 10000 loops : 0.138285693 ms UltraSwap2 after 1000000 loops : 13.729674098 ms UltraSwap2 after 10000000 loops : 218.611242878 ms ------------------------------------- UltraSwap3 after 1 loops : 0.003352380 ms UltraSwap3 after 1000 loops : 0.013409522 ms UltraSwap3 after 10000 loops : 0.092190462 ms UltraSwap3 after 1000000 loops : 8.895541502 ms UltraSwap3 after 10000000 loops : 128.132170951 ms It's a lot more faster for long loops!! Stef Progamming looks like taking drugs... I think I did an overdose. ;-P

                          Z Offline
                          Z Offline
                          Zdeslav Vojkovic
                          wrote on last edited by
                          #12

                          yes, almost every solution can be made even better, but this solution doesn't work correctly, it stores the address of a into eax, then sets a to value of b, and then sets b to value now stored in a which results in a and b being equal. if you have a chance, take a look at michael abrash's book "zen of code optimization", it will show you many neat tricks. understanding/knowing assembly can only make you a better developer, so this is the right way to go.

                          S 2 Replies Last reply
                          0
                          • Z Zdeslav Vojkovic

                            yes, almost every solution can be made even better, but this solution doesn't work correctly, it stores the address of a into eax, then sets a to value of b, and then sets b to value now stored in a which results in a and b being equal. if you have a chance, take a look at michael abrash's book "zen of code optimization", it will show you many neat tricks. understanding/knowing assembly can only make you a better developer, so this is the right way to go.

                            S Offline
                            S Offline
                            Stephan Poirier
                            wrote on last edited by
                            #13

                            Oups!! :-D I've tested every code in debug to be sure that everything was swapped properly but forgot this one:^)! Thanks Zdeslav! For sure, I will look for the book you're talking about. I ran my test program with swap codes exactly identical to the ones you tested yourself and it gives me always the same result, the assembly code is always still faster. I don't understand. Maybe it depend on the way it is compiled and on which CPU it is ran... I use MS Visual C++ 6.0 Compiler version : MS 32-bit C/C++ Optimizing Compiler Version 12.00.8168 for 80x86 Linker version : MS Incremental Linker Version 6.00.8168 My CPU is : Intel Pentium III, 450MHz SDK installed : MS SDK for WinXP SP2 My project settings are the base one for an MFC dialog-based application. Also, every loops are called within a worker thread sets with normal priority. Progamming looks like taking drugs... I think I did an overdose. ;-P

                            1 Reply Last reply
                            0
                            • Z Zdeslav Vojkovic

                              yes, almost every solution can be made even better, but this solution doesn't work correctly, it stores the address of a into eax, then sets a to value of b, and then sets b to value now stored in a which results in a and b being equal. if you have a chance, take a look at michael abrash's book "zen of code optimization", it will show you many neat tricks. understanding/knowing assembly can only make you a better developer, so this is the right way to go.

                              S Offline
                              S Offline
                              Stephan Poirier
                              wrote on last edited by
                              #14

                              Forget my last post Zdeslav! I found what's going wrong, I missed to place the "inline" instruction in my function header!!:(( Sorry. Ok, now I go sleep, I think I need it! Ha ha ha! Thanks for help! Progamming looks like taking drugs... I think I did an overdose. ;-P

                              1 Reply Last reply
                              0
                              Reply
                              • Reply as topic
                              Log in to reply
                              • Oldest to Newest
                              • Newest to Oldest
                              • Most Votes


                              • Login

                              • Don't have an account? Register

                              • Login or register to search.
                              • First post
                                Last post
                              0
                              • Categories
                              • Recent
                              • Tags
                              • Popular
                              • World
                              • Users
                              • Groups