Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C / C++ / MFC
  4. Interesting problems

Interesting problems

Scheduled Pinned Locked Moved C / C++ / MFC
c++performancehelptutorialquestion
8 Posts 4 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • F Offline
    F Offline
    FlyingDancer
    wrote on last edited by
    #1

    Program Code: #include #include #define N 1024 int i,j,k; float slice[N][N]; void main() { time_t start,end; float s; start=time(NULL); for(k=0;k<100;k++) { for(j=0;j slice[i][j]=(float)(slice[i][j]+0.01); slice[j][i]=(float)(slice[j][i]+0.01); } printf("%d\n",k); } end=time(NULL); s=difftime(end,start); printf(" The total time is %f:",s); } Questions or problems(Compiled by Visual C++ 6.0): 1.If N equals 1022,1023,1025 or 1026, its run time is about 13 seconds, else if N=1024 that will be about 56 seconds. that is, the speed is very different. 2. "slice[j][i]=(float)(slice[j][i]+0.01)" is executed over two times faster than "slice[i][j]=(float)(slice[i][j]+0.01);". You can have a try by cutting off one of these sentences. 3. An exception will happen if "int i,j,k; float slice[N][N];" is moved into function main() that says "test.exe has encountered a problem and needs to close. We are sorry for the inconvenience.". In addition, my program is named by "test.cpp" Have you known these problems? and could you give me an explanation and how to avoid these bad results please Any is appreciated! Thanks!

    M P 2 Replies Last reply
    0
    • F FlyingDancer

      Program Code: #include #include #define N 1024 int i,j,k; float slice[N][N]; void main() { time_t start,end; float s; start=time(NULL); for(k=0;k<100;k++) { for(j=0;j slice[i][j]=(float)(slice[i][j]+0.01); slice[j][i]=(float)(slice[j][i]+0.01); } printf("%d\n",k); } end=time(NULL); s=difftime(end,start); printf(" The total time is %f:",s); } Questions or problems(Compiled by Visual C++ 6.0): 1.If N equals 1022,1023,1025 or 1026, its run time is about 13 seconds, else if N=1024 that will be about 56 seconds. that is, the speed is very different. 2. "slice[j][i]=(float)(slice[j][i]+0.01)" is executed over two times faster than "slice[i][j]=(float)(slice[i][j]+0.01);". You can have a try by cutting off one of these sentences. 3. An exception will happen if "int i,j,k; float slice[N][N];" is moved into function main() that says "test.exe has encountered a problem and needs to close. We are sorry for the inconvenience.". In addition, my program is named by "test.cpp" Have you known these problems? and could you give me an explanation and how to avoid these bad results please Any is appreciated! Thanks!

      M Offline
      M Offline
      Maxwell Chen
      wrote on last edited by
      #2
      1. With VC++6, and with VC++7(2002), I saw the same situation. 2) Regarding to FlyingDancer wrote: float slice[N][N];" is moved into function main() please try this, since you use .cpp extension. It does not crash.

      // #include<stdio.h>
      #include <iostream>
      #include <time.h>

      // #define N 1023
      // float slice[N][N];
      void main()
      {
      int i,j,k;
      time_t start,end;
      float s;
      const int N = 1024;
      float (*slice)[N] = new float[N][N];

      start=time(NULL);
      for(k=0;k<100;k++)
      {
      	for(j=0;j<N;j++)
      		for(i=0;i<N;i++)
      		{
      			slice\[i\]\[j\]=(float)(slice\[i\]\[j\]+0.01F);
      			slice\[j\]\[i\]=(float)(slice\[j\]\[i\]+0.01F);
      		}
      		printf("%d\\n",k);
      }
      end=time(NULL);
      s=difftime(end,start);
      printf("   The total time is %f:",s);
      delete\[\] slice;
      

      }

      1. Regarding to 1024 taking that long time, I dunno. I guess that it may be the x86 instructions... Maxwell Chen
      F 2 Replies Last reply
      0
      • M Maxwell Chen
        1. With VC++6, and with VC++7(2002), I saw the same situation. 2) Regarding to FlyingDancer wrote: float slice[N][N];" is moved into function main() please try this, since you use .cpp extension. It does not crash.

        // #include<stdio.h>
        #include <iostream>
        #include <time.h>

        // #define N 1023
        // float slice[N][N];
        void main()
        {
        int i,j,k;
        time_t start,end;
        float s;
        const int N = 1024;
        float (*slice)[N] = new float[N][N];

        start=time(NULL);
        for(k=0;k<100;k++)
        {
        	for(j=0;j<N;j++)
        		for(i=0;i<N;i++)
        		{
        			slice\[i\]\[j\]=(float)(slice\[i\]\[j\]+0.01F);
        			slice\[j\]\[i\]=(float)(slice\[j\]\[i\]+0.01F);
        		}
        		printf("%d\\n",k);
        }
        end=time(NULL);
        s=difftime(end,start);
        printf("   The total time is %f:",s);
        delete\[\] slice;
        

        }

        1. Regarding to 1024 taking that long time, I dunno. I guess that it may be the x86 instructions... Maxwell Chen
        F Offline
        F Offline
        FlyingDancer
        wrote on last edited by
        #3

        Yeah It can work well, not considering its run speed Your way is to adopt an array pointer, that seems a little different but what has been solved regarding to a big array slice[N][N] by adopting a array pointer? This is a difficult problem, really. Is it related with OS, compiling way? Or maybe it is a memory allocating problem... How do you think about this?

        O 1 Reply Last reply
        0
        • F FlyingDancer

          Yeah It can work well, not considering its run speed Your way is to adopt an array pointer, that seems a little different but what has been solved regarding to a big array slice[N][N] by adopting a array pointer? This is a difficult problem, really. Is it related with OS, compiling way? Or maybe it is a memory allocating problem... How do you think about this?

          O Offline
          O Offline
          ohadp
          wrote on last edited by
          #4

          it crashes because if not allocated dynamically this array is allocated on the stack which probably can't hold 1024*1024*4-bytes...

          1 Reply Last reply
          0
          • M Maxwell Chen
            1. With VC++6, and with VC++7(2002), I saw the same situation. 2) Regarding to FlyingDancer wrote: float slice[N][N];" is moved into function main() please try this, since you use .cpp extension. It does not crash.

            // #include<stdio.h>
            #include <iostream>
            #include <time.h>

            // #define N 1023
            // float slice[N][N];
            void main()
            {
            int i,j,k;
            time_t start,end;
            float s;
            const int N = 1024;
            float (*slice)[N] = new float[N][N];

            start=time(NULL);
            for(k=0;k<100;k++)
            {
            	for(j=0;j<N;j++)
            		for(i=0;i<N;i++)
            		{
            			slice\[i\]\[j\]=(float)(slice\[i\]\[j\]+0.01F);
            			slice\[j\]\[i\]=(float)(slice\[j\]\[i\]+0.01F);
            		}
            		printf("%d\\n",k);
            }
            end=time(NULL);
            s=difftime(end,start);
            printf("   The total time is %f:",s);
            delete\[\] slice;
            

            }

            1. Regarding to 1024 taking that long time, I dunno. I guess that it may be the x86 instructions... Maxwell Chen
            F Offline
            F Offline
            FlyingDancer
            wrote on last edited by
            #5

            Problem: "slice[j][i]=(float)(slice[j][i]+0.01F);" is executed faster than "slice[i][j]=(float)(slice[i][j]+0.01F);" I think it should be answered from two aspects 1. In VC, A two-dimension array is stored according to its row first, then its col,... 2. Virtual memory technology. Paging and Swaping In this problem every row of that array couldn't get enough free space so when accessing to any data of another row a swapping action will happen therefore one runs faster than the other Am I right?

            M 1 Reply Last reply
            0
            • F FlyingDancer

              Problem: "slice[j][i]=(float)(slice[j][i]+0.01F);" is executed faster than "slice[i][j]=(float)(slice[i][j]+0.01F);" I think it should be answered from two aspects 1. In VC, A two-dimension array is stored according to its row first, then its col,... 2. Virtual memory technology. Paging and Swaping In this problem every row of that array couldn't get enough free space so when accessing to any data of another row a swapping action will happen therefore one runs faster than the other Am I right?

              M Offline
              M Offline
              Maxwell Chen
              wrote on last edited by
              #6

              :-D Maxwell Chen

              1 Reply Last reply
              0
              • F FlyingDancer

                Program Code: #include #include #define N 1024 int i,j,k; float slice[N][N]; void main() { time_t start,end; float s; start=time(NULL); for(k=0;k<100;k++) { for(j=0;j slice[i][j]=(float)(slice[i][j]+0.01); slice[j][i]=(float)(slice[j][i]+0.01); } printf("%d\n",k); } end=time(NULL); s=difftime(end,start); printf(" The total time is %f:",s); } Questions or problems(Compiled by Visual C++ 6.0): 1.If N equals 1022,1023,1025 or 1026, its run time is about 13 seconds, else if N=1024 that will be about 56 seconds. that is, the speed is very different. 2. "slice[j][i]=(float)(slice[j][i]+0.01)" is executed over two times faster than "slice[i][j]=(float)(slice[i][j]+0.01);". You can have a try by cutting off one of these sentences. 3. An exception will happen if "int i,j,k; float slice[N][N];" is moved into function main() that says "test.exe has encountered a problem and needs to close. We are sorry for the inconvenience.". In addition, my program is named by "test.cpp" Have you known these problems? and could you give me an explanation and how to avoid these bad results please Any is appreciated! Thanks!

                P Offline
                P Offline
                Paul Ranson
                wrote on last edited by
                #7

                1. I don't know, other than it's likely to be a virtual memory pathology. 2. This is your code, // v1 for ( j = 0; j< N; j++ ) { for ( i = 0; i < N; i++ ) { slice [i][j] = (float)(slice[i][j] + 0.01 ) ; } } // which is equivalent to for ( j = 0; j< N; j++ ) { for ( i = 0; i < N; i++ ) { float * pf = slice + ( i * N ) + j ; *pf += 0.01 ; } } // v2 for ( j = 0; j< N; j++ ) { for ( i = 0; i < N; i++ ) { slice [j][i] = (float)(slice[j][i] + 0.01 ) ; } } // which is equivalent to for ( j = 0; j< N; j++ ) { float * pf = slice + (j * N) ; for ( i = 0; i < N; i++ ) { *pf += 0.01 ; ++pf ; } } IOW in the first example you are asking the CPU to do an extra multiplication each time around the inner loop. The optimiser may be able to turn it into an addition (if that's faster...), but it's still extra work. More subtley the second example accesses memory consecutively, so the data is much more likely to be in the CPU cache, whereas the first accesses every N * sizeof ( float ) bytes which means the next value will never be in the cache, accessing main memory means waiting about, accessing the cache puts that off, and since the cache is read and written to main memory in relatively large chunks you will get an entire 'cache line' of modified values going to main memory in the same time as it takes to write one. Anyway it would be worth examining the generated machine code for each example to see what the optimiser actually does, and perhaps play with the options. 3. The default stack size for Win32 is 1MB. You are asking to allocate 4MB (sizeof ( float ) == 4 ) so the only way is to exit with an exception. You can adjust this in the linker, or with EditBin, but for a data structure of this nature either declaring it statically as in your example or allocation on the heap as in Maxwell's is appropriate. Paul

                F 1 Reply Last reply
                0
                • P Paul Ranson

                  1. I don't know, other than it's likely to be a virtual memory pathology. 2. This is your code, // v1 for ( j = 0; j< N; j++ ) { for ( i = 0; i < N; i++ ) { slice [i][j] = (float)(slice[i][j] + 0.01 ) ; } } // which is equivalent to for ( j = 0; j< N; j++ ) { for ( i = 0; i < N; i++ ) { float * pf = slice + ( i * N ) + j ; *pf += 0.01 ; } } // v2 for ( j = 0; j< N; j++ ) { for ( i = 0; i < N; i++ ) { slice [j][i] = (float)(slice[j][i] + 0.01 ) ; } } // which is equivalent to for ( j = 0; j< N; j++ ) { float * pf = slice + (j * N) ; for ( i = 0; i < N; i++ ) { *pf += 0.01 ; ++pf ; } } IOW in the first example you are asking the CPU to do an extra multiplication each time around the inner loop. The optimiser may be able to turn it into an addition (if that's faster...), but it's still extra work. More subtley the second example accesses memory consecutively, so the data is much more likely to be in the CPU cache, whereas the first accesses every N * sizeof ( float ) bytes which means the next value will never be in the cache, accessing main memory means waiting about, accessing the cache puts that off, and since the cache is read and written to main memory in relatively large chunks you will get an entire 'cache line' of modified values going to main memory in the same time as it takes to write one. Anyway it would be worth examining the generated machine code for each example to see what the optimiser actually does, and perhaps play with the options. 3. The default stack size for Win32 is 1MB. You are asking to allocate 4MB (sizeof ( float ) == 4 ) so the only way is to exit with an exception. You can adjust this in the linker, or with EditBin, but for a data structure of this nature either declaring it statically as in your example or allocation on the heap as in Maxwell's is appropriate. Paul

                  F Offline
                  F Offline
                  FlyingDancer
                  wrote on last edited by
                  #8

                  Great! Full and clear!! Thank you very much!!! :laugh:

                  1 Reply Last reply
                  0
                  Reply
                  • Reply as topic
                  Log in to reply
                  • Oldest to Newest
                  • Newest to Oldest
                  • Most Votes


                  • Login

                  • Don't have an account? Register

                  • Login or register to search.
                  • First post
                    Last post
                  0
                  • Categories
                  • Recent
                  • Tags
                  • Popular
                  • World
                  • Users
                  • Groups