MASM 9.0 bug.
-
Harold, Your code is not doing the same thing, it is decrementing a register. My code was propagateing a borrow through multiple DWORDS. My timing indicated that the aligned code took 3.57 seconds to loop 100,000,000 (one hundred million) times, and 4.36 seconds to loop the same number of times for the unaligned test. That indicates that it took 1.22% longer. In addition, The loop had to re-initialize all of the DWORDS with the correct data for each loop. The loop included this initialization for both the base time and also for the test time, thus the time increment was due to only the time increment for the unaligned loop, meaning that the % difference is really in the neighborhood of 44%. I will give you the timing differences and the code changes between the two tests. If you want to see the the execution time differences (visual studio "disassembly" tab, then just ask and I will include that also.
RSA-Test TimeSbbLoop - 100M*TEST32: 3.57
;
; Code from Masm 9.0 .lst file for aligned test.
;
C ;*******************************************************************************
C ;
C ; Timing test for alignment.
C ;
C ; esi has the source OFFSET of the data.
C ; edi has the already scaled OFFSET of the destination data.
C ; ecx has the DWORD count to subtract.
C ;
C ;*******************************************************************************
C ALIGN OWORD
00001AEE C .data
C
C ALIGN OWORD
00001AF0 00000020 [ C TestData DWORD 32 DUP (0) ; 128 BYTES, 8 xmm regs
00000000
]
C
00001B70 00000001 C TestDWORD DWORD 1
C
C ALIGN WORD
00001B74 0023 C WORD (LENGTHOF szTestCase - 1)
00001B76 52 53 41 2D 54 C szTestCase BYTE "RSA-Test TimeSbbLoop - 100M*TEST32:",0
65 73 74 20 54
69 6D 65 53 62
62 4C 6F 6F 70
20 2D 20 31 30
30 4D 2A 54 45
53 54 33 32 3A
00
C
000006A0 C .code
C
C ;
C ; Get start time.
C ;
000006A0 C Start:
000006A0 E8 000054BB C CALL GetStartTime
C ; jnz Exit
C ;
C ; Clear the clear regs.
C ;
000006A5 66| 0F EF C0 C pxor xmm0,xmm0
000006A9 66| 0F 6F C8 C movdqa xmm1,xmm0
000006AD 66| 0F 6F D0 C -
Of course it's not doing the same thing, it's just testing the effect of alignment by itself. There is way too much shit here to be sure of anything.
Harold, My feelings exactly. I will use what I have and beware of any ALIGN in the code section except for the entry at PROC. I will insure that there is no entry to an aligned Label except by a jump or conditional jump. Dave.