GetTickCount()
-
Tim Smith's 24 Day Bug post reminded me of one nasty bug I found in Windows 3.x back in the early 90s. I was working on software that controlled scientific instruments, particularly chromatographs. The control necessary wasn't hard real time but we needed to time intervals reasonably accurately mostly for sequencing commands to the instrument. We used the system call GetTickCount() to give us the timing. One call at the beginning and one at the end with the difference giving the interval. We even took into account the possible wrap as that was fairly well known. When the software was released, we started getting occasional reports of the sequencing not happening properly. I was the lucky one who got to figure it out. :sigh: In Windows 3.x GetTickCount() returned a 32 bit value in the register pair DX:AX. I wrote test programs to continually hit GetTickCount() to see if I could duplicate the problem and sure enough after days of pounding it would show up at random intervals. There would suddenly be a jump in the time of about 65.5 seconds. While the occurance of the error was random, the time difference was consistent. What was happening was that when Windows was copying the from it's internal variables to DX:AX for return, it would transfer AX first and then DX. Whoever wrote GetTickCount() apparently was clueless about dealing with variables that could be hit by interrupt service routines because they didn't disable the timer interrupt before doing the transfer. There was a possibility of an interrupt coming in between the instructions moving AX and then DX. If AX was 0xFFFF, there would be a carry and the high word would be incremented. What was returned was HighWord + 1, but still 0xFFFF in the low word instead of 0. There was an opportunity for this to happen every 65536 milliseconds. Call GetTickCount() often enough and it will happen. The solution was to write our own DLL that exported our version of GetTickCount() and make sure that was linked before the Window's library. Our version just disabled interrupts, called Window's GetTickCount(), and then reenabled interrupts. This way none of the code that depended on GetTickCount() had to change. Seemed like a really rookie error to make by someone writing operating systems. But nothing much coming out of Microsoft surprises me anymore. Porting the code to 32 bits for Windows 95 and NT permanently fixed the problem as GetTickCount() would copy the whole 32 bit tick count in one instruction.
The evol
-
Tim Smith's 24 Day Bug post reminded me of one nasty bug I found in Windows 3.x back in the early 90s. I was working on software that controlled scientific instruments, particularly chromatographs. The control necessary wasn't hard real time but we needed to time intervals reasonably accurately mostly for sequencing commands to the instrument. We used the system call GetTickCount() to give us the timing. One call at the beginning and one at the end with the difference giving the interval. We even took into account the possible wrap as that was fairly well known. When the software was released, we started getting occasional reports of the sequencing not happening properly. I was the lucky one who got to figure it out. :sigh: In Windows 3.x GetTickCount() returned a 32 bit value in the register pair DX:AX. I wrote test programs to continually hit GetTickCount() to see if I could duplicate the problem and sure enough after days of pounding it would show up at random intervals. There would suddenly be a jump in the time of about 65.5 seconds. While the occurance of the error was random, the time difference was consistent. What was happening was that when Windows was copying the from it's internal variables to DX:AX for return, it would transfer AX first and then DX. Whoever wrote GetTickCount() apparently was clueless about dealing with variables that could be hit by interrupt service routines because they didn't disable the timer interrupt before doing the transfer. There was a possibility of an interrupt coming in between the instructions moving AX and then DX. If AX was 0xFFFF, there would be a carry and the high word would be incremented. What was returned was HighWord + 1, but still 0xFFFF in the low word instead of 0. There was an opportunity for this to happen every 65536 milliseconds. Call GetTickCount() often enough and it will happen. The solution was to write our own DLL that exported our version of GetTickCount() and make sure that was linked before the Window's library. Our version just disabled interrupts, called Window's GetTickCount(), and then reenabled interrupts. This way none of the code that depended on GetTickCount() had to change. Seemed like a really rookie error to make by someone writing operating systems. But nothing much coming out of Microsoft surprises me anymore. Porting the code to 32 bits for Windows 95 and NT permanently fixed the problem as GetTickCount() would copy the whole 32 bit tick count in one instruction.
The evol
Tim Craig wrote:
Seemed like a really rookie error to make by someone writing operating systems
I agree.:)
If the Lord God Almighty had consulted me before embarking upon the Creation, I would have recommended something simpler. -- Alfonso the Wise, 13th Century King of Castile.
-
Tim Smith's 24 Day Bug post reminded me of one nasty bug I found in Windows 3.x back in the early 90s. I was working on software that controlled scientific instruments, particularly chromatographs. The control necessary wasn't hard real time but we needed to time intervals reasonably accurately mostly for sequencing commands to the instrument. We used the system call GetTickCount() to give us the timing. One call at the beginning and one at the end with the difference giving the interval. We even took into account the possible wrap as that was fairly well known. When the software was released, we started getting occasional reports of the sequencing not happening properly. I was the lucky one who got to figure it out. :sigh: In Windows 3.x GetTickCount() returned a 32 bit value in the register pair DX:AX. I wrote test programs to continually hit GetTickCount() to see if I could duplicate the problem and sure enough after days of pounding it would show up at random intervals. There would suddenly be a jump in the time of about 65.5 seconds. While the occurance of the error was random, the time difference was consistent. What was happening was that when Windows was copying the from it's internal variables to DX:AX for return, it would transfer AX first and then DX. Whoever wrote GetTickCount() apparently was clueless about dealing with variables that could be hit by interrupt service routines because they didn't disable the timer interrupt before doing the transfer. There was a possibility of an interrupt coming in between the instructions moving AX and then DX. If AX was 0xFFFF, there would be a carry and the high word would be incremented. What was returned was HighWord + 1, but still 0xFFFF in the low word instead of 0. There was an opportunity for this to happen every 65536 milliseconds. Call GetTickCount() often enough and it will happen. The solution was to write our own DLL that exported our version of GetTickCount() and make sure that was linked before the Window's library. Our version just disabled interrupts, called Window's GetTickCount(), and then reenabled interrupts. This way none of the code that depended on GetTickCount() had to change. Seemed like a really rookie error to make by someone writing operating systems. But nothing much coming out of Microsoft surprises me anymore. Porting the code to 32 bits for Windows 95 and NT permanently fixed the problem as GetTickCount() would copy the whole 32 bit tick count in one instruction.
The evol
Good one! Thank you for sharing :) Mark
-
Tim Smith's 24 Day Bug post reminded me of one nasty bug I found in Windows 3.x back in the early 90s. I was working on software that controlled scientific instruments, particularly chromatographs. The control necessary wasn't hard real time but we needed to time intervals reasonably accurately mostly for sequencing commands to the instrument. We used the system call GetTickCount() to give us the timing. One call at the beginning and one at the end with the difference giving the interval. We even took into account the possible wrap as that was fairly well known. When the software was released, we started getting occasional reports of the sequencing not happening properly. I was the lucky one who got to figure it out. :sigh: In Windows 3.x GetTickCount() returned a 32 bit value in the register pair DX:AX. I wrote test programs to continually hit GetTickCount() to see if I could duplicate the problem and sure enough after days of pounding it would show up at random intervals. There would suddenly be a jump in the time of about 65.5 seconds. While the occurance of the error was random, the time difference was consistent. What was happening was that when Windows was copying the from it's internal variables to DX:AX for return, it would transfer AX first and then DX. Whoever wrote GetTickCount() apparently was clueless about dealing with variables that could be hit by interrupt service routines because they didn't disable the timer interrupt before doing the transfer. There was a possibility of an interrupt coming in between the instructions moving AX and then DX. If AX was 0xFFFF, there would be a carry and the high word would be incremented. What was returned was HighWord + 1, but still 0xFFFF in the low word instead of 0. There was an opportunity for this to happen every 65536 milliseconds. Call GetTickCount() often enough and it will happen. The solution was to write our own DLL that exported our version of GetTickCount() and make sure that was linked before the Window's library. Our version just disabled interrupts, called Window's GetTickCount(), and then reenabled interrupts. This way none of the code that depended on GetTickCount() had to change. Seemed like a really rookie error to make by someone writing operating systems. But nothing much coming out of Microsoft surprises me anymore. Porting the code to 32 bits for Windows 95 and NT permanently fixed the problem as GetTickCount() would copy the whole 32 bit tick count in one instruction.
The evol
-
Tim Smith's 24 Day Bug post reminded me of one nasty bug I found in Windows 3.x back in the early 90s. I was working on software that controlled scientific instruments, particularly chromatographs. The control necessary wasn't hard real time but we needed to time intervals reasonably accurately mostly for sequencing commands to the instrument. We used the system call GetTickCount() to give us the timing. One call at the beginning and one at the end with the difference giving the interval. We even took into account the possible wrap as that was fairly well known. When the software was released, we started getting occasional reports of the sequencing not happening properly. I was the lucky one who got to figure it out. :sigh: In Windows 3.x GetTickCount() returned a 32 bit value in the register pair DX:AX. I wrote test programs to continually hit GetTickCount() to see if I could duplicate the problem and sure enough after days of pounding it would show up at random intervals. There would suddenly be a jump in the time of about 65.5 seconds. While the occurance of the error was random, the time difference was consistent. What was happening was that when Windows was copying the from it's internal variables to DX:AX for return, it would transfer AX first and then DX. Whoever wrote GetTickCount() apparently was clueless about dealing with variables that could be hit by interrupt service routines because they didn't disable the timer interrupt before doing the transfer. There was a possibility of an interrupt coming in between the instructions moving AX and then DX. If AX was 0xFFFF, there would be a carry and the high word would be incremented. What was returned was HighWord + 1, but still 0xFFFF in the low word instead of 0. There was an opportunity for this to happen every 65536 milliseconds. Call GetTickCount() often enough and it will happen. The solution was to write our own DLL that exported our version of GetTickCount() and make sure that was linked before the Window's library. Our version just disabled interrupts, called Window's GetTickCount(), and then reenabled interrupts. This way none of the code that depended on GetTickCount() had to change. Seemed like a really rookie error to make by someone writing operating systems. But nothing much coming out of Microsoft surprises me anymore. Porting the code to 32 bits for Windows 95 and NT permanently fixed the problem as GetTickCount() would copy the whole 32 bit tick count in one instruction.
The evol
I am impressed and am very glad that it was not me that had to solve that problem. You can only beat you head against a wall so many times before you get a head ache.
INTP "Program testing can be used to show the presence of bugs, but never to show their absence."Edsger Dijkstra