Couple of questions concerning bits
-
I was under the impression (according to my programming book) that a left shift on a byte caused the bits on the left to "fall off" and zeros inserted to the right: eg. 0x1F << 2 = 0xC0 but it equals 0x7C0 Why is this? My reason for asking this question is that I am trying to find a four byte sequence within a file (MPEG-1) that doesn't seem to byte aligned. I wrote a little function to do a "bit by bit" search but since the above statement is not true it fails, here is my code (I feed it 64 bits at a time):
int FindSequenceHeader(unsigned char *str) { // Does a bit by bit search for the sequence header char ch[4]; DWORD dwValue; for (int x = 0; x < 4; x++) { for (int i = 0; i < 8; i++) { ch[0] = ( (str[x + 0] << (0 + i)) | (str[x + 1] >> (8 - i)) ); ch[1] = ( (str[x + 1] << (0 + i)) | (str[x + 2] >> (8 - i)) ); ch[2] = ( (str[x + 2] << (0 + i)) | (str[x + 3] >> (8 - i)) ); ch[3] = ( (str[x + 3] << (0 + i)) | (str[x + 4] >> (8 - i)) ); dwValue = ( ch[0] | (ch[1] << 8) | (ch[2] << 16) | (ch[3] << 24) ); switch (dwValue) { case 0x000001b3: cout << "Video Sequence Header Found" << endl; return 0; break; default: break; } } } // Found nothing... return 1; }
which then leads me to ask why this standard routine i've seen in many sample codes works:DWORD MakeDword(unsigned char *str) { return ( str[0] | (str[1] << 8) | (str[2] << 16) | (str[3] << 24) ); }
any thoughts? -
I was under the impression (according to my programming book) that a left shift on a byte caused the bits on the left to "fall off" and zeros inserted to the right: eg. 0x1F << 2 = 0xC0 but it equals 0x7C0 Why is this? My reason for asking this question is that I am trying to find a four byte sequence within a file (MPEG-1) that doesn't seem to byte aligned. I wrote a little function to do a "bit by bit" search but since the above statement is not true it fails, here is my code (I feed it 64 bits at a time):
int FindSequenceHeader(unsigned char *str) { // Does a bit by bit search for the sequence header char ch[4]; DWORD dwValue; for (int x = 0; x < 4; x++) { for (int i = 0; i < 8; i++) { ch[0] = ( (str[x + 0] << (0 + i)) | (str[x + 1] >> (8 - i)) ); ch[1] = ( (str[x + 1] << (0 + i)) | (str[x + 2] >> (8 - i)) ); ch[2] = ( (str[x + 2] << (0 + i)) | (str[x + 3] >> (8 - i)) ); ch[3] = ( (str[x + 3] << (0 + i)) | (str[x + 4] >> (8 - i)) ); dwValue = ( ch[0] | (ch[1] << 8) | (ch[2] << 16) | (ch[3] << 24) ); switch (dwValue) { case 0x000001b3: cout << "Video Sequence Header Found" << endl; return 0; break; default: break; } } } // Found nothing... return 1; }
which then leads me to ask why this standard routine i've seen in many sample codes works:DWORD MakeDword(unsigned char *str) { return ( str[0] | (str[1] << 8) | (str[2] << 16) | (str[3] << 24) ); }
any thoughts?georgiek50 wrote: eg. 0x1F << 2 = 0xC0 but it equals 0x7C0 georgiek50 wrote: Why is this? Because it is the right answer. 0x1F = 0001 1111 shift it left a coupla times and put zeros in on the right 0111 1100 = 0x7C 0000 = 0 0001 = 1 0010 = 2 0011 = 3 0100 = 4 0101 = 5 0110 = 6 0111 = 7 1000 = 8 1001 = 9 1010 = A 1011 = B 1100 = C 1101 = D 1110 = E 1111 = F cheers!! Adam. My world tour What I do now.. "I spent a lot of my money on booze, birds and fast cars. The rest I just squandered" George Best.
-
I was under the impression (according to my programming book) that a left shift on a byte caused the bits on the left to "fall off" and zeros inserted to the right: eg. 0x1F << 2 = 0xC0 but it equals 0x7C0 Why is this? My reason for asking this question is that I am trying to find a four byte sequence within a file (MPEG-1) that doesn't seem to byte aligned. I wrote a little function to do a "bit by bit" search but since the above statement is not true it fails, here is my code (I feed it 64 bits at a time):
int FindSequenceHeader(unsigned char *str) { // Does a bit by bit search for the sequence header char ch[4]; DWORD dwValue; for (int x = 0; x < 4; x++) { for (int i = 0; i < 8; i++) { ch[0] = ( (str[x + 0] << (0 + i)) | (str[x + 1] >> (8 - i)) ); ch[1] = ( (str[x + 1] << (0 + i)) | (str[x + 2] >> (8 - i)) ); ch[2] = ( (str[x + 2] << (0 + i)) | (str[x + 3] >> (8 - i)) ); ch[3] = ( (str[x + 3] << (0 + i)) | (str[x + 4] >> (8 - i)) ); dwValue = ( ch[0] | (ch[1] << 8) | (ch[2] << 16) | (ch[3] << 24) ); switch (dwValue) { case 0x000001b3: cout << "Video Sequence Header Found" << endl; return 0; break; default: break; } } } // Found nothing... return 1; }
which then leads me to ask why this standard routine i've seen in many sample codes works:DWORD MakeDword(unsigned char *str) { return ( str[0] | (str[1] << 8) | (str[2] << 16) | (str[3] << 24) ); }
any thoughts?georgiek50 wrote: I was under the impression (according to my programming book) that a left shift on a byte caused the bits on the left to "fall off" and zeros inserted to the right: eg. 0x1F << 2 = 0xC0 but it equals 0x7C0 That's not the right answer. 0x1F << 2 = 0x7C.But in any case to ensure that you receive only the last 8 bits you can AND a byte with 0x0FF. For example:
char[ 0 ] = 0x1F;
DWORD dwVar = char[ 0 ];
dwVar = (dwVar << 2) & 0x0FF // dwVar = 0x7Cgeorgiek50 wrote: dwValue = ( ch[0] | (ch[1] << 8) | (ch[2] << 16) | (ch[3] << 24) ); The compiler will get each character in ch and store it into a register because that's the only place where shift operations can be performed. For example if you would see the above code written in machine language it would look like this:
mov eax, ch[0]; // ch[0] = byte ptr address in memory
mov edx, ch[1]; // same thing for ch[1]
shl edx, 8; // ch1[] << 8
or eax, edx; // ch[0] | (ch[1] << 8)
mov edx, ch[2]; // edx = ch[2]
shl edx, 16; // ch[2] << 16
or eax, edx; // ch[0] | (ch[1] << 8) | (ch[2] << 16 );
and so on....The final result in eax will be moved to a dword ptr memory location that corresponds to dwValue. All the registers are 4 bytes long (32 bits) and a DWORD data type is a 32 bit unsigned integer. That's why this works. // Afterall I realized that even my comment lines have bugs
-
georgiek50 wrote: I was under the impression (according to my programming book) that a left shift on a byte caused the bits on the left to "fall off" and zeros inserted to the right: eg. 0x1F << 2 = 0xC0 but it equals 0x7C0 That's not the right answer. 0x1F << 2 = 0x7C.But in any case to ensure that you receive only the last 8 bits you can AND a byte with 0x0FF. For example:
char[ 0 ] = 0x1F;
DWORD dwVar = char[ 0 ];
dwVar = (dwVar << 2) & 0x0FF // dwVar = 0x7Cgeorgiek50 wrote: dwValue = ( ch[0] | (ch[1] << 8) | (ch[2] << 16) | (ch[3] << 24) ); The compiler will get each character in ch and store it into a register because that's the only place where shift operations can be performed. For example if you would see the above code written in machine language it would look like this:
mov eax, ch[0]; // ch[0] = byte ptr address in memory
mov edx, ch[1]; // same thing for ch[1]
shl edx, 8; // ch1[] << 8
or eax, edx; // ch[0] | (ch[1] << 8)
mov edx, ch[2]; // edx = ch[2]
shl edx, 16; // ch[2] << 16
or eax, edx; // ch[0] | (ch[1] << 8) | (ch[2] << 16 );
and so on....The final result in eax will be moved to a dword ptr memory location that corresponds to dwValue. All the registers are 4 bytes long (32 bits) and a DWORD data type is a 32 bit unsigned integer. That's why this works. // Afterall I realized that even my comment lines have bugs
Sorry, what I meant to write was 0x1f << 6 (instead of 2)= 0x7c0 where I was expecting just 0xc0 according to this logic:
0x1F = 00011111 0x1F << 6 = 11000000 or 0xC0
I think my understanding of shifting is very off! -
Sorry, what I meant to write was 0x1f << 6 (instead of 2)= 0x7c0 where I was expecting just 0xc0 according to this logic:
0x1F = 00011111 0x1F << 6 = 11000000 or 0xC0
I think my understanding of shifting is very off!The compiler assumes it as a 16-bit number. So, 0x1F = 0000000000011111 0x1F << 6 = 0000011111000000 = 0x7C0. So, for 8 - bit operation may be u can write as unsigned long a = 0x1F; a=LOBYTE(a<<6); It gives the desired result....You can fit it into your program... Harsha ---------------------------------- http://www.ece.arizona.edu/~hpg ----------------------------------
-
The compiler assumes it as a 16-bit number. So, 0x1F = 0000000000011111 0x1F << 6 = 0000011111000000 = 0x7C0. So, for 8 - bit operation may be u can write as unsigned long a = 0x1F; a=LOBYTE(a<<6); It gives the desired result....You can fit it into your program... Harsha ---------------------------------- http://www.ece.arizona.edu/~hpg ----------------------------------
Thanks...and you clarified for me what hi and lo - byte means, I was wondering about that for some time!
-
I was under the impression (according to my programming book) that a left shift on a byte caused the bits on the left to "fall off" and zeros inserted to the right: eg. 0x1F << 2 = 0xC0 but it equals 0x7C0 Why is this? My reason for asking this question is that I am trying to find a four byte sequence within a file (MPEG-1) that doesn't seem to byte aligned. I wrote a little function to do a "bit by bit" search but since the above statement is not true it fails, here is my code (I feed it 64 bits at a time):
int FindSequenceHeader(unsigned char *str) { // Does a bit by bit search for the sequence header char ch[4]; DWORD dwValue; for (int x = 0; x < 4; x++) { for (int i = 0; i < 8; i++) { ch[0] = ( (str[x + 0] << (0 + i)) | (str[x + 1] >> (8 - i)) ); ch[1] = ( (str[x + 1] << (0 + i)) | (str[x + 2] >> (8 - i)) ); ch[2] = ( (str[x + 2] << (0 + i)) | (str[x + 3] >> (8 - i)) ); ch[3] = ( (str[x + 3] << (0 + i)) | (str[x + 4] >> (8 - i)) ); dwValue = ( ch[0] | (ch[1] << 8) | (ch[2] << 16) | (ch[3] << 24) ); switch (dwValue) { case 0x000001b3: cout << "Video Sequence Header Found" << endl; return 0; break; default: break; } } } // Found nothing... return 1; }
which then leads me to ask why this standard routine i've seen in many sample codes works:DWORD MakeDword(unsigned char *str) { return ( str[0] | (str[1] << 8) | (str[2] << 16) | (str[3] << 24) ); }
any thoughts?Be aware that: a) Bytes get int-expanded when shifting things around b) char, int etc. are signed - which makes the shift "arithmetic", i.e. the highest order bit remains set. So a) use unsigned char, and b) cast th shift result back to unsigned char.
"Der Geist des Kriegers ist erwacht / Ich hab die Macht" StS
sighist | Agile Programming | doxygen