Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C / C++ / MFC
  4. Couple of questions concerning bits

Couple of questions concerning bits

Scheduled Pinned Locked Moved C / C++ / MFC
questiondiscussionlearning
7 Posts 5 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • G Offline
    G Offline
    georgiek50
    wrote on last edited by
    #1

    I was under the impression (according to my programming book) that a left shift on a byte caused the bits on the left to "fall off" and zeros inserted to the right: eg. 0x1F << 2 = 0xC0 but it equals 0x7C0 Why is this? My reason for asking this question is that I am trying to find a four byte sequence within a file (MPEG-1) that doesn't seem to byte aligned. I wrote a little function to do a "bit by bit" search but since the above statement is not true it fails, here is my code (I feed it 64 bits at a time):int FindSequenceHeader(unsigned char *str) { // Does a bit by bit search for the sequence header char ch[4]; DWORD dwValue; for (int x = 0; x < 4; x++) { for (int i = 0; i < 8; i++) { ch[0] = ( (str[x + 0] << (0 + i)) | (str[x + 1] >> (8 - i)) ); ch[1] = ( (str[x + 1] << (0 + i)) | (str[x + 2] >> (8 - i)) ); ch[2] = ( (str[x + 2] << (0 + i)) | (str[x + 3] >> (8 - i)) ); ch[3] = ( (str[x + 3] << (0 + i)) | (str[x + 4] >> (8 - i)) ); dwValue = ( ch[0] | (ch[1] << 8) | (ch[2] << 16) | (ch[3] << 24) ); switch (dwValue) { case 0x000001b3: cout << "Video Sequence Header Found" << endl; return 0; break; default: break; } } } // Found nothing... return 1; }
    which then leads me to ask why this standard routine i've seen in many sample codes works:DWORD MakeDword(unsigned char *str) { return ( str[0] | (str[1] << 8) | (str[2] << 16) | (str[3] << 24) ); }
    any thoughts?

    A T P 3 Replies Last reply
    0
    • G georgiek50

      I was under the impression (according to my programming book) that a left shift on a byte caused the bits on the left to "fall off" and zeros inserted to the right: eg. 0x1F << 2 = 0xC0 but it equals 0x7C0 Why is this? My reason for asking this question is that I am trying to find a four byte sequence within a file (MPEG-1) that doesn't seem to byte aligned. I wrote a little function to do a "bit by bit" search but since the above statement is not true it fails, here is my code (I feed it 64 bits at a time):int FindSequenceHeader(unsigned char *str) { // Does a bit by bit search for the sequence header char ch[4]; DWORD dwValue; for (int x = 0; x < 4; x++) { for (int i = 0; i < 8; i++) { ch[0] = ( (str[x + 0] << (0 + i)) | (str[x + 1] >> (8 - i)) ); ch[1] = ( (str[x + 1] << (0 + i)) | (str[x + 2] >> (8 - i)) ); ch[2] = ( (str[x + 2] << (0 + i)) | (str[x + 3] >> (8 - i)) ); ch[3] = ( (str[x + 3] << (0 + i)) | (str[x + 4] >> (8 - i)) ); dwValue = ( ch[0] | (ch[1] << 8) | (ch[2] << 16) | (ch[3] << 24) ); switch (dwValue) { case 0x000001b3: cout << "Video Sequence Header Found" << endl; return 0; break; default: break; } } } // Found nothing... return 1; }
      which then leads me to ask why this standard routine i've seen in many sample codes works:DWORD MakeDword(unsigned char *str) { return ( str[0] | (str[1] << 8) | (str[2] << 16) | (str[3] << 24) ); }
      any thoughts?

      A Offline
      A Offline
      adamUK
      wrote on last edited by
      #2

      georgiek50 wrote: eg. 0x1F << 2 = 0xC0 but it equals 0x7C0 georgiek50 wrote: Why is this? Because it is the right answer. 0x1F = 0001 1111 shift it left a coupla times and put zeros in on the right 0111 1100 = 0x7C 0000 = 0 0001 = 1 0010 = 2 0011 = 3 0100 = 4 0101 = 5 0110 = 6 0111 = 7 1000 = 8 1001 = 9 1010 = A 1011 = B 1100 = C 1101 = D 1110 = E 1111 = F cheers!! Adam. My world tour What I do now.. "I spent a lot of my money on booze, birds and fast cars. The rest I just squandered" George Best.

      1 Reply Last reply
      0
      • G georgiek50

        I was under the impression (according to my programming book) that a left shift on a byte caused the bits on the left to "fall off" and zeros inserted to the right: eg. 0x1F << 2 = 0xC0 but it equals 0x7C0 Why is this? My reason for asking this question is that I am trying to find a four byte sequence within a file (MPEG-1) that doesn't seem to byte aligned. I wrote a little function to do a "bit by bit" search but since the above statement is not true it fails, here is my code (I feed it 64 bits at a time):int FindSequenceHeader(unsigned char *str) { // Does a bit by bit search for the sequence header char ch[4]; DWORD dwValue; for (int x = 0; x < 4; x++) { for (int i = 0; i < 8; i++) { ch[0] = ( (str[x + 0] << (0 + i)) | (str[x + 1] >> (8 - i)) ); ch[1] = ( (str[x + 1] << (0 + i)) | (str[x + 2] >> (8 - i)) ); ch[2] = ( (str[x + 2] << (0 + i)) | (str[x + 3] >> (8 - i)) ); ch[3] = ( (str[x + 3] << (0 + i)) | (str[x + 4] >> (8 - i)) ); dwValue = ( ch[0] | (ch[1] << 8) | (ch[2] << 16) | (ch[3] << 24) ); switch (dwValue) { case 0x000001b3: cout << "Video Sequence Header Found" << endl; return 0; break; default: break; } } } // Found nothing... return 1; }
        which then leads me to ask why this standard routine i've seen in many sample codes works:DWORD MakeDword(unsigned char *str) { return ( str[0] | (str[1] << 8) | (str[2] << 16) | (str[3] << 24) ); }
        any thoughts?

        T Offline
        T Offline
        Toni78
        wrote on last edited by
        #3

        georgiek50 wrote: I was under the impression (according to my programming book) that a left shift on a byte caused the bits on the left to "fall off" and zeros inserted to the right: eg. 0x1F << 2 = 0xC0 but it equals 0x7C0 That's not the right answer. 0x1F << 2 = 0x7C.But in any case to ensure that you receive only the last 8 bits you can AND a byte with 0x0FF. For example:

        char[ 0 ] = 0x1F;
        DWORD dwVar = char[ 0 ];
        dwVar = (dwVar << 2) & 0x0FF // dwVar = 0x7C

        georgiek50 wrote: dwValue = ( ch[0] | (ch[1] << 8) | (ch[2] << 16) | (ch[3] << 24) ); The compiler will get each character in ch and store it into a register because that's the only place where shift operations can be performed. For example if you would see the above code written in machine language it would look like this:

        mov eax, ch[0]; // ch[0] = byte ptr address in memory
        mov edx, ch[1]; // same thing for ch[1]
        shl edx, 8; // ch1[] << 8
        or eax, edx; // ch[0] | (ch[1] << 8)
        mov edx, ch[2]; // edx = ch[2]
        shl edx, 16; // ch[2] << 16
        or eax, edx; // ch[0] | (ch[1] << 8) | (ch[2] << 16 );
        and so on....

        The final result in eax will be moved to a dword ptr memory location that corresponds to dwValue. All the registers are 4 bytes long (32 bits) and a DWORD data type is a 32 bit unsigned integer. That's why this works. // Afterall I realized that even my comment lines have bugs

        G 1 Reply Last reply
        0
        • T Toni78

          georgiek50 wrote: I was under the impression (according to my programming book) that a left shift on a byte caused the bits on the left to "fall off" and zeros inserted to the right: eg. 0x1F << 2 = 0xC0 but it equals 0x7C0 That's not the right answer. 0x1F << 2 = 0x7C.But in any case to ensure that you receive only the last 8 bits you can AND a byte with 0x0FF. For example:

          char[ 0 ] = 0x1F;
          DWORD dwVar = char[ 0 ];
          dwVar = (dwVar << 2) & 0x0FF // dwVar = 0x7C

          georgiek50 wrote: dwValue = ( ch[0] | (ch[1] << 8) | (ch[2] << 16) | (ch[3] << 24) ); The compiler will get each character in ch and store it into a register because that's the only place where shift operations can be performed. For example if you would see the above code written in machine language it would look like this:

          mov eax, ch[0]; // ch[0] = byte ptr address in memory
          mov edx, ch[1]; // same thing for ch[1]
          shl edx, 8; // ch1[] << 8
          or eax, edx; // ch[0] | (ch[1] << 8)
          mov edx, ch[2]; // edx = ch[2]
          shl edx, 16; // ch[2] << 16
          or eax, edx; // ch[0] | (ch[1] << 8) | (ch[2] << 16 );
          and so on....

          The final result in eax will be moved to a dword ptr memory location that corresponds to dwValue. All the registers are 4 bytes long (32 bits) and a DWORD data type is a 32 bit unsigned integer. That's why this works. // Afterall I realized that even my comment lines have bugs

          G Offline
          G Offline
          georgiek50
          wrote on last edited by
          #4

          Sorry, what I meant to write was 0x1f << 6 (instead of 2)= 0x7c0 where I was expecting just 0xc0 according to this logic: 0x1F = 00011111 0x1F << 6 = 11000000 or 0xC0 I think my understanding of shifting is very off!

          H 1 Reply Last reply
          0
          • G georgiek50

            Sorry, what I meant to write was 0x1f << 6 (instead of 2)= 0x7c0 where I was expecting just 0xc0 according to this logic: 0x1F = 00011111 0x1F << 6 = 11000000 or 0xC0 I think my understanding of shifting is very off!

            H Offline
            H Offline
            Harsha Gopal
            wrote on last edited by
            #5

            The compiler assumes it as a 16-bit number. So, 0x1F = 0000000000011111 0x1F << 6 = 0000011111000000 = 0x7C0. So, for 8 - bit operation may be u can write as unsigned long a = 0x1F; a=LOBYTE(a<<6); It gives the desired result....You can fit it into your program... Harsha ---------------------------------- http://www.ece.arizona.edu/~hpg ----------------------------------

            G 1 Reply Last reply
            0
            • H Harsha Gopal

              The compiler assumes it as a 16-bit number. So, 0x1F = 0000000000011111 0x1F << 6 = 0000011111000000 = 0x7C0. So, for 8 - bit operation may be u can write as unsigned long a = 0x1F; a=LOBYTE(a<<6); It gives the desired result....You can fit it into your program... Harsha ---------------------------------- http://www.ece.arizona.edu/~hpg ----------------------------------

              G Offline
              G Offline
              georgiek50
              wrote on last edited by
              #6

              Thanks...and you clarified for me what hi and lo - byte means, I was wondering about that for some time!

              1 Reply Last reply
              0
              • G georgiek50

                I was under the impression (according to my programming book) that a left shift on a byte caused the bits on the left to "fall off" and zeros inserted to the right: eg. 0x1F << 2 = 0xC0 but it equals 0x7C0 Why is this? My reason for asking this question is that I am trying to find a four byte sequence within a file (MPEG-1) that doesn't seem to byte aligned. I wrote a little function to do a "bit by bit" search but since the above statement is not true it fails, here is my code (I feed it 64 bits at a time):int FindSequenceHeader(unsigned char *str) { // Does a bit by bit search for the sequence header char ch[4]; DWORD dwValue; for (int x = 0; x < 4; x++) { for (int i = 0; i < 8; i++) { ch[0] = ( (str[x + 0] << (0 + i)) | (str[x + 1] >> (8 - i)) ); ch[1] = ( (str[x + 1] << (0 + i)) | (str[x + 2] >> (8 - i)) ); ch[2] = ( (str[x + 2] << (0 + i)) | (str[x + 3] >> (8 - i)) ); ch[3] = ( (str[x + 3] << (0 + i)) | (str[x + 4] >> (8 - i)) ); dwValue = ( ch[0] | (ch[1] << 8) | (ch[2] << 16) | (ch[3] << 24) ); switch (dwValue) { case 0x000001b3: cout << "Video Sequence Header Found" << endl; return 0; break; default: break; } } } // Found nothing... return 1; }
                which then leads me to ask why this standard routine i've seen in many sample codes works:DWORD MakeDword(unsigned char *str) { return ( str[0] | (str[1] << 8) | (str[2] << 16) | (str[3] << 24) ); }
                any thoughts?

                P Offline
                P Offline
                peterchen
                wrote on last edited by
                #7

                Be aware that: a) Bytes get int-expanded when shifting things around b) char, int etc. are signed - which makes the shift "arithmetic", i.e. the highest order bit remains set. So a) use unsigned char, and b) cast th shift result back to unsigned char.


                "Der Geist des Kriegers ist erwacht / Ich hab die Macht" StS
                sighist | Agile Programming | doxygen

                1 Reply Last reply
                0
                Reply
                • Reply as topic
                Log in to reply
                • Oldest to Newest
                • Newest to Oldest
                • Most Votes


                • Login

                • Don't have an account? Register

                • Login or register to search.
                • First post
                  Last post
                0
                • Categories
                • Recent
                • Tags
                • Popular
                • World
                • Users
                • Groups