Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. Boolean 4 bytes...?

Boolean 4 bytes...?

Scheduled Pinned Locked Moved The Lounge
question
25 Posts 11 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • T Todd C Wilson

    No, it would be a waste of both time and space, because the processor will not be able to do something like (pseudo-code): loadregister address but will need to do: loadregister #0x01 barrelshift left bitposition andmask address barrelshift right bitposition You're trading off 3 extra bytes of data memory (which is 1 time in memory) for a minimum 4x increase of code for each access of that bit. Now, if your overriding concern is memory storage (such as a huge table of bit data), then it is your job, as the developer, to inform the compiler to pack them tigher, probably by using the afor-mentioned unsigned int xyz:1 statement. But you're trading off speed & code size to reduce data memory consumption.


    Visual Studio Favorites - www.nopcode.com/visualfav

    A Offline
    A Offline
    alex barylski
    wrote on last edited by
    #21

    Thats the response I was looking for. See initially i wasn't even thinking about the the clock cycles expended for shl, shr never mind the size of instruction(s). All I was thinking was that 4 bytes on a 32 bit processor fits like hand in glove, but those poor 31 remaining bits must get some lonely. Yes after double checking with intel docs it occured to me...IT's a waste of space yup, but it'd be more with the added instructions shl, shr. Hell I almost went to lengths of exemplifying my own mistake...I was gonna find the difference in clocks and and code size and show you how right you were, but decided showing someone how right they were cuz how wrong I was, isn't acceptable. Congrats...I'm never wrong...not that i'm a genius, I just don't speak unless I know what i'm talking about or perhaps with as much authority. Not that i was trying to be authoritive, just didn't quite find yoer comment about asking in the right forum appropriate. I've asked numerous conceptual, theory questions there. Thats what i was looking for, not technical...i figure thats what books are for, but in this case it proved beneficial. Ah well tis what i get for speaking before thinking. Ummm...I'm super tired...damn computer keeps me awake for days...I slept yesterday though...poor excuse for bad judgement I agree, but it's an excuse. ;P

    shl eax, 2

    Is only 2 cycles and 3 bytes so i wasn't "that" wrong...but I was wrong... Cheers! "An expert is someone who has made all the mistakes in his or her field" - Niels Bohr

    1 Reply Last reply
    0
    • M MarkyMark

      This is a C++ BOOL you're talking about is it? a BOOL (int) is 4 bytes but a bool is only one.

      A Offline
      A Offline
      alex barylski
      wrote on last edited by
      #22

      In Visual C++ 5.0 and later yup. Actually the question stemed from my old TurboPascal program, which uses 4 bytes for boolean also. "An expert is someone who has made all the mistakes in his or her field" - Niels Bohr

      1 Reply Last reply
      0
      • T Todd C Wilson

        No, it would be a waste of both time and space, because the processor will not be able to do something like (pseudo-code): loadregister address but will need to do: loadregister #0x01 barrelshift left bitposition andmask address barrelshift right bitposition You're trading off 3 extra bytes of data memory (which is 1 time in memory) for a minimum 4x increase of code for each access of that bit. Now, if your overriding concern is memory storage (such as a huge table of bit data), then it is your job, as the developer, to inform the compiler to pack them tigher, probably by using the afor-mentioned unsigned int xyz:1 statement. But you're trading off speed & code size to reduce data memory consumption.


        Visual Studio Favorites - www.nopcode.com/visualfav

        A Offline
        A Offline
        alex barylski
        wrote on last edited by
        #23

        However...It would ge quicker to pull 32 bits into say register eax. Then perform AND/OR with your 32 bit mask wouldn't it...? Something like this: boolean = 123 //1111011 array of bool's mov eax, [boolean] and eax, 0x00000001 //Mask off unwanted bits, first bit the above is the same effect as mov eax, [boolean] //But this time it holds only one bit assuming the mov eax, [boolean] consumes 6 bytes and 8 cycles or so and the and eax, 0x00000001 uses 6 bytes and 6 cycles, theres no perfomance gain, but if theres more than one bool.

        //Compiler way
        mov eax, [boolean1] //6 bytes, 8 clocks
        mov ebx, [boolean2] //6 bytes, 8 clocks
        mov edx, [boolean3] //6 bytes, 8 clocks
        mov ecx, [boolean4] //6 bytes, 8 clocks
        //24 bytes, 32 clocks + 16 bytes for four booleans

        //Optimized way
        mov eax, [boolean] //6 bytes, 8 clocks + 4 bytes for one boolean
        and eax, 0x00000001 //6 bytes, 6 clocks
        and ebx, 0x00000002 //6 bytes, 6 clocks
        and ecx, 0x00000003 //6 bytes, 6 clocks
        and edx, 0x00000004 //6 bytes, 6 clocks
        //30 bytes, 32 clocks, 4 bytes for boolean

        What am I missing...? Assuming the above data is somewhat accurate: the bit masking method saves space and if and uses fewer clocks like I suspect it might (maybe) then it would execute in less time so there is a time and space savings. "An expert is someone who has made all the mistakes in his or her field" - Niels Bohr

        T 1 Reply Last reply
        0
        • A alex barylski

          However...It would ge quicker to pull 32 bits into say register eax. Then perform AND/OR with your 32 bit mask wouldn't it...? Something like this: boolean = 123 //1111011 array of bool's mov eax, [boolean] and eax, 0x00000001 //Mask off unwanted bits, first bit the above is the same effect as mov eax, [boolean] //But this time it holds only one bit assuming the mov eax, [boolean] consumes 6 bytes and 8 cycles or so and the and eax, 0x00000001 uses 6 bytes and 6 cycles, theres no perfomance gain, but if theres more than one bool.

          //Compiler way
          mov eax, [boolean1] //6 bytes, 8 clocks
          mov ebx, [boolean2] //6 bytes, 8 clocks
          mov edx, [boolean3] //6 bytes, 8 clocks
          mov ecx, [boolean4] //6 bytes, 8 clocks
          //24 bytes, 32 clocks + 16 bytes for four booleans

          //Optimized way
          mov eax, [boolean] //6 bytes, 8 clocks + 4 bytes for one boolean
          and eax, 0x00000001 //6 bytes, 6 clocks
          and ebx, 0x00000002 //6 bytes, 6 clocks
          and ecx, 0x00000003 //6 bytes, 6 clocks
          and edx, 0x00000004 //6 bytes, 6 clocks
          //30 bytes, 32 clocks, 4 bytes for boolean

          What am I missing...? Assuming the above data is somewhat accurate: the bit masking method saves space and if and uses fewer clocks like I suspect it might (maybe) then it would execute in less time so there is a time and space savings. "An expert is someone who has made all the mistakes in his or her field" - Niels Bohr

          T Offline
          T Offline
          Todd C Wilson
          wrote on last edited by
          #24

          [chuckle] You're now trying to make your data fit your theory! [/chuckle] You're basically doing a hand-optimization of the code; the compiler will never produce code of this kind. Compiler optimizations is pretty much a black art, and is very very dependant upon the underlying cpu design. This is why certain Intel optimizations work poorly on AMD (or even Pentium Pro opt's gag on Pentium II's). This is why my example used pseduo-code and not the real thing. This is also why top-notch first person shooters use hand-tuned assembly and not just rely upon the compiler. Again, some processors and even processer versions do have specialized "bit flag" or "bit register" operations, so you don't even have to load squat - "test addr,bit" will set the flags. You're also making the assumption that your and eax, 0x04 result is treated the same way when checking for true and false - does the compiler generate code like "if (reg&0x04)==1" or does it do "if (reg&0x04)!=0"? The former will always fail, the latter is correct. If this is true, then you need to do your right-shift by the correct amount, to make the result a strict 0 or 1. Again, your hand-tuned and hand-optimized for x86 can do an and 0x04 and then a branch-not-equals since the z flag will be set, but what about other processors? Now, one point you left out and that I didn't mention before since I thought it would muddy the issue, is that you're only reading the values now, which is easy as you point out - get the value, mask it, check for non-zero. However, storing is much more difficult - load, mask with compliment, ora in correct bit mask value for item, store back. There are ways to do this that borrow from graphics design (bit masking and blitting) that solves the temporary value problem but involves two writes to the same location. If you're doing a hardware interface to a location, this may cause a double-latch, which may or may not be a good/bad thing. Another problem is that it is very rare that you're going to be using all the bit flags in a field at the same time - more likely you'll be checking for a condition and then doing a block of work, which will trash your registers. Basically, if you're worried (or if it's an issue with data heap space) about bool taking up a complete word, then help the compiler along using #pragma pack or your developer system's equivilant.


          Visual Studio Favorites - www.nopcode.com/visualfa

          A 1 Reply Last reply
          0
          • T Todd C Wilson

            [chuckle] You're now trying to make your data fit your theory! [/chuckle] You're basically doing a hand-optimization of the code; the compiler will never produce code of this kind. Compiler optimizations is pretty much a black art, and is very very dependant upon the underlying cpu design. This is why certain Intel optimizations work poorly on AMD (or even Pentium Pro opt's gag on Pentium II's). This is why my example used pseduo-code and not the real thing. This is also why top-notch first person shooters use hand-tuned assembly and not just rely upon the compiler. Again, some processors and even processer versions do have specialized "bit flag" or "bit register" operations, so you don't even have to load squat - "test addr,bit" will set the flags. You're also making the assumption that your and eax, 0x04 result is treated the same way when checking for true and false - does the compiler generate code like "if (reg&0x04)==1" or does it do "if (reg&0x04)!=0"? The former will always fail, the latter is correct. If this is true, then you need to do your right-shift by the correct amount, to make the result a strict 0 or 1. Again, your hand-tuned and hand-optimized for x86 can do an and 0x04 and then a branch-not-equals since the z flag will be set, but what about other processors? Now, one point you left out and that I didn't mention before since I thought it would muddy the issue, is that you're only reading the values now, which is easy as you point out - get the value, mask it, check for non-zero. However, storing is much more difficult - load, mask with compliment, ora in correct bit mask value for item, store back. There are ways to do this that borrow from graphics design (bit masking and blitting) that solves the temporary value problem but involves two writes to the same location. If you're doing a hardware interface to a location, this may cause a double-latch, which may or may not be a good/bad thing. Another problem is that it is very rare that you're going to be using all the bit flags in a field at the same time - more likely you'll be checking for a condition and then doing a block of work, which will trash your registers. Basically, if you're worried (or if it's an issue with data heap space) about bool taking up a complete word, then help the compiler along using #pragma pack or your developer system's equivilant.


            Visual Studio Favorites - www.nopcode.com/visualfa

            A Offline
            A Offline
            alex barylski
            wrote on last edited by
            #25

            No worries...i'm running intel 266Mhz. Not greased lighting, but fast enough to not have to worry about 31 bits wasted space. I was curious not so much for the compiler optimizing the code, but if it could be done by hand using inline _asm. I program for me only at the moment so portability isn't an issue. Intels test instruction works on my puter...thats all i'm worried about. I can't see myself ever actually implementing something like the following, except for maybe bool values in the registry. Assembly pro i'm not...I simply know enough to optimize and figure things out that require low level knowledge. The amount of time it takes me to figure out a section of code(and recheck and recheck) in assembly would never allow me to finish projects I start. Really I just wanted to see if my approach was legit or if I was mis understanding something. Had nothing to do with actual optimizations. I always let the compiler do inline, loop unrolls, common sub-expression elimination you name it...with the occasional pragma. Thanx for the time!! :) "An expert is someone who has made all the mistakes in his or her field" - Niels Bohr

            1 Reply Last reply
            0
            Reply
            • Reply as topic
            Log in to reply
            • Oldest to Newest
            • Newest to Oldest
            • Most Votes


            • Login

            • Don't have an account? Register

            • Login or register to search.
            • First post
              Last post
            0
            • Categories
            • Recent
            • Tags
            • Popular
            • World
            • Users
            • Groups