Boolean 4 bytes...?

Brigg Thorp

This is the kind of crap that make applications enormous. How many values could have been fit into a single boolean bit value, only to take up 32 bits instead. Imagine if I had a bit array, with 100 booleans. That is 3100 bytes (3.1Kb) of wasted space. This also happens with hard drive cluster size as well. Whose stupid idea was this anyway? No wonder all new computers are coming with 100+ Gb drives. Jeesh End of rant... Brigg Thorp Software Engineer Timex Corporation

Ray Hayes

HockeyDude wrote: So whats a guy with 8 phd's called...Doctor's doctor...? Maybe! When I lived in Germany, I rented a flat from a local Uni Professor, his contact name/address started Herrn Professor Doktor Doktor Stroubler (maybe a couple of spelling mistakes there though, it is FRIDAY afternoon and I went to the pub at lunch :) !) Regards, Ray

dandy72

A distant relative of mine had at least 6 a few years ago. He had a few more going simultaneously, so he's probably over 8 by now. Mind you, he was the prime example of "a genius is someone who can do anything except earn a living".

dandy72

Imagine how quickly your hard drive would fragment if each file was allocated space on a per-byte basis...

alex barylski

:-D "An expert is someone who has made all the mistakes in his or her field" - Niels Bohr

alex barylski

Whatever the case...someone with 8 phd's would be something other than human. Cheers "An expert is someone who has made all the mistakes in his or her field" - Niels Bohr

Kastellanos Nikos

I can send you an old 64MB SIMM for free. That's 16.777.216 booleans. ;P - - - - - - - - - - - - - - - - - - Memory leaks is the price we pay \0 01234567890123456789012345678901234

alex barylski

Thats the response I was looking for. See initially i wasn't even thinking about the the clock cycles expended for shl, shr never mind the size of instruction(s). All I was thinking was that 4 bytes on a 32 bit processor fits like hand in glove, but those poor 31 remaining bits must get some lonely. Yes after double checking with intel docs it occured to me...IT's a waste of space yup, but it'd be more with the added instructions shl, shr. Hell I almost went to lengths of exemplifying my own mistake...I was gonna find the difference in clocks and and code size and show you how right you were, but decided showing someone how right they were cuz how wrong I was, isn't acceptable. Congrats...I'm never wrong...not that i'm a genius, I just don't speak unless I know what i'm talking about or perhaps with as much authority. Not that i was trying to be authoritive, just didn't quite find yoer comment about asking in the right forum appropriate. I've asked numerous conceptual, theory questions there. Thats what i was looking for, not technical...i figure thats what books are for, but in this case it proved beneficial. Ah well tis what i get for speaking before thinking. Ummm...I'm super tired...damn computer keeps me awake for days...I slept yesterday though...poor excuse for bad judgement I agree, but it's an excuse. ;P

shl eax, 2

Is only 2 cycles and 3 bytes so i wasn't "that" wrong...but I was wrong... Cheers! "An expert is someone who has made all the mistakes in his or her field" - Niels Bohr

alex barylski

In Visual C++ 5.0 and later yup. Actually the question stemed from my old TurboPascal program, which uses 4 bytes for boolean also. "An expert is someone who has made all the mistakes in his or her field" - Niels Bohr

alex barylski

However...It would ge quicker to pull 32 bits into say register eax. Then perform AND/OR with your 32 bit mask wouldn't it...? Something like this: boolean = 123 //1111011 array of bool's mov eax, [boolean] and eax, 0x00000001 //Mask off unwanted bits, first bit the above is the same effect as mov eax, [boolean] //But this time it holds only one bit assuming the mov eax, [boolean] consumes 6 bytes and 8 cycles or so and the and eax, 0x00000001 uses 6 bytes and 6 cycles, theres no perfomance gain, but if theres more than one bool.

//Compiler way
mov eax, [boolean1] //6 bytes, 8 clocks
mov ebx, [boolean2] //6 bytes, 8 clocks
mov edx, [boolean3] //6 bytes, 8 clocks
mov ecx, [boolean4] //6 bytes, 8 clocks
//24 bytes, 32 clocks + 16 bytes for four booleans

//Optimized way
mov eax, [boolean] //6 bytes, 8 clocks + 4 bytes for one boolean
and eax, 0x00000001 //6 bytes, 6 clocks
and ebx, 0x00000002 //6 bytes, 6 clocks
and ecx, 0x00000003 //6 bytes, 6 clocks
and edx, 0x00000004 //6 bytes, 6 clocks
//30 bytes, 32 clocks, 4 bytes for boolean

What am I missing...? Assuming the above data is somewhat accurate: the bit masking method saves space and if and uses fewer clocks like I suspect it might (maybe) then it would execute in less time so there is a time and space savings. "An expert is someone who has made all the mistakes in his or her field" - Niels Bohr

Todd C Wilson

[chuckle] You're now trying to make your data fit your theory! [/chuckle] You're basically doing a hand-optimization of the code; the compiler will never produce code of this kind. Compiler optimizations is pretty much a black art, and is very very dependant upon the underlying cpu design. This is why certain Intel optimizations work poorly on AMD (or even Pentium Pro opt's gag on Pentium II's). This is why my example used pseduo-code and not the real thing. This is also why top-notch first person shooters use hand-tuned assembly and not just rely upon the compiler. Again, some processors and even processer versions do have specialized "bit flag" or "bit register" operations, so you don't even have to load squat - "test addr,bit" will set the flags. You're also making the assumption that your and eax, 0x04 result is treated the same way when checking for true and false - does the compiler generate code like "if (reg&0x04)==1" or does it do "if (reg&0x04)!=0"? The former will always fail, the latter is correct. If this is true, then you need to do your right-shift by the correct amount, to make the result a strict 0 or 1. Again, your hand-tuned and hand-optimized for x86 can do an and 0x04 and then a branch-not-equals since the z flag will be set, but what about other processors? Now, one point you left out and that I didn't mention before since I thought it would muddy the issue, is that you're only reading the values now, which is easy as you point out - get the value, mask it, check for non-zero. However, storing is much more difficult - load, mask with compliment, ora in correct bit mask value for item, store back. There are ways to do this that borrow from graphics design (bit masking and blitting) that solves the temporary value problem but involves two writes to the same location. If you're doing a hardware interface to a location, this may cause a double-latch, which may or may not be a good/bad thing. Another problem is that it is very rare that you're going to be using all the bit flags in a field at the same time - more likely you'll be checking for a condition and then doing a block of work, which will trash your registers. Basically, if you're worried (or if it's an issue with data heap space) about bool taking up a complete word, then help the compiler along using #pragma pack or your developer system's equivilant.

Visual Studio Favorites - www.nopcode.com/visualfa

alex barylski

No worries...i'm running intel 266Mhz. Not greased lighting, but fast enough to not have to worry about 31 bits wasted space. I was curious not so much for the compiler optimizing the code, but if it could be done by hand using inline _asm. I program for me only at the moment so portability isn't an issue. Intels test instruction works on my puter...thats all i'm worried about. I can't see myself ever actually implementing something like the following, except for maybe bool values in the registry. Assembly pro i'm not...I simply know enough to optimize and figure things out that require low level knowledge. The amount of time it takes me to figure out a section of code(and recheck and recheck) in assembly would never allow me to finish projects I start. Really I just wanted to see if my approach was legit or if I was mis understanding something. Had nothing to do with actual optimizations. I always let the compiler do inline, loop unrolls, common sub-expression elimination you name it...with the occasional pragma. Thanx for the time!! :) "An expert is someone who has made all the mistakes in his or her field" - Niels Bohr