Counting set bits

PaleyX

What would be the fastest way to count the number of set bits in a byte? I currently use a static array which has the number of bits set for each possible combination of set bits then use the byte value to index into that array. Are there faster ways to do it?

Philip Fitzsimons

your technique is generally the fastest... have a look at http://forums.devshed.com/t70218/s.html?highlight=lookup+values[^] for a discussion....

"When the only tool you have is a hammer, a sore thumb you will have."

PaleyX

Is it? Drat! It's where my code spends most of its time - although it is doing it billions of times so I don't suppose I can complain. :rolleyes: Thanks for the reply.

Philip Fitzsimons

I have a general rule when it comes to optimization: - the fastest way to do something is not to do it i.e. does you code really need to count the bits? is there a way to avoid it? what are you counting bits for btw?

"When the only tool you have is a hammer, a sore thumb you will have."

PaleyX

It's a database application which stores the presence or non presence of variable values in bit vectors stored in byte arrays. Counts and analyses can then be run very quickly - for example: All people who live in England, drive a ford, eat out at restaurants once a week, read the Times and don't have a pet - on a database of 100 million individuals takes half a second. Counting the bits gives the total number matching the criteria after the logic processing has done its thang. TBH, it's not too much of a worry that the bit counting is taking a lot of the time as I am happy with half a second on 100 million individuals - and I doubt I will ever put 100 million individuals into a real database, I was just stress testing then seeing where it was spending its time.

Philip Fitzsimons

lol strange coincidence... :laugh: i used to design direct marketing systems for exactly this kind of thing (CMT/NDL/Claritas) check you IO and Page faulting...

"When the only tool you have is a hammer, a sore thumb you will have."

PaleyX

Really? I used to write the software at Dudley Jenkins for their direct marketing "lifestyle" database. I have set up on my own now. I have got the IO and Page Faulting sorted, I believe. There is certainly very little page faulting when analysing the 100 million and the IO seems to have very little impact. It's been quite a ride getting this thing to work this fast in C# after writing similar software in C++.

Sebastian Schneider

Speed is always difficult do maintain with so many different layers inserted between processor and application. C# feels good, though. Most of the times ;) Cheers Sid

PaleyX

True, I rather like C# now, I wqasn't to keen at first and I still have to do stuff I would rather not just to get the performance but then that's often the case in most languages - at least C# remains reasonably readable.