Need help with reversing an algorithm

Anonymous

I have the following two functions: unsigned PullTerminalId(unsigned long id) { return ((unsigned)(id >> 20)); } unsigned long get_human_check_id(unsigned long id) { unsigned long terminalId = (unsigned long)PullTerminalId(id) * 10000; unsigned long checkNbr = (unsigned)(id & 0xFFFF); return terminalId + checkNbr; } As an example, if I call get_human_check_id with a value of 1048577, then the result is 10001 Could someone show me how to reverse this process?

Chris Losinger

i'm not sure you can. one major problem is that the algorithm doesn't preserve all of the information in the original id. it looks like at least 4 bits of the original data are thrown away completely: in effect, the ">> 20" throws away 20 bits, but the "& 0xffff" recovers 16 of them - the final number is still missing the information that was contained in bits 16..19. that's before you even look at the arithmetic... Cleek | Image Toolkits | Thumbnail maker

Priyank Bolia

Don't think so that it is possible! B'caz you are using and operation on the number. Actually you think as you have 2 numbers of 2 byte each and you are converting them to one number. But we can do the reverse, i.e. for x+y = 5, we can't say that x & y will be what? http://www.priyank.in/

Kuniva

You can't reverse the process, cause here:

unsigned PullTerminalId(unsigned long id)
{
return ((unsigned)(id >> 20));
}

You potentially loose information, u can't just automagically make it reappear. id >> 20 makes the 20 rightmost bits of the unsigned long integer go down a black hole. Kuniva --------------------------------------------

Anonymous

I agree with you both - this is Tom Archer by the way; CP keeps logging me out since I changed my email addr However, I wanted to verify that with others before telling the client that.

Anonymous

[Tom Archer] I think I got it. First, the 4 missing bits are not used in the system, which is why he's not accounting for them. Now that we know that the following function should work (I think) unsigned long get_human_check_id_rev(unsigned long id) { unsigned long rev1 = id / 10000; rev1 = rev1 << 20; unsigned long rev2 = (unsigned)(id & 0x01); return rev1 + rev2; }

Chris Losinger

i'm not sure that's gonna do it either. in the encoding, the information in the high 12 bits is combined with the low 16 bits through addition. seperating them is equivalent to finding which two numbers were added to produce "1234". just for a test: UINT in = 0xffffffff; TRACE("%in: s\n", BinDump((BYTE *)&in, 4)); UINT hid = get_human_check_id(in); UINT out = get_human_check_id_rev(hid); TRACE("out: %s\n", BinDump((BYTE *)&out, 4)); outputs: in: 11111111 11111111 11111111 11111111 out: 00000001 00000000 01010000 00000000 Cleek | Image Toolkits | Thumbnail maker

Tom Archer

Except that we know *how* they were added. IOW, the get_human function basically isolates everything above 20 bits, multiplies it by 1000 and states that its the terminalID. Ex: 1048577 = terminalId of 1000. It then takes bits 0-15 (hence the loss of 4 bits) and states that it's the checkNbr. Ex: 1048577 = checkNbr of 1 The two are added together. Ex: 1048577 == 1001 It seems like this should be reversible but I'm not seeing it. Cheers, Tom Archer - Archer Consulting Group
"Eat your brussel sprouts, Junior. There are starving Chinese children American programmers that would kill for that food!"

Tom Archer

I think I've got it now - proving that the get_human functoin can be reversed:

unsigned long get_human_check_id(unsigned long id)
{
unsigned long terminalId = (unsigned long)PullTerminalId(id) * 10000;
unsigned long checkNbr = (unsigned)(id & 0xFFFF);
return terminalId + checkNbr;
}

okay, lets break it down: the terminal id part is in bits 20 and up, then scaled up by 10000 the checknumber is in the lower 16 bits. So to reverse it you need to take : terminalid / 10000, then shift it left 20 bits plus the checknumber. My first attempt had roughly the right idea, except for that mask. Cheers, Tom Archer - Archer Consulting Group
"Eat your brussel sprouts, Junior. There are starving Chinese children American programmers that would kill for that food!"

Chris Losinger

So to reverse it you need to take : terminalid / 10000, then shift it left 20 bits plus the checknumber where do you get the checknumber ? Cleek | Image Toolkits | Thumbnail maker

Tom Archer

Check number is bits 0-15 (id & 0xFFFF) Therefore, * terminalId is bits 20 and above (multiplied by 1000) * checkNbr is bits 0-15 The only bits lost are bits 16-19, which are never used. Cheers, Tom Archer - Archer Consulting Group
"Eat your brussel sprouts, Junior. There are starving Chinese children American programmers that would kill for that food!"

Chris Losinger

Tom Archer wrote: terminalId is bits 20 and above (multiplied by 1000) but those bits are shift down 20 (which is equivalent to division by 1048576). they're now bits 0..12 of terminalId. * checkNbr is bits 0-15 and those two groups of bits are combined by the addition step. they don't occupy distinct bits in the result. Cleek | Image Toolkits | Thumbnail maker

Tom Archer

That's why the programmer multiplies terminalId by 1000; so that he can add the checkNbr without overlapping the terminalId digits. Cheers, Tom Archer - Archer Consulting Group
"Eat your brussel sprouts, Junior. There are starving Chinese children American programmers that would kill for that food!"

Anonymous

The original function is many-to-one, so there may be multiple possible answers when you invert it. For example, the original ids of 43920, 2000000, 2121072, 3159648, and 4198224 all yield a human_check_id of 43920 (with terminalIds of 0, 10000, 20000, 30000, and 40000, respectively).

Chris Losinger

1000 isn't enough to push those 12 bits 16 places. you'd need to multiply by at least 65336 to move something 16 bits. Cleek | Image Toolkits | Thumbnail maker

Alexander M

You can't. It's a one-way algorithm because data gets lost... First there is a shift right that will "kill" 20 bits, and then the upper 16 bits get lost by the and operation. Additinally there is a addition, which makes it almost impossible to find out what the original summands had been. Don't try it, just do it! ;-)