Math with string, but why?
-
I often see questions of the form: "I have strings such as '0011001010010' and I want to calculate the bitwise AND/OR/XOR between them, how do I do that?" One person even wanted to implement SHA1 this way. Where is this coming from*? Who is teaching people that this is a good idea? Why is this allowed to continue? An other variant of this anti-pattern works with decimal strings, often zeroes are appended to the string because "that's faster than multiplying by 10, right?". * some theories I have are that the people suffering from these ideas have a fundamental misunderstanding either of how math works on a computer or of types in general (possibly because debuggers show string representations of everything), or because their brains are too visually oriented (and therefore sees integers merely as strings of glyphs).
This just seems to be a common pattern for 'noobs'. My guess is that they don't quite get datatypes, get lots of compiler errors and then automatically try to use strings for everything, including really awful string conversions.
At least artificial intelligence already is superior to natural stupidity
-
I often see questions of the form: "I have strings such as '0011001010010' and I want to calculate the bitwise AND/OR/XOR between them, how do I do that?" One person even wanted to implement SHA1 this way. Where is this coming from*? Who is teaching people that this is a good idea? Why is this allowed to continue? An other variant of this anti-pattern works with decimal strings, often zeroes are appended to the string because "that's faster than multiplying by 10, right?". * some theories I have are that the people suffering from these ideas have a fundamental misunderstanding either of how math works on a computer or of types in general (possibly because debuggers show string representations of everything), or because their brains are too visually oriented (and therefore sees integers merely as strings of glyphs).
-
harold aptroot wrote:
Math with string, but why?
Because string theory[^] desperately needs math? ;P
Veni, vidi, vici.
-
Ok I took a look at the section "The mathematics".. Stuff for nightmares. It desperately needs less math, I'd say.
-
I often see questions of the form: "I have strings such as '0011001010010' and I want to calculate the bitwise AND/OR/XOR between them, how do I do that?" One person even wanted to implement SHA1 this way. Where is this coming from*? Who is teaching people that this is a good idea? Why is this allowed to continue? An other variant of this anti-pattern works with decimal strings, often zeroes are appended to the string because "that's faster than multiplying by 10, right?". * some theories I have are that the people suffering from these ideas have a fundamental misunderstanding either of how math works on a computer or of types in general (possibly because debuggers show string representations of everything), or because their brains are too visually oriented (and therefore sees integers merely as strings of glyphs).
That's all.
-
I often see questions of the form: "I have strings such as '0011001010010' and I want to calculate the bitwise AND/OR/XOR between them, how do I do that?" One person even wanted to implement SHA1 this way. Where is this coming from*? Who is teaching people that this is a good idea? Why is this allowed to continue? An other variant of this anti-pattern works with decimal strings, often zeroes are appended to the string because "that's faster than multiplying by 10, right?". * some theories I have are that the people suffering from these ideas have a fundamental misunderstanding either of how math works on a computer or of types in general (possibly because debuggers show string representations of everything), or because their brains are too visually oriented (and therefore sees integers merely as strings of glyphs).
-
harold aptroot wrote:
An other variant of this anti-pattern works with decimal strings, often zeroes are appended to the string because "that's faster than multiplying by 10, right?".
I would expect appending to be faster myself.
-
harold aptroot wrote:
An other variant of this anti-pattern works with decimal strings, often zeroes are appended to the string because "that's faster than multiplying by 10, right?".
I would expect appending to be faster myself.
-
harold aptroot wrote:
An other variant of this anti-pattern works with decimal strings, often zeroes are appended to the string because "that's faster than multiplying by 10, right?".
I would expect appending to be faster myself.
You haven't met .net yet, have you? :sigh:
-
Well, contrary to what other people will tell you, be a good person and teach them quickly and show them the way to understanding of how it works. something like : ----
Well, that number is a string representation of a number, you need to convert it to a real number with functions x, y, or z, after that, you can bitwize it to your liking and convert back the number to its string representation with function a, b or c; to be able to display it on screen.
Here are a few internal/external links that can help you further:
http://link_to_good_explanation
http://link_to_another_explanation---- IMO, this is what the forum should be. if you cannot or do not want to answer the question, just let it be and do something else that will make you happy. M.
Watched code never compiles.
But the weather's OK, yeah? Can we bitch about the weather?
I wanna be a eunuchs developer! Pass me a bread knife!
-
That's all.
Best comment I've heard all day! :-D :-D
Full-fledged Java/.NET lover, full-fledged PHP hater. Full-fledged Google/Microsoft lover, full-fledged Apple hater. Full-fledged Skype lover, full-fledged YM hater.
-
I often see questions of the form: "I have strings such as '0011001010010' and I want to calculate the bitwise AND/OR/XOR between them, how do I do that?" One person even wanted to implement SHA1 this way. Where is this coming from*? Who is teaching people that this is a good idea? Why is this allowed to continue? An other variant of this anti-pattern works with decimal strings, often zeroes are appended to the string because "that's faster than multiplying by 10, right?". * some theories I have are that the people suffering from these ideas have a fundamental misunderstanding either of how math works on a computer or of types in general (possibly because debuggers show string representations of everything), or because their brains are too visually oriented (and therefore sees integers merely as strings of glyphs).
I worked on a system where the DEA encryption program used stings of '0's and '1's instead of trying to work out the possibly more efficient bitwise operations. That code was initially written in DataBASIC on a Pick system. In Pick all numbers are just ASCII strings, they get converted implicitly when you do an arithmetic expression but when assigned to a variable they are converted to ASCII string. It works fine and it sure makes the files easy to edit as all numbers are their ASCII representation. The Pick system was much too slow for encryption. Pick works well in an I/O bound system, the DataBASIC actually running as an interpreted P-Code. Fine for today's Mega-Fast systems doing Java but it was slow on 20Mhz 68020 systems. So the encryption was re-implemented on a PC-XT under C. The binary as strings logic was kept, and a good thing too. Several years later we upgraded to UniData running on a Motorola 88K, running under Unix. That C code for the encryption could now run on the same box as the 'system', no need for a clunky comms connection to a PC. And the C code ported just fine. I hate to think what that conversion would have been like from 16 bit ints to 32 bit with big endianess instead of little on the 8088. There was another upgrade to 64bit Alpha, again that would have needed the code adjusting for another increase in the size of integers. Not doubt that system is now running on XEONs, a change in endianess again, irrelevant to the binary as string code.
-
I worked on a system where the DEA encryption program used stings of '0's and '1's instead of trying to work out the possibly more efficient bitwise operations. That code was initially written in DataBASIC on a Pick system. In Pick all numbers are just ASCII strings, they get converted implicitly when you do an arithmetic expression but when assigned to a variable they are converted to ASCII string. It works fine and it sure makes the files easy to edit as all numbers are their ASCII representation. The Pick system was much too slow for encryption. Pick works well in an I/O bound system, the DataBASIC actually running as an interpreted P-Code. Fine for today's Mega-Fast systems doing Java but it was slow on 20Mhz 68020 systems. So the encryption was re-implemented on a PC-XT under C. The binary as strings logic was kept, and a good thing too. Several years later we upgraded to UniData running on a Motorola 88K, running under Unix. That C code for the encryption could now run on the same box as the 'system', no need for a clunky comms connection to a PC. And the C code ported just fine. I hate to think what that conversion would have been like from 16 bit ints to 32 bit with big endianess instead of little on the 8088. There was another upgrade to 64bit Alpha, again that would have needed the code adjusting for another increase in the size of integers. Not doubt that system is now running on XEONs, a change in endianess again, irrelevant to the binary as string code.
-
You haven't met .net yet, have you? :sigh:
PIEBALDconsult wrote:
You haven't met .net yet, have you?
Meaning? Just to be clear...converting a string to a numeric, multiplying by 10 and then converting back to a string is going to be slower than appending to a string. In all of the languages that I know, including .Net.
-
harold aptroot wrote:
Yes.. but the idea was not to use strings in the first place.
Ah...but of course that wasn't the scenario presented. And not presumable in these days of prevelant GUIs and xml.
-
The scenario is intended to be the same one as in the earlier paragraphs of my post: math with strings for no good reason.
-
PIEBALDconsult wrote:
You haven't met .net yet, have you?
Meaning? Just to be clear...converting a string to a numeric, multiplying by 10 and then converting back to a string is going to be slower than appending to a string. In all of the languages that I know, including .Net.
Yes, but the appendification might not be as quick as just the multiplying by ten.
-
No offense, but I wouldn't say "too lazy to port properly" is a valid excuse to keep that kind of code around.. But it's not as bad as doing it out of lack of understanding - at least there was some actual reason.
There's a lot of code in the DEA algorithm to set bits from seemingly random bits in a source 64 bit block. Thinking about it the string binary might actually be more efficient that bitwise binary. Setting Bit 53 to the value of bit 17 in string binary is a simple byte to byte copy. In Bitwise operations its simple enough to specify in C by what might the assembly code look like? It would have to load the word containing the source bit. Do a TST if the bit is set or not. Branch to separate setting or unsetting logic. Load the destination word. OR the destination bit if setting, AND the compliment of the bit if unsetting. That has got to end up being more intructions than a simple byte to byte transfer. I think that was the whole point of the DEA algorithm, it was to computationally inefficient and time consuming.
-
There's a lot of code in the DEA algorithm to set bits from seemingly random bits in a source 64 bit block. Thinking about it the string binary might actually be more efficient that bitwise binary. Setting Bit 53 to the value of bit 17 in string binary is a simple byte to byte copy. In Bitwise operations its simple enough to specify in C by what might the assembly code look like? It would have to load the word containing the source bit. Do a TST if the bit is set or not. Branch to separate setting or unsetting logic. Load the destination word. OR the destination bit if setting, AND the compliment of the bit if unsetting. That has got to end up being more intructions than a simple byte to byte transfer. I think that was the whole point of the DEA algorithm, it was to computationally inefficient and time consuming.
Now that's a more valid reason. You wouldn't do it with a branch of course, but yes it still kind of sucks. This is how I might write it:
; rcx = destination, rdx = source, for no reason
btr rcx,53
and edx,0x00020000
shl rdx,36
or rcx,rdxSo 3 cycles, not so bad. Still, using strings would be faster for this step.