The Magician's String, what you see is not what you get.
-
static void Main(string[] args)
{
String str1 = "http://toto.com/";
String str2 = "http://toto.com/";
bool eq = str1 == str2;
Console.WriteLine(eq); //print falsestr1 = "http://toto.com/"; str2 = "http://toto.com/"; eq = str1 == str2; Console.WriteLine(eq); //print true
}
See for yourself, but copy the code, do not retype it. :) I lost hair on this one, bug on an actual project for one customer. But it is a nice trick to do to one of your most hated co worker if his computer is unlocked... Also works in configuration files. ;) This is pure evil though. [UPDATE] With some advice I found even more evil than that.
"а" == "a" //false
-
So, for us mortals, care to explain what is going on here ?
~RaGE();
I think words like 'destiny' are a way of trying to find order where none exists. - Christian Graus Entropy isn't what it used to.
Magic. :) str2 is a string with an hidden character. If you copy my code, you copy the hidden character, so this bug follow you in whatever programming language. You can execute the code in debug mode, and see that str2 is length +1 str1. However, I have no idea how I ended up with this hidden character in my code. (the int value of this strange character is 0x200f)
-
static void Main(string[] args)
{
String str1 = "http://toto.com/";
String str2 = "http://toto.com/";
bool eq = str1 == str2;
Console.WriteLine(eq); //print falsestr1 = "http://toto.com/"; str2 = "http://toto.com/"; eq = str1 == str2; Console.WriteLine(eq); //print true
}
See for yourself, but copy the code, do not retype it. :) I lost hair on this one, bug on an actual project for one customer. But it is a nice trick to do to one of your most hated co worker if his computer is unlocked... Also works in configuration files. ;) This is pure evil though. [UPDATE] With some advice I found even more evil than that.
"а" == "a" //false
Compileonline.com[^] will shows you the buggy char on
String str2
. Interesting though. :)Wonde Tadesse
-
Compileonline.com[^] will shows you the buggy char on
String str2
. Interesting though. :)Wonde Tadesse
did not know this tool ! good to know.
-
Magic. :) str2 is a string with an hidden character. If you copy my code, you copy the hidden character, so this bug follow you in whatever programming language. You can execute the code in debug mode, and see that str2 is length +1 str1. However, I have no idea how I ended up with this hidden character in my code. (the int value of this strange character is 0x200f)
Another thing that bit me before was an UTF-8 preamble or BOM with the bytes 0xEF, 0xBB, 0xBF that got copied from somewhere... :doh:
-
Another thing that bit me before was an UTF-8 preamble or BOM with the bytes 0xEF, 0xBB, 0xBF that got copied from somewhere... :doh:
Already got it, if you create a text file with visual studio, it bites you.
-
static void Main(string[] args)
{
String str1 = "http://toto.com/";
String str2 = "http://toto.com/";
bool eq = str1 == str2;
Console.WriteLine(eq); //print falsestr1 = "http://toto.com/"; str2 = "http://toto.com/"; eq = str1 == str2; Console.WriteLine(eq); //print true
}
See for yourself, but copy the code, do not retype it. :) I lost hair on this one, bug on an actual project for one customer. But it is a nice trick to do to one of your most hated co worker if his computer is unlocked... Also works in configuration files. ;) This is pure evil though. [UPDATE] With some advice I found even more evil than that.
"а" == "a" //false
I once read a kind of ironic posting about what you could do to obscrure your code (and this way make yourself irreplacable). One of the topics was using similar letters from different alphabets in variable names. They used the example of the Cyrillic 'a' which looks just like the Latin 'a' but is seen as different by the compiler. I assume you could have reached a similar effect by using a Cyrillic 'r' instead of the Latin 'p' in the URL. :-D
The good thing about pessimism is, that you are always either right or pleasently surprised.
-
I once read a kind of ironic posting about what you could do to obscrure your code (and this way make yourself irreplacable). One of the topics was using similar letters from different alphabets in variable names. They used the example of the Cyrillic 'a' which looks just like the Latin 'a' but is seen as different by the compiler. I assume you could have reached a similar effect by using a Cyrillic 'r' instead of the Latin 'p' in the URL. :-D
The good thing about pessimism is, that you are always either right or pleasently surprised.
Did not know that, this is even more evil than the invisible character. I take some notes. You can spot the invisible char by doing str1.Length, but a Cyrillic 'a'... huhu. :)
-
I once read a kind of ironic posting about what you could do to obscrure your code (and this way make yourself irreplacable). One of the topics was using similar letters from different alphabets in variable names. They used the example of the Cyrillic 'a' which looks just like the Latin 'a' but is seen as different by the compiler. I assume you could have reached a similar effect by using a Cyrillic 'r' instead of the Latin 'p' in the URL. :-D
The good thing about pessimism is, that you are always either right or pleasently surprised.
Yes, Unicode can be very handy. "the Greek letter Tau (t) (Unicode U+03A4) which looks enough like the Latin letter T" -- Sorting 'Total' after data values[^]
-
Yes, Unicode can be very handy. "the Greek letter Tau (t) (Unicode U+03A4) which looks enough like the Latin letter T" -- Sorting 'Total' after data values[^]
Does the Terminal font trick would find the bug ? :D
-
Did not know that, this is even more evil than the invisible character. I take some notes. You can spot the invisible char by doing str1.Length, but a Cyrillic 'a'... huhu. :)
Well, some phishers used that in web addresses. Then, browsers were changed to show some encoded values in the address bar for such characters.
www.dеutsсhеbаnk.соm
looks so nice at first view, but Firefox changes it intowww.xn--dutshbnk-66g8be6l.xn--m-0tbi
nowadays. -
Compileonline.com[^] will shows you the buggy char on
String str2
. Interesting though. :)Wonde Tadesse
-
It catches it and will tell you
false
. They are not the same Unicode. See the code below.private static void MystriesUniCode()
{
Console.WriteLine("{0} U+{1:x4} {2}", 'а', (int)'а', (int)'а');
Console.WriteLine("{0} U+{1:x4} {2}", 'a', (int)'a', (int)'a');
}And output is
? U+0430 1072 a U+0061 97
Seeing is believing. Not this time. Compiling is believing. :-D
Wonde Tadesse
-
static void Main(string[] args)
{
String str1 = "http://toto.com/";
String str2 = "http://toto.com/";
bool eq = str1 == str2;
Console.WriteLine(eq); //print falsestr1 = "http://toto.com/"; str2 = "http://toto.com/"; eq = str1 == str2; Console.WriteLine(eq); //print true
}
See for yourself, but copy the code, do not retype it. :) I lost hair on this one, bug on an actual project for one customer. But it is a nice trick to do to one of your most hated co worker if his computer is unlocked... Also works in configuration files. ;) This is pure evil though. [UPDATE] With some advice I found even more evil than that.
"а" == "a" //false
I would never, ever, ever stoop so low as to unleash this on my coworkers... but I can think of some suppliers who may benefit from this (see my previous posts on Coding Horrors The Wierd and the Wonderful). bwa ha ha ha bwa ha ha ha ha bwa ha ha ha ha ha bwa ha ha ha ha ha ha
"If you don't fail at least 90 percent of the time, you're not aiming high enough." Alan Kay.
-
static void Main(string[] args)
{
String str1 = "http://toto.com/";
String str2 = "http://toto.com/";
bool eq = str1 == str2;
Console.WriteLine(eq); //print falsestr1 = "http://toto.com/"; str2 = "http://toto.com/"; eq = str1 == str2; Console.WriteLine(eq); //print true
}
See for yourself, but copy the code, do not retype it. :) I lost hair on this one, bug on an actual project for one customer. But it is a nice trick to do to one of your most hated co worker if his computer is unlocked... Also works in configuration files. ;) This is pure evil though. [UPDATE] With some advice I found even more evil than that.
"а" == "a" //false
So evil... Heh... Of course, the VS theme[^] I'm using automatically underlines hyperlinks. And it doesn't underline that second one. Gee, I wonder why :-D
Proud to have finally moved to the A-Ark. Which one are you in?
Author of the Guardians Saga (Sci-Fi/Fantasy novels) -
It catches it and will tell you
false
. They are not the same Unicode. See the code below.private static void MystriesUniCode()
{
Console.WriteLine("{0} U+{1:x4} {2}", 'а', (int)'а', (int)'а');
Console.WriteLine("{0} U+{1:x4} {2}", 'a', (int)'a', (int)'a');
}And output is
? U+0430 1072 a U+0061 97
Seeing is believing. Not this time. Compiling is believing. :-D
Wonde Tadesse
Quote:
Compileonline.com[^] will shows you the buggy char on String str2 . Interesting though. Smile | :)
I was talking about the Above BTW. :) Well strange compilation. :D Same Line twice and Different Results..
We should be building great things that don't exist-
Lary Page
-
static void Main(string[] args)
{
String str1 = "http://toto.com/";
String str2 = "http://toto.com/";
bool eq = str1 == str2;
Console.WriteLine(eq); //print falsestr1 = "http://toto.com/"; str2 = "http://toto.com/"; eq = str1 == str2; Console.WriteLine(eq); //print true
}
See for yourself, but copy the code, do not retype it. :) I lost hair on this one, bug on an actual project for one customer. But it is a nice trick to do to one of your most hated co worker if his computer is unlocked... Also works in configuration files. ;) This is pure evil though. [UPDATE] With some advice I found even more evil than that.
"а" == "a" //false
I refused to copy because I also wanted to find out where.
static void Main(string\[\] args) { String str1 = "http://toto.com/"; String str2 = "http://toto.com/"; // 123456789 123456 bool eq = str1 == str2; int j = str1.Length; Console.WriteLine(string.Format("Evaluates to {0}, Length = {1},{2}", eq, j, str2.Length)); //print false for (int i = 0; i < j; i++) { if (str1\[i\] != str2\[i\]) Console.WriteLine(string.Format("Mismatch found index={0}, char(2),int(2) = {1}-{2},{3}-{4}" , i, str1\[i\], str2\[i\], (int)str1\[i\], (int)str2\[i\])); } str1 = "http://toto.com/"; str2 = "http://toto.com/"; eq = str1 == str2; Console.WriteLine(eq); //print true Console.Read(); }
PS my "find-out" code has a bug in it that I only realized after it ran successfully through pure luck. (Do you see it?)
-
I refused to copy because I also wanted to find out where.
static void Main(string\[\] args) { String str1 = "http://toto.com/"; String str2 = "http://toto.com/"; // 123456789 123456 bool eq = str1 == str2; int j = str1.Length; Console.WriteLine(string.Format("Evaluates to {0}, Length = {1},{2}", eq, j, str2.Length)); //print false for (int i = 0; i < j; i++) { if (str1\[i\] != str2\[i\]) Console.WriteLine(string.Format("Mismatch found index={0}, char(2),int(2) = {1}-{2},{3}-{4}" , i, str1\[i\], str2\[i\], (int)str1\[i\], (int)str2\[i\])); } str1 = "http://toto.com/"; str2 = "http://toto.com/"; eq = str1 == str2; Console.WriteLine(eq); //print true Console.Read(); }
PS my "find-out" code has a bug in it that I only realized after it ran successfully through pure luck. (Do you see it?)
Have you tried
String str1 = "аrnold";
String str2 = "arnold";This is not the same problem ;)
-
Have you tried
String str1 = "аrnold";
String str2 = "arnold";This is not the same problem ;)
Looks like the same problem to me. One character is ASCII and the other is Unicode. (Well, they both are Unicode and one isn't ASCII. You of course, could have both not ASCII
static void Main(string\[\] args) { String str1 = "http://toto.com/"; String str2 = "http://toto.com/"; // 123456789 123456 TestStrs(str1, str2); str1 = "аrnold"; str2 = "arnold"; TestStrs(str1, str2); str1 = "http://toto.com/"; str2 = "http://toto.com/"; TestStrs(str1, str2); Console.Read(); } static bool TestStrs(string str1, string str2) { bool eq = str1 == str2; if (eq) { Console.WriteLine(string.Format("Two Strings ({0}) are the same", str1)); return eq; } Console.WriteLine(string.Format("Mismatch, two Strings ({0}) ({1})are not the same", str1, str2)); int j = str1.Length, i = str2.Length; if (j > i) { j = i; } for (i = 0; i < j; i++) { if (str1\[i\] != str2\[i\]) Console.WriteLine(string.Format("Mismatch found index={0}, char(2),int(2) = {1}-{2},{3}-{4}", i, str1\[i\], str2\[i\], (int)str1\[i\], (int)str2\[i\])); } return eq; }
Of course this has a bug in it too. Multiple true Unicode strings would blow up with an overindex error. Help says exactly what I thought it said, which is patently wrong: String . Length Property (System) - MSDN – the Microsoft ... The Length property returns the number of Char objects in this instance, not the number of Unicode characters. The reason is that a Unicode character might be ... http://msdn.microsoft.com/en-us/library/system.string.length[^] If it did what it said it would do, I wouldn't have spotted the bug in the first place. No, I'm wrong. I was under the impression that char could hold one or two bytes. I guess instead when a true Unicode character is indexed in the string, the next index location points to the start of the next Unicode character, otherwise your string would have a bunch of mismatches in the loop. So true multiple UNICODE
-
Looks like the same problem to me. One character is ASCII and the other is Unicode. (Well, they both are Unicode and one isn't ASCII. You of course, could have both not ASCII
static void Main(string\[\] args) { String str1 = "http://toto.com/"; String str2 = "http://toto.com/"; // 123456789 123456 TestStrs(str1, str2); str1 = "аrnold"; str2 = "arnold"; TestStrs(str1, str2); str1 = "http://toto.com/"; str2 = "http://toto.com/"; TestStrs(str1, str2); Console.Read(); } static bool TestStrs(string str1, string str2) { bool eq = str1 == str2; if (eq) { Console.WriteLine(string.Format("Two Strings ({0}) are the same", str1)); return eq; } Console.WriteLine(string.Format("Mismatch, two Strings ({0}) ({1})are not the same", str1, str2)); int j = str1.Length, i = str2.Length; if (j > i) { j = i; } for (i = 0; i < j; i++) { if (str1\[i\] != str2\[i\]) Console.WriteLine(string.Format("Mismatch found index={0}, char(2),int(2) = {1}-{2},{3}-{4}", i, str1\[i\], str2\[i\], (int)str1\[i\], (int)str2\[i\])); } return eq; }
Of course this has a bug in it too. Multiple true Unicode strings would blow up with an overindex error. Help says exactly what I thought it said, which is patently wrong: String . Length Property (System) - MSDN – the Microsoft ... The Length property returns the number of Char objects in this instance, not the number of Unicode characters. The reason is that a Unicode character might be ... http://msdn.microsoft.com/en-us/library/system.string.length[^] If it did what it said it would do, I wouldn't have spotted the bug in the first place. No, I'm wrong. I was under the impression that char could hold one or two bytes. I guess instead when a true Unicode character is indexed in the string, the next index location points to the start of the next Unicode character, otherwise your string would have a bunch of mismatches in the loop. So true multiple UNICODE
The different is that the http://toto.com/ example contains a char that is hidden. However the "arnold" == "arnold" are using two different "a". This is why the two arnold have same length, but not the two http://toto.com/ I was not aware of the StringInfo class and how a unicode char could take two chars. Very intersting stuff, I have not idea if a "Unicode char" exists.