Compare string to UTF-8 character set
-
hi all, i want to compare each character of my string variable to all characters of UTF-8 character set(i think there are near about 65000 character). and remove any character of variable which it is not in list of UTF -8 Character set. anybody have idea that how can i perform this operation efficiently ?
Rupesh Kumar Swami Software Developer, Integrated Solution, Bikaner (India) My Company Award: Best VB.NET article of June 2008: Create Column Charts Using OWC11
-
hi all, i want to compare each character of my string variable to all characters of UTF-8 character set(i think there are near about 65000 character). and remove any character of variable which it is not in list of UTF -8 Character set. anybody have idea that how can i perform this operation efficiently ?
Rupesh Kumar Swami Software Developer, Integrated Solution, Bikaner (India) My Company Award: Best VB.NET article of June 2008: Create Column Charts Using OWC11
Your question doesn't make sense. UTF-8 is not a character set, it's an encoding. Read[^]
A guide to posting questions on CodeProject[^]
Dave Kreskowiak Microsoft MVP Visual Developer - Visual Basic
2006, 2007, 2008 -
Your question doesn't make sense. UTF-8 is not a character set, it's an encoding. Read[^]
A guide to posting questions on CodeProject[^]
Dave Kreskowiak Microsoft MVP Visual Developer - Visual Basic
2006, 2007, 2008sir, whether i can check if the character of any string variable is not in UTF-8 format ?
Rupesh Kumar Swami Software Developer, Integrated Solution, Bikaner (India) My Company Award: Best VB.NET article of June 2008: Create Column Charts Using OWC11
-
sir, whether i can check if the character of any string variable is not in UTF-8 format ?
Rupesh Kumar Swami Software Developer, Integrated Solution, Bikaner (India) My Company Award: Best VB.NET article of June 2008: Create Column Charts Using OWC11
OK, you're not getting it. You don't check a String to see if it's encoded using UTF-8. You can check a stream of bytes though. See GetDecoder[^] for an example.
A guide to posting questions on CodeProject[^]
Dave Kreskowiak Microsoft MVP Visual Developer - Visual Basic
2006, 2007, 2008 -
hi all, i want to compare each character of my string variable to all characters of UTF-8 character set(i think there are near about 65000 character). and remove any character of variable which it is not in list of UTF -8 Character set. anybody have idea that how can i perform this operation efficiently ?
Rupesh Kumar Swami Software Developer, Integrated Solution, Bikaner (India) My Company Award: Best VB.NET article of June 2008: Create Column Charts Using OWC11
Rupesh Kumar Swami wrote:
i want to compare each character of my string variable to all characters of UTF-8 character set
As already have been pointed out in the thread, UTF-8 is not a character set.
Rupesh Kumar Swami wrote:
UTF-8 character set(i think there are near about 65000 character)
The Unicode character set contains more than 100 000 chcaracters.
Rupesh Kumar Swami wrote:
and remove any character of variable which it is not in list of UTF -8 Character set.
Here's a method that does that for you:
Public Function RemoveNonUnicodeCharacers(ByVal s As String) As String
Return s
End FunctionStrings in .NET are Uncode. UTF-8 is a way to represent unicode characters as a binary stream. So, any character that you have in a string can be encoded into an UTF-8 stream.
Despite everything, the person most likely to be fooling you next is yourself.
-
Rupesh Kumar Swami wrote:
i want to compare each character of my string variable to all characters of UTF-8 character set
As already have been pointed out in the thread, UTF-8 is not a character set.
Rupesh Kumar Swami wrote:
UTF-8 character set(i think there are near about 65000 character)
The Unicode character set contains more than 100 000 chcaracters.
Rupesh Kumar Swami wrote:
and remove any character of variable which it is not in list of UTF -8 Character set.
Here's a method that does that for you:
Public Function RemoveNonUnicodeCharacers(ByVal s As String) As String
Return s
End FunctionStrings in .NET are Uncode. UTF-8 is a way to represent unicode characters as a binary stream. So, any character that you have in a string can be encoded into an UTF-8 stream.
Despite everything, the person most likely to be fooling you next is yourself.
Thank sir, for your information one more question How can find ascii equivalent of string which is made of utf-8 encoding? Since this is 8 bit
Rupesh Kumar Swami Software Developer, Integrated Solution, Bikaner (India) My Company Award: Best VB.NET article of June 2008: Create Column Charts Using OWC11
-
Thank sir, for your information one more question How can find ascii equivalent of string which is made of utf-8 encoding? Since this is 8 bit
Rupesh Kumar Swami Software Developer, Integrated Solution, Bikaner (India) My Company Award: Best VB.NET article of June 2008: Create Column Charts Using OWC11
Strings are not encoded at all. An array of bytes using a certain encoding represents a string of text. It's kind of like a .ZIP file. A ZIP an array of bytes that encodes a file resulting in a compressed version of the file. What's the point of all this?? Why are you insisting on doing this?
A guide to posting questions on CodeProject[^]
Dave Kreskowiak Microsoft MVP Visual Developer - Visual Basic
2006, 2007, 2008