Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. Visual Basic
  4. Compare string to UTF-8 character set

Compare string to UTF-8 character set

Scheduled Pinned Locked Moved Visual Basic
questioncsharpcom
7 Posts 3 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • R Offline
    R Offline
    Rupesh Kumar Swami
    wrote on last edited by
    #1

    hi all, i want to compare each character of my string variable to all characters of UTF-8 character set(i think there are near about 65000 character). and remove any character of variable which it is not in list of UTF -8 Character set. anybody have idea that how can i perform this operation efficiently ?

    Rupesh Kumar Swami Software Developer, Integrated Solution, Bikaner (India) My Company Award: Best VB.NET article of June 2008: Create Column Charts Using OWC11

    D G 2 Replies Last reply
    0
    • R Rupesh Kumar Swami

      hi all, i want to compare each character of my string variable to all characters of UTF-8 character set(i think there are near about 65000 character). and remove any character of variable which it is not in list of UTF -8 Character set. anybody have idea that how can i perform this operation efficiently ?

      Rupesh Kumar Swami Software Developer, Integrated Solution, Bikaner (India) My Company Award: Best VB.NET article of June 2008: Create Column Charts Using OWC11

      D Offline
      D Offline
      Dave Kreskowiak
      wrote on last edited by
      #2

      Your question doesn't make sense. UTF-8 is not a character set, it's an encoding. Read[^]

      A guide to posting questions on CodeProject[^]
      Dave Kreskowiak Microsoft MVP Visual Developer - Visual Basic
           2006, 2007, 2008

      R 1 Reply Last reply
      0
      • D Dave Kreskowiak

        Your question doesn't make sense. UTF-8 is not a character set, it's an encoding. Read[^]

        A guide to posting questions on CodeProject[^]
        Dave Kreskowiak Microsoft MVP Visual Developer - Visual Basic
             2006, 2007, 2008

        R Offline
        R Offline
        Rupesh Kumar Swami
        wrote on last edited by
        #3

        sir, whether i can check if the character of any string variable is not in UTF-8 format ?

        Rupesh Kumar Swami Software Developer, Integrated Solution, Bikaner (India) My Company Award: Best VB.NET article of June 2008: Create Column Charts Using OWC11

        D 1 Reply Last reply
        0
        • R Rupesh Kumar Swami

          sir, whether i can check if the character of any string variable is not in UTF-8 format ?

          Rupesh Kumar Swami Software Developer, Integrated Solution, Bikaner (India) My Company Award: Best VB.NET article of June 2008: Create Column Charts Using OWC11

          D Offline
          D Offline
          Dave Kreskowiak
          wrote on last edited by
          #4

          OK, you're not getting it. You don't check a String to see if it's encoded using UTF-8. You can check a stream of bytes though. See GetDecoder[^] for an example.

          A guide to posting questions on CodeProject[^]
          Dave Kreskowiak Microsoft MVP Visual Developer - Visual Basic
               2006, 2007, 2008

          1 Reply Last reply
          0
          • R Rupesh Kumar Swami

            hi all, i want to compare each character of my string variable to all characters of UTF-8 character set(i think there are near about 65000 character). and remove any character of variable which it is not in list of UTF -8 Character set. anybody have idea that how can i perform this operation efficiently ?

            Rupesh Kumar Swami Software Developer, Integrated Solution, Bikaner (India) My Company Award: Best VB.NET article of June 2008: Create Column Charts Using OWC11

            G Offline
            G Offline
            Guffa
            wrote on last edited by
            #5

            Rupesh Kumar Swami wrote:

            i want to compare each character of my string variable to all characters of UTF-8 character set

            As already have been pointed out in the thread, UTF-8 is not a character set.

            Rupesh Kumar Swami wrote:

            UTF-8 character set(i think there are near about 65000 character)

            The Unicode character set contains more than 100 000 chcaracters.

            Rupesh Kumar Swami wrote:

            and remove any character of variable which it is not in list of UTF -8 Character set.

            Here's a method that does that for you:

            Public Function RemoveNonUnicodeCharacers(ByVal s As String) As String
            Return s
            End Function

            Strings in .NET are Uncode. UTF-8 is a way to represent unicode characters as a binary stream. So, any character that you have in a string can be encoded into an UTF-8 stream.

            Despite everything, the person most likely to be fooling you next is yourself.

            R 1 Reply Last reply
            0
            • G Guffa

              Rupesh Kumar Swami wrote:

              i want to compare each character of my string variable to all characters of UTF-8 character set

              As already have been pointed out in the thread, UTF-8 is not a character set.

              Rupesh Kumar Swami wrote:

              UTF-8 character set(i think there are near about 65000 character)

              The Unicode character set contains more than 100 000 chcaracters.

              Rupesh Kumar Swami wrote:

              and remove any character of variable which it is not in list of UTF -8 Character set.

              Here's a method that does that for you:

              Public Function RemoveNonUnicodeCharacers(ByVal s As String) As String
              Return s
              End Function

              Strings in .NET are Uncode. UTF-8 is a way to represent unicode characters as a binary stream. So, any character that you have in a string can be encoded into an UTF-8 stream.

              Despite everything, the person most likely to be fooling you next is yourself.

              R Offline
              R Offline
              Rupesh Kumar Swami
              wrote on last edited by
              #6

              Thank sir, for your information one more question How can find ascii equivalent of string which is made of utf-8 encoding? Since this is 8 bit

              Rupesh Kumar Swami Software Developer, Integrated Solution, Bikaner (India) My Company Award: Best VB.NET article of June 2008: Create Column Charts Using OWC11

              D 1 Reply Last reply
              0
              • R Rupesh Kumar Swami

                Thank sir, for your information one more question How can find ascii equivalent of string which is made of utf-8 encoding? Since this is 8 bit

                Rupesh Kumar Swami Software Developer, Integrated Solution, Bikaner (India) My Company Award: Best VB.NET article of June 2008: Create Column Charts Using OWC11

                D Offline
                D Offline
                Dave Kreskowiak
                wrote on last edited by
                #7

                Strings are not encoded at all. An array of bytes using a certain encoding represents a string of text. It's kind of like a .ZIP file. A ZIP an array of bytes that encodes a file resulting in a compressed version of the file. What's the point of all this?? Why are you insisting on doing this?

                A guide to posting questions on CodeProject[^]
                Dave Kreskowiak Microsoft MVP Visual Developer - Visual Basic
                     2006, 2007, 2008

                1 Reply Last reply
                0
                Reply
                • Reply as topic
                Log in to reply
                • Oldest to Newest
                • Newest to Oldest
                • Most Votes


                • Login

                • Don't have an account? Register

                • Login or register to search.
                • First post
                  Last post
                0
                • Categories
                • Recent
                • Tags
                • Popular
                • World
                • Users
                • Groups