Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C#
  4. Encoding and the Streamreader

Encoding and the Streamreader

Scheduled Pinned Locked Moved C#
debugginghelp
3 Posts 2 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • K Offline
    K Offline
    KaptinKrunch
    wrote on last edited by
    #1

    Hi, I have run into a small problem reading text from a file. Consider this line below as part of the source file. (Code 1956, § 486.01) I'm using the good old Streamreader to read the file. string line = sourceStreamReader.ReadLine(); When this line executes on the above source line the "Section" character " §" is read as char 65533. I've declare the streamreader as follows. StreamReader sourceStreamReader = new StreamReader(sourceFiles[i].ToString()); System.Diagnostics.Debug.WriteLine(sourceStreamReader.CurrentEncoding.EncodingName); the output from the debug line tells me thats its UTF-8 encoding, but the "section" character is not read nor writes as its original value. Can someone please tell me where I went wrong!

    Just because we can; does not mean we should.

    K 1 Reply Last reply
    0
    • K KaptinKrunch

      Hi, I have run into a small problem reading text from a file. Consider this line below as part of the source file. (Code 1956, § 486.01) I'm using the good old Streamreader to read the file. string line = sourceStreamReader.ReadLine(); When this line executes on the above source line the "Section" character " §" is read as char 65533. I've declare the streamreader as follows. StreamReader sourceStreamReader = new StreamReader(sourceFiles[i].ToString()); System.Diagnostics.Debug.WriteLine(sourceStreamReader.CurrentEncoding.EncodingName); the output from the debug line tells me thats its UTF-8 encoding, but the "section" character is not read nor writes as its original value. Can someone please tell me where I went wrong!

      Just because we can; does not mean we should.

      K Offline
      K Offline
      KaptinKrunch
      wrote on last edited by
      #2

      As it turns out, changing the code to the listing below resolved my issue. //read the source file FileStream fStream = File.OpenRead(sourceFiles[i].ToString()); byte[] buffer = new byte[fStream.Length]; int bytesRead; bytesRead = fStream.Read(buffer, 0, buffer.Length); fStream.Close(); Decoder decoder = Encoding.Default.GetDecoder(); char[] cBuffer = new char[buffer.Length]; int bytesConverted, charsConverted; bool bCompleted; decoder.Convert(buffer, 0, buffer.Length, cBuffer, 0, buffer.Length, false, out bytesConverted, out charsConverted, out bCompleted);

      Just because we can; does not mean we should.

      G 1 Reply Last reply
      0
      • K KaptinKrunch

        As it turns out, changing the code to the listing below resolved my issue. //read the source file FileStream fStream = File.OpenRead(sourceFiles[i].ToString()); byte[] buffer = new byte[fStream.Length]; int bytesRead; bytesRead = fStream.Read(buffer, 0, buffer.Length); fStream.Close(); Decoder decoder = Encoding.Default.GetDecoder(); char[] cBuffer = new char[buffer.Length]; int bytesConverted, charsConverted; bool bCompleted; decoder.Convert(buffer, 0, buffer.Length, cBuffer, 0, buffer.Length, false, out bytesConverted, out charsConverted, out bCompleted);

        Just because we can; does not mean we should.

        G Offline
        G Offline
        Guffa
        wrote on last edited by
        #3

        KaptinKrunch wrote:

        changing the code to the listing below resolved my issue.

        That means that the file is not at all Unicode, but ANSI.

        KaptinKrunch wrote:

        bytesRead = fStream.Read(buffer, 0, buffer.Length);

        Ouch. You read from the file, but you ignore the result. The Read method doesn't have to read as much data as you request. You have to loop until all data is read, or the method return zero. The easiest is of course to just replace all that code with: string text = File.ReadAllText(sourceFiles[i].ToString(), Encoding.Default);

        Despite everything, the person most likely to be fooling you next is yourself.

        1 Reply Last reply
        0
        Reply
        • Reply as topic
        Log in to reply
        • Oldest to Newest
        • Newest to Oldest
        • Most Votes


        • Login

        • Don't have an account? Register

        • Login or register to search.
        • First post
          Last post
        0
        • Categories
        • Recent
        • Tags
        • Popular
        • World
        • Users
        • Groups