Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C#
  4. Text Readers, Streams, *ugh*

Text Readers, Streams, *ugh*

Scheduled Pinned Locked Moved C#
csharphelpquestionc++html
9 Posts 4 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • G Offline
    G Offline
    gUrM33T
    wrote on last edited by
    #1

    Hi, while writing a .NET port of my HTML Reader Class Library[^] using C#, I'm kinda stucked with a stupid problem. You may be knowing that the library allows to read HTML text both from in-memory strings and disk files. At first I declared a private field of type System.IO.TextReader in my LiteHTMLReader class (the main class of the library) and defined a Read function with 1 overload. Some portion of the class looked like this:

    public class LiteHTMLReader
    {
    private System.IO.TextReader oHtmlReader = null;

    public long Read(string htmlText)
    {
    oHtmlReader = new StringReader(htmlText);
    return (parseHTMLDocument()); // parseHTMLDocument is a private function
    }

    public long Read(string pathToFile, System.Text.Encoding encoding)
    {
    oHtmlReader = new StreamReader(pathToFile, encoding, true);
    return (parseHTMLDocument());
    }
    }

    But soon enough, I learned that readers (derived from System.IO.TextReader) are just forward-only. But sometimes while parsing an HTML document, I need to move back also. So, I rejected the idea of using readers. So obviously, my next option was using streams. Ok now we have a System.IO.FileStream class to deal with files, but what about the strings. - How can I open a stream on a string? - Is there any class available in the framework? - Shall I go for my own implementation? - Any other better option availble for the above-defined scenario? Please suggest. Even if I use streams, there is one more thing I need to know. Streams deal with bytes only (GetBytes). String class uses Unicode by default. How to deal with this situation? Guys, I'm so sorry for asking soooooo much but you can obviosly guess that I'm a newbie in C#. And I'm getting mails daily from different people requesting me to release a .NET port of the library. Please help. Any suggestions are welcome. Regards, Gurmeet


    BTW, can Google help me search my lost pajamas?

    My Articles: HTML Reader C++ Class

    P H 2 Replies Last reply
    0
    • G gUrM33T

      Hi, while writing a .NET port of my HTML Reader Class Library[^] using C#, I'm kinda stucked with a stupid problem. You may be knowing that the library allows to read HTML text both from in-memory strings and disk files. At first I declared a private field of type System.IO.TextReader in my LiteHTMLReader class (the main class of the library) and defined a Read function with 1 overload. Some portion of the class looked like this:

      public class LiteHTMLReader
      {
      private System.IO.TextReader oHtmlReader = null;

      public long Read(string htmlText)
      {
      oHtmlReader = new StringReader(htmlText);
      return (parseHTMLDocument()); // parseHTMLDocument is a private function
      }

      public long Read(string pathToFile, System.Text.Encoding encoding)
      {
      oHtmlReader = new StreamReader(pathToFile, encoding, true);
      return (parseHTMLDocument());
      }
      }

      But soon enough, I learned that readers (derived from System.IO.TextReader) are just forward-only. But sometimes while parsing an HTML document, I need to move back also. So, I rejected the idea of using readers. So obviously, my next option was using streams. Ok now we have a System.IO.FileStream class to deal with files, but what about the strings. - How can I open a stream on a string? - Is there any class available in the framework? - Shall I go for my own implementation? - Any other better option availble for the above-defined scenario? Please suggest. Even if I use streams, there is one more thing I need to know. Streams deal with bytes only (GetBytes). String class uses Unicode by default. How to deal with this situation? Guys, I'm so sorry for asking soooooo much but you can obviosly guess that I'm a newbie in C#. And I'm getting mails daily from different people requesting me to release a .NET port of the library. Please help. Any suggestions are welcome. Regards, Gurmeet


      BTW, can Google help me search my lost pajamas?

      My Articles: HTML Reader C++ Class

      P Offline
      P Offline
      Paul Watson
      wrote on last edited by
      #2

      xsl transform to stream adds extra characters That question indicates you can use MemoryStream for string streams. Hope that helps. regards, Paul Watson Bluegrass South Africa Chris Maunder wrote: "I'd rather cover myself in honey and lie on an ant's nest than commit myself to it publicly." Jon Sagara replied: "I think we've all been in that situation before." Crikey! ain't life grand?

      G 1 Reply Last reply
      0
      • P Paul Watson

        xsl transform to stream adds extra characters That question indicates you can use MemoryStream for string streams. Hope that helps. regards, Paul Watson Bluegrass South Africa Chris Maunder wrote: "I'd rather cover myself in honey and lie on an ant's nest than commit myself to it publicly." Jon Sagara replied: "I think we've all been in that situation before." Crikey! ain't life grand?

        G Offline
        G Offline
        gUrM33T
        wrote on last edited by
        #3

        ... but I'm not sure if its the right one. Now my class looks like this:

        public class LiteHTMLReader
        {
        private System.IO.Stream oHtmlStream;

        public long Read(string htmlText)
        {
        using (this.oHtmlStream = new System.IO.MemoryStream(System.Text.Encoding.Unicode.GetBytes(htmlText)))
        {
        long lCharCount = this.parseHTMLDocument();
        return (lCharCount);
        }
        }

        public long ReadFile(string pathToFile)
        {
        using (System.IO.StreamReader sr = new System.IO.StreamReader(pathToFile, true))
        {
        string strFileData = sr.ReadToEnd();
        return (this.Read(strFileData));
        }
        }
        }

        I dunno why but I dont think this is the actual way to do it. Can someone clear my doubts? Heath Stewart, Mike Dimmick, any other C# guru, where are you guyz? Please help. Thanks, Gurmeet


        BTW, can Google help me search my lost pajamas?

        My Articles: HTML Reader C++ Class Library, Numeric Edit Control

        1 Reply Last reply
        0
        • G gUrM33T

          Hi, while writing a .NET port of my HTML Reader Class Library[^] using C#, I'm kinda stucked with a stupid problem. You may be knowing that the library allows to read HTML text both from in-memory strings and disk files. At first I declared a private field of type System.IO.TextReader in my LiteHTMLReader class (the main class of the library) and defined a Read function with 1 overload. Some portion of the class looked like this:

          public class LiteHTMLReader
          {
          private System.IO.TextReader oHtmlReader = null;

          public long Read(string htmlText)
          {
          oHtmlReader = new StringReader(htmlText);
          return (parseHTMLDocument()); // parseHTMLDocument is a private function
          }

          public long Read(string pathToFile, System.Text.Encoding encoding)
          {
          oHtmlReader = new StreamReader(pathToFile, encoding, true);
          return (parseHTMLDocument());
          }
          }

          But soon enough, I learned that readers (derived from System.IO.TextReader) are just forward-only. But sometimes while parsing an HTML document, I need to move back also. So, I rejected the idea of using readers. So obviously, my next option was using streams. Ok now we have a System.IO.FileStream class to deal with files, but what about the strings. - How can I open a stream on a string? - Is there any class available in the framework? - Shall I go for my own implementation? - Any other better option availble for the above-defined scenario? Please suggest. Even if I use streams, there is one more thing I need to know. Streams deal with bytes only (GetBytes). String class uses Unicode by default. How to deal with this situation? Guys, I'm so sorry for asking soooooo much but you can obviosly guess that I'm a newbie in C#. And I'm getting mails daily from different people requesting me to release a .NET port of the library. Please help. Any suggestions are welcome. Regards, Gurmeet


          BTW, can Google help me search my lost pajamas?

          My Articles: HTML Reader C++ Class

          H Offline
          H Offline
          Heath Stewart
          wrote on last edited by
          #4

          You can use the Encoding class to convert the bytes to strings using the appropriate encoding. Also, even with a reader, you should be able to call Seek on the BaseStream property if you use a StreamReader (which derives from TextReader).

          Microsoft MVP, Visual C# My Articles

          M G 2 Replies Last reply
          0
          • H Heath Stewart

            You can use the Encoding class to convert the bytes to strings using the appropriate encoding. Also, even with a reader, you should be able to call Seek on the BaseStream property if you use a StreamReader (which derives from TextReader).

            Microsoft MVP, Visual C# My Articles

            M Offline
            M Offline
            Mike Dimmick
            wrote on last edited by
            #5

            If you do Seek on the BaseStream, remember to call DiscardBufferedData on the StreamReader. Otherwise, you'll get the rest of the buffered data before you get the data from the new position in the stream. It doesn't look like you can turn this buffering off. Stability. What an interesting concept. -- Chris Maunder

            1 Reply Last reply
            0
            • H Heath Stewart

              You can use the Encoding class to convert the bytes to strings using the appropriate encoding. Also, even with a reader, you should be able to call Seek on the BaseStream property if you use a StreamReader (which derives from TextReader).

              Microsoft MVP, Visual C# My Articles

              G Offline
              G Offline
              gUrM33T
              wrote on last edited by
              #6

              Your solution suggests using BaseStream, property of StreamReader class. But I don't only need to deal with files, I need to deal with strings also. The StreamReader class has a BaseStream property but the StringReader class does not. What do you suggest in this case? Moreover, I would like to ask you whether the "alternate way" that I've posted above is right according to you in this situation or not. Thanks, Gurmeet


              BTW, can Google help me search my lost pajamas?

              My Articles: HTML Reader C++ Class Library, Numeric Edit Control

              H 1 Reply Last reply
              0
              • G gUrM33T

                Your solution suggests using BaseStream, property of StreamReader class. But I don't only need to deal with files, I need to deal with strings also. The StreamReader class has a BaseStream property but the StringReader class does not. What do you suggest in this case? Moreover, I would like to ask you whether the "alternate way" that I've posted above is right according to you in this situation or not. Thanks, Gurmeet


                BTW, can Google help me search my lost pajamas?

                My Articles: HTML Reader C++ Class Library, Numeric Edit Control

                H Offline
                H Offline
                Heath Stewart
                wrote on last edited by
                #7

                There's almost never a "right" way, just good and bad ways. Your alternative - if it works - isn't bad and seems to be pretty efficient. That's what counts.

                Microsoft MVP, Visual C# My Articles

                G 1 Reply Last reply
                0
                • H Heath Stewart

                  There's almost never a "right" way, just good and bad ways. Your alternative - if it works - isn't bad and seems to be pretty efficient. That's what counts.

                  Microsoft MVP, Visual C# My Articles

                  G Offline
                  G Offline
                  gUrM33T
                  wrote on last edited by
                  #8

                  What, according to you, can be done to make it more efficient? Gurmeet


                  BTW, can Google help me search my lost pajamas?

                  My Articles: HTML Reader C++ Class Library, Numeric Edit Control

                  H 1 Reply Last reply
                  0
                  • G gUrM33T

                    What, according to you, can be done to make it more efficient? Gurmeet


                    BTW, can Google help me search my lost pajamas?

                    My Articles: HTML Reader C++ Class Library, Numeric Edit Control

                    H Offline
                    H Offline
                    Heath Stewart
                    wrote on last edited by
                    #9

                    Sorry, that was supposed to be "efficient", not "inefficient" ( I use the latter far more often here in this forum :rolleyes: ). Context clues should've told you that, but thanks for the low vote anyway.

                    Microsoft MVP, Visual C# My Articles

                    1 Reply Last reply
                    0
                    Reply
                    • Reply as topic
                    Log in to reply
                    • Oldest to Newest
                    • Newest to Oldest
                    • Most Votes


                    • Login

                    • Don't have an account? Register

                    • Login or register to search.
                    • First post
                      Last post
                    0
                    • Categories
                    • Recent
                    • Tags
                    • Popular
                    • World
                    • Users
                    • Groups