Text Readers, Streams, *ugh*
-
Hi, while writing a .NET port of my HTML Reader Class Library[^] using C#, I'm kinda stucked with a stupid problem. You may be knowing that the library allows to read HTML text both from in-memory strings and disk files. At first I declared a private field of type
System.IO.TextReader
in myLiteHTMLReader
class (the main class of the library) and defined aRead
function with 1 overload. Some portion of the class looked like this:public class LiteHTMLReader
{
private System.IO.TextReader oHtmlReader = null;public long Read(string htmlText)
{
oHtmlReader = new StringReader(htmlText);
return (parseHTMLDocument()); // parseHTMLDocument is a private function
}public long Read(string pathToFile, System.Text.Encoding encoding)
{
oHtmlReader = new StreamReader(pathToFile, encoding, true);
return (parseHTMLDocument());
}
}But soon enough, I learned that readers (derived from
System.IO.TextReader
) are just forward-only. But sometimes while parsing an HTML document, I need to move back also. So, I rejected the idea of using readers. So obviously, my next option was using streams. Ok now we have aSystem.IO.FileStream
class to deal with files, but what about thestring
s. - How can I open a stream on a string? - Is there any class available in the framework? - Shall I go for my own implementation? - Any other better option availble for the above-defined scenario? Please suggest. Even if I use streams, there is one more thing I need to know.Stream
s deal with bytes only (GetBytes
).String
class uses Unicode by default. How to deal with this situation? Guys, I'm so sorry for asking soooooo much but you can obviosly guess that I'm a newbie in C#. And I'm getting mails daily from different people requesting me to release a .NET port of the library. Please help. Any suggestions are welcome. Regards, Gurmeet
BTW, can Google help me search my lost pajamas?
My Articles: HTML Reader C++ Class
-
Hi, while writing a .NET port of my HTML Reader Class Library[^] using C#, I'm kinda stucked with a stupid problem. You may be knowing that the library allows to read HTML text both from in-memory strings and disk files. At first I declared a private field of type
System.IO.TextReader
in myLiteHTMLReader
class (the main class of the library) and defined aRead
function with 1 overload. Some portion of the class looked like this:public class LiteHTMLReader
{
private System.IO.TextReader oHtmlReader = null;public long Read(string htmlText)
{
oHtmlReader = new StringReader(htmlText);
return (parseHTMLDocument()); // parseHTMLDocument is a private function
}public long Read(string pathToFile, System.Text.Encoding encoding)
{
oHtmlReader = new StreamReader(pathToFile, encoding, true);
return (parseHTMLDocument());
}
}But soon enough, I learned that readers (derived from
System.IO.TextReader
) are just forward-only. But sometimes while parsing an HTML document, I need to move back also. So, I rejected the idea of using readers. So obviously, my next option was using streams. Ok now we have aSystem.IO.FileStream
class to deal with files, but what about thestring
s. - How can I open a stream on a string? - Is there any class available in the framework? - Shall I go for my own implementation? - Any other better option availble for the above-defined scenario? Please suggest. Even if I use streams, there is one more thing I need to know.Stream
s deal with bytes only (GetBytes
).String
class uses Unicode by default. How to deal with this situation? Guys, I'm so sorry for asking soooooo much but you can obviosly guess that I'm a newbie in C#. And I'm getting mails daily from different people requesting me to release a .NET port of the library. Please help. Any suggestions are welcome. Regards, Gurmeet
BTW, can Google help me search my lost pajamas?
My Articles: HTML Reader C++ Class
xsl transform to stream adds extra characters That question indicates you can use
MemoryStream
for string streams. Hope that helps. regards, Paul Watson Bluegrass South Africa Chris Maunder wrote: "I'd rather cover myself in honey and lie on an ant's nest than commit myself to it publicly." Jon Sagara replied: "I think we've all been in that situation before." Crikey! ain't life grand? -
xsl transform to stream adds extra characters That question indicates you can use
MemoryStream
for string streams. Hope that helps. regards, Paul Watson Bluegrass South Africa Chris Maunder wrote: "I'd rather cover myself in honey and lie on an ant's nest than commit myself to it publicly." Jon Sagara replied: "I think we've all been in that situation before." Crikey! ain't life grand?... but I'm not sure if its the right one. Now my class looks like this:
public class LiteHTMLReader
{
private System.IO.Stream oHtmlStream;public long Read(string htmlText)
{
using (this.oHtmlStream = new System.IO.MemoryStream(System.Text.Encoding.Unicode.GetBytes(htmlText)))
{
long lCharCount = this.parseHTMLDocument();
return (lCharCount);
}
}public long ReadFile(string pathToFile)
{
using (System.IO.StreamReader sr = new System.IO.StreamReader(pathToFile, true))
{
string strFileData = sr.ReadToEnd();
return (this.Read(strFileData));
}
}
}I dunno why but I dont think this is the actual way to do it. Can someone clear my doubts? Heath Stewart, Mike Dimmick, any other C# guru, where are you guyz? Please help. Thanks, Gurmeet
BTW, can Google help me search my lost pajamas?
My Articles: HTML Reader C++ Class Library, Numeric Edit Control
-
Hi, while writing a .NET port of my HTML Reader Class Library[^] using C#, I'm kinda stucked with a stupid problem. You may be knowing that the library allows to read HTML text both from in-memory strings and disk files. At first I declared a private field of type
System.IO.TextReader
in myLiteHTMLReader
class (the main class of the library) and defined aRead
function with 1 overload. Some portion of the class looked like this:public class LiteHTMLReader
{
private System.IO.TextReader oHtmlReader = null;public long Read(string htmlText)
{
oHtmlReader = new StringReader(htmlText);
return (parseHTMLDocument()); // parseHTMLDocument is a private function
}public long Read(string pathToFile, System.Text.Encoding encoding)
{
oHtmlReader = new StreamReader(pathToFile, encoding, true);
return (parseHTMLDocument());
}
}But soon enough, I learned that readers (derived from
System.IO.TextReader
) are just forward-only. But sometimes while parsing an HTML document, I need to move back also. So, I rejected the idea of using readers. So obviously, my next option was using streams. Ok now we have aSystem.IO.FileStream
class to deal with files, but what about thestring
s. - How can I open a stream on a string? - Is there any class available in the framework? - Shall I go for my own implementation? - Any other better option availble for the above-defined scenario? Please suggest. Even if I use streams, there is one more thing I need to know.Stream
s deal with bytes only (GetBytes
).String
class uses Unicode by default. How to deal with this situation? Guys, I'm so sorry for asking soooooo much but you can obviosly guess that I'm a newbie in C#. And I'm getting mails daily from different people requesting me to release a .NET port of the library. Please help. Any suggestions are welcome. Regards, Gurmeet
BTW, can Google help me search my lost pajamas?
My Articles: HTML Reader C++ Class
You can use the
Encoding
class to convert the bytes to strings using the appropriate encoding. Also, even with a reader, you should be able to callSeek
on theBaseStream
property if you use aStreamReader
(which derives fromTextReader
).Microsoft MVP, Visual C# My Articles
-
You can use the
Encoding
class to convert the bytes to strings using the appropriate encoding. Also, even with a reader, you should be able to callSeek
on theBaseStream
property if you use aStreamReader
(which derives fromTextReader
).Microsoft MVP, Visual C# My Articles
If you do
Seek
on theBaseStream
, remember to callDiscardBufferedData
on theStreamReader
. Otherwise, you'll get the rest of the buffered data before you get the data from the new position in the stream. It doesn't look like you can turn this buffering off. Stability. What an interesting concept. -- Chris Maunder -
You can use the
Encoding
class to convert the bytes to strings using the appropriate encoding. Also, even with a reader, you should be able to callSeek
on theBaseStream
property if you use aStreamReader
(which derives fromTextReader
).Microsoft MVP, Visual C# My Articles
Your solution suggests using
BaseStream,
property ofStreamReader
class. But I don't only need to deal with files, I need to deal with strings also. TheStreamReader
class has aBaseStream
property but theStringReader
class does not. What do you suggest in this case? Moreover, I would like to ask you whether the "alternate way" that I've posted above is right according to you in this situation or not. Thanks, Gurmeet
BTW, can Google help me search my lost pajamas?
My Articles: HTML Reader C++ Class Library, Numeric Edit Control
-
Your solution suggests using
BaseStream,
property ofStreamReader
class. But I don't only need to deal with files, I need to deal with strings also. TheStreamReader
class has aBaseStream
property but theStringReader
class does not. What do you suggest in this case? Moreover, I would like to ask you whether the "alternate way" that I've posted above is right according to you in this situation or not. Thanks, Gurmeet
BTW, can Google help me search my lost pajamas?
My Articles: HTML Reader C++ Class Library, Numeric Edit Control
There's almost never a "right" way, just good and bad ways. Your alternative - if it works - isn't bad and seems to be pretty efficient. That's what counts.
Microsoft MVP, Visual C# My Articles
-
There's almost never a "right" way, just good and bad ways. Your alternative - if it works - isn't bad and seems to be pretty efficient. That's what counts.
Microsoft MVP, Visual C# My Articles
What, according to you, can be done to make it more efficient? Gurmeet
BTW, can Google help me search my lost pajamas?
My Articles: HTML Reader C++ Class Library, Numeric Edit Control
-
What, according to you, can be done to make it more efficient? Gurmeet
BTW, can Google help me search my lost pajamas?
My Articles: HTML Reader C++ Class Library, Numeric Edit Control
Sorry, that was supposed to be "efficient", not "inefficient" ( I use the latter far more often here in this forum :rolleyes: ). Context clues should've told you that, but thanks for the low vote anyway.
Microsoft MVP, Visual C# My Articles