Converting a stream of html to xml and reading the xml into a xmldocumet in c#
-
I have a stream object containing html that I can read as a List. The stream contains html. What I need to do is convert the stream of html into xml and store the xml into a xmlDocument in C#. Can anyone show me some code that will help me do this? Thanks, Steve Holdorf
-
I have a stream object containing html that I can read as a List. The stream contains html. What I need to do is convert the stream of html into xml and store the xml into a xmlDocument in C#. Can anyone show me some code that will help me do this? Thanks, Steve Holdorf
Please explain what do you mean by "convert html to xml". Both are just strings wiht set format, but there are many cases that valid html is not valid xml (for example <br> and <p> tags may be not terminated) You can try to parse the html and try to "fix" it, but it may not be worth the effort. On the other hand if you have control over the html and can ensure correct structure yourself, nothing prvents you from creating XmlDocument out of it using Load() or LoadXml() methods.
-- "My software never has bugs. It just develops random features."
-
I have a stream object containing html that I can read as a List. The stream contains html. What I need to do is convert the stream of html into xml and store the xml into a xmlDocument in C#. Can anyone show me some code that will help me do this? Thanks, Steve Holdorf
You'll need to use something like the HTML Agility Pack[^].
"These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer
-
Please explain what do you mean by "convert html to xml". Both are just strings wiht set format, but there are many cases that valid html is not valid xml (for example <br> and <p> tags may be not terminated) You can try to parse the html and try to "fix" it, but it may not be worth the effort. On the other hand if you have control over the html and can ensure correct structure yourself, nothing prvents you from creating XmlDocument out of it using Load() or LoadXml() methods.
-- "My software never has bugs. It just develops random features."
What I mean is that I am reading excel spreadsheet cells pasted into an editor and pulling the content of the editor into a stream of html. Next, what I want to dso is take the html stream and make it into an XML stream that I can load into an XMLDocument (ie
into ......
) then I will parse the xml data and store it into the database. Thanks, Steve Holdorf
-
You'll need to use something like the HTML Agility Pack[^].
"These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer
Richard, The codeplex library is great. One question; however. It's class object works by writting the html to a file on the server then the app code must read the file back into an xmldocument. Because of security reasons we can't allow the application to save a file from the browser to the server and then read it back from the server. Is there anyway to have the html to xml converter class write the converted html to a stream or some thing else besides a file on the server? Thanks a lot!!!!! Steve Holdorf
-
Richard, The codeplex library is great. One question; however. It's class object works by writting the html to a file on the server then the app code must read the file back into an xmldocument. Because of security reasons we can't allow the application to save a file from the browser to the server and then read it back from the server. Is there anyway to have the html to xml converter class write the converted html to a stream or some thing else besides a file on the server? Thanks a lot!!!!! Steve Holdorf
As well as the overloads of the
Load
method which take a file path, there are overloads which take aStream
or aTextReader
as the source:- File path:
void Load(string)
void Load(string, bool)
void Load(string, Encoding)
void Load(string, Encoding, bool)
void Load(string, Encoding, bool, int)
- Stream:
void Load(Stream)
void Load(Stream, bool)
void Load(Stream, Encoding)
void Load(Stream, Encoding, bool)
void Load(Stream, Encoding, bool, int)
void Load(TextReader)
There's also a
LoadHtml
method which accepts a string containing the HTML to load.
"These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer
- File path: