Parsing Html The Cthulhu Way
-
I'm currently trying to parse some html in order to get some specific information out of it. Now, I found this very amusing blog entry on coding horror: Parsing Html The Cthulhu Way[^] I must admit, I'm "unusually seducted about parsing HTML the Cthulhu way", as Jeff Atwood is calling it ;) (and yes, I could be called a "novice programmer") But as I'm a "sane person", I will use a library! (but only after I had enough fun with parsing html myself :) ) Read the blog entry, it's fun! Phil
I won’t not use no double negatives.
-
I'm currently trying to parse some html in order to get some specific information out of it. Now, I found this very amusing blog entry on coding horror: Parsing Html The Cthulhu Way[^] I must admit, I'm "unusually seducted about parsing HTML the Cthulhu way", as Jeff Atwood is calling it ;) (and yes, I could be called a "novice programmer") But as I'm a "sane person", I will use a library! (but only after I had enough fun with parsing html myself :) ) Read the blog entry, it's fun! Phil
I won’t not use no double negatives.
-
Philip F. wrote:
I will use a library
What if the library parses html the "Cthulhu" way? :cool:
-
I'm currently trying to parse some html in order to get some specific information out of it. Now, I found this very amusing blog entry on coding horror: Parsing Html The Cthulhu Way[^] I must admit, I'm "unusually seducted about parsing HTML the Cthulhu way", as Jeff Atwood is calling it ;) (and yes, I could be called a "novice programmer") But as I'm a "sane person", I will use a library! (but only after I had enough fun with parsing html myself :) ) Read the blog entry, it's fun! Phil
I won’t not use no double negatives.
The 5th comment got me.
If I have accidentally said something witty, smart, or correct, it is purely by mistake and I apologize for it.
-
I'm currently trying to parse some html in order to get some specific information out of it. Now, I found this very amusing blog entry on coding horror: Parsing Html The Cthulhu Way[^] I must admit, I'm "unusually seducted about parsing HTML the Cthulhu way", as Jeff Atwood is calling it ;) (and yes, I could be called a "novice programmer") But as I'm a "sane person", I will use a library! (but only after I had enough fun with parsing html myself :) ) Read the blog entry, it's fun! Phil
I won’t not use no double negatives.
But are you really parsing HTML or are you merely extracting some data from a string that looks a lot like HTML? :cool:
-
I'm currently trying to parse some html in order to get some specific information out of it. Now, I found this very amusing blog entry on coding horror: Parsing Html The Cthulhu Way[^] I must admit, I'm "unusually seducted about parsing HTML the Cthulhu way", as Jeff Atwood is calling it ;) (and yes, I could be called a "novice programmer") But as I'm a "sane person", I will use a library! (but only after I had enough fun with parsing html myself :) ) Read the blog entry, it's fun! Phil
I won’t not use no double negatives.
http://htmlagilitypack.codeplex.com/[^] "This is an agile HTML parser that builds a read/write DOM and supports plain XPATH or XSLT (you actually don't HAVE to understand XPATH nor XSLT to use it, don't worry...). It is a .NET code library that allows you to parse "out of the web" HTML files. The parser is very tolerant with "real world" malformed HTML. The object model is very similar to what proposes System.Xml, but for HTML documents (or streams)."
-
I'm currently trying to parse some html in order to get some specific information out of it. Now, I found this very amusing blog entry on coding horror: Parsing Html The Cthulhu Way[^] I must admit, I'm "unusually seducted about parsing HTML the Cthulhu way", as Jeff Atwood is calling it ;) (and yes, I could be called a "novice programmer") But as I'm a "sane person", I will use a library! (but only after I had enough fun with parsing html myself :) ) Read the blog entry, it's fun! Phil
I won’t not use no double negatives.
Not too long ago I tried, not parsing HTML, but merely searching it, and found it too challenging for the .NET RegEx engine.