Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. Parsing Html The Cthulhu Way

Parsing Html The Cthulhu Way

Scheduled Pinned Locked Moved The Lounge
htmlcomjson
7 Posts 6 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • P Offline
    P Offline
    Philip F
    wrote on last edited by
    #1

    I'm currently trying to parse some html in order to get some specific information out of it. Now, I found this very amusing blog entry on coding horror: Parsing Html The Cthulhu Way[^] I must admit, I'm "unusually seducted about parsing HTML the Cthulhu way", as Jeff Atwood is calling it ;) (and yes, I could be called a "novice programmer") But as I'm a "sane person", I will use a library! (but only after I had enough fun with parsing html myself :) ) Read the blog entry, it's fun! Phil

    I won’t not use no double negatives.

    A R P L B 5 Replies Last reply
    0
    • P Philip F

      I'm currently trying to parse some html in order to get some specific information out of it. Now, I found this very amusing blog entry on coding horror: Parsing Html The Cthulhu Way[^] I must admit, I'm "unusually seducted about parsing HTML the Cthulhu way", as Jeff Atwood is calling it ;) (and yes, I could be called a "novice programmer") But as I'm a "sane person", I will use a library! (but only after I had enough fun with parsing html myself :) ) Read the blog entry, it's fun! Phil

      I won’t not use no double negatives.

      A Offline
      A Offline
      Abhinav S
      wrote on last edited by
      #2

      Philip F. wrote:

      I will use a library

      What if the library parses html the "Cthulhu" way? :cool:

      P 1 Reply Last reply
      0
      • A Abhinav S

        Philip F. wrote:

        I will use a library

        What if the library parses html the "Cthulhu" way? :cool:

        P Offline
        P Offline
        Philip F
        wrote on last edited by
        #3

        I will cross-check this with my own parsing ;)

        I won’t not use no double negatives.

        1 Reply Last reply
        0
        • P Philip F

          I'm currently trying to parse some html in order to get some specific information out of it. Now, I found this very amusing blog entry on coding horror: Parsing Html The Cthulhu Way[^] I must admit, I'm "unusually seducted about parsing HTML the Cthulhu way", as Jeff Atwood is calling it ;) (and yes, I could be called a "novice programmer") But as I'm a "sane person", I will use a library! (but only after I had enough fun with parsing html myself :) ) Read the blog entry, it's fun! Phil

          I won’t not use no double negatives.

          R Offline
          R Offline
          ragnaroknrol
          wrote on last edited by
          #4

          The 5th comment got me.

          If I have accidentally said something witty, smart, or correct, it is purely by mistake and I apologize for it.

          1 Reply Last reply
          0
          • P Philip F

            I'm currently trying to parse some html in order to get some specific information out of it. Now, I found this very amusing blog entry on coding horror: Parsing Html The Cthulhu Way[^] I must admit, I'm "unusually seducted about parsing HTML the Cthulhu way", as Jeff Atwood is calling it ;) (and yes, I could be called a "novice programmer") But as I'm a "sane person", I will use a library! (but only after I had enough fun with parsing html myself :) ) Read the blog entry, it's fun! Phil

            I won’t not use no double negatives.

            P Offline
            P Offline
            PIEBALDconsult
            wrote on last edited by
            #5

            But are you really parsing HTML or are you merely extracting some data from a string that looks a lot like HTML? :cool:

            1 Reply Last reply
            0
            • P Philip F

              I'm currently trying to parse some html in order to get some specific information out of it. Now, I found this very amusing blog entry on coding horror: Parsing Html The Cthulhu Way[^] I must admit, I'm "unusually seducted about parsing HTML the Cthulhu way", as Jeff Atwood is calling it ;) (and yes, I could be called a "novice programmer") But as I'm a "sane person", I will use a library! (but only after I had enough fun with parsing html myself :) ) Read the blog entry, it's fun! Phil

              I won’t not use no double negatives.

              L Offline
              L Offline
              lepipele
              wrote on last edited by
              #6

              http://htmlagilitypack.codeplex.com/[^] "This is an agile HTML parser that builds a read/write DOM and supports plain XPATH or XSLT (you actually don't HAVE to understand XPATH nor XSLT to use it, don't worry...). It is a .NET code library that allows you to parse "out of the web" HTML files. The parser is very tolerant with "real world" malformed HTML. The object model is very similar to what proposes System.Xml, but for HTML documents (or streams)."

              1 Reply Last reply
              0
              • P Philip F

                I'm currently trying to parse some html in order to get some specific information out of it. Now, I found this very amusing blog entry on coding horror: Parsing Html The Cthulhu Way[^] I must admit, I'm "unusually seducted about parsing HTML the Cthulhu way", as Jeff Atwood is calling it ;) (and yes, I could be called a "novice programmer") But as I'm a "sane person", I will use a library! (but only after I had enough fun with parsing html myself :) ) Read the blog entry, it's fun! Phil

                I won’t not use no double negatives.

                B Offline
                B Offline
                Brady Kelly
                wrote on last edited by
                #7

                Not too long ago I tried, not parsing HTML, but merely searching it, and found it too challenging for the .NET RegEx engine.

                1 Reply Last reply
                0
                Reply
                • Reply as topic
                Log in to reply
                • Oldest to Newest
                • Newest to Oldest
                • Most Votes


                • Login

                • Don't have an account? Register

                • Login or register to search.
                • First post
                  Last post
                0
                • Categories
                • Recent
                • Tags
                • Popular
                • World
                • Users
                • Groups