Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. It makes me wonder why I've never seen it before

It makes me wonder why I've never seen it before

Scheduled Pinned Locked Moved The Lounge
xmlhelphtmldatabaseiot
15 Posts 6 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • R Rage

    Why ? Because it is probably more complicated than it seems.

    Do not escape reality : improve reality !

    H Offline
    H Offline
    honey the codewitch
    wrote on last edited by
    #6

    It's not really. I'm almost done with it.

    Real programmers use butterflies

    R 1 Reply Last reply
    0
    • H honey the codewitch

      It's not really. I'm almost done with it.

      Real programmers use butterflies

      R Offline
      R Offline
      Rage
      wrote on last edited by
      #7

      Yes, I was merely talking about normal mortals like us, not wizards :)

      Do not escape reality : improve reality !

      H 1 Reply Last reply
      0
      • R Rage

        Yes, I was merely talking about normal mortals like us, not wizards :)

        Do not escape reality : improve reality !

        H Offline
        H Offline
        honey the codewitch
        wrote on last edited by
        #8

        That'd be a witch. Wizards are a different thing altogether. :-D

        Real programmers use butterflies

        1 Reply Last reply
        0
        • H honey the codewitch

          On IoT you don't have a lot of luxuries. You simply learn to do without. Well, one area where I have to do without is HTML and XML well formedness checking and validation. That might be an issue where data interchange is concerned, but not so much where rendering HTML or XHTML content is concerned. What do you do on an error? You fail. You can either stop, or continue to render, possibly having some bad content displayed as a result, but this is still a better case than failing outright halfway through the parse because the document forgot a </b>. In fact, this is what commercial browsers do. Here's the thing. If this is what you're doing, you don't need a DTD. You don't need an XSD Schema. You don't even need a heckin stack! The result is much faster and lighter with a smaller binary footprint. So why haven't I seen a pull reader with minimal validation/well formedness checking in the open source pool? You'd think such a beast would be incredibly useful for building web browsers - even tiny ones - especially tiny ones! *cracks knuckles* I shouldn't have to be writing this. It's one of those things that leaves me wondering why it doesn't exist already.

          Real programmers use butterflies

          L Offline
          L Offline
          Lost User
          wrote on last edited by
          #9

          I preprocess / strip out all the HTML and insert my own markup directives. I don't need a "paragraph" keyword to tell me where a paragraph should start or end; etc. But as you say, it's assumed to be valid HTML in the first place (or made to be).

          It was only in wine that he laid down no limit for himself, but he did not allow himself to be confused by it. ― Confucian Analects: Rules of Confucius about his food

          H 1 Reply Last reply
          0
          • L Lost User

            I preprocess / strip out all the HTML and insert my own markup directives. I don't need a "paragraph" keyword to tell me where a paragraph should start or end; etc. But as you say, it's assumed to be valid HTML in the first place (or made to be).

            It was only in wine that he laid down no limit for himself, but he did not allow himself to be confused by it. ― Confucian Analects: Rules of Confucius about his food

            H Offline
            H Offline
            honey the codewitch
            wrote on last edited by
            #10

            As far as preprocessing I don't have the memory or nvs storage to do that on my device, and it wouldn't really buy me anything much even if i did. Much better in my scenario at least, to just read with a pull parser in a loop, get tag names back and set values on a context structure i use while rendering. The context structure has the current position and flags like "bold" or "italic", styles and font faces, that sort of thing. I mean, if I knew at compile time what my HTML was going to be I'd just generate C++ code from it that renders it, and that sort of "preprocessing" would be a huge win, but in my scenario I have to read arbitrary HTML.

            Real programmers use butterflies

            1 Reply Last reply
            0
            • H honey the codewitch

              On IoT you don't have a lot of luxuries. You simply learn to do without. Well, one area where I have to do without is HTML and XML well formedness checking and validation. That might be an issue where data interchange is concerned, but not so much where rendering HTML or XHTML content is concerned. What do you do on an error? You fail. You can either stop, or continue to render, possibly having some bad content displayed as a result, but this is still a better case than failing outright halfway through the parse because the document forgot a </b>. In fact, this is what commercial browsers do. Here's the thing. If this is what you're doing, you don't need a DTD. You don't need an XSD Schema. You don't even need a heckin stack! The result is much faster and lighter with a smaller binary footprint. So why haven't I seen a pull reader with minimal validation/well formedness checking in the open source pool? You'd think such a beast would be incredibly useful for building web browsers - even tiny ones - especially tiny ones! *cracks knuckles* I shouldn't have to be writing this. It's one of those things that leaves me wondering why it doesn't exist already.

              Real programmers use butterflies

              enhzflepE Offline
              enhzflepE Offline
              enhzflep
              wrote on last edited by
              #11

              honey the codewitch wrote:

              *cracks knuckles*

              Ye-aaaaah! Rock on Witchy Poo! Looking forward to the article(s) :thumbsup:

              H 1 Reply Last reply
              0
              • enhzflepE enhzflep

                honey the codewitch wrote:

                *cracks knuckles*

                Ye-aaaaah! Rock on Witchy Poo! Looking forward to the article(s) :thumbsup:

                H Offline
                H Offline
                honey the codewitch
                wrote on last edited by
                #12

                Thanks for the vote of confidence. I'm getting there but it's a bit of a bear. For starters, everything has to stream, because you don't have a ton of RAM. 1kB is a big deal, so i let you specify as little as 128 bytes for a buffer. I can't stream attribute and element names, but I can stream attribute values and element content, N bytes at a time (depending on what you had set N to) So if you have a long attribute value while you're doing while(reader.read()) { you'll get multiple reader.node_type()==ml_node_type::attribute_content results back before getting reader.node_type()==ml_node_type::attribute_end Not only are there a zillion html entities like © (©) but I had to make a state machine to decode all of them efficiently off a unicode stream. Also this:

                <span class="foo">this is valid
                <span class='foo'>and this
                <input disabled>
                <div class=you_thought_this_would_be easy id=but_no_because_html_hates_you_this_is_also_valid>

                So this is kinda rough sometimes, but I'm making progress. Fortunately I don't have to care about custom entity references, namespace declarations, or even well formedness (balanced tags, etc) which makes some of it pretty easy.

                Real programmers use butterflies

                enhzflepE 1 Reply Last reply
                0
                • H honey the codewitch

                  Thanks for the vote of confidence. I'm getting there but it's a bit of a bear. For starters, everything has to stream, because you don't have a ton of RAM. 1kB is a big deal, so i let you specify as little as 128 bytes for a buffer. I can't stream attribute and element names, but I can stream attribute values and element content, N bytes at a time (depending on what you had set N to) So if you have a long attribute value while you're doing while(reader.read()) { you'll get multiple reader.node_type()==ml_node_type::attribute_content results back before getting reader.node_type()==ml_node_type::attribute_end Not only are there a zillion html entities like © (©) but I had to make a state machine to decode all of them efficiently off a unicode stream. Also this:

                  <span class="foo">this is valid
                  <span class='foo'>and this
                  <input disabled>
                  <div class=you_thought_this_would_be easy id=but_no_because_html_hates_you_this_is_also_valid>

                  So this is kinda rough sometimes, but I'm making progress. Fortunately I don't have to care about custom entity references, namespace declarations, or even well formedness (balanced tags, etc) which makes some of it pretty easy.

                  Real programmers use butterflies

                  enhzflepE Offline
                  enhzflepE Offline
                  enhzflep
                  wrote on last edited by
                  #13

                  You bugger. The bit of my brain I need to use to make an intelligent response is entirely consumed with the act of killing myself laughing at your code-snippet. Fan-friggen-tastic :laugh: :laugh: :thumbsup:

                  H 1 Reply Last reply
                  0
                  • enhzflepE enhzflep

                    You bugger. The bit of my brain I need to use to make an intelligent response is entirely consumed with the act of killing myself laughing at your code-snippet. Fan-friggen-tastic :laugh: :laugh: :thumbsup:

                    H Offline
                    H Offline
                    honey the codewitch
                    wrote on last edited by
                    #14

                    I've got it running. I even wrote most of the article, but I should probably test it more.

                    Real programmers use butterflies

                    enhzflepE 1 Reply Last reply
                    0
                    • H honey the codewitch

                      I've got it running. I even wrote most of the article, but I should probably test it more.

                      Real programmers use butterflies

                      enhzflepE Offline
                      enhzflepE Offline
                      enhzflep
                      wrote on last edited by
                      #15

                      Nice work.. :-D

                      1 Reply Last reply
                      0
                      Reply
                      • Reply as topic
                      Log in to reply
                      • Oldest to Newest
                      • Newest to Oldest
                      • Most Votes


                      • Login

                      • Don't have an account? Register

                      • Login or register to search.
                      • First post
                        Last post
                      0
                      • Categories
                      • Recent
                      • Tags
                      • Popular
                      • World
                      • Users
                      • Groups