Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. XML / XSL
  4. XML and performance

XML and performance

Scheduled Pinned Locked Moved XML / XSL
questiondata-structuresxmlperformance
6 Posts 3 Posters 1 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • P Offline
    P Offline
    pedefetoll
    wrote on last edited by
    #1

    I need to scan a very big XML file (several millions of records), and I have discovered that access to data is very slow. On a Pentium IV / 3 Ghz, the following instruction need 9 milliseconds : BSTR BSTR_Result; IXMLDOMNodePtr pNode; // scan tree of nodes ... pNode = pNode->nextSibling; // very quick (0 ms) // get the text of the node pNode->get_text( &BSTR_Result); // < need 9 milliseconds To scan the whole file, the call to "nextSibling" is very quick. But the access to value is very slow ("get_text"). How can I increase performances ? Best regards.

    L 1 Reply Last reply
    0
    • P pedefetoll

      I need to scan a very big XML file (several millions of records), and I have discovered that access to data is very slow. On a Pentium IV / 3 Ghz, the following instruction need 9 milliseconds : BSTR BSTR_Result; IXMLDOMNodePtr pNode; // scan tree of nodes ... pNode = pNode->nextSibling; // very quick (0 ms) // get the text of the node pNode->get_text( &BSTR_Result); // < need 9 milliseconds To scan the whole file, the call to "nextSibling" is very quick. But the access to value is very slow ("get_text"). How can I increase performances ? Best regards.

      L Offline
      L Offline
      led mike
      wrote on last edited by
      #2

      marcelcerdanjunior wrote:

      How can I increase performances ?

      marcelcerdanjunior wrote:

      How can I increase performances ?

      You can try switching to a SAX parser but I doubt that will satisfy your 9ms requirement. It is far more likely that you are abusing XML. XML is NOT a replacement for Databases. You will probably have to use some form of optimized database to satisfy your 9ms requirement.


      Last modified: after originally posted -- clicked wrong button

      led mike

      P 1 Reply Last reply
      0
      • L led mike

        marcelcerdanjunior wrote:

        How can I increase performances ?

        marcelcerdanjunior wrote:

        How can I increase performances ?

        You can try switching to a SAX parser but I doubt that will satisfy your 9ms requirement. It is far more likely that you are abusing XML. XML is NOT a replacement for Databases. You will probably have to use some form of optimized database to satisfy your 9ms requirement.


        Last modified: after originally posted -- clicked wrong button

        led mike

        P Offline
        P Offline
        pedefetoll
        wrote on last edited by
        #3

        It is strange that DOT.NET gives correct performances (less than 1 second to find a record in million list), but not the msxml4.dll interface.

        L G 2 Replies Last reply
        0
        • P pedefetoll

          It is strange that DOT.NET gives correct performances (less than 1 second to find a record in million list), but not the msxml4.dll interface.

          L Offline
          L Offline
          led mike
          wrote on last edited by
          #4

          marcelcerdanjunior wrote:

          It is strange that DOT.NET gives correct performances (less than 1 second to find a record in million list), but not the msxml4.dll interface.

          Why is that strange? It would be strange if they were the same thing, but since they are not the same thing why is it strange. There is a web page out there somewhere that lists the multitude of XML parsers with performance information. But again, at a million records that just screams "Database".

          led mike

          1 Reply Last reply
          0
          • P pedefetoll

            It is strange that DOT.NET gives correct performances (less than 1 second to find a record in million list), but not the msxml4.dll interface.

            G Offline
            G Offline
            Gerald Schwab
            wrote on last edited by
            #5

            If you are trying to find 1 record in a list of a million, why not use selectSingleNode() instead of looping through each node yourself?

            P 1 Reply Last reply
            0
            • G Gerald Schwab

              If you are trying to find 1 record in a list of a million, why not use selectSingleNode() instead of looping through each node yourself?

              P Offline
              P Offline
              pedefetoll
              wrote on last edited by
              #6

              Thanks, I have tried your suggest, and I have the following result with selectSingleNode(.) call : (ms = milliseconds) /* nb records table load duration last record access time identifier to access ----------------------------------------------------------------------------------- 1000000 75000 ms "999999" ~ 43000 ms 100000 2119 ms "99999" 52 ms 10000 160 ms "9999" 5 ms */ My table looks like : Bigtest 0000000 Guest Guest 0 06.00.00.00.00 true 0000001 Guest Guest 1 06.00.00.00.00 true 0000002 Guest Guest 2 06.00.00.00.00 true ...

              1 Reply Last reply
              0
              Reply
              • Reply as topic
              Log in to reply
              • Oldest to Newest
              • Newest to Oldest
              • Most Votes


              • Login

              • Don't have an account? Register

              • Login or register to search.
              • First post
                Last post
              0
              • Categories
              • Recent
              • Tags
              • Popular
              • World
              • Users
              • Groups