Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
CODE PROJECT For Those Who Code
  • Home
  • Articles
  • FAQ
Community
  1. Home
  2. General Programming
  3. C#
  4. OutOfMemory loading massive XML file

OutOfMemory loading massive XML file

Scheduled Pinned Locked Moved C#
helptestingbeta-testingxmltutorial
7 Posts 4 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M Offline
    M Offline
    Martin23
    wrote on last edited by
    #1

    Hi, I am writing a small program that opens a very large (almost 4GB) XML file, then looks through the data for various text matches etc. (I don't need to display or write to the XML file). The problem I have is using the following code to open the XML file; StreamReader sr = new StreamReader(Filename); XmlTextReader xr = new XmlTextReader(sr); XmlDocument XMLdoc = new XmlDocument(); XMLdoc.Load(xr); This loads the entire file, which causes an OutOfMemoryException error when it reaches the 2GB limit. It works ok for small XML files that I have been testing on, but I dont know how to deal with this massive file. I assume I am meant to open the file in chunks or something, but I haven't been able to find any info on how acheive this (other than upgrading to 64bit, but that isnt really practical). Can anybody point me in the right direction? thanks!

    G C 2 Replies Last reply
    0
    • M Martin23

      Hi, I am writing a small program that opens a very large (almost 4GB) XML file, then looks through the data for various text matches etc. (I don't need to display or write to the XML file). The problem I have is using the following code to open the XML file; StreamReader sr = new StreamReader(Filename); XmlTextReader xr = new XmlTextReader(sr); XmlDocument XMLdoc = new XmlDocument(); XMLdoc.Load(xr); This loads the entire file, which causes an OutOfMemoryException error when it reaches the 2GB limit. It works ok for small XML files that I have been testing on, but I dont know how to deal with this massive file. I assume I am meant to open the file in chunks or something, but I haven't been able to find any info on how acheive this (other than upgrading to 64bit, but that isnt really practical). Can anybody point me in the right direction? thanks!

      G Offline
      G Offline
      Guffa
      wrote on last edited by
      #2

      You could parse the nodes yourself if the file has a basically simple structure, e.g. something similar to: <root> <node ... >...</node> <node ... >...</node> <node ... >...</node> <node ... >...</node> <node ... >...</node> ... lots'a nodes <node ... >...</node> </root> By reading the file in small parts, you could extract the complete nodes you find in that part of the file, put them in a separate xml document in a string and load it into a XmlDocument object. Pseudo code: buffer = "" loop { buffer += stream.Read(lotsabytes) find first "<node>" in buffer find last "</node>" in buffer nodes = get what's between buffer = what's after xmldoc.LoadXml("<root>" + nodes+ "</root>") ... do whatever you want with the nodes } --- b { font-weight: normal; }<

      M 1 Reply Last reply
      0
      • G Guffa

        You could parse the nodes yourself if the file has a basically simple structure, e.g. something similar to: <root> <node ... >...</node> <node ... >...</node> <node ... >...</node> <node ... >...</node> <node ... >...</node> ... lots'a nodes <node ... >...</node> </root> By reading the file in small parts, you could extract the complete nodes you find in that part of the file, put them in a separate xml document in a string and load it into a XmlDocument object. Pseudo code: buffer = "" loop { buffer += stream.Read(lotsabytes) find first "<node>" in buffer find last "</node>" in buffer nodes = get what's between buffer = what's after xmldoc.LoadXml("<root>" + nodes+ "</root>") ... do whatever you want with the nodes } --- b { font-weight: normal; }<

        M Offline
        M Offline
        Martin23
        wrote on last edited by
        #3

        I see what you're getting at, thanks!

        1 Reply Last reply
        0
        • M Martin23

          Hi, I am writing a small program that opens a very large (almost 4GB) XML file, then looks through the data for various text matches etc. (I don't need to display or write to the XML file). The problem I have is using the following code to open the XML file; StreamReader sr = new StreamReader(Filename); XmlTextReader xr = new XmlTextReader(sr); XmlDocument XMLdoc = new XmlDocument(); XMLdoc.Load(xr); This loads the entire file, which causes an OutOfMemoryException error when it reaches the 2GB limit. It works ok for small XML files that I have been testing on, but I dont know how to deal with this massive file. I assume I am meant to open the file in chunks or something, but I haven't been able to find any info on how acheive this (other than upgrading to 64bit, but that isnt really practical). Can anybody point me in the right direction? thanks!

          C Offline
          C Offline
          Christian Graus
          wrote on last edited by
          #4

          I believe the XMLDataReader provides a solution that reads the file as it parses it. If not, there must be some control that works as SAX instead of DOM, and doesn't hold the whole file in memory. Christian Graus - Microsoft MVP - C++

          M 1 Reply Last reply
          0
          • C Christian Graus

            I believe the XMLDataReader provides a solution that reads the file as it parses it. If not, there must be some control that works as SAX instead of DOM, and doesn't hold the whole file in memory. Christian Graus - Microsoft MVP - C++

            M Offline
            M Offline
            Matt Gerrans
            wrote on last edited by
            #5

            Yes, take the SAX approach -- open it as a Stream and use XmlTextReader. Not only will it help in not running out of memory, but it would be faster than the DOM approach even if you did have enough memory to load the whole thing.

            Stream stream = new FileStream(fileName, FileMode.Open);
            XmlTextReader reader = new XmlTextReader(stream);
            reader.WhitespaceHandling = WhitespaceHandling.None;
            while (reader.Read())
            {
            ...
            }

            This is also a lot better idea than parsing the file with regular expressions, or worse, junk like text.IndexOf(blah), especially when you consider the fact that there is no guaranteed that a whole xml file isn't a single line of text (then you are back to loading the whole thing into memory, probably). Matt Gerrans

            M 1 Reply Last reply
            0
            • M Matt Gerrans

              Yes, take the SAX approach -- open it as a Stream and use XmlTextReader. Not only will it help in not running out of memory, but it would be faster than the DOM approach even if you did have enough memory to load the whole thing.

              Stream stream = new FileStream(fileName, FileMode.Open);
              XmlTextReader reader = new XmlTextReader(stream);
              reader.WhitespaceHandling = WhitespaceHandling.None;
              while (reader.Read())
              {
              ...
              }

              This is also a lot better idea than parsing the file with regular expressions, or worse, junk like text.IndexOf(blah), especially when you consider the fact that there is no guaranteed that a whole xml file isn't a single line of text (then you are back to loading the whole thing into memory, probably). Matt Gerrans

              M Offline
              M Offline
              Martin23
              wrote on last edited by
              #6

              Thanks guys, I should mange from here!.

              M 1 Reply Last reply
              0
              • M Martin23

                Thanks guys, I should mange from here!.

                M Offline
                M Offline
                Matt Gerrans
                wrote on last edited by
                #7

                > Thanks guys, I should mange from here! Hmm... I think soap and water will help with that. If not you should probably see a doctor. ;P Matt Gerrans

                1 Reply Last reply
                0
                Reply
                • Reply as topic
                Log in to reply
                • Oldest to Newest
                • Newest to Oldest
                • Most Votes


                • Login

                • Don't have an account? Register

                • Login or register to search.
                • First post
                  Last post
                0
                • Categories
                • Recent
                • Tags
                • Popular
                • World
                • Users
                • Groups