Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. Did you know...

Did you know...

Scheduled Pinned Locked Moved The Lounge
visual-studioxmlquestion
32 Posts 12 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • OriginalGriffO OriginalGriff

    It's probably to encourage smaller source files: 10MB of code in one file is probably a little too big ... :laugh: Why on earth do you want to load a 1GB XML anyway? That's far too big for me to want to read!

    Sent from my Amstrad PC 1640 Never throw anything away, Griff Bad command or file name. Bad, bad command! Sit! Stay! Staaaay... AntiTwitter: @DalekDave is now a follower!

    R Offline
    R Offline
    realJSOP
    wrote on last edited by
    #6

    I just wanted to see what was in it. :) They range from 1mb, up to about 3.5gb. I have no control over how large the files to be processed are (they're generated by nessus security scans). The idiots that generate the files are completely unwilling to accommodate us, so it's essentially a "it is what is is" situation. I have to parse these files and store the results in our database. Using just XDocument, I was running out of memory (the server in question only has 8gb, of which most is already used by other processes), so I have to resort to using a combination of XmlReader and LinqToXml. Notepad, IE, Firefox, WordPad, and MS Word all load the file, but it takes more than five MINUTES for them, and wordpad/word become completely unusable. <rant> I wish people here (not you but some others) would stop f*ckin assuming I'm a rookie programmer. I have more years in the industry than most people on CP have even been alive. </rant>

    ".45 ACP - because shooting twice is just silly" - JSOP, 2010
    -----
    You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
    -----
    When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

    D OriginalGriffO J J Sander RosselS 5 Replies Last reply
    0
    • P PIEBALDconsult

      That's a problem with XML; it has to be read in its entirety before you can do anything with it. At work I receive a 6GB XML file every stinking day and I have to use SSIS to get it into a database. I'm beginning to prefer JSON, which I can read one object at a time (provided the outer-most value is a array of objects). However, I have written a fairly simple XML file splitter so I can make smaller files from one big one when I need to find out where a problem (e.g. non-well-formed XML) exists.

      R Offline
      R Offline
      realJSOP
      wrote on last edited by
      #7

      That's not entirely true. You can use XmlReader, and it sequentially reads a node at a time (it's slower than XDocument, and you can't go reverse read direction, but it solves my issue).

      ".45 ACP - because shooting twice is just silly" - JSOP, 2010
      -----
      You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
      -----
      When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

      1 Reply Last reply
      0
      • R realJSOP

        I just wanted to see what was in it. :) They range from 1mb, up to about 3.5gb. I have no control over how large the files to be processed are (they're generated by nessus security scans). The idiots that generate the files are completely unwilling to accommodate us, so it's essentially a "it is what is is" situation. I have to parse these files and store the results in our database. Using just XDocument, I was running out of memory (the server in question only has 8gb, of which most is already used by other processes), so I have to resort to using a combination of XmlReader and LinqToXml. Notepad, IE, Firefox, WordPad, and MS Word all load the file, but it takes more than five MINUTES for them, and wordpad/word become completely unusable. <rant> I wish people here (not you but some others) would stop f*ckin assuming I'm a rookie programmer. I have more years in the industry than most people on CP have even been alive. </rant>

        ".45 ACP - because shooting twice is just silly" - JSOP, 2010
        -----
        You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
        -----
        When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

        D Offline
        D Offline
        dandy72
        wrote on last edited by
        #8

        John Simmons / outlaw programmer wrote:

        <rant> I wish people here (not you but some others) would stop f*ckin assuming I'm a rookie programmer. I have more years in the industry than most people on CP have even been alive. </rant>

        It's been a while I've seen you assert yourself on CP. I kinda miss the ol' smackdowns.

        1 Reply Last reply
        0
        • R realJSOP

          I just wanted to see what was in it. :) They range from 1mb, up to about 3.5gb. I have no control over how large the files to be processed are (they're generated by nessus security scans). The idiots that generate the files are completely unwilling to accommodate us, so it's essentially a "it is what is is" situation. I have to parse these files and store the results in our database. Using just XDocument, I was running out of memory (the server in question only has 8gb, of which most is already used by other processes), so I have to resort to using a combination of XmlReader and LinqToXml. Notepad, IE, Firefox, WordPad, and MS Word all load the file, but it takes more than five MINUTES for them, and wordpad/word become completely unusable. <rant> I wish people here (not you but some others) would stop f*ckin assuming I'm a rookie programmer. I have more years in the industry than most people on CP have even been alive. </rant>

          ".45 ACP - because shooting twice is just silly" - JSOP, 2010
          -----
          You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
          -----
          When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

          OriginalGriffO Offline
          OriginalGriffO Offline
          OriginalGriff
          wrote on last edited by
          #9

          Ouch! That's a stupid amount of data, particularly for a text-based transfer mechanism. Have these people never heard of databases? On the bright side, at least it's not XLSX? :laugh:

          Sent from my Amstrad PC 1640 Never throw anything away, Griff Bad command or file name. Bad, bad command! Sit! Stay! Staaaay... AntiTwitter: @DalekDave is now a follower!

          "I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
          "Common sense is so rare these days, it should be classified as a super power" - Random T-shirt

          L R 2 Replies Last reply
          0
          • OriginalGriffO OriginalGriff

            Ouch! That's a stupid amount of data, particularly for a text-based transfer mechanism. Have these people never heard of databases? On the bright side, at least it's not XLSX? :laugh:

            Sent from my Amstrad PC 1640 Never throw anything away, Griff Bad command or file name. Bad, bad command! Sit! Stay! Staaaay... AntiTwitter: @DalekDave is now a follower!

            L Offline
            L Offline
            Lost User
            wrote on last edited by
            #10

            OriginalGriff wrote:

            On the bright side, at least it's not XLSX?

            dunno, xlsx isn't so bad and easier (ok lazier) to debug if there's bad data elements jus load into excel and scroll down to the line with the issue. if you're suggesting interop (i.e. slower than molasses) that's a complete other issue, and there are way way faster [read & write] alternatives. worst comes to worst can unpack the xlsx and viola, it's xml (pretty much exactly the same). (not criticizing, just unsure why you think it's any worse.)

            Message Signature (Click to edit ->)

            OriginalGriffO 1 Reply Last reply
            0
            • R realJSOP

              I just wanted to see what was in it. :) They range from 1mb, up to about 3.5gb. I have no control over how large the files to be processed are (they're generated by nessus security scans). The idiots that generate the files are completely unwilling to accommodate us, so it's essentially a "it is what is is" situation. I have to parse these files and store the results in our database. Using just XDocument, I was running out of memory (the server in question only has 8gb, of which most is already used by other processes), so I have to resort to using a combination of XmlReader and LinqToXml. Notepad, IE, Firefox, WordPad, and MS Word all load the file, but it takes more than five MINUTES for them, and wordpad/word become completely unusable. <rant> I wish people here (not you but some others) would stop f*ckin assuming I'm a rookie programmer. I have more years in the industry than most people on CP have even been alive. </rant>

              ".45 ACP - because shooting twice is just silly" - JSOP, 2010
              -----
              You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
              -----
              When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

              J Offline
              J Offline
              Jorgen Andersson
              wrote on last edited by
              #11

              If you just want to take a peek in the file you can use the Lister that comes with Total Commander. It's still immediate on a 20GB file. No specific support for XML though, it's treated the same as any file.

              Wrong is evil and must be defeated. - Jeff Ello

              1 Reply Last reply
              0
              • R realJSOP

                VS(2017) has a maximum supported file size of 10mb? I just found out myself while trying to load a 925mb xml file.

                ".45 ACP - because shooting twice is just silly" - JSOP, 2010
                -----
                You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
                -----
                When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

                D Offline
                D Offline
                Dr Walt Fair PE
                wrote on last edited by
                #12

                Methinks you need to rethink your XML design

                CQ de W5ALT

                Walt Fair, Jr., P. E. Comport Computing Specializing in Technical Engineering Software

                R 1 Reply Last reply
                0
                • R realJSOP

                  VS(2017) has a maximum supported file size of 10mb? I just found out myself while trying to load a 925mb xml file.

                  ".45 ACP - because shooting twice is just silly" - JSOP, 2010
                  -----
                  You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
                  -----
                  When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

                  L Offline
                  L Offline
                  Lost User
                  wrote on last edited by
                  #13

                  XML is not an export-format.

                  Bastard Programmer from Hell :suss: If you can't read my code, try converting it here[^] "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.

                  P 1 Reply Last reply
                  0
                  • L Lost User

                    OriginalGriff wrote:

                    On the bright side, at least it's not XLSX?

                    dunno, xlsx isn't so bad and easier (ok lazier) to debug if there's bad data elements jus load into excel and scroll down to the line with the issue. if you're suggesting interop (i.e. slower than molasses) that's a complete other issue, and there are way way faster [read & write] alternatives. worst comes to worst can unpack the xlsx and viola, it's xml (pretty much exactly the same). (not criticizing, just unsure why you think it's any worse.)

                    Message Signature (Click to edit ->)

                    OriginalGriffO Offline
                    OriginalGriffO Offline
                    OriginalGriff
                    wrote on last edited by
                    #14

                    Have you ever tried to load 1GB into Excel? :omg: (And bear in mind that XLSX is packaged, zipped, XML - and thus slower and more memory hungry than "naked" XML)

                    Sent from my Amstrad PC 1640 Never throw anything away, Griff Bad command or file name. Bad, bad command! Sit! Stay! Staaaay... AntiTwitter: @DalekDave is now a follower!

                    "I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
                    "Common sense is so rare these days, it should be classified as a super power" - Random T-shirt

                    1 Reply Last reply
                    0
                    • R realJSOP

                      I just wanted to see what was in it. :) They range from 1mb, up to about 3.5gb. I have no control over how large the files to be processed are (they're generated by nessus security scans). The idiots that generate the files are completely unwilling to accommodate us, so it's essentially a "it is what is is" situation. I have to parse these files and store the results in our database. Using just XDocument, I was running out of memory (the server in question only has 8gb, of which most is already used by other processes), so I have to resort to using a combination of XmlReader and LinqToXml. Notepad, IE, Firefox, WordPad, and MS Word all load the file, but it takes more than five MINUTES for them, and wordpad/word become completely unusable. <rant> I wish people here (not you but some others) would stop f*ckin assuming I'm a rookie programmer. I have more years in the industry than most people on CP have even been alive. </rant>

                      ".45 ACP - because shooting twice is just silly" - JSOP, 2010
                      -----
                      You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
                      -----
                      When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

                      J Offline
                      J Offline
                      Jan R Hansen
                      wrote on last edited by
                      #15

                      I'd recommend UltraEdit. You can disable the "make automatic backups when opening files" and then you are able to open and work with very large files. Fast. That feature, and built-in hex edit that allow me to see everything, including BOM bytes in files makes it worth the license fee. Just if you didn't know it - and needed something better than notepad and notepad++ for large files :)

                      Do you know why it's important to make fast decisions? Because you give yourself more time to correct your mistakes, when you find out that you made the wrong one. Chris Meech on deciding whether to go to his daughters graduation or a Neil Young concert

                      1 Reply Last reply
                      0
                      • R realJSOP

                        I just wanted to see what was in it. :) They range from 1mb, up to about 3.5gb. I have no control over how large the files to be processed are (they're generated by nessus security scans). The idiots that generate the files are completely unwilling to accommodate us, so it's essentially a "it is what is is" situation. I have to parse these files and store the results in our database. Using just XDocument, I was running out of memory (the server in question only has 8gb, of which most is already used by other processes), so I have to resort to using a combination of XmlReader and LinqToXml. Notepad, IE, Firefox, WordPad, and MS Word all load the file, but it takes more than five MINUTES for them, and wordpad/word become completely unusable. <rant> I wish people here (not you but some others) would stop f*ckin assuming I'm a rookie programmer. I have more years in the industry than most people on CP have even been alive. </rant>

                        ".45 ACP - because shooting twice is just silly" - JSOP, 2010
                        -----
                        You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
                        -----
                        When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

                        Sander RosselS Offline
                        Sander RosselS Offline
                        Sander Rossel
                        wrote on last edited by
                        #16

                        John Simmons / outlaw programmer wrote:

                        I wish people here (not you but some others) would stop f*ckin assuming I'm a rookie programmer.

                        Just like at work, other people mess up, but you get the blame!

                        John Simmons / outlaw programmer wrote:

                        I have more years in the industry than most people on CP have even been alive.

                        That's no guarantee for actually being a good programmer. For example, the programmer who gives you 3.5 GB of XML in a single file probably says the same :rolleyes:

                        Best, Sander sanderrossel.com Continuous Integration, Delivery, and Deployment arrgh.js - Bringing LINQ to JavaScript Object-Oriented Programming in C# Succinctly

                        R 1 Reply Last reply
                        0
                        • P PIEBALDconsult

                          That's a problem with XML; it has to be read in its entirety before you can do anything with it. At work I receive a 6GB XML file every stinking day and I have to use SSIS to get it into a database. I'm beginning to prefer JSON, which I can read one object at a time (provided the outer-most value is a array of objects). However, I have written a fairly simple XML file splitter so I can make smaller files from one big one when I need to find out where a problem (e.g. non-well-formed XML) exists.

                          R Offline
                          R Offline
                          realJSOP
                          wrote on last edited by
                          #17

                          Combining XmlReader and LinqToXML, the memory consumption never goes above 350mb, and it takes about 45 minutes to run though the sample files (this includes adding the data to the database, one record at a time (426,000 records). When I add a dash of TPL, it only takes about 9 minutes to process the same three files. I think I could get it even faster if I inserted multiple records per query, but I'm tired of dickin' with it.

                          ".45 ACP - because shooting twice is just silly" - JSOP, 2010
                          -----
                          You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
                          -----
                          When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

                          P 1 Reply Last reply
                          0
                          • OriginalGriffO OriginalGriff

                            Ouch! That's a stupid amount of data, particularly for a text-based transfer mechanism. Have these people never heard of databases? On the bright side, at least it's not XLSX? :laugh:

                            Sent from my Amstrad PC 1640 Never throw anything away, Griff Bad command or file name. Bad, bad command! Sit! Stay! Staaaay... AntiTwitter: @DalekDave is now a follower!

                            R Offline
                            R Offline
                            realJSOP
                            wrote on last edited by
                            #18

                            OriginalGriff wrote:

                            Have these people never heard of databases?

                            That's our job. :) Got memory consumption down to no more than 350mb and it only takes 9 minutes to process my three sample files, for a total of 426,000 records. I'm going to look awesome on Tuesday. Upside, this app replaces a large perl script that was doing the same job, and everyone in the shop can maintain it because - well - it's not perl. :)

                            ".45 ACP - because shooting twice is just silly" - JSOP, 2010
                            -----
                            You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
                            -----
                            When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

                            N 1 Reply Last reply
                            0
                            • Sander RosselS Sander Rossel

                              John Simmons / outlaw programmer wrote:

                              I wish people here (not you but some others) would stop f*ckin assuming I'm a rookie programmer.

                              Just like at work, other people mess up, but you get the blame!

                              John Simmons / outlaw programmer wrote:

                              I have more years in the industry than most people on CP have even been alive.

                              That's no guarantee for actually being a good programmer. For example, the programmer who gives you 3.5 GB of XML in a single file probably says the same :rolleyes:

                              Best, Sander sanderrossel.com Continuous Integration, Delivery, and Deployment arrgh.js - Bringing LINQ to JavaScript Object-Oriented Programming in C# Succinctly

                              R Offline
                              R Offline
                              realJSOP
                              wrote on last edited by
                              #19

                              Sander Rossel wrote:

                              For example, the programmer who gives you 3.5 GB of XML in a single file probably says the same

                              We don't get the files from programmers - we get them from security nazis.

                              ".45 ACP - because shooting twice is just silly" - JSOP, 2010
                              -----
                              You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
                              -----
                              When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

                              M 1 Reply Last reply
                              0
                              • D Dr Walt Fair PE

                                Methinks you need to rethink your XML design

                                CQ de W5ALT

                                Walt Fair, Jr., P. E. Comport Computing Specializing in Technical Engineering Software

                                R Offline
                                R Offline
                                realJSOP
                                wrote on last edited by
                                #20

                                It ain't my design, and it won't be changing to anything better.

                                ".45 ACP - because shooting twice is just silly" - JSOP, 2010
                                -----
                                You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
                                -----
                                When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

                                1 Reply Last reply
                                0
                                • R realJSOP

                                  Sander Rossel wrote:

                                  For example, the programmer who gives you 3.5 GB of XML in a single file probably says the same

                                  We don't get the files from programmers - we get them from security nazis.

                                  ".45 ACP - because shooting twice is just silly" - JSOP, 2010
                                  -----
                                  You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
                                  -----
                                  When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

                                  M Offline
                                  M Offline
                                  Mycroft Holmes
                                  wrote on last edited by
                                  #21

                                  John Simmons / outlaw programmer wrote:

                                  we get them from security nazis.

                                  And on the other side of that barrier there is some poor sod producing the xml. Or it was designed in the 90s and they refuse to even consider changing something that works - sort of.

                                  Never underestimate the power of human stupidity - RAH I'm old. I know stuff - JSOP

                                  R OriginalGriffO 2 Replies Last reply
                                  0
                                  • M Mycroft Holmes

                                    John Simmons / outlaw programmer wrote:

                                    we get them from security nazis.

                                    And on the other side of that barrier there is some poor sod producing the xml. Or it was designed in the 90s and they refuse to even consider changing something that works - sort of.

                                    Never underestimate the power of human stupidity - RAH I'm old. I know stuff - JSOP

                                    R Offline
                                    R Offline
                                    realJSOP
                                    wrote on last edited by
                                    #22

                                    a scan tool called Nessus generates the file. I know nothing about it, or it’s configurability where file generation is concerned.

                                    ".45 ACP - because shooting twice is just silly" - JSOP, 2010
                                    -----
                                    You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
                                    -----
                                    When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

                                    1 Reply Last reply
                                    0
                                    • M Mycroft Holmes

                                      John Simmons / outlaw programmer wrote:

                                      we get them from security nazis.

                                      And on the other side of that barrier there is some poor sod producing the xml. Or it was designed in the 90s and they refuse to even consider changing something that works - sort of.

                                      Never underestimate the power of human stupidity - RAH I'm old. I know stuff - JSOP

                                      OriginalGriffO Offline
                                      OriginalGriffO Offline
                                      OriginalGriff
                                      wrote on last edited by
                                      #23

                                      More likely it was designed in the 90's when the log data was small and (the then new and cutting edge) XML made some sense. But ... the developer who wrote that moved on, and file formats are boring, so the new guy just tested it worked in small scale and worked on the sexier stuff. And now ... intrusion / vulnerability data has grown like everything else and it's just a silly decision with hindsight.

                                      Sent from my Amstrad PC 1640 Never throw anything away, Griff Bad command or file name. Bad, bad command! Sit! Stay! Staaaay... AntiTwitter: @DalekDave is now a follower!

                                      "I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
                                      "Common sense is so rare these days, it should be classified as a super power" - Random T-shirt

                                      R 1 Reply Last reply
                                      0
                                      • L Lost User

                                        XML is not an export-format.

                                        Bastard Programmer from Hell :suss: If you can't read my code, try converting it here[^] "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.

                                        P Offline
                                        P Offline
                                        PIEBALDconsult
                                        wrote on last edited by
                                        #24

                                        The 6GB XML file I have to read is the backup of a third-party system. It's not only XML, but it's all name/value pairs.

                                        L 1 Reply Last reply
                                        0
                                        • R realJSOP

                                          Combining XmlReader and LinqToXML, the memory consumption never goes above 350mb, and it takes about 45 minutes to run though the sample files (this includes adding the data to the database, one record at a time (426,000 records). When I add a dash of TPL, it only takes about 9 minutes to process the same three files. I think I could get it even faster if I inserted multiple records per query, but I'm tired of dickin' with it.

                                          ".45 ACP - because shooting twice is just silly" - JSOP, 2010
                                          -----
                                          You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
                                          -----
                                          When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

                                          P Offline
                                          P Offline
                                          PIEBALDconsult
                                          wrote on last edited by
                                          #25

                                          Well, for the most part I'm limited to built-in SSIS components. Potentially I could write something custom, as I have for JSON and CSV files (ones which aren't stable enough for the flat-file components).

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups