Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. Did you know...

Did you know...

Scheduled Pinned Locked Moved The Lounge
visual-studioxmlquestion
32 Posts 12 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • L Lost User

    OriginalGriff wrote:

    On the bright side, at least it's not XLSX?

    dunno, xlsx isn't so bad and easier (ok lazier) to debug if there's bad data elements jus load into excel and scroll down to the line with the issue. if you're suggesting interop (i.e. slower than molasses) that's a complete other issue, and there are way way faster [read & write] alternatives. worst comes to worst can unpack the xlsx and viola, it's xml (pretty much exactly the same). (not criticizing, just unsure why you think it's any worse.)

    Message Signature (Click to edit ->)

    OriginalGriffO Offline
    OriginalGriffO Offline
    OriginalGriff
    wrote on last edited by
    #14

    Have you ever tried to load 1GB into Excel? :omg: (And bear in mind that XLSX is packaged, zipped, XML - and thus slower and more memory hungry than "naked" XML)

    Sent from my Amstrad PC 1640 Never throw anything away, Griff Bad command or file name. Bad, bad command! Sit! Stay! Staaaay... AntiTwitter: @DalekDave is now a follower!

    "I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
    "Common sense is so rare these days, it should be classified as a super power" - Random T-shirt

    1 Reply Last reply
    0
    • R realJSOP

      I just wanted to see what was in it. :) They range from 1mb, up to about 3.5gb. I have no control over how large the files to be processed are (they're generated by nessus security scans). The idiots that generate the files are completely unwilling to accommodate us, so it's essentially a "it is what is is" situation. I have to parse these files and store the results in our database. Using just XDocument, I was running out of memory (the server in question only has 8gb, of which most is already used by other processes), so I have to resort to using a combination of XmlReader and LinqToXml. Notepad, IE, Firefox, WordPad, and MS Word all load the file, but it takes more than five MINUTES for them, and wordpad/word become completely unusable. <rant> I wish people here (not you but some others) would stop f*ckin assuming I'm a rookie programmer. I have more years in the industry than most people on CP have even been alive. </rant>

      ".45 ACP - because shooting twice is just silly" - JSOP, 2010
      -----
      You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
      -----
      When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

      J Offline
      J Offline
      Jan R Hansen
      wrote on last edited by
      #15

      I'd recommend UltraEdit. You can disable the "make automatic backups when opening files" and then you are able to open and work with very large files. Fast. That feature, and built-in hex edit that allow me to see everything, including BOM bytes in files makes it worth the license fee. Just if you didn't know it - and needed something better than notepad and notepad++ for large files :)

      Do you know why it's important to make fast decisions? Because you give yourself more time to correct your mistakes, when you find out that you made the wrong one. Chris Meech on deciding whether to go to his daughters graduation or a Neil Young concert

      1 Reply Last reply
      0
      • R realJSOP

        I just wanted to see what was in it. :) They range from 1mb, up to about 3.5gb. I have no control over how large the files to be processed are (they're generated by nessus security scans). The idiots that generate the files are completely unwilling to accommodate us, so it's essentially a "it is what is is" situation. I have to parse these files and store the results in our database. Using just XDocument, I was running out of memory (the server in question only has 8gb, of which most is already used by other processes), so I have to resort to using a combination of XmlReader and LinqToXml. Notepad, IE, Firefox, WordPad, and MS Word all load the file, but it takes more than five MINUTES for them, and wordpad/word become completely unusable. <rant> I wish people here (not you but some others) would stop f*ckin assuming I'm a rookie programmer. I have more years in the industry than most people on CP have even been alive. </rant>

        ".45 ACP - because shooting twice is just silly" - JSOP, 2010
        -----
        You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
        -----
        When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

        Sander RosselS Offline
        Sander RosselS Offline
        Sander Rossel
        wrote on last edited by
        #16

        John Simmons / outlaw programmer wrote:

        I wish people here (not you but some others) would stop f*ckin assuming I'm a rookie programmer.

        Just like at work, other people mess up, but you get the blame!

        John Simmons / outlaw programmer wrote:

        I have more years in the industry than most people on CP have even been alive.

        That's no guarantee for actually being a good programmer. For example, the programmer who gives you 3.5 GB of XML in a single file probably says the same :rolleyes:

        Best, Sander sanderrossel.com Continuous Integration, Delivery, and Deployment arrgh.js - Bringing LINQ to JavaScript Object-Oriented Programming in C# Succinctly

        R 1 Reply Last reply
        0
        • P PIEBALDconsult

          That's a problem with XML; it has to be read in its entirety before you can do anything with it. At work I receive a 6GB XML file every stinking day and I have to use SSIS to get it into a database. I'm beginning to prefer JSON, which I can read one object at a time (provided the outer-most value is a array of objects). However, I have written a fairly simple XML file splitter so I can make smaller files from one big one when I need to find out where a problem (e.g. non-well-formed XML) exists.

          R Offline
          R Offline
          realJSOP
          wrote on last edited by
          #17

          Combining XmlReader and LinqToXML, the memory consumption never goes above 350mb, and it takes about 45 minutes to run though the sample files (this includes adding the data to the database, one record at a time (426,000 records). When I add a dash of TPL, it only takes about 9 minutes to process the same three files. I think I could get it even faster if I inserted multiple records per query, but I'm tired of dickin' with it.

          ".45 ACP - because shooting twice is just silly" - JSOP, 2010
          -----
          You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
          -----
          When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

          P 1 Reply Last reply
          0
          • OriginalGriffO OriginalGriff

            Ouch! That's a stupid amount of data, particularly for a text-based transfer mechanism. Have these people never heard of databases? On the bright side, at least it's not XLSX? :laugh:

            Sent from my Amstrad PC 1640 Never throw anything away, Griff Bad command or file name. Bad, bad command! Sit! Stay! Staaaay... AntiTwitter: @DalekDave is now a follower!

            R Offline
            R Offline
            realJSOP
            wrote on last edited by
            #18

            OriginalGriff wrote:

            Have these people never heard of databases?

            That's our job. :) Got memory consumption down to no more than 350mb and it only takes 9 minutes to process my three sample files, for a total of 426,000 records. I'm going to look awesome on Tuesday. Upside, this app replaces a large perl script that was doing the same job, and everyone in the shop can maintain it because - well - it's not perl. :)

            ".45 ACP - because shooting twice is just silly" - JSOP, 2010
            -----
            You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
            -----
            When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

            N 1 Reply Last reply
            0
            • Sander RosselS Sander Rossel

              John Simmons / outlaw programmer wrote:

              I wish people here (not you but some others) would stop f*ckin assuming I'm a rookie programmer.

              Just like at work, other people mess up, but you get the blame!

              John Simmons / outlaw programmer wrote:

              I have more years in the industry than most people on CP have even been alive.

              That's no guarantee for actually being a good programmer. For example, the programmer who gives you 3.5 GB of XML in a single file probably says the same :rolleyes:

              Best, Sander sanderrossel.com Continuous Integration, Delivery, and Deployment arrgh.js - Bringing LINQ to JavaScript Object-Oriented Programming in C# Succinctly

              R Offline
              R Offline
              realJSOP
              wrote on last edited by
              #19

              Sander Rossel wrote:

              For example, the programmer who gives you 3.5 GB of XML in a single file probably says the same

              We don't get the files from programmers - we get them from security nazis.

              ".45 ACP - because shooting twice is just silly" - JSOP, 2010
              -----
              You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
              -----
              When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

              M 1 Reply Last reply
              0
              • D Dr Walt Fair PE

                Methinks you need to rethink your XML design

                CQ de W5ALT

                Walt Fair, Jr., P. E. Comport Computing Specializing in Technical Engineering Software

                R Offline
                R Offline
                realJSOP
                wrote on last edited by
                #20

                It ain't my design, and it won't be changing to anything better.

                ".45 ACP - because shooting twice is just silly" - JSOP, 2010
                -----
                You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
                -----
                When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

                1 Reply Last reply
                0
                • R realJSOP

                  Sander Rossel wrote:

                  For example, the programmer who gives you 3.5 GB of XML in a single file probably says the same

                  We don't get the files from programmers - we get them from security nazis.

                  ".45 ACP - because shooting twice is just silly" - JSOP, 2010
                  -----
                  You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
                  -----
                  When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

                  M Offline
                  M Offline
                  Mycroft Holmes
                  wrote on last edited by
                  #21

                  John Simmons / outlaw programmer wrote:

                  we get them from security nazis.

                  And on the other side of that barrier there is some poor sod producing the xml. Or it was designed in the 90s and they refuse to even consider changing something that works - sort of.

                  Never underestimate the power of human stupidity - RAH I'm old. I know stuff - JSOP

                  R OriginalGriffO 2 Replies Last reply
                  0
                  • M Mycroft Holmes

                    John Simmons / outlaw programmer wrote:

                    we get them from security nazis.

                    And on the other side of that barrier there is some poor sod producing the xml. Or it was designed in the 90s and they refuse to even consider changing something that works - sort of.

                    Never underestimate the power of human stupidity - RAH I'm old. I know stuff - JSOP

                    R Offline
                    R Offline
                    realJSOP
                    wrote on last edited by
                    #22

                    a scan tool called Nessus generates the file. I know nothing about it, or it’s configurability where file generation is concerned.

                    ".45 ACP - because shooting twice is just silly" - JSOP, 2010
                    -----
                    You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
                    -----
                    When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

                    1 Reply Last reply
                    0
                    • M Mycroft Holmes

                      John Simmons / outlaw programmer wrote:

                      we get them from security nazis.

                      And on the other side of that barrier there is some poor sod producing the xml. Or it was designed in the 90s and they refuse to even consider changing something that works - sort of.

                      Never underestimate the power of human stupidity - RAH I'm old. I know stuff - JSOP

                      OriginalGriffO Offline
                      OriginalGriffO Offline
                      OriginalGriff
                      wrote on last edited by
                      #23

                      More likely it was designed in the 90's when the log data was small and (the then new and cutting edge) XML made some sense. But ... the developer who wrote that moved on, and file formats are boring, so the new guy just tested it worked in small scale and worked on the sexier stuff. And now ... intrusion / vulnerability data has grown like everything else and it's just a silly decision with hindsight.

                      Sent from my Amstrad PC 1640 Never throw anything away, Griff Bad command or file name. Bad, bad command! Sit! Stay! Staaaay... AntiTwitter: @DalekDave is now a follower!

                      "I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
                      "Common sense is so rare these days, it should be classified as a super power" - Random T-shirt

                      R 1 Reply Last reply
                      0
                      • L Lost User

                        XML is not an export-format.

                        Bastard Programmer from Hell :suss: If you can't read my code, try converting it here[^] "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.

                        P Offline
                        P Offline
                        PIEBALDconsult
                        wrote on last edited by
                        #24

                        The 6GB XML file I have to read is the backup of a third-party system. It's not only XML, but it's all name/value pairs.

                        L 1 Reply Last reply
                        0
                        • R realJSOP

                          Combining XmlReader and LinqToXML, the memory consumption never goes above 350mb, and it takes about 45 minutes to run though the sample files (this includes adding the data to the database, one record at a time (426,000 records). When I add a dash of TPL, it only takes about 9 minutes to process the same three files. I think I could get it even faster if I inserted multiple records per query, but I'm tired of dickin' with it.

                          ".45 ACP - because shooting twice is just silly" - JSOP, 2010
                          -----
                          You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
                          -----
                          When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

                          P Offline
                          P Offline
                          PIEBALDconsult
                          wrote on last edited by
                          #25

                          Well, for the most part I'm limited to built-in SSIS components. Potentially I could write something custom, as I have for JSON and CSV files (ones which aren't stable enough for the flat-file components).

                          1 Reply Last reply
                          0
                          • P PIEBALDconsult

                            The 6GB XML file I have to read is the backup of a third-party system. It's not only XML, but it's all name/value pairs.

                            L Offline
                            L Offline
                            Lost User
                            wrote on last edited by
                            #26

                            XML is an exchange-format. For backups it saves too much redundant information.

                            Bastard Programmer from Hell :suss: If you can't read my code, try converting it here[^] "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.

                            P 1 Reply Last reply
                            0
                            • L Lost User

                              XML is an exchange-format. For backups it saves too much redundant information.

                              Bastard Programmer from Hell :suss: If you can't read my code, try converting it here[^] "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.

                              P Offline
                              P Offline
                              PIEBALDconsult
                              wrote on last edited by
                              #27

                              Don't tell me. I mostly use it for configuration.

                              1 Reply Last reply
                              0
                              • R realJSOP

                                OriginalGriff wrote:

                                Have these people never heard of databases?

                                That's our job. :) Got memory consumption down to no more than 350mb and it only takes 9 minutes to process my three sample files, for a total of 426,000 records. I'm going to look awesome on Tuesday. Upside, this app replaces a large perl script that was doing the same job, and everyone in the shop can maintain it because - well - it's not perl. :)

                                ".45 ACP - because shooting twice is just silly" - JSOP, 2010
                                -----
                                You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
                                -----
                                When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

                                N Offline
                                N Offline
                                Nelek
                                wrote on last edited by
                                #28

                                John Simmons / outlaw programmer wrote:

                                I'm going to look awesome on Tuesday.

                                That only if the other morons people appreciate your work, not the first time awesome tools that are real improvements get dumped because a couple of idiots co-workers say: - We have always done it this way - That is not going to work (without even giving a try) - Or similar crap arguments... and not even give a damned "Thank you"

                                M.D.V. ;) If something has a solution... Why do we have to worry about?. If it has no solution... For what reason do we have to worry about? Help me to understand what I'm saying, and I'll explain it better to you Rating helpful answers is nice, but saying thanks can be even nicer.

                                R 1 Reply Last reply
                                0
                                • N Nelek

                                  John Simmons / outlaw programmer wrote:

                                  I'm going to look awesome on Tuesday.

                                  That only if the other morons people appreciate your work, not the first time awesome tools that are real improvements get dumped because a couple of idiots co-workers say: - We have always done it this way - That is not going to work (without even giving a try) - Or similar crap arguments... and not even give a damned "Thank you"

                                  M.D.V. ;) If something has a solution... Why do we have to worry about?. If it has no solution... For what reason do we have to worry about? Help me to understand what I'm saying, and I'll explain it better to you Rating helpful answers is nice, but saying thanks can be even nicer.

                                  R Offline
                                  R Offline
                                  realJSOP
                                  wrote on last edited by
                                  #29

                                  I've already gotten thank-yous for this. They're grateful that they don't have to maintain that monster perl script anymore.

                                  ".45 ACP - because shooting twice is just silly" - JSOP, 2010
                                  -----
                                  You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
                                  -----
                                  When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

                                  1 Reply Last reply
                                  0
                                  • OriginalGriffO OriginalGriff

                                    More likely it was designed in the 90's when the log data was small and (the then new and cutting edge) XML made some sense. But ... the developer who wrote that moved on, and file formats are boring, so the new guy just tested it worked in small scale and worked on the sexier stuff. And now ... intrusion / vulnerability data has grown like everything else and it's just a silly decision with hindsight.

                                    Sent from my Amstrad PC 1640 Never throw anything away, Griff Bad command or file name. Bad, bad command! Sit! Stay! Staaaay... AntiTwitter: @DalekDave is now a follower!

                                    R Offline
                                    R Offline
                                    realJSOP
                                    wrote on last edited by
                                    #30

                                    I'm sure they're suffering from the same thing we all have to deal with - management that doesn't (want to) see a reason to re-architect the app that generates the files.

                                    ".45 ACP - because shooting twice is just silly" - JSOP, 2010
                                    -----
                                    You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
                                    -----
                                    When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

                                    1 Reply Last reply
                                    0
                                    • R realJSOP

                                      VS(2017) has a maximum supported file size of 10mb? I just found out myself while trying to load a 925mb xml file.

                                      ".45 ACP - because shooting twice is just silly" - JSOP, 2010
                                      -----
                                      You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
                                      -----
                                      When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

                                      R Offline
                                      R Offline
                                      Richard Deeming
                                      wrote on last edited by
                                      #31

                                      Did you know you can increase that maximum, if you're feeling brave? :) registry - Large XML Files in VS 2017 15.1 - Stack Overflow[^] No idea whether you can increase it enough for such a ludicrously huge file, though.


                                      "These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer

                                      R 1 Reply Last reply
                                      0
                                      • R Richard Deeming

                                        Did you know you can increase that maximum, if you're feeling brave? :) registry - Large XML Files in VS 2017 15.1 - Stack Overflow[^] No idea whether you can increase it enough for such a ludicrously huge file, though.


                                        "These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer

                                        R Offline
                                        R Offline
                                        realJSOP
                                        wrote on last edited by
                                        #32

                                        yeah, i knew that, and i imaging that if its an integer value, int.maxvalue would be reasonable.

                                        ".45 ACP - because shooting twice is just silly" - JSOP, 2010
                                        -----
                                        You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
                                        -----
                                        When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

                                        1 Reply Last reply
                                        0
                                        Reply
                                        • Reply as topic
                                        Log in to reply
                                        • Oldest to Newest
                                        • Newest to Oldest
                                        • Most Votes


                                        • Login

                                        • Don't have an account? Register

                                        • Login or register to search.
                                        • First post
                                          Last post
                                        0
                                        • Categories
                                        • Recent
                                        • Tags
                                        • Popular
                                        • World
                                        • Users
                                        • Groups