Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. I'd like to ask a question about JSON to get a feel for priorities of coders here

I'd like to ask a question about JSON to get a feel for priorities of coders here

Scheduled Pinned Locked Moved The Lounge
questionjsonhelp
54 Posts 24 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • K Kent Sharkey

    This one? What's next for System.Text.Json? | .NET Blog[^]

    TTFN - Kent

    P Offline
    P Offline
    PIEBALDconsult
    wrote on last edited by
    #27

    I think so, but until I see it, I can't tell.

    1 Reply Last reply
    0
    • H honey the codewitch

      Let's say you wanted to write a fast JSON parser. You could do a pull parser that does well-formedness checking Or you could do one that's significantly faster but skips well formedness checking during search/skip operations, which can lead to later error reporting or missed errors You can't make an option to choose one or the other, but you can avoid using the skip/search functions that do this in the latter case. Which do you do? Are you a stomp-the-pedal type or a defensive driver? (Seriously, this is more about getting a read of the room than anything - I want a feel for priorities)

      Real programmers use butterflies

      Sander RosselS Offline
      Sander RosselS Offline
      Sander Rossel
      wrote on last edited by
      #28

      I'm pretty trusting. When someone says they're going to give me JSON I assume they'll give me JSON. So I'd go for it and worry about validation when the party that should be giving me JSON isn't giving me JSON. So far that has worked pretty well. In practice, these kind of things rarely break. You either get JSON or no JSON at all, but rarely (or even never) a badly formed JSON.

      Best, Sander Azure DevOps Succinctly (free eBook) Azure Serverless Succinctly (free eBook) Migrating Apps to the Cloud with Azure arrgh.js - Bringing LINQ to JavaScript

      H 1 Reply Last reply
      0
      • Sander RosselS Sander Rossel

        I'm pretty trusting. When someone says they're going to give me JSON I assume they'll give me JSON. So I'd go for it and worry about validation when the party that should be giving me JSON isn't giving me JSON. So far that has worked pretty well. In practice, these kind of things rarely break. You either get JSON or no JSON at all, but rarely (or even never) a badly formed JSON.

        Best, Sander Azure DevOps Succinctly (free eBook) Azure Serverless Succinctly (free eBook) Migrating Apps to the Cloud with Azure arrgh.js - Bringing LINQ to JavaScript

        H Offline
        H Offline
        honey the codewitch
        wrote on last edited by
        #29

        I agree! :-D

        Real programmers use butterflies

        1 Reply Last reply
        0
        • J Jorgen Andersson

          Tell me when you make a parser for XML. I'm loading 80 GB into a database every week, and XML (or rather the built in tools) seriously isn't made for that.

          Wrong is evil and must be defeated. - Jeff Ello Never stop dreaming - Freddie Kruger

          P Offline
          P Offline
          PIEBALDconsult
          wrote on last edited by
          #30

          I load 51GB of XML with what SSIS has built-in. It takes about twelve minutes. I load 5GB of JSON with my own parser. It takes about eight minutes. I load 80GB of JSON with my own parser -- this dataset has tripled in size over the last month. It's now taking about five hours. These datasets are in no way comparable, I'm just comparing the size-on-disk of the files. I will, of course, accept that my JSON loader is a likely bottleneck, but I have nothing else to compare it against. It seemed "good enough" two years ago when I had a year-end deadline to meet. I may also be able to configure my JSON Loader to use BulkCopy, as I do for the 5GB dataset, but I seem to recall that the data wasn't suited to it. At any rate, I'm in need of an alternative, but it can't be third-party. Next year will be different.

          J H 2 Replies Last reply
          0
          • H honey the codewitch

            So if it wasn't, you'd like to error as soon as you catch it, even if it meant a slower parse is what I'm hearing.

            Real programmers use butterflies

            M Offline
            M Offline
            Marc Clifton
            wrote on last edited by
            #31

            yes

            Latest Articles:
            Thread Safe Quantized Temporal Frame Ring Buffer

            1 Reply Last reply
            0
            • S Slacker007

              why are you not using Newtonsoft? Not sure why you are re-inventing the wheel here. :confused: NuGet Gallery| Newtonsoft.Json 12.0.3[^]

              J Offline
              J Offline
              John Stewien
              wrote on last edited by
              #32

              Some people have to work on air gap networks, where you can not copy anything to the network. It comes configured with a couple of approved things like the operating system, and whatever comes bundled with say Visual Studio 2015, and that's it. Nothing else gets in. With good reason too, e.g. see supply chain poisoning like the recent SolarWinds incident.

              1 Reply Last reply
              0
              • H honey the codewitch

                Let's say you wanted to write a fast JSON parser. You could do a pull parser that does well-formedness checking Or you could do one that's significantly faster but skips well formedness checking during search/skip operations, which can lead to later error reporting or missed errors You can't make an option to choose one or the other, but you can avoid using the skip/search functions that do this in the latter case. Which do you do? Are you a stomp-the-pedal type or a defensive driver? (Seriously, this is more about getting a read of the room than anything - I want a feel for priorities)

                Real programmers use butterflies

                A Offline
                A Offline
                Alexander Munro
                wrote on last edited by
                #33

                Since JSON is such a well defined construct simple parsers are very easy to write. I have a few. The nub is of course in 'a few'. It really falls into the case usage arena. If you know the data a quick regex parser will do. Regex parsers are fundamentally flawed though, and tend to fail on large data sets containing mixed characters (locale is a pain). So, well-formedness is largely there already. Two dimensional arrays only require a few lines of code. Multi dimensional arrays just a few more. Large unknown datasets across languages? Use someone else's library and save yourself time.

                1 Reply Last reply
                0
                • H honey the codewitch

                  Let's say you wanted to write a fast JSON parser. You could do a pull parser that does well-formedness checking Or you could do one that's significantly faster but skips well formedness checking during search/skip operations, which can lead to later error reporting or missed errors You can't make an option to choose one or the other, but you can avoid using the skip/search functions that do this in the latter case. Which do you do? Are you a stomp-the-pedal type or a defensive driver? (Seriously, this is more about getting a read of the room than anything - I want a feel for priorities)

                  Real programmers use butterflies

                  U Offline
                  U Offline
                  User 13269747
                  wrote on last edited by
                  #34

                  Quote:

                  Or you could do one that's significantly faster but skips well formedness checking during search/skip operations, which can lead to later error reporting or missed errors

                  As with all input to your program, you validate on reception. All the other code that uses that input after that can then assume valid input and you can choose whatever shortcuts you want to on the assumption of valid input. Doesn't matter if the input is JSON, XML, key/value pairs from .ini files or tokens, you only validate it once on reception.

                  1 Reply Last reply
                  0
                  • P PIEBALDconsult

                    Fortunately I'm not allowed to use third-party add-ins. I am awaiting access to the JSON support built into .net 4.7 and newer to see whether or not it can do what I require.

                    R Offline
                    R Offline
                    Reelix
                    wrote on last edited by
                    #35

                    If you're allowed to upgrade to .NET 5, they effectively implemented Newtonsofts one natively with pretty much the identical syntax. Works really well, and you're not using third-party add-ins.

                    -= Reelix =-

                    P 1 Reply Last reply
                    0
                    • H honey the codewitch

                      Let's say you wanted to write a fast JSON parser. You could do a pull parser that does well-formedness checking Or you could do one that's significantly faster but skips well formedness checking during search/skip operations, which can lead to later error reporting or missed errors You can't make an option to choose one or the other, but you can avoid using the skip/search functions that do this in the latter case. Which do you do? Are you a stomp-the-pedal type or a defensive driver? (Seriously, this is more about getting a read of the room than anything - I want a feel for priorities)

                      Real programmers use butterflies

                      M Offline
                      M Offline
                      Mehdi Gholam
                      wrote on last edited by
                      #36

                      The spec is pretty clear, so correctness and errors are clear. To be fast is another matter, see fastJSON - Smallest, Fastest Polymorphic JSON Serializer[^] and GitHub - simdjson/simdjson: Parsing gigabytes of JSON per second[^]

                      Exception up = new Exception("Something is really wrong."); throw up;

                      1 Reply Last reply
                      0
                      • H honey the codewitch

                        Let's say you wanted to write a fast JSON parser. You could do a pull parser that does well-formedness checking Or you could do one that's significantly faster but skips well formedness checking during search/skip operations, which can lead to later error reporting or missed errors You can't make an option to choose one or the other, but you can avoid using the skip/search functions that do this in the latter case. Which do you do? Are you a stomp-the-pedal type or a defensive driver? (Seriously, this is more about getting a read of the room than anything - I want a feel for priorities)

                        Real programmers use butterflies

                        O Offline
                        O Offline
                        obeobe
                        wrote on last edited by
                        #37

                        A key question is what this parser will be used for. Is it for a hobby project or a production system? What would be the benefits of the higher performance? Will it be perceivable for human users? Will it save money by requiring less hardware? How much money? Is there an impact on the development effort? What is the impact on the resulting code in terms of maintainability? What would be the cost of choosing one option now and updating to the other option later? (is it a full rewrite? would it be simpler to go from A to B, or from B to A? etc.) What would be the code of implementing both options and letting the user (well, caller) decide which one to use? There are many things to factor in this decision. Maybe different developers will give different weights to these considerations, and inexperienced developers will overlook some or all of them, but I believe that for most developers the answer would (and should) be "it depends on the details of the situation".

                        1 Reply Last reply
                        0
                        • H honey the codewitch

                          First of all this is a hypothetical. Second, hosting the .NET CLI in C++ just to use a .NET package from C++ to parse a little JSON seems heavy handed and horribly inefficient. Plus C# won't run on arduinos.

                          Real programmers use butterflies

                          S Offline
                          S Offline
                          Stuart Dootson
                          wrote on last edited by
                          #38

                          honey the codewitch wrote:

                          hosting the .NET CLI in C++ just to use a .NET package from C++ to parse a little JSON seems heavy handed and horribly inefficient.

                          If you're using C++, why not use a C++ JSON library such as [Modern JSON](https://github.com/nlohmann/json), [RapidJSON](https://rapidjson.org/) or [simdjson](https://simdjson.org/)? Or if you do develop your own library, you might be interested to look at [simdjson's 'On Demand' parsing approach...](https://github.com/simdjson/simdjson/blob/master/doc/ondemand.md)

                          Java, Basic, who cares - it's all a bunch of tree-hugging hippy cr*p

                          H 1 Reply Last reply
                          0
                          • S Stuart Dootson

                            honey the codewitch wrote:

                            hosting the .NET CLI in C++ just to use a .NET package from C++ to parse a little JSON seems heavy handed and horribly inefficient.

                            If you're using C++, why not use a C++ JSON library such as [Modern JSON](https://github.com/nlohmann/json), [RapidJSON](https://rapidjson.org/) or [simdjson](https://simdjson.org/)? Or if you do develop your own library, you might be interested to look at [simdjson's 'On Demand' parsing approach...](https://github.com/simdjson/simdjson/blob/master/doc/ondemand.md)

                            Java, Basic, who cares - it's all a bunch of tree-hugging hippy cr*p

                            H Offline
                            H Offline
                            honey the codewitch
                            wrote on last edited by
                            #39

                            They use too much memory and can't target IoT. of them simdjson shows the most potential but it still isn't about 71 bytes to do an episodes query off of a tmdb.com show data dump

                            Real programmers use butterflies

                            1 Reply Last reply
                            0
                            • H honey the codewitch

                              Let's say you wanted to write a fast JSON parser. You could do a pull parser that does well-formedness checking Or you could do one that's significantly faster but skips well formedness checking during search/skip operations, which can lead to later error reporting or missed errors You can't make an option to choose one or the other, but you can avoid using the skip/search functions that do this in the latter case. Which do you do? Are you a stomp-the-pedal type or a defensive driver? (Seriously, this is more about getting a read of the room than anything - I want a feel for priorities)

                              Real programmers use butterflies

                              U Offline
                              U Offline
                              User 14060113
                              wrote on last edited by
                              #40

                              Stability over performance!

                              1 Reply Last reply
                              0
                              • H honey the codewitch

                                Let's say you wanted to write a fast JSON parser. You could do a pull parser that does well-formedness checking Or you could do one that's significantly faster but skips well formedness checking during search/skip operations, which can lead to later error reporting or missed errors You can't make an option to choose one or the other, but you can avoid using the skip/search functions that do this in the latter case. Which do you do? Are you a stomp-the-pedal type or a defensive driver? (Seriously, this is more about getting a read of the room than anything - I want a feel for priorities)

                                Real programmers use butterflies

                                M Offline
                                M Offline
                                MGuerrieri
                                wrote on last edited by
                                #41

                                I take a function-first approach. You won't be able to parse the JSON if it's not well formed, so I would do that check first. If performance is poor, then I'd do a trace to find the bottlenecks and address them if possible. I wouldn't want to spend my time unnecessarily tracking down import errors.

                                H 1 Reply Last reply
                                0
                                • M MGuerrieri

                                  I take a function-first approach. You won't be able to parse the JSON if it's not well formed, so I would do that check first. If performance is poor, then I'd do a trace to find the bottlenecks and address them if possible. I wouldn't want to spend my time unnecessarily tracking down import errors.

                                  H Offline
                                  H Offline
                                  honey the codewitch
                                  wrote on last edited by
                                  #42

                                  I look at it this way - and keep in mind this is purely hypothetical: Let's say you're bulk uploading parts of some JSON out of a huge dataset. Almost always that JSON is machine generated because who writes huge JSON by hand? Scanning through it quickly is important. If at some point you get a bad data dump, might it be better to roll back that update and then run a validator over the bad document that one time out 1000 when it fails, rather than paying for that validation every other 999 times?

                                  Real programmers use butterflies

                                  1 Reply Last reply
                                  0
                                  • H honey the codewitch

                                    Let's say you wanted to write a fast JSON parser. You could do a pull parser that does well-formedness checking Or you could do one that's significantly faster but skips well formedness checking during search/skip operations, which can lead to later error reporting or missed errors You can't make an option to choose one or the other, but you can avoid using the skip/search functions that do this in the latter case. Which do you do? Are you a stomp-the-pedal type or a defensive driver? (Seriously, this is more about getting a read of the room than anything - I want a feel for priorities)

                                    Real programmers use butterflies

                                    M Offline
                                    M Offline
                                    Mark Meuer
                                    wrote on last edited by
                                    #43

                                    As a general rule, I try to follow these steps in order: 1. Make the program run right. 2. Make the program run right. 3. Make the program run right. 4. If I really need to, make it faster.

                                    H P 2 Replies Last reply
                                    0
                                    • M Mark Meuer

                                      As a general rule, I try to follow these steps in order: 1. Make the program run right. 2. Make the program run right. 3. Make the program run right. 4. If I really need to, make it faster.

                                      H Offline
                                      H Offline
                                      honey the codewitch
                                      wrote on last edited by
                                      #44

                                      That works to a point but certain design decisions for performance must be made up front. For example, deciding to use a pull parser as the primary way of navigation rather than an in memory tree.

                                      Real programmers use butterflies

                                      1 Reply Last reply
                                      0
                                      • R Reelix

                                        If you're allowed to upgrade to .NET 5, they effectively implemented Newtonsofts one natively with pretty much the identical syntax. Works really well, and you're not using third-party add-ins.

                                        -= Reelix =-

                                        P Offline
                                        P Offline
                                        PIEBALDconsult
                                        wrote on last edited by
                                        #45

                                        Yup, looking forward to it. Not holding my breath. It doesn't help that my boss read a blog that said that Microsoft is abandoning .net ( :sigh: ). Middle-managers will believe anything if it's in a blog. I countered with a link to Microsoft's road map for the future of .net, but the damage was already done.

                                        1 Reply Last reply
                                        0
                                        • M Mark Meuer

                                          As a general rule, I try to follow these steps in order: 1. Make the program run right. 2. Make the program run right. 3. Make the program run right. 4. If I really need to, make it faster.

                                          P Offline
                                          P Offline
                                          PIEBALDconsult
                                          wrote on last edited by
                                          #46

                                          4a. You always need to make it faster.

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups