Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. I'm feeling "diverse" today

I'm feeling "diverse" today

Scheduled Pinned Locked Moved The Lounge
regexjsoncsharpc++iot
28 Posts 7 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M megaadam

    I dunno but without knowing all your requirements I would consider the most "simple-dumb" way:

    Search for the json-key including quotes
    Search for ":" is an optional bonus, not strictly needed
    Extract the string between the subsequent pair of double-quotes.
    Wrap in a function

    Of course it fails for corrupt json, but the regex-based state machine would also fail.

    "If we don't change direction, we'll end up where we're going"

    H Offline
    H Offline
    honey the codewitch
    wrote on last edited by
    #9

    "search for the json key" That's where you hid your complexity behind few words. That's what DFA state machine takes care of. Short of that, I'd need to loop, and then within that loop, I need to fetch each character of the key i'm hunting until i fail, at which point i continue the outer loop. That's what the DFA code does. That's exactly what it does. ETA: All of this was for naught because I found out the Arduino Stream implementation has a find() method. *headdesk*

    To err is human. Fortune favors the monsters.

    M 2 Replies Last reply
    0
    • K klinkenbecker

      Yes, that is way too big :) Specifically, I'm not sure the exact size of the parser, but we routinely order map entries for size and JSON never gets anywhere near the top. Top flash hogs are radio (3k), print engine (2k), events (2.5k), class engine (2.6k), object engine (3.8k) (8051 numbers). Together they are ~60% of the ~20k flash for the OS. My 'guestimate' size delta was based on looking at your regex code which, at first blush, looked much more complex than our JSON parser (excluding the binary json part). We parse JSON 'in place' in the buffer it came in on, we don't use a heap (anywhere) and we don't create a 'document', generally jumping straight to methods. Since the radios we use are typically 128 byte max frame size, JSON is typically very constrained. Everything else is managed on the (2k) stack. Having seen your other work, I know you have thought about the problem very carefully. Just saying our mileage is different, mostly because we have bounded the problem in very specific ways that are generic to our (IoT) domain. It is often not possible to do that when attempting to solve for the 'unbounded' problem for 'everyone'. Effectively managing embedded constraints is one of the reasons why embedded resists unbounded solutions and their inevitable inclusion of unnecessary code (for any given specific instance). :)

      H Offline
      H Offline
      honey the codewitch
      wrote on last edited by
      #10

      To be clear, my flash space is all being used by other libraries - not this state machine. I was just posting it to give you an idea of where I'm at in terms of what I've used so far.

      To err is human. Fortune favors the monsters.

      K 1 Reply Last reply
      0
      • H honey the codewitch

        "search for the json key" That's where you hid your complexity behind few words. That's what DFA state machine takes care of. Short of that, I'd need to loop, and then within that loop, I need to fetch each character of the key i'm hunting until i fail, at which point i continue the outer loop. That's what the DFA code does. That's exactly what it does. ETA: All of this was for naught because I found out the Arduino Stream implementation has a find() method. *headdesk*

        To err is human. Fortune favors the monsters.

        M Offline
        M Offline
        megaadam
        wrote on last edited by
        #11

        I assumed you have access to std::string find() Not so much complexity IMO...

        "If we don't change direction, we'll end up where we're going"

        H 1 Reply Last reply
        0
        • H honey the codewitch

          I wanted to load some JSON from a couple of web based APIs to get the local time and weather. The trouble is I didn't actually want to use JSON, because that's a nasty dependency. What about regex? There's no regex engine readily available to my IoT widget. Well, traversing a DFA state table that represents a regular expression in C++ is almost trivial. Generating that table is not. But I have a regex engine in C#, and it's capable of generating that table. So I whip up a little C# program to generate a C++ array representing the "DFA table" for a regular expression - basically the opcodes it needs to match the expression. And then some C++ code to traverse it. 3 languages Regex C++ C# To "parse" a fourth, a JSON subset In a really compact way. And I'm not even doing front-end web development - just a REST/JSON client.

          To err is human. Fortune favors the monsters.

          D Offline
          D Offline
          Daniel Pfeffer
          wrote on last edited by
          #12

          honey the codewitch wrote:

          3 languages Regex C++ C# To "parse" a fourth, a JSON subset

          This brings to mind the children's song about the old lady who swallowed a fly.

          The two last verses are:

          I know an old lady
          Who swallowed a cow.
          I don't know how
          She swallowed a cow
          She swallowed the cow to catch the dog
          What a hog to swallow a dog!
          She swallowed the dog to catch the cat
          Fancy that! To swallow a cat!
          She swallowed the cat to catch the bird
          How absurd, to swallowed a bird!
          She swallowed the bird to catch the spider
          That wriggled and tickled inside her.
          She swallowed the spider to catch the fly.
          I don't know why she swallowed a fly.
          Perhaps she'll die...

          I know an old lady who swallowed a horse!
          She's dead, of course

          .

          Freedom is the freedom to say that two plus two make four. If that is granted, all else follows. -- 6079 Smith W.

          1 Reply Last reply
          0
          • H honey the codewitch

            "search for the json key" That's where you hid your complexity behind few words. That's what DFA state machine takes care of. Short of that, I'd need to loop, and then within that loop, I need to fetch each character of the key i'm hunting until i fail, at which point i continue the outer loop. That's what the DFA code does. That's exactly what it does. ETA: All of this was for naught because I found out the Arduino Stream implementation has a find() method. *headdesk*

            To err is human. Fortune favors the monsters.

            M Offline
            M Offline
            megaadam
            wrote on last edited by
            #13

            And for standard C there is of course always with strstr() that I assume you know

            "If we don't change direction, we'll end up where we're going"

            H 1 Reply Last reply
            0
            • M megaadam

              And for standard C there is of course always with strstr() that I assume you know

              "If we don't change direction, we'll end up where we're going"

              H Offline
              H Offline
              honey the codewitch
              wrote on last edited by
              #14

              strstr only works for in memory strings, not streams.

              To err is human. Fortune favors the monsters.

              1 Reply Last reply
              0
              • M megaadam

                I assumed you have access to std::string find() Not so much complexity IMO...

                "If we don't change direction, we'll end up where we're going"

                H Offline
                H Offline
                honey the codewitch
                wrote on last edited by
                #15

                that only works for in memory strings.

                To err is human. Fortune favors the monsters.

                1 Reply Last reply
                0
                • H honey the codewitch

                  To be clear, my flash space is all being used by other libraries - not this state machine. I was just posting it to give you an idea of where I'm at in terms of what I've used so far.

                  To err is human. Fortune favors the monsters.

                  K Offline
                  K Offline
                  klinkenbecker
                  wrote on last edited by
                  #16

                  It is an interesting approach and, as always, I will be very keen to see how it compares when it's finished.

                  H 1 Reply Last reply
                  0
                  • M megaadam

                    I dunno but without knowing all your requirements I would consider the most "simple-dumb" way:

                    Search for the json-key including quotes
                    Search for ":" is an optional bonus, not strictly needed
                    Extract the string between the subsequent pair of double-quotes.
                    Wrap in a function

                    Of course it fails for corrupt json, but the regex-based state machine would also fail.

                    "If we don't change direction, we'll end up where we're going"

                    K Offline
                    K Offline
                    klinkenbecker
                    wrote on last edited by
                    #17

                    One of the nice things about embedded is that generally, if you have adequate implementation, unit testing and system testing, you can safely assume your input will not be corrupt. I.e. embedded implementations can be fully and specifically bounded and 'gated' in such a way as to avoid input errors - frames can be error checked, etc, etc. Moving data errors into places they can be easily managed is a key piece of making 'engines' more efficient. The exact same paradigm is the way a car is built - or better example - a boat. You would never build an engine for a boat to be able to take water in the fuel. You 'move' the error handling (water in the fuel) to an input qualifying filter. Fuel filters for boats and cars are very different beasts, the engines are (fundamentally) the same.

                    1 Reply Last reply
                    0
                    • K klinkenbecker

                      It is an interesting approach and, as always, I will be very keen to see how it compares when it's finished.

                      H Offline
                      H Offline
                      honey the codewitch
                      wrote on last edited by
                      #18

                      I ditched it altogether! I found out the Arduino Stream class has a find() method which will allow you to find a string within a stream (without having to load it into a string first and use strstr()) So much for all this effort, although I will need to use something like the DFA machine to grab JSON from a weather service. The issue with that is fields can be in any order so I either load the fields into memory, or i use a DFA lexer. I'd rather use the lexer.

                      To err is human. Fortune favors the monsters.

                      K 1 Reply Last reply
                      0
                      • H honey the codewitch

                        I ditched it altogether! I found out the Arduino Stream class has a find() method which will allow you to find a string within a stream (without having to load it into a string first and use strstr()) So much for all this effort, although I will need to use something like the DFA machine to grab JSON from a weather service. The issue with that is fields can be in any order so I either load the fields into memory, or i use a DFA lexer. I'd rather use the lexer.

                        To err is human. Fortune favors the monsters.

                        K Offline
                        K Offline
                        klinkenbecker
                        wrote on last edited by
                        #19

                        I find that coding effort is very rarely, if ever wasted, it goes into a black hole and comes out as ultra-energetic gamma radiation at a some later date. 15 years ago, I wrote a (micro) JS server to run applications via browser on any platform with plug-ins to pull data from misc devices/websites. I got side tracked and just had cause to go back to it. I wrote that in C and it will be much simpler in c# now (and much more x-platform), but it is still a good architectural reference point. Never wasted, the neurons are just better configured for next time...

                        1 Reply Last reply
                        0
                        • H honey the codewitch

                          I wanted to load some JSON from a couple of web based APIs to get the local time and weather. The trouble is I didn't actually want to use JSON, because that's a nasty dependency. What about regex? There's no regex engine readily available to my IoT widget. Well, traversing a DFA state table that represents a regular expression in C++ is almost trivial. Generating that table is not. But I have a regex engine in C#, and it's capable of generating that table. So I whip up a little C# program to generate a C++ array representing the "DFA table" for a regular expression - basically the opcodes it needs to match the expression. And then some C++ code to traverse it. 3 languages Regex C++ C# To "parse" a fourth, a JSON subset In a really compact way. And I'm not even doing front-end web development - just a REST/JSON client.

                          To err is human. Fortune favors the monsters.

                          M Offline
                          M Offline
                          Member 9167057
                          wrote on last edited by
                          #20

                          Why not parsing JSON in C# directly? That's a dependency on .NET's standard runtime library which ain't too bad.

                          H 1 Reply Last reply
                          0
                          • M Member 9167057

                            Why not parsing JSON in C# directly? That's a dependency on .NET's standard runtime library which ain't too bad.

                            H Offline
                            H Offline
                            honey the codewitch
                            wrote on last edited by
                            #21

                            Because first it would mean upgrading the SRAM on my device to something more than 512kB Then it would involve upgrading the processor to something in the GHz range And heck, it would involve adding a PC in there somewhere to actually run .NET. This is not a .NET device[^]

                            To err is human. Fortune favors the monsters.

                            M 1 Reply Last reply
                            0
                            • H honey the codewitch

                              Because first it would mean upgrading the SRAM on my device to something more than 512kB Then it would involve upgrading the processor to something in the GHz range And heck, it would involve adding a PC in there somewhere to actually run .NET. This is not a .NET device[^]

                              To err is human. Fortune favors the monsters.

                              M Offline
                              M Offline
                              Member 9167057
                              wrote on last edited by
                              #22

                              Things are getting insteresting! What and how do you compile C# down to? Can I imagine what you're doing to be similar to what the Unity developers are doing (compiling C# to C++ which then gets compiled to native code)?

                              H 1 Reply Last reply
                              0
                              • M Member 9167057

                                Things are getting insteresting! What and how do you compile C# down to? Can I imagine what you're doing to be similar to what the Unity developers are doing (compiling C# to C++ which then gets compiled to native code)?

                                H Offline
                                H Offline
                                honey the codewitch
                                wrote on last edited by
                                #23

                                I'm just basically using C# to generate an array for my C++ code to traverse. The C# code is a console application. I feed it a regular expression on the command line and it produces a small amount of C++ code to declare an array as its output - for example:

                                int16_t dfa_table[] = {
                                -1, 1, 6, 1, 34, 34, -1, 1, 12, 1, 117, 117, -1, 1, 18,
                                1, 110, 110, -1, 1, 24, 1, 105, 105, -1, 1, 30, 1, 120, 120,
                                -1, 1, 36, 1, 116, 116, -1, 1, 42, 1, 105, 105, -1, 1, 48,
                                1, 109, 109, -1, 1, 54, 1, 101, 101, -1, 1, 60, 1, 34, 34,
                                -1, 1, 66, 1, 58, 58, 0, 0
                                };

                                That's a DFA table. What it is is a state machine encoded into an array. I have C++ code that can walk it in order to run the regular expression. The walking code is easy and efficient. Generating the array is not easy. That C# console application uses a regular expression engine I wrote (in C#) in order to generate that C++ array. The code to run the regular expression is simple and is in C++:

                                bool match(const int16_t* dfa, int16_t(read_cb)(void*), void* cb_state = nullptr) {
                                int tlen;
                                int tto;
                                int prlen;
                                int pmin;
                                int pmax;
                                int i;
                                int j;
                                int ch;
                                int state = 0;
                                bool done;
                                bool found = false;
                                int acc = -1;
                                ch = read_cb(cb_state);
                                while (ch != -1) {
                                acc = -1;
                                done = false;
                                while (!done) {
                                start_dfa:
                                done = true;
                                acc = dfa[state++];
                                tlen = dfa[state++];
                                for (i = 0; i < tlen; ++i) {
                                tto = dfa[state++];
                                prlen = dfa[state++];
                                for (j = 0; j < prlen; ++j) {
                                pmin = dfa[state++];
                                pmax = dfa[state++];
                                if (ch < pmin) break;
                                if (ch <= pmax) {
                                found = true;
                                ch = read_cb(cb_state);
                                state = tto;
                                done = false;

                                                    goto start\_dfa;
                                                }
                                            }
                                        }
                                    }
                                    if (acc != -1) {
                                        return found;
                                    }
                                    ch = read\_cb(cb\_state);
                                    state = 0;
                                }
                                return false;
                                

                                }

                                To err is human. Fortune favors the monsters.

                                M 1 Reply Last reply
                                0
                                • H honey the codewitch

                                  I'm just basically using C# to generate an array for my C++ code to traverse. The C# code is a console application. I feed it a regular expression on the command line and it produces a small amount of C++ code to declare an array as its output - for example:

                                  int16_t dfa_table[] = {
                                  -1, 1, 6, 1, 34, 34, -1, 1, 12, 1, 117, 117, -1, 1, 18,
                                  1, 110, 110, -1, 1, 24, 1, 105, 105, -1, 1, 30, 1, 120, 120,
                                  -1, 1, 36, 1, 116, 116, -1, 1, 42, 1, 105, 105, -1, 1, 48,
                                  1, 109, 109, -1, 1, 54, 1, 101, 101, -1, 1, 60, 1, 34, 34,
                                  -1, 1, 66, 1, 58, 58, 0, 0
                                  };

                                  That's a DFA table. What it is is a state machine encoded into an array. I have C++ code that can walk it in order to run the regular expression. The walking code is easy and efficient. Generating the array is not easy. That C# console application uses a regular expression engine I wrote (in C#) in order to generate that C++ array. The code to run the regular expression is simple and is in C++:

                                  bool match(const int16_t* dfa, int16_t(read_cb)(void*), void* cb_state = nullptr) {
                                  int tlen;
                                  int tto;
                                  int prlen;
                                  int pmin;
                                  int pmax;
                                  int i;
                                  int j;
                                  int ch;
                                  int state = 0;
                                  bool done;
                                  bool found = false;
                                  int acc = -1;
                                  ch = read_cb(cb_state);
                                  while (ch != -1) {
                                  acc = -1;
                                  done = false;
                                  while (!done) {
                                  start_dfa:
                                  done = true;
                                  acc = dfa[state++];
                                  tlen = dfa[state++];
                                  for (i = 0; i < tlen; ++i) {
                                  tto = dfa[state++];
                                  prlen = dfa[state++];
                                  for (j = 0; j < prlen; ++j) {
                                  pmin = dfa[state++];
                                  pmax = dfa[state++];
                                  if (ch < pmin) break;
                                  if (ch <= pmax) {
                                  found = true;
                                  ch = read_cb(cb_state);
                                  state = tto;
                                  done = false;

                                                      goto start\_dfa;
                                                  }
                                              }
                                          }
                                      }
                                      if (acc != -1) {
                                          return found;
                                      }
                                      ch = read\_cb(cb\_state);
                                      state = 0;
                                  }
                                  return false;
                                  

                                  }

                                  To err is human. Fortune favors the monsters.

                                  M Offline
                                  M Offline
                                  Member 9167057
                                  wrote on last edited by
                                  #24

                                  Ah, I get it. Thank you for the thorough explanation :)

                                  H 1 Reply Last reply
                                  0
                                  • M Member 9167057

                                    Ah, I get it. Thank you for the thorough explanation :)

                                    H Offline
                                    H Offline
                                    honey the codewitch
                                    wrote on last edited by
                                    #25

                                    No problem. I don't target .NET these days, as I spend most of time with tiny little gadgets. To me .NET is a means to an end. If I can use it to offload some of the heavy lifting my code would otherwise have to do I'll do that, of course. Otherwise I haven't really used it recently. I do have a lot of code I've written in C#, and used to use it professionally - sometimes I still will every once in awhile, but it's not my bread and butter anymore. The regular expression thing is a great example of being able to use it to offload work though - where an actual regular expression engine on the device would take up precious RAM and flash space, I was able to "outsource it" to an external C# app I only need to run once.

                                    To err is human. Fortune favors the monsters.

                                    M 1 Reply Last reply
                                    0
                                    • H honey the codewitch

                                      No problem. I don't target .NET these days, as I spend most of time with tiny little gadgets. To me .NET is a means to an end. If I can use it to offload some of the heavy lifting my code would otherwise have to do I'll do that, of course. Otherwise I haven't really used it recently. I do have a lot of code I've written in C#, and used to use it professionally - sometimes I still will every once in awhile, but it's not my bread and butter anymore. The regular expression thing is a great example of being able to use it to offload work though - where an actual regular expression engine on the device would take up precious RAM and flash space, I was able to "outsource it" to an external C# app I only need to run once.

                                      To err is human. Fortune favors the monsters.

                                      M Offline
                                      M Offline
                                      Member 9167057
                                      wrote on last edited by
                                      #26

                                      I'd go as far as to claim for everything to be means to and end. Programming something embedded in C++, it still is means to an end :p

                                      H 1 Reply Last reply
                                      0
                                      • M Member 9167057

                                        I'd go as far as to claim for everything to be means to and end. Programming something embedded in C++, it still is means to an end :p

                                        H Offline
                                        H Offline
                                        honey the codewitch
                                        wrote on last edited by
                                        #27

                                        Ha! Okay that's fair. But in this case, I mean I'll use it to facilitate my IoT development, rather than use it to make applications that are meant to be used standalone or even libraries from C#.

                                        To err is human. Fortune favors the monsters.

                                        1 Reply Last reply
                                        0
                                        • H honey the codewitch

                                          I wanted to load some JSON from a couple of web based APIs to get the local time and weather. The trouble is I didn't actually want to use JSON, because that's a nasty dependency. What about regex? There's no regex engine readily available to my IoT widget. Well, traversing a DFA state table that represents a regular expression in C++ is almost trivial. Generating that table is not. But I have a regex engine in C#, and it's capable of generating that table. So I whip up a little C# program to generate a C++ array representing the "DFA table" for a regular expression - basically the opcodes it needs to match the expression. And then some C++ code to traverse it. 3 languages Regex C++ C# To "parse" a fourth, a JSON subset In a really compact way. And I'm not even doing front-end web development - just a REST/JSON client.

                                          To err is human. Fortune favors the monsters.

                                          M Offline
                                          M Offline
                                          maze3
                                          wrote on last edited by
                                          #28

                                          sounds awesome. when reading the comments and when start talking about tens of kilobytes, :(( :sigh: I know way out of my depth. Trying to figure out library and runtime differences of Regex/JSON and not getting anywhre Ill just fallback to Regex being many times older and many works of optimising and minimising. JSON format been great for the balance of human readability and reduced redundancy of XML, but where computers matter, bits is all that they care about.

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups