Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C#
  4. Performance of Switch case vs dictionary with delegates

Performance of Switch case vs dictionary with delegates

Scheduled Pinned Locked Moved C#
htmlvisual-studiocomperformancequestion
22 Posts 5 Posters 11 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • J Offline
    J Offline
    Jorgen Andersson
    wrote on last edited by
    #1

    Imagine you have a switch with anything between 3 and a silly amount of comparisons, at which size does a lookup dictionary with delegates get faster? I'm fully aware there are no exact answers to this, I just want some elaboration on what's affecting the performance.

    Wrong is evil and must be defeated. - Jeff Ello

    T J R 3 Replies Last reply
    0
    • J Jorgen Andersson

      Imagine you have a switch with anything between 3 and a silly amount of comparisons, at which size does a lookup dictionary with delegates get faster? I'm fully aware there are no exact answers to this, I just want some elaboration on what's affecting the performance.

      Wrong is evil and must be defeated. - Jeff Ello

      T Offline
      T Offline
      trønderen
      wrote on last edited by
      #2

      Let me add to that question - not using a lookup dictionary, but a 2D array: I regularly see people claim that they program in a 'state machine' fashion (breaking numerous rules for state machine programming, but that's not the question here), essentially as a switch or sequence of if-else on the event, each switch/else alternative being another switch/if-else on the current state. I find that coding style terrible, impossible to maintain. My coding style for state machines is creating a 2D array of Action and Output delegate references and a NextState value, possibly headed by a predicate delegate reference (so the array becomes a 2,5D one). In the worst case, three delegates must be called, plus one per failing predicate. In the simplest case, a single delegate is called (no predicate, no output). Obviously, there is also the initial indexing of the state table on event and state, and if the entry has a chain of predicated alternatives, the code to iterate over them. This is part of the basic transition mechanism, unrelated to the specific table/transition. This way of coding state machines has so many advantages that I will be very reluctant to change it. Yet I wonder: Is this indexing and delegate calling a CPU costly way of doing it, compared to nesting of switch / if-else in 2-3 levels? Are there performance pitfalls I should be aware of when indexing / calling delegates?

      Religious freedom is the freedom to say that two plus two make five.

      1 Reply Last reply
      0
      • J Jorgen Andersson

        Imagine you have a switch with anything between 3 and a silly amount of comparisons, at which size does a lookup dictionary with delegates get faster? I'm fully aware there are no exact answers to this, I just want some elaboration on what's affecting the performance.

        Wrong is evil and must be defeated. - Jeff Ello

        J Offline
        J Offline
        jschell
        wrote on last edited by
        #3

        Jörgen Andersson wrote:

        between 3 and a silly amount of comparisons, a

        Presumably you mean cases.

        Jörgen Andersson wrote:

        y with delegates get faster?

        At the point where I have profiled the application (not the code) with realistic production data and determined that the specific code is in fact a performance bottle neck. At that point then I would look at the design not the code to determine if there was some completely different way to do it.

        1 Reply Last reply
        0
        • J Jorgen Andersson

          Imagine you have a switch with anything between 3 and a silly amount of comparisons, at which size does a lookup dictionary with delegates get faster? I'm fully aware there are no exact answers to this, I just want some elaboration on what's affecting the performance.

          Wrong is evil and must be defeated. - Jeff Ello

          R Offline
          R Offline
          Rob Philpott
          wrote on last edited by
          #4

          Interesting question, to which I wouldn't like to guess the answer, but here's what I'd consider. Firstly, what is it that is being switched? If its something primitive like an integer, the switch code would I expect boil down to a collection of CMP and BEQ instructions (compare and branch). These would be stupidly fast, and because they are consecutive in memory are likely to benefit from CPU caching, so in that instance, an awful lot of switch cases could be compared in the time of a dictionary look up. If you are switching on strings though, things get more complicated. To do the dictionary look up, first the string needs to be hashed to give a bucket index, then an equality check is needed to make sure it matches. The switch statement doesn't need to do the hash, but has multiple equality checks to do, so I suspect the answer here boils down to the ratio of time taken to hash vs. time taken to do an equality check. So then you get into the realms of how similar are the strings? To check for equality, if the first character is different you can just bail out and fail the test quickly, but if its the last you have to go through every character before you can pass or fail the test. It'd be interesting to profile this, but somehow the idea of creating a switch statement with hundreds/thousands of cases sounds unpleasant, I have no idea whether a compiler would accept it and would be completely impossible to work with.

          Regards, Rob Philpott.

          J 1 Reply Last reply
          0
          • R Rob Philpott

            Interesting question, to which I wouldn't like to guess the answer, but here's what I'd consider. Firstly, what is it that is being switched? If its something primitive like an integer, the switch code would I expect boil down to a collection of CMP and BEQ instructions (compare and branch). These would be stupidly fast, and because they are consecutive in memory are likely to benefit from CPU caching, so in that instance, an awful lot of switch cases could be compared in the time of a dictionary look up. If you are switching on strings though, things get more complicated. To do the dictionary look up, first the string needs to be hashed to give a bucket index, then an equality check is needed to make sure it matches. The switch statement doesn't need to do the hash, but has multiple equality checks to do, so I suspect the answer here boils down to the ratio of time taken to hash vs. time taken to do an equality check. So then you get into the realms of how similar are the strings? To check for equality, if the first character is different you can just bail out and fail the test quickly, but if its the last you have to go through every character before you can pass or fail the test. It'd be interesting to profile this, but somehow the idea of creating a switch statement with hundreds/thousands of cases sounds unpleasant, I have no idea whether a compiler would accept it and would be completely impossible to work with.

            Regards, Rob Philpott.

            J Offline
            J Offline
            Jorgen Andersson
            wrote on last edited by
            #5

            Rob Philpott wrote:

            Firstly, what is it that is being switched?

            It's strings.

            Rob Philpott wrote:

            So then you get into the realms of how similar are the strings?

            They are quite similar I'm afraid, as the words I'm parsing starts with the category. So one of the larger switches will be 60+ words where the first difference is at the 19th position. And as the files that will be parsed are in between GB and TB in size it will probably be worth some optimization.

            Wrong is evil and must be defeated. - Jeff Ello

            Richard DeemingR R J 3 Replies Last reply
            0
            • J Jorgen Andersson

              Rob Philpott wrote:

              Firstly, what is it that is being switched?

              It's strings.

              Rob Philpott wrote:

              So then you get into the realms of how similar are the strings?

              They are quite similar I'm afraid, as the words I'm parsing starts with the category. So one of the larger switches will be 60+ words where the first difference is at the 19th position. And as the files that will be parsed are in between GB and TB in size it will probably be worth some optimization.

              Wrong is evil and must be defeated. - Jeff Ello

              Richard DeemingR Offline
              Richard DeemingR Offline
              Richard Deeming
              wrote on last edited by
              #6

              Jörgen Andersson wrote:

              So one of the larger switches will be 60+ words where the first difference is at the 19th position.

              Perhaps you could break that down a bit? Validate that the input contains at least 19 characters, then switch on the 19th character to decide which path to take.

              if (input.Length >= 19)
              {
              return input[18] switch
              {
              'A' => ProcessA(input.AsSpan(19)),
              'B' => ProcessB(input.AsSpan(19)),
              ...
              };
              }

              You could even do that with a list pattern[^], although the repeated discards would look quite messy. :)

              return input switch
              {
              [_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 'A', ..] => ProcessA(input.AsSpan(19)),
              ...
              };


              "These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer

              "These people looked deep within my soul and assigned me a number based on the order in which I joined" - Homer

              R J 2 Replies Last reply
              0
              • J Jorgen Andersson

                Rob Philpott wrote:

                Firstly, what is it that is being switched?

                It's strings.

                Rob Philpott wrote:

                So then you get into the realms of how similar are the strings?

                They are quite similar I'm afraid, as the words I'm parsing starts with the category. So one of the larger switches will be 60+ words where the first difference is at the 19th position. And as the files that will be parsed are in between GB and TB in size it will probably be worth some optimization.

                Wrong is evil and must be defeated. - Jeff Ello

                R Offline
                R Offline
                Rob Philpott
                wrote on last edited by
                #7

                Ah ok, so is it that you've got these large files to process and you're trying to optimise the switching (state changes) for speed? Which approach are you using at the moment (switch vs. array lookup/not dictionary, sorry just read your update)? I suppose another difference is that switch statements are compile timed things, turned into code, whereas dictionaries are created and used at runtime. Does this mean the switch/state change logic is fixed in advance? I think the only well to tell really is try both methods and see which is quicker (if noticeably so). What I would say is I'd expect them both to be fast, so is this really the bottleneck to performance gains, or could something else be optimised? Multithreading/pipelining etc. TPL Dataflow (if you're in .NET) is good for this.

                Regards, Rob Philpott.

                J 2 Replies Last reply
                0
                • Richard DeemingR Richard Deeming

                  Jörgen Andersson wrote:

                  So one of the larger switches will be 60+ words where the first difference is at the 19th position.

                  Perhaps you could break that down a bit? Validate that the input contains at least 19 characters, then switch on the 19th character to decide which path to take.

                  if (input.Length >= 19)
                  {
                  return input[18] switch
                  {
                  'A' => ProcessA(input.AsSpan(19)),
                  'B' => ProcessB(input.AsSpan(19)),
                  ...
                  };
                  }

                  You could even do that with a list pattern[^], although the repeated discards would look quite messy. :)

                  return input switch
                  {
                  [_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 'A', ..] => ProcessA(input.AsSpan(19)),
                  ...
                  };


                  "These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer

                  R Offline
                  R Offline
                  Rob Philpott
                  wrote on last edited by
                  #8

                  C# has gone a bit bonkers hasn't it? I still have to look up how to do this 'new' stuff. Although I do like being able to create empty arrays with [] etc. Oh, and I think you're out-by-one: return input[18] switch

                  Regards, Rob Philpott.

                  1 Reply Last reply
                  0
                  • Richard DeemingR Richard Deeming

                    Jörgen Andersson wrote:

                    So one of the larger switches will be 60+ words where the first difference is at the 19th position.

                    Perhaps you could break that down a bit? Validate that the input contains at least 19 characters, then switch on the 19th character to decide which path to take.

                    if (input.Length >= 19)
                    {
                    return input[18] switch
                    {
                    'A' => ProcessA(input.AsSpan(19)),
                    'B' => ProcessB(input.AsSpan(19)),
                    ...
                    };
                    }

                    You could even do that with a list pattern[^], although the repeated discards would look quite messy. :)

                    return input switch
                    {
                    [_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 'A', ..] => ProcessA(input.AsSpan(19)),
                    ...
                    };


                    "These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer

                    J Offline
                    J Offline
                    Jorgen Andersson
                    wrote on last edited by
                    #9

                    List pattern looks interesting

                    Wrong is evil and must be defeated. - Jeff Ello

                    1 Reply Last reply
                    0
                    • R Rob Philpott

                      Ah ok, so is it that you've got these large files to process and you're trying to optimise the switching (state changes) for speed? Which approach are you using at the moment (switch vs. array lookup/not dictionary, sorry just read your update)? I suppose another difference is that switch statements are compile timed things, turned into code, whereas dictionaries are created and used at runtime. Does this mean the switch/state change logic is fixed in advance? I think the only well to tell really is try both methods and see which is quicker (if noticeably so). What I would say is I'd expect them both to be fast, so is this really the bottleneck to performance gains, or could something else be optimised? Multithreading/pipelining etc. TPL Dataflow (if you're in .NET) is good for this.

                      Regards, Rob Philpott.

                      J Offline
                      J Offline
                      Jorgen Andersson
                      wrote on last edited by
                      #10

                      Rob Philpott wrote:

                      so is it that you've got these large files to process and you're trying to optimise the switching (state changes) for speed

                      Indeed.

                      Rob Philpott wrote:

                      Which approach are you using at the moment

                      I've set up the parsing using switches just to make sure it works, but it's painfully slow so I'm looking at refactoring it at the moment.

                      Rob Philpott wrote:

                      Does this mean the switch/state change logic is fixed in advance?

                      This is where it gets funny. In theory yes. But changes might happen every now and then These files are supplied by a government entity. And while we're allowed to get the data (which is actually only a subset), we're not allowed to take part of the documentation (no, really). And they can't be bothered to make separate documentation for our subset (without us paying an extortion fee that is). So I've added logics that tells me when they've added or removed attributes. Oddly enough, I'm having fun tinkering with these files, mostly. Multithreading is the next logical step, but I want to get as far as possible without using brute force before that.

                      Wrong is evil and must be defeated. - Jeff Ello

                      1 Reply Last reply
                      0
                      • R Rob Philpott

                        Ah ok, so is it that you've got these large files to process and you're trying to optimise the switching (state changes) for speed? Which approach are you using at the moment (switch vs. array lookup/not dictionary, sorry just read your update)? I suppose another difference is that switch statements are compile timed things, turned into code, whereas dictionaries are created and used at runtime. Does this mean the switch/state change logic is fixed in advance? I think the only well to tell really is try both methods and see which is quicker (if noticeably so). What I would say is I'd expect them both to be fast, so is this really the bottleneck to performance gains, or could something else be optimised? Multithreading/pipelining etc. TPL Dataflow (if you're in .NET) is good for this.

                        Regards, Rob Philpott.

                        J Offline
                        J Offline
                        Jorgen Andersson
                        wrote on last edited by
                        #11

                        I believe I might have a generic answer to my question.

                        Quote: ListDictionary Class[^]

                        This is a simple implementation of IDictionary using a singly linked list. It is smaller and faster than a Hashtable if the number of elements is 10 or less. This should not be used if performance is important for large numbers of elements.

                        Wrong is evil and must be defeated. - Jeff Ello

                        R 1 Reply Last reply
                        0
                        • J Jorgen Andersson

                          I believe I might have a generic answer to my question.

                          Quote: ListDictionary Class[^]

                          This is a simple implementation of IDictionary using a singly linked list. It is smaller and faster than a Hashtable if the number of elements is 10 or less. This should not be used if performance is important for large numbers of elements.

                          Wrong is evil and must be defeated. - Jeff Ello

                          R Offline
                          R Offline
                          Rob Philpott
                          wrote on last edited by
                          #12

                          Maybe, that thing is old, predating generics so there might be some boxing overhead depending on what you stick in it, unless they've done a generic version of it. It's hard to comment from this distance, but if the state machine _might_ change, surely its better to model it at runtime so you just need to adjust some static data rather than go back to source... Profiling is always a good option, to see where the bottlenecks lie. Anyway, best of luck! :)

                          Regards, Rob Philpott.

                          1 Reply Last reply
                          0
                          • J Jorgen Andersson

                            Rob Philpott wrote:

                            Firstly, what is it that is being switched?

                            It's strings.

                            Rob Philpott wrote:

                            So then you get into the realms of how similar are the strings?

                            They are quite similar I'm afraid, as the words I'm parsing starts with the category. So one of the larger switches will be 60+ words where the first difference is at the 19th position. And as the files that will be parsed are in between GB and TB in size it will probably be worth some optimization.

                            Wrong is evil and must be defeated. - Jeff Ello

                            J Offline
                            J Offline
                            jschell
                            wrote on last edited by
                            #13

                            Jörgen Andersson wrote:

                            It's strings.

                            For a dictionary then you are going to need to compute the hash.

                            Jörgen Andersson wrote:

                            And as the files

                            And then you must compute the hash for each of those. I suspect this really depends on the size and probably the standard deviation of the sizes for each string. I haven't thought this through and certainly have not profiled it but a tree might be better. The sparse tree is built with each fork having one character. Next level has 26 (or whatever size your set is) characters. Keep in mind that a hash requires sequencing through each character. So a tree is somewhat similar to that EXCEPT when you reach the end (leaf of tree) you have already reached your delegate. So no further operations to look up. If each level has the entire character set you can use an array and do a direct look up to the next level (the character is the index into the array.) Carefully calculate the memory space. You could use a sparse tree but that will slow it down. And maybe you should look at unmanaged code. Specifically C++. One advantage to C++ (and C) is that you can force a string to be treated as a numeric. So a value like "ABCD" can be cast directly to a 32 bit unsigned integer. And of course you could use 64 bit also. The problem with that of course is that you must then deal with 4/8 character size blocks only.

                            Jörgen Andersson wrote:

                            be worth some optimization.

                            You should actually profile the application. Not specific code but the entire application. If you want it to be fast then you should find the exact places where it is slow.

                            J 2 Replies Last reply
                            0
                            • J jschell

                              Jörgen Andersson wrote:

                              It's strings.

                              For a dictionary then you are going to need to compute the hash.

                              Jörgen Andersson wrote:

                              And as the files

                              And then you must compute the hash for each of those. I suspect this really depends on the size and probably the standard deviation of the sizes for each string. I haven't thought this through and certainly have not profiled it but a tree might be better. The sparse tree is built with each fork having one character. Next level has 26 (or whatever size your set is) characters. Keep in mind that a hash requires sequencing through each character. So a tree is somewhat similar to that EXCEPT when you reach the end (leaf of tree) you have already reached your delegate. So no further operations to look up. If each level has the entire character set you can use an array and do a direct look up to the next level (the character is the index into the array.) Carefully calculate the memory space. You could use a sparse tree but that will slow it down. And maybe you should look at unmanaged code. Specifically C++. One advantage to C++ (and C) is that you can force a string to be treated as a numeric. So a value like "ABCD" can be cast directly to a 32 bit unsigned integer. And of course you could use 64 bit also. The problem with that of course is that you must then deal with 4/8 character size blocks only.

                              Jörgen Andersson wrote:

                              be worth some optimization.

                              You should actually profile the application. Not specific code but the entire application. If you want it to be fast then you should find the exact places where it is slow.

                              J Offline
                              J Offline
                              Jorgen Andersson
                              wrote on last edited by
                              #14

                              A tree is a really good suggestion! Thanks!

                              Wrong is evil and must be defeated. - Jeff Ello

                              1 Reply Last reply
                              0
                              • J jschell

                                Jörgen Andersson wrote:

                                It's strings.

                                For a dictionary then you are going to need to compute the hash.

                                Jörgen Andersson wrote:

                                And as the files

                                And then you must compute the hash for each of those. I suspect this really depends on the size and probably the standard deviation of the sizes for each string. I haven't thought this through and certainly have not profiled it but a tree might be better. The sparse tree is built with each fork having one character. Next level has 26 (or whatever size your set is) characters. Keep in mind that a hash requires sequencing through each character. So a tree is somewhat similar to that EXCEPT when you reach the end (leaf of tree) you have already reached your delegate. So no further operations to look up. If each level has the entire character set you can use an array and do a direct look up to the next level (the character is the index into the array.) Carefully calculate the memory space. You could use a sparse tree but that will slow it down. And maybe you should look at unmanaged code. Specifically C++. One advantage to C++ (and C) is that you can force a string to be treated as a numeric. So a value like "ABCD" can be cast directly to a 32 bit unsigned integer. And of course you could use 64 bit also. The problem with that of course is that you must then deal with 4/8 character size blocks only.

                                Jörgen Andersson wrote:

                                be worth some optimization.

                                You should actually profile the application. Not specific code but the entire application. If you want it to be fast then you should find the exact places where it is slow.

                                J Offline
                                J Offline
                                Jorgen Andersson
                                wrote on last edited by
                                #15

                                jschell wrote:

                                One advantage to C++ (and C) is that you can force a string to be treated as a numeric. So a value like "ABCD" can be cast directly to a 32 bit unsigned integer. And of course you could use 64 bit also.

                                Ah, yes to do that in C# I would need to use pointer indirection operators. I hate pointers.

                                Wrong is evil and must be defeated. - Jeff Ello

                                Richard DeemingR 1 Reply Last reply
                                0
                                • J Jorgen Andersson

                                  jschell wrote:

                                  One advantage to C++ (and C) is that you can force a string to be treated as a numeric. So a value like "ABCD" can be cast directly to a 32 bit unsigned integer. And of course you could use 64 bit also.

                                  Ah, yes to do that in C# I would need to use pointer indirection operators. I hate pointers.

                                  Wrong is evil and must be defeated. - Jeff Ello

                                  Richard DeemingR Offline
                                  Richard DeemingR Offline
                                  Richard Deeming
                                  wrote on last edited by
                                  #16

                                  Not if you're using a recent version of .NET, or have a reference to the System.Memory NuGet package[^]:

                                  ReadOnlySpan input = "ABCD"; // For .NET Framework / Standard 2.0, you'll need to add ".AsSpan()" here.
                                  ReadOnlySpan bytes = System.Runtime.InteropServices.MemoryMarshal.AsBytes(input);
                                  int value = System.Runtime.InteropServices.MemoryMarshal.Read(bytes);
                                  // value == 4325441

                                  Of course, you may still need to take the "endianness" of the system into account. :)


                                  "These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer

                                  "These people looked deep within my soul and assigned me a number based on the order in which I joined" - Homer

                                  T 1 Reply Last reply
                                  0
                                  • Richard DeemingR Richard Deeming

                                    Not if you're using a recent version of .NET, or have a reference to the System.Memory NuGet package[^]:

                                    ReadOnlySpan input = "ABCD"; // For .NET Framework / Standard 2.0, you'll need to add ".AsSpan()" here.
                                    ReadOnlySpan bytes = System.Runtime.InteropServices.MemoryMarshal.AsBytes(input);
                                    int value = System.Runtime.InteropServices.MemoryMarshal.Read(bytes);
                                    // value == 4325441

                                    Of course, you may still need to take the "endianness" of the system into account. :)


                                    "These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer

                                    T Offline
                                    T Offline
                                    trønderen
                                    wrote on last edited by
                                    #17

                                    Richard Deeming wrote:

                                    ReadOnlySpan input = "ABCD";
                                    ReadOnlySpan bytes = System.Runtime.InteropServices.MemoryMarshal.AsBytes(input);
                                    int value = System.Runtime.InteropServices.MemoryMarshal.Read(bytes);

                                    Will that compile to a single instruction, as you might see when using C/C++ casting? If you want to treat 4 chars at a time by treating them as ints, this doesn't look like something that would save CPU cycles. I admit that I haven't tried to compile the code and studied the instructions generated. Endianness isn't your only concern. Don't forget that UTF16, the internal character format of C#, also can contain surrogates and other funny elements.

                                    jschell wrote:

                                    So a value like "ABCD" can be cast directly to a 32 bit unsigned integer.

                                    obviously expecting a result of 1 094 861 636, hex 41424344 on big-endian machines, 1 145 258 561, hex 44434241 on little-endian machines. With UTF16 representation, the value 4325441, hex 00420041, encodes only two characters, "AB", not four as the C++ programmer expected. (C# never used 8 bit char representation, so the C# should not expect that four chars are packed into a 32 bit int.)

                                    Religious freedom is the freedom to say that two plus two make five.

                                    Richard DeemingR 1 Reply Last reply
                                    0
                                    • T trønderen

                                      Richard Deeming wrote:

                                      ReadOnlySpan input = "ABCD";
                                      ReadOnlySpan bytes = System.Runtime.InteropServices.MemoryMarshal.AsBytes(input);
                                      int value = System.Runtime.InteropServices.MemoryMarshal.Read(bytes);

                                      Will that compile to a single instruction, as you might see when using C/C++ casting? If you want to treat 4 chars at a time by treating them as ints, this doesn't look like something that would save CPU cycles. I admit that I haven't tried to compile the code and studied the instructions generated. Endianness isn't your only concern. Don't forget that UTF16, the internal character format of C#, also can contain surrogates and other funny elements.

                                      jschell wrote:

                                      So a value like "ABCD" can be cast directly to a 32 bit unsigned integer.

                                      obviously expecting a result of 1 094 861 636, hex 41424344 on big-endian machines, 1 145 258 561, hex 44434241 on little-endian machines. With UTF16 representation, the value 4325441, hex 00420041, encodes only two characters, "AB", not four as the C++ programmer expected. (C# never used 8 bit char representation, so the C# should not expect that four chars are packed into a 32 bit int.)

                                      Religious freedom is the freedom to say that two plus two make five.

                                      Richard DeemingR Offline
                                      Richard DeemingR Offline
                                      Richard Deeming
                                      wrote on last edited by
                                      #18

                                      For literals, the other alternative would be to use a UTF8 string literal:

                                      ReadOnlySpan bytes = "ABCD"u8;
                                      int value = System.Runtime.InteropServices.MemoryMarshal.Read(bytes);
                                      // 0x44434241

                                      It's not going to compile to a single instruction, but it should be fairly well optimized.


                                      "These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer

                                      "These people looked deep within my soul and assigned me a number based on the order in which I joined" - Homer

                                      R T 2 Replies Last reply
                                      0
                                      • Richard DeemingR Richard Deeming

                                        For literals, the other alternative would be to use a UTF8 string literal:

                                        ReadOnlySpan bytes = "ABCD"u8;
                                        int value = System.Runtime.InteropServices.MemoryMarshal.Read(bytes);
                                        // 0x44434241

                                        It's not going to compile to a single instruction, but it should be fairly well optimized.


                                        "These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer

                                        R Offline
                                        R Offline
                                        Rob Philpott
                                        wrote on last edited by
                                        #19

                                        ReadOnlySpan bytes = "ABCD"u8;

                                        You see, I had no idea you could do that. u8 - when did that arrive?! Can't keep up with it all.

                                        Regards, Rob Philpott.

                                        Richard DeemingR 1 Reply Last reply
                                        0
                                        • R Rob Philpott

                                          ReadOnlySpan bytes = "ABCD"u8;

                                          You see, I had no idea you could do that. u8 - when did that arrive?! Can't keep up with it all.

                                          Regards, Rob Philpott.

                                          Richard DeemingR Offline
                                          Richard DeemingR Offline
                                          Richard Deeming
                                          wrote on last edited by
                                          #20

                                          That was added in C# 11, back in November 2022. :) UTF-8 string literals - C# feature specifications | Microsoft Learn[^]


                                          "These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer

                                          "These people looked deep within my soul and assigned me a number based on the order in which I joined" - Homer

                                          T 1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups