Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C#
  4. Most efficient switch statement variable to use?

Most efficient switch statement variable to use?

Scheduled Pinned Locked Moved C#
questioncsharpcss
20 Posts 8 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • B Offline
    B Offline
    Bruce Coward
    wrote on last edited by
    #1

    When I am using switch statements in C# what is the most efficient type of switch variable to use? Normally I only have less than 10 cases which makes an Int32 seem rather overkill but am I correct in assuming that as the machine is running 32 bits that may be more efficient than trimming the switch variable down to 16 or even 8 bits which may cause more code steps? Cheers, Bruce :cool:

    P L E L M 5 Replies Last reply
    0
    • B Bruce Coward

      When I am using switch statements in C# what is the most efficient type of switch variable to use? Normally I only have less than 10 cases which makes an Int32 seem rather overkill but am I correct in assuming that as the machine is running 32 bits that may be more efficient than trimming the switch variable down to 16 or even 8 bits which may cause more code steps? Cheers, Bruce :cool:

      P Offline
      P Offline
      PIEBALDconsult
      wrote on last edited by
      #2

      Probably, test it and let us know.

      1 Reply Last reply
      0
      • B Bruce Coward

        When I am using switch statements in C# what is the most efficient type of switch variable to use? Normally I only have less than 10 cases which makes an Int32 seem rather overkill but am I correct in assuming that as the machine is running 32 bits that may be more efficient than trimming the switch variable down to 16 or even 8 bits which may cause more code steps? Cheers, Bruce :cool:

        L Offline
        L Offline
        Luc Pattyn
        wrote on last edited by
        #3

        Hi Bruce, in general integer operations are fastest for the native word size, meaning 8-bit or 16-bit operations are not faster than 32-bit operations on modern CPUs. This tells us byte and short mainly exist to support compatibility with existing data structures, files, etc; and of course to economize on memory when using large amounts of them as in arrays. BTW: This may not be very easy to test, since (1) programming languages use the native size for literal values anyway, and (2) often the compiler will use int operations although byte or short where coded when those ints are equivalent to what you actually coded. :)

        Luc Pattyn [Forum Guidelines] [My Articles]


        - before you ask a question here, search CodeProject, then Google - the quality and detail of your question reflects on the effectiveness of the help you are likely to get - use the code block button (PRE tags) to preserve formatting when showing multi-line code snippets


        modified on Sunday, June 12, 2011 8:35 AM

        P D D 3 Replies Last reply
        0
        • L Luc Pattyn

          Hi Bruce, in general integer operations are fastest for the native word size, meaning 8-bit or 16-bit operations are not faster than 32-bit operations on modern CPUs. This tells us byte and short mainly exist to support compatibility with existing data structures, files, etc; and of course to economize on memory when using large amounts of them as in arrays. BTW: This may not be very easy to test, since (1) programming languages use the native size for literal values anyway, and (2) often the compiler will use int operations although byte or short where coded when those ints are equivalent to what you actually coded. :)

          Luc Pattyn [Forum Guidelines] [My Articles]


          - before you ask a question here, search CodeProject, then Google - the quality and detail of your question reflects on the effectiveness of the help you are likely to get - use the code block button (PRE tags) to preserve formatting when showing multi-line code snippets


          modified on Sunday, June 12, 2011 8:35 AM

          P Offline
          P Offline
          PIEBALDconsult
          wrote on last edited by
          #4

          Luc Pattyn wrote:

          often the compiler will use int operations although byte or short where coded

          That's what I would expect; a bunch of up-casting to int.

          1 Reply Last reply
          0
          • B Bruce Coward

            When I am using switch statements in C# what is the most efficient type of switch variable to use? Normally I only have less than 10 cases which makes an Int32 seem rather overkill but am I correct in assuming that as the machine is running 32 bits that may be more efficient than trimming the switch variable down to 16 or even 8 bits which may cause more code steps? Cheers, Bruce :cool:

            E Offline
            E Offline
            Ennis Ray Lynch Jr
            wrote on last edited by
            #5

            I don't know about faster but constants would increase maintainability. While there is no speed decrease it is poor practice to use strings as a switch variable.

            Need custom software developed? I do C# development and consulting all over the United States. A man said to the universe: "Sir I exist!" "However," replied the universe, "The fact has not created in me A sense of obligation." --Stephen Crane

            1 Reply Last reply
            0
            • L Luc Pattyn

              Hi Bruce, in general integer operations are fastest for the native word size, meaning 8-bit or 16-bit operations are not faster than 32-bit operations on modern CPUs. This tells us byte and short mainly exist to support compatibility with existing data structures, files, etc; and of course to economize on memory when using large amounts of them as in arrays. BTW: This may not be very easy to test, since (1) programming languages use the native size for literal values anyway, and (2) often the compiler will use int operations although byte or short where coded when those ints are equivalent to what you actually coded. :)

              Luc Pattyn [Forum Guidelines] [My Articles]


              - before you ask a question here, search CodeProject, then Google - the quality and detail of your question reflects on the effectiveness of the help you are likely to get - use the code block button (PRE tags) to preserve formatting when showing multi-line code snippets


              modified on Sunday, June 12, 2011 8:35 AM

              D Offline
              D Offline
              DaveyM69
              wrote on last edited by
              #6

              Luc Pattyn wrote:

              economize on memory

              Interesting point - I've never delved that deep into the CPU architecture. Are things always word aligned or byte aligned, or is it variable? If it's word aligned then there would be no advantage at all to using anything less than the native size.

              Dave
              BTW, in software, hope and pray is not a viable strategy. (Luc Pattyn)
              Visual Basic is not used by normal people so we're not covering it here. (Uncyclopedia)

              P L L 3 Replies Last reply
              0
              • L Luc Pattyn

                Hi Bruce, in general integer operations are fastest for the native word size, meaning 8-bit or 16-bit operations are not faster than 32-bit operations on modern CPUs. This tells us byte and short mainly exist to support compatibility with existing data structures, files, etc; and of course to economize on memory when using large amounts of them as in arrays. BTW: This may not be very easy to test, since (1) programming languages use the native size for literal values anyway, and (2) often the compiler will use int operations although byte or short where coded when those ints are equivalent to what you actually coded. :)

                Luc Pattyn [Forum Guidelines] [My Articles]


                - before you ask a question here, search CodeProject, then Google - the quality and detail of your question reflects on the effectiveness of the help you are likely to get - use the code block button (PRE tags) to preserve formatting when showing multi-line code snippets


                modified on Sunday, June 12, 2011 8:35 AM

                D Offline
                D Offline
                Deresen
                wrote on last edited by
                #7

                Totaly correct! But in these times with an average cpu power of 2 GHz it is not really neccesary to 'bitfuck'. The problem comes when you go programming on external devices, though today most of the pda's and cell phones only have a lack of memory in stead of cpu power. It's getting tough when you program on microcontrollers, but then you shouldn't uing C#, but assembler!!

                L 1 Reply Last reply
                0
                • D DaveyM69

                  Luc Pattyn wrote:

                  economize on memory

                  Interesting point - I've never delved that deep into the CPU architecture. Are things always word aligned or byte aligned, or is it variable? If it's word aligned then there would be no advantage at all to using anything less than the native size.

                  Dave
                  BTW, in software, hope and pray is not a viable strategy. (Luc Pattyn)
                  Visual Basic is not used by normal people so we're not covering it here. (Uncyclopedia)

                  P Offline
                  P Offline
                  PIEBALDconsult
                  wrote on last edited by
                  #8

                  structs can be laid exactly as required. Each element of an array of bytes comes immediately after the previous. Etc.

                  1 Reply Last reply
                  0
                  • B Bruce Coward

                    When I am using switch statements in C# what is the most efficient type of switch variable to use? Normally I only have less than 10 cases which makes an Int32 seem rather overkill but am I correct in assuming that as the machine is running 32 bits that may be more efficient than trimming the switch variable down to 16 or even 8 bits which may cause more code steps? Cheers, Bruce :cool:

                    L Offline
                    L Offline
                    Lost User
                    wrote on last edited by
                    #9

                    Downcasting is often a 0 step operation (no operation actually, but just start using less bits of the register) It won't become faster though, even operations on 128bits at a time have the same latency and throughput. On Core2 at least. note: the following part is based on speculation So would code would the JIT compiler generate for a switch instruction in MSIL? With a bit of bad luck it will generate something like: (assume switch variable is in eax) mov edx,SwitchTabelBase ;or any other register that can be used as base jmp [edx+4*eax] Bad luck because that would mean it will need an extra operation if it sees a downcast to a byte first, it can't just assume that the cast is a nop, unless it would do some expensive analyzes it can't know that the value won't exceed 255 anyway. So it would have to do: movzx eax,al mov edx,SwitchTabelBase ;or any other register that can be used as base jmp [edx+4*eax] Or possibly: and eax,0xFF mov edx,SwitchTabelBase ;or any other register that can be used as base jmp [edx+4*eax] Or something else? Who knows? How can you disassemble the code after it's JIT-compiled anyway? However! (speculations end here) A switch instruction (in MSIL) is only generated when the resulting table would be dense, when it isn't generated, a 'tree' of if's is generated instead (or a chain of if's as a special case, which the .NET reflector doesn't understand). It isn't really a tree though, it's a mess of if's and labels (in a linear way), but the data-flow as "as though it were a tree of if's". It does the expected thing - split the range of values in 2 every time. Obviously this is a O(log n) algorithm so beware, switch doesn't always perform O(1). edit: the whole point for this was to note that: The size of the operand doesn't matter for speed, unless it's bigger than 64bits, because these are comparisons, and a 64bit comparison can be done as fast as any smaller comparison. 32bit if running in 32bit mode. The 128bits thing only works when working with SSE, which the .NET JIT compiler doesn't use, except for FISTTP which is technically an SSE instruction but it works with the regular floating point stack. For switches on strings it's a whole different story - if a real switch is used, all possible values are put into a dictionary every time again the dictionary is not saved. Otherwise it generated a chain of if's, using op_Equality (aka ==) between the value and every case. Both algorithms are O(n) in the number of cases, but the chain of if's at least has an early

                    L 1 Reply Last reply
                    0
                    • D Deresen

                      Totaly correct! But in these times with an average cpu power of 2 GHz it is not really neccesary to 'bitfuck'. The problem comes when you go programming on external devices, though today most of the pda's and cell phones only have a lack of memory in stead of cpu power. It's getting tough when you program on microcontrollers, but then you shouldn't uing C#, but assembler!!

                      L Offline
                      L Offline
                      Lost User
                      wrote on last edited by
                      #10

                      Some forms of "bitfucking" are becoming more important though, since with the widening CPU-RAM speed gap it becomes increasingly important to not address more memory in inner loops than the size of the L2 cache (needing less if of course even better), if "bitfucks" are needed to accomplish that then so be it. And since the conditions for store-forwarding are very restrictive, extracting smaller types from within larger types at a non-aligned point should always be done with a "bitfuck" - performance will suck if you write it to memory and read a smaller and unaligned part of it back. The reverse, inserting a small type into a large type is even worse, it is never store-forwarded so "bitfucking" is always needed unless the code is not in a performance critical section.

                      D 1 Reply Last reply
                      0
                      • D DaveyM69

                        Luc Pattyn wrote:

                        economize on memory

                        Interesting point - I've never delved that deep into the CPU architecture. Are things always word aligned or byte aligned, or is it variable? If it's word aligned then there would be no advantage at all to using anything less than the native size.

                        Dave
                        BTW, in software, hope and pray is not a viable strategy. (Luc Pattyn)
                        Visual Basic is not used by normal people so we're not covering it here. (Uncyclopedia)

                        L Offline
                        L Offline
                        Lost User
                        wrote on last edited by
                        #11

                        The CPU (assuming x86) doesn't require much, SSE requires 16 byte alignment unless you don't care about an approx 100% overhead. But the windows (and linux, too) ABI requires alignment of all things to at least their size, and of the stack to "twice the pointer size" (depends on bit-ness, obviously) The individual elements in an array of bytes can be byte aligned but the starting address will usually be dword (or more) aligned. And Luc is correct of course, in general. Some instructions such as div and idiv have a latency and throughput depending on the value of the result, and small types can lead to smaller values (thus faster computation), obviously that is not guaranteed since it depends on the actual values. For floats, lower precision makes operations such as fdiv faster.

                        1 Reply Last reply
                        0
                        • L Lost User

                          Some forms of "bitfucking" are becoming more important though, since with the widening CPU-RAM speed gap it becomes increasingly important to not address more memory in inner loops than the size of the L2 cache (needing less if of course even better), if "bitfucks" are needed to accomplish that then so be it. And since the conditions for store-forwarding are very restrictive, extracting smaller types from within larger types at a non-aligned point should always be done with a "bitfuck" - performance will suck if you write it to memory and read a smaller and unaligned part of it back. The reverse, inserting a small type into a large type is even worse, it is never store-forwarded so "bitfucking" is always needed unless the code is not in a performance critical section.

                          D Offline
                          D Offline
                          Deresen
                          wrote on last edited by
                          #12

                          You are true, and you always have to bitfuck wherever you can I agree with that. Maybe it's because I like it, maybe because I started with microcontrollers. But if we go and calculate, we will see that it doesn't really matter which types you will use. An average to small cache size is 1 Gb, this means. Let's say your program will use half of it so this leaves you to use 500 M of bytes. If you're an efficient programmer you can never make your program to use the whole 500 M. Even if you store numbers which you could put into a byte in an Int128 it'll take you an array of (500/16 =~) 30 million to fill it. The only way to fill your RAM is when you're busy with graphics. And even then it's not neccecary to bitfuck, because Microsoft made some good library's which will all take care of the memory problem. In conclusion I think we can say that bitfucking is fun, but not really neccesary if you're just making efficient code.

                          L 1 Reply Last reply
                          0
                          • D Deresen

                            You are true, and you always have to bitfuck wherever you can I agree with that. Maybe it's because I like it, maybe because I started with microcontrollers. But if we go and calculate, we will see that it doesn't really matter which types you will use. An average to small cache size is 1 Gb, this means. Let's say your program will use half of it so this leaves you to use 500 M of bytes. If you're an efficient programmer you can never make your program to use the whole 500 M. Even if you store numbers which you could put into a byte in an Int128 it'll take you an array of (500/16 =~) 30 million to fill it. The only way to fill your RAM is when you're busy with graphics. And even then it's not neccecary to bitfuck, because Microsoft made some good library's which will all take care of the memory problem. In conclusion I think we can say that bitfucking is fun, but not really neccesary if you're just making efficient code.

                            L Offline
                            L Offline
                            Lost User
                            wrote on last edited by
                            #13

                            Deresen wrote:

                            average to small cache size is 1 Gb

                            Where did you get that information? The biggest cache size I've seen so far is 12MB (as 2x6MB) of L2. That is not so much, and every so often an other program will come along and trash it (if you don't do it yourself)

                            D 1 Reply Last reply
                            0
                            • L Lost User

                              Deresen wrote:

                              average to small cache size is 1 Gb

                              Where did you get that information? The biggest cache size I've seen so far is 12MB (as 2x6MB) of L2. That is not so much, and every so often an other program will come along and trash it (if you don't do it yourself)

                              D Offline
                              D Offline
                              Deresen
                              wrote on last edited by
                              #14

                              My mistake, I was thinking about RAM memory. :omg:

                              L 1 Reply Last reply
                              0
                              • D Deresen

                                My mistake, I was thinking about RAM memory. :omg:

                                L Offline
                                L Offline
                                Lost User
                                wrote on last edited by
                                #15

                                Well then 1GB is indeed small :) But RAM is slow (compared to the CPU), a cache miss can easily cost 100 cycles - long enough to justify doing complex calculations just to avoid the cache miss, plenty of time for 150 to 250 instructions That makes me wonder what the theoretical maximum of instructions in 100 cycles is (on Core2) 500 if looking only at the predecoder specs: macro fusion can fuse 2 instructions but can only be done once per cycle, so 5 instructions (3 of which must be 1 µop) and the size of such a "block" should satisfy N*size = 16 (to Never cross a boundary) and no 66H or 67H prefixes should occur anywhere But then looking at the rest: the sequence must not have any dependacy chains, not all instructions are perfectly pipelined, there are only 3 "normal" ports (0, 1 and 5) and even register reading can be a bottleneck. Only 6 µops per cycle are allowed, but that includes memory read/write (bringing us down to 400 except for µop fusion) The predecoder throughput would be less important (only the first iteration) if we were executing a small (less than 4 times 16 bytes) loop.. And I'm not even going to mention the rest. The best throughput of any one instruction is 3/cycle (a stream of NOP's for example) so it should be possible to do (slightly) better than that, right? This is too complex, I'll leave it to the pro's.

                                1 Reply Last reply
                                0
                                • B Bruce Coward

                                  When I am using switch statements in C# what is the most efficient type of switch variable to use? Normally I only have less than 10 cases which makes an Int32 seem rather overkill but am I correct in assuming that as the machine is running 32 bits that may be more efficient than trimming the switch variable down to 16 or even 8 bits which may cause more code steps? Cheers, Bruce :cool:

                                  M Offline
                                  M Offline
                                  Mark Churchill
                                  wrote on last edited by
                                  #16

                                  Minor optimisation tricks like that are something that your compiler should (and probably is) is handling for you. Write clear and readable code.

                                  Mark Churchill Director, Dunn & Churchill Pty Ltd Free Download: Diamond Binding: The simple, powerful, reliable, and effective data layer toolkit for Visual Studio.
                                  Entanglar: .Net game engine featuring automatic networking and powerful HLSL gfx binding.

                                  P L 2 Replies Last reply
                                  0
                                  • D DaveyM69

                                    Luc Pattyn wrote:

                                    economize on memory

                                    Interesting point - I've never delved that deep into the CPU architecture. Are things always word aligned or byte aligned, or is it variable? If it's word aligned then there would be no advantage at all to using anything less than the native size.

                                    Dave
                                    BTW, in software, hope and pray is not a viable strategy. (Luc Pattyn)
                                    Visual Basic is not used by normal people so we're not covering it here. (Uncyclopedia)

                                    L Offline
                                    L Offline
                                    Luc Pattyn
                                    wrote on last edited by
                                    #17

                                    DaveyM69 wrote:

                                    Are things always word aligned

                                    There is the notion of "natural alignment" which states each item should be aligned to its size, so 2B shorts have even addresses, 4B ints have addresses that are multiples of 4, etc. (although items larger than the int size (long and double) don't need to be aligned (but SIMD data does). A struct by default would use padding to achieve that when necessary, i.e. it would insert dummy bytes when required. To reduce the size bloat, the suggestion if to orden members from largest to smallest. The linker and the run-time will allocate objects at a multiple of 8 or even 16B, so a structs that would only need 6B will effectively be layed out 8B apart. Warning: some Win32 APIs expect an array of structs with odd sizes, such as 6. If you don't want any padding, use Marshal attributes with explicit offsets. :)

                                    Luc Pattyn [Forum Guidelines] [My Articles]


                                    - before you ask a question here, search CodeProject, then Google - the quality and detail of your question reflects on the effectiveness of the help you are likely to get - use the code block button (PRE tags) to preserve formatting when showing multi-line code snippets


                                    modified on Sunday, June 12, 2011 8:36 AM

                                    1 Reply Last reply
                                    0
                                    • M Mark Churchill

                                      Minor optimisation tricks like that are something that your compiler should (and probably is) is handling for you. Write clear and readable code.

                                      Mark Churchill Director, Dunn & Churchill Pty Ltd Free Download: Diamond Binding: The simple, powerful, reliable, and effective data layer toolkit for Visual Studio.
                                      Entanglar: .Net game engine featuring automatic networking and powerful HLSL gfx binding.

                                      P Offline
                                      P Offline
                                      PIEBALDconsult
                                      wrote on last edited by
                                      #18

                                      Hear hear!

                                      1 Reply Last reply
                                      0
                                      • M Mark Churchill

                                        Minor optimisation tricks like that are something that your compiler should (and probably is) is handling for you. Write clear and readable code.

                                        Mark Churchill Director, Dunn & Churchill Pty Ltd Free Download: Diamond Binding: The simple, powerful, reliable, and effective data layer toolkit for Visual Studio.
                                        Entanglar: .Net game engine featuring automatic networking and powerful HLSL gfx binding.

                                        L Offline
                                        L Offline
                                        Lost User
                                        wrote on last edited by
                                        #19

                                        Alas, it isn't (but it should!).

                                        1 Reply Last reply
                                        0
                                        • L Lost User

                                          Downcasting is often a 0 step operation (no operation actually, but just start using less bits of the register) It won't become faster though, even operations on 128bits at a time have the same latency and throughput. On Core2 at least. note: the following part is based on speculation So would code would the JIT compiler generate for a switch instruction in MSIL? With a bit of bad luck it will generate something like: (assume switch variable is in eax) mov edx,SwitchTabelBase ;or any other register that can be used as base jmp [edx+4*eax] Bad luck because that would mean it will need an extra operation if it sees a downcast to a byte first, it can't just assume that the cast is a nop, unless it would do some expensive analyzes it can't know that the value won't exceed 255 anyway. So it would have to do: movzx eax,al mov edx,SwitchTabelBase ;or any other register that can be used as base jmp [edx+4*eax] Or possibly: and eax,0xFF mov edx,SwitchTabelBase ;or any other register that can be used as base jmp [edx+4*eax] Or something else? Who knows? How can you disassemble the code after it's JIT-compiled anyway? However! (speculations end here) A switch instruction (in MSIL) is only generated when the resulting table would be dense, when it isn't generated, a 'tree' of if's is generated instead (or a chain of if's as a special case, which the .NET reflector doesn't understand). It isn't really a tree though, it's a mess of if's and labels (in a linear way), but the data-flow as "as though it were a tree of if's". It does the expected thing - split the range of values in 2 every time. Obviously this is a O(log n) algorithm so beware, switch doesn't always perform O(1). edit: the whole point for this was to note that: The size of the operand doesn't matter for speed, unless it's bigger than 64bits, because these are comparisons, and a 64bit comparison can be done as fast as any smaller comparison. 32bit if running in 32bit mode. The 128bits thing only works when working with SSE, which the .NET JIT compiler doesn't use, except for FISTTP which is technically an SSE instruction but it works with the regular floating point stack. For switches on strings it's a whole different story - if a real switch is used, all possible values are put into a dictionary every time again the dictionary is not saved. Otherwise it generated a chain of if's, using op_Equality (aka ==) between the value and every case. Both algorithms are O(n) in the number of cases, but the chain of if's at least has an early

                                          L Offline
                                          L Offline
                                          Lost User
                                          wrote on last edited by
                                          #20

                                          Aw no comments? It took me quite a while to do the required research..

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups