Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. .NET (Core and Framework)
  4. String compare oddity [modified]

String compare oddity [modified]

Scheduled Pinned Locked Moved .NET (Core and Framework)
comalgorithmstestingbeta-testingtutorial
13 Posts 3 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • P Offline
    P Offline
    PIEBALDconsult
    wrote on last edited by
    #1

    I was looking at sorting some strings today and noticed that the default comparer orders them wrong -- it sorts "a" before "A", rather than after it, for instance. So I looked into it and read up a bit and tried the InvariantCulture and the Ordinal comparer -- Ordinal works correctly, and I'll likely use it, but I'm not happy about it. According to http://msdn.microsoft.com/en-us/library/dd465121.aspx[^]: "Comparisons that use StringComparison.InvariantCulture and StringComparison.Ordinal work identically on ASCII strings." Which is not strictly true -- they order these differently. Does anyone here know how to make a Culture (based on en-US) that does a case-sensitive sort the right way? Edit: This is interesting... I was experimenting with the System.StringComparer.CurrentCultureIgnoreCase and discovered that it seems to do what I want -- at least in the tests I've made. Which is good. :thumbsup: But that means that it doesn't actually ignore the case! :sigh: Scratch that -- it's just an artifact of how I was testing it.

    modified on Friday, August 19, 2011 2:13 PM

    N L P 3 Replies Last reply
    0
    • P PIEBALDconsult

      I was looking at sorting some strings today and noticed that the default comparer orders them wrong -- it sorts "a" before "A", rather than after it, for instance. So I looked into it and read up a bit and tried the InvariantCulture and the Ordinal comparer -- Ordinal works correctly, and I'll likely use it, but I'm not happy about it. According to http://msdn.microsoft.com/en-us/library/dd465121.aspx[^]: "Comparisons that use StringComparison.InvariantCulture and StringComparison.Ordinal work identically on ASCII strings." Which is not strictly true -- they order these differently. Does anyone here know how to make a Culture (based on en-US) that does a case-sensitive sort the right way? Edit: This is interesting... I was experimenting with the System.StringComparer.CurrentCultureIgnoreCase and discovered that it seems to do what I want -- at least in the tests I've made. Which is good. :thumbsup: But that means that it doesn't actually ignore the case! :sigh: Scratch that -- it's just an artifact of how I was testing it.

      modified on Friday, August 19, 2011 2:13 PM

      N Offline
      N Offline
      Navin Pandit
      wrote on last edited by
      #2

      One way is to put each characters into an array and the sort the array. Second option is: convert each chars of string to ASCII value. Sort the value and then reconvert it to chars. Finally, merge the chars. You will get the sorted string.

      P 1 Reply Last reply
      0
      • N Navin Pandit

        One way is to put each characters into an array and the sort the array. Second option is: convert each chars of string to ASCII value. Sort the value and then reconvert it to chars. Finally, merge the chars. You will get the sorted string.

        P Offline
        P Offline
        PIEBALDconsult
        wrote on last edited by
        #3

        That's equivalent to what the Ordinal comparer does. And I'm not sorting a string, I'm sorting a collection of strings.

        1 Reply Last reply
        0
        • P PIEBALDconsult

          I was looking at sorting some strings today and noticed that the default comparer orders them wrong -- it sorts "a" before "A", rather than after it, for instance. So I looked into it and read up a bit and tried the InvariantCulture and the Ordinal comparer -- Ordinal works correctly, and I'll likely use it, but I'm not happy about it. According to http://msdn.microsoft.com/en-us/library/dd465121.aspx[^]: "Comparisons that use StringComparison.InvariantCulture and StringComparison.Ordinal work identically on ASCII strings." Which is not strictly true -- they order these differently. Does anyone here know how to make a Culture (based on en-US) that does a case-sensitive sort the right way? Edit: This is interesting... I was experimenting with the System.StringComparer.CurrentCultureIgnoreCase and discovered that it seems to do what I want -- at least in the tests I've made. Which is good. :thumbsup: But that means that it doesn't actually ignore the case! :sigh: Scratch that -- it's just an artifact of how I was testing it.

          modified on Friday, August 19, 2011 2:13 PM

          L Offline
          L Offline
          Lost User
          wrote on last edited by
          #4

          Ordinal does an ASCII compare.

          PIEBALDconsult wrote:

          I'll likely use it, but I'm not happy about it

          Not happy about what?

          "Don't confuse experts with facts" - Eric_V

          P 1 Reply Last reply
          0
          • L Lost User

            Ordinal does an ASCII compare.

            PIEBALDconsult wrote:

            I'll likely use it, but I'm not happy about it

            Not happy about what?

            "Don't confuse experts with facts" - Eric_V

            P Offline
            P Offline
            PIEBALDconsult
            wrote on last edited by
            #5

            About having to use the Ordinal comparer to get the correct (desired) case-sensitive sort order. I should be able to use a "linguistic" (I think that's the term the documentation used) Culture that produces the same sort order of ASCII data. And if I can create a Culture that does that and set it as my CurrentCulture, so much the better.

            L 1 Reply Last reply
            0
            • P PIEBALDconsult

              About having to use the Ordinal comparer to get the correct (desired) case-sensitive sort order. I should be able to use a "linguistic" (I think that's the term the documentation used) Culture that produces the same sort order of ASCII data. And if I can create a Culture that does that and set it as my CurrentCulture, so much the better.

              L Offline
              L Offline
              Lost User
              wrote on last edited by
              #6

              PIEBALDconsult wrote:

              I should be able to use a "linguistic" (I think that's the term the documentation used) Culture that produces the same sort order of ASCII data.

              Different languages treat upper and lower case differently. While English considers a lowercase 'a' to be semantically "less than" the uppercase 'A', it may not be the same with other languages. I think that's why the designers of .NET chose to give us the language-neutral Ordinal option.

              "Don't confuse experts with facts" - Eric_V

              P 1 Reply Last reply
              0
              • L Lost User

                PIEBALDconsult wrote:

                I should be able to use a "linguistic" (I think that's the term the documentation used) Culture that produces the same sort order of ASCII data.

                Different languages treat upper and lower case differently. While English considers a lowercase 'a' to be semantically "less than" the uppercase 'A', it may not be the same with other languages. I think that's why the designers of .NET chose to give us the language-neutral Ordinal option.

                "Don't confuse experts with facts" - Eric_V

                P Offline
                P Offline
                PIEBALDconsult
                wrote on last edited by
                #7

                Shameel wrote:

                English considers a lowercase 'a' to be semantically "less than" the uppercase 'A',

                Got a reference for that? I think the opposite is true.

                L 1 Reply Last reply
                0
                • P PIEBALDconsult

                  Shameel wrote:

                  English considers a lowercase 'a' to be semantically "less than" the uppercase 'A',

                  Got a reference for that? I think the opposite is true.

                  L Offline
                  L Offline
                  Lost User
                  wrote on last edited by
                  #8

                  PIEBALDconsult wrote:

                  Got a reference for that?

                  I don't have a reference, but working with customers, they like to see 'A' on top of lists and 'a' below that.

                  PIEBALDconsult wrote:

                  I think the opposite is true.

                  The opposite is true in case of ASCII.

                  "Don't confuse experts with facts" - Eric_V

                  P 1 Reply Last reply
                  0
                  • L Lost User

                    PIEBALDconsult wrote:

                    Got a reference for that?

                    I don't have a reference, but working with customers, they like to see 'A' on top of lists and 'a' below that.

                    PIEBALDconsult wrote:

                    I think the opposite is true.

                    The opposite is true in case of ASCII.

                    "Don't confuse experts with facts" - Eric_V

                    P Offline
                    P Offline
                    PIEBALDconsult
                    wrote on last edited by
                    #9

                    Shameel wrote:

                    they like to see 'A' on top of lists and 'a' below that.

                    Exactly. That's how I want it, and the Ordinal comparer does it, but the InvariantCulture (and en-US) does it the other way. Edit: Well not exactly, come to think of it, because the Ordinal comparer also say "Z" < "a", which I don't want.

                    modified on Friday, August 19, 2011 12:15 PM

                    L 1 Reply Last reply
                    0
                    • P PIEBALDconsult

                      Shameel wrote:

                      they like to see 'A' on top of lists and 'a' below that.

                      Exactly. That's how I want it, and the Ordinal comparer does it, but the InvariantCulture (and en-US) does it the other way. Edit: Well not exactly, come to think of it, because the Ordinal comparer also say "Z" < "a", which I don't want.

                      modified on Friday, August 19, 2011 12:15 PM

                      L Offline
                      L Offline
                      Lost User
                      wrote on last edited by
                      #10

                      PIEBALDconsult wrote:

                      Exactly. That's how I want it, and the Ordinal comparer does it

                      The Ordinal comparer uses the ASCII order of the characters.

                      Char Dec Oct Hex | Char Dec Oct Hex | Char Dec Oct Hex | Char Dec Oct Hex

                      (nul) 0 0000 0x00 | (sp) 32 0040 0x20 | @ 64 0100 0x40 | ` 96 0140 0x60
                      (soh) 1 0001 0x01 | ! 33 0041 0x21 | A 65 0101 0x41 | a 97 0141 0x61
                      (stx) 2 0002 0x02 | " 34 0042 0x22 | B 66 0102 0x42 | b 98 0142 0x62
                      (etx) 3 0003 0x03 | # 35 0043 0x23 | C 67 0103 0x43 | c 99 0143 0x63
                      (eot) 4 0004 0x04 | $ 36 0044 0x24 | D 68 0104 0x44 | d 100 0144 0x64
                      (enq) 5 0005 0x05 | % 37 0045 0x25 | E 69 0105 0x45 | e 101 0145 0x65
                      (ack) 6 0006 0x06 | & 38 0046 0x26 | F 70 0106 0x46 | f 102 0146 0x66
                      (bel) 7 0007 0x07 | ' 39 0047 0x27 | G 71 0107 0x47 | g 103 0147 0x67
                      (bs) 8 0010 0x08 | ( 40 0050 0x28 | H 72 0110 0x48 | h 104 0150 0x68
                      (ht) 9 0011 0x09 | ) 41 0051 0x29 | I 73 0111 0x49 | i 105 0151 0x69
                      (nl) 10 0012 0x0a | * 42 0052 0x2a | J 74 0112 0x4a | j 106 0152 0x6a
                      (vt) 11 0013 0x0b | + 43 0053 0x2b | K 75 0113 0x4b | k 107 0153 0x6b
                      (np) 12 0014 0x0c | , 44 0054 0x2c | L 76 0114 0x4c | l 108 0154 0x6c
                      (cr) 13 0015 0x0d | - 45 0055 0x2d | M 77 0115 0x4d | m 109 0155 0x6d
                      (so) 14 0016 0x0e | . 46 0056 0x2e | N 78 0116 0x4e | n 110 0156 0x6e
                      (si) 15 0017 0x0f | / 47 0057 0x2f | O 79 0117 0x4f | o 111 0157 0x6f
                      (dle) 16 0020 0x10 | 0 48 0060 0x30 | P 80 0120 0x50 | p 112 0160 0x70
                      (dc1) 17 0021 0x11 | 1 49 0061 0x31 | Q 81 0121 0x51 | q 113 0161 0x71
                      (dc2) 18 0022 0x12 | 2 50 0062 0x32 | R 82 0122 0x52 | r 114 0162 0x72
                      (dc3) 19 0023 0x13 | 3 51 0063 0x33 | S 83 0123 0x53 | s 115 0163 0x73
                      (dc4) 20 0024 0x14 | 4 52 0064 0x34 | T 84 0124 0x54 | t 116 0164 0x74
                      (nak) 21 0025 0x15 | 5 53 0065 0x35 | U 85 0125 0x55 | u 117 0165 0x75
                      (syn) 22 0026 0x16 | 6 54 0066 0x36 | V 86 0126 0x56 | v 118 0166 0x76
                      (etb) 23 0027 0x17 | 7 55 0067 0x37 | W 87 0127 0x57 | w 119 0167 0x77
                      (can) 24 0030 0x18 | 8 56 0070 0x38 | X 88 0130 0x58 | x

                      P 1 Reply Last reply
                      0
                      • L Lost User

                        PIEBALDconsult wrote:

                        Exactly. That's how I want it, and the Ordinal comparer does it

                        The Ordinal comparer uses the ASCII order of the characters.

                        Char Dec Oct Hex | Char Dec Oct Hex | Char Dec Oct Hex | Char Dec Oct Hex

                        (nul) 0 0000 0x00 | (sp) 32 0040 0x20 | @ 64 0100 0x40 | ` 96 0140 0x60
                        (soh) 1 0001 0x01 | ! 33 0041 0x21 | A 65 0101 0x41 | a 97 0141 0x61
                        (stx) 2 0002 0x02 | " 34 0042 0x22 | B 66 0102 0x42 | b 98 0142 0x62
                        (etx) 3 0003 0x03 | # 35 0043 0x23 | C 67 0103 0x43 | c 99 0143 0x63
                        (eot) 4 0004 0x04 | $ 36 0044 0x24 | D 68 0104 0x44 | d 100 0144 0x64
                        (enq) 5 0005 0x05 | % 37 0045 0x25 | E 69 0105 0x45 | e 101 0145 0x65
                        (ack) 6 0006 0x06 | & 38 0046 0x26 | F 70 0106 0x46 | f 102 0146 0x66
                        (bel) 7 0007 0x07 | ' 39 0047 0x27 | G 71 0107 0x47 | g 103 0147 0x67
                        (bs) 8 0010 0x08 | ( 40 0050 0x28 | H 72 0110 0x48 | h 104 0150 0x68
                        (ht) 9 0011 0x09 | ) 41 0051 0x29 | I 73 0111 0x49 | i 105 0151 0x69
                        (nl) 10 0012 0x0a | * 42 0052 0x2a | J 74 0112 0x4a | j 106 0152 0x6a
                        (vt) 11 0013 0x0b | + 43 0053 0x2b | K 75 0113 0x4b | k 107 0153 0x6b
                        (np) 12 0014 0x0c | , 44 0054 0x2c | L 76 0114 0x4c | l 108 0154 0x6c
                        (cr) 13 0015 0x0d | - 45 0055 0x2d | M 77 0115 0x4d | m 109 0155 0x6d
                        (so) 14 0016 0x0e | . 46 0056 0x2e | N 78 0116 0x4e | n 110 0156 0x6e
                        (si) 15 0017 0x0f | / 47 0057 0x2f | O 79 0117 0x4f | o 111 0157 0x6f
                        (dle) 16 0020 0x10 | 0 48 0060 0x30 | P 80 0120 0x50 | p 112 0160 0x70
                        (dc1) 17 0021 0x11 | 1 49 0061 0x31 | Q 81 0121 0x51 | q 113 0161 0x71
                        (dc2) 18 0022 0x12 | 2 50 0062 0x32 | R 82 0122 0x52 | r 114 0162 0x72
                        (dc3) 19 0023 0x13 | 3 51 0063 0x33 | S 83 0123 0x53 | s 115 0163 0x73
                        (dc4) 20 0024 0x14 | 4 52 0064 0x34 | T 84 0124 0x54 | t 116 0164 0x74
                        (nak) 21 0025 0x15 | 5 53 0065 0x35 | U 85 0125 0x55 | u 117 0165 0x75
                        (syn) 22 0026 0x16 | 6 54 0066 0x36 | V 86 0126 0x56 | v 118 0166 0x76
                        (etb) 23 0027 0x17 | 7 55 0067 0x37 | W 87 0127 0x57 | w 119 0167 0x77
                        (can) 24 0030 0x18 | 8 56 0070 0x38 | X 88 0130 0x58 | x

                        P Offline
                        P Offline
                        PIEBALDconsult
                        wrote on last edited by
                        #11

                        Yes, I know that, but I don't know who gave you the 1, take a 5 for your efforts.

                        L 1 Reply Last reply
                        0
                        • P PIEBALDconsult

                          I was looking at sorting some strings today and noticed that the default comparer orders them wrong -- it sorts "a" before "A", rather than after it, for instance. So I looked into it and read up a bit and tried the InvariantCulture and the Ordinal comparer -- Ordinal works correctly, and I'll likely use it, but I'm not happy about it. According to http://msdn.microsoft.com/en-us/library/dd465121.aspx[^]: "Comparisons that use StringComparison.InvariantCulture and StringComparison.Ordinal work identically on ASCII strings." Which is not strictly true -- they order these differently. Does anyone here know how to make a Culture (based on en-US) that does a case-sensitive sort the right way? Edit: This is interesting... I was experimenting with the System.StringComparer.CurrentCultureIgnoreCase and discovered that it seems to do what I want -- at least in the tests I've made. Which is good. :thumbsup: But that means that it doesn't actually ignore the case! :sigh: Scratch that -- it's just an artifact of how I was testing it.

                          modified on Friday, August 19, 2011 2:13 PM

                          P Offline
                          P Offline
                          PIEBALDconsult
                          wrote on last edited by
                          #12

                          What I came up with as a simple interim solution is this:

                          private sealed class MyComparer : System.Collections.Generic.IComparer
                          {
                          public int
                          Compare
                          (
                          string Op0
                          ,
                          string Op1
                          )
                          {
                          int result = System.StringComparer.InvariantCultureIgnoreCase.Compare ( Op0 , Op1 ) ;

                              if ( result == 0 )
                              {
                                  result = System.StringComparer.InvariantCulture.Compare ( Op0 , Op1 ) \* -1 ;
                              }
                          
                              return ( result ) ;
                          }
                          

                          }

                          1 Reply Last reply
                          0
                          • P PIEBALDconsult

                            Yes, I know that, but I don't know who gave you the 1, take a 5 for your efforts.

                            L Offline
                            L Offline
                            Lost User
                            wrote on last edited by
                            #13

                            PIEBALDconsult wrote:

                            I don't know who gave you the 1

                            I get downvoted all the time and the people who do it do not have the courage to own up and explain it.

                            PIEBALDconsult wrote:

                            take a 5 for your efforts

                            Thanks :-)

                            "Don't confuse experts with facts" - Eric_V

                            1 Reply Last reply
                            0
                            Reply
                            • Reply as topic
                            Log in to reply
                            • Oldest to Newest
                            • Newest to Oldest
                            • Most Votes


                            • Login

                            • Don't have an account? Register

                            • Login or register to search.
                            • First post
                              Last post
                            0
                            • Categories
                            • Recent
                            • Tags
                            • Popular
                            • World
                            • Users
                            • Groups