Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. .NET (Core and Framework)
  4. String compare oddity [modified]

String compare oddity [modified]

Scheduled Pinned Locked Moved .NET (Core and Framework)
comalgorithmstestingbeta-testingtutorial
13 Posts 3 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • P PIEBALDconsult

    I was looking at sorting some strings today and noticed that the default comparer orders them wrong -- it sorts "a" before "A", rather than after it, for instance. So I looked into it and read up a bit and tried the InvariantCulture and the Ordinal comparer -- Ordinal works correctly, and I'll likely use it, but I'm not happy about it. According to http://msdn.microsoft.com/en-us/library/dd465121.aspx[^]: "Comparisons that use StringComparison.InvariantCulture and StringComparison.Ordinal work identically on ASCII strings." Which is not strictly true -- they order these differently. Does anyone here know how to make a Culture (based on en-US) that does a case-sensitive sort the right way? Edit: This is interesting... I was experimenting with the System.StringComparer.CurrentCultureIgnoreCase and discovered that it seems to do what I want -- at least in the tests I've made. Which is good. :thumbsup: But that means that it doesn't actually ignore the case! :sigh: Scratch that -- it's just an artifact of how I was testing it.

    modified on Friday, August 19, 2011 2:13 PM

    L Offline
    L Offline
    Lost User
    wrote on last edited by
    #4

    Ordinal does an ASCII compare.

    PIEBALDconsult wrote:

    I'll likely use it, but I'm not happy about it

    Not happy about what?

    "Don't confuse experts with facts" - Eric_V

    P 1 Reply Last reply
    0
    • L Lost User

      Ordinal does an ASCII compare.

      PIEBALDconsult wrote:

      I'll likely use it, but I'm not happy about it

      Not happy about what?

      "Don't confuse experts with facts" - Eric_V

      P Offline
      P Offline
      PIEBALDconsult
      wrote on last edited by
      #5

      About having to use the Ordinal comparer to get the correct (desired) case-sensitive sort order. I should be able to use a "linguistic" (I think that's the term the documentation used) Culture that produces the same sort order of ASCII data. And if I can create a Culture that does that and set it as my CurrentCulture, so much the better.

      L 1 Reply Last reply
      0
      • P PIEBALDconsult

        About having to use the Ordinal comparer to get the correct (desired) case-sensitive sort order. I should be able to use a "linguistic" (I think that's the term the documentation used) Culture that produces the same sort order of ASCII data. And if I can create a Culture that does that and set it as my CurrentCulture, so much the better.

        L Offline
        L Offline
        Lost User
        wrote on last edited by
        #6

        PIEBALDconsult wrote:

        I should be able to use a "linguistic" (I think that's the term the documentation used) Culture that produces the same sort order of ASCII data.

        Different languages treat upper and lower case differently. While English considers a lowercase 'a' to be semantically "less than" the uppercase 'A', it may not be the same with other languages. I think that's why the designers of .NET chose to give us the language-neutral Ordinal option.

        "Don't confuse experts with facts" - Eric_V

        P 1 Reply Last reply
        0
        • L Lost User

          PIEBALDconsult wrote:

          I should be able to use a "linguistic" (I think that's the term the documentation used) Culture that produces the same sort order of ASCII data.

          Different languages treat upper and lower case differently. While English considers a lowercase 'a' to be semantically "less than" the uppercase 'A', it may not be the same with other languages. I think that's why the designers of .NET chose to give us the language-neutral Ordinal option.

          "Don't confuse experts with facts" - Eric_V

          P Offline
          P Offline
          PIEBALDconsult
          wrote on last edited by
          #7

          Shameel wrote:

          English considers a lowercase 'a' to be semantically "less than" the uppercase 'A',

          Got a reference for that? I think the opposite is true.

          L 1 Reply Last reply
          0
          • P PIEBALDconsult

            Shameel wrote:

            English considers a lowercase 'a' to be semantically "less than" the uppercase 'A',

            Got a reference for that? I think the opposite is true.

            L Offline
            L Offline
            Lost User
            wrote on last edited by
            #8

            PIEBALDconsult wrote:

            Got a reference for that?

            I don't have a reference, but working with customers, they like to see 'A' on top of lists and 'a' below that.

            PIEBALDconsult wrote:

            I think the opposite is true.

            The opposite is true in case of ASCII.

            "Don't confuse experts with facts" - Eric_V

            P 1 Reply Last reply
            0
            • L Lost User

              PIEBALDconsult wrote:

              Got a reference for that?

              I don't have a reference, but working with customers, they like to see 'A' on top of lists and 'a' below that.

              PIEBALDconsult wrote:

              I think the opposite is true.

              The opposite is true in case of ASCII.

              "Don't confuse experts with facts" - Eric_V

              P Offline
              P Offline
              PIEBALDconsult
              wrote on last edited by
              #9

              Shameel wrote:

              they like to see 'A' on top of lists and 'a' below that.

              Exactly. That's how I want it, and the Ordinal comparer does it, but the InvariantCulture (and en-US) does it the other way. Edit: Well not exactly, come to think of it, because the Ordinal comparer also say "Z" < "a", which I don't want.

              modified on Friday, August 19, 2011 12:15 PM

              L 1 Reply Last reply
              0
              • P PIEBALDconsult

                Shameel wrote:

                they like to see 'A' on top of lists and 'a' below that.

                Exactly. That's how I want it, and the Ordinal comparer does it, but the InvariantCulture (and en-US) does it the other way. Edit: Well not exactly, come to think of it, because the Ordinal comparer also say "Z" < "a", which I don't want.

                modified on Friday, August 19, 2011 12:15 PM

                L Offline
                L Offline
                Lost User
                wrote on last edited by
                #10

                PIEBALDconsult wrote:

                Exactly. That's how I want it, and the Ordinal comparer does it

                The Ordinal comparer uses the ASCII order of the characters.

                Char Dec Oct Hex | Char Dec Oct Hex | Char Dec Oct Hex | Char Dec Oct Hex

                (nul) 0 0000 0x00 | (sp) 32 0040 0x20 | @ 64 0100 0x40 | ` 96 0140 0x60
                (soh) 1 0001 0x01 | ! 33 0041 0x21 | A 65 0101 0x41 | a 97 0141 0x61
                (stx) 2 0002 0x02 | " 34 0042 0x22 | B 66 0102 0x42 | b 98 0142 0x62
                (etx) 3 0003 0x03 | # 35 0043 0x23 | C 67 0103 0x43 | c 99 0143 0x63
                (eot) 4 0004 0x04 | $ 36 0044 0x24 | D 68 0104 0x44 | d 100 0144 0x64
                (enq) 5 0005 0x05 | % 37 0045 0x25 | E 69 0105 0x45 | e 101 0145 0x65
                (ack) 6 0006 0x06 | & 38 0046 0x26 | F 70 0106 0x46 | f 102 0146 0x66
                (bel) 7 0007 0x07 | ' 39 0047 0x27 | G 71 0107 0x47 | g 103 0147 0x67
                (bs) 8 0010 0x08 | ( 40 0050 0x28 | H 72 0110 0x48 | h 104 0150 0x68
                (ht) 9 0011 0x09 | ) 41 0051 0x29 | I 73 0111 0x49 | i 105 0151 0x69
                (nl) 10 0012 0x0a | * 42 0052 0x2a | J 74 0112 0x4a | j 106 0152 0x6a
                (vt) 11 0013 0x0b | + 43 0053 0x2b | K 75 0113 0x4b | k 107 0153 0x6b
                (np) 12 0014 0x0c | , 44 0054 0x2c | L 76 0114 0x4c | l 108 0154 0x6c
                (cr) 13 0015 0x0d | - 45 0055 0x2d | M 77 0115 0x4d | m 109 0155 0x6d
                (so) 14 0016 0x0e | . 46 0056 0x2e | N 78 0116 0x4e | n 110 0156 0x6e
                (si) 15 0017 0x0f | / 47 0057 0x2f | O 79 0117 0x4f | o 111 0157 0x6f
                (dle) 16 0020 0x10 | 0 48 0060 0x30 | P 80 0120 0x50 | p 112 0160 0x70
                (dc1) 17 0021 0x11 | 1 49 0061 0x31 | Q 81 0121 0x51 | q 113 0161 0x71
                (dc2) 18 0022 0x12 | 2 50 0062 0x32 | R 82 0122 0x52 | r 114 0162 0x72
                (dc3) 19 0023 0x13 | 3 51 0063 0x33 | S 83 0123 0x53 | s 115 0163 0x73
                (dc4) 20 0024 0x14 | 4 52 0064 0x34 | T 84 0124 0x54 | t 116 0164 0x74
                (nak) 21 0025 0x15 | 5 53 0065 0x35 | U 85 0125 0x55 | u 117 0165 0x75
                (syn) 22 0026 0x16 | 6 54 0066 0x36 | V 86 0126 0x56 | v 118 0166 0x76
                (etb) 23 0027 0x17 | 7 55 0067 0x37 | W 87 0127 0x57 | w 119 0167 0x77
                (can) 24 0030 0x18 | 8 56 0070 0x38 | X 88 0130 0x58 | x

                P 1 Reply Last reply
                0
                • L Lost User

                  PIEBALDconsult wrote:

                  Exactly. That's how I want it, and the Ordinal comparer does it

                  The Ordinal comparer uses the ASCII order of the characters.

                  Char Dec Oct Hex | Char Dec Oct Hex | Char Dec Oct Hex | Char Dec Oct Hex

                  (nul) 0 0000 0x00 | (sp) 32 0040 0x20 | @ 64 0100 0x40 | ` 96 0140 0x60
                  (soh) 1 0001 0x01 | ! 33 0041 0x21 | A 65 0101 0x41 | a 97 0141 0x61
                  (stx) 2 0002 0x02 | " 34 0042 0x22 | B 66 0102 0x42 | b 98 0142 0x62
                  (etx) 3 0003 0x03 | # 35 0043 0x23 | C 67 0103 0x43 | c 99 0143 0x63
                  (eot) 4 0004 0x04 | $ 36 0044 0x24 | D 68 0104 0x44 | d 100 0144 0x64
                  (enq) 5 0005 0x05 | % 37 0045 0x25 | E 69 0105 0x45 | e 101 0145 0x65
                  (ack) 6 0006 0x06 | & 38 0046 0x26 | F 70 0106 0x46 | f 102 0146 0x66
                  (bel) 7 0007 0x07 | ' 39 0047 0x27 | G 71 0107 0x47 | g 103 0147 0x67
                  (bs) 8 0010 0x08 | ( 40 0050 0x28 | H 72 0110 0x48 | h 104 0150 0x68
                  (ht) 9 0011 0x09 | ) 41 0051 0x29 | I 73 0111 0x49 | i 105 0151 0x69
                  (nl) 10 0012 0x0a | * 42 0052 0x2a | J 74 0112 0x4a | j 106 0152 0x6a
                  (vt) 11 0013 0x0b | + 43 0053 0x2b | K 75 0113 0x4b | k 107 0153 0x6b
                  (np) 12 0014 0x0c | , 44 0054 0x2c | L 76 0114 0x4c | l 108 0154 0x6c
                  (cr) 13 0015 0x0d | - 45 0055 0x2d | M 77 0115 0x4d | m 109 0155 0x6d
                  (so) 14 0016 0x0e | . 46 0056 0x2e | N 78 0116 0x4e | n 110 0156 0x6e
                  (si) 15 0017 0x0f | / 47 0057 0x2f | O 79 0117 0x4f | o 111 0157 0x6f
                  (dle) 16 0020 0x10 | 0 48 0060 0x30 | P 80 0120 0x50 | p 112 0160 0x70
                  (dc1) 17 0021 0x11 | 1 49 0061 0x31 | Q 81 0121 0x51 | q 113 0161 0x71
                  (dc2) 18 0022 0x12 | 2 50 0062 0x32 | R 82 0122 0x52 | r 114 0162 0x72
                  (dc3) 19 0023 0x13 | 3 51 0063 0x33 | S 83 0123 0x53 | s 115 0163 0x73
                  (dc4) 20 0024 0x14 | 4 52 0064 0x34 | T 84 0124 0x54 | t 116 0164 0x74
                  (nak) 21 0025 0x15 | 5 53 0065 0x35 | U 85 0125 0x55 | u 117 0165 0x75
                  (syn) 22 0026 0x16 | 6 54 0066 0x36 | V 86 0126 0x56 | v 118 0166 0x76
                  (etb) 23 0027 0x17 | 7 55 0067 0x37 | W 87 0127 0x57 | w 119 0167 0x77
                  (can) 24 0030 0x18 | 8 56 0070 0x38 | X 88 0130 0x58 | x

                  P Offline
                  P Offline
                  PIEBALDconsult
                  wrote on last edited by
                  #11

                  Yes, I know that, but I don't know who gave you the 1, take a 5 for your efforts.

                  L 1 Reply Last reply
                  0
                  • P PIEBALDconsult

                    I was looking at sorting some strings today and noticed that the default comparer orders them wrong -- it sorts "a" before "A", rather than after it, for instance. So I looked into it and read up a bit and tried the InvariantCulture and the Ordinal comparer -- Ordinal works correctly, and I'll likely use it, but I'm not happy about it. According to http://msdn.microsoft.com/en-us/library/dd465121.aspx[^]: "Comparisons that use StringComparison.InvariantCulture and StringComparison.Ordinal work identically on ASCII strings." Which is not strictly true -- they order these differently. Does anyone here know how to make a Culture (based on en-US) that does a case-sensitive sort the right way? Edit: This is interesting... I was experimenting with the System.StringComparer.CurrentCultureIgnoreCase and discovered that it seems to do what I want -- at least in the tests I've made. Which is good. :thumbsup: But that means that it doesn't actually ignore the case! :sigh: Scratch that -- it's just an artifact of how I was testing it.

                    modified on Friday, August 19, 2011 2:13 PM

                    P Offline
                    P Offline
                    PIEBALDconsult
                    wrote on last edited by
                    #12

                    What I came up with as a simple interim solution is this:

                    private sealed class MyComparer : System.Collections.Generic.IComparer
                    {
                    public int
                    Compare
                    (
                    string Op0
                    ,
                    string Op1
                    )
                    {
                    int result = System.StringComparer.InvariantCultureIgnoreCase.Compare ( Op0 , Op1 ) ;

                        if ( result == 0 )
                        {
                            result = System.StringComparer.InvariantCulture.Compare ( Op0 , Op1 ) \* -1 ;
                        }
                    
                        return ( result ) ;
                    }
                    

                    }

                    1 Reply Last reply
                    0
                    • P PIEBALDconsult

                      Yes, I know that, but I don't know who gave you the 1, take a 5 for your efforts.

                      L Offline
                      L Offline
                      Lost User
                      wrote on last edited by
                      #13

                      PIEBALDconsult wrote:

                      I don't know who gave you the 1

                      I get downvoted all the time and the people who do it do not have the courage to own up and explain it.

                      PIEBALDconsult wrote:

                      take a 5 for your efforts

                      Thanks :-)

                      "Don't confuse experts with facts" - Eric_V

                      1 Reply Last reply
                      0
                      Reply
                      • Reply as topic
                      Log in to reply
                      • Oldest to Newest
                      • Newest to Oldest
                      • Most Votes


                      • Login

                      • Don't have an account? Register

                      • Login or register to search.
                      • First post
                        Last post
                      0
                      • Categories
                      • Recent
                      • Tags
                      • Popular
                      • World
                      • Users
                      • Groups