String compare oddity [modified]

PIEBALDconsult

I was looking at sorting some strings today and noticed that the default comparer orders them wrong -- it sorts "a" before "A", rather than after it, for instance. So I looked into it and read up a bit and tried the InvariantCulture and the Ordinal comparer -- Ordinal works correctly, and I'll likely use it, but I'm not happy about it. According to http://msdn.microsoft.com/en-us/library/dd465121.aspx[^]: "Comparisons that use StringComparison.InvariantCulture and StringComparison.Ordinal work identically on ASCII strings." Which is not strictly true -- they order these differently. Does anyone here know how to make a Culture (based on en-US) that does a case-sensitive sort the right way? Edit: This is interesting... I was experimenting with the System.StringComparer.CurrentCultureIgnoreCase and discovered that it seems to do what I want -- at least in the tests I've made. Which is good. :thumbsup: But that means that it doesn't actually ignore the case! :sigh: Scratch that -- it's just an artifact of how I was testing it.

modified on Friday, August 19, 2011 2:13 PM

Navin Pandit · modified on Friday, August 19, 2011 2:13 PM

One way is to put each characters into an array and the sort the array. Second option is: convert each chars of string to ASCII value. Sort the value and then reconvert it to chars. Finally, merge the chars. You will get the sorted string.

PIEBALDconsult

That's equivalent to what the Ordinal comparer does. And I'm not sorting a string, I'm sorting a collection of strings.

Lost User · modified on Friday, August 19, 2011 2:13 PM

Ordinal does an ASCII compare.

PIEBALDconsult wrote:

I'll likely use it, but I'm not happy about it

Not happy about what?

"Don't confuse experts with facts" - Eric_V

PIEBALDconsult

About having to use the Ordinal comparer to get the correct (desired) case-sensitive sort order. I should be able to use a "linguistic" (I think that's the term the documentation used) Culture that produces the same sort order of ASCII data. And if I can create a Culture that does that and set it as my CurrentCulture, so much the better.

Lost User

PIEBALDconsult wrote:

I should be able to use a "linguistic" (I think that's the term the documentation used) Culture that produces the same sort order of ASCII data.

Different languages treat upper and lower case differently. While English considers a lowercase 'a' to be semantically "less than" the uppercase 'A', it may not be the same with other languages. I think that's why the designers of .NET chose to give us the language-neutral Ordinal option.

"Don't confuse experts with facts" - Eric_V

PIEBALDconsult

Shameel wrote:

English considers a lowercase 'a' to be semantically "less than" the uppercase 'A',

Got a reference for that? I think the opposite is true.

Lost User

PIEBALDconsult wrote:

Got a reference for that?

I don't have a reference, but working with customers, they like to see 'A' on top of lists and 'a' below that.

PIEBALDconsult wrote:

I think the opposite is true.

The opposite is true in case of ASCII.

"Don't confuse experts with facts" - Eric_V

PIEBALDconsult

Shameel wrote:

they like to see 'A' on top of lists and 'a' below that.

Exactly. That's how I want it, and the Ordinal comparer does it, but the InvariantCulture (and en-US) does it the other way. Edit: Well not exactly, come to think of it, because the Ordinal comparer also say "Z" < "a", which I don't want.

modified on Friday, August 19, 2011 12:15 PM

Lost User · modified on Friday, August 19, 2011 12:15 PM

PIEBALDconsult wrote:

Exactly. That's how I want it, and the Ordinal comparer does it

The Ordinal comparer uses the ASCII order of the characters.

Char Dec Oct Hex | Char Dec Oct Hex | Char Dec Oct Hex | Char Dec Oct Hex

(nul) 0 0000 0x00 | (sp) (soh) 1 0001 0x01 | ! (stx) 2 0002 0x02 | " (etx) 3 0003 0x03 | # (eot) 4 0004 0x04 | $ (enq) 5 0005 0x05 | % (ack) 6 0006 0x06 | & (bel) 7 0007 0x07 | ' (bs) 8 0010 0x08 | ( (ht) 9 0011 0x09 | ) (nl) 10 0012 0x0a | * (vt) 11 0013 0x0b | + (np) 12 0014 0x0c | , (cr) 13 0015 0x0d | - (so) 14 0016 0x0e | . (si) 15 0017 0x0f | / (dle) 16 0020 0x10 | 0 (dc1) 17 0021 0x11 | 1 (dc2) 18 0022 0x12 | 2 (dc3) 19 0023 0x13 | 3 (dc4) 20 0024 0x14 | 4 (nak) 21 0025 0x15 | 5 (syn) 22 0026 0x16 | 6 (etb) 23 0027 0x17 | 7 (can) 24 0030 0x18 | 8 32 0040 0x20 | @ 64 0100 0x40 | ` 96 0140 0x60
33 0041 0x21 | A 65 0101 0x41 | a 97 0141 0x61
34 0042 0x22 | B 66 0102 0x42 | b 98 0142 0x62
35 0043 0x23 | C 67 0103 0x43 | c 99 0143 0x63
36 0044 0x24 | D 68 0104 0x44 | d 100 0144 0x64
37 0045 0x25 | E 69 0105 0x45 | e 101 0145 0x65
38 0046 0x26 | F 70 0106 0x46 | f 102 0146 0x66
39 0047 0x27 | G 71 0107 0x47 | g 103 0147 0x67
40 0050 0x28 | H 72 0110 0x48 | h 104 0150 0x68
41 0051 0x29 | I 73 0111 0x49 | i 105 0151 0x69
42 0052 0x2a | J 74 0112 0x4a | j 106 0152 0x6a
43 0053 0x2b | K 75 0113 0x4b | k 107 0153 0x6b
44 0054 0x2c | L 76 0114 0x4c | l 108 0154 0x6c
45 0055 0x2d | M 77 0115 0x4d | m 109 0155 0x6d
46 0056 0x2e | N 78 0116 0x4e | n 110 0156 0x6e
47 0057 0x2f | O 79 0117 0x4f | o 111 0157 0x6f
48 0060 0x30 | P 80 0120 0x50 | p 112 0160 0x70
49 0061 0x31 | Q 81 0121 0x51 | q 113 0161 0x71
50 0062 0x32 | R 82 0122 0x52 | r 114 0162 0x72
51 0063 0x33 | S 83 0123 0x53 | s 115 0163 0x73
52 0064 0x34 | T 84 0124 0x54 | t 116 0164 0x74
53 0065 0x35 | U 85 0125 0x55 | u 117 0165 0x75
54 0066 0x36 | V 86 0126 0x56 | v 118 0166 0x76
55 0067 0x37 | W 87 0127 0x57 | w 119 0167 0x77
56 0070 0x38 | X 88 0130 0x58 | x

PIEBALDconsult

Yes, I know that, but I don't know who gave you the 1, take a 5 for your efforts.

PIEBALDconsult · modified on Friday, August 19, 2011 2:13 PM

What I came up with as a simple interim solution is this:

private sealed class MyComparer : System.Collections.Generic.IComparer
{
public int
Compare
(
string Op0
,
string Op1
)
{
int result = System.StringComparer.InvariantCultureIgnoreCase.Compare ( Op0 , Op1 ) ;

    if ( result == 0 )
    {
        result = System.StringComparer.InvariantCulture.Compare ( Op0 , Op1 ) \* -1 ;
    }

    return ( result ) ;
}

}

Lost User

PIEBALDconsult wrote:

I don't know who gave you the 1

I get downvoted all the time and the people who do it do not have the courage to own up and explain it.

PIEBALDconsult wrote:

take a 5 for your efforts

Thanks :-)

"Don't confuse experts with facts" - Eric_V