Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C#
  4. case insensitive string comparison - relative speed

case insensitive string comparison - relative speed

Scheduled Pinned Locked Moved C#
performancequestion
10 Posts 6 Posters 3 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • B Offline
    B Offline
    Blake Miller
    wrote on last edited by
    #1

    Suppose two strings,

    String S1, S2;

    Which is faster:

    if( S1.ToLower() == S2.ToLower() )

    Or:

    if( String.Compare(S1, S2, true) == 0 )

    Under what conditions might the relative speed vary?

    I need a 32 bit unsigned value just to hold the number of coding WTF I see in a day …

    L Richard DeemingR C J 4 Replies Last reply
    0
    • B Blake Miller

      Suppose two strings,

      String S1, S2;

      Which is faster:

      if( S1.ToLower() == S2.ToLower() )

      Or:

      if( String.Compare(S1, S2, true) == 0 )

      Under what conditions might the relative speed vary?

      I need a 32 bit unsigned value just to hold the number of coding WTF I see in a day …

      L Offline
      L Offline
      Lost User
      wrote on last edited by
      #2

      Blake Miller wrote:

      Under what conditions might the relative speed vary?

      • strings are reference-types, and atomic in memory
      • Converting them both before the comparison is slower than comparing them directly (there can be only one!)
      • Microsoft "advises" to use uppercase constants. Has something to do with efficiency in comparing.
      • Does it matter?

      The last point is the most important one; readability is important, as it influences maintainability. If you're doing a lot of string-operations, consider a RegEx for the job. -edit;

      Many string operations, most important the Compare and Equals methods, now provide an overload that accepts a StringComparision enumeration value as a parameter. When you specify either StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase, the string comparison will be non-linguistic. That is, the features that are specific to the natural language are ignored when making comparison decisions. This means the decisions are based on simple byte comparisons and ignore casing or equivalence tables that are parameterized by culture. As a result, by explicitly setting the parameter to either the StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase, your code often gains speed, increases correctness, and becomes more reliable.

      It's one of FxCops' warnings[^] :)

      Bastard Programmer from Hell :suss: if you can't read my code, try converting it here[^]

      B 1 Reply Last reply
      0
      • B Blake Miller

        Suppose two strings,

        String S1, S2;

        Which is faster:

        if( S1.ToLower() == S2.ToLower() )

        Or:

        if( String.Compare(S1, S2, true) == 0 )

        Under what conditions might the relative speed vary?

        I need a 32 bit unsigned value just to hold the number of coding WTF I see in a day …

        Richard DeemingR Offline
        Richard DeemingR Offline
        Richard Deeming
        wrote on last edited by
        #3

        The first one creates two new strings, converting each character to lower-case, and then compares the results. The second one performs a case-insensitive comparison of each character, without allocating any new strings. Instinct would say that the second will always out-perform the first. Here's some code to test that:

        int ITERATIONS = 1000000;

        string s1 = "Hello World";
        string s2 = "hello world";

        Debug.Assert(s1.ToLower() == s2.ToLower());
        Debug.Assert(string.Compare(s1, s2, true) == 0);

        var sw1 = new Stopwatch();
        sw1.Start();
        for (int i = 0; i < ITERATIONS; i++)
        {
        Debug.Assert(s1.ToLower() == s2.ToLower());
        }
        sw1.Stop();

        var sw2 = new Stopwatch();
        sw2.Start();
        for (int i = 0; i < ITERATIONS; i++)
        {
        Debug.Assert(string.Compare(s1, s2, true) == 0);
        }
        sw2.Stop();

        Console.WriteLine("ToLower: {0}", sw1.Elapsed);
        Console.WriteLine("Compare: {0}", sw2.Elapsed);

        On my computer, the output is:

        ToLower: 00:00:00.4507542
        Compare: 00:00:00.1856049

        The ToLower approach takes more than twice as long as the Compare approach.


        "These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer

        "These people looked deep within my soul and assigned me a number based on the order in which I joined" - Homer

        B 1 Reply Last reply
        0
        • B Blake Miller

          Suppose two strings,

          String S1, S2;

          Which is faster:

          if( S1.ToLower() == S2.ToLower() )

          Or:

          if( String.Compare(S1, S2, true) == 0 )

          Under what conditions might the relative speed vary?

          I need a 32 bit unsigned value just to hold the number of coding WTF I see in a day …

          C Offline
          C Offline
          Clifford Nelson
          wrote on last edited by
          #4

          Look at http://msdn.microsoft.com/en-us/library/cc165449.aspx[^]

          B 1 Reply Last reply
          0
          • L Lost User

            Blake Miller wrote:

            Under what conditions might the relative speed vary?

            • strings are reference-types, and atomic in memory
            • Converting them both before the comparison is slower than comparing them directly (there can be only one!)
            • Microsoft "advises" to use uppercase constants. Has something to do with efficiency in comparing.
            • Does it matter?

            The last point is the most important one; readability is important, as it influences maintainability. If you're doing a lot of string-operations, consider a RegEx for the job. -edit;

            Many string operations, most important the Compare and Equals methods, now provide an overload that accepts a StringComparision enumeration value as a parameter. When you specify either StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase, the string comparison will be non-linguistic. That is, the features that are specific to the natural language are ignored when making comparison decisions. This means the decisions are based on simple byte comparisons and ignore casing or equivalence tables that are parameterized by culture. As a result, by explicitly setting the parameter to either the StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase, your code often gains speed, increases correctness, and becomes more reliable.

            It's one of FxCops' warnings[^] :)

            Bastard Programmer from Hell :suss: if you can't read my code, try converting it here[^]

            B Offline
            B Offline
            Blake Miller
            wrote on last edited by
            #5

            Thank you for your answers. Would you have a link to a tech note, language guide or MSDN about this part "Microsoft 'advises' to use uppercase constants." It does matter. I am looking into this because customers are complaining about CPU load.

            I need a 32 bit unsigned value just to hold the number of coding WTF I see in a day …

            L S 2 Replies Last reply
            0
            • C Clifford Nelson

              Look at http://msdn.microsoft.com/en-us/library/cc165449.aspx[^]

              B Offline
              B Offline
              Blake Miller
              wrote on last edited by
              #6

              Awesome! Thanks for the link. :thumbsup:

              I need a 32 bit unsigned value just to hold the number of coding WTF I see in a day …

              1 Reply Last reply
              0
              • Richard DeemingR Richard Deeming

                The first one creates two new strings, converting each character to lower-case, and then compares the results. The second one performs a case-insensitive comparison of each character, without allocating any new strings. Instinct would say that the second will always out-perform the first. Here's some code to test that:

                int ITERATIONS = 1000000;

                string s1 = "Hello World";
                string s2 = "hello world";

                Debug.Assert(s1.ToLower() == s2.ToLower());
                Debug.Assert(string.Compare(s1, s2, true) == 0);

                var sw1 = new Stopwatch();
                sw1.Start();
                for (int i = 0; i < ITERATIONS; i++)
                {
                Debug.Assert(s1.ToLower() == s2.ToLower());
                }
                sw1.Stop();

                var sw2 = new Stopwatch();
                sw2.Start();
                for (int i = 0; i < ITERATIONS; i++)
                {
                Debug.Assert(string.Compare(s1, s2, true) == 0);
                }
                sw2.Stop();

                Console.WriteLine("ToLower: {0}", sw1.Elapsed);
                Console.WriteLine("Compare: {0}", sw2.Elapsed);

                On my computer, the output is:

                ToLower: 00:00:00.4507542
                Compare: 00:00:00.1856049

                The ToLower approach takes more than twice as long as the Compare approach.


                "These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer

                B Offline
                B Offline
                Blake Miller
                wrote on last edited by
                #7

                Confirms my 'gut feeling' and we all know how much we like to depend upon those :)

                I need a 32 bit unsigned value just to hold the number of coding WTF I see in a day …

                1 Reply Last reply
                0
                • B Blake Miller

                  Thank you for your answers. Would you have a link to a tech note, language guide or MSDN about this part "Microsoft 'advises' to use uppercase constants." It does matter. I am looking into this because customers are complaining about CPU load.

                  I need a 32 bit unsigned value just to hold the number of coding WTF I see in a day …

                  L Offline
                  L Offline
                  Lost User
                  wrote on last edited by
                  #8

                  Modified the post to include the FxCop rule :)

                  Bastard Programmer from Hell :suss: If you can't read my code, try converting it here[^] They hate us for our freedom![^]

                  1 Reply Last reply
                  0
                  • B Blake Miller

                    Thank you for your answers. Would you have a link to a tech note, language guide or MSDN about this part "Microsoft 'advises' to use uppercase constants." It does matter. I am looking into this because customers are complaining about CPU load.

                    I need a 32 bit unsigned value just to hold the number of coding WTF I see in a day …

                    S Offline
                    S Offline
                    SledgeHammer01
                    wrote on last edited by
                    #9

                    Blake Miller wrote:

                    It does matter. I am looking into this because customers are complaining about CPU load.

                    I think you are barking up the wrong tree if that's the case. As another poster responded, with 1 *MILLION* string comparisons, the difference in performance is non-existant. Your performance issues are likely elsewhere.

                    1 Reply Last reply
                    0
                    • B Blake Miller

                      Suppose two strings,

                      String S1, S2;

                      Which is faster:

                      if( S1.ToLower() == S2.ToLower() )

                      Or:

                      if( String.Compare(S1, S2, true) == 0 )

                      Under what conditions might the relative speed vary?

                      I need a 32 bit unsigned value just to hold the number of coding WTF I see in a day …

                      J Offline
                      J Offline
                      jschell
                      wrote on last edited by
                      #10

                      In terms of application performance (rather than just statement performance.) If you have not measured the application using appropriate data then your first step would be to do that. If you have measured and found that this specific method containing this statement is the problem then finding a different algorithmic approach would have much more impact. The goal of course in that case is not to find a faster way to do the comparison but instead to find a way so no comparison at all is needed.

                      1 Reply Last reply
                      0
                      Reply
                      • Reply as topic
                      Log in to reply
                      • Oldest to Newest
                      • Newest to Oldest
                      • Most Votes


                      • Login

                      • Don't have an account? Register

                      • Login or register to search.
                      • First post
                        Last post
                      0
                      • Categories
                      • Recent
                      • Tags
                      • Popular
                      • World
                      • Users
                      • Groups