Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C#
  4. String manipulation performance issue

String manipulation performance issue

Scheduled Pinned Locked Moved C#
helpwinformsdata-structuresdebuggingperformance
5 Posts 4 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • S Offline
    S Offline
    sjembek
    wrote on last edited by
    #1

    Hello, In order to show users a string representation of a 120kb file in Hex-format within a windows forms textbox I wrote a simple routine that converts all decimal bytes in a buffer into a hexadecimal string, taking formatting into account. The routine looks like this: int x =0; string content = String.Format("\n{0:X5}: ", x); for (; x < Data.senderdata.Length; x++) { content += String.Format("{0:X2}", Data.senderdata[x]); #if DEBUG flp.setProgress(0, Data.senderdata.Length, x); #endif if ((x + 1) % 2 == 0) content += " "; if ((x + 1) % 16 == 0) content += String.Format("\n{0:X5}: ", (x+1)); } When done, the content string is assigned to a textbox control. Perhaps I should mention "senderdata" is a byte array. The routine seems to take an insane amount of time to complete. I realize that it might take a few secconds, since senderdata.Length typically has a value of about 125K, but this is taking over a minute to complete, in a single threaded program running on a pentium 4. Is this due to the String.Format Hex-conversion routine or is it just that it's too hard for Windows to handle such long strings? Does anyone have a tip to improve on the performance of this? Thanks in advance for any help, Benny

    S S G 3 Replies Last reply
    0
    • S sjembek

      Hello, In order to show users a string representation of a 120kb file in Hex-format within a windows forms textbox I wrote a simple routine that converts all decimal bytes in a buffer into a hexadecimal string, taking formatting into account. The routine looks like this: int x =0; string content = String.Format("\n{0:X5}: ", x); for (; x < Data.senderdata.Length; x++) { content += String.Format("{0:X2}", Data.senderdata[x]); #if DEBUG flp.setProgress(0, Data.senderdata.Length, x); #endif if ((x + 1) % 2 == 0) content += " "; if ((x + 1) % 16 == 0) content += String.Format("\n{0:X5}: ", (x+1)); } When done, the content string is assigned to a textbox control. Perhaps I should mention "senderdata" is a byte array. The routine seems to take an insane amount of time to complete. I realize that it might take a few secconds, since senderdata.Length typically has a value of about 125K, but this is taking over a minute to complete, in a single threaded program running on a pentium 4. Is this due to the String.Format Hex-conversion routine or is it just that it's too hard for Windows to handle such long strings? Does anyone have a tip to improve on the performance of this? Thanks in advance for any help, Benny

      S Offline
      S Offline
      sathish s
      wrote on last edited by
      #2

      One possible suggestion is, you could replace the variable type of content, from string to StringBuilder instance. StringBuilder's concatenation are faster than string's. -- modified at 9:48 Tuesday 6th June, 2006

      S 1 Reply Last reply
      0
      • S sjembek

        Hello, In order to show users a string representation of a 120kb file in Hex-format within a windows forms textbox I wrote a simple routine that converts all decimal bytes in a buffer into a hexadecimal string, taking formatting into account. The routine looks like this: int x =0; string content = String.Format("\n{0:X5}: ", x); for (; x < Data.senderdata.Length; x++) { content += String.Format("{0:X2}", Data.senderdata[x]); #if DEBUG flp.setProgress(0, Data.senderdata.Length, x); #endif if ((x + 1) % 2 == 0) content += " "; if ((x + 1) % 16 == 0) content += String.Format("\n{0:X5}: ", (x+1)); } When done, the content string is assigned to a textbox control. Perhaps I should mention "senderdata" is a byte array. The routine seems to take an insane amount of time to complete. I realize that it might take a few secconds, since senderdata.Length typically has a value of about 125K, but this is taking over a minute to complete, in a single threaded program running on a pentium 4. Is this due to the String.Format Hex-conversion routine or is it just that it's too hard for Windows to handle such long strings? Does anyone have a tip to improve on the performance of this? Thanks in advance for any help, Benny

        S Offline
        S Offline
        Stefan Troschuetz
        wrote on last edited by
        #3

        I think a main issue is the mass number of string concatenation. As a string is immutable, a new string object is created everytime you append something to your content variable. Try using StrinBuilder[^] instead.


        "Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning." - Rick Cook

        www.troschuetz.de

        1 Reply Last reply
        0
        • S sathish s

          One possible suggestion is, you could replace the variable type of content, from string to StringBuilder instance. StringBuilder's concatenation are faster than string's. -- modified at 9:48 Tuesday 6th June, 2006

          S Offline
          S Offline
          sjembek
          wrote on last edited by
          #4

          Thanks very much (Stefan aswell), this helps alot :-)

          1 Reply Last reply
          0
          • S sjembek

            Hello, In order to show users a string representation of a 120kb file in Hex-format within a windows forms textbox I wrote a simple routine that converts all decimal bytes in a buffer into a hexadecimal string, taking formatting into account. The routine looks like this: int x =0; string content = String.Format("\n{0:X5}: ", x); for (; x < Data.senderdata.Length; x++) { content += String.Format("{0:X2}", Data.senderdata[x]); #if DEBUG flp.setProgress(0, Data.senderdata.Length, x); #endif if ((x + 1) % 2 == 0) content += " "; if ((x + 1) % 16 == 0) content += String.Format("\n{0:X5}: ", (x+1)); } When done, the content string is assigned to a textbox control. Perhaps I should mention "senderdata" is a byte array. The routine seems to take an insane amount of time to complete. I realize that it might take a few secconds, since senderdata.Length typically has a value of about 125K, but this is taking over a minute to complete, in a single threaded program running on a pentium 4. Is this due to the String.Format Hex-conversion routine or is it just that it's too hard for Windows to handle such long strings? Does anyone have a tip to improve on the performance of this? Thanks in advance for any help, Benny

            G Offline
            G Offline
            Guffa
            wrote on last edited by
            #5

            When the += operator is used on a string, it might appear like the string is appended to the end of the original string. This is not true, as strings are immutable in .NET. The statement: content += " "; is actually performed as: content = string.Concat(content, " "); With that in mind, let's do some math to find out why the routine is so slow: Each iteration does either one, two or three concatenations. The first one is done every iteration, the second is done every other iteration, and the third is done every 16th iteration. This gives that: :: Each iteration does by average 1.5625 string concatenations. :: Each iteration adds by average 3.0625 characters to the string. With an array containing 125000 elements it produces a string that contains about 383000 characters. As each character is two bytes, that gives a string that uses 766 kbyte of data. As the string is growing in a linear fashion, we can calcuate the average work done by each concatenation by taking the average size of the string during the operation, which is half the size of the finished string. So a concatenation is by average moving an amount of 383 kbytes of data. As we have 125000 iterations, we have around 195000 string concatenations (125000 times 1.5625). 195000 times 383 kbytes makes 74685000 kbyte. When the routine has finished, it has moved somewhere around 75 gigabyte of data. (As that is far more than the amount of avialable RAM, this has also caused hundreds of garbage collections to take place.) That is the reason why the routine is so slow. To improve the routine is easy. Use a StringBuilder. That would make the routine run around a 100000 times faster. As an interresting observation in optimization, one can speed up the routine somewhat by using a temporary string:

            string content, line;
            content = string.Empty;
            line = string.Empty;
            for (int x = 0; x < Data.senderdata.Length; x++) {
            if (x % 16 == 0) {
            content += line;
            line = String.Format("\n{0:X5}: ", (x+1));
            }
            line += String.Format("{0:X2}", Data.senderdata[x]);
            if ((x + 1) % 2 == 0) line += " ";
            }
            content += line;

            This would redude the number of lengthy concatenations from 1.5625 per iteration to 0.0625, reducing the execution time by 96%. Not nearly as effective as using a StringBuilder, but somewhat impressive eventhough... :) --- b { font-weight: normal; } -- modified at 11:27 Tuesday 6th June, 2006

            1 Reply Last reply
            0
            Reply
            • Reply as topic
            Log in to reply
            • Oldest to Newest
            • Newest to Oldest
            • Most Votes


            • Login

            • Don't have an account? Register

            • Login or register to search.
            • First post
              Last post
            0
            • Categories
            • Recent
            • Tags
            • Popular
            • World
            • Users
            • Groups