Building strings
-
I was a bit sceptic about the
StringBuilder
class of .NET and always used+
and+=
to concatenate strings. Know I saw with my own two eyes the difference between the two... :omg:. For creating about 2MB of text and writing to a file it was a difference of MINUTES ! (It was a complex method with recursive calling etc...) (Well, better find out late then never...) (I was doing this with my old method and was surprised that it was so slow... for the 'fun' of it I tried theStringBuilder
) So if your app is slow and you're using the+
and+=
way... this is your solution...
V. Stop smoking so you can: enjoy longer the money you save.
The reason that it's so slow is because adding items in a string actually results in the creation of a new string, rather than by using more efficient methods. It is important to be aware of the differences in concatenating strings. If you create a string like this;
string a = "My " + "new " + "string";
then behind the scenes, the IL produces a string.Concat(string, string, string) which is efficient. This is because string.Concat actually starts out by working out how big the memory area needs to be for the newly created string, and then calls a routine to allocate memory big enough to store it (I believe that it's FastAllocateString). It then adds the strings in to create the concatenated string. Now, it is common wisdom that you should only do this with 2 to 4 items in the string. Anything more and you should use StringBuilder. Well, this is not always the case. In the previous example, if you had:string a = "My" + " " + "new" + " " + "string" + " " + "is" + " " + "very" + " long" + " and " + "is" + " " + "annoyingly" + " " + "full" + " " + "of" + " " + "concatenations";
behind the scenes C# converts this into an array of strings and then passes this into string.Concat (cunningly an overload). At this point, the same behaviour applies and the resulting concatenation is faster than usingStringBuilder.Append
. StringBuilder is much better suited to adding items over many lines of code, or in loops. If you find you are doingstring +=
orstring = string +
repeatedly, then StringBuilder is much better.Arthur Dent - "That would explain it. All my life I've had this strange feeling that there's something big and sinister going on in the world." Slartibartfast - "No. That's perfectly normal paranoia. Everybody in the universe gets that." Deja View - the feeling that you've seen this post before.
-
The reason that it's so slow is because adding items in a string actually results in the creation of a new string, rather than by using more efficient methods. It is important to be aware of the differences in concatenating strings. If you create a string like this;
string a = "My " + "new " + "string";
then behind the scenes, the IL produces a string.Concat(string, string, string) which is efficient. This is because string.Concat actually starts out by working out how big the memory area needs to be for the newly created string, and then calls a routine to allocate memory big enough to store it (I believe that it's FastAllocateString). It then adds the strings in to create the concatenated string. Now, it is common wisdom that you should only do this with 2 to 4 items in the string. Anything more and you should use StringBuilder. Well, this is not always the case. In the previous example, if you had:string a = "My" + " " + "new" + " " + "string" + " " + "is" + " " + "very" + " long" + " and " + "is" + " " + "annoyingly" + " " + "full" + " " + "of" + " " + "concatenations";
behind the scenes C# converts this into an array of strings and then passes this into string.Concat (cunningly an overload). At this point, the same behaviour applies and the resulting concatenation is faster than usingStringBuilder.Append
. StringBuilder is much better suited to adding items over many lines of code, or in loops. If you find you are doingstring +=
orstring = string +
repeatedly, then StringBuilder is much better.Arthur Dent - "That would explain it. All my life I've had this strange feeling that there's something big and sinister going on in the world." Slartibartfast - "No. That's perfectly normal paranoia. Everybody in the universe gets that." Deja View - the feeling that you've seen this post before.
Exactly what I found a while ago. it's even on MSDN (but who reads that stuff anyway ;-))
V.
Stop smoking so you can: Enjoy longer the money you save. Moviereview Archive -
The reason that it's so slow is because adding items in a string actually results in the creation of a new string, rather than by using more efficient methods. It is important to be aware of the differences in concatenating strings. If you create a string like this;
string a = "My " + "new " + "string";
then behind the scenes, the IL produces a string.Concat(string, string, string) which is efficient. This is because string.Concat actually starts out by working out how big the memory area needs to be for the newly created string, and then calls a routine to allocate memory big enough to store it (I believe that it's FastAllocateString). It then adds the strings in to create the concatenated string. Now, it is common wisdom that you should only do this with 2 to 4 items in the string. Anything more and you should use StringBuilder. Well, this is not always the case. In the previous example, if you had:string a = "My" + " " + "new" + " " + "string" + " " + "is" + " " + "very" + " long" + " and " + "is" + " " + "annoyingly" + " " + "full" + " " + "of" + " " + "concatenations";
behind the scenes C# converts this into an array of strings and then passes this into string.Concat (cunningly an overload). At this point, the same behaviour applies and the resulting concatenation is faster than usingStringBuilder.Append
. StringBuilder is much better suited to adding items over many lines of code, or in loops. If you find you are doingstring +=
orstring = string +
repeatedly, then StringBuilder is much better.Arthur Dent - "That would explain it. All my life I've had this strange feeling that there's something big and sinister going on in the world." Slartibartfast - "No. That's perfectly normal paranoia. Everybody in the universe gets that." Deja View - the feeling that you've seen this post before.
You are correct, but your examples are wrong.
string a = "My " + "new " + "string";
is compiled into:string a = "My new string";
You can see this with the Reflector. To correct your example use variables in the string:string a = "new "; string b = "My " + a + "string";
Ami -
You are correct, but your examples are wrong.
string a = "My " + "new " + "string";
is compiled into:string a = "My new string";
You can see this with the Reflector. To correct your example use variables in the string:string a = "new "; string b = "My " + a + "string";
AmiAmi Bar wrote:
You are correct, but your examples are wrong. string a = "My " + "new " + "string"; is compiled into: string a = "My new string";
That's exactly what I was saying. This example was to demonstrate the use of string.Concat and not temporary strings. If you take a look at the IL behind the scenes, you will see that the concatenation works like this.
Arthur Dent - "That would explain it. All my life I've had this strange feeling that there's something big and sinister going on in the world." Slartibartfast - "No. That's perfectly normal paranoia. Everybody in the universe gets that." Deja View - the feeling that you've seen this post before.
-
Ami Bar wrote:
You are correct, but your examples are wrong. string a = "My " + "new " + "string"; is compiled into: string a = "My new string";
That's exactly what I was saying. This example was to demonstrate the use of string.Concat and not temporary strings. If you take a look at the IL behind the scenes, you will see that the concatenation works like this.
Arthur Dent - "That would explain it. All my life I've had this strange feeling that there's something big and sinister going on in the world." Slartibartfast - "No. That's perfectly normal paranoia. Everybody in the universe gets that." Deja View - the feeling that you've seen this post before.
What I am saying is that in your example the string.Concat is not used. Maybe it is used during the compilation time (Translating the code to IL) by the compiler, however in the IL the string.Concat is not used. At runtime there is no string.Concat either. Ami
-
You could substitute .NET for Java in your comment and it would also hold true. Java also has a class called
StringBuilder
to speed up dealing with immutableStrings
.
Kicking squealing Gucci little piggy.
Actually, in Java it is a
StringBuffer
.Matt Gerrans
-
Actually, in Java it is a
StringBuffer
.Matt Gerrans
StringBuilder
was added to Java 1.5. :) http://java.sun.com/j2se/1.5.0/docs/api/java/lang/StringBuilder.html[^]
Kicking squealing Gucci little piggy.
The Rob Blog -
What I am saying is that in your example the string.Concat is not used. Maybe it is used during the compilation time (Translating the code to IL) by the compiler, however in the IL the string.Concat is not used. At runtime there is no string.Concat either. Ami
Ahh - I see the confusion, and I should have stated that this applied to .NET 1.1. In version 2, as you have seen, the compiler is clever enough to optimize this to:
.method private hidebysig static void Main(string[] args) cil managed { .entrypoint // Code size 15 (0xf) .maxstack 1 .locals init ([0] string a) IL_0000: nop IL_0001: ldstr "My new string is very long and is annoyingly full " + "of concatenations" IL_0006: stloc.0 IL_0007: ldloc.0 IL_0008: call void [mscorlib]System.Console::WriteLine(string) IL_000d: nop IL_000e: ret } // end of method Program::Main
The point that I was trying to make (badly it seems) is that conventional string wisdom shouldn't always be taken for granted.;)Arthur Dent - "That would explain it. All my life I've had this strange feeling that there's something big and sinister going on in the world." Slartibartfast - "No. That's perfectly normal paranoia. Everybody in the universe gets that." Deja View - the feeling that you've seen this post before.
-
StringBuilder
was added to Java 1.5. :) http://java.sun.com/j2se/1.5.0/docs/api/java/lang/StringBuilder.html[^]
Kicking squealing Gucci little piggy.
The Rob BlogThat's funny. I guess its been a while since I've used Java. ;) I guess Microsoft and Sun borrow class names from each other now.
Matt Gerrans
-
Ahh - I see the confusion, and I should have stated that this applied to .NET 1.1. In version 2, as you have seen, the compiler is clever enough to optimize this to:
.method private hidebysig static void Main(string[] args) cil managed { .entrypoint // Code size 15 (0xf) .maxstack 1 .locals init ([0] string a) IL_0000: nop IL_0001: ldstr "My new string is very long and is annoyingly full " + "of concatenations" IL_0006: stloc.0 IL_0007: ldloc.0 IL_0008: call void [mscorlib]System.Console::WriteLine(string) IL_000d: nop IL_000e: ret } // end of method Program::Main
The point that I was trying to make (badly it seems) is that conventional string wisdom shouldn't always be taken for granted.;)Arthur Dent - "That would explain it. All my life I've had this strange feeling that there's something big and sinister going on in the world." Slartibartfast - "No. That's perfectly normal paranoia. Everybody in the universe gets that." Deja View - the feeling that you've seen this post before.
-
I was a bit sceptic about the
StringBuilder
class of .NET and always used+
and+=
to concatenate strings. Know I saw with my own two eyes the difference between the two... :omg:. For creating about 2MB of text and writing to a file it was a difference of MINUTES ! (It was a complex method with recursive calling etc...) (Well, better find out late then never...) (I was doing this with my old method and was surprised that it was so slow... for the 'fun' of it I tried theStringBuilder
) So if your app is slow and you're using the+
and+=
way... this is your solution...
V. Stop smoking so you can: enjoy longer the money you save.
Assume you're building a string from 100000 characters using +=. Assuming one += call per character. On the first concatenation, you're copying 2 characters (first+second character) (= 4 byte). On the second concatenation, you're copying 3 characters (= 6 byte). On the third concatenation, you're copying 4 characters (= 8 byte). You see how this continues - you're effectively copying the first characters a lot of times. In total, you'll copy 10 GB of data for just 100000 characters resulting in a tiny 200 KB string. In comparison, StringBuilder will internally have to copy at most 400 KB to build that string.
-
I was a bit sceptic about the
StringBuilder
class of .NET and always used+
and+=
to concatenate strings. Know I saw with my own two eyes the difference between the two... :omg:. For creating about 2MB of text and writing to a file it was a difference of MINUTES ! (It was a complex method with recursive calling etc...) (Well, better find out late then never...) (I was doing this with my old method and was surprised that it was so slow... for the 'fun' of it I tried theStringBuilder
) So if your app is slow and you're using the+
and+=
way... this is your solution...
V. Stop smoking so you can: enjoy longer the money you save.
Using += works fine as long as you have a small (and limited) number of strings, but it scales really badly. The execution time increases exponetially with the number of string that you add. Here's a guy with the same problem a while back: String manipulation performance issue[^] Creating a string of 383000 characters by using += required moving 75 GB of data, so there is no wonder that it took over a minute to create it. Using a StringBuilder reduces the execution time with something like 99.999%. As an experiment in optimisation I also attempted to reduce the execution time without using a StringBuilder, and managed to reduce it by about 96%. :)
--- b { font-weight: normal; }
-
Using += works fine as long as you have a small (and limited) number of strings, but it scales really badly. The execution time increases exponetially with the number of string that you add. Here's a guy with the same problem a while back: String manipulation performance issue[^] Creating a string of 383000 characters by using += required moving 75 GB of data, so there is no wonder that it took over a minute to create it. Using a StringBuilder reduces the execution time with something like 99.999%. As an experiment in optimisation I also attempted to reduce the execution time without using a StringBuilder, and managed to reduce it by about 96%. :)
--- b { font-weight: normal; }
Guffa wrote:
As an experiment in optimisation I also attempted to reduce the execution time without using a StringBuilder, and managed to reduce it by about 96%.
You've said 'A', I wait for 'B'... ;)
-- The Blog: Bits and Pieces
-
Guffa wrote:
As an experiment in optimisation I also attempted to reduce the execution time without using a StringBuilder, and managed to reduce it by about 96%.
You've said 'A', I wait for 'B'... ;)
-- The Blog: Bits and Pieces
-
I already said both A and B a long time ago. Look in the thread that I linked to.
--- b { font-weight: normal; }
Ah! Now I see it - you didn't mention in the post that your 96% fix was in that thread... ;P
-- The Blog: Bits and Pieces
-
I was a bit sceptic about the
StringBuilder
class of .NET and always used+
and+=
to concatenate strings. Know I saw with my own two eyes the difference between the two... :omg:. For creating about 2MB of text and writing to a file it was a difference of MINUTES ! (It was a complex method with recursive calling etc...) (Well, better find out late then never...) (I was doing this with my old method and was surprised that it was so slow... for the 'fun' of it I tried theStringBuilder
) So if your app is slow and you're using the+
and+=
way... this is your solution...
V. Stop smoking so you can: enjoy longer the money you save.
My favorite issue with StringBuilder. 99% if the time I have seen it used are as follows: StringBuilder sb = new StringBuilder(); ... do sb.Append("Some Text" + i.ToString() + "some more text" + abc.ToString()); while return sb.ToString();
On two occasions I have been asked [by members of Parliament], 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - Charles Babbage
-
My favorite issue with StringBuilder. 99% if the time I have seen it used are as follows: StringBuilder sb = new StringBuilder(); ... do sb.Append("Some Text" + i.ToString() + "some more text" + abc.ToString()); while return sb.ToString();
On two occasions I have been asked [by members of Parliament], 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - Charles Babbage
-
I was a bit sceptic about the
StringBuilder
class of .NET and always used+
and+=
to concatenate strings. Know I saw with my own two eyes the difference between the two... :omg:. For creating about 2MB of text and writing to a file it was a difference of MINUTES ! (It was a complex method with recursive calling etc...) (Well, better find out late then never...) (I was doing this with my old method and was surprised that it was so slow... for the 'fun' of it I tried theStringBuilder
) So if your app is slow and you're using the+
and+=
way... this is your solution...
V. Stop smoking so you can: enjoy longer the money you save.
If you have to alter very big strings StringBuilder is the way to go. But for small strings the border where StringBuilder is faster than the String.xxx functions becomes a moving target: http://www.codeproject.com/useritems/StringBuilder_vs_String.asp Yours, Alois Kraus
-
I was a bit sceptic about the
StringBuilder
class of .NET and always used+
and+=
to concatenate strings. Know I saw with my own two eyes the difference between the two... :omg:. For creating about 2MB of text and writing to a file it was a difference of MINUTES ! (It was a complex method with recursive calling etc...) (Well, better find out late then never...) (I was doing this with my old method and was surprised that it was so slow... for the 'fun' of it I tried theStringBuilder
) So if your app is slow and you're using the+
and+=
way... this is your solution...
V. Stop smoking so you can: enjoy longer the money you save.
The reason why this is the case is that strings in .NET (just like in Java), are immutable objects. That means that the underlying memory buffer is locked, and will not allow modifications.
String s = "ab"; s += "c";
is equivalent toString s = "ab"; s = s + "c"
which in turn is equivalent toString s = "ab"; String temp = s + "c"; s = temp;
. Each append generates a new object. StringBuilder however, is a mutable object, meaning that the underlying character array is modified when you issue an append command - no temporary objects are created. The reason why strings are immutable is that you can do a fair amount of optimizations on strings, which will improve performance - given that you do not misuse them of course. :)-- Presented in doublevision (where drunk)
-
I was a bit sceptic about the
StringBuilder
class of .NET and always used+
and+=
to concatenate strings. Know I saw with my own two eyes the difference between the two... :omg:. For creating about 2MB of text and writing to a file it was a difference of MINUTES ! (It was a complex method with recursive calling etc...) (Well, better find out late then never...) (I was doing this with my old method and was surprised that it was so slow... for the 'fun' of it I tried theStringBuilder
) So if your app is slow and you're using the+
and+=
way... this is your solution...
V. Stop smoking so you can: enjoy longer the money you save.
I think I read somewhere that StringBuilder is not a class that you could write in managed C# or VB. It is managed C# but It uses string internals which are not available to us. This result in a much better memory management. I had a look at the source, and I believe it is an awfully long winded. Anyway It is MUCH better than using +. This said I have seen people using StringBuilder to concatenate 2 strings and show the result in a message box. I think it is a bit over the top, stringbuilder beat the + operator when .append is used repeatidly on the same stringbuilder. What the point of saving 0.1 millisecond to display a messagebox. Overall in the long run code clarity is as important as speed, I would recommend to use stringbuilder when you concatenate 3 items or more.