Fast string to integer conversion
-
I am searching for a fast string to integer conversion algorithm.
Int32.Parse
is too slow for my purposes. I implemented an algorithm that was faster by orders of magnitude, but I was hoping someone could point me to an existing library that includes such functions, or at least to a paper to make sure I'm working correctly. Thank you. -
I am searching for a fast string to integer conversion algorithm.
Int32.Parse
is too slow for my purposes. I implemented an algorithm that was faster by orders of magnitude, but I was hoping someone could point me to an existing library that includes such functions, or at least to a paper to make sure I'm working correctly. Thank you.HosamAly wrote:
Int32.Parse is too slow for my purposes
Interesting! Can you give more information about your application type where minor things like this is affected?
Navaneeth How to use google | Ask smart questions
-
I am searching for a fast string to integer conversion algorithm.
Int32.Parse
is too slow for my purposes. I implemented an algorithm that was faster by orders of magnitude, but I was hoping someone could point me to an existing library that includes such functions, or at least to a paper to make sure I'm working correctly. Thank you.For my experience, sometimes you do not need to Parse the string object to int. we can detect the type of the data, and convert it constrainedly. For example: object o; int i = 10; o = i; if(o is int) { int j = (int)o; Console.WriteLine(j); } Otherwise, we can use TryParse to avoid throw some exception. Afterall, Parse is already a time-consuming method. Tan Li I Love KongFu~
-
For my experience, sometimes you do not need to Parse the string object to int. we can detect the type of the data, and convert it constrainedly. For example: object o; int i = 10; o = i; if(o is int) { int j = (int)o; Console.WriteLine(j); } Otherwise, we can use TryParse to avoid throw some exception. Afterall, Parse is already a time-consuming method. Tan Li I Love KongFu~
I'm not trying to cast a boxed object. I'm trying to parse a string. There is no noticable difference between
Int32.Parse
andInt32.TryParse
when the input is known to be correct. Actually, if you open both using Reflector, they use the same implementation, except that one of them throws exceptions for error conditions, while the other just returnsfalse
. -
HosamAly wrote:
Int32.Parse is too slow for my purposes
Interesting! Can you give more information about your application type where minor things like this is affected?
Navaneeth How to use google | Ask smart questions
I am parsing a live feed that contains many numbers. The feed comes from a trusted source (i.e. format is guaranteed to be correct), and the numbers are simple ASCII integers (no localization, no thousands separator, etc).
Int32.Parse
is simply too slow for such "simple" operations. In my benchmarks, it's taking 5-16 times more than some of my implementations. -
I'm not trying to cast a boxed object. I'm trying to parse a string. There is no noticable difference between
Int32.Parse
andInt32.TryParse
when the input is known to be correct. Actually, if you open both using Reflector, they use the same implementation, except that one of them throws exceptions for error conditions, while the other just returnsfalse
.Yes, I did not say the TryParse is more effective. But we can check the parameter before convertion. Surely, if you can ensure the string is correct format, no need to do this.
Tan Li I Love KongFu~
-
I am parsing a live feed that contains many numbers. The feed comes from a trusted source (i.e. format is guaranteed to be correct), and the numbers are simple ASCII integers (no localization, no thousands separator, etc).
Int32.Parse
is simply too slow for such "simple" operations. In my benchmarks, it's taking 5-16 times more than some of my implementations. -
So you say you've written a fast implementation of int parsing, that performs better than int.Parse - so what exactly are you asking? If you've done it right? You've not posted your code if so!
I am asking whether there exists a library that includes such functions, in hope that it would be better tested and optimized than my own implementation. If there isn't, I was hoping to get some links to papers or algorithms for how to convert a string to a number efficiently.
-
I am searching for a fast string to integer conversion algorithm.
Int32.Parse
is too slow for my purposes. I implemented an algorithm that was faster by orders of magnitude, but I was hoping someone could point me to an existing library that includes such functions, or at least to a paper to make sure I'm working correctly. Thank you.Parsing a string is not very hard. I put together this code, it should cover the basics:
public static int ParseInt32(string text) {
long value = 0;
long sign = 1;
bool first = true;
foreach (char c in text) {
if (c >= '0' && c <= '9') {
value = value * 10 + c - '0';
} else if (c == '-' && first) {
sign = -1;
} else {
throw new FormatException();
}
first = false;
}
value *= sign;
if (value < int.MinValue || value > int.MaxValue) throw new OverflowException();
return (int)value;
}Despite everything, the person most likely to be fooling you next is yourself.
-
I am asking whether there exists a library that includes such functions, in hope that it would be better tested and optimized than my own implementation. If there isn't, I was hoping to get some links to papers or algorithms for how to convert a string to a number efficiently.
Ok, I see. The performance issue with Int32.Parse is because, like alot of things in the framework, it tries to cater for every man's needs and therefore includes the ability, and logic (therefore overhead) to parse lots of different types of string (eg, currency, hex etc). If you are satisfied that you are only receiving decimal-formatted numbers, without any other markup (such as separators) you could look at Int32.Parse in reflector and follow the trail to the unmanaged code that does the actual parsing - this will lead you into an internal class called
Number
which has an unsafe methodParseNumber
that does the bulk of the work. Your job would be to identify the bit that parses a decimal-formatted number and just extract that portion into your own library. -
Parsing a string is not very hard. I put together this code, it should cover the basics:
public static int ParseInt32(string text) {
long value = 0;
long sign = 1;
bool first = true;
foreach (char c in text) {
if (c >= '0' && c <= '9') {
value = value * 10 + c - '0';
} else if (c == '-' && first) {
sign = -1;
} else {
throw new FormatException();
}
first = false;
}
value *= sign;
if (value < int.MinValue || value > int.MaxValue) throw new OverflowException();
return (int)value;
}Despite everything, the person most likely to be fooling you next is yourself.
That was brilliant and simple. 5ed
Navaneeth How to use google | Ask smart questions
-
That was brilliant and simple. 5ed
Navaneeth How to use google | Ask smart questions
You could also disable array bound checking, which could be even faster. Check the documentation for "unsafe" and "fixed" C# keywords.
-
That was brilliant and simple. 5ed
Navaneeth How to use google | Ask smart questions
-
Parsing a string is not very hard. I put together this code, it should cover the basics:
public static int ParseInt32(string text) {
long value = 0;
long sign = 1;
bool first = true;
foreach (char c in text) {
if (c >= '0' && c <= '9') {
value = value * 10 + c - '0';
} else if (c == '-' && first) {
sign = -1;
} else {
throw new FormatException();
}
first = false;
}
value *= sign;
if (value < int.MinValue || value > int.MaxValue) throw new OverflowException();
return (int)value;
}Despite everything, the person most likely to be fooling you next is yourself.
-
You could also disable array bound checking, which could be even faster. Check the documentation for "unsafe" and "fixed" C# keywords.
Sometimes yes, sometimes no. According to my measurements, sometimes the cost of using
fixed
is almost the same as bounds checking (especially for small arrays), because it calls a function to pin the referred object and return a pointer to it. (You can check it with IL DASM.) -
Sometimes yes, sometimes no. According to my measurements, sometimes the cost of using
fixed
is almost the same as bounds checking (especially for small arrays), because it calls a function to pin the referred object and return a pointer to it. (You can check it with IL DASM.)Absolutely. One must choose the solution that fits best, that is why testing more than one way to achieve this is the best way to find out.