Fast square root algorithm and using SSE Intrinsics to boost floating point calculation performance
-
Hey Guys, I'm working on a simulation project that involves writing a highly accurate vehicle dynamics simulator. I'm writing the project using MFC/C++ and I'm using Visual Studio 2008. I've hit a bit of a block so I figured I would turn to the wisdom contained in this community. 1) I need to use a number of square root calculations. Would anyone happen to know a really fast square root algorithm in Visual Studio 2008 using C/C++. In my case I would be feeding it doubles. 2) I've read quite a bit about SSE Intrinsics, but apart from limited examples I haven't found a good source to say, let's do it this way or this is how you implement it. I've written the numerical engine using double varaibles, so does anyone have some tips about how do I go about converting doubles to the SSE Intrinsics. Many Thanks Danny
-
Hey Guys, I'm working on a simulation project that involves writing a highly accurate vehicle dynamics simulator. I'm writing the project using MFC/C++ and I'm using Visual Studio 2008. I've hit a bit of a block so I figured I would turn to the wisdom contained in this community. 1) I need to use a number of square root calculations. Would anyone happen to know a really fast square root algorithm in Visual Studio 2008 using C/C++. In my case I would be feeding it doubles. 2) I've read quite a bit about SSE Intrinsics, but apart from limited examples I haven't found a good source to say, let's do it this way or this is how you implement it. I've written the numerical engine using double varaibles, so does anyone have some tips about how do I go about converting doubles to the SSE Intrinsics. Many Thanks Danny
-
Hey Guys, I'm working on a simulation project that involves writing a highly accurate vehicle dynamics simulator. I'm writing the project using MFC/C++ and I'm using Visual Studio 2008. I've hit a bit of a block so I figured I would turn to the wisdom contained in this community. 1) I need to use a number of square root calculations. Would anyone happen to know a really fast square root algorithm in Visual Studio 2008 using C/C++. In my case I would be feeding it doubles. 2) I've read quite a bit about SSE Intrinsics, but apart from limited examples I haven't found a good source to say, let's do it this way or this is how you implement it. I've written the numerical engine using double varaibles, so does anyone have some tips about how do I go about converting doubles to the SSE Intrinsics. Many Thanks Danny
The best answer to this is to use very strange and dangerous magic, as described here. (This is for floats, but it should be expandable to doubles easily enough.) It's usually attributed to John Carmack, although I don't know if he came up with it, or got it from somewhere else. PS - there's even a Wikipedia article on it!! :)
There are three kinds of people in the world - those who can count and those who can't...
-
Hey Guys, I'm working on a simulation project that involves writing a highly accurate vehicle dynamics simulator. I'm writing the project using MFC/C++ and I'm using Visual Studio 2008. I've hit a bit of a block so I figured I would turn to the wisdom contained in this community. 1) I need to use a number of square root calculations. Would anyone happen to know a really fast square root algorithm in Visual Studio 2008 using C/C++. In my case I would be feeding it doubles. 2) I've read quite a bit about SSE Intrinsics, but apart from limited examples I haven't found a good source to say, let's do it this way or this is how you implement it. I've written the numerical engine using double varaibles, so does anyone have some tips about how do I go about converting doubles to the SSE Intrinsics. Many Thanks Danny
One "trick" I have resorted to in the past is to minimize if not remove the use of square roots. There are some calculations where it is unavoidable but in many there are alternatives. For example, in computing distances like for finding closest objects don't compare with the actual distance but use the distance squared because actual distance requires a square root.
-
Hey Guys, I'm working on a simulation project that involves writing a highly accurate vehicle dynamics simulator. I'm writing the project using MFC/C++ and I'm using Visual Studio 2008. I've hit a bit of a block so I figured I would turn to the wisdom contained in this community. 1) I need to use a number of square root calculations. Would anyone happen to know a really fast square root algorithm in Visual Studio 2008 using C/C++. In my case I would be feeding it doubles. 2) I've read quite a bit about SSE Intrinsics, but apart from limited examples I haven't found a good source to say, let's do it this way or this is how you implement it. I've written the numerical engine using double varaibles, so does anyone have some tips about how do I go about converting doubles to the SSE Intrinsics. Many Thanks Danny
Hi, is you application already using MMX/SSE code? is it vectorized at all? if not, it won't make sense to use SSE just to get a square root; you would have to get the data in and out the vector processor, which would forego all the speed gain you might achieve. So just apply regular optimization on doubles. Some of these have already been mentioned: - avoid SQRT; don't use it when it isn't necessary. - use an approximation, if that is acceptable. - for the range of numbers you're interested in, find a fast way (often a polynomial, or something smart, sometimes called "magic"). - if you need SQRT on a series of related numbers, try using each result as the initial estimate for the next. :)
Luc Pattyn
Local announcement (Antwerp region): Lange Wapper? Neen!
-
Hey Guys, I'm working on a simulation project that involves writing a highly accurate vehicle dynamics simulator. I'm writing the project using MFC/C++ and I'm using Visual Studio 2008. I've hit a bit of a block so I figured I would turn to the wisdom contained in this community. 1) I need to use a number of square root calculations. Would anyone happen to know a really fast square root algorithm in Visual Studio 2008 using C/C++. In my case I would be feeding it doubles. 2) I've read quite a bit about SSE Intrinsics, but apart from limited examples I haven't found a good source to say, let's do it this way or this is how you implement it. I've written the numerical engine using double varaibles, so does anyone have some tips about how do I go about converting doubles to the SSE Intrinsics. Many Thanks Danny
Hey Guys, Many thanks for all your help. It has poven very useful. All the Best Danny