fill memory with a pattern
-
Is there an efficient algorithm to fill memory with an arbitrary pattern?
void memoryfill( void* target, char* pattern, size_t patlen, size_t numRepeats )
{
if( 1 == patlen ) memset(target, *pattern, numRepeats);
else ???
}I came up with something that makes 2*log(numRepeats) calls to memcpy(), but figured this must be a standard thing to do, and there is probably a good algorithm (better than mine?) already out there. I couldn't find anything on CodeProject, or on Google. David
-
Is there an efficient algorithm to fill memory with an arbitrary pattern?
void memoryfill( void* target, char* pattern, size_t patlen, size_t numRepeats )
{
if( 1 == patlen ) memset(target, *pattern, numRepeats);
else ???
}I came up with something that makes 2*log(numRepeats) calls to memcpy(), but figured this must be a standard thing to do, and there is probably a good algorithm (better than mine?) already out there. I couldn't find anything on CodeProject, or on Google. David
I wasn't getting any responses, so I went ahead an posted my O(log(n)) solution at http://www.codeproject.com/script/comments/forums.asp?forumid=1647&XtraIDs=1647&sd=13+Jul+2007&ed=11+Oct+2007&author=jay_p_patel&stype=1&select=2267440&mpp=50&fr=167&df=100#xx2267440xx[^]. Perhaps there is not a textbook algorithm for this, and if not, then maybe my algorithm will help someone. David
-
Is there an efficient algorithm to fill memory with an arbitrary pattern?
void memoryfill( void* target, char* pattern, size_t patlen, size_t numRepeats )
{
if( 1 == patlen ) memset(target, *pattern, numRepeats);
else ???
}I came up with something that makes 2*log(numRepeats) calls to memcpy(), but figured this must be a standard thing to do, and there is probably a good algorithm (better than mine?) already out there. I couldn't find anything on CodeProject, or on Google. David
You can use destructive overlap to write your fill pattern as fast as your CPU can go. Here's the algorithm: 1. Copy one instance of the pattern to target. 2. Copy memory from target to (target + patlen) with length (numRepeats - 1) * patlen. (This is one function call.) I believe this is the most efficient algorithm possible. The implementation of memcpy will probably be more efficient than anything you could write in a higher-level language. Also you're copying from addresses you just wrote to, so they'll already be in the cache. This lets the CPU avoid getting these bytes from (relatively) slow external memory chips. This algorithm should fly like the wind!
-
You can use destructive overlap to write your fill pattern as fast as your CPU can go. Here's the algorithm: 1. Copy one instance of the pattern to target. 2. Copy memory from target to (target + patlen) with length (numRepeats - 1) * patlen. (This is one function call.) I believe this is the most efficient algorithm possible. The implementation of memcpy will probably be more efficient than anything you could write in a higher-level language. Also you're copying from addresses you just wrote to, so they'll already be in the cache. This lets the CPU avoid getting these bytes from (relatively) slow external memory chips. This algorithm should fly like the wind!
IF (a few critical things...) then I think you're right. I think your algorithm will work really really good. I'm working on other things, and don't really want to write a timing test, but I believe you're probably right about it being the fastest possible algorithm. Now for the critical things: IF memcpy copies from the end of the array instead of from the base, this algorithm won't work at all. Depending on how it computes the target (if copying from the end), it'll either copy garbage until it finally hits the first pattern, and then make one good copy of it, or it'll make one good copy first, then cause a GPF by attempting to overwrite memory to the left of the src. Here's a quote from "STANDARD C" by P.J. Plauger and Jim Brodie regarding memcpy: "The elements of the array can be accessed and stored in any order." It could even start in the middle of the array if that favors the target environment. I suppose one could bracket the code with #if's, assuming one could absolutely determine how memcpy worked; and guarantee that it never changes it's mind depending on any number of things. The other alternative is to simply write the algorithm myself in C. This way I could guarantee copying from the base of the array instead of from the end. But, as you point out, memcpy is highly tuned for the machine architecture, and my "roll your own" isn't going to work as well. Thanks a lot anyway. I really appreciate the thoughts. Maybe your idea will come in handy in some of my future projects. It was a really good idea. David
-
IF (a few critical things...) then I think you're right. I think your algorithm will work really really good. I'm working on other things, and don't really want to write a timing test, but I believe you're probably right about it being the fastest possible algorithm. Now for the critical things: IF memcpy copies from the end of the array instead of from the base, this algorithm won't work at all. Depending on how it computes the target (if copying from the end), it'll either copy garbage until it finally hits the first pattern, and then make one good copy of it, or it'll make one good copy first, then cause a GPF by attempting to overwrite memory to the left of the src. Here's a quote from "STANDARD C" by P.J. Plauger and Jim Brodie regarding memcpy: "The elements of the array can be accessed and stored in any order." It could even start in the middle of the array if that favors the target environment. I suppose one could bracket the code with #if's, assuming one could absolutely determine how memcpy worked; and guarantee that it never changes it's mind depending on any number of things. The other alternative is to simply write the algorithm myself in C. This way I could guarantee copying from the base of the array instead of from the end. But, as you point out, memcpy is highly tuned for the machine architecture, and my "roll your own" isn't going to work as well. Thanks a lot anyway. I really appreciate the thoughts. Maybe your idea will come in handy in some of my future projects. It was a really good idea. David
After thinking a bit more about it, I realized the stuff about the GPF is probably not correct. It would just copy garbage. As I think thru things, I try to document them so I (or someone else) won't come back later on and try to improve on the code without throughly thinking things thru. Based on your suggestion, I've added the following comments to my memoryFill() function.
void memoryFill( void* target, char* pattern, size_t ptrnlen, size_t numRepeats )
{
char* trgt = (char*)target; //for ease in pointer manipulation.
...
//
// If we KNEW that memcpy copied from the base of the src array to the base
// of the target array, we could write an EXTREMELY fast algorithm:
// memcpy( trgt, pattern, ptrnlen ); //put one copy at the base.
// memcpy( trgt+ptrnlen, trgt, ptrnlen*(numRepeats-1) ); //destructive copy the rest
// Conversely, if we KNEW that memcpy copied from the end of the src array
// to the end of the target array, we could write:
// memcpy( trgt+ptrnlen*(numRepeats-1) , pattern, ptrnlen ); //put one copy at the end.
// memcpy( trgt, trgt+ptrnlen, ptrnlen*(numRepeats-1) ); //destructive copy the rest
// However, the C standard distinctly says that memcpy can do it's copying
// in whatever order it wants. We CANNOT rely on any specific order.
//
...
}David