MFC vs STL performance test
-
What I asked is to verify if my code is correct and maybe suggest how to make it more valid. I still remember claims how "highly portable", "highly standard" and "very fast" STL was back in 1999, this is really not what I was asking.
I never liked this kind of comparison of containers, it is not very useful partly because of the reasons I mentioned in my previous post. Different real-world scenarios can put stress on a container in a thousand different ways.
-
What I asked is to verify if my code is correct and maybe suggest how to make it more valid. I still remember claims how "highly portable", "highly standard" and "very fast" STL was back in 1999, this is really not what I was asking.
-
Hi guys, I did some performance comparison between MFC and STL containers and I think maybe it could evolve into an article here at Codeproject. Please have a look at my code and suggest some improvements to make comparison more valid. So far I found that STL still sucks big time even after 10 or 15 years of polishing (and neglecting of MFC) in VS2012. Following is complete source code, just drop it into console project with MFC support, compile and run in Release mode.
// MFCvsSTL.cpp : Performance comparison between MFC and STL containers // (c) Alex Fedorov http:://alexf.name 2013 #include "stdafx.h" #include "MFCvsSTL.h" // nothing special here #include <unordered_map> #include <map> #include <vector> #include <list> #ifdef _DEBUG #define new DEBUG_NEW #endif // The one and only application object CWinApp theApp; using namespace std; // typedef map<DWORD, void*> stlmap; typedef unordered_map<DWORD, void*> stlmap; int stlMap(int nCount) { int nSize = (int)(1.2 * (float)nCount); stlmap a(nSize); // unordered_map <DWORD, void*> a(nSize); for( int i = 0; i < nCount; ++i ) { a[ i ] = (void *) i; } stlmap::iterator iter; for( int j = 0; j < nCount; ++j ) { iter = a.find( ( abs( rand() * j ) % nCount ) ); } return 1; } int mfcMap(int nCount) { int nSize = (int)(1.2 * (float)nCount); // CMapWordToPtr a; CMap<DWORD, DWORD, void*, void*> a; a.InitHashTable( nSize ); for( int i = 0; i < nCount; ++i ) { a[ i ] = (void *) i; } void * val; for( int j = 0; j < nCount; ++j ) { a.Lookup( ( abs( rand() * j ) % nCount ), val ); } return 0; } int stlArray(int nCount) { vector <int> bigarray; int nMs = bigarray.max_size(); try { bigarray.reserve(nCount); } catch (...) { CString str; str.Format(_T("Memory allocation error trying to reserve %d elements. vector.max_size=%d\r\n"), nCount, nMs); _tprintf(str); return 0; } for(int k = 0; k < nCount; ++k) { bigarray.push_back(k); // bigarray[k] = k; } int ret = bigarray.size(); return ret; } int mfcArray(int nCount) { // CArray<int,int> arr; // OCArray<int> arr; CUIntArray arr; arr.SetSize(0, nCount); for(int k = 0; k < nCount; ++k) { arr.Add(k); // arr[k] = k; } int ret = arr.GetCount(); return ret; } int mfcList(int nCount) { CList<int, int> a; for(int k = 0; k < nCount; ++k) { a.AddHead(
I can't comment on map/list but if the MFC array type is performing faster than an STL vector or a built in array I'd be very suspicious of what I'd written. In your vector code the push_back and exception handling (which you haven't got for the MFC case) would be my first targets for a good hard look.
-
I can't comment on map/list but if the MFC array type is performing faster than an STL vector or a built in array I'd be very suspicious of what I'd written. In your vector code the push_back and exception handling (which you haven't got for the MFC case) would be my first targets for a good hard look.
Good point. Exception handling actually does not add anithing to the timing, because its just 1 large function that wrapped in it. I just was wondering why I cannot allocate 500M array while vectors.max_siz says it can do over 1G so basically its a leftover from some debugging. MFC CArray is actually only container that I found is 4-5 times slower than STL vector with or without try-catch block. Maybe if I add array random access to the test MFC will be winning?
-
Good point. Exception handling actually does not add anithing to the timing, because its just 1 large function that wrapped in it. I just was wondering why I cannot allocate 500M array while vectors.max_siz says it can do over 1G so basically its a leftover from some debugging. MFC CArray is actually only container that I found is 4-5 times slower than STL vector with or without try-catch block. Maybe if I add array random access to the test MFC will be winning?
I'm now surprised that the MFC array is so slow - I'd have thought built in, vector and CArray would all be about the same speed. I doubt adding random access will change much, provided you don't change the size a std::vector should be as fast as a built in array for random access.
-
I'm now surprised that the MFC array is so slow - I'd have thought built in, vector and CArray would all be about the same speed. I doubt adding random access will change much, provided you don't change the size a std::vector should be as fast as a built in array for random access.
Its only noticeably slower on really large arrays, like hundred millions of elements, probably because CArray doing unnecessary stuff like zeroing memory before using it.
-
Hi guys, I did some performance comparison between MFC and STL containers and I think maybe it could evolve into an article here at Codeproject. Please have a look at my code and suggest some improvements to make comparison more valid. So far I found that STL still sucks big time even after 10 or 15 years of polishing (and neglecting of MFC) in VS2012. Following is complete source code, just drop it into console project with MFC support, compile and run in Release mode.
// MFCvsSTL.cpp : Performance comparison between MFC and STL containers // (c) Alex Fedorov http:://alexf.name 2013 #include "stdafx.h" #include "MFCvsSTL.h" // nothing special here #include <unordered_map> #include <map> #include <vector> #include <list> #ifdef _DEBUG #define new DEBUG_NEW #endif // The one and only application object CWinApp theApp; using namespace std; // typedef map<DWORD, void*> stlmap; typedef unordered_map<DWORD, void*> stlmap; int stlMap(int nCount) { int nSize = (int)(1.2 * (float)nCount); stlmap a(nSize); // unordered_map <DWORD, void*> a(nSize); for( int i = 0; i < nCount; ++i ) { a[ i ] = (void *) i; } stlmap::iterator iter; for( int j = 0; j < nCount; ++j ) { iter = a.find( ( abs( rand() * j ) % nCount ) ); } return 1; } int mfcMap(int nCount) { int nSize = (int)(1.2 * (float)nCount); // CMapWordToPtr a; CMap<DWORD, DWORD, void*, void*> a; a.InitHashTable( nSize ); for( int i = 0; i < nCount; ++i ) { a[ i ] = (void *) i; } void * val; for( int j = 0; j < nCount; ++j ) { a.Lookup( ( abs( rand() * j ) % nCount ), val ); } return 0; } int stlArray(int nCount) { vector <int> bigarray; int nMs = bigarray.max_size(); try { bigarray.reserve(nCount); } catch (...) { CString str; str.Format(_T("Memory allocation error trying to reserve %d elements. vector.max_size=%d\r\n"), nCount, nMs); _tprintf(str); return 0; } for(int k = 0; k < nCount; ++k) { bigarray.push_back(k); // bigarray[k] = k; } int ret = bigarray.size(); return ret; } int mfcArray(int nCount) { // CArray<int,int> arr; // OCArray<int> arr; CUIntArray arr; arr.SetSize(0, nCount); for(int k = 0; k < nCount; ++k) { arr.Add(k); // arr[k] = k; } int ret = arr.GetCount(); return ret; } int mfcList(int nCount) { CList<int, int> a; for(int k = 0; k < nCount; ++k) { a.AddHead(
I tried using std::map in an application that handles telemetry data. When it was not working I tried stepping into a simple call to fetch data from the map. After 100+ calls without the data I concluded the problem had been found. When the map was replaced with a simple array, the application ran with a very low CPU load. The std::map is easy to use and a great way to get the program fundamentals right while trying to ignore the data storage problem. If speed is needed, a general and generic problem that can handle any type of data will be inherently much slower. My conclusion: the template libraries are really cool and easy to use, but very inefficient.
Thanks for your time If you work with telemetry, please check this bulletin board: http://www.bkelly.ws/irig\_106/
-
I tried using std::map in an application that handles telemetry data. When it was not working I tried stepping into a simple call to fetch data from the map. After 100+ calls without the data I concluded the problem had been found. When the map was replaced with a simple array, the application ran with a very low CPU load. The std::map is easy to use and a great way to get the program fundamentals right while trying to ignore the data storage problem. If speed is needed, a general and generic problem that can handle any type of data will be inherently much slower. My conclusion: the template libraries are really cool and easy to use, but very inefficient.
Thanks for your time If you work with telemetry, please check this bulletin board: http://www.bkelly.ws/irig\_106/
bkelly13 wrote:
My conclusion: the template libraries are really cool and easy to use, but very inefficient.
Well this is not really an accurate statement, because it depends on the template library that you're using and it's implementation and whether your use fits in with the standard use that it was designed for. std::map containers are never going to be fast, so in your case, just about any other container would have been faster. All in all, you can likely make something that is just as fast or faster than any container because you can make it application specific (you know how much data you need and how it will be accessed), but that doesn't discount the use of template libraries.
-
I tried using std::map in an application that handles telemetry data. When it was not working I tried stepping into a simple call to fetch data from the map. After 100+ calls without the data I concluded the problem had been found. When the map was replaced with a simple array, the application ran with a very low CPU load. The std::map is easy to use and a great way to get the program fundamentals right while trying to ignore the data storage problem. If speed is needed, a general and generic problem that can handle any type of data will be inherently much slower. My conclusion: the template libraries are really cool and easy to use, but very inefficient.
Thanks for your time If you work with telemetry, please check this bulletin board: http://www.bkelly.ws/irig\_106/
-
When you're choosing a container choose a vector. They're (except for one case) as fast as an equivalent automatic array. Then if you find the interface or performance characteristics of something else works better you can change it later.
Well... although a dynamically growing vector will still be slower than a pre-allocated array. Although you can presize a vector as well but that would require you to know about the speed cost of dynamic allocation and how vectors allocate arrays internally. Guess what I'm getting at is... you have to know at least a little about your containers and how you're using them. In summary, no free ride. ;P :)