Probably lots of time is lost in finding a chunk of the wanted size. This used to be a known problem in C run time libraries, that's why many people wrote their own memory management module. A good idea is to prepare 'pools' of predefined sizes (e.g. multiples of 8) and manage lists of vectors in each of the pools. If your application then requests e.g. 15 bytes, you round it up to the next multiple of 8 (16 in this case) and then go to the pool of 16 bytes. If it's empty, you allocate space for e.g. 10 chunks of 16 bytes. The next time you need 16 bytes (>9, <=16), it will be much faster. Of course you need to foresee additional list-pointers in your vector of 10 chunks of 16 bytes to store information about which chunk is already handed out to the application, which one is free, what the size is, ... Enjoy life, this is not a rehearsal !!!