A great step forward was to call CoUnitialize for each thread. Tried both with and without a message pump (with MsgWaitForMultipleObjects), the message pump version produced less exceptions, but the actual result of the conversion is the same. I'm really close to giving up on this, the combination of a under-documented API and multi-threading is a bit much :-) Currently, I have a pool of 4 threads for different converters (not only the Office ones, but for HTML and various other formats who are unproblematic). In case of the Office DLL converters, each thread in the pool kicks off another separate thread doing the actual conversion (the only way I could figure out to put a message pump there, and this is how it's done in the MFC/ole/Wordpad sample of MSVC). Everything works fine (with no exceptions at all and correct conversion results) if I block reentry at the call to ForeignToRtf32 with a CRITICAL_SECTION mutex. So if the application should do something with let's say *.docx files only, the performance boost from multi-threading is ... exactly zero (!) Functional, but what a waste of effort... Thanks anyhow for the interesting reading of marshalling!