jonegerton wrote:
Unfortunately this has to work on thousands of documents (corporate rebrand!)
So I guess you have been tasked with replacing all occurences of "Tiger Woods" with "Tom Watson" or something along those lines? I tried to make something like that work years ago, i.e. something that tried to re-create the Word "save" logic at a low level. My recollection is that somewhere in the file, there is a field holding the length of the data or (less likely) a checksum. I thought I was adjusting that properly, but never did manage to create "valid" Word documents. Apparently there was some other checksum somewhere that I did not know about. Eventually I ended up doing the job with automation. I managed to work through the message-box-related issues... I think there are ways to detect the error condition and kill Winword.exe. In the worst case, you could just assume an error occured after a certain length of time. None of this is beautiful, but in the end it proved more workable than manually messing around with the file. And I did try mightily to make that work... I was just out of college, and had been immersed in a thesis that used Intel assembly, and the low-level approach was definitely the one I preferred. One more thought: the "DOCX" format of Office 2007 is much more regular and well-documented than the old melange of DOC formats. I think a DOCX is basically a zipped-up collection of XML documents and embedded image files. Have you considered converting to DOCX as the first step of the process? It might make your life easier.