Removing metadata from MS Office documents
-
Hi folks! Does anybody have a clue how to programatically remove the metadata from MS Office documents? With metadata I mean things like author, company or the document title (viewable from File > Properties). Other types of metadata are created during the editing process, e.g. track changes, hidden text, deleted text or previous versions (viewable from View > Markup). There exists a tool named ezClean which does all this nice stuff: http://www.kklsoftware.com/index.asp[^] You'll find another tool here: http://www.payneconsulting.com/public/products/ProductDetail.asp?nProductID=29[^] Even MS got a tool for it :cool: : http://www.microsoft.com/downloads/details.aspx?familyid=144e54ed-d43e-42ca-bc7b-5446d34e5360&displaylang=en[^] So does anybody have a clue how those guys did this? Or does even anybody have some ready source code for it? I researched quite a while but didn't find anything really useful yet! :sigh: Thanks a lot for your help! Regards, mYkel
-
Hi folks! Does anybody have a clue how to programatically remove the metadata from MS Office documents? With metadata I mean things like author, company or the document title (viewable from File > Properties). Other types of metadata are created during the editing process, e.g. track changes, hidden text, deleted text or previous versions (viewable from View > Markup). There exists a tool named ezClean which does all this nice stuff: http://www.kklsoftware.com/index.asp[^] You'll find another tool here: http://www.payneconsulting.com/public/products/ProductDetail.asp?nProductID=29[^] Even MS got a tool for it :cool: : http://www.microsoft.com/downloads/details.aspx?familyid=144e54ed-d43e-42ca-bc7b-5446d34e5360&displaylang=en[^] So does anybody have a clue how those guys did this? Or does even anybody have some ready source code for it? I researched quite a while but didn't find anything really useful yet! :sigh: Thanks a lot for your help! Regards, mYkel
The office documents are OLE Compound Documents. You'll need to use the IPropertyStorage interface. If you search for IPropertyStorage on Google and MSDN, you should find some examples. Michael CP Blog [^] Development Blog [^]
-
The office documents are OLE Compound Documents. You'll need to use the IPropertyStorage interface. If you search for IPropertyStorage on Google and MSDN, you should find some examples. Michael CP Blog [^] Development Blog [^]
Hi Michael, thanks for your reply! With the following links I figured out how to get and set the properties of office documents like author, title, comment or company: http://support.microsoft.com/kb/186898/en-us[^] http://msdn.microsoft.com/library/default.asp?url=/library/en-us/stg/stg/the_documentsummaryinformation_and_userdefined_property_sets.asp[^] So the first (and minor point) on my list of wishes is done. But as far as I can see I cannot edit the major metadata of office document's like track changes, hidden text, deleted text or previous versions with the IPropertyStorage interface. Check out for example the ezClean[^] tool that does all this pretty stuff. Does anybody have a clue how to do this? Regards, mYkel