To get rid of the document format, you may consider generating the xml document on your own. That being said, you can use Automation or a third party component (a better option IMO) to access the word document, and use xsl to transform data to generate a light xml file.