How to convert from html to xml
-
Hey all, I have been writing a tool which will allow u to convert xml files into .docx files (Word 2007). For this one needs to have the xml and a related xsl file for it. Now i also need to convert the word document to the basic xml file that was used earlier. I couldn'nt figure out a way to do that and so have converted that word document to a basic html file. Now i am in a fix as to how to convert that file in to the original xml file. Is the technique which i am following(Word > html > xml) correct? Or is there a better way to do this? If there is an easier or better way to do this, Please help me out. Thanks in Advance Teja
-
Hey all, I have been writing a tool which will allow u to convert xml files into .docx files (Word 2007). For this one needs to have the xml and a related xsl file for it. Now i also need to convert the word document to the basic xml file that was used earlier. I couldn'nt figure out a way to do that and so have converted that word document to a basic html file. Now i am in a fix as to how to convert that file in to the original xml file. Is the technique which i am following(Word > html > xml) correct? Or is there a better way to do this? If there is an easier or better way to do this, Please help me out. Thanks in Advance Teja
I guess the question is if your transformation from XML to DOCX represents a lossy or lossless transformation. If the transformation causes loss in understanding the content or structure of the original XML file, then you cannot reconstitute the exact original XML file regardless of intermediary transformations, such as to XML. Otherwise, if your transformation is lossless, then you should have enough information to write an XSLT document that provides the inverse of the original XSLT. In this case, your intermediary step of HTML is not needed.
"we must lose precision to make significant statements about complex systems." -deKorvin on uncertainty
-
I guess the question is if your transformation from XML to DOCX represents a lossy or lossless transformation. If the transformation causes loss in understanding the content or structure of the original XML file, then you cannot reconstitute the exact original XML file regardless of intermediary transformations, such as to XML. Otherwise, if your transformation is lossless, then you should have enough information to write an XSLT document that provides the inverse of the original XSLT. In this case, your intermediary step of HTML is not needed.
"we must lose precision to make significant statements about complex systems." -deKorvin on uncertainty
Hey Currtis, Thanks for the reply. Mine is a loseless transformation. I need a clarification. What do you mean by a reverse XSLT? Once i get my reverse XSLT, Do i need to use it on the Wordml document in order to transform it into a original xml?> Please help, Thanks in Advance Teja
-
Hey Currtis, Thanks for the reply. Mine is a loseless transformation. I need a clarification. What do you mean by a reverse XSLT? Once i get my reverse XSLT, Do i need to use it on the Wordml document in order to transform it into a original xml?> Please help, Thanks in Advance Teja
Hi, Teja. You need to craft another XSLT document that is the inverse of your original XSLT document. You have XML -> WordML, now you need a WordML -> XML transformation, as well. Then, you just run the new XSLT on the WordML document to make it XML. Do you want to do this because you don't retain the original XML? Or, you transform it from XML so that users can edit it in Word? Then, after their edits, you want to replace it in your document management/repository system? You're welcome, in hindsight, Curtis.
"we must lose precision to make significant statements about complex systems." -deKorvin on uncertainty
-
Hi, Teja. You need to craft another XSLT document that is the inverse of your original XSLT document. You have XML -> WordML, now you need a WordML -> XML transformation, as well. Then, you just run the new XSLT on the WordML document to make it XML. Do you want to do this because you don't retain the original XML? Or, you transform it from XML so that users can edit it in Word? Then, after their edits, you want to replace it in your document management/repository system? You're welcome, in hindsight, Curtis.
"we must lose precision to make significant statements about complex systems." -deKorvin on uncertainty
Hey Curtis, Wow, i dint know that there existed an inverse xslt that will help me convert the WordML to a the original xml. I guess now i need to figure out as to how to write the inverse XSLT. Yes the kind of interpretation u gave about wat am i trying to do is correct. I am trying to allow a user to open an xml as word document and when he edits it and saves it, i an trying to save it in the original xml format. Could u send me any sites in which i could get a glance of the word to xml, xslt. if i get an example, i could start writing my own xslt for the document. Thanks for the great help, Thanks in advance, Teja
-
Hey Curtis, Wow, i dint know that there existed an inverse xslt that will help me convert the WordML to a the original xml. I guess now i need to figure out as to how to write the inverse XSLT. Yes the kind of interpretation u gave about wat am i trying to do is correct. I am trying to allow a user to open an xml as word document and when he edits it and saves it, i an trying to save it in the original xml format. Could u send me any sites in which i could get a glance of the word to xml, xslt. if i get an example, i could start writing my own xslt for the document. Thanks for the great help, Thanks in advance, Teja
Teja, I regret that I have to write that I know of no Web sites or books that talk about how to create an inverse XSLT. However, since you have the original one, I would recommend going through the transformation command by command, figure out the mappings XML -> WordML, and then try to capture that in another XSLT. I don't use it myself (because it costs money and I'm cheap : ), but Altova XmlSpy can help you with understanding the transformations with its XSLT editor and debugger. I think that Altova used to allow you to download a free edition, or you could ask your company to purchase a license for the Standard, Professional, or Enterprise edition. (Note: I do not work for or with Altova.) Hope that helps and happy hacking! You're welcome in hindsight, Curtis.
"we must lose precision to make significant statements about complex systems." -deKorvin on uncertainty
-
Teja, I regret that I have to write that I know of no Web sites or books that talk about how to create an inverse XSLT. However, since you have the original one, I would recommend going through the transformation command by command, figure out the mappings XML -> WordML, and then try to capture that in another XSLT. I don't use it myself (because it costs money and I'm cheap : ), but Altova XmlSpy can help you with understanding the transformations with its XSLT editor and debugger. I think that Altova used to allow you to download a free edition, or you could ask your company to purchase a license for the Standard, Professional, or Enterprise edition. (Note: I do not work for or with Altova.) Hope that helps and happy hacking! You're welcome in hindsight, Curtis.
"we must lose precision to make significant statements about complex systems." -deKorvin on uncertainty
Curtis, Thanks a milliom for the reply. I guess this is a path that has less flowers and more thorns in it. So let me check out your suggestion and get back to you. I guess you need to know that i am not the best of pesons in writing xslts. All i hve done is taken an xslt of another xml document and have modified it in such a way that i can help show my xml in word. What you also have to understand is that, I am trying to generate an XSLT and convert my xml to word and back programatically in .NET. So i guess i will hae to figure out a way in which i can do this . Thank you Loads Teja