thanks Jens! i've solved the questions. solutions is i write an html analyzer, and analyze the html document to get the dom tree. and then, alter the dom tree is then much easy.
wnfk
Posts
-
How could i analyze and do something to a webpage content? -
How could i analyze and do something to a webpage content?Jens Meyer wrote:
maybe you should take a look into HttpModules. You can find many articles here on codeproject on that topic. You can alter the result stream (your html code for example) in this module just like the way you want with the proxy.
Thanks Jens, the question i meet now is how to alter the result stream/ string. that is how to alter the relative path of the website to a absolute path, for example, when the code is <pre> <img src="./a.gif" />, <a href="./default.aspx">Home</a> good </pre> how i alter them to <pre> <img src="http://www.example.com/a.gif" />, <a href="http://www.example.com/default.aspx">Default</a> better </pre> and how to get the word "Home" and "good", and alter them to "Default" and "better" is there any example code you know? thanks again jens!
-
How could i analyze and do something to a webpage content?Thanks Jens, the question i meet now is how to alter the result stream/ string. that is how to alter the relative path of the website to a absolute path, for example, when the code is
<img src="./a.gif" />, <a href="./default.aspx">Home</a>
goodhow i alter them to
<img src="http://www.example.com/a.gif" />, <a href="http://www.example.com/default.aspx">Default</a>
betterand how to get the word "Home" and "good", and alter them to "Default" and "better" is there any example code you know? thanks again jens!
-
How could i analyze and do something to a webpage content?hi guys, thanks for advance. I'd like to do something as Google translate, but is more simple than google translate, and is somehow like an online webproxy. it should contain the following function descriptions: 1. get all text nodes from a web page 2. do some changes to the text nodes which is the result of the first step 3. put the changes back to original web page 4. display the changed web page again. for example, i do some change to www.bing.com, i'd like to do things as follow: 1. get the webcontent of www.bing.com with WebRequest, let's assume the result is as below:
<html>
...
<body>
<span>WebPage</span>
<span>Pictures</span>
.....
<img src="/aa.gif" />
<input type="text" />
<input type="submit" value="Search" />
</body>
</html>2.changed web page content is as below:
<html>
...
<body>
<span>MyTranslatedWebPage</span>
<span>MyTranslatedPictures</span>
.....
<img src="http://www.bing.com/aa.gif" />
<input type="text" />
<input type="submit" value="Search" />
</body>
</html>could anybody give me some inputs according the descriptions above? thanks in advance!