Problem with CHttpFile and HTML
-
Hello, For one of my company's projects I'm using CHttpConnection and CHttpFile to download HTML files from different web sites. I then extract some data as well as certain links from these files. The problem is that sometimes the HTML files contain character codes for certain characters instead of the actual character (%3F instead of ? for example), which is problematic when extracting the data. What's strange is that sometimes the *same* HTML file will contain character codes, and sometimes it will contain the actual character, which is even more problematic. I would like the downloaded HTML files to be in a normal "readable" format. Do I have to parse the files and convert the characters myself? Maybe I'm just missing something unfortunately I'm don't know much about the details of HTTP. Here are my relevant HTTP headers in case this is the source of the problem: Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */* Accept-Language: fr Hope someone can help!! Sylv33