How to parse special chars
-
I have to parse html with charset=iso-8859-2. I am getting html string now like this:
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(http://www.koroska-on.net/index.php?option=com\_glossary&func=display&letter=All&Itemid=0&catid=66&page=1);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();Stream resStream = response.GetResponseStream();
int count = 0;
byte[] buffer = new byte[8192];
do
{
count = resStream.Read(buffer, 0, buffer.Length);if (count != 0) { tempString = Encoding.ASCII.GetString(buffer, 0, count); html += tempString; } } while (count > 0);
Now specials are showed like ? or smt, because they aren't in ASCII range. How could i parse it with special chars showed normally? Thanks, Bye
-
I have to parse html with charset=iso-8859-2. I am getting html string now like this:
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(http://www.koroska-on.net/index.php?option=com\_glossary&func=display&letter=All&Itemid=0&catid=66&page=1);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();Stream resStream = response.GetResponseStream();
int count = 0;
byte[] buffer = new byte[8192];
do
{
count = resStream.Read(buffer, 0, buffer.Length);if (count != 0) { tempString = Encoding.ASCII.GetString(buffer, 0, count); html += tempString; } } while (count > 0);
Now specials are showed like ? or smt, because they aren't in ASCII range. How could i parse it with special chars showed normally? Thanks, Bye
-
I have to parse html with charset=iso-8859-2. I am getting html string now like this:
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(http://www.koroska-on.net/index.php?option=com\_glossary&func=display&letter=All&Itemid=0&catid=66&page=1);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();Stream resStream = response.GetResponseStream();
int count = 0;
byte[] buffer = new byte[8192];
do
{
count = resStream.Read(buffer, 0, buffer.Length);if (count != 0) { tempString = Encoding.ASCII.GetString(buffer, 0, count); html += tempString; } } while (count > 0);
Now specials are showed like ? or smt, because they aren't in ASCII range. How could i parse it with special chars showed normally? Thanks, Bye
Seems like ISO 8859-2[^] stands for the Central European character set. So I would go for an 8-bit encoding, most likely
new Encoding(12xx)
is all it takes. It certainly won't work properly with ASCII, as that only accepts character codes [0, 127]. For you to read[^]. :)Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles]
Happy New Year to all.
We hope 2010 soon brings us automatic PRE tags!
Until then, please insert them manually.
-
Seems like ISO 8859-2[^] stands for the Central European character set. So I would go for an 8-bit encoding, most likely
new Encoding(12xx)
is all it takes. It certainly won't work properly with ASCII, as that only accepts character codes [0, 127]. For you to read[^]. :)Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles]
Happy New Year to all.
We hope 2010 soon brings us automatic PRE tags!
Until then, please insert them manually.