Converting semi-structured htm data to xml
-
Hi everyone, Posting for the first time here - this looks to be a very active forum. I have some data in an htm file, enclosed in "pre" tags (well at least the beginning tag is there). It looks like this:
----- --------- ----- --------- ---- ------------------------------------- ---------------- ---------- -------- ---------------- ----------------------
Official Pace Chip Gender Category Split Split Split
Place Gun Time km Time # Name City Plce/Tot Plce/Tot Category @10km @15km @20km
1 1:37:58.2 3:16 1:37:58.2 3 BETCHIM, NOURDDINE Montreal 1/2297 1/277 Men 30 - 34 32:37 49:24 1:05:25 2 1:37:59.9 3:16 1:37:59.2 2 NSENGIYUMVA, JOSEPH Ottawa 2/2297 1/140 Men 25 - 29 32:37 49:23 1:05:26 3 1:38:50.0 3:18 1:38:50.0 5 DEHBI, AMOR Montreal 3/2297 2/277 Men 30 - 34 32:37 49:23 1:05:46 4 1:40:15.4 3:21 1:40:15.4 7 KASSAP, DANNY Toronto 4/2297 1/52 Men 20 - 24 32:38 49:23 1:05:35 5 1:42:04.6 3:25 1:42:04.6 19 PAULING, RYAN Rochester 5/2297 2/140 Men 25 - 29 33:34 50:50 1:07:50 6 1:42:25.0 3:25 1:42:25.0 1817 VOLLMER, MARK Guelph 6/2297 3/140 Men 25 - 29 33:33 50:49 1:07:49 7 1:43:10.6 3:27 1:43:10.6 12 GRAF, ERIC Meadville 7/2297 4/140 Men 25 - 29 33:48 51:22 1:08:27 8 1:43:14.8 3:27 1:43:13.8 16 PRINCIC, DANIEL Meadville 8/2297 5/140 Men 25 - 29 34:46 52:03 1:08:42 9 1:43:19.5 3:27 1:43:18.8 4462 KEMP, PAUL Toronto 9/2297 1/354 Men 35 - 39 34:12 51:33 1:08:25
10 1:43:44.5 3:28 1:43:42.9 1256 MACDONALD, JAY Hamilton 10/2297 6/140 Men 25 - 29 33:33 50:50 1:07:49
I'd like to convert this to XML, I think it's best to do this programmatically. I'm considering using C# or Perl to do this, but I'm open to other ideas. I ha
-
Hi everyone, Posting for the first time here - this looks to be a very active forum. I have some data in an htm file, enclosed in "pre" tags (well at least the beginning tag is there). It looks like this:
----- --------- ----- --------- ---- ------------------------------------- ---------------- ---------- -------- ---------------- ----------------------
Official Pace Chip Gender Category Split Split Split
Place Gun Time km Time # Name City Plce/Tot Plce/Tot Category @10km @15km @20km
1 1:37:58.2 3:16 1:37:58.2 3 BETCHIM, NOURDDINE Montreal 1/2297 1/277 Men 30 - 34 32:37 49:24 1:05:25 2 1:37:59.9 3:16 1:37:59.2 2 NSENGIYUMVA, JOSEPH Ottawa 2/2297 1/140 Men 25 - 29 32:37 49:23 1:05:26 3 1:38:50.0 3:18 1:38:50.0 5 DEHBI, AMOR Montreal 3/2297 2/277 Men 30 - 34 32:37 49:23 1:05:46 4 1:40:15.4 3:21 1:40:15.4 7 KASSAP, DANNY Toronto 4/2297 1/52 Men 20 - 24 32:38 49:23 1:05:35 5 1:42:04.6 3:25 1:42:04.6 19 PAULING, RYAN Rochester 5/2297 2/140 Men 25 - 29 33:34 50:50 1:07:50 6 1:42:25.0 3:25 1:42:25.0 1817 VOLLMER, MARK Guelph 6/2297 3/140 Men 25 - 29 33:33 50:49 1:07:49 7 1:43:10.6 3:27 1:43:10.6 12 GRAF, ERIC Meadville 7/2297 4/140 Men 25 - 29 33:48 51:22 1:08:27 8 1:43:14.8 3:27 1:43:13.8 16 PRINCIC, DANIEL Meadville 8/2297 5/140 Men 25 - 29 34:46 52:03 1:08:42 9 1:43:19.5 3:27 1:43:18.8 4462 KEMP, PAUL Toronto 9/2297 1/354 Men 35 - 39 34:12 51:33 1:08:25
10 1:43:44.5 3:28 1:43:42.9 1256 MACDONALD, JAY Hamilton 10/2297 6/140 Men 25 - 29 33:33 50:50 1:07:49
I'd like to convert this to XML, I think it's best to do this programmatically. I'm considering using C# or Perl to do this, but I'm open to other ideas. I ha
RichardInToronto wrote:
I don't know what class I would use to produce the XML and serialize it to disk.
Look in the System.Xml namespace. Also look for articles on CodeProject and msdn.microsoft.com that might be tutorial type articles. YOu can use a DOM object System.Xml.XmlDocument or perhaps a XmlWriter would work better for your solution.
led mike
-
Hi everyone, Posting for the first time here - this looks to be a very active forum. I have some data in an htm file, enclosed in "pre" tags (well at least the beginning tag is there). It looks like this:
----- --------- ----- --------- ---- ------------------------------------- ---------------- ---------- -------- ---------------- ----------------------
Official Pace Chip Gender Category Split Split Split
Place Gun Time km Time # Name City Plce/Tot Plce/Tot Category @10km @15km @20km
1 1:37:58.2 3:16 1:37:58.2 3 BETCHIM, NOURDDINE Montreal 1/2297 1/277 Men 30 - 34 32:37 49:24 1:05:25 2 1:37:59.9 3:16 1:37:59.2 2 NSENGIYUMVA, JOSEPH Ottawa 2/2297 1/140 Men 25 - 29 32:37 49:23 1:05:26 3 1:38:50.0 3:18 1:38:50.0 5 DEHBI, AMOR Montreal 3/2297 2/277 Men 30 - 34 32:37 49:23 1:05:46 4 1:40:15.4 3:21 1:40:15.4 7 KASSAP, DANNY Toronto 4/2297 1/52 Men 20 - 24 32:38 49:23 1:05:35 5 1:42:04.6 3:25 1:42:04.6 19 PAULING, RYAN Rochester 5/2297 2/140 Men 25 - 29 33:34 50:50 1:07:50 6 1:42:25.0 3:25 1:42:25.0 1817 VOLLMER, MARK Guelph 6/2297 3/140 Men 25 - 29 33:33 50:49 1:07:49 7 1:43:10.6 3:27 1:43:10.6 12 GRAF, ERIC Meadville 7/2297 4/140 Men 25 - 29 33:48 51:22 1:08:27 8 1:43:14.8 3:27 1:43:13.8 16 PRINCIC, DANIEL Meadville 8/2297 5/140 Men 25 - 29 34:46 52:03 1:08:42 9 1:43:19.5 3:27 1:43:18.8 4462 KEMP, PAUL Toronto 9/2297 1/354 Men 35 - 39 34:12 51:33 1:08:25
10 1:43:44.5 3:28 1:43:42.9 1256 MACDONALD, JAY Hamilton 10/2297 6/140 Men 25 - 29 33:33 50:50 1:07:49
I'd like to convert this to XML, I think it's best to do this programmatically. I'm considering using C# or Perl to do this, but I'm open to other ideas. I ha
use regular expression object in C# 1. Use Regex 2. Do Grouping in Regex 3. Use MatchEvaluator if needed 4. Add the grouping result values to the dataset. 5. Save Datset to Xml. You are done! Sreenath
-
RichardInToronto wrote:
I don't know what class I would use to produce the XML and serialize it to disk.
Look in the System.Xml namespace. Also look for articles on CodeProject and msdn.microsoft.com that might be tutorial type articles. YOu can use a DOM object System.Xml.XmlDocument or perhaps a XmlWriter would work better for your solution.
led mike
Thanks for the help Mike.
-
use regular expression object in C# 1. Use Regex 2. Do Grouping in Regex 3. Use MatchEvaluator if needed 4. Add the grouping result values to the dataset. 5. Save Datset to Xml. You are done! Sreenath
Thanks for your help Sreenath.