Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C#
  4. Converting semi-structured htm data to xml

Converting semi-structured htm data to xml

Scheduled Pinned Locked Moved C#
csharpperlxml
5 Posts 3 Posters 2 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • R Offline
    R Offline
    RichardInToronto
    wrote on last edited by
    #1

    Hi everyone, Posting for the first time here - this looks to be a very active forum. I have some data in an htm file, enclosed in "pre" tags (well at least the beginning tag is there). It looks like this:

    ----- --------- ----- --------- ---- ------------------------------------- ---------------- ---------- -------- ---------------- ----------------------
    Official Pace Chip Gender Category Split Split Split
    Place Gun Time km Time # Name City Plce/Tot Plce/Tot Category @10km @15km @20km


    1 1:37:58.2  3:16 1:37:58.2    3 BETCHIM, NOURDDINE                    Montreal            1/2297    1/277  Men 30 - 34        32:37   49:24 1:05:25 
    2 1:37:59.9  3:16 1:37:59.2    2 NSENGIYUMVA, JOSEPH                   Ottawa              2/2297    1/140  Men 25 - 29        32:37   49:23 1:05:26 
    3 1:38:50.0  3:18 1:38:50.0    5 DEHBI, AMOR                           Montreal            3/2297    2/277  Men 30 - 34        32:37   49:23 1:05:46 
    4 1:40:15.4  3:21 1:40:15.4    7 KASSAP, DANNY                         Toronto             4/2297    1/52   Men 20 - 24        32:38   49:23 1:05:35 
    5 1:42:04.6  3:25 1:42:04.6   19 PAULING, RYAN                         Rochester           5/2297    2/140  Men 25 - 29        33:34   50:50 1:07:50 
    6 1:42:25.0  3:25 1:42:25.0 1817 VOLLMER, MARK                         Guelph              6/2297    3/140  Men 25 - 29        33:33   50:49 1:07:49 
    7 1:43:10.6  3:27 1:43:10.6   12 GRAF, ERIC                            Meadville           7/2297    4/140  Men 25 - 29        33:48   51:22 1:08:27 
    8 1:43:14.8  3:27 1:43:13.8   16 PRINCIC, DANIEL                       Meadville           8/2297    5/140  Men 25 - 29        34:46   52:03 1:08:42 
    9 1:43:19.5  3:27 1:43:18.8 4462 KEMP, PAUL                            Toronto             9/2297    1/354  Men 35 - 39        34:12   51:33 1:08:25 
    

    10 1:43:44.5 3:28 1:43:42.9 1256 MACDONALD, JAY Hamilton 10/2297 6/140 Men 25 - 29 33:33 50:50 1:07:49

    I'd like to convert this to XML, I think it's best to do this programmatically. I'm considering using C# or Perl to do this, but I'm open to other ideas. I ha

    L S 2 Replies Last reply
    0
    • R RichardInToronto

      Hi everyone, Posting for the first time here - this looks to be a very active forum. I have some data in an htm file, enclosed in "pre" tags (well at least the beginning tag is there). It looks like this:

      ----- --------- ----- --------- ---- ------------------------------------- ---------------- ---------- -------- ---------------- ----------------------
      Official Pace Chip Gender Category Split Split Split
      Place Gun Time km Time # Name City Plce/Tot Plce/Tot Category @10km @15km @20km


      1 1:37:58.2  3:16 1:37:58.2    3 BETCHIM, NOURDDINE                    Montreal            1/2297    1/277  Men 30 - 34        32:37   49:24 1:05:25 
      2 1:37:59.9  3:16 1:37:59.2    2 NSENGIYUMVA, JOSEPH                   Ottawa              2/2297    1/140  Men 25 - 29        32:37   49:23 1:05:26 
      3 1:38:50.0  3:18 1:38:50.0    5 DEHBI, AMOR                           Montreal            3/2297    2/277  Men 30 - 34        32:37   49:23 1:05:46 
      4 1:40:15.4  3:21 1:40:15.4    7 KASSAP, DANNY                         Toronto             4/2297    1/52   Men 20 - 24        32:38   49:23 1:05:35 
      5 1:42:04.6  3:25 1:42:04.6   19 PAULING, RYAN                         Rochester           5/2297    2/140  Men 25 - 29        33:34   50:50 1:07:50 
      6 1:42:25.0  3:25 1:42:25.0 1817 VOLLMER, MARK                         Guelph              6/2297    3/140  Men 25 - 29        33:33   50:49 1:07:49 
      7 1:43:10.6  3:27 1:43:10.6   12 GRAF, ERIC                            Meadville           7/2297    4/140  Men 25 - 29        33:48   51:22 1:08:27 
      8 1:43:14.8  3:27 1:43:13.8   16 PRINCIC, DANIEL                       Meadville           8/2297    5/140  Men 25 - 29        34:46   52:03 1:08:42 
      9 1:43:19.5  3:27 1:43:18.8 4462 KEMP, PAUL                            Toronto             9/2297    1/354  Men 35 - 39        34:12   51:33 1:08:25 
      

      10 1:43:44.5 3:28 1:43:42.9 1256 MACDONALD, JAY Hamilton 10/2297 6/140 Men 25 - 29 33:33 50:50 1:07:49

      I'd like to convert this to XML, I think it's best to do this programmatically. I'm considering using C# or Perl to do this, but I'm open to other ideas. I ha

      L Offline
      L Offline
      led mike
      wrote on last edited by
      #2

      RichardInToronto wrote:

      I don't know what class I would use to produce the XML and serialize it to disk.

      Look in the System.Xml namespace. Also look for articles on CodeProject and msdn.microsoft.com that might be tutorial type articles. YOu can use a DOM object System.Xml.XmlDocument or perhaps a XmlWriter would work better for your solution.

      led mike

      R 1 Reply Last reply
      0
      • R RichardInToronto

        Hi everyone, Posting for the first time here - this looks to be a very active forum. I have some data in an htm file, enclosed in "pre" tags (well at least the beginning tag is there). It looks like this:

        ----- --------- ----- --------- ---- ------------------------------------- ---------------- ---------- -------- ---------------- ----------------------
        Official Pace Chip Gender Category Split Split Split
        Place Gun Time km Time # Name City Plce/Tot Plce/Tot Category @10km @15km @20km


        1 1:37:58.2  3:16 1:37:58.2    3 BETCHIM, NOURDDINE                    Montreal            1/2297    1/277  Men 30 - 34        32:37   49:24 1:05:25 
        2 1:37:59.9  3:16 1:37:59.2    2 NSENGIYUMVA, JOSEPH                   Ottawa              2/2297    1/140  Men 25 - 29        32:37   49:23 1:05:26 
        3 1:38:50.0  3:18 1:38:50.0    5 DEHBI, AMOR                           Montreal            3/2297    2/277  Men 30 - 34        32:37   49:23 1:05:46 
        4 1:40:15.4  3:21 1:40:15.4    7 KASSAP, DANNY                         Toronto             4/2297    1/52   Men 20 - 24        32:38   49:23 1:05:35 
        5 1:42:04.6  3:25 1:42:04.6   19 PAULING, RYAN                         Rochester           5/2297    2/140  Men 25 - 29        33:34   50:50 1:07:50 
        6 1:42:25.0  3:25 1:42:25.0 1817 VOLLMER, MARK                         Guelph              6/2297    3/140  Men 25 - 29        33:33   50:49 1:07:49 
        7 1:43:10.6  3:27 1:43:10.6   12 GRAF, ERIC                            Meadville           7/2297    4/140  Men 25 - 29        33:48   51:22 1:08:27 
        8 1:43:14.8  3:27 1:43:13.8   16 PRINCIC, DANIEL                       Meadville           8/2297    5/140  Men 25 - 29        34:46   52:03 1:08:42 
        9 1:43:19.5  3:27 1:43:18.8 4462 KEMP, PAUL                            Toronto             9/2297    1/354  Men 35 - 39        34:12   51:33 1:08:25 
        

        10 1:43:44.5 3:28 1:43:42.9 1256 MACDONALD, JAY Hamilton 10/2297 6/140 Men 25 - 29 33:33 50:50 1:07:49

        I'd like to convert this to XML, I think it's best to do this programmatically. I'm considering using C# or Perl to do this, but I'm open to other ideas. I ha

        S Offline
        S Offline
        Sreenath Madyastha
        wrote on last edited by
        #3

        use regular expression object in C# 1. Use Regex 2. Do Grouping in Regex 3. Use MatchEvaluator if needed 4. Add the grouping result values to the dataset. 5. Save Datset to Xml. You are done! Sreenath

        R 1 Reply Last reply
        0
        • L led mike

          RichardInToronto wrote:

          I don't know what class I would use to produce the XML and serialize it to disk.

          Look in the System.Xml namespace. Also look for articles on CodeProject and msdn.microsoft.com that might be tutorial type articles. YOu can use a DOM object System.Xml.XmlDocument or perhaps a XmlWriter would work better for your solution.

          led mike

          R Offline
          R Offline
          RichardInToronto
          wrote on last edited by
          #4

          Thanks for the help Mike.

          1 Reply Last reply
          0
          • S Sreenath Madyastha

            use regular expression object in C# 1. Use Regex 2. Do Grouping in Regex 3. Use MatchEvaluator if needed 4. Add the grouping result values to the dataset. 5. Save Datset to Xml. You are done! Sreenath

            R Offline
            R Offline
            RichardInToronto
            wrote on last edited by
            #5

            Thanks for your help Sreenath.

            1 Reply Last reply
            0
            Reply
            • Reply as topic
            Log in to reply
            • Oldest to Newest
            • Newest to Oldest
            • Most Votes


            • Login

            • Don't have an account? Register

            • Login or register to search.
            • First post
              Last post
            0
            • Categories
            • Recent
            • Tags
            • Popular
            • World
            • Users
            • Groups