Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C#
  4. regex help [modified]

regex help [modified]

Scheduled Pinned Locked Moved C#
databaseregexhelp
8 Posts 3 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • U Offline
    U Offline
    uglyeyes
    wrote on last edited by
    #1

    Hi, I need to strip out text from a big file and my condition is: I need to strip out text from below <pre> <div class="a">apple</div>    <p>&nbsp;&nbsp;</p> <p>red delicious</p> <div class="b">banana</div>    <p>&nbsp;&nbsp;</p> <p>riped banana</p> <div class="c">chives</div> <p>&nbsp;&nbsp;</p> <p>fresh green chives</p> </pre> to below 'apple', 'red delicious' 'banana', 'riped banana' 'chives', 'fresh green chives' so that i can enter each of them to database. I would really appreciate if you could please provide me a regex that could do this. thanks for your help!!!

    modified on Tuesday, January 26, 2010 11:43 PM

    L realJSOPR 2 Replies Last reply
    0
    • U uglyeyes

      Hi, I need to strip out text from a big file and my condition is: I need to strip out text from below <pre> <div class="a">apple</div>    <p>&nbsp;&nbsp;</p> <p>red delicious</p> <div class="b">banana</div>    <p>&nbsp;&nbsp;</p> <p>riped banana</p> <div class="c">chives</div> <p>&nbsp;&nbsp;</p> <p>fresh green chives</p> </pre> to below 'apple', 'red delicious' 'banana', 'riped banana' 'chives', 'fresh green chives' so that i can enter each of them to database. I would really appreciate if you could please provide me a regex that could do this. thanks for your help!!!

      modified on Tuesday, January 26, 2010 11:43 PM

      L Offline
      L Offline
      Luc Pattyn
      wrote on last edited by
      #2

      you can edit (i.e. modify) an earlier message if you decide you need to fix an error, improve formatting, or add information. Now please go and delete your other messages before someone attempts to read and answer them. :)

      Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles]


      I only read code that is properly formatted, adding PRE tags is the easiest way to obtain that.
      [The QA section does it automatically now, I hope we soon get it on regular forums as well]


      1 Reply Last reply
      0
      • U uglyeyes

        Hi, I need to strip out text from a big file and my condition is: I need to strip out text from below <pre> <div class="a">apple</div>    <p>&nbsp;&nbsp;</p> <p>red delicious</p> <div class="b">banana</div>    <p>&nbsp;&nbsp;</p> <p>riped banana</p> <div class="c">chives</div> <p>&nbsp;&nbsp;</p> <p>fresh green chives</p> </pre> to below 'apple', 'red delicious' 'banana', 'riped banana' 'chives', 'fresh green chives' so that i can enter each of them to database. I would really appreciate if you could please provide me a regex that could do this. thanks for your help!!!

        modified on Tuesday, January 26, 2010 11:43 PM

        realJSOPR Offline
        realJSOPR Offline
        realJSOP
        wrote on last edited by
        #3

        At the very least, you could use Linq-to-XML to do this. Don't use Regex to parse HTML. The class you're going to want to look at is XElement.

        .45 ACP - because shooting twice is just silly
        -----
        "Why don't you tie a kerosene-soaked rag around your ankles so the ants won't climb up and eat your candy ass..." - Dale Earnhardt, 1997
        -----
        "The staggering layers of obscenity in your statement make it a work of art on so many levels." - J. Jystad, 2001

        U 2 Replies Last reply
        0
        • realJSOPR realJSOP

          At the very least, you could use Linq-to-XML to do this. Don't use Regex to parse HTML. The class you're going to want to look at is XElement.

          .45 ACP - because shooting twice is just silly
          -----
          "Why don't you tie a kerosene-soaked rag around your ankles so the ants won't climb up and eat your candy ass..." - Dale Earnhardt, 1997
          -----
          "The staggering layers of obscenity in your statement make it a work of art on so many levels." - J. Jystad, 2001

          U Offline
          U Offline
          uglyeyes
          wrote on last edited by
          #4

          Hello, this is one off process and the html is stored in a csv file. I crawl a site, store each web page in csv. the url is something like www.mysite.com.au/product/product.asp?id={0} now i am storing all html for each product page in one csv. now i want to delete all the text except from the one that i wanted. could you please help how can i acheive this with regex?

          U 1 Reply Last reply
          0
          • U uglyeyes

            Hello, this is one off process and the html is stored in a csv file. I crawl a site, store each web page in csv. the url is something like www.mysite.com.au/product/product.asp?id={0} now i am storing all html for each product page in one csv. now i want to delete all the text except from the one that i wanted. could you please help how can i acheive this with regex?

            U Offline
            U Offline
            uglyeyes
            wrote on last edited by
            #5

            anyone please help?

            1 Reply Last reply
            0
            • realJSOPR realJSOP

              At the very least, you could use Linq-to-XML to do this. Don't use Regex to parse HTML. The class you're going to want to look at is XElement.

              .45 ACP - because shooting twice is just silly
              -----
              "Why don't you tie a kerosene-soaked rag around your ankles so the ants won't climb up and eat your candy ass..." - Dale Earnhardt, 1997
              -----
              "The staggering layers of obscenity in your statement make it a work of art on so many levels." - J. Jystad, 2001

              U Offline
              U Offline
              uglyeyes
              wrote on last edited by
              #6

              but its not a valid html or xml file. its a csv file and there is lots of work to get the html or xml validation to work. any other idea or suggestions?

              realJSOPR 1 Reply Last reply
              0
              • U uglyeyes

                but its not a valid html or xml file. its a csv file and there is lots of work to get the html or xml validation to work. any other idea or suggestions?

                realJSOPR Offline
                realJSOPR Offline
                realJSOP
                wrote on last edited by
                #7

                It doesn't have to be a xml/html file. It just needs to be a properly formatted XML string. Trust me - regex is not the answer.

                .45 ACP - because shooting twice is just silly
                -----
                "Why don't you tie a kerosene-soaked rag around your ankles so the ants won't climb up and eat your candy ass..." - Dale Earnhardt, 1997
                -----
                "The staggering layers of obscenity in your statement make it a work of art on so many levels." - J. Jystad, 2001

                U 1 Reply Last reply
                0
                • realJSOPR realJSOP

                  It doesn't have to be a xml/html file. It just needs to be a properly formatted XML string. Trust me - regex is not the answer.

                  .45 ACP - because shooting twice is just silly
                  -----
                  "Why don't you tie a kerosene-soaked rag around your ankles so the ants won't climb up and eat your candy ass..." - Dale Earnhardt, 1997
                  -----
                  "The staggering layers of obscenity in your statement make it a work of art on so many levels." - J. Jystad, 2001

                  U Offline
                  U Offline
                  uglyeyes
                  wrote on last edited by
                  #8

                  formatting takes ages. my

                  element doesnt have a id. why would you come to the conclusion regex is not an answer? when i know all the text is going to be in same format?

                  1 Reply Last reply
                  0
                  Reply
                  • Reply as topic
                  Log in to reply
                  • Oldest to Newest
                  • Newest to Oldest
                  • Most Votes


                  • Login

                  • Don't have an account? Register

                  • Login or register to search.
                  • First post
                    Last post
                  0
                  • Categories
                  • Recent
                  • Tags
                  • Popular
                  • World
                  • Users
                  • Groups