Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. Regular Expressions
  4. Regex search to find and remove consecutive lines which end with same characters

Regex search to find and remove consecutive lines which end with same characters

Scheduled Pinned Locked Moved Regular Expressions
regextutorialquestion
2 Posts 2 Posters 8 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M Offline
    M Offline
    Member_15829879
    wrote on last edited by
    #1

    Hello, I need to write a regular expression search which will locate when a line ends with the same text as the preceding line, but does not have the same first 10 characters. So in this example: [11:12:21] Hello this is Tom. How are you? [11:14:08] Hello this is Tom. How are you? . . . I would need to search for consecutive lines for which the text was the same after the time entered in brackets. I know that this search: FIND: ^.{11}(.*)$ REPLACE; $1 . . . will locate the first 11 characters and remove them. This search: FIND: ^((.{10}).*)(?:\r?\n\2.*)+ REPLACE: $1 . . . will locate lines where the first 10 characters are the same and remove them. But I can't figure out how to structure the search so it checks the text from position 11 to the end of the line, and then checks if the text on the next line from the 11th character to the end of the line is the same.

    J 1 Reply Last reply
    0
    • M Member_15829879

      Hello, I need to write a regular expression search which will locate when a line ends with the same text as the preceding line, but does not have the same first 10 characters. So in this example: [11:12:21] Hello this is Tom. How are you? [11:14:08] Hello this is Tom. How are you? . . . I would need to search for consecutive lines for which the text was the same after the time entered in brackets. I know that this search: FIND: ^.{11}(.*)$ REPLACE; $1 . . . will locate the first 11 characters and remove them. This search: FIND: ^((.{10}).*)(?:\r?\n\2.*)+ REPLACE: $1 . . . will locate lines where the first 10 characters are the same and remove them. But I can't figure out how to structure the search so it checks the text from position 11 to the end of the line, and then checks if the text on the next line from the 11th character to the end of the line is the same.

      J Offline
      J Offline
      jschell
      wrote on last edited by
      #2

      You cannot do what you are asking with a regular expression. (There is in fact a very wrong way to attempt this which is ridiculous and would lead to nothing but a maintenance nightmare.) However in a programming language that uses regexes the algorithm that you would create would look like the following 1. Read a line 2. Parse the line to remove the timestamp. 3. Does it match the previous one? (Do whatever you want) 4. Otherwise save it for the next time 5. Go back to step 1 until there are no more lines to read.

      1 Reply Last reply
      0
      Reply
      • Reply as topic
      Log in to reply
      • Oldest to Newest
      • Newest to Oldest
      • Most Votes


      • Login

      • Don't have an account? Register

      • Login or register to search.
      • First post
        Last post
      0
      • Categories
      • Recent
      • Tags
      • Popular
      • World
      • Users
      • Groups