Regex search to find and remove consecutive lines which end with same characters
-
Hello, I need to write a regular expression search which will locate when a line ends with the same text as the preceding line, but does not have the same first 10 characters. So in this example: [11:12:21] Hello this is Tom. How are you? [11:14:08] Hello this is Tom. How are you? . . . I would need to search for consecutive lines for which the text was the same after the time entered in brackets. I know that this search: FIND: ^.{11}(.*)$ REPLACE; $1 . . . will locate the first 11 characters and remove them. This search: FIND: ^((.{10}).*)(?:\r?\n\2.*)+ REPLACE: $1 . . . will locate lines where the first 10 characters are the same and remove them. But I can't figure out how to structure the search so it checks the text from position 11 to the end of the line, and then checks if the text on the next line from the 11th character to the end of the line is the same.
-
Hello, I need to write a regular expression search which will locate when a line ends with the same text as the preceding line, but does not have the same first 10 characters. So in this example: [11:12:21] Hello this is Tom. How are you? [11:14:08] Hello this is Tom. How are you? . . . I would need to search for consecutive lines for which the text was the same after the time entered in brackets. I know that this search: FIND: ^.{11}(.*)$ REPLACE; $1 . . . will locate the first 11 characters and remove them. This search: FIND: ^((.{10}).*)(?:\r?\n\2.*)+ REPLACE: $1 . . . will locate lines where the first 10 characters are the same and remove them. But I can't figure out how to structure the search so it checks the text from position 11 to the end of the line, and then checks if the text on the next line from the 11th character to the end of the line is the same.
You cannot do what you are asking with a regular expression. (There is in fact a very wrong way to attempt this which is ridiculous and would lead to nothing but a maintenance nightmare.) However in a programming language that uses regexes the algorithm that you would create would look like the following 1. Read a line 2. Parse the line to remove the timestamp. 3. Does it match the previous one? (Do whatever you want) 4. Otherwise save it for the next time 5. Go back to step 1 until there are no more lines to read.