Read specific lines from txt file and write them to another file
-
Hi, I've got a txt file I'm parsing and need to extract specific lines from it and copy them to another file. First I scanned the file for the lines of interest by using the readline function. Each time I got a line I'm interested in I inserted the line number into an array.In the end I got an array that holds the line numbers of the lines I'm interested in coping. Meaning, the start line in one cell and end line in another, and I've got several of those in the array. How do I copy the block of lines (between 2 cells listed in the array )from one file to a new file? Do I need to sync the StreamReader of one file with the StreamWriter of the other? or are there any other better ideas as to how to do it? :^) Thanks in advance, Inbal
-
Hi, I've got a txt file I'm parsing and need to extract specific lines from it and copy them to another file. First I scanned the file for the lines of interest by using the readline function. Each time I got a line I'm interested in I inserted the line number into an array.In the end I got an array that holds the line numbers of the lines I'm interested in coping. Meaning, the start line in one cell and end line in another, and I've got several of those in the array. How do I copy the block of lines (between 2 cells listed in the array )from one file to a new file? Do I need to sync the StreamReader of one file with the StreamWriter of the other? or are there any other better ideas as to how to do it? :^) Thanks in advance, Inbal
Rather than only holding the line numbers in the array. Make it a string array*, and store the whole line in the array. Then, once you've finished processing the initial file. Close it, and open up a stream to your new file. Now you can write each string value from the array into the new file. (* Even better, Use a generic
List<String>
instead of an array. This will be strongly typed and automatically resize to hold however many lines you need)Simon
-
Rather than only holding the line numbers in the array. Make it a string array*, and store the whole line in the array. Then, once you've finished processing the initial file. Close it, and open up a stream to your new file. Now you can write each string value from the array into the new file. (* Even better, Use a generic
List<String>
instead of an array. This will be strongly typed and automatically resize to hold however many lines you need)Simon
Simon Stevens wrote:
Rather than only holding the line numbers in the array. Make it a string array*, and store the whole line in the array.
What happens when the file is very big (for the sake of argument 1GB)? I already thought about your solution, but it seems to me that it consume a lot of memory if I need to save the whole file into an array. If I want to focus only to copy the lines of interest which I know where they begin and where they end, how do I do it? any way to convert line numbers to file offset and use seek function get there?
-
Simon Stevens wrote:
Rather than only holding the line numbers in the array. Make it a string array*, and store the whole line in the array.
What happens when the file is very big (for the sake of argument 1GB)? I already thought about your solution, but it seems to me that it consume a lot of memory if I need to save the whole file into an array. If I want to focus only to copy the lines of interest which I know where they begin and where they end, how do I do it? any way to convert line numbers to file offset and use seek function get there?
In that case, Have both the read and the write stream open at the same time. Every time you find a line you are interested in from the reader, write it directly to the write stream. This will prevent you from having more than one line in memory at any one time.
Simon
-
In that case, Have both the read and the write stream open at the same time. Every time you find a line you are interested in from the reader, write it directly to the write stream. This will prevent you from having more than one line in memory at any one time.
Simon
-
Rather than only holding the line numbers in the array. Make it a string array*, and store the whole line in the array. Then, once you've finished processing the initial file. Close it, and open up a stream to your new file. Now you can write each string value from the array into the new file. (* Even better, Use a generic
List<String>
instead of an array. This will be strongly typed and automatically resize to hold however many lines you need)Simon
Hi Why not writing the line to the new file ?? at the moment you decide to keep this line. The array is not a very good solution because you need to parse the stream again in order to go back and forward. (you need to keep the postion of the stream not the line No) I think a good solution to your problem can be: 1.open the file 2.read line 3.if the line is to be saved -> process the line (if you need) 4.save the line in the output text file 5.read the next line.
-
Is there no way I can use line numbers to get to the specific line and start copying from there forth? Inbal
There is no method you can call to just jump to whatever line number in the file. Files do not have line numbers. You're going to have to read the file, line-by-line, counting the number of lines you read, until you get to the point you want. You don't have to hold onto the lines you don't want in an array. Then start reading the file, line-by-line, and writing those lines out to whatever you want. Even now, you don't have to hold these lines in an array either.
A guide to posting questions on CodeProject[^]
Dave Kreskowiak Microsoft MVP Visual Developer - Visual Basic
2006, 2007, 2008 -
Hi Why not writing the line to the new file ?? at the moment you decide to keep this line. The array is not a very good solution because you need to parse the stream again in order to go back and forward. (you need to keep the postion of the stream not the line No) I think a good solution to your problem can be: 1.open the file 2.read line 3.if the line is to be saved -> process the line (if you need) 4.save the line in the output text file 5.read the next line.
It could have been a good solution if I didn't mind the size of the output file. I have to know in advance if the block of lines I'm about to copy doesn't exceed 65000 lines per file. If it does than I need to split it into 2 consecutive files. That's why I scanned it in advance and saved the beginning and the end of the block in an array. Inbal
-
It could have been a good solution if I didn't mind the size of the output file. I have to know in advance if the block of lines I'm about to copy doesn't exceed 65000 lines per file. If it does than I need to split it into 2 consecutive files. That's why I scanned it in advance and saved the beginning and the end of the block in an array. Inbal
While you are writing a line to the output file manage a counter that will be the limit size of the output file when the counter as reach the limit close the stream and open a new for a new file
-
It could have been a good solution if I didn't mind the size of the output file. I have to know in advance if the block of lines I'm about to copy doesn't exceed 65000 lines per file. If it does than I need to split it into 2 consecutive files. That's why I scanned it in advance and saved the beginning and the end of the block in an array. Inbal
I recommend to you to do it on the fly, like other people is saying The only thing you must be care about is to make the stream flush to prevent memory problems, i mean
void CopyLines(string inFile, string outFile){ StreamReader sr = new StreamReader(File.Open(inFile, FileMode.Open, FileAccess.Read, FileShare.ReadWrite)); StreamWriter sw = new StreamWriter(outFile); string line; while (!sr.EndOfStream) { line = sr.ReadLine(); if ( Match(line) ) { sw.WriteLine(line); } sw.Flush(); } sr.Close(); sw.Close(); }
Maybe the Flush method should be not always called, i mean like if (i % 100 == 0) sw.Flush(), or use the AutoFlush property
Saludos!! ____Juan