Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. Java
  4. Parsing Text Files and Spliting on '>'

Parsing Text Files and Spliting on '>'

Scheduled Pinned Locked Moved Java
javajsonhelptutorial
7 Posts 2 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M Offline
    M Offline
    meixiang6
    wrote on last edited by
    #1

    Hi can some kind person help with this JAVA Q on parsing text files: INPUT TEXT (FASTA) FILE: >AB485992 some text ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT >AB485993 some text ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT >AB485994 some text ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT >AB485922 some text ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT >AB485912 some text ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT >AB485942 some text ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT I need JAVA Code that parses the input file and generates 6 files each file contains file is labelled after the '>' and before the space ie file 1 will be named AB485992 and that file will contain the text between the '>' and the following '>' ie: >AB485992 some text ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT file 2 will be labelled: AB485993 and it contents >AB485993 some text ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT file 3 will be labelled: AB485993 and it contents >AB485994 some text ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT and so on for all six files So far I have code that parses out the word after '>' ie the file label and prints those to args file like below, can some kind person PLEASE show me how to parse the INPUT file and generate the 6 seperate files as described above each file labeled after the '>' eg file1, labelled AB485992 and so on for the 6 files >AB485992 some text ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT THANKS SO MUCH! import java.io.*; import java.util.Scanner; public final class ReadWithScanner { public static void main(String... aArgs) throws FileNotFoundException { File f = new File("args"); f.delete(); ReadWithScanner parser = new ReadWithScanner("file.text"); parser.processLineByLine(); log("Done."); } public ReadWithScanner(String aFileName){ fFile = new File(aFileName); } public final void processLineByLine() throws FileNotFoundException { Scanner scanner = new Scanner(fFile); try { while ( scanner.hasNextLine() ){ processLine( scanner.nextLine() ); } } finally { scanner.clo

    4 1 Reply Last reply
    0
    • M meixiang6

      Hi can some kind person help with this JAVA Q on parsing text files: INPUT TEXT (FASTA) FILE: >AB485992 some text ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT >AB485993 some text ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT >AB485994 some text ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT >AB485922 some text ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT >AB485912 some text ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT >AB485942 some text ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT I need JAVA Code that parses the input file and generates 6 files each file contains file is labelled after the '>' and before the space ie file 1 will be named AB485992 and that file will contain the text between the '>' and the following '>' ie: >AB485992 some text ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT file 2 will be labelled: AB485993 and it contents >AB485993 some text ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT file 3 will be labelled: AB485993 and it contents >AB485994 some text ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT and so on for all six files So far I have code that parses out the word after '>' ie the file label and prints those to args file like below, can some kind person PLEASE show me how to parse the INPUT file and generate the 6 seperate files as described above each file labeled after the '>' eg file1, labelled AB485992 and so on for the 6 files >AB485992 some text ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT ATTGGGAATGGAGGGAAATAAATGACTGGATGGTCGCTGCT THANKS SO MUCH! import java.io.*; import java.util.Scanner; public final class ReadWithScanner { public static void main(String... aArgs) throws FileNotFoundException { File f = new File("args"); f.delete(); ReadWithScanner parser = new ReadWithScanner("file.text"); parser.processLineByLine(); log("Done."); } public ReadWithScanner(String aFileName){ fFile = new File(aFileName); } public final void processLineByLine() throws FileNotFoundException { Scanner scanner = new Scanner(fFile); try { while ( scanner.hasNextLine() ){ processLine( scanner.nextLine() ); } } finally { scanner.clo

      4 Offline
      4 Offline
      4277480
      wrote on last edited by
      #2

      I took it one step further.

      import java.io.*;
      import java.util.*;

      public class SplitFiles {
      public static void main(String[] args) throws Exception {
      File mainfile = new File("MainFile.txt");

      	Scanner scan = new Scanner(mainfile);
      
      	ArrayList<String> List = new ArrayList<String>();
      
      	String getFileName = "";
      
      	while (scan.hasNext()) {
      		List.add(scan.nextLine());
      	}
      
      	for (int i = 0; i < List.size(); i++) {
      		if (List.get(i).charAt(0) == '>') {
      
      			getFileName = List.get(i).replace(">", "");
      
      			String result = "";
      
      			for (int j = 0; j < getFileName.length(); j++) {
      				if (getFileName.charAt(j) == ' ') {
      					result = getFileName.substring(0, j);
      					break;
      				}
      			}
      
      			File output;
      
      			if (new File(result + ".txt").exists()) {
      				BufferedWriter out = new BufferedWriter(new FileWriter(
      						result + ".txt", true));
      
      				out.write(List.get(i) + "\\n");
      
      				int k;
      
      				for (k = i + 1; k < List.size(); k++) {
      					if (List.get(k).charAt(0) == '>')
      						break;
      					else {
      						out.write(List.get(k) + "\\n");
      					}
      				}
      
      				out.close();
      			} else {
      				output = new File(result + ".txt");
      
      				PrintWriter out = new PrintWriter(output);
      
      				out.println(List.get(i));
      
      				int k;
      
      				for (k = i + 1; k < List.size(); k++) {
      					if (List.get(k).charAt(0) == '>')
      						break;
      					else {
      						out.println(List.get(k));
      					}
      				}
      
      				out.close();
      			}
      		}
      	}
      }
      

      }

      Good Luck

      M 1 Reply Last reply
      0
      • 4 4277480

        I took it one step further.

        import java.io.*;
        import java.util.*;

        public class SplitFiles {
        public static void main(String[] args) throws Exception {
        File mainfile = new File("MainFile.txt");

        	Scanner scan = new Scanner(mainfile);
        
        	ArrayList<String> List = new ArrayList<String>();
        
        	String getFileName = "";
        
        	while (scan.hasNext()) {
        		List.add(scan.nextLine());
        	}
        
        	for (int i = 0; i < List.size(); i++) {
        		if (List.get(i).charAt(0) == '>') {
        
        			getFileName = List.get(i).replace(">", "");
        
        			String result = "";
        
        			for (int j = 0; j < getFileName.length(); j++) {
        				if (getFileName.charAt(j) == ' ') {
        					result = getFileName.substring(0, j);
        					break;
        				}
        			}
        
        			File output;
        
        			if (new File(result + ".txt").exists()) {
        				BufferedWriter out = new BufferedWriter(new FileWriter(
        						result + ".txt", true));
        
        				out.write(List.get(i) + "\\n");
        
        				int k;
        
        				for (k = i + 1; k < List.size(); k++) {
        					if (List.get(k).charAt(0) == '>')
        						break;
        					else {
        						out.write(List.get(k) + "\\n");
        					}
        				}
        
        				out.close();
        			} else {
        				output = new File(result + ".txt");
        
        				PrintWriter out = new PrintWriter(output);
        
        				out.println(List.get(i));
        
        				int k;
        
        				for (k = i + 1; k < List.size(); k++) {
        					if (List.get(k).charAt(0) == '>')
        						break;
        					else {
        						out.println(List.get(k));
        					}
        				}
        
        				out.close();
        			}
        		}
        	}
        }
        

        }

        Good Luck

        M Offline
        M Offline
        meixiang6
        wrote on last edited by
        #3

        Thanks so much for your help! Just one small point the code you wrote here "for (k = i+1; k " is missing the end part any chances you know what the ending should be? Thanks again!

        4 1 Reply Last reply
        0
        • M meixiang6

          Thanks so much for your help! Just one small point the code you wrote here "for (k = i+1; k " is missing the end part any chances you know what the ending should be? Thanks again!

          4 Offline
          4 Offline
          4277480
          wrote on last edited by
          #4

          Yes that happens with < > in code, I corrected it.

          M 1 Reply Last reply
          0
          • 4 4277480

            Yes that happens with < > in code, I corrected it.

            M Offline
            M Offline
            meixiang6
            wrote on last edited by
            #5

            Thanks so much! I tried compiling and I am getting complaints as follows, any ideas? Sorry I am new to JAVA and cant figure out how to fix it. thanks again! SplitFiles.java:24: cannot find symbol symbol : method charAt(int) location: class java.lang.Object if (List.get(i).charAt(0) == '>') ^ SplitFiles.java:27: cannot find symbol symbol : method replace(java.lang.String,java.lang.String) location: class java.lang.Object getFileName = List.get(i).replace(">", ""); ^ SplitFiles.java:52: cannot find symbol symbol : method charAt(int) location: class java.lang.Object if (List.get(k).charAt(0) == '>') ^ SplitFiles.java:74: cannot find symbol symbol : method charAt(int) location: class java.lang.Object if (List.get(k).charAt(0) == '>')

            4 1 Reply Last reply
            0
            • M meixiang6

              Thanks so much! I tried compiling and I am getting complaints as follows, any ideas? Sorry I am new to JAVA and cant figure out how to fix it. thanks again! SplitFiles.java:24: cannot find symbol symbol : method charAt(int) location: class java.lang.Object if (List.get(i).charAt(0) == '>') ^ SplitFiles.java:27: cannot find symbol symbol : method replace(java.lang.String,java.lang.String) location: class java.lang.Object getFileName = List.get(i).replace(">", ""); ^ SplitFiles.java:52: cannot find symbol symbol : method charAt(int) location: class java.lang.Object if (List.get(k).charAt(0) == '>') ^ SplitFiles.java:74: cannot find symbol symbol : method charAt(int) location: class java.lang.Object if (List.get(k).charAt(0) == '>')

              4 Offline
              4 Offline
              4277480
              wrote on last edited by
              #6

              Steps: 1. Copy the code into your Java project. 2. Create a Text file called MainFile.txt containing what you wanted in the first post. It should work since I corrected the < and > (HTML problem) in my answer post.

              M 1 Reply Last reply
              0
              • 4 4277480

                Steps: 1. Copy the code into your Java project. 2. Create a Text file called MainFile.txt containing what you wanted in the first post. It should work since I corrected the < and > (HTML problem) in my answer post.

                M Offline
                M Offline
                meixiang6
                wrote on last edited by
                #7

                It works, thank you so much!

                1 Reply Last reply
                0
                Reply
                • Reply as topic
                Log in to reply
                • Oldest to Newest
                • Newest to Oldest
                • Most Votes


                • Login

                • Don't have an account? Register

                • Login or register to search.
                • First post
                  Last post
                0
                • Categories
                • Recent
                • Tags
                • Popular
                • World
                • Users
                • Groups