7 Replies Latest reply: Sep 15, 2007 2:40 PM by 807605 RSS

    Process text file

    807605
      Hello all, For my project am processing text files to remove unwanted datas ..In a folder i may have a group of text files.I want remove unwanted datas in all those text files.For example year,email id,bulletins.The following code is for to remove the keyword references and its following content. In the above exampl i want remove a single line containg those words.Line with discontinuous text should also be removed.Can anyone tel me how to do.

      Thank u.
      import java.io.BufferedInputStream;
      import java.io.BufferedOutputStream;
      import java.io.FileInputStream;
      import java.io.FileOutputStream;
      import java.io.IOException;
      import java.io.FilenameFilter;
      import java.io.File;
      
      public class Test {
      
           // the main function
           public void deleteReferencesAndData(String filename) {
      
                try {
                     // make 32k buffer for output
                     StringBuffer strOutput = new StringBuffer(32768);
      
                     // read input file into a byte array
                     byte[] pInput = ReadFile(filename);
      
                     String strInput = new String(pInput);
      
                     // find all instances of args[1] and replace it with args[2]
                     int nPos = 0;
                     while (true) {
                          int nIndex = strInput.indexOf("References:", nPos);
                          // if args[1] can no longer be found, then copy the rest of the
                          // input
                          if (nIndex < 0) {
                               strOutput.append(strInput.substring(nPos));
      
                          }
                          // otherwise, replace it with args[2] and continue
                          else {
                               strOutput.append(strInput.substring(nPos, nIndex));
      
                          }
                          break;
                     }
      
                     strInput = strOutput.toString();
      
                     // write the output string to file
                     WriteFile(filename, strInput.getBytes());
                } catch (Exception e) {
                     System.out.println(e.getMessage());
                }
           }
      
           // helper function to read a file into a byte array
           static public final byte[] ReadFile(String strFile) throws IOException {
                int nSize = 32768;
                // open the input file stream
                BufferedInputStream inStream = new BufferedInputStream(
                          new FileInputStream(strFile), nSize);
                byte[] pBuffer = new byte[nSize];
                int nPos = 0;
                // read bytes into a buffer
                nPos += inStream.read(pBuffer, nPos, nSize - nPos);
                // while the buffer is filled, double the buffer size and read more
                while (nPos == nSize) {
                     byte[] pTemp = pBuffer;
                     nSize *= 2;
                     pBuffer = new byte[nSize];
                     System.arraycopy(pTemp, 0, pBuffer, 0, nPos);
                     nPos += inStream.read(pBuffer, nPos, nSize - nPos);
                }
                // close the input stream
                inStream.close();
                if (nPos == 0) {
                     return "".getBytes();
                }
                // return data read into the buffer as a byte array
                byte[] pData = new byte[nPos];
                System.arraycopy(pBuffer, 0, pData, 0, nPos);
                return pData;
           }
      
           // helper function to write a byte array into a file
           static public final void WriteFile(String strFile, byte[] pData)
                     throws IOException {
                BufferedOutputStream outStream = new BufferedOutputStream(
                          new FileOutputStream(strFile), 32768);
                if (pData.length > 0)
                     outStream.write(pData, 0, pData.length);
                outStream.close();
           }
      
           public static void main(String args[]) {
                Test test = new Test();
      
                File f1 = new File("E:\\txtsamplefiles");
      
                FilenameFilter only = new Onlyext("txt");
                String s[] = f1.list(only);
                if(s!=null)
                {
                for (int i = 0; i < s.length; i++) {
                     test.deleteReferencesAndData("E:\\txtsamplefiles\\"+s);

                }
                }

           }

      }

      class Onlyext implements FilenameFilter {
           String ext;

           public Onlyext(String ext) {
                this.ext = "." + ext;
           }

           public boolean accept(File dir, String name) {
                return name.endsWith(ext);
           }
      }
        • 1. Re: Process text file
          807605
          You should:

          1) follow Java naming conventions (method names start with a lower-case letter, etc.)

          2) use a standard indentation style

          3) Use Readers and Writers to read/write textual files, not Input/OutputStreams

          4) Stop trying to write 1970's C in Java.

          Generally, if you want to remove a line from a file, you read in the lines, then write out all lines except the one you want to remove. Similarly, if you want to remove part of a line, you read the line in, and then write out everything except the one you don't want to write.
          • 2. Re: Process text file
            807605
            so ur correcting my program ..Anyway thanks for ur suggestions.. can u tel sum sample code..I want to process all t files at once.
            • 3. Re: Process text file
              807605
              You already have a program that looks in a directory and finds files to process.
              Clean it up and there you go.
              • 4. Re: Process text file
                807605
                Is tat regular expression in pattern matching is the only way to do it or Cany anyone send sample code for any finding any keyword and removing that line
                Thank u
                • 5. Re: Process text file
                  807605
                  import java.io.BufferedReader;
                  import java.io.FileReader;
                  import java.io.File;
                  import java.io.FileWriter;
                  import java.io.FileNotFoundException;
                  import java.io.IOException;
                  import java.io.PrintWriter;
                  import java.util.regex.*;
                  import java.lang.*;
                  import java.io.*;
                  import java.util.*;

                  class Removeline
                  {
                  public static void main (String [] args)throws IOException
                  {
                  BufferedReader br = new BufferedReader(new FileReader("file1.txt"));
                  String s;
                  /*Boolean found;*/

                  while((s=br.readLine())!=null)
                  {
                       Pattern p = Pattern.compile ("fig");
                       StringBuffer sb=new StringBuffer(s);
                       Matcher m = p.matcher (s);
                  /*found=m.matches();*/
                       if(m.find()==true)
                       {
                       sb.delete(0,100);
                       }
                       FileWriter f2=new FileWriter("file2.txt",true);
                       String str1=sb.toString();
                       f2.write(str1);
                       f2.close();
                  }
                  }
                  }

                  In this program am deleting the word figure and am writing in a new file.Instead of doing tat i want to delete in the same file .And the output is coming in a single line i should also put in correct original order.
                  • 6. Re: Process text file
                    807605
                    If you want it to go to the original file, then write to a new file and rename the files when you're finished. You can use java.io.File.renameTo.

                    I don't understand the rest of your question.
                    • 7. Re: Process text file
                      807605
                      This is working properly thank u for ur reply...
                      import java.io.BufferedReader;
                      import java.io.File;
                      import java.io.FileReader;
                      import java.io.FileWriter;
                      import java.io.FilenameFilter;
                      import java.io.IOException;
                      import java.util.regex.Matcher;
                      import java.util.regex.Pattern;
                      
                      
                      public class Main {
                      
                           public void removeLine(String filename)throws IOException{
                                BufferedReader br = new BufferedReader(new FileReader("D:\\TextFiles\\" + 
                      
                      filename));
                              FileWriter f2=new FileWriter("D:\\TextFiles\\Output\\" + filename,true);
                              String s;
                              //StringBuffer sb;
                              /*Boolean found;*/
                      
                              while((s=br.readLine())!= null ){
                                  try{
                                      System.out.println(s);
                                      Pattern p = Pattern.compile("(.*)[Ff]ig.?\\s*\\d*\\.\\d*(.*)");
                                      Matcher m = p.matcher (s);
                                      /*found=m.matches ();*/
                                      if(m.find()==true){
                                          continue;
                                      }
                                      s = s + "\r\n";
                                      f2.write(s);
                                  }catch (Exception e){
                                      e.printStackTrace();
                                  }
                              }
                              br.close();
                              f2.close();
                           }
                      
                           public static void main(String args[])throws IOException {
                                Main main = new Main();
                                File file = new File("D:\\TextFiles");
                      
                                FilenameFilter filter = new Onlyext("TXT");
                                String files[] = file.list(filter);
                                if(files!=null){
                                     for (int i = 0; i < files.length; i++) {
                                          //System.out.println(files);
                                          main.removeLine(files[i].toString());
                                     }
                                }
                           }
                      }

                      class Onlyext implements FilenameFilter {
                           String ext;

                           public Onlyext(String ext) {
                                this.ext = "." + ext;
                           }

                           public boolean accept(File dir, String name) {
                                return name.endsWith(ext);
                           }
                      }