This discussion is archived
9 Replies Latest reply: Oct 15, 2007 6:18 AM by 807592 RSS

Reading a file in reverse....

807587 Newbie
Currently Being Moderated
Hi,

Does anyone know if there is a free "ReverseFileInputStream" or "ReverseFileReader" available anywhere?

There was a previous thread (about 3 years ago) where someone had a similar issue.

Basically I would like to read a file from the "bottom up", a line at a time, so having the same contract as BufferedReader would be good, but of course it reads the last line first and then works backwards a line at a time.

It would just save me a lot of time if someone knows of one already available (then I don't have to mess about with RandomAccessFile)...

Many thanks in advance,

Gary

P.S. Please bear in mind that the file I want to read is 130MB+ (probably towards 500MB) so I can't just buffer it and read it in reverse... I probably want the last 30MB or so.
  • 1. Re: Reading a file in reverse....
    807587 Newbie
    Currently Being Moderated
    What's a "line"?
  • 2. Re: Reading a file in reverse....
    807587 Newbie
    Currently Being Moderated
    http://www.emerle.net/comments/view.cfm/p/185
  • 3. Re: Reading a file in reverse....
    800382 Newbie
    Currently Being Moderated
    you could use RandomAccessFile, seek the end and walk backwards. But only one byte at a time.... Or you could write a class that uses RandomAccessFile that seeks to (end - some chars), read that chunk, reverse the array, then back up another chunk.
  • 4. Re: Reading a file in reverse....
    DrClap Expert
    Currently Being Moderated
    http://www.emerle.net/comments/view.cfm/p/185
    A little warning: the code posted there encodes bytes to chars by simply casting them. This should work correctly if the encoding of the file is ISO-8859-1. If it's something else, the non-ASCII chars will be mangled. As the author says, works fine for him.
  • 5. Re: Reading a file in reverse....
    807587 Newbie
    Currently Being Moderated
    Thanks guys... that's exactly what I'm looking for, sorry for the loonnnggg delay in responding, been busy...

    G.
  • 6. Re: Reading a file in reverse....
    807587 Newbie
    Currently Being Moderated
    To up allbody the code Reverse file..
    /*******************************************************************
     * Author:          Ryan D. Emerle
     * Date:               10.12.2004
     * Desc:               Reverse file reader.  Reads a file from the end to the
     *                              beginning
     *
     * Known Issues:
     *                              Does not support unicode!
     *******************************************************************/
     
    package org.emerle.fileIO;
    import java.io.*;
    import java.util.*;
    
    public class ReverseFileReader {     
              private String filename;     
              private RandomAccessFile randomfile;     
              private long position;
              
              public ReverseFileReader (String filename) throws Exception {          
                   // Open up a random access file
                   this.randomfile=new RandomAccessFile(filename,"r");
                   // Set our seek position to the end of the file
                   this.position=this.randomfile.length();
                        
                   // Seek to the end of the file
                   this.randomfile.seek(this.position);
                   //Move our pointer to the first valid position at the end of the file.
                   String thisLine=this.randomfile.readLine();
                   while(thisLine == null ) {
                        this.position--;
                        this.randomfile.seek(this.position);
                        thisLine=this.randomfile.readLine();
                        this.randomfile.seek(this.position);
                   }
              }     
              
              // Read one line from the current position towards the beginning
              public String readLine() throws Exception {          
                   int thisCode;
                   char thisChar;
                   String finalLine="";
                   
                   // If our position is less than zero already, we are at the beginning
                   // with nothing to return.
                   if ( this.position < 0 ) {
                             return null;
                   }
                   
                   for(;;) {
                        // we've reached the beginning of the file
                        if ( this.position < 0 ) {
                             break;
                        }
                        // Seek to the current position
                        this.randomfile.seek(this.position);
                        
                        // Read the data at this position
                        thisCode=this.randomfile.readByte();
                        thisChar=(char)thisCode;
                        
                        // If this is a line break or carrige return, stop looking
                        if (thisCode == 13 || thisCode == 10 ) {
                             // See if the previous character is also a line break character.
                             // this accounts for crlf combinations
                             this.randomfile.seek(this.position-1);
                             int nextCode=this.randomfile.readByte();
                             if ( (thisCode == 10 && nextCode == 13) || (thisCode == 13 && nextCode == 10) ) {
                                  // If we found another linebreak character, ignore it
                                  this.position=this.position-1;
                             }
                             // Move the pointer for the next readline
                             this.position--;
                             break;
                        } else {
                             // This is a valid character append to the string
                             finalLine=thisChar + finalLine;
                        }
                        // Move to the next char
                        this.position--;
                   }
                   // return the line
                   return finalLine;
              }     
    }
  • 7. Re: Reading a file in reverse....
    807587 Newbie
    Currently Being Moderated
    What's a "line"?
    a line is a group of consecutive characters that ends with an "end of line" character...

    by extension, I suppose one could consider a file with no 'end of line' at all as one line, so, we could rewrite our definition :

    a line is a group of consecutive characters that ends with an "end of line" of "end of file" character...
  • 8. Re: Reading a file in reverse....
    807592 Newbie
    Currently Being Moderated
    Hi

    Problem:
    Let there be a file containing thousands of lines. We want to read last N lines. Similar to functionality provided by "tail" command.

    Constraints:
    1 Minimize the number of file reads.
    Avoid reading the complete file to get last few lines.
    2 Minimize the JVM in-memory usage.
    Avoid storing the complete file info in in-memory.

    Approach:
    Read a chunk of characters from end of file. One chunk should contain multiple lines. Reverse this chunk and extract the lines. Repeat this until you get required number of last N lines. In this way we read and store only the required part of the file.

    Below is a utility program:
    -------------------------------------------------------------------------------
     
    import java.util.*;
    import java.io.RandomAccessFile;
    
    public class FileHelper {
    
         /**
          * @param args
          */
         public static void main(String[] args) 
         {
              String dir = "C:\\";
              String fileName = args[0];
              int lineCount = Integer.parseInt(args[1]);
              FileHelper.tail(dir + fileName, lineCount);          
         }
    
         public static Vector tail(String fileName, int lineCount)
         {
              return tail(fileName, lineCount, 2000);
         }
    
         /**
          * Given a byte array this method:
          * a. creates a String out of it
          * b. reverses the string
          * c. extracts the lines
          * d. characters in extracted line will be in reverse order, 
          *    so it reverses the line just before storing in Vector.
          *     
          *  On extracting required numer of lines, this method returns TRUE, 
          *  Else it returns FALSE.
          *   
          * @param bytearray
          * @param lineCount
          * @param lastNlines
          * @return
          */
         private static boolean parseLinesFromLast(byte[] bytearray, int lineCount, Vector lastNlines)
         {
              String lastNChars = new String (bytearray);
              StringBuffer sb = new StringBuffer(lastNChars);
              lastNChars = sb.reverse().toString();
              StringTokenizer tokens= new StringTokenizer(lastNChars,"\n");
              while(tokens.hasMoreTokens())
              {
                   StringBuffer sbLine = new StringBuffer((String)tokens.nextToken());               
                   lastNlines.add(sbLine.reverse().toString());
                   if(lastNlines.size() == lineCount)
                   {
                        return true;//indicates we got 'lineCount' lines
                   }
              }
              return false; //indicates didn't read 'lineCount' lines
         }
         
         /**
          * Reads last N lines from the given file. File reading is done in chunks.
          * 
          * Constraints:
          * 1 Minimize the number of file reads -- Avoid reading the complete file
          * to get last few lines.
          * 2 Minimize the JVM in-memory usage -- Avoid storing the complete file 
          * info in in-memory.
          *
          * Approach: Read a chunk of characters from end of file. One chunk should
          * contain multiple lines. Reverse this chunk and extract the lines. 
          * Repeat this until you get required number of last N lines. In this way 
          * we read and store only the required part of the file.
          * 
          * 1 Create a RandomAccessFile.
          * 2 Get the position of last character using (i.e length-1). Let this be curPos.
          * 3 Move the cursor to fromPos = (curPos - chunkSize). Use seek().
          * 4 If fromPos is less than or equal to ZERO then go to step-5. Else go to step-6
          * 5 Read characters from beginning of file to curPos. Go to step-9.
          * 6 Read 'chunksize' characters from fromPos.
          * 7 Extract the lines. On reading required N lines go to step-9.
          * 8 Repeat step 3 to 7 until 
          *               a. N lines are read.
          *          OR
          *               b. All lines are read when num of lines in file is less than N. 
          * Last line may be a incomplete, so discard it. Modify curPos appropriately.
          * 9 Exit. Got N lines or less than that.
          *
          * @param fileName
          * @param lineCount
          * @param chunkSize
          * @return
          */
         public static Vector tail(String fileName, int lineCount, int chunkSize)
         {
              try
              {
                   RandomAccessFile raf = new RandomAccessFile(fileName,"r");
                   Vector lastNlines = new Vector();               
                   int delta=0;
                   long curPos = raf.length() - 1;
                   long fromPos;
                   byte[] bytearray;
                   while(true)
                   {                    
                        fromPos = curPos - chunkSize;
                        System.out.println(curPos);
                        System.out.println(fromPos);                    
                        if(fromPos <= 0)
                        {
                             raf.seek(0);
                             bytearray = new byte[(int)curPos];
                             raf.readFully(bytearray);
                             parseLinesFromLast(bytearray, lineCount, lastNlines);
                             break;
                        }
                        else
                        {                         
                             raf.seek(fromPos);
                             bytearray = new byte[chunkSize];
                             raf.readFully(bytearray);
                             if(parseLinesFromLast(bytearray, lineCount, lastNlines))
                             {
                                  break;
                             }
                             delta = ((String)lastNlines.get(lastNlines.size()-1)).length();
                             lastNlines.remove(lastNlines.size()-1);
                             curPos = fromPos + delta;     
                        }
                   }
                   Enumeration e = lastNlines.elements();
                   while(e.hasMoreElements())
                   {
                        System.out.println(e.nextElement());
                   }               
                   return lastNlines;
              }
              catch(Exception e)
              {
                   e.printStackTrace();
                   return null;
              }
         }     
    }
    -------------------------------------------------------------------------------

    Best Regards,
    Chary
  • 9. Re: Reading a file in reverse....
    807592 Newbie
    Currently Being Moderated
    i have concocted the following:

    //herman vierendeels belgium
    //read file backward
    //java be/dekamer/dbtools/reverseFileReader /tmp/xx
    //
    // reverseFileReader /tmp/x1 > /tmp/x2
    // reverseFileReader /tmp/x2 > /tmp/x3
    // diff x1 x3
    // try and compare also with tac
    package be.dekamer.dbtools;

    import java.io.*;
    import java.util.logging.*;

    import hvr4.source.java.allerlei.H4Object;

    public class reverseFileReader
    {
    private static Logger logger=Logger.getLogger(reverseFileReader.class.getName());

    private RandomAccessFile randomfile;
    private long position;
    String charset="UTF-8";

    public reverseFileReader(String filename)
    throws Exception
    {          
    this.randomfile=new RandomAccessFile(filename,"r");
    this.position=this.randomfile.length()-1;
    //logger.info("position="+position);

    char c=readValidChar();
    if(c=='\n' || c=='\r')
    {
    long oldpos=position;     
    c=readValidChar();
    if(c!='\n' && c!='\r') position=oldpos;//13 10
    }
    else
    {
    throw(new RuntimeException("invalid char"));
    }     
    //logger.info("position="+position);
    }//constructor
    private char readValidChar()
    throws java.io.IOException      
    {
    char rv;     
    byte[] vs=new byte[2];
    String vss=null;

    randomfile.seek(position);
    vs[1]=randomfile.readByte();
    position--;

    if(Character.isValidCodePoint(vs[1]))
    {
    vss=new String(vs,1,1,charset);
    }
    else
    {
    randomfile.seek(position);
    vs[0]=randomfile.readByte();
    position--;

    vss=new String(vs,charset);

    }
    if(vss.length()!=1)
    {
    throw(new RuntimeException(position+" vss.length()="+vss.length()+" vss="+H4Object.string2hexstring(vss)+" vs="+H4Object.bytes2hexstring(vs)));
    }
    //byte[] utf8=vss.getBytes(charset);
    rv=vss.charAt(0);
    //logger.info("rv="+rv);
    return(rv);
    }//private char readValidChar()
    public String readLine()
    throws Exception
    {          
    char c;
    String finalLine="";

    if(position<0){return null;}

    while(position>=0)

    c=readValidChar();
    if(c=='\n' || c=='\r')
    {
    c=readValidChar();
    if(c!='\n' && c!='\r') position++;
    break;
    }
    else
    {
    finalLine=c+finalLine;
    }
    }//while
    return finalLine;
    }//readLine
    public void close()
    throws java.io.IOException     
    {
    randomfile.close();
    }     
         
              
    public static void main(String args[])
    throws Exception
    {
    String line=null;

    reverseFileReader rfr=new reverseFileReader(args[0]);

    line=rfr.readLine();
    while(line!=null)
    {      
    System.out.println(line);
    line=rfr.readLine();
    }
    //
    rfr.close();
    }//main
    }//public class reverseFileReader