4 Replies Latest reply: Dec 15, 2010 7:23 AM by 800928 RSS

    the concurrent io problem when using RandomAccessFile

    800928
      Hi:
      In my application,I have to export the tomcat log(the log file like "localhost_access_log.2010-10-13") to database and the do some analisis.

      My way:

      start to export log at 00:05:00 every day,at this moment just read the log whose date before yesterday.
      For example,at 2010-12-12 00:05:00,the log of 2010-12-11... 2010-12-01 ..2010-11-12...(just check the nearest 30 days).
      All of these data are put into one table named "log".
      If log of one day is exported successfully,insert one record to another table named 'logrecord'.
      ----------------------------------
      //main code fragment:
           public void start() {
                //start the push export work once the server startup.
                run();
                //start the schedule work
                new ScheduledThreadPoolExecutor(5).scheduleAtFixedRate(this, getStartTime(), 24 * 3600,
                          TimeUnit.SECONDS);
           }
           
           //return the left time(from now to 00:05:00 of tomorrow)
           private long getStartTime() {
                Date d = new Date();
                long t = (DateUtil.getNextDayAtMiddleTime(d).getTime() - d.getTime()) / 1000 + 300;
                return t;
           }
      
           @Override
           public void run() {
                Date days[] = DateUtil.getSeveralDayRangeByTime(30); //just the nearest 30 days.
                for (Date d : days) {
                     if (exist(d)) {
                          continue;
                     }
                     exportLogByDate(d);
                }
           }
      ------------------------------------------

      It works for now expect we can not anlyzer data of today.

      However we need it now.

      As far as I thought,I want to create a new table which owns the same structure of the former table "log" used to hold the log of "today" only.
      At 00:05:00 of every day,after the normal log exporting(export the nearest 30 days'log but today),export the log of today.

      It sounds easy,read the content,parser,filter,insert,just like what I did.

      But,the tomcat log file is saved by day.So in my normal log exporting,the log file(nearest 30 days) are not be used by tomcat,I can open/close it as I like.

      However if I tried to read the log of today,since the log file maybe used by tomcat for inserting log.

      I prefer to use the RandomAccessFile to read the log of today:

      But I am coufused by the concurrent io problem:what is going on if I am reading a line while tomcat is writing the line?

      Here is my test exmaple:
      -------------------------------
      package com.test;
      
      import java.io.BufferedWriter;
      import java.io.File;
      import java.io.FileNotFoundException;
      import java.io.FileWriter;
      import java.io.IOException;
      import java.io.RandomAccessFile;
      
      import org.junit.BeforeClass;
      import org.junit.Test;
      
      public class TestMain {
           private static File          file;
           private static long          pos; //record the position of last time
           private static Thread     writterThread;
      
           @BeforeClass
           public static void init() {
                file = new File("D:/random.txt");
                // build the thread for simulating the tomcat write log
                writterThread = new Thread(new Runnable() {
                     @Override
                     public void run() {
                          FileWriter fw;
                          try {
                               fw = new FileWriter(file);
                               BufferedWriter bw = new BufferedWriter(fw);
                               int i = 0;
                               while (true) {
                                    i++;
                                    bw.append(i + " added to line...");
                                    bw.append("\r\n");
                                    bw.flush();
                                    Thread.sleep(5000);
                               }
                          } catch (IOException e) {
                               e.printStackTrace();
                          } catch (InterruptedException e) {
                               e.printStackTrace();
                          }
                     }
                });
           }
      
           @Test
           public void testRandomRead() throws IOException, InterruptedException {
                writterThread.start();
                try {
                     RandomAccessFile raf = new RandomAccessFile(file, "r");
                     String line;
                     while ((line = raf.readLine()) != null) {
                          System.out.println(line);
                     }
                     pos = raf.getFilePointer();
                     raf.close();
      
                     // read the file by 30 seconds within 2 min,just for test)
                     for (long m = 0; m < 1000 * 60 * 2; m += 30000) {
                          raf = new RandomAccessFile(file, "r");
                          raf.seek(pos);
                          while ((line = raf.readLine()) != null) {
                               System.out.println(line);
                          }
                          pos = raf.getFilePointer();
                          raf.close();
                          Thread.sleep(30000);
                     }
      
                } catch (FileNotFoundException e) {
                     e.printStackTrace();
                }
           }
      }
      ----------------------------

      The normal output is something like:
      1 added to line...
      2 added to line...
      3 added to line...
      4 added to line...
      5 added to line...
      ......

      However I always get the following output:
      1
      added to line...
      2 added to line...
      3 added to line...
      4 added to line...
      5
      added to line...

      That's to say,the RandomAccessFile is reading the line which has not been completed by tomcat.

      So,I have two questions now:

      1) How about my normal log exporting? Is there anything can be improved?

      2) How to slove the concurrent io problem when export log of today?
        • 1. Re: the concurrent io problem when using RandomAccessFile
          802316
          Once a file reader finds the end, it cannot be reused to keep reading the file.

          The only approach you can do is to repeatedly trying to read data from where you have read data successfully before, by polling the file. BufferedReader may not be suitable because you don't know if you have a new line terminated line. i.e. you don't know if the line is complete. You can poll the File.length() to see if it has changed. Note: if the length shrinks you should assume the file has been truncated.

          Basically files are a very poor medium for communicating data between processes. You just have to make the best of it.
          • 2. Re: the concurrent io problem when using RandomAccessFile
            800928
            Thanks for your reply. :)
            Peter Lawrey wrote:
            Once a file reader finds the end, it cannot be reused to keep reading the file.

            The only approach you can do is to repeatedly trying to read data from where you have read data successfully before, by polling the file. BufferedReader may not be suitable because you don't know if you have a new line terminated line. i.e. you don't know if the line is complete. You can poll the File.length() to see if it has changed. Note: if the length shrinks you should assume the file has been truncated.
            I am not exactly sure your meaning,can you show me a live example?

            >
            Basically files are a very poor medium for communicating data between processes. You just have to make the best of it.
            Any other way?
            • 3. Re: the concurrent io problem when using RandomAccessFile
              802316
              You can;
              - check the length to see if it has grown since the last time it was written to. If it has shrunk, start from the start of the file.
              - if longer, open the file from the last point read.
              - read the text up to the last newline in the file. (might be no new lines)
              - close the file and remember where you were up to. (the start of the last incomplete line.
              - wait a bit and repeat.

              You can write the data to a file and a socket connection. Reading updates as they happen on a socket is much simpler. A better solution might be to use a JMS queue or topic to distibute the information in the logs. (More complex but the most flexible solution)
              • 4. Re: the concurrent io problem when using RandomAccessFile
                800928
                Peter Lawrey wrote:
                You can;
                - check the length to see if it has grown since the last time it was written to. If it has shrunk, start from the start of the file.
                - if longer, open the file from the last point read.
                - read the text up to the last newline in the file. (might be no new lines)
                - close the file and remember where you were up to. (the start of the last incomplete line.
                - wait a bit and repeat.
                But how to decide if one line is completed?

                Also,how about if the randomaccessfile can not stop?
                For example,start the work at 02:00,it read the tomcat log file line by line and export them to db,and during this time,the tomcat keep writing log to the same file(user request the server all the time), and then the randomaccessfile will keeping reading accordingly,when it is 03:00,the last task is not completed,but a new task should start,how to control this?