5 Replies Latest reply: Jun 16, 2008 5:20 PM by 807591 RSS

    Problem Reading in File (Extra Chars?)

    807591
      I'm having a small (but blocking) problem with a client-server application. Here's the scoop. The client side will run a Windows batch which simply calls a few commands and outputs them to a text file. Here's an example:
      wmic path win32_localtime get day^,hour^,minute^,month^,second^,year > proclist.txt
      wmic process list full >> proclist.txt
      Okay, great, the file looks as it should. Date/time stamped at the top and full process information below. Now to send it to the server. Here's the client-side 'sending' code:
        public static void transferTo(String file, String host, int port) throws IOException 
        {
          File toXfer = new File(file);
          InputStream in = new FileInputStream(toXfer);
          Socket socket = new Socket(host, port);
          OutputStream output = socket.getOutputStream();
          byte[] buff = new byte[socket.getSendBufferSize()];
          int bytesRead = 0;
          while((bytesRead = in.read(buff))>0)
          {
            output.write(buff,0,bytesRead);
          }
          in.close();
          output.close();
          socket.close();
          //TODO: Verify transfer with MD5, for now, just delete the file
          toXfer.delete();
        }
      Cool, now here's the server-side 'receive code':
            InputStream input = socket.getInputStream();
            String filename = socket.getRemoteSocketAddress().toString();
            StringTokenizer st = new StringTokenizer(filename,"/:");
            filename = st.nextToken();
            filename = System.getProperty("user.dir") + "\\wipbin\\" + filename + "-" + System.currentTimeMillis() + ".txt";
            FileOutputStream wr = new FileOutputStream(new File(filename));
            byte[] outBuffer = new byte[socket.getReceiveBufferSize()];
            int bytesReceived = 0;
            double xferSize = 0;
            long start = System.currentTimeMillis();
            while((bytesReceived = input.read(outBuffer))>0)
            {
              wr.write(outBuffer,0,bytesReceived);
              xferSize+=bytesReceived;
            }
            long end = System.currentTimeMillis();
            double xferTime = ((double) (end - start)) / 1000;
            this.log.write("File transfer of " + xferSize/1024 + " KB completed from " + socket.getRemoteSocketAddress().toString().substring(1) + " completed in " + xferTime + " seconds", Log.NORMAL);
            wr.close();
            input.close();
            this.log.write("Closing connection with " + socket.getRemoteSocketAddress().toString().substring(1) + " on Thread#" + this.getId() + ".", Log.NORMAL);
            socket.close();
            //TODO: Verify transfer with MD5
      Now, eventually, the server will comb through this file to analyze it and pull out the appropriate data. Here's what I'm trying to do just to read in the file line of the file:
            FileReader in = new FileReader(f);
            BufferedReader input = new BufferedReader(in);
            String firstLine = input.readLine();
            System.out.println(firstLine);
            if (firstLine.contains("System")) //MS Info Report
            {
              System.out.println(f.toString() + " is an info report.");
              type = Analyser.MSINFOREPORT;
            }
            else if (firstLine.contains("Day")) //Process List
            {
              System.out.println(f.toString() + " is a process list.");
              type = Analyser.MSPROCESSLIST;
            }
            else //error
            {
              System.out.println(f.toString() + " is an unknown file type.");
              type = Analyser.ERROR;
            }
      Here's what the first line of each file type should look like:

      MSINFOREPORT: System Information report written at: 06/15/08 15:02:54
      MSPROCESSLIST: Day Hour Minute Month Second Year

      And everytime that last code snippet goes to look at two files (one of each kind) it spits this out:

      [Link to Image of Output|http://i25.tinypic.com/15d1fv5.jpg]

      Why is the file all jacked-up (for lack of a better term) when I read it back in? Is the file encoding getting changed/manipulated somewhere that I'm not catching? Am I having a special moment and reading/writing to the socket incorrectly? Any help will be greatly appreciated, and thank you in advance!

      ~Kelly
        • 1. Re: Problem Reading in File (Extra Chars?)
          807591
          I also forgot to add, I've just manually copied the same files from the client to the server instead of using the program's socket to copy it and the same problem resulted. Thanks again in advance for any assistance.

          ~Kelly
          • 2. Re: Problem Reading in File (Extra Chars?)
            807591
            The file appears to be encoded as UTF-16. Try this:
            InputStreamReader in = new InputStreamReader(new FileInputStream(f), "UTF-16");
            BufferedReader input = new BufferedReader(in);
            • 3. Re: Problem Reading in File (Extra Chars?)
              807591
              Ahh!!! That did it! Thanks so much!

              Just another question, how do you think it got encoded as UTF-16? This is all running on a Vista Business workstation, is UTF-16 the default encoding of Vista? I've just never had a problem reading/writing files this way until now.
              • 4. Re: Problem Reading in File (Extra Chars?)
                807591
                blinkfink182 wrote:
                Ahh!!! That did it! Thanks so much!

                Just another question, how do you think it got encoded as UTF-16? This is all running on a Vista Business workstation, is UTF-16 the default encoding of Vista? I've just never had a problem reading/writing files this way until now.
                The phrase "default encoding" is a little ambiguous. I believe Windows has used UTF-16 internally since NT. Applications running on Windows usually use cp-125x (with 'x' being a different digit depending on the locale) by default, and that's what Java treats as the default encoding. But if you export a part of the Registry, the resulting .reg file will be in UTF-16, so it doesn't surprise me to see wmic creating UTF-16 files. By the way, when you view the files in Notepad, it auto-detects the encoding, so you wouldn't know it was UTF-16 unless you actively checked.
                • 5. Re: Problem Reading in File (Extra Chars?)
                  807591
                  Thanks for the clarification. That makes a lot more sense. Glad that you mentioned the Notepad thing because I did check that and it perplexed me even more. Thanks so much again. :)