1 Reply Latest reply: May 17, 2010 1:53 PM by 800459 RSS

    Having problems creating a zip file with Japanese language character names

    843810
      I have a bunch of files with names in Japanese characters (also Chinese, Korean, Spanish etc, but this will do for an example). The encoding is Unicode.

      I wish to be able to put them in a zip file, then extract them again using any old zip tool (WinZip,PKZip,7-Zip etc)

      Trouble is, every time I do it, the files inside the zipfile end up with garbage character names. The contents of the files are fine, though.

      I'm aware that there used to be a bug in the java.util.zip.* classes regarding character encodings (http://bugs.sun.com/bugdatabase/view_bug.do;jsessionid=5bd4fe01ad8a7b4ec89afef5005da?bug_id=4244499) but as far as I can see it's supposed to have been fixed I have also tried the ZipOutputStream class from apache which allows you to set the encoding manually - no luck there either (just many different varieties of garbage characters)

      Test code below. What am I missing here?

      _______________________________________________________________________________________________________________________________________________

      import java.io.File;
      import java.io.FileOutputStream;
      import java.io.FileReader;
      import java.io.IOException;

      public class ZipTest {

      static String uniqueFileName = "西 純.txt";

      public static void main(String[] args) {
      try {
      standardZip();
      apacheZip();
      } catch (IOException fe) {
      System.out.println("problem in file - exception " + fe.getMessage());
      }
      }

      public static void apacheZip() throws IOException {
      String encoding = "";
      File inputFile = new File(uniqueFileName);
      FileReader reader = new FileReader(inputFile);
      encoding = reader.getEncoding();
      System.out.println("Using Apache - Encoding = " + encoding);
      File zipFile = new File("apacheZip.zip");
      org.apache.tools.zip.ZipOutputStream zipOutputStream = new org.apache.tools.zip.ZipOutputStream(
      zipFile);
      zipOutputStream.setEncoding(encoding);
      org.apache.tools.zip.ZipEntry zipEntry = new org.apache.tools.zip.ZipEntry(uniqueFileName);
      zipEntry.setSize(inputFile.length());
      zipEntry.setTime(inputFile.lastModified());
      zipOutputStream.putNextEntry(zipEntry);
      int c = 0;
      while (c >= 0) {
      c = reader.read();
      if (c >= 0) {
      zipOutputStream.write(c);
      }
      }
      zipOutputStream.closeEntry();
      zipOutputStream.finish();

      }

      public static void standardZip() throws IOException {
      String encoding = "";
      File inputFile = new File(uniqueFileName);
      FileReader reader = new FileReader(inputFile);
      encoding = reader.getEncoding();
      System.out.println("Using Java IO - Encoding = " + encoding);
      File zipFile = new File("standardZip.zip");
      FileOutputStream zipOut = new FileOutputStream(zipFile);
      java.util.zip.ZipOutputStream zipOutputStream = new java.util.zip.ZipOutputStream(zipOut);
      java.util.zip.ZipEntry zipEntry = new java.util.zip.ZipEntry(uniqueFileName);
      zipEntry.setSize(inputFile.length());
      zipEntry.setTime(inputFile.lastModified());
      zipOutputStream.putNextEntry(zipEntry);
      int c = 0;
      while (c >= 0) {
      c = reader.read();
      if (c >= 0) {
      zipOutputStream.write(c);
      }
      }
      zipOutputStream.closeEntry();
      zipOutputStream.finish();
      }
      }

        • 1. Re: Having problems creating a zip file with Japanese language character names
          800459
          Emma_Baillie wrote:
          I have a bunch of files with names in Japanese characters (also Chinese, Korean, Spanish etc, but this will do for an example). The encoding is Unicode.

          I wish to be able to put them in a zip file, then extract them again using any old zip tool (WinZip,PKZip,7-Zip etc)

          Trouble is, every time I do it, the files inside the zipfile end up with garbage character names. The contents of the files are fine, though.
          This is becuase zip tool doesn't support unicode for zip entries. For example WinZip prior to 11.2 does not support Unicode characters in filenames. You need to look for the simillar information for other tools. You can find more on tools on their website.