3 Replies Latest reply: Sep 1, 2009 1:08 AM by 796440 RSS

    Identifying whether a byte array consists of compressed data

    807580
      Hi,

      I'm new to java.util.zip.

      I'm trying to inflate data from a byte array which is originally retrieved from a database. However, I need to dynamically check whether the data in the byte[] was compressed (using java.util.zip.Deflater) or simply uncompressed byte[] representation of String.

      Essentially I want to have something like "isInflatable()" so that I can check it before attempting inflation using java.util.zip.Inflater.

      Is there a way to do this without actually attempting inflation which may or may not succeed depending on the data in the byte array?

      Thanks,
      Rahul
        • 1. Re: Identifying whether a byte array consists of compressed data
          796440
          You can try to inflate it. If it works, it was compressed. If it fails, it was not compressed, or it got corrupted.

          Or you could google for the spec for the zip algorithm and see if it has any particular "signature" bytes in the header that might tell you if the data is compressed in the format. Of course, that's not foolproof. Some other data might happen to start with those bytes even though it's not compressed.

          Why do you not want to attempt inflation?

          A better approach might be to just add another column that tells you what format the column in question is in.
          • 2. Re: Identifying whether a byte array consists of compressed data
            807580
            Thanks.

            In my specific application of this, I want to retrieve data from multiple columns some containing compressed data, others uncompressed data. I just wanted to see if I can do the retrieval and conversion to final String generically so that I don't have to have special handling. Attempting inflation may be expensive and what if uncompressed data gets processed resulting in garbage...isn't that possible?

            Rahul
            • 3. Re: Identifying whether a byte array consists of compressed data
              796440
              It's possible but unlikely that you'll have uncompressed data that will happen to correspond to valid compressed data.

              Really the right way to do it is to have a separate indicator of whether the given data is compressed.

              And if you're saying that some columns will always be compressed and some will never be compressed, and you're just trying to detect by the data even though knowing which column it is gives you that information already, then you're definitely going about it the wrong way.