JDK-8028804 : Deflater.deflateBytes() may produce corrupted output on Deflater level/strategy change
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.util.jar
  • Affected Version: 7,8
  • Priority: P4
  • Status: Open
  • Resolution: Unresolved
  • OS: generic
  • CPU: generic
  • Submitted: 2013-11-21
  • Updated: 2018-09-11
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbdUnresolved
Related Reports
Relates :  
Relates :  
Description
Deflater.deflateBytes() may cause corrupted output when being called after Deflater level or strategy have been changed.

 

In Deflater.c, deflateBytes(), zlib deflateParams() is called after level or strategy changed. deflateParams() will attempt to flush all outstanding output. It will do this by calling - once, a single time - deflate(..., Z_BLOCK). This may not be enough if the output buffer is too small to hold the whole output; in that case, output will not have been completely flushed before the compression parameters are changed. The result may be that the compressed data are corrupted.

The following tests shows the problem:

/* Copyright (c) 2001-2007 by SAP AG, Walldorf, Germany.
 */

import java.io.ByteArrayOutputStream;
import java.util.Random;
import java.util.zip.DataFormatException;
import java.util.zip.Deflater;
import java.util.zip.Inflater;

/**
 * This test stresses the Deflater's ability to change compression
 * parameters while compressing; the resulting z stream should be
 * valid and not corrupted.
 *
 * @author Thomas Stuefe
 */
public class DeflaterFlushTest {
    
    public static void main(String[] args) {
      new DeflaterFlushTest().testDeflateParams();
    }

    public DeflaterFlushTest() {
    }
    
    private static int durationSeconds = 120;
    
    private boolean compareByteArrays(byte[] a, byte[] b) {
        if (a.length != b.length) {
            System.out.println("Lengths differ: " + a.length + " vs " + b.length + ".");
            return false;
        }
        for (int i = 0; i < a.length; i ++) {
            if (a[i] != b[i]) {
                System.out.println("Different bytes at pos " + i + ".");
                return false;
            }
        }
        return true;
    }
    

    private byte[] uncompress(byte[] compressed) throws DataFormatException {
        ByteArrayOutputStream bos = new ByteArrayOutputStream();
        byte[] tmp = new byte[512];
        Inflater inf = new Inflater();
        inf.setInput(compressed);
        while (!inf.finished()) {
            int len = 0;
            len = inf.inflate(tmp);
            bos.write(tmp, 0, len);
        }
        return bos.toByteArray();
    }
    
    public void testDeflateParams() {
        
        Random r = new Random();
        
        final long t1 = System.currentTimeMillis();
        final long tend = t1 + (durationSeconds * 1000); // 2 minutes max
        int run = 0;
        int errors = 0;
        
        // given a large input block of textlike random data, compress this data
        // using Deflater in steps of 0-4096 bytes.
        // Each Step there is a high chance of switching compression level and 
        // thus enforcing a flush onto the Deflater.
        // Deflation itself uses a very small (100 bytes) output array which mimicks
        // the 512 byte temp array in the DeflaterOutputStream. 
        // This means that if compression is changed and a flush is enforced,
        // the chance is good that this flush needs several attempts because the
        // output buffer will be too small to hold the whole flush output.
        //
        // This in turn triggers the bug in the Deflater where deflateParams()
        // cannot fully flush the current block and nevertheless changes compression
        // parameters.
        while (run < 10 || System.currentTimeMillis() < tend) {
            
            run ++;
           
            byte original[] = new byte[r.nextInt(0x100000) + 100];
            for (int i = 0; i < original.length; i ++) {
                original[i] = (byte)(r.nextInt(95) + 32);
         //       original[i] = (byte)(r.nextInt());
            }
            int ipos = 0;
            byte tmp[] = new byte[0x80];
            
            ByteArrayOutputStream bos = new ByteArrayOutputStream();
            
            Deflater def = new Deflater();
            
            while (ipos < original.length) {
                
                final int[] compressionLevels = {0, 2, 9};
                int nextCompressionLevel = compressionLevels[r.nextInt(3)];
                
                def.setLevel(nextCompressionLevel);
                
                int nextlen = r.nextInt(4096);
                if (ipos + nextlen > original.length) {
                    nextlen = original.length - ipos;
                }
                def.setInput(original, ipos, nextlen);
                ipos += nextlen;
                
                while (!def.needsInput()) {
                    int len = def.deflate(tmp);
                    bos.write(tmp, 0, len);
                }
                
            }

            def.finish();
            while (!def.finished()) {
                int len = def.deflate(tmp);
                bos.write(tmp, 0, len);
            }

            byte[] compressed = bos.toByteArray();
            
            byte[] inflated = null;
            boolean wasok = true;
            
            try {
                inflated = uncompress(compressed);
            } catch (DataFormatException e) {
                System.out.println("DataFormatException " + e.getMessage());
                wasok = false;
            }
            
            if (inflated != null) {
                wasok = compareByteArrays(inflated, original);
            }
            
            if (!wasok) {
                errors ++;
                System.out.println(errors + " Errors in " + run + " runs.");
            }
            
        }
        
        if (errors > 0) {
            System.out.println("One or more errors.");
        }
        
    }
    
    
}

Comments
Does this bug stay open then ? As per the zlib bug comments, here's what Mark states : === I have updated the documentation to note the behavior. I will leave the responsibility with the user to flush the stream before calling deflateParams() if the parameter change is required to take effect at an exact location in the uncompressed data. Note that deflateParams() already does signal an unfinished block with Z_BUF_ERROR. In this case, the parameter change will take effect at an undefined location in the data provided so far, but the deflated data will not be corrupted. === JDK code only calls deflateParams() in one location : Java_java_util_zip_Deflater_deflateBytes Would JDK be able to detect such a parameter change and take appropriate action in that JNI function ?
14-10-2015

Update: A bug was opened with the zlib author in 2013: https://github.com/madler/zlib/issues/57 As a result, Mark Adler did not change the behaviour of deflateParams(), but added a comment to the deflateParams() documentation: " If a deflate(strm, Z_BLOCK) is performed by deflateParams(), and it does not have enough output space to complete, then the parameter change will take effect at an undetermined location in the uncompressed data provided so far. In order to assure a change in the parameters at a specific location in the uncompressed data, the deflate stream should first be flushed with Z_BLOCK or another flush parameter, and deflate() called until strm.avail_out is not zero, before the call of deflateParams(). Then no more input data should be provided before the deflateParams() call. If this is done, the old level and strategy will be applied to the data compressed before deflateParams(), and the new level and strategy will be applied to the the data compressed after deflateParams()." My first solution was a small fix inside the zlib itself, which got rejected in the JDK mainling lists in 2013 because the feeling was that we did not want to carry a patched zlib around and that the long-term strategy was to rely on the system's zlib anyway. I can fully understand this. Unfortunately fixing the issue in the JDK without breaking existing user code is extremely complicated, so for now there is no solution for this issue.
17-08-2015

There was a mail thread in November 20123 (http://mail.openjdk.java.net/pipermail/core-libs-dev/2013-November/thread.html#23557) where Thomas Stuefe proposed a fix which required some changes in the native zlib implementation. The webrev is still available from here: http://cr.openjdk.java.net/~simonis/webrevs/8028804/ Thomas as also opened a bug against the original zlib implementation: https://github.com/madler/zlib/issues/57
06-10-2014

This may also be related to JDK-8028216 but unfortunately JIRA dosn't allow me to add another link.
25-11-2013