JDK-4774421 : A zip file modified programmatically cannot be read again
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.net
  • Affected Version: 1.3.1
  • Priority: P4
  • Status: Closed
  • Resolution: Not an Issue
  • OS: linux
  • CPU: x86
  • Submitted: 2002-11-06
  • Updated: 2017-08-16
  • Resolved: 2017-08-16
Description

Name: gm110360			Date: 11/05/2002


FULL PRODUCT VERSION :
java version "1.3.1"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.1-b24)
Java HotSpot(TM) Client VM (build 1.3.1-b24, mixed mode)


FULL OPERATING SYSTEM VERSION :
Linux Red Hat 7.3

ADDITIONAL OPERATING SYSTEMS :

Windows 98


A DESCRIPTION OF THE PROBLEM :
If a zip file is opened for reading once, its entries seem
to be cached by the JVM till it is shut down. If the file is
modified henceforth, trying to read from the file afterwards
leads to wrong results / exceptions being thrown.

There does not seem to be any way to clear this cache.

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
1.Run the test program attached
2.
3.

EXPECTED VERSUS ACTUAL BEHAVIOR :
The program creates a zip file with 5 dummy files within it.
It then reads one of the files.

Now this file is deleted and recreated - this time with 6
entries, and data in the text file being different from the
first test.

When we try to read this file, the following errors occur.

Trying to read file 4 seems to read the old version of the
file on Linux) - and throws a ZipException under Windows 98.

Trying to read file 6 causes a FileNotFoundException to occur.

The file is created correctly - as can be verified by
opening it with another unzip program. The problem is only
while reading the same file in the same session - and seems
to be caused by the Zip entries being cached by the JVM.



REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
package test;

import java.io.*;
import java.net.*;
import java.util.*;
import java.util.zip.*;

/**
 * Test for bug in java.util.zip package
 */
public class ZipTest
{
	private static final String ZIPNAME = "testfile.zip";
	private int number = 0;
	private Random random = new Random();
	
	/**
	 * Creates a sample ZIP file with the specified number of entries
	 */
	private boolean createSampleZip(int numberOfEntries)
	{
		System.out.println("Creating "+ZIPNAME+" with "+numberOfEntries+" entries");
		try
		{
			ZipOutputStream zos = new ZipOutputStream(new FileOutputStream(ZIPNAME));
			for (int i=1; i <= numberOfEntries; i++)
			{
				ZipEntry entry = new ZipEntry("file" + i + ".txt");
				zos.putNextEntry(entry);
		      	PrintWriter pw=new PrintWriter(zos);
		      	int lines = random.nextInt(500)+100; // 100 to 599 lines
		      	int start = number+1;
		      	for (int j=0; j < lines; j++)
		      	{
		      		number++;
		      		pw.print("Line "+ number + "\r\n");
		        }
		      	pw.flush();
		      	zos.closeEntry();
		      	System.out.println("Succesfully created "+entry.getName()+ " with lines
"+start+" to "+number);
			}
			zos.close();
			System.out.println("Zipfile succesfully created");
			return true;
		} catch (IOException ioe)
		{
			System.out.println("Error in creating Zip file");
			ioe.printStackTrace();
			return false;
		}
	}
	
	/**
	 * Parses the line number from a line "Line xxxx" where xxxx is the line number
	 *
	 * @param line The string to be parsed (of the file "Line xxxx"
	 * @return The line number / -1 if error
	 */
	private int parseLineNumber(String line)
	{
		StringTokenizer st = new StringTokenizer(line);
		try
		{
			st.nextToken();
			String num = st.nextToken();
			try
			{
				return Integer.parseInt(num.trim());
			} catch (NumberFormatException nfe)
			{
				return -1;
			}
		} catch (NoSuchElementException nsee)
		{
			return -2;
		}
	}
	
	public void readZipFile(int entry)
	{
		System.out.println("Trying to read file No : "+entry);
		try
		{
			URL url = new URL("jar:file:"+ZIPNAME+"!/file"+entry+".txt");
			try
			{
				InputStream is = url.openStream();
				BufferedReader rdr = new BufferedReader(new InputStreamReader(is));
				String line;
				int start = -1;
				int last = -1;
				while ((line=rdr.readLine()) != null)
				{
					int n = parseLineNumber(line);
					if (n < 0)
					{
						System.out.println("Invalid line / Error = "+n+" Line ="+line);
					} else
					{
						if (start < 0) start = n;
						last = n;
					}
				}
				System.out.println("File read succesfully - Start line ="+start+" end line ="+last);
				rdr.close();
				is.close();
			} catch (IOException ioe)
			{
				ioe.printStackTrace();
			}
		}
		catch (MalformedURLException mue)
		{
			mue.printStackTrace();
		}
	}
	
	public static void main(String[] args)
	{
		ZipTest test = new ZipTest();
		// Create a sample zip file with 5 files in it
		if (test.createSampleZip(5))
		{
			// Test by reading back file no. 4
			test.readZipFile(4);
			
			// Now delete thsi zip file
			File file = new File(ZIPNAME);
			file.delete();
			System.out.println("Deleted the zip file");
			
			// Recreate the zip file, but with 6 entries this time
			if (test.createSampleZip(6))
			{
				// Try reading file 4 - results in wrong data being read (on Linux) / ZipException
(on Win98)
				test.readZipFile(4);
				
				// Try reading file 6 - Throws exception
				test.readZipFile(6);
			}
		}
		
	}
}

---------- END SOURCE ----------
(Review ID: 163903) 
======================================================================

Comments
As of Java 9, the URLConnection API provides a method of configuring caching per-protocol scheme. The default is still to cache, but cache control is now at a finer level of granularity.
16-08-2017

EVALUATION Test case use URL.openStream() to open the jar/jarentry (instead of using the JarFile directly). JarURLConnection caches the jarfile by default, so even the output stream obtained from openStream() gets closed, the jarfile is still held by the cache. So the reference to the jarfile is not reduced to 0 and the file is not really be deleted... URL url = new URL("jar:file:"+ZIPNAME+"!/file"+entry+".txt"); URLConnection urlc = url.openConnection(); urlc.setUseCaches(false); InputStream is = urlc.getInputStream(); does make the problem disappear. It does appear to be an issue that one time use of new URL.openStream() will always leave the target jarfile open and there is nothing the developer can do to release the resource. Transfer to the JarURLConnection owner for further evaluation.
11-06-2009