JDK-4285834 : File.list(FilenameFilter) is not effective for huge directories
  • Type: Enhancement
  • Component: core-libs
  • Sub-Component: java.io
  • Affected Version: 1.2.2,5.0
  • Priority: P4
  • Status: Closed
  • Resolution: Duplicate
  • OS: linux,windows_nt
  • CPU: x86
  • Submitted: 1999-10-28
  • Updated: 2009-02-16
  • Resolved: 2009-02-16
Related Reports
Duplicate :  
Relates :  
Name: krT82822			Date: 10/28/99

all versions to date

The File.list() method returns String[].  File.list(FilenameFilter) also returns
String[].  The problem is that when a directory has tens of thousands of files
in it, the use of File.list() can cause a huge VM size explosion.  In most
applications processing files, it is not necessary to see all of the files at
once.  One file at a time is fine.  The Posix readdir functionality in
particular is handy in these cases so that one can process all the entries
in a directory, one at a time.

I'd like to suggest that a new method be provided in File that will facilitate
the convienent processing of large directories.  Here's my suggestion.

public void scan( FileEntryAction act ) throws IOException;

public interface FileEntryAction {
    pubic void processFile( File dir, String name );

This would allow the activities to be implemented in another class if needed
and eliminate the need for such a huge array for large directories.

The reason I am purposing this instead of the alteration of the behavior of
File.list( FilenameFilter ) to filter as files are found, is that I think the
above purposal is more general solution and avoids the array creation
(Review ID: 97165) 

EVALUATION This issue has been addressed by the file system API defined for NIO2. A DirectoryStream is returned when the directory is opened and this can be used to iterate over the entries in a huge directory without requiring the read the entire list of file names as is done by the File.list method.

EVALUATION Contribution forum : https://jdk-collaboration.dev.java.net/servlets/ProjectForumMessageView?forumID=1463&messageID=23037

WORK AROUND Name: krT82822 Date: 10/28/99 There is no work around. For large directories, java can spends minutes if not hours scanning a directory to collect all the entries, and not only that, but it causes the VM size to swell excessively in the process. ======================================================================

EVALUATION Incremental reads of large directories will be addressed in the new I/O framework. -- mr@eng 1999/10/28