JDK-8332530 : ENTRY_CREATE event not always emitted when copying file into the folder
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.nio
  • Affected Version: 17.0.11
  • Priority: P3
  • Status: Resolved
  • Resolution: Not an Issue
  • OS: os_x
  • CPU: generic
  • Submitted: 2024-05-17
  • Updated: 2024-08-16
  • Resolved: 2024-08-16
Related Reports
Relates :  
Relates :  
Description
ADDITIONAL SYSTEM INFORMATION :
OS: Mac OS Monterey 12.5.1 / Mac OS Ventura 13.6.6
IMPLEMENTOR="Eclipse Adoptium"
IMPLEMENTOR_VERSION="Temurin-17.0.9+9"
IMPLEMENTOR_VERSION="Temurin-17.0.11+9"


A DESCRIPTION OF THE PROBLEM :
The ENTRY_CREATE event (java.nio.file.WatchService) is not always emitted when creating subfolder in the watched directory, renaming it and copying file there.

This behaviour was observed in the project which is using Spring Integration library
and initially the bug has been reported for that Spring library.

However, it turned out this is related to the JDK, not to that library.

This is only reproducible on Mac (reproduced for Temurin-17.0.9+9 and Temurin-17.0.11+9).
I've tried to reproduce it also on MAC using the different version of JDK:
IMPLEMENTOR="Oracle Corporation"
JAVA_VERSION="17.0.4.1"
JAVA_VERSION_DATE="2022-08-18"
but it is working fine.... !!!

REGRESSION : Last worked in version 17.0.11

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :

1. Configure FileReadingMessageSource in the following way:
FileReadingMessageSource messageSource = new FileReadingMessageSource();
messageSource.setDirectory(new File(inputDirectory));
messageSource.setUseWatchService(true);
messageSource.setWatchEvents(CREATE, MODIFY, DELETE);

2. Configure a simple flow that simply deletes the file

3. Prepare a file with minimum size of 17 KB (test.zip)

4. Configure logger for org.srpingframework.integration.file package (DEBUG level)

5. Execute the following command in the shell inside the watched directory:
mkdir b1; sleep 2; mv b1 b2; sleep 3; cp /Users/<USERNAME>/sandbox/test/test.zip b2/; sleep 2; mkdir b3; sleep 2; mv b3 b4; sleep 3; cp /Users/<USERNAME>/sandbox/test/test.zip b4/; sleep 2; mkdir b5; sleep 2; mv b5 b6; sleep 3; cp /Users/<USERNAME>/sandbox/test/test.zip b6/; sleep 2; mkdir b7; sleep 2; mv b7 b8; sleep 3; cp /Users/<USERNAME>/sandbox/test/test.zip b8/; sleep 2;

6. The ENTRY_CREATE is not emitted for all files (in the example below, it is only emitted for b4 folder and missing for b2, b6 and b8):
2024-05-17 10:44:08.150 [main] DEBUG o.s.i.file.FileReadingMessageSource - registering: /Users/<USERNAME>/sandbox/input for file events 2024-05-17 10:44:11.178 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - Watch event [ENTRY_CREATE] for file [/Users<USERNAME>/sandbox/input/b1] 2024-05-17 10:44:11.179 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - registering: /Users/<USERNAME>/sandbox/input/b1 for file events 2024-05-17 10:44:13.167 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - Watch event [ENTRY_CREATE] for file [/Users/<USERNAME>/sandbox/input/b2] 2024-05-17 10:44:13.168 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - registering: /Users/<USERNAME>/sandbox/input/b2 for file events 2024-05-17 10:44:13.169 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - Watch event [ENTRY_DELETE] for file [/Users/<USERNAME>/sandbox/input/b1] 2024-05-17 10:44:17.171 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - Watch event [ENTRY_MODIFY] for file [/Users/<USERNAME>/sandbox/input/b2] 2024-05-17 10:44:19.167 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - Watch event [ENTRY_CREATE] for file [/Users/<USERNAME>/sandbox/input/b3] 2024-05-17 10:44:19.167 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - registering: /Users/<USERNAME>/sandbox/input/b3 for file events 2024-05-17 10:44:21.164 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - Watch event [ENTRY_CREATE] for file [/Users/<USERNAME>/sandbox/input/b4] 2024-05-17 10:44:21.164 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - registering: /Users/<USERNAME>/sandbox/input/b4 for file events 2024-05-17 10:44:21.165 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - Watch event [ENTRY_DELETE] for file [/Users/<USERNAME>/sandbox/input/b3] 2024-05-17 10:44:23.170 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - Watch event [ENTRY_MODIFY] for file [/Users/<USERNAME>/sandbox/input/b4] 2024-05-17 10:44:24.170 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - Watch event [ENTRY_CREATE] for file [/Users/<USERNAME>/sandbox/input/b4/test.zip] 2024-05-17 10:44:24.171 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - Added to queue: [/Users/<USERNAME>/sandbox/input/b4/test.zip] 2024-05-17 10:44:24.177 [Thread_5] INFO  o.s.i.handler.LoggingHandler - GenericMessage [payload=/Users/<USERNAME>/sandbox/input/b4/test.zip, headers={sourceCountry=global, feedName=reuters, file_name=test.zip, targetPath=data, file_originalFile=/Users/<USERNAME>/sandbox/input/b4/test.zip, source=Global_Relay, priority=5, file_relativePath=b4/test.zip, uncompress=true, sieve=true, id=ed14f9ab-335f-b547-58db-0aa4b735dac5, decrypt=false, sourcePath=/Users/<USERNAME>/sandbox/input, remoteFeedName=reuters, timestamp=1715935464177}] 2024-05-17 10:44:25.172 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - Watch event [ENTRY_CREATE] for file [/Users/<USERNAME>/sandbox/input/b5] 2024-05-17 10:44:25.172 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - registering: /Users/<USERNAME>/sandbox/input/b5 for file events 2024-05-17 10:44:25.172 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - Watch event [ENTRY_MODIFY] for file [/Users/<USERNAME>/sandbox/input/b4] 2024-05-17 10:44:26.172 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - Watch event [ENTRY_DELETE] for file [/Users/<USERNAME>/sandbox/input/b4/test.zip] 2024-05-17 10:44:27.168 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - Watch event [ENTRY_CREATE] for file [/Users/<USERNAME>/sandbox/input/b6] 2024-05-17 10:44:27.168 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - registering: /Users/<USERNAME>/sandbox/input/b6 for file events 2024-05-17 10:44:27.169 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - Watch event [ENTRY_DELETE] for file [/Users/<USERNAME>/sandbox/input/b5] 2024-05-17 10:44:31.166 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - Watch event [ENTRY_MODIFY] for file [/Users/<USERNAME>/sandbox/input/b6] 2024-05-17 10:44:33.166 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - Watch event [ENTRY_CREATE] for file [/Users/<USERNAME>/sandbox/input/b7] 2024-05-17 10:44:33.166 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - registering: /Users/<USERNAME>/sandbox/input/b7 for file events 2024-05-17 10:44:35.169 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - Watch event [ENTRY_CREATE] for file [/Users/<USERNAME>/sandbox/input/b8] 2024-05-17 10:44:35.169 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - registering: /Users/<USERNAME>/sandbox/input/b8 for file events 2024-05-17 10:44:35.169 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - Watch event [ENTRY_DELETE] for file [/Users/<USERNAME>/sandbox/input/b7] 2024-05-17 10:44:39.168 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - Watch event [ENTRY_MODIFY] for file [/Users/<USERNAME>/sandbox/input/b8]

7. It seems the subfolders b2, b6 and b8 are not registered properly as when executing the following command (copying different file to these subfolders):
cp /Users/<USERNAME>/sandbox/test/test2.zip b2/ cp /Users/<USERNAME>/sandbox/test/test2.zip b4/ cp /Users/<USERNAME>/sandbox/test/test2.zip b6/ cp /Users/<USERNAME>/sandbox/test/test2.zip b8/

8. The ENTRY_CREATE and ENTRY_DELETE are only emitted for the b4 subfolder:
2024-05-17 11:22:51.107 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - Watch event [ENTRY_MODIFY] for file [/Users/<USERNAME>/sandbox/input/b2] 2024-05-17 11:23:06.107 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - Watch event [ENTRY_CREATE] for file [/Users/<USERNAME>/sandbox/input/b4/test2.zip] 2024-05-17 11:23:06.108 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - Added to queue: [/Users/<USERNAME>/sandbox/input/b4/test2.zip] 2024-05-17 11:23:06.111 [Thread_25] INFO  o.s.i.handler.LoggingHandler - GenericMessage [payload=/Users/<USERNAME>/sandbox/input/b4/test2.zip, headers={sourceCountry=global, feedName=reuters, file_name=test2.zip, targetPath=data, file_originalFile=/Users/<USERNAME>/sandbox/input/b4/test2.zip, source=Global_Relay, priority=5, file_relativePath=b4/test2.zip, uncompress=true, sieve=true, id=5ecc527e-1188-a57e-6afd-571d2ba75431, decrypt=false, sourcePath=/Users/<USERNAME>/sandbox/input, remoteFeedName=reuters, timestamp=1715937786111}] 2024-05-17 11:23:07.108 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - Watch event [ENTRY_MODIFY] for file [/Users/<USERNAME>/sandbox/input/b4] 2024-05-17 11:23:08.104 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - Watch event [ENTRY_DELETE] for file [/Users/<USERNAME>/sandbox/input/b4/test2.zip] 2024-05-17 11:23:13.106 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - Watch event [ENTRY_MODIFY] for file [/Users/<USERNAME>/sandbox/input/b6] 2024-05-17 11:23:21.109 [scheduling-1] DEBUG o.s.i.file.FileReadingMessageSource - Watch event [ENTRY_MODIFY] for file [/Users/<USERNAME>/sandbox/input/b8]

9. Everything works fine if the sleep is removed from command above (point 5) !!!



EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Expected behaviour:
The ENTRY_CREATE should be always emitted when renaming the subfolder of the watched directory and copying file there.
ACTUAL -
The ENTRY_CREATE event is not always emitted which is making WatchService not reliable mechanism in OpenJDK (it seems it is working fine against Oracle JDK)

---------- BEGIN SOURCE ----------

The bug can be reproduced by launching Spring Boot application and executing commands in the shell.
I haven't tried to make the same operations in the Java Test as the WathService mechanism is supposed to collaborate with file system notifications.
---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
Do not use sleep commands between renaming the folder and copying the file or use Oracle JDK which seems to work just fine....
I haven't managed to reproduce this issue on Windows but I suspect it may occur for Linux as well....

FREQUENCY : often



Comments
The observed behavior is as expected for a polling implementation such as in use here. From the class specification of java.nio.file.WatchService: "Short-lived files (meaning files that are deleted very quickly after they are created) may not be detected by primitive implementations that periodically poll the file system to detect changes. " Resolving as a non-issue.
16-08-2024

[~tongwan] The reporter should perhaps be apprised of the addition of the jdk.nio.file.WatchService.maxEventsPerPoll system property in JDK 23 by the fix for JDK-8330077. They might be able to work around the problem by setting this property to a value larger than the default 512.
26-07-2024

If the attached CreateEvents class is run on my system as $ java CreateEvents X where X > 0 is the number of milliseconds between file creations in the watched directory, then if X = 3 not as many events are detected as files created, but if X > 3, then the number of events equals the number of file creations. Unless there is something wrong with the CreateEvents class, it seems odd that any file creation events would be missed if the files were not deleted before a subsequent poll. Update: When X = 3, an OVERFLOW event is observed, so the conclusion about CreateEvents is inaccurate. When the system property jdk.nio.file.WatchService.maxEventsPerPoll is set to a larger value, no events are missed.
26-07-2024

Ask the submitter go to nio-dev, https://mail.openjdk.org/mailman/listinfo/nio-dev, to discuss further.
02-06-2024

[~tongwan] Can you ask the submitter to come to nio-dev to discuss further? The API docs, specifically the Platform Dependencies section, has always allowed for polling implementations. I think we need a clear, and short summary, on what they claim they mean by "missing events". None of the recent comments from the submitter are helpful.
02-06-2024

Additional information from the submitter: I’ve just read the comment from [~alanb]. Unfortunately, I cannot agree with him, especially with this sentence: „ The only reliable implementation on macOS right now is the polling implementation. ” In my opinion „reliable” WatchService means that there is no possibility to miss any file event (and the reproducer I’ve provided proves that the event which should notify about a new file is omitted). Additionally, this comment: WatchService on MacOS is still implemented by the generic PollingWatchService, which continuously re-scans the directory consuming CPU, introduces large latencies to the generated WatchEvent's, and misses changes that occur faster than the re-scan interval confirms that some changes could be missed by WatchService on Mac (it means that WatchService on Mac is not reliable). In the application which I am developing for HSBC bank we cannot accept the fact that the event about a new file is missed as this would result in serious business consequences. Fortunately we are not using Mac on production. Thus, if the specification allows for polling implementations You can close the bug but You should warn developers in the documentation that the WatchService on Mac OS not reliable - more specifically You should warn the developers that some file events can be missed on Mac.
01-06-2024

> As for the comment from [~alanb], I think simple polling implementation for WatchService in MacOS is not acceptable because it is not reliable. I would expect from JDK to handle file system notifications instead. If I would like to use polling mechanism here I would prefer to do it in my own code (for example by using Spring Integration poller) as I would have more control on that. This comment from the submitter is a bit unfair. The specification allows for polling implementations. There has been several attempts at native implementations over the years but all have run into bugs with the file events mechanisms on macOS. The only reliable implementation on macOS right now is the polling implementation. If the submitter wants to debate this further then they should come to the OpenJDK nio-dev mailing list where this has been discussed several times.
25-05-2024

Additional information from the submitter: I think I could prepare a standalone reproducer using WatchService directly. However, it would take some time for me (as I need to investigate a bit spring class I am using) and I am not sure when I will be able to do that. As for the comment from [~alanb], I think simple polling implementation for WatchService in MacOS is not acceptable because it is not reliable. I would expect from JDK to handle file system notifications instead. If I would like to use polling mechanism here I would prefer to do it in my own code (for example by using Spring Integration poller) as I would have more control on that.
25-05-2024

Additional information from the submitter: I’ve just found this issue in Your tracking system: https://bugs.openjdk.org/browse/JDK-8293067 WatchService on MacOS is still implemented by the generic PollingWatchService, which continuously re-scans the directory consuming CPU, introduces large latencies to the generated WatchEvent's, and misses changes that occur faster than the re-scan interval It seems that explains everything…
23-05-2024

Additional information from the submitter: I've just tried to reproduce that issue on Linux Red Hat Enterprise Linux Server 7.9 (Maipo) and it is working fine (all events are emitted correctly). It is not reproducible on Windows as well. Thus, it seems that the issue is related to the Mac-OS only.... You will find the sources of the project in the attached file.
20-05-2024

The WatchService implementation used on macOS is the simple polling implementation. Short lived files that only appear in a directory for a brief period of time may not be observed by this implementation. Note that this is allowed by the spec, see "Platform dependencies" section of the WatchService class description for more on this. Also note that the granularity of the time stamp used by the polling implementation, and the polling interval, have increased in recent releases, which might explain why they see a behavior difference between JDK releases.
20-05-2024

Message sent to reporter: We are unable to use the reproducer as provided. Code must be submitted with the bug report directly. Content stored in external locations (such as, but not limited to, iCloud Drive, Dropbox, Google Drive, Microsoft OneDrive, or GitHub) can not be accepted. Please supply a reproducer (raw source code of the test) on the understanding that the code: - is owned by you, or you are in possession of all associated rights; - will enter the public domain; - can be used, without any conditions or obligations, by any user for any purpose. By the way, can you kindly try the latest version of JDK 17 from https://www.oracle.com/java/technologies/downloads/#java17 and let us know the results?
20-05-2024