JDK-8202835 : jfr/event/os/TestSystemProcess.java fails on missing events
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 11,12
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: linux
  • Submitted: 2018-05-09
  • Updated: 2020-04-27
  • Resolved: 2018-07-17
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 12 Other
11.0.4Fixed 12 b03Fixed openjdk8u262Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Relates :  
Description
Nightly failure in tier5 with the following exception:
java.lang.RuntimeException: No events: expected false, was true 
at jdk.testlibrary.Asserts.fail(Asserts.java:540) 
at jdk.testlibrary.Asserts.assertFalse(Asserts.java:465) 
at com.oracle.jfr.common.Events.hasEvents(Events.java:134) 
at com.oracle.jfr.event.os.TestSystemProcess.main(TestSystemProcess.java:32) 
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
at java.base/java.lang.reflect.Method.invoke(Method.java:569) 
at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:115) 
at java.base/java.lang.Thread.run(Thread.java:832) 

This was previously associated with a different bug, but was not really related. This issue is linked.
Comments
Replacing jdk8u-fix-request with link to JDK-8239140
17-02-2020

RFC: https://mail.openjdk.java.net/pipermail/jdk8u-dev/2020-January/011063.html
30-01-2020

Suggested fix in 'ProcessIterator': http://mail.openjdk.java.net/pipermail/hotspot-dev/2018-May/032445.html
19-05-2018

[~kbarrett] I globally agree with your analysis and I think your solution is pertinent. However, I would like to clarify how 'DIR' and 'dirent' are commonly used. Since POSIX spec doesn't provide precise implementation information, I'll use glibc manual instead: https://www.gnu.org/software/libc/manual/html_node/Reading_002fClosing-Directory.html#Reading_002fClosing-Directory Fist of all, we immediately see that 'readdir()' is 'MT_Safe' and share the same security concepts as 'readdir_r()' which is a good thing for the planned replacement. Then, quoting from the manual: "It normally returns a pointer to a structure containing information about the file. This structure is associated with the dirstream handle and can be rewritten by a subsequent call." [...] "Caution: The pointer returned by readdir points to a buffer within the DIR object. The data in that buffer will be overwritten by the next call to readdir. You must take care, for instance, to copy the d_name string if you need it later." Which is what I suggested with the fix above, and finally: "Because of this, it is not safe to share a DIR object among multiple threads, unless you use your own locking to ensure that no thread calls readdir while another thread is still using the data from the previous call. In the GNU C Library, it is safe to call readdir from multiple threads as long as each thread uses its own DIR object. POSIX.1-2008 does not require this to be safe, but we are not aware of any operating systems where it does not work." Given this, I guess the patch I provided for JDK-8202794 is safe as far as I know. But in the case of the OS-independent "os::readdir()", this seems a bit more complicated... we have to make sure that: 1) 'DIR' objects are locked if used in multiple threads 2) 'dirent' is copied if necessary So, either 'os::readdir()' spec should be clear about that and the callers fixed as you suggested or 'dirent' should be copied and 'DIR' eventually locked if required.
12-05-2018

[~bsrbnd] > The problem is that 'readdir()' returns a 'dirent' that may be statically allocated and reused, As discussed in JDK-8202353, no platform that we care about implements readdir that way. They are all thread-safe in the sense we are interested in here. And that is the direction POSIX specification seems to be headed, though glacially. > But, until now, 'os::readdir()' filled the buffer provided by the caller. The Windows implementation does not. > ... I guess this would be even easier to directly use 'readdir()' instead. os::readdir is a cross-platform compatibility function that is expected to support Windows, which doesn't have readdir. Part of the problem here is that os::readdir's current interface is a confusing merging of readdir and readdir_r, both returning a dirent* and having a dirent* argument that may or may not be used for anything. Some jfr code expected the argument to be filled in, even though that's not the cross-platform API that was provided (since Windows os::readdir ignores that argument). The jfr code got away with that because it's only calling os::readdir on non-Windows platforms; for Windows the corresponding jfr code does something else. When we changed the Linux os::readdir to be similar to Windows (ignoring the dirent* argument), that broke the arguably flawed jfr code. It doesn't look too difficult to fix the jfr code, but that's inconvenient in the middle of the review of all of jfr.
12-05-2018

The problem is that 'readdir()' returns a 'dirent' that may be statically allocated and reused, as specified: http://pubs.opengroup.org/onlinepubs/9699919799/functions/readdir.html But, until now, 'os::readdir()' filled the buffer provided by the caller. If we change this behavior, we'll have to cover all the places where it is used and I guess this would be even easier to directly use 'readdir()' instead. So, I think 'os::readdir()' should maybe keep its advantage of filling the caller's buffer as suggested in the fix above.
11-05-2018

The failing test is in jfr (Java Flight Recorder), which has been an Oracle commercial product which is presently in the process of being open sourced: JDK-8199712. A proper fix involves some changes to jfr. Rather than either (1) forcing that to happen in mid-review, or (2) doing something kludgy in os::readdir (as being suggested here), I think a better solution is (3) backout JDK-8179887 (reverting to using readdir_r but with the deprecation warning suppressed), let jfr go in, change jfr's usage of os::readdir, and then deal with os::readdir via JDK-8202353.
10-05-2018

Maybe some 'memcpy(dbuf ...)' is missing in os::readdir(). I've attached a fix you might try (I didn't find the failing test in the JDK repository but tier1 is OK with this fix).
10-05-2018

I've uploaded a safer variant using 'strlen()' to compute 'd_name' length.
10-05-2018

I see what's going on there. Sorry about that. It seems I missed or didn't examine carefully enough the currently closed (in jfr) uses of os::readdir.
09-05-2018

8179887: Build failure with glibc >= 2.24: error: 'int readdir_r(DIR*, dirent*, dirent**)' is deprecated Summary: Use readir rather than readdir_r on Linux. Reviewed-by: kbarrett, stuefe, andrew Contributed-by: Michal Vala <mvala@redhat.com> Has destroyed the SystemProcesses iterator that relies on *dbuf being populated during iteration: struct dirent* os::readdir(DIR* dirp, dirent *dbuf)
09-05-2018