JDK-8223481 : gtest/GTestWrapper.java failed due to "assert(ret == 0) failed: sem_post failed; error='Invalid argument' (errno=EINVAL)"
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 13
  • Priority: P2
  • Status: Closed
  • Resolution: Fixed
  • OS: solaris_11
  • CPU: sparc_64
  • Submitted: 2019-05-07
  • Updated: 2019-08-15
  • Resolved: 2019-05-07
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 13
13 b20Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Description
The following test failed in the JDK13 CI:

gtest/GTestWrapper.java

Here's a snippet from the log file:

[----------] 1 test from markOopDesc
[ RUN      ] markOopDesc.printing_test_vm
# To suppress the following error report, specify this argument
# after -XX: or in .hotspotrc:  SuppressErrorAt=/semaphore_posix.cpp:56
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/scratch/opt/mach5/mesos/work_dir/797e873e-e550-4bd3-abe4-140aca3eb239/workspace/open/src/hotspot/os/posix/semaphore_posix.cpp:56), pid=22645, tid=84
#  assert(ret == 0) failed: sem_post failed; error='Invalid argument' (errno=EINVAL)
#
# JRE version: Java(TM) SE Runtime Environment (13.0) (fastdebug build 13-internal+0-jdk-13-961)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 13-internal+0-jdk-13-961, mixed mode, tiered, compressed oops, g1 gc, solaris-sparc)
# Core dump will be written. Default location: /opt/mach5/mesos/work_dir/09936a6a-dfa4-4963-b63d-cd1813e707c6/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_common/scratch/0/core or core.22645
#
# An error report file with more information is saved as:
# /opt/mach5/mesos/work_dir/09936a6a-dfa4-4963-b63d-cd1813e707c6/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_common/scratch/0/hs_err_pid22645.log
[       OK ] markOopDesc.printing_test_vm (272 ms)
[----------] 1 test from markOopDesc (272 ms total)
Comments
Also fails in tier3 on Windows jib > [----------] 1 test from markOopDesc jib > [ RUN ] markOopDesc.printing_test_vm jib > # To suppress the following error report, specify this argument jib > # after -XX: or in .hotspotrc: SuppressErrorAt=t:/workspace/open/src/hotspot/os/windows/semaphore_windows.cpp:46 jib > # jib > # A fatal error has been detected by the Java Runtime Environment: jib > # jib > # Internal Error (t:/workspace/open/src/hotspot/os/windows/semaphore_windows.cpp:46), pid=17568, tid=42408 jib > # assert(ret != 0) failed: ReleaseSemaphore failed with error code: 6 jib > # jib > # JRE version: Java(TM) SE Runtime Environment (13.0) (fastdebug build 13-internal+0-jdk-13-970) jib > # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 13-internal+0-jdk-13-970, mixed mode, tiered, compressed oops, g1 gc, windows-amd64) jib > # Core dump will be written. Default location: T:\testoutput\test-support\jtreg_open_test_hotspot_jtreg_tier1_common\scratch\1\hs_err_pid17568.mdmp jib > # jib > # An error report file with more information is saved as: jib > # T:\testoutput\test-support\jtreg_open_test_hotspot_jtreg_tier1_common\scratch\1\hs_err_pid17568.log jib > [ OK ] markOopDesc.printing_test_vm (1261 ms)
08-05-2019

I added a sleep to the LockerThread before exit, like this: for (int i = 0; i < 1000; i++) os::naked_short_sleep(10); } And a done.wait() - wait on the semaphore to the main thread. This caused a deadlock because the LockerThread exit runs post_run: void post_run() { Threads::remove(this, false); _post->signal(); this->smr_delete(); } Because the sleep causes the GuaranteedSafepointInterval to fire, Thread::remove() will wait for a safepoint on the Threads_lock. The main thread waits for the semaphore and does not check for a safepoint. It's running as far as safepoint checking goes, so the test deadlocks waiting for a safepoint. I thought these were in the wrong order, and that we should post the semaphore first, but posting the semaphore releases the main thread who is waiting. In this test, the next thing the main thread does is do a full GC. Because the gtest LockerThread is not really a real thread, the GC crashes trying to make the TLAB parseable. The real fix is to make it call Semaphore::wait_with_safepoint_check. I suppose I'll file a new RFE to add back the test (even though now it seems cursed).
07-05-2019

Also fails in tier1 on OS X: [----------] 1 test from markOopDesc [ RUN ] markOopDesc.printing_test_vm # To suppress the following error report, specify this argument # after -XX: or in .hotspotrc: SuppressErrorAt=/semaphore_bsd.cpp:57 # # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (/scratch/mesos/slaves/07fc96ef-bf4d-487f-b22f-a84e49f5f44a-S134440/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/662be188-58e7-43db-a13f-1d0e9cc5be32/runs/5845a121-33a1-4bff-b1d0-deca430ec43d/workspace/open/src/hotspot/os/bsd/semaphore_bsd.cpp:57), pid=89749, tid=40511 # assert(ret == 0) failed: Failed to signal semaphore #
07-05-2019

Yes, the problem is the semaphore is destroyed before post_run() is executed (which executes the crashing _post->signal()). That done.wait(); was actually needed.
07-05-2019

Wonder if this is what's wrong with the test (below). The test that I copied the thread code from had this but we thought the wait wasn't needed. diff --git a/test/hotspot/gtest/oops/test_markOop.cpp b/test/hotspot/gtest/oops/test_markOop.cpp --- a/test/hotspot/gtest/oops/test_markOop.cpp +++ b/test/hotspot/gtest/oops/test_markOop.cpp @@ -125,6 +125,7 @@ ol.wait(THREAD); assert_test_pattern(h_obj, "monitor"); + done.wait(); // wait till the thread is done. } // Make the object older. Not all GCs use this field.
07-05-2019

[~coleenp] - This failure might interest you....
07-05-2019

Attached the test-support_jtreg_open_test_hotspot_jtreg_tier1_common_gtest_GTestWrapper_hs_err_pid22645.log to the bug report. Here's a snippet from the hs_err_pid file: --------------- T H R E A D --------------- Current thread (0x00000001009d7800): JavaThread "JavaTestThread" [_thread_in_vm, id=84, stack(0xffffffff50900000,0xffffffff50a00000)] Stack: [0xffffffff50900000,0xffffffff50a00000], sp=0xffffffff509ff430, free space=1021k Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x23c2100] void VMError::report_and_die(int,const char*,const char*,void*,Thread*,unsigned char*,void*,void*,const char*,int,unsigned long)+0xac0 V [libjvm.so+0x23c15d8] void VMError::report_and_die(Thread*,void*,const char*,int,const char*,const char*,void*)+0x38 V [libjvm.so+0x170ef00] void report_vm_error(const char*,int,const char*,const char*,...)+0xf0 V [libjvm.so+0x2122d70] void PosixSemaphore::signal(unsigned)+0x90 V [libjvm.so+0x1053990] void JavaTestThread::post_run()+0x20 V [libjvm.so+0x22e3fdc] void Thread::call_run()+0x1dc V [libjvm.so+0x1fb5b8c] thread_native_entry+0x3ac Based on the stack trace, it appears that the VM crashed after finishing a test with a JavaTestThread. Based on the log file, it appears that the crash happened during "1 test from markOopDesc". I don't know gtest that well, it could be that the crash is due to the previous test. However, test/hotspot/gtest/oops/test_markOop.cpp was pushed on 2019.05.06 by JDK-8222893.
07-05-2019