JDK-8260427 : ZGC: misc vmTestbase/gc/gctests timed out
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 16,17
  • Priority: P4
  • Status: Resolved
  • Resolution: Cannot Reproduce
  • OS: windows
  • CPU: x86_64
  • Submitted: 2021-01-26
  • Updated: 2021-05-20
  • Resolved: 2021-05-03
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 17
17Resolved
Related Reports
Duplicate :  
Relates :  
Relates :  
Relates :  
Description
The following two tests timed out in the JDK16 CI:

vmTestbase/gc/gctests/gctest03/gctest03.java
vmTestbase/gc/gctests/gctest02/gctest02.java

Here's a snippet from the gctest03.java log file:

#section:main
----------messages:(6/256)----------
command: main gc.gctests.gctest03.gctest03 10000
reason: User specified action: run main/othervm gc.gctests.gctest03.gctest03 10000 
Mode: othervm [/othervm specified]
Timeout information:
--- Timeout information end.
elapsed time (seconds): 539.322
----------configuration:(0/0)----------
----------System.out:(3/197)----------
For random generator using seed: -4705380899977468619
To re-run test with same seed value please add "-Djdk.test.lib.random.seed=-4705380899977468619" to command line.
Timeout refired 480 times
----------System.err:(1/16)----------
STATUS:Passed.
----------rerun:(41/6189)*----------

<snip>

result: Error. Program `c:\ade\mesos\work_dir\jib-master\install\jdk-16+34-2213\windows-x64.jdk\jdk-16\bin\java' timed out (timeout set to 480000ms, elapsed time including timeout handling was 539307ms).


Here's a snippet from the gctest02.java log file:

#section:main
----------messages:(6/252)----------
command: main gc.gctests.gctest02.gctest02 100
reason: User specified action: run main/othervm gc.gctests.gctest02.gctest02 100 
Mode: othervm [/othervm specified]
Timeout information:
--- Timeout information end.
elapsed time (seconds): 561.028
----------configuration:(0/0)----------
----------System.out:(4/211)----------
For random generator using seed: -4705380899977468619
To re-run test with same seed value please add "-Djdk.test.lib.random.seed=-4705380899977468619" to command line.
Timeout refired 480 times
Test passed.
----------System.err:(1/16)----------
STATUS:Passed.
----------rerun:(41/6187)*----------

<snip>

result: Error. Program `c:\ade\mesos\work_dir\jib-master\install\jdk-16+34-2213\windows-x64.jdk\jdk-16\bin\java' timed out (timeout set to 480000ms, elapsed time including timeout handling was 561013ms).

The test task has these JVM args:

-XX:+CreateCoredumpOnCrash -XX:+UseZGC

Based on this line in the log files:

    STATUS:Passed.

both tests passed while the timeout handler was running so it looks like these tests
executed just past the default timeout value. Since these failures happened with
ZGC, perhaps the tests need a smaller heap specified?
Comments
I'm going to close this bug as CNR, since the original timeouts have not happened since January. The last reports will be handled by JDK-8266432. Any new timeouts in this area is best handled with a new blank-slate bug report.
03-05-2021

The timeouts that have been happening since the April 26th, are caused by JDK-8266432.
03-05-2021

This bug happens because of a deadlock because a relocation happens within a relocation. So, the thread first takes the _ref_lock, and then later tries to take the lock again. #1 0x00007ff375c1fc90 in os::PlatformMonitor::wait #2 0x00007ff3760cbf92 in ZForwarding::wait_page_released #3 0x00007ff376118065 in ZRelocate::relocate_object #4 0x00007ff374f7097b in AccessInternal::PostRuntimeDispatch&lt;ZBarrierSet::AccessBarrier&lt;286790ul, ZBarrierSet&gt;, #5 0x00007ff375176134 in oopDesc::obj_field #6 0x00007ff37551c5cb in java_lang_Thread::name #7 0x00007ff375f2e7ee in JavaThread::get_thread_name_string #8 0x00007ff376126323 in ZStatPhase::log_end #9 0x00007ff3761272e8 in ZStatCriticalPhase::register_end #10 0x00007ff3760cc0b0 in ZForwarding::wait_page_released #11 0x00007ff376118065 in ZRelocate::relocate_object #12 0x00007ff3760989c5 in ZLoadBarrierOopClosure::do_oop #13 0x00007ff375408ca8 in HandleArea::oops_do #14 0x00007ff375f2e0e9 in JavaThread::oops_do_no_frames
03-05-2021

Note that running gctest02 on my local machine takes ~1 second. When I ran it locally together with all other tests in the gctests directory it took 14 s. It's interesting that many of the tests in this directory is marked to be run "exclusively". I'm not sure this prevent non-exclusive tests from running. Maybe that's one cause of these kind of failures? I need to investigate that possibility.
27-01-2021

Could be related to the timeouts in JDK-8259789.
27-01-2021

Two competing tests perhaps causing both to time out.
27-01-2021