JDK-8293829 : Windows native last error value cleared with -Xcomp or -Xint (but not neither)
  • Type: Bug
  • Component: core-svc
  • Sub-Component: debugger
  • Affected Version: 20
  • Priority: P3
  • Status: Open
  • Resolution: Unresolved
  • OS: windows
  • Submitted: 2022-09-14
  • Updated: 2022-09-19
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbdUnresolved
Related Reports
Relates :  
Relates :  
Sub Tasks
JDK-8293813 :  
Description
The native Windows last error value can be set and fetched using Panama native access, to call SetLastError and GetLastError.  The value set should be available to be fetched again.  GetLastError should usually be called "immediately" after e.g. an api call that may set it, so calling immediately after calling SetLastError seems reasonable.

JDK-8292302 showed a problem under jdb, now resolved, where the last error value can be cleared/corrupted.  This was a regression from previous behaviour.  

Testing for that fix shows that -Xcomp also clears the value, and this is reproducible before the change that caused 8292302.

Running with -Xint  also reproduces the problem of GetLastError not returning the value set with SetLastError, but takes longer.

Running with neither -Xcomp or -Xint does not appear to reproduce the problem.

Example test app attached:

javac --release 20 --enable-preview GetLastErrorTest.java

Normal success:
java --enable-preview GetLastErrorTest
..truncated...
.........ran for 5230230 ms


Failure:

java -Xcomp --enable-preview GetLastErrorTest
(warnings removed)
0
java.lang.RuntimeException: GetLastError mismatch: 0 != wanted 42: failed.
ran for 4530 ms


java -Xint --enable-preview GetLastErrorTest
...truncated...
.........................................................................................0
java.lang.RuntimeException: GetLastError mismatch: 0 != wanted 42: failed.
ran for 1542702 ms

(that's failing in about 25 minutes, -Xcomp fails immediately)
Comments
I agree with [~jvernee] earlier comment that we should not expect the VM to be able to preserve the "last error" in this context. As discussed previously the way this is done in Panama via two separate java calls is fundamentally flawed and should not be expected to work.
19-09-2022

[~kevinw] Nice finds! (I had been trying to trace calls to SetLastError using wt in windbg (as the suggested alternative to strace) without much luck. Setting a breakpoint is a much better idea :))
17-09-2022

The ProblemListing of com/sun/jdi/JdbLastErrorTest.java was integrated in jdk-20+16-1038 so these sightings should quiet down soon...
16-09-2022

OK, two little save and restore last error calls fix this. One for -Xint and one for -Xcomp. Should not be a surprise now I have it: For -Xint it's an allocation which must see a full tlab and needs an allocation, so thread is suspended and loses its last error value. I placed a save/restore in VMThread::execute and that works fine for this test, and should also fix other cases where the thread may have suspended for a vm op. For -Xcomp, it seems obvious NOW that it was CompileBroker::compile_method which eventually blocked on a monitor. A save/restore in there fixes the problem. Notes on the run with -Xint in WinDbg: ~11 bp ntdll!RtlSetLastWin32Error "~.;r;k;g" ~11 bp ntdll!RtlGetLastWin32Error "~.;r;k;g" This is the Java app calling SetLastError with rcx = 0x2a = 42 . 13 Id: 4308.6660 Suspend: 1 Teb: 000000d1`cdc42000 Unfrozen rax=0000025e7b08a2f8 rbx=0000025e7b08a2f8 rcx=000000000000002a ... ntdll!RtlSetLastWin32Error: 00007fff`61180770 894c2408 mov dword ptr [rsp+8],ecx ss:000000d1`ce2fedd0=7b07c498 # Child-SP RetAddr Call Site 00 000000d1`ce2fedc8 0000025e`6a09203e ntdll!RtlSetLastWin32Error 01 000000d1`ce2fedd0 0000025e`7b07c498 0x0000025e`6a09203e 02 000000d1`ce2fedd8 000000d1`ce2fee20 0x0000025e`7b07c498 03 000000d1`ce2fede0 0000025e`7b07c620 0x000000d1`ce2fee20 04 000000d1`ce2fede8 00000000`00000000 0x0000025e`7b07c620 Then the next set last error in the same thread: this calls it with rcx = 0 It's an allocation! Filled tlab and needs a GC!? . 13 Id: 4308.6660 Suspend: 1 Teb: 000000d1`cdc42000 Unfrozen rax=0000000000000000 rbx=0000000000000000 rcx=0000000000000000 ntdll!RtlSetLastWin32Error: 00007fff`61180770 894c2408 mov dword ptr [rsp+8],ecx ss:000000d1`ce2fea00=00000000 # Child-SP RetAddr Call Site 00 000000d1`ce2fe9f8 00007fff`5ee9b38d ntdll!RtlSetLastWin32Error 01 000000d1`ce2fea00 00007fff`5eedcfb7 KERNELBASE!BaseSetLastNTError+0x1d 02 000000d1`ce2fea30 00007ffe`d3036449 KERNELBASE!SleepConditionVariableCS+0x37 03 000000d1`ce2fea60 00007ffe`d3008284 jvm!PlatformMonitor::wait+0x29 [s\open\src\hotspot\os\windows\os_windows.cpp @ 5379] 04 000000d1`ce2fea90 00007ffe`d31ae125 jvm!Monitor::wait+0x84 [s\open\src\hotspot\share\runtime\mutex.cpp @ 257] 05 (Inline Function) --------`-------- jvm!MonitorLocker::wait+0xe [s\open\src\hotspot\share\runtime\mutexLocker.hpp @ 256] 06 (Inline Function) --------`-------- jvm!VMThread::wait_until_executed+0x10e [s\open\src\hotspot\share\runtime\vmThread.cpp @ 378] 07 000000d1`ce2fead0 00007ffe`d2ca6c06 jvm!VMThread::execute+0x1a5 [s\open\src\hotspot\share\runtime\vmThread.cpp @ 549] 08 (Inline Function) --------`-------- jvm!G1CollectedHeap::do_collection_pause+0x46 [s\open\src\hotspot\share\gc\g1\g1CollectedHeap.cpp @ 2656] 09 000000d1`ce2feb40 00007ffe`d2ca635f jvm!G1CollectedHeap::attempt_allocation_slow+0x2b6 [s\open\src\hotspot\share\gc\g1\g1CollectedHeap.cpp @ 449] 0a 000000d1`ce2fec30 00007ffe`d2fcaa85 jvm!G1CollectedHeap::attempt_allocation+0x3f [s\open\src\hotspot\share\gc\g1\g1CollectedHeap.cpp @ 721] 0b 000000d1`ce2fec70 00007ffe`d2fca7e0 jvm!MemAllocator::allocate_inside_tlab_slow+0x185 [s\open\src\hotspot\share\gc\shared\memAllocator.cpp @ 324] 0c (Inline Function) --------`-------- jvm!MemAllocator::allocate_inside_tlab+0x3f [s\open\src\hotspot\share\gc\shared\memAllocator.cpp @ 280] 0d (Inline Function) --------`-------- jvm!MemAllocator::mem_allocate+0x6f [s\open\src\hotspot\share\gc\shared\memAllocator.cpp @ 356] 0e 000000d1`ce2fecb0 00007ffe`d2d40861 jvm!MemAllocator::allocate+0xb0 [s\open\src\hotspot\share\gc\shared\memAllocator.cpp @ 369] 0f (Inline Function) --------`-------- jvm!CollectedHeap::obj_allocate+0x22 [s\open\src\hotspot\share\gc\shared\collectedHeap.inline.hpp @ 36] 10 000000d1`ce2fed10 00007ffe`d2d55b1c jvm!InstanceKlass::allocate_instance+0x61 [s\open\src\hotspot\share\oops\instanceKlass.cpp @ 1376] 11 000000d1`ce2fee10 0000025e`6a00911f jvm!InterpreterRuntime::_new+0x9c [s\open\src\hotspot\share\interpreter\interpreterRuntime.cpp @ 242] 12 000000d1`ce2fee50 0000025e`7b016870 0x0000025e`6a00911f 13 000000d1`ce2fee58 00000006`203ffff8 0x0000025e`7b016870 14 000000d1`ce2fee60 000000d1`ce2fee80 0x00000006`203ffff8 15 000000d1`ce2fee68 000000d1`ce2feed0 0x000000d1`ce2fee80 16 000000d1`ce2fee70 00007fff`60b75bf0 0x000000d1`ce2feed0 17 000000d1`ce2fee78 0000025e`6a0090df KERNEL32!GetLastErrorStub 18 000000d1`ce2fee80 000000d1`ce2fee80 0x0000025e`6a0090df 19 000000d1`ce2fee88 0000025e`7b0171cc 0x000000d1`ce2fee80 1a 000000d1`ce2fee90 000000d1`ce2feee0 0x0000025e`7b0171cc 1b 000000d1`ce2fee98 0000025e`7b036090 0x000000d1`ce2feee0 1c 000000d1`ce2feea0 00000000`00000000 0x0000025e`7b036090 TLAB size does actually affect the timing of the problem, but not solve it. But saving last error around a VM Operation is good. The key stack from the -Xcomp case was: . 11 Id: 198c.6b30 Suspend: 1 Teb: 00000040`3fdad000 Unfrozen rax=0000000000000000 rbx=0000000000000000 rcx=0000000000000000 ntdll!RtlSetLastWin32Error: 00007fff`61180770 894c2408 mov dword ptr [rsp+8],ecx ss:00000040`401f3c50=00000000 # Child-SP RetAddr Call Site 00 00000040`401f3c48 00007fff`5ee9b38d ntdll!RtlSetLastWin32Error 01 00000040`401f3c50 00007fff`5eedcfb7 KERNELBASE!BaseSetLastNTError+0x1d 02 00000040`401f3c80 00007ffe`c7726449 KERNELBASE!SleepConditionVariableCS+0x37 03 00000040`401f3cb0 00007ffe`c76f8284 jvm!PlatformMonitor::wait+0x29 [s\open\src\hotspot\os\windows\os_windows.cpp @ 5379] 04 00000040`401f3ce0 00007ffe`c72c88b5 jvm!Monitor::wait+0x84 [s\open\src\hotspot\share\runtime\mutex.cpp @ 257] 05 (Inline Function) --------`-------- jvm!MonitorLocker::wait+0xa [s\open\src\hotspot\share\runtime\mutexLocker.hpp @ 256] 06 00000040`401f3d20 00007ffe`c72c443a jvm!CompileBroker::wait_for_completion+0x285 [s\open\src\hotspot\share\compiler\compileBroker.cpp @ 1684] 07 00000040`401f3d90 00007ffe`c72c4070 jvm!CompileBroker::compile_method_base+0x2ba [s\open\src\hotspot\share\compiler\compileBroker.cpp @ 1282] 08 00000040`401f5140 00007ffe`c72c414d jvm!CompileBroker::compile_method+0x460 [s\open\src\hotspot\share\compiler\compileBroker.cpp @ 1438] 09 00000040`401f51f0 00007ffe`c72b2cd7 jvm!CompileBroker::compile_method+0x8d [s\open\src\hotspot\share\compiler\compileBroker.cpp @ 1302] 0a 00000040`401f5240 00007ffe`c72b3300 jvm!CompilationPolicy::compile+0x1d7 [s\open\src\hotspot\share\compiler\compilationPolicy.cpp @ 814] 0b (Inline Function) --------`-------- jvm!CompilationPolicy::method_invocation_event+0x63 [s\open\src\hotspot\share\compiler\compilationPolicy.cpp @ 1150] 0c 00000040`401f65c0 00007ffe`c7219521 jvm!CompilationPolicy::event+0x100 [s\open\src\hotspot\share\compiler\compilationPolicy.cpp @ 755] 0d (Inline Function) --------`-------- jvm!counter_overflow_helper+0x139 [s\open\src\hotspot\share\c1\c1_Runtime1.cpp @ 481] 0e 00000040`401f6610 00000215`876510f8 jvm!Runtime1::counter_overflow+0x181 [s\open\src\hotspot\share\c1\c1_Runtime1.cpp @ 488] 0f 00000040`401f8ca0 00000000`00000010 0x00000215`876510f8 10 00000040`401f8ca8 00000000`00000002 0x10 11 00000040`401f8cb0 00000215`992fa440 0x2 12 00000040`401f8cb8 00000000`00000002 0x00000215`992fa440 13 00000040`401f8cc0 00000215`f0a90000 0x2 14 00000040`401f8cc8 00000000`00000000 0x00000215`f0a90000
16-09-2022

Thanks Jorn for the note above -- Right, we are a VM and the get is hardly guaranteed to be "immediately" after the set. 8-) It might have been a mistake to document the -Xcomp and -Xint behaviour in the same place. They might be for unrelated reasons. The -Xcomp I thought was worth investigating, as it happens immediately, and we run the test very well without it. With -Xint it lasts for a while, it can be 25 minutes of continual setting/getting to see a problem, which is a lot of iterations (an additional -Xint test just now showed 2 errors in nearly 2 million iterations, so it is a 1 in a million problem). With -Xcomp the errors are more frequent, and quite predictable. If I let the test continue rather than fail, there are reliably 5 failures with -Xcomp, always at the loop iteration numbers here: 0 GetLastError mismatch at iteration 1: lastError = 0 != wanted 42: failed. 0 GetLastError mismatch at iteration 2: lastError = 0 != wanted 42: failed. 0 GetLastError mismatch at iteration 128: lastError = 0 != wanted 42: failed. 0 GetLastError mismatch at iteration 256: lastError = 0 != wanted 42: failed. 0 GetLastError mismatch at iteration 257: lastError = 0 != wanted 42: failed. ..................... ...and continuing, it always succeeds, for the time I've observed at least. (They seem like interesting numbers for loop counts but that may be a distraction!) A 5-second sleep before the loop does not change this (it's not other things warming up). A 5-second delay inside the loop: well that's taking a while to test but... it's exactly the same. Keeping -Xcomp, disable tiered compilation changes the behaviour a little: java -XX:-TieredCompilation -Xcomp --enable-preview GetLastErrorTest ... 0 GetLastError mismatch at iteration 1: lastError = 0 != wanted 42: failed. 0 GetLastError mismatch at iteration 128: lastError = 0 != wanted 42: failed. ......0 GetLastError mismatch at iteration 6784: lastError = 0 != wanted 42: failed. ........................................................................ Again that's a reliable pattern. In jdk-20+16-1032 and an older 20-ea+11-652 Compilation excludes do affect the failures: Exclude DowncallStub*.invoke from compilation, fewer failures: 3 rather than 5 java -XX:CompileCommand=exclude,jdk/internal/foreign/abi/DowncallStub*.invoke -Xcomp --enable-preview GetLastErrorTest CompileCommand: exclude jdk/internal/foreign/abi/DowncallStub*.invoke bool exclude = true 0 GetLastError mismatch at iteration 1: lastError = 0 != wanted 42: failed. 0 GetLastError mismatch at iteration 2: lastError = 0 != wanted 42: failed. 0 GetLastError mismatch at iteration 128: lastError = 0 != wanted 42: failed. ................. Excluding just the GetLastErrorTest class itself from compilation does not avoid/change the problem. But excluding probably everything with: java -XX:CompileCommand=exclude,j*.* -Xcomp --enable-preview GetLastErrorTest ...shows no errors. So I don't pin this to a particular method being compiled, but compilation behaviour relates to seeing the problem. Calling Thread::current() in the context of the app thread would be a problem, but as this is so infrequent it's not that every compile shows the issue, and it's too predictable to relate much to chance. It seems hopeful that there is something related to the compilation activity here which triggers the issue. If so, maybe the other efforts to preserve last error are not needed...
16-09-2022

I'm setting the sub-cat to 'debugger' since most of the report generators/tracker need a subcat.
15-09-2022

Hi Kevin. I've discussed the issue of GetLastError being overwritten some more with some of the other Panama team members recently as well (after getting back from vacation), and in general it seems that there are many things that can happen in the VM to overwrite the last error value, and it seems unreasonable to expect the VM to preserve the error value everywhere. (what do you think?) It's maybe not evident from the Panama example code, but there is still Java code that runs in between the native calls, including code that does (Java) allocations. So, if the heap needs to be resized, the last error value could be overwritten as well for instance. As a workaround, we are currently discussion an addition to the Panama API that will allow saving values such as GetLastError to the side in the tail end of another native call, so the VM can't get in between to overwrite the value, and the user can retrieve the saved value later. With all that said, I think that, at least at the moment, we shouldn't expect the VM to preserve the last error value.
15-09-2022

As this is also failing with -Xint, it does not seem to be a compiler specific issue. Moving to core-svc for further analysis.
15-09-2022

ILW = unexpected behavior; with -Xcomp or -Xint; no workaround = MMH = P3
14-09-2022