JDK-8250782 : UseOSErrorReporting causes stack space exception (Win)
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 14
  • Priority: P4
  • Status: In Progress
  • Resolution: Unresolved
  • OS: windows_10
  • Submitted: 2020-07-29
  • Updated: 2023-01-07
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 21
21Unresolved
Related Reports
Cloners :  
Relates :  
Relates :  
Description
Consider deprecating this in JDK 21

Turning UseOSErrorReporting halts the VM process during crash report, eventually times out with message: "[ timer expired, abort... ]" on macOS and Linux. It doesn't do that on Windows, but does it actually do anything useful?

I tried it on Windows 10 and personally I did not spot any difference with the flag ON/OFF from the point of the user/developer - in particular I see no system crash reports.

On Windows, similar to Mac and Linux we seem stuck in infinite loop and the only reason the process does not time out is that after about 250 loops it gets stack space exception.

Might need a similar fox to that for POSIX (i.e.  JDK-8250637)
Comments
Thanks [~dholmes], we will take a look.
07-01-2023

Reading: https://learn.microsoft.com/en-us/windows/win32/debug/exception-dispatching When an exception occurs in user-mode code, the system uses the following search order to find an exception handler: 1. If the process is being debugged, the system notifies the debugger. For more information, see Debugger Exception Handling. 2. If the process is not being debugged, or if the associated debugger does not handle the exception, the system attempts to locate a frame-based exception handler by searching the stack frames of the thread in which the exception occurred. The system searches the current stack frame first, then searches through preceding stack frames in reverse order. 3. If no frame-based handler can be found, or no frame-based handler handles the exception, but the process is being debugged, the system notifies the debugger a second time. 4. If the process is not being debugged, or if the associated debugger does not handle the exception, the system provides default handling based on the exception type. For most exceptions, the default action is to call the ExitProcess function. --- The UseOSErrorReporting flag is, I believe, intended to allow us to get to step 3 to notify the debugger after we have done any internal processing of the fault (like generating the hs_err file).
05-01-2023

[~mbeckwit] is any one from the Microsoft team able to provide guidance here on whether UseOSErrorReporting still has some value? Thanks.
05-01-2023

I had a debug version of JDK 17 at hand so ran a simple test: $ ./apps/Java/jdk-17/fastdebug/bin/java -XX:ErrorHandlerTest=14 -XX:-CreateCoredumpOnCrash # # A fatal error has been detected by the Java Runtime Environment: # # EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x00007ff81a0a7cbd, pid=18180, tid=21668 # # JRE version: Java(TM) SE Runtime Environment (17.0) (fastdebug build 17-internal+0-LTS-2021-06-01-2133181.david.holmes.jdk-dev2.git) # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 17-internal+0-LTS-2021-06-01-2133181.david.holmes.jdk-dev2.git, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, windows-amd64) # Problematic frame: # V [jvm.dll+0xce7cbd] # # CreateCoredumpOnCrash turned off, no core file dumped # # An error report file with more information is saved as: # D:\ade\hs_err_pid18180.log # # If you would like to submit a bug report, please visit: # https://bugreport.java.com/bugreport/crash.jsp # $ ./apps/Java/jdk-17/fastdebug/bin/java -XX:ErrorHandlerTest=14 -XX:-CreateCoredumpOnCrash -XX:+UseOSErrorReporting # # A fatal error has been detected by the Java Runtime Environment: # # EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x00007ff81a0a7cbd, pid=7416, tid=13588 # # JRE version: Java(TM) SE Runtime Environment (17.0) (fastdebug build 17-internal+0-LTS-2021-06-01-2133181.david.holmes.jdk-dev2.git) # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 17-internal+0-LTS-2021-06-01-2133181.david.holmes.jdk-dev2.git, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, windows-amd64) # Problematic frame: # V [jvm.dll+0xce7cbd] # # CreateCoredumpOnCrash turned off, no core file dumped # # An error report file with more information is saved as: # D:\ade\hs_err_pid7416.log # # If you would like to submit a bug report, please visit: # https://bugreport.java.com/bugreport/crash.jsp # Segmentation fault so there is a slight difference in behaviour (as would be expected as we don't call abort) but no infinite loop in this case.
05-01-2023

The empty mdmp file is a bug in the VM. We create the file in os::check_dump_limit, but if UseOSErrorReporting is on we never actually generate the dump. So we should create the file only if UseOSErrorReporting is false. Otherwise, on Windows, UseOSErrorReporting should prevent the VM from terminating the process and returns control to the OS. It is up to the user to have configured what happens then, they may have enabled dumps: https://learn.microsoft.com/en-us/windows/win32/wer/collecting-user-mode-dumps or debugging: https://learn.microsoft.com/en-us/windows/win32/debug/configuring-automatic-debugging#configuring-automatic-debugging-for-application-crashes but by default the process will just terminate and there will not be a minidump. If UseOSErrorReporting is not allowing the process to terminate then it may be a bug in the VM because our own error code is still doing things it should not be if we are going to return control to the OS. It is also possible we have a bug in our SEH logic.
05-10-2022

UseOSErrorReporting prevents our error handler from aborting the VM and, in theory, allowing the exception to propagate back up to Windows so that it can perform its own crash reporting logic (WER). What Windows will do in that case depends on how it is configured. In the "old days" you had to have DrWatson installed and running to get any kind of crash reporting - ref JDK-4997835. But I agree that the utility of allowing this seems very limited and we have no real way to test that it actually does allow more useful WER interactions.
04-10-2022

I linked to the old bug JDK-4997835, but that problem has been long solved. By default, we get both hs_err_pid*log and mdmp file with -XX:+CreateCoreDumpOnCrash. When I look at this mdmp file in the windbg debugger, it doesn't print the stack of the crashing thread correctly, so [~pchilanomate] dug it out for me by reassigning the rsp to some value on the stack and then printing the stack. I thought UseOSErrorReporting would prevent the stack of the crashing thread from being unreadable, but it didn't. It resulted in an empty mdmp file, which is even worse. We will never support WER, so I'm wondering if UseOSErrorReporting has any use to anyone who uses windows. If not, we should deprecate it.
04-10-2022

If I pass UseOsErrorReporting to a test on Windows that crashes I end up with an empty mdmp file, so this option appears the opposite of useful. It appears broken. Maybe we should remove it.
03-10-2022

I don't have access to any windows systems so I can't access what we should do with this option.
17-08-2022

Maybe we should deprecate UseOSErrorReporting since it doesn't seem to work and we don't get error reports from Microsoft WER.
23-06-2022

UseOSErrorReporting used to not filter the crash through our error reporter, and in fact stop the program exactly at the crash. That way, if you were connected to Visual Studio or wanted your report to go to WER, that would be the setting you'd use. Since we don't get reports from Microsoft WER, we don't really want to enable that option in production. Maybe this option should only be declared for Windows.
29-07-2020

Would someone with Windows (WER) experience please investigate?
29-07-2020