JDK-8292166 : POSIX signal handling issues with pc access
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 20
  • Priority: P4
  • Status: Closed
  • Resolution: Won't Fix
  • Submitted: 2022-08-10
  • Updated: 2024-07-11
  • Resolved: 2024-07-11
Related Reports
Relates :  
Relates :  
Description
JDK-8283326 made some changes to the POSIX signal handling code.  Some of these issue might have pre-dated that.

The signal handler has an assert that the ucVoid argument is non-NULL. Later, when extracting the pc from the ucontext, it only does so if uc (obtained from ucVoid) is non-NULL. But it can't be NULL because of the assertion.

The pc is extracted from the context even if it isn't needed. The signal may have already been handled by handle_assert_poison_fault.  The pre-8283326 checked signal_was_handled.  (Also, see below regarding signal_was_handled.)

The static variant of handle_safefetch ignores the pc argument and refetches it from the ucontext. The setjmp variant of handle_safefetch also ignores the pc argument (naming it ignored1). Maybe these functions should just not take the pc as an argument. That would push the extraction code further down in the signal handler. Alternatively, the static variant could use the passed in argument rather than re-extracting it. (The static variant of handle_safefetch now uses the passed pc argument. The fix was made in JDK-8294053.)

Extraction of the pc involves some special cases. The static variant of handle_safefetch doesn't deal with those cases. (They don't seem to apply, other than for ZERO, so okay.) There is also similar but (inconsistently) different code in the crash_handler function in vmError_posix.cpp. Maybe those places could share code.

crash_handler also checks for a the ucVoid argument being NULL; is that actually possible here?

It seems like the POSIX signal handler would be simpler if all places that set signal_was_handled instead checked the assigned-from value and returned true if that value is true. That would eliminate the signal_was_handled variable and all the downstream checks.  But maybe this should be a separate issue.


Comments
Runtime Triage: This is not on our current list of priorities. We will consider this feature if we receive additional customer requirements.
11-07-2024

The issue tends be what implementations actually do, more so than what specs allow. Linux/glibc does not always adhere to Posix. In any case we have many such conservative code fragments in the VM. We can rely on an assert when we are checking something we internally control, but not when it comes from an external source.
11-08-2022

We can probably safely assume the context to be not NULL. Posix does not give any leeway to the implementation to leave the context NULL (https://pubs.opengroup.org/onlinepubs/009604499/functions/sigaction.html) : "If SA_SIGINFO is set and the signal is caught, the signal-catching function shall be entered as: void func(int signo, siginfo_t *info, void *context); .... ; the third argument can be cast to a pointer to an object of type ucontext_t to refer to the receiving thread's context that was interrupted when the signal was delivered. " And practically, we all run masses of tests with debug VMs and that assert never fired, so it is probably okay.
10-08-2022

> But it can't be NULL because of the assertion. The assertion in not present in a product build. If it is unexpectedly NULL then we don't want to trigger a crash.
10-08-2022