CR 7011862 showed how a StackoverflowError in one thread can leave system data structures in an unusable state. This is the insidious nature of these exceptions as they are effectively asynchronous, and you can not write practical code to guard against them.
A number of simple suggestions were made as to how we might better handle this situation. These include:
1. Keep a global count of all such exceptions (StackOverflowError and other VirtualMachineErrors (excluding OOME probably)) and report that count as part of the normal SIGQUIT / ctrl-\ stack dump, so that when a system is "hung" you can see that threads have encountered these kinds of errors at some point.
2. Add a product VM flag, eg ReportVirtualMachineErrors, that will report the point at which such exceptions are first thrown (again OOME is likely to want to be excluded)
3. Add a product VM flag, eg VirtualMachineErrorsAreFatal that will cause the throwing of one of these errors to be converted into a fatal error (ie guarantee(false, "...")). This not only allows extra debugging of the call stack but allows applications to ensure they will not continue in the presence of potentially data corrupting exceptions. [ Again exclude OOME]
A more ambitious enhancement would be to try to identify key classes/methods and ensure that on entry to they have sufficient stack to allow successful execution. At the language level, as Dave Dice notes, the ability to disable such exceptions around critical regions of code, ala .NET, could be useful (of course the implementation of that requires establishing how much stack will be needed). As Dave notes, "You can find StackOverflowError thrown from really unexpected places because of deopt and OSR".