JDK-8223456 : Delayed starting of debugging via jcmd
  • Type: CSR
  • Component: core-svc
  • Sub-Component: debugger
  • Priority: P4
  • Status: Closed
  • Resolution: Approved
  • Fix Versions: 12
  • Submitted: 2019-05-07
  • Updated: 2019-06-28
  • Resolved: 2019-06-24
Related Reports
CSR :  
Description
Summary
-------

Allow the JDWP debugging backend to be started delayed via a diagnostic command.

Problem
-------

The JDK already provides a mechanism to start debugging delayed in case specific exceptions are thrown or not catched via the `onthrow` and `onuncaught` JDWP options.

But a JVM might enter a 'bad' state without throwing an exception (e.g. by being in an endless loop). Some of these problematic states cannot be analyzed by the usual tools like stack- or heap-dumps, but only by debugging via a Java debugger. So it would be nice to be able to enable debugging when such a bad state is actually encountered.

Solution
--------

Analogously to the `onthrow`/`onuncaught` options we add a boolean `onjcmd` option to the JDWP agentlib. If it is enabled, we don't start debugging directly, but wait for it to be requested via a diagnostic command.

The option is named `onjcmd`, because this is how the user will usually start the debugging, even if technically it could be started otherwise, e.g. programmatically via the `DiagnosticCommandMBean` or via `jconsole`. The latter methods are not very well known in the Java community.

Like the `onthrow`/`onuncaught` options, the `onjcmd` option is currently only supported when the `server=y` JDWP option is in effect.

In contrast to the `onthrow`/`onuncaught` options, the `launch` option is currently not supported by the `onjcmd` option.

The JDWP library exports a JNI call to start the debugging delayed when `onjcmd=y` was specified in the agentlib command line. If called, it initializes the debugging system on the first call. On subsequent calls it currently just returns.

This JNI call is then used in a diagnostic command, which is exported to `jcmd` and to the `DiagnosticCommandMBean` under the name `VM.start_java_debugging`. It currently supports no additional options.

Specification
-------------

A boolean JDWP option named `onjcmd` is added. 

It cannot currently be combined with the `server=n` configuration or the `launch` option.

A diagnostic command named `VM.start_java_debugging` is added.

If the `onjcmd` option is enabled (`onjcmd=y`), the first call via the `VM.start_java_debugging` command tries to start the debugging backend. If this can not be completed, the feature will be disabled, but the VM is not exited. Otherwise the debugging backend is started in server mode.

Further calls to `VM.start_java_debugging` have currently no effect.

The permission required to start debugging via `VM.start_java_debugging will be` `"java.lang.management.ManagementPermission"` `"monitor"`.
Comments
Thanks for reviewing and approving this CSR. I have created task JDK-8226941 to collect follow up items.
28-06-2019

After consultation with others including [~alanb] and [~mr], I've concluded there is lack of technical consensus on this appropriateness of the feature in its current state to the platform. As noted in the CSR FAQ (https://wiki.openjdk.java.net/display/csr/CSR+FAQs): "In exceptional circumstances, the need for a CSR review may be recognized only after a push has already occurred. In such cases, a retroactive CSR review can be conducted. The results of such a retroactive review may require updates to the change, up to and including complete removal of the change." Administratively, I'm retroactively voting to approve this CSR as it has already been pushed in JDK 12; however, given the lack of consensus, I've filed the follow-up bug JDK-8226608 to: * hide the onjcmd option from the help output * explore hiding "VM.start_java_debugging" from the "jcmd <pid> help" This bug needs to be addressed before JDK 13 ramdown 2. Please also consider restarting a broader discussion on serviceability-dev about exploring and possible approaches to addressing problems in this space (https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-May/028226.html).
22-06-2019

Thanks, Serguei, for your input on this. As for renaming of the flag: I could also imagine "standby", instead of "ondemand". What needs to be done to get this item done? As for renaming of the option, I think we can open another bug/CSR refering this one. It should probably be done in JDK13 time. I also think that JDK-8224673 should be resolved for JDK13. However, we're waiting with further activity on that one until this CSR is resolved. For further activities in the debugging on demand space, I'm willing to open another bug for evaluation purposes. In there we can collect further tasks, such as looking into com.sun.tools.attach.VirtualMachine enhancements and further items we also have ideas for in the SAP team. Maybe in the end it can grow to a JEP...
21-06-2019

In general, this looks Okay to me. I like the suggestion from Volker S. to rename the flag "onjcmd" to "ondemand" . This change impacts the JPDA spec document: "Connection and Invocation Details" (conninv.html). So, a spec document update has to be suggested in this CSR. Unfortunately this document source is still located in closed repository. As Alan suggested, it would be worth to consider extending the class com.sun.tools.attach.VirtualMachine with an abstract method startDebugAgent() or startJDWPAgent(), similar to startManagementAgent. But it will need to be separated into a different CSR.
07-06-2019

Agreed and updated accordingly.
28-05-2019

This CSR should reflect what was actually pushed IMO, then 8224673 should have its own CSR to change it to "control".
28-05-2019

Hi David, I have added the permissions as proposed by JDK-8224673 (control). So would this be the CSR for JDK-8224673 then as well? Or shall we file another CSR for it? Thanks Christoph
28-05-2019

This request should mention the permissions needed to use this facility - ref the proposed change to that permission in JDK-8224673.
28-05-2019

At the time https://bugs.openjdk.java.net/browse/JDK-8214892 was resolved (JDK12), there was no CSR request. The need for it was discovered just later. So we process this CSR request retroactively. As for the naming of the debugging agent option which was set to `onjcmd`, I'd rather suggest something like `standby`. But if we were to change it, e.g. for OpenJDK 13 delivery, I suggest to do this change with another CSR. So, I consider this reviewed from my end.
27-05-2019