JDK-8243244 : Add option to jcmd to write a gzipped heap dump
  • Type: CSR
  • Component: hotspot
  • Sub-Component: svc
  • Priority: P4
  • Status: Closed
  • Resolution: Approved
  • Fix Versions: 15
  • Submitted: 2020-04-21
  • Updated: 2021-11-25
  • Resolved: 2020-06-02
Related Reports
CSR :  
Description
Summary
-------
Add an option to write a gzip compressed heap dump via the GC.heap_dump diagnostic command.

Problem
-------
With ever increasing heap sizes, the creation of heap dumps takes long and uses more space.

Solution
--------
Add an option to directly write a gzipped heap dump via the diagnostic command GC.heap_dump. Since the deflate mechanism used in the gzip format is relatively slow, the compression is done in parallel.

When writing to a slow backing store, compression can lead to faster writing of the heap dump. And in practically all cases a compression rate of about 3 or more should be achieved.

The diagnostic command can be accessed via the jcmd tool. An example would be:

    jcmd <pid|main class> GC.heap_dump -gz=4 dump.hprof.gz

This would trigger a gzip compressed heap dump of the specified VM, using  a compression level of 4.

For a discussion of alternative solutions see https://mail.openjdk.java.net/pipermail/serviceability-dev/2020-February/030369.html

Specification
-------------
Add an integer option called **gz** to the GC.heap_dump diagnostic command. If it is specified, it will enable the gzip compression of the written heap dump. The supplied value is the compression level. It can range from 1 (fastest) to 9 (slowest, but best compression). The recommended level is 1.

Comments
Sorry for the delay -- I've read over the code review thread -- moving to Approved.
02-06-2020

Thanks for implementing version 3), it seems to have the most common acceptance. Reviewed.
18-05-2020

I'm against #2. The compression level 0 (no compression) does not make sense to me as it will cause confusions only. The modified #1 (gz[=1..10]) looks a little bit better but it depends on your resources to implement it. Otherwise, the modified #3 (-gz=1..10) would work too. The situation when a customer wants to use gzip but has no idea what compression level to use looks a little bit strange to me. I guess, using minimal possible compression level should work. The #3 can be converted to #1 later if necessary.
11-05-2020

I've changed the proposal to #3. You have to specify the compression level always, but don't need a new feature in the option code. The compression level of 0 (store) has also be removed.
11-05-2020

So we have several options for the option ;) here: 1. One flag, level optional -gz[=0..10] 2. Two flags, level optional: -gz [-gz-level=0..10] 3. One flag, force level: -gz=0..10 1.) seems the most obvious implementation of the flag. But implemented as int, it changes the behaviour of int flags for all int flags. The old behaviour forces a value for int flags. The new behaviour would set a default value if a value is omitted (here: 1). (This change of behaviour would require a CSR in itself I guess.) Therefore Ralf proposed to implement it as String, where he can parse the int value in his own coding. 2.) is more awkward because there are two flags. But it leaves the user the possiblity to just enable compression without knowing any details about levels by just using -gz. It can be implemented with existing boolean and int flags. 3.) Keeps all in one flag but it forces the user to set and thus understand the compression levels. It can be implemented with the existing int flag. I would vote against changing the behaviour of int flags. It would require checking all existing int flags to have a proper default in place. Also, I think having the simple option -gz is very useful. It probably covers 95% of the use cases. A seperate issue discussed here is whether compression level 0 should be exposed. When does it make sense to write gzip format but not compress the file?
08-05-2020

I think we should either find a way to allow -gz[=compression-level], where the compression level part is completely optional or we should go with two separate options -gz and -gz-level=... My personal preference would be the first one. But having a mandatory "=*" part in -gz, e.g. -gz=1 is in my eyes the worst compromise. Would it be hard to create an "optional int" option in diagnostics framework?
08-05-2020

I do not see it as a problem to always specify the compression level. Is it a big problem for 99% users to always pass the option -gz=1 instead of -gz ? Probably, we should disallow the option -gz=0 as it does not make sense to use. It is just to ask for confusions. > And if one really needs to specify the compression level, using a second option should be OK. It adds extra complexity to the interface.
07-05-2020

The main reason is that the diagnostic framework does not supports an Integer option which works like a boolean option too. The tests in jdk/jfr/startupargs/TestBadOptionValues.java:even test, that you get an error for an integer option without a value specified. The main use case should be not specifying the compression level, since this is what is right for 99% of the use cases. And in this case you only have to specify the -gz option. This is easy and hard to screw up. Always needing o specify a copression level is more error prone (e.g. speccifying 0, which means no compression). And if one really needs to specify the compression level, using a second option should be OK.
05-05-2020

Probably, the CSR is the best place to sort out such questions. It is still not clear to me why could not we simplify the options to -gz=compression-level where compression-level is integer or long number. We could try to make the compression-level part to be optional: -gz[=compression-level]. Another approach is to make it mandatory. I do not see any problems for users to to use -gz=0 or -gz=1 if they do not care about the compression level.
04-05-2020