JDK-8226512 : JFR Event Streaming
  • Type: CSR
  • Component: hotspot
  • Sub-Component: jfr
  • Priority: P3
  • Status: Closed
  • Resolution: Approved
  • Fix Versions: 14
  • Submitted: 2019-06-20
  • Updated: 2019-09-30
  • Resolved: 2019-09-28
Related Reports
CSR :  
Description
Summary
-------

Expose JDK Flight Recorder data for continuous monitoring.

Problem
-------

The HotSpot VM emits more than 500 data points using JFR, most of them not available through other means besides parsing log files.

To consume the data today, a user must start a recording, stop it, dump the contents to disk and then parse the recording file. This works well for application profiling, where typically at least a minute of data is being recorded at a time, but not for monitoring purposes.

Solution
--------

Add an API to the jdk.jfr.consumer package that allow users to read recording data directly (stream) from the disk repository without dumping a recording file. There will be an interface, jdk.jfr.consumer.EventStream that will provide read capabilities for in process, out of process and ordinary recording files. 

    public interface EventStream extends AutoCloseable {
      public static EventStream openRepository();
      public static EventStream openRepository(Path directory);
      public static EventStream openFile(Path file);

      void setStartTime(Instant startTime);
      void setEndTime(Instant endTime);
      void setOrdered(boolean ordered);
      void setReuse(boolean reuse);

      void onEvent(Consumer<RecordedEvent> action);
      void onEvent(String eventName, Consumer<RecordedEvent action);
      void onFlush(Runnable action);
      void onClose(Runnable action);
      void onError(Runnable action);
      void remove(Object action);
 
      void start();
      void startAsync();
    
      void awaitTermination();
      void awaitTermination(Duration duration);
      void close();
    )

There will be a class, jdk.jfr.consumer.RecordingStream, that implement the EventStream interface to provide control capabilties and retention policy.

    public final class RecordingStream implements EventStream (
 
      public RecordingStream();
      public RecordingStream(Configuration configuration);
 
      public EventSettings enable(String eventName);
      public EventSettings enable(Class<? extends Event> eventClass);
      public EventSettings disable(String eventname);
      public EventSettings disable(Class<? extends Event> eventClass);
      public void setSetting(Map<String, String> settings);
 
      public void setMaxAge(Duration maxAge);
      public void setMaxSize(long maxSize);
      public void setFlushInterval(Duration interval);

      // Implementation of EventStream interface
    )

Two methods will also be added to the jdk.jfr.Recording class that allow users to control how often data should be flushed.

    void setFlushInterval(Duration interval)
    Duration getFlushInterval();

The -XX:StartFlightRecording will get a new option called flush-interval that specifies how often data should be flushed. The diagnostic command JFR.start will also get an option with the same name for setting the interval during runtime.

To allow access to the disk repository from other processes, a system property ("jdk.jfr.repository"), will be added so a client can discover the file path using the attach API.

Specification
-------------

Command line:

    -XX:StartFlightRecording:flush-interval=<NANOTIME>

Diagnostic command:

    JFR.start flush-interval=<NANOTIME>

See attached diff for changes to EventStream, RecordingStream and Recording.
Comments
Added "The size is measured in bytes." to setmaxSize method and updated api-changes.txt and javadoc.zip
30-09-2019

Moving to Approved. For the setMaxSize method, I suggested stating explicitly that the size is measured in bytes.
28-09-2019

Thanks for the feedback. See updated diff. - Changed from <code>...</code> to {@code ... } where appropriate. - Added "@since 14" to setFlushInterval and getFlushInterval - Changed to var in example code with try-with-resources - Added javdoc EventStream::close that states that closing a closed stream has no effect. - Fixed typo in RecordingStream::disable When it comes to an isStarted() method, the expected usage is a local context where the client is aware of the state of the EventStream. Even if that is not the case, a state change may occur after the call to isStarted() and before the invocation of the start() method, for example if the stream is started from another thread.
27-09-2019

In Recording, as a code review comment, in + * @throws IllegalArgumentException if <code>interval</code> is negative + @throws IllegalArgumentException if <code>maxAge</code> is negative and other similar locations please use "{@code foo}" rather than `<code>foo</code>`. The setFlushInterval method in Recording should have an @since tag. Also as a code review suggestion, the resources variables in the try-with-resources statements could use var. In EventStream, should there be a "started" predicate method to allow clients to test and avoid IllegalStateException? I recommend having explicit requirements on whether or not calling close on an already closed stream is an error condition or not. By default, not having this be an error (idempotent close) is preferable. Typo: "is" -> "are" in + * in different class loaders), then all events that match the name is + * disabled. To disable a specific class, use the {@link #disable(Class)}
24-09-2019

I attached a diff to the three classes that are added/changed. It will hopefully make it more easy to see what is being chnged.
23-09-2019

For CSR review purposes it is not convenient to look over the javadoc output for the entire JDK. Please attach and include some artifact which highlights the changes of this CSR, such as a webrev or specdiff. Thanks.
21-09-2019

1) Javadoc will be added before finalizing CSR. 2) We like the API to work well with reactive frameworks and we considered using the interfaces in java.util.concurrent.Flow, but decided against it: - It made the API cumbersome to use (without any real added benefit). - There is no easy way to regulate the amount of data produced by the JVM using back pressure. - Flight Recorder already have two policies (maxAge and maxSize) to handle overflow. - It's a one-liner to publish events from a j.u.f.Consumer to a reactive framework 3) Given a reasonable heap size, i.e. 100 - 200 MB, it is not possible to run out of RAM using the API. If a consumer is not able to keep up, old data (on disk) will be discarded if maxAge and maxSize have been set.
26-06-2019

Moving to Provisional. A few comments before the request is finalized for the second phase of CSR review: * Please have all methods and classes intended to be used by client of the API have full javadoc. * Is this API meant to be a reactive stream as in java.util.concurrent.Flow? * Is it possible/desirable to run the buffer out of memory using this API?
24-06-2019