JDK-8238169 : BasicDirectoryModel getDirectories and DoChangeContents.run can deadlock
  • Type: Bug
  • Component: client-libs
  • Sub-Component: javax.swing
  • Affected Version: 11,13,14,15,17,21
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2020-01-29
  • Updated: 2024-06-14
  • Resolved: 2024-03-21
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 17 JDK 21 JDK 22 JDK 23 JDK 8
11.0.25-oracleFixed 17.0.13-oracleFixed 21.0.5-oracleFixed 22.0.2Fixed 23 b16Fixed 8u431Fixed
Related Reports
Duplicate :  
Relates :  
Description
Report by Adrian Nistor <adrnisto@amazon.com>.

The code in BasicDirectoryModel.java acquires synchronization objects in different orders. One acquires DoChangeContents.this and then fileCache, the other acquires fileCache and then DoChangeContents.this.

Acquiring synchronization objects in different orders can cause a circular wait (deadlock).

Here is the first order: (1) first acquire DoChangeContents.this and then fileCache:

    enter public method run() at line 524
    acquire synchronization on DoChangeContents.this at entry to run() because run() is a synchronized method --- see line 524
    acquire synchronization on fileCache at line 528

Here is the second order: (2) first acquire fileCache and then DoChangeContents.this:

    enter public method renameFile() at line 176
    acquire synchronization on fileCache at line 177
    call validateFileCache() at line 179
    call cancelRunnables() at line 157
    call cancelRunnables(Vector) at line 397
    call cancel() at line 392
    acquire synchronization on DoChangeContents.this at entry to cancel() because cancel() is a synchronized method --- see line 520

A straight-forward fix always acquires synchronization in only one order: first fileCache and then DoChangeContents.this. Swap the order of synchronized (this) and synchronized (fileCache) at lines 524 and 528.

Removing the synchronized keyword from the method signature and replacing it with synchronized (this) ensures the same behavior.

public void run() { <<<<<<<<<<<<<<<<<<<<<<<<<<<<< removed synchronized from here and moved it to synchronized (this) 2 lines below
    synchronized(fileCache) {   <<<<<<<<<<<<<<<<<<<<<<< added this statement instead of the same statement 5 lines below
        synchronized(this) { <<<<<<<<<<<<<<<<<<<<< added this instead of the synchronized keyword in the method signature 2 lines above
            if (fetchID == fid && doFire) {
                int remSize = (remFiles == null) ? 0 : remFiles.size();
                int addSize = (addFiles == null) ? 0 : addFiles.size();
                synchronized(fileCache) {   <<<<<<<<<<<<<<<< ***remove*** from here and moved it 5 lines above 
... ... ... ... ... ... ... ... 
        }
    }
}

The downside is that the fileCache synchronized region is larger than before.


Comments
[jdk11u-fix-request] Approval Request from Martin Should get backported for parity with 11.0.25-oracle. Applies cleanly and tier 1-4 have passed.
14-06-2024

A pull request was submitted for review. URL: https://git.openjdk.org/jdk11u-dev/pull/2768 Date: 2024-06-13 12:53:53 +0000
13-06-2024

[jdk17u-fix-request] Approval Request from Martin Should get backported for parity with 17.0.13-oracle. Applies cleanly and tier 1-4 have passed.
11-06-2024

A pull request was submitted for review. URL: https://git.openjdk.org/jdk17u-dev/pull/2558 Date: 2024-06-10 09:25:57 +0000
10-06-2024

A pull request was submitted for review. URL: https://git.openjdk.org/jdk17u-dev/pull/2539 Date: 2024-06-05 14:06:32 +0000
05-06-2024

[jdk21u-fix-request] Approval Request from Martin Should get backported for parity with 21.0.5-oracle. Applies cleanly and tier 1-4 have passed.
01-06-2024

A pull request was submitted for review. URL: https://git.openjdk.org/jdk21u-dev/pull/633 Date: 2024-05-31 15:43:11 +0000
31-05-2024

Fix request: 22u This bug is resolved by the fix JDK-8325179 which is already approved for 22u. Clean backport. Medium risk. Pull request: https://github.com/openjdk/jdk22u/pull/164
24-04-2024

A pull request was submitted for review. URL: https://git.openjdk.org/jdk22u/pull/164 Date: 2024-04-24 15:12:52 +0000
24-04-2024

Changeset: e66788c1 Author: Alexey Ivanov <aivanov@openjdk.org> Date: 2024-03-21 16:03:30 +0000 URL: https://git.openjdk.org/jdk/commit/e66788c16563d343f6cccd2807a251ccc6f9b64a
21-03-2024

A pull request was submitted for review. URL: https://git.openjdk.org/jdk/pull/18111 Date: 2024-03-04 20:21:30 +0000
04-03-2024

This bug will be resolved by JDK-8325179 which removes the cancelRunnables method as well as removes the synchronized modifier from DoChangeContents.run.
04-03-2024

https://github.com/openjdk/jdk/pull/17462#issuecomment-1910079621 Does run need to be synchronised? Probably not, only reading the value of doFire needs to be. The following changeset should be enough to avoid the deadlock: - public synchronized void run() { - if (fetchID.get() == fid && doFire) { + public void run() { + synchronized (this) { + if (fetchID.get() != fid || !doFire) { + return; } Then the body of the `if` is run outside `synchronized (this)` block.
25-01-2024

Synchronization on "this" in DoChangeContents does not make any sense, the "run" method must be executed on one thread(EDT) only via SwingUtilities.invokeLater(), and "cancel" flag could be marked as volatile.
31-01-2020