JDK-6649594 : Intermittent IOExceptions during dynamic attach on linux and solaris
  • Type: Bug
  • Component: hotspot
  • Sub-Component: svc
  • Affected Version: 6u21,7
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • OS: generic,linux
  • CPU: generic
  • Submitted: 2008-01-11
  • Updated: 2014-07-15
  • Resolved: 2011-03-08
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6 JDK 7 Other
6u21pFixed 7Fixed hs19Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Description
Running tests using dynamic attach (com.sun.tools.attach API) I found that on linux attempt to attach intermittently fails with 2 kind of exceptions:

java.io.IOException: well-known file is not secure
        at sun.tools.attach.LinuxVirtualMachine.checkPermissions(Native Method)
        at sun.tools.attach.LinuxVirtualMachine.<init>(LinuxVirtualMachine.java:111)
        at sun.tools.attach.LinuxAttachProvider.attachVirtualMachine(LinuxAttachProvider.java:63)
        at com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:213)

java.io.IOException: Connection refused
        at sun.tools.attach.LinuxVirtualMachine.connect(Native Method)
        at sun.tools.attach.LinuxVirtualMachine.<init>(LinuxVirtualMachine.java:118)
        at sun.tools.attach.LinuxAttachProvider.attachVirtualMachine(LinuxAttachProvider.java:63)
        at com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:213)

To reproduce extract attached archive and run script test_attach.sh, first script parameter - tested JDK. This script in loop executes simple test using dynamic attach. Test fails very intermittently (~ 1 of 500), but since we have many tests using dynamic attach possibility to face such failures is pretty high.

Failure reproduces for example on vmsqe-p4-08.russia (SuSe 10), vmsqe-xeon-02.russia (RedHat 4).
This failure mode has been seen in nightly testing. Here is the
entry from my analysis report:

New MM_REGRESSION failures (from 2008.02.28)
*   sun/management/jmxremote/bootstrap/LocalManagementTest.sh
        This test failed due to "IOException: well-known file is not
        secure" on Linux IA32 Server VM (machine jtg-linux17).

        Update: This failure might be related to the following:
                6649594 3/4 Intermittent IOExceptions during dynamic
                            attach on linux

            I'm checking with Alan.
This failure mode has been seen in nightly testing. Here is the
entry from my analysis report:

New nsk.jvmti failures (from 2008.09.18)
    nsk/jvmti/AttachOnDemand/attach039
        This test failed due to "IOException: well-known file is not
        secure" on Linux IA32 Server VM (machine tq-win2k-exe).
Another sighting from nightly testing:

New nsk.jvmti failures (from 2008.09.26)
    nsk/jvmti/AttachOnDemand/attach015
        This test failed due to "IOException: well-known file is not
        secure" on Linux AMD64 Server VM (machine wowamd). This is
        an occurrence of the following bug:

            6649594 3/4 Intermittent IOExceptions during dynamic attach on linux

        I will copy this entry to 6649594.
Another sighting from nightly testing:

New nsk.jvmti failures (from 2008.09.30)
*   nsk/jvmti/AttachOnDemand/attach045
        This test failed due to "IOException: well-known file is not
        secure" on Linux IA32 Client VM (machine wowamd). This is an
        occurrence of the following bug:

            6649594 3/4 Intermittent IOExceptions during dynamic attach on linux
        I will copy this entry to 6649594.
We also regularly see intermittent failures in attach-on-demand test with following exception:
java.io.IOException: Bad file number
        at sun.tools.attach.SolarisVirtualMachine.enqueue(Native Method)
        at sun.tools.attach.SolarisVirtualMachine.execute(SolarisVirtualMachine.java:107)
        at sun.tools.attach.HotSpotVirtualMachine.loadAgentLibrary(HotSpotVirtualMachine.java:40)
        at sun.tools.attach.HotSpotVirtualMachine.loadAgentLibrary(HotSpotVirtualMachine.java:61)
        at sun.tools.attach.HotSpotVirtualMachine.loadAgent(HotSpotVirtualMachine.java:85)
        at nsk.share.aod.AgentsAttacher.tryToLoadAgent(AgentsAttacher.java:68)
        at nsk.share.aod.AgentsAttacher.attachAgents(AgentsAttacher.java:48)
        at nsk.share.aod.AODTestRunner.doTestActions(AODTestRunner.java:60)
        at nsk.share.aod.AODTestRunner.runTest(AODTestRunner.java:112)
        at nsk.share.aod.AODTestRunner.main(AODTestRunner.java:144)

I tried to run attach-on-demand tests with -XX:+StartAttachListener and with this flag tests don't fail.

Looks like this bug also affects JDI tests against com.sun.jdi.ProcessAttach connector:
nsk/jdi/AttachingConnector/attach/attach004
nsk/jdi/AttachingConnector/attachnosuspend/attachnosuspend002

Example of tests failures:
http://vmsqe.russia.sun.com/execution/results/JDK7/PROMOTION/VM/b39/ConcMarkSweepGC_2/vm/solaris-i586/server/comp/solaris-i586_server_comp_nsk.jdi.testlist/ResultDir/attachnosuspend002/attachnosuspend002.log
http://vmsqe.russia.sun.com/execution/results/JDK7/PROMOTION/VM/b39/ConcMarkSweepGC_2/vm/solaris-sparcv9/server/mixed/solaris-sparcv9_server_mixed_nsk.jdi.testlist/ResultDir/attach004/attach004.log
Adding a number of entries from my nightly analysis report:

New nsk.jvmti failures (from 2009.01.30)
    nsk/jvmti/AttachOnDemand/attach003
        This test failed due to "ERROR: Unexpected IOException during
        VirtualMachine.attach: java.io.IOException: Permission denied"
        on Solaris AMD64 Server VM (machine vm-v20z-29).

New nsk.jvmti failures (from 2009.01.23)
    nsk/jvmti/AttachOnDemand/attach013
        This test failed due to "IOException: well-known file is not
        secure" on Linux IA32 Server VM -Xmixed (machine jtg-linux17).

New nsk.jvmti failures (from 2009.01.05)
    nsk/jvmti/AttachOnDemand/attach037
        This test failed due to "ERROR: Unexpected IOException during
        VirtualMachine.attach: java.io.IOException: Connection refused"
        on Linux AMD64 Server VM -Xmixed (machine vm-v20z-3).

New nsk.jvmti failures (from 2008.12.12)
    nsk/jvmti/AttachOnDemand/attach020
        This test failed due to "IOException: well-known file is not
        secure" on Linux AMD64 Server VM (machine vmnightly6).

New nsk.jvmti failures (from 2008.11.28)
    nsk/jvmti/AttachOnDemand/attach042
        This test failed due to "ERROR: Unexpected IOException during
        VirtualMachine.attach: java.io.IOException: Permission denied"
        on Solaris SPARC Server VM (machine vm-v215-02).

New nsk.jvmti failures (from 2008.11.26)
    nsk/jvmti/AttachOnDemand/attach010
        This test failed due to "ERROR: Unexpected IOException during
        VirtualMachine.attach: java.io.IOException: Connection refused"
        on Linux IA32 Client VM (machine inclusion).

New nsk.jvmti failures (from 2008.10.30)
    nsk/jvmti/AttachOnDemand/attach014
        This test failed due to "IOException: well-known file is not
        secure" on Linux AMD64 Server VM (machine wowamd).

        Update: In the 2009.01.31 nightly, the test failed due to
            "ERROR: Unexpected IOException during VirtualMachine.attach:
            java.io.IOException: Connection refused" on Linux AMD64
            Server VM -Xcomp (machine intelsdv17).

New nsk.jvmti failures (from 2008.10.01)
    nsk/jvmti/AttachOnDemand/attach012
        This test failed due to "ERROR: Unexpected IOException during
        VirtualMachine.attach: java.io.IOException: Connection refused"
        on Linux IA32 Server VM (machine intelsdv17).

Comments
EVALUATION http://hg.openjdk.java.net/jdk7/hotspot/hotspot/rev/a81afd9c293c
02-08-2010

EVALUATION http://hg.openjdk.java.net/jdk7/hotspot-rt/hotspot/rev/a81afd9c293c
16-07-2010

EVALUATION As pointed by Nicolay the proposed workaround has a problem, see 6786948. So, would be good to fix the problem.
15-01-2009

EVALUATION This is a timing bug in initialization of the attach mechanism. The client can observe the well-known file before the socket file is completely initialized. This can be fixed by renaming it to the correct file once the listener is started. This can be worked around in the test environment by starting the starting VM with the -XX:+StartAttachListener option.
11-01-2008