United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: JDK-6649594 Intermittent IOExceptions during dynamic attach on linux and solaris
JDK-6649594 : Intermittent IOExceptions during dynamic attach on linux and solaris

Details
Type:
Bug
Submit Date:
2008-01-11
Status:
Closed
Updated Date:
2012-10-22
Project Name:
JDK
Resolved Date:
2011-03-08
Component:
hotspot
OS:
linux,generic
Sub-Component:
svc
CPU:
generic
Priority:
P3
Resolution:
Fixed
Affected Versions:
6u21,7
Fixed Versions:
hs19 (b06)

Related Reports
Backport:
Backport:
Backport:
Backport:
Relates:
Relates:
Relates:

Sub Tasks

Description
Running tests using dynamic attach (com.sun.tools.attach API) I found that on linux attempt to attach intermittently fails with 2 kind of exceptions:

java.io.IOException: well-known file is not secure
        at sun.tools.attach.LinuxVirtualMachine.checkPermissions(Native Method)
        at sun.tools.attach.LinuxVirtualMachine.<init>(LinuxVirtualMachine.java:111)
        at sun.tools.attach.LinuxAttachProvider.attachVirtualMachine(LinuxAttachProvider.java:63)
        at com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:213)

java.io.IOException: Connection refused
        at sun.tools.attach.LinuxVirtualMachine.connect(Native Method)
        at sun.tools.attach.LinuxVirtualMachine.<init>(LinuxVirtualMachine.java:118)
        at sun.tools.attach.LinuxAttachProvider.attachVirtualMachine(LinuxAttachProvider.java:63)
        at com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:213)

To reproduce extract attached archive and run script test_attach.sh, first script parameter - tested JDK. This script in loop executes simple test using dynamic attach. Test fails very intermittently (~ 1 of 500), but since we have many tests using dynamic attach possibility to face such failures is pretty high.

Failure reproduces for example on vmsqe-p4-08.russia (SuSe 10), vmsqe-xeon-02.russia (RedHat 4).
This failure mode has been seen in nightly testing. Here is the
entry from my analysis report:

New MM_REGRESSION failures (from 2008.02.28)
*   sun/management/jmxremote/bootstrap/LocalManagementTest.sh
        This test failed due to "IOException: well-known file is not
        secure" on Linux IA32 Server VM (machine jtg-linux17).

        Update: This failure might be related to the following:
                6649594 3/4 Intermittent IOExceptions during dynamic
                            attach on linux

            I'm checking with Alan.
This failure mode has been seen in nightly testing. Here is the
entry from my analysis report:

New nsk.jvmti failures (from 2008.09.18)
    nsk/jvmti/AttachOnDemand/attach039
        This test failed due to "IOException: well-known file is not
        secure" on Linux IA32 Server VM (machine tq-win2k-exe).
Another sighting from nightly testing:

New nsk.jvmti failures (from 2008.09.26)
    nsk/jvmti/AttachOnDemand/attach015
        This test failed due to "IOException: well-known file is not
        secure" on Linux AMD64 Server VM (machine wowamd). This is
        an occurrence of the following bug:

            6649594 3/4 Intermittent IOExceptions during dynamic attach on linux

        I will copy this entry to 6649594.
Another sighting from nightly testing:

New nsk.jvmti failures (from 2008.09.30)
*   nsk/jvmti/AttachOnDemand/attach045
        This test failed due to "IOException: well-known file is not
        secure" on Linux IA32 Client VM (machine wowamd). This is an
        occurrence of the following bug:

            6649594 3/4 Intermittent IOExceptions during dynamic attach on linux
        I will copy this entry to 6649594.
We also regularly see intermittent failures in attach-on-demand test with following exception:
java.io.IOException: Bad file number
        at sun.tools.attach.SolarisVirtualMachine.enqueue(Native Method)
        at sun.tools.attach.SolarisVirtualMachine.execute(SolarisVirtualMachine.java:107)
        at sun.tools.attach.HotSpotVirtualMachine.loadAgentLibrary(HotSpotVirtualMachine.java:40)
        at sun.tools.attach.HotSpotVirtualMachine.loadAgentLibrary(HotSpotVirtualMachine.java:61)
        at sun.tools.attach.HotSpotVirtualMachine.loadAgent(HotSpotVirtualMachine.java:85)
        at nsk.share.aod.AgentsAttacher.tryToLoadAgent(AgentsAttacher.java:68)
        at nsk.share.aod.AgentsAttacher.attachAgents(AgentsAttacher.java:48)
        at nsk.share.aod.AODTestRunner.doTestActions(AODTestRunner.java:60)
        at nsk.share.aod.AODTestRunner.runTest(AODTestRunner.java:112)
        at nsk.share.aod.AODTestRunner.main(AODTestRunner.java:144)

I tried to run attach-on-demand tests with -XX:+StartAttachListener and with this flag tests don't fail.

Looks like this bug also affects JDI tests against com.sun.jdi.ProcessAttach connector:
nsk/jdi/AttachingConnector/attach/attach004
nsk/jdi/AttachingConnector/attachnosuspend/attachnosuspend002

Example of tests failures:
http://vmsqe.russia.sun.com/execution/results/JDK7/PROMOTION/VM/b39/ConcMarkSweepGC_2/vm/solaris-i586/server/comp/solaris-i586_server_comp_nsk.jdi.testlist/ResultDir/attachnosuspend002/attachnosuspend002.log
http://vmsqe.russia.sun.com/execution/results/JDK7/PROMOTION/VM/b39/ConcMarkSweepGC_2/vm/solaris-sparcv9/server/mixed/solaris-sparcv9_server_mixed_nsk.jdi.testlist/ResultDir/attach004/attach004.log
Adding a number of entries from my nightly analysis report:

New nsk.jvmti failures (from 2009.01.30)
    nsk/jvmti/AttachOnDemand/attach003
        This test failed due to "ERROR: Unexpected IOException during
        VirtualMachine.attach: java.io.IOException: Permission denied"
        on Solaris AMD64 Server VM (machine vm-v20z-29).

New nsk.jvmti failures (from 2009.01.23)
    nsk/jvmti/AttachOnDemand/attach013
        This test failed due to "IOException: well-known file is not
        secure" on Linux IA32 Server VM -Xmixed (machine jtg-linux17).

New nsk.jvmti failures (from 2009.01.05)
    nsk/jvmti/AttachOnDemand/attach037
        This test failed due to "ERROR: Unexpected IOException during
        VirtualMachine.attach: java.io.IOException: Connection refused"
        on Linux AMD64 Server VM -Xmixed (machine vm-v20z-3).

New nsk.jvmti failures (from 2008.12.12)
    nsk/jvmti/AttachOnDemand/attach020
        This test failed due to "IOException: well-known file is not
        secure" on Linux AMD64 Server VM (machine vmnightly6).

New nsk.jvmti failures (from 2008.11.28)
    nsk/jvmti/AttachOnDemand/attach042
        This test failed due to "ERROR: Unexpected IOException during
        VirtualMachine.attach: java.io.IOException: Permission denied"
        on Solaris SPARC Server VM (machine vm-v215-02).

New nsk.jvmti failures (from 2008.11.26)
    nsk/jvmti/AttachOnDemand/attach010
        This test failed due to "ERROR: Unexpected IOException during
        VirtualMachine.attach: java.io.IOException: Connection refused"
        on Linux IA32 Client VM (machine inclusion).

New nsk.jvmti failures (from 2008.10.30)
    nsk/jvmti/AttachOnDemand/attach014
        This test failed due to "IOException: well-known file is not
        secure" on Linux AMD64 Server VM (machine wowamd).

        Update: In the 2009.01.31 nightly, the test failed due to
            "ERROR: Unexpected IOException during VirtualMachine.attach:
            java.io.IOException: Connection refused" on Linux AMD64
            Server VM -Xcomp (machine intelsdv17).

New nsk.jvmti failures (from 2008.10.01)
    nsk/jvmti/AttachOnDemand/attach012
        This test failed due to "ERROR: Unexpected IOException during
        VirtualMachine.attach: java.io.IOException: Connection refused"
        on Linux IA32 Server VM (machine intelsdv17).

                                    

Comments
EVALUATION

This is a timing bug in initialization of the attach mechanism. The client can observe the well-known file before the socket file is completely initialized. This can be fixed by renaming it to the correct file once the listener is started. This can be worked around in the test environment by starting the starting VM with the -XX:+StartAttachListener option.
                                     
2008-01-11
EVALUATION

As pointed by Nicolay the proposed workaround has a problem, see 6786948.
So, would be good to fix the problem.
                                     
2009-01-15
EVALUATION

http://hg.openjdk.java.net/jdk7/hotspot-rt/hotspot/rev/a81afd9c293c
                                     
2010-07-16
EVALUATION

http://hg.openjdk.java.net/jdk7/hotspot/hotspot/rev/a81afd9c293c
                                     
2010-08-02



Hardware and Software, Engineered to Work Together