JDK-8237775 : Core file generation will silently fail if /cores directory has wrong permissions set
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 15
  • Priority: P4
  • Status: Closed
  • Resolution: Won't Fix
  • Submitted: 2020-01-23
  • Updated: 2022-01-04
  • Resolved: 2022-01-04
Related Reports
Relates :  
Relates :  
Description
After reinstalling my OS X, my system had these permissions set on "/cores" folder:

drwxr-xr-x    2 root  wheel    64 Aug 23 18:01 cores

Which would make our crash handler call to abort() fail to generate the core file, so hs_err_pid crash log file would end up lying:

# Core dump will be written. Default location: /cores/core.64980
...
Dumping core ...

as core file would NOT be written.

The solution is for the user to do:

sudo chmod a+w /cores

but it would be nice if hs_err_pid would say something about it.

For example it already says this:

# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again

if ulimit is not set, so it should say something similar if wrong permissions are set on "/cores" folder, that prohibit core file dumps.
Comments
RT Triage: We agree with David's assessment that HotSpot should not attempt to write core files in alternative location. And hs_err log already contains message about inability to write core file if this happens. Closing as will not fix.
04-01-2022

Different requirements feedback requires re-consideration. At this point I believe that: - the fix should only apply to macOS - any lookup or file operations should be kept out of the handler (but not in startup either?) - crash handler should print warning only if needed and should be short without modifying the existing message too much (what is too much here?) I still personally would like to see something done here. Retargeting to after jdk15
27-05-2020

For testing: jdk/bin/java -XX:ErrorHandlerTest=12 --version
18-05-2020

SA has this check in its tests: if (Platform.isOSX()) { File coresDir = new File("/cores"); if (!coresDir.isDirectory()) { cleanup(); throw new Error(coresDir + " is not a directory"); } // the /cores directory is usually not writable on macOS 10.15 if (!coresDir.canWrite()) { cleanup(); throw new SkippedException("Directory \"" + coresDir + "\" is not writable"); } It checks whether the "/cores" folder is writable and skips the test, if it's not writeable.
12-05-2020

I do not think we should be doing anything to mess with the user environment. Applications get deployed in environments where sys admins have take great pains to try and control things like core file placement and the JVM should not try to workaround those constraints. If we can't generate a core file then we tell the user. If they really want a core file then they need to adjust their environment accordingly.
08-05-2020

We could redirect "core" files, using sysctl, to the same folder as the one where hs_err crash log files go. That would make collecting and preserving the info simpler for the user and any test harness environments ... (?) Also, doing it that way would render this issue moot. And if we could do it reliably on all the platforms, it would render any issues related to where the "core" files go, moot as well.
07-05-2020

Note, to show the pattern, location, we can use "sysctl" : # sysctl -n kern.corefile /cores/core.%P but that can be changed: # sysctl -w kernel.core_pattern='/tmp/core_%e.%p'
07-05-2020

If we can't do complex tasks during the crash handling and we don't want to handle them during startup (because they may introduce a delay) could we handle such tasks during "available" time some time after startup?
07-05-2020

Some links: http://www.osxbook.com/book/bonus/chapter8/core/ https://developer.apple.com/library/archive/technotes/tn2124/_index.html
07-05-2020

Unset "ulimit" or wrong permissions on "/cores" folder will equally result in no corefile written, so, in my opinion, they should be treated equally with telling the user what to do to generate the desired corefile. We can lookup the permissions and cache them at the startup maybe? I can't imagine this would be expensive. I agree though, that doing anything nontrivial in our signal handler for this should not be entertained.
24-01-2020

I didn't say we should remove those checks we already have, just that we shouldn't go to unreasonable lengths to try and cover all possibilities. This all executes from a signal handler potentially and should be avoiding things that are not async-signal-safe, but instead we keep adding more and more general code execution to the hs_err processing. In this case how can you determine the necessary permissions accurately, other than by trying to write to the location? Checking the permission bits of the directory is only an approximation without going into detailed user and group id checks.
24-01-2020

We already have code in "os::check_dump_limit()" to check the ulimit, so if we were to go down that route, we should remove that code as well? I'd rather that our code was smart enough to check the requirements and offer the user specific ways to solve it, which is doable for both the "ulimit" settings and "/core" folder permissions.
24-01-2020

I'd prefer to write a looser error message " a core dump should have been written ..." than jump through hoops trying to figure out all the ways in which we can fail to write a core file.
23-01-2020