JDK-8148425 : strerror() function is not thread-safe
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 9
  • Priority: P4
  • Status: Closed
  • Resolution: Fixed
  • OS: linux
  • Submitted: 2016-01-28
  • Updated: 2018-12-06
  • Resolved: 2016-03-16
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 10 JDK 9
10Fixed 9Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Relates :  
Description
In many places in the VM code if we detect an error from a library call we use strerror(errno) to produce a human-readable description of the error. Unfortunately strerror may not be thread-safe, particularly when the content is localized, due to use of a static buffer:

DESCRIPTION
       The  strerror()  function  returns  a  pointer  to  a string that describes the error code passed in the argument errnum, possibly using the LC_MESSAGES part of the current locale  to  select  the  appropriate  language.   This  string  must  not  be  modified by the application, but may be modified by a subsequent call to perror(3) or strerror().  No library function will modify this string.

       The strerror_r() function is similar to strerror(), but is thread safe.  This function is available in  two  versions:  an XSI-compliant version specified in POSIX.1-2001 (available since glibc 2.3.4), and a GNU-specific version (available since glibc 2.0). 

----

Switching to strerror_r seems reasonable, but some usage contexts make it difficult to define a local buffer - in particular unified-logging expressions where we don't want to have to define the buffer if logging is not enabled. Simple wrapping of the call also doesn't work as we can't use the buffer after the call returns.
Comments
AAARRGGHH! I never noticed the Fix Version was set for 10 now we have a backport created for this. I'll have to manually close this.
16-03-2016

Fixed - See "backport".
16-03-2016

I think stringified versions of EINVAL etc would be fine.
25-02-2016

Colleen, you are right, this could be made more pretty. I also am curious whether other programmers would prefer the full strerror() text? I always assume that every programmer rather reads "EINVAL" instead of (potentially localized) "File not found", but this may be my personal taste.
24-02-2016

I wonder if you can do it as an X macro instead though because this is really pretty gross.
23-02-2016

Both strerror and strerror_s print localized strings (which is a source for potential thread unsafety with strerror()). Do we really need this localization? It brings its own problems, apart from the added complexity to strerror: - It makes logs unreadable, because you get snippets in foreign languages. - Even worse, if locales are not setup right, you may not see anything (e.g. "????" for japanese LC_MESSAGES). See e.g. https://sourceware.org/bugzilla/show_bug.cgi?id=15940 Lets compare this with a simple solution where we print the errno define as string, e.g. "EINVAL" for EINVAL. This would be easier because instead of lengthy, potentially unreadable error strings you need to mentally translare back to the errno value - because, after all, this is what you really want - you get those errno values right away. Finally, literal errno values like "EINVAL" are also easier to grep for, when grepping log files. With that in mind, we could roll our own strerror, similar to e.g. os::Posix::get_signal_name(): const char* os::errno_name(int e) { switch e case EINVAL: return "EINVAL"; ... } This would be threadsafe, simple and fast, and very easy to use for logging. For an example of how this could be implemented see my first webrev for 8149036 here: http://cr.openjdk.java.net/~stuefe/webrevs/8149036-add-tracing-for-thread-events/webrev.00/webrev/src/share/vm/runtime/os.cpp.udiff.html
23-02-2016

I do not think that os::lasterror() does any good. It has a number of drawbacks: - It samples errno and, in case of windows, GetLastError when being called. This does not fit in cases where errno retrieval and printing are apart, e.g. in this construct: handle h = shmget(); void* p = shmat(h); if (!p) { int e = errno; // before doing anything else, lets first remove shm shmctl(RMID); // now lets report errno, which may invoke complicated code sequences (e.g. logging) log(e); } In fact, I think it is a good idea to make the programmer pass in errno himself to whatever printing function is used, to make bugs less likely where errno is overwritten by follow-up API calls. - It also is not a good fit for cases where errno is returned by the function as return value, not stored into errno. For example, "thr_create" on solaris. - Windows: mixing GetLastError (plus the expensive FormatMessage) in with strerror() on Windows does not make sense and is really ugly. It is unnecessary: when I call an API, I know the error semantics (or I should know them) - call a win32 API, use GetLastError, call a Posix API, use errno. The API treats GetLastError with a preference, but nothing in the specs says that I cannot have a GetLastError() value and an errno - both != 0 - which are unrelated to each other. I may have called a POSIX api and really only want the errno printed, but I get a GetLastError() which hang around from my last win32 call eons ago. - Finally, it returns the size of the characters written - for easy use in logging, it would make more sense to return the pointer to the user buffer, to be able to use it like this: log("error: %s", os::lasterror(mybuf, sizeof(mybuf), errno); os::lasterror() looks like something introduced especially for windows (all other platforms may just as well use strerror_s) but is a bad idea there, imho.
23-02-2016

Yes this is the same issue as reported in JDK-6223913 - thanks for finding that. As I noted above the main area of concern with thread-safety is when localization is used; otherwise we do expect one of the pre-canned constant error strings to be returned. This is one of the factors that makes this a lower priority - the other main factor is the potential rarity of two threads calling strerror at the same time. The main concern with moving to strerror_r, as noted in both referenced bugs, is the fact there are two different versions and historically it seems the GNU version has been unreliable. Well hopefully by now that situation has improved. And yes in shared code we would certainly want to use something completely portable so a call to os::lasterror (with it calling strerror_r) would be the way to go. Interesting to see that despite the writeup in JDK-6596471 the JDK code in getLastErrorString actually uses the Xopen variant: #if defined(LINUX) && (defined(_GNU_SOURCE) || \ (defined(_POSIX_C_SOURCE) && _POSIX_C_SOURCE < 200112L \ && defined(_XOPEN_SOURCE) && _XOPEN_SOURCE < 600)) extern int __xpg_strerror_r(int, char *, size_t); #define strerror_r(a, b, c) __xpg_strerror_r((a), (b), (c)) #endif Another simple possibility, given the usage context, might be to introduce use of a mutex around the call to strerror in os::lasterror.
29-01-2016

I think this is the same as JDK-6223913, which was ultimately closed as won't fix, based on analysis discussed in the 2005-09-14 comment by Hui Huang. See also JDK-6596471, also marked won't fix and referring to JDK-6223913. And instead of using strerror[_r] in shared code, perhaps we should be using something similar to os::lasterror, which has additional stuff for Windows. The jdk repo has getErrorString and getLastErrorString.
28-01-2016