JDK-8246348 : Crash in libpango on Ubuntu 20.04 with some unicode chars
  • Type: Bug
  • Component: javafx
  • Sub-Component: graphics
  • Affected Version: 8,openjfx11,openjfx14,openjfx15
  • Priority: P2
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2020-06-02
  • Updated: 2020-08-25
  • Resolved: 2020-06-15
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 8 Other
8u271Fixed openjfx11.0.9Fixed
Related Reports
Relates :  
Relates :  
Description
To reproduce, run HelloWebView on Ubuntu 20.04 as follows:

$ java HelloWebView https://gluonhq.com/

This will crash and generate a core file, but no hs_err_pid* file. Here is the stack trace from the core file:

(gdb) where
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f0b53757859 in __GI_abort () at abort.c:79
#2  0x00007f0b10e55b63 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#3  0x00007f0b10eb2b4f in g_assertion_message_expr ()
   from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#4  0x00007f0b10fdf44e in ?? () from /lib/x86_64-linux-gnu/libpango-1.0.so.0
#5  0x00007f0b10fe01d8 in pango_itemize_with_base_dir ()
   from /lib/x86_64-linux-gnu/libpango-1.0.so.0
#6  0x00007f0b10fe02e9 in pango_itemize ()
   from /lib/x86_64-linux-gnu/libpango-1.0.so.0
#7  0x00007f0b346e65b0 in ?? ()
#8  0x00007f0b4c55bc20 in ?? ()
#9  0x0000000000000000 in ?? ()

The attached test program also reproduce this crash. In this case we do get an hs_err_pid* file.

$ java UnicodeTextTest
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007fa93dc0b2ac, pid=13889, tid=13906
#
# JRE version: Java(TM) SE Runtime Environment (14.0+36) (build 14+36-1461)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (14+36-1461, mixed mode, sharing, tiered, compressed oops, serial gc, linux-amd64)
# Problematic frame:
# C  [libpango-1.0.so.0+0x1d2ac]
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport %p %s %c %d %P %E" (or dumping to /home/kcr/javafx/tmp/core.13889)
#
# An error report file with more information is saved as:
# /home/kcr/javafx/tmp/hs_err_pid13889.log
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
Aborted (core dumped)


Here is the stack trace from the core dump file:
(gdb) where
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007fa98037e859 in __GI_abort () at abort.c:79
#2  0x00007fa97f40cf4b in os::abort(bool, void*, void const*) [clone .cold.64]
    () from /home/kcr/jdks/jdk-14/lib/server/libjvm.so
#3  0x00007fa97ff1aa86 in VMError::report_and_die(int, char const*, char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long) () from /home/kcr/jdks/jdk-14/lib/server/libjvm.so
#4  0x00007fa97ff1b40b in VMError::report_and_die(Thread*, unsigned int, unsigned char*, void*, void*, char const*, ...) ()
   from /home/kcr/jdks/jdk-14/lib/server/libjvm.so
#5  0x00007fa97ff1b43e in VMError::report_and_die(Thread*, unsigned int, unsigned char*, void*, void*) () from /home/kcr/jdks/jdk-14/lib/server/libjvm.so
#6  0x00007fa97fd1237c in JVM_handle_linux_signal ()
   from /home/kcr/jdks/jdk-14/lib/server/libjvm.so
#7  0x00007fa97fd061f8 in signalHandler(int, siginfo*, void*) ()
   from /home/kcr/jdks/jdk-14/lib/server/libjvm.so
#8  <signal handler called>
#9  0x00007fa93dc0b2ac in ?? () from /lib/x86_64-linux-gnu/libpango-1.0.so.0
#10 0x00007fa93dc0b8c0 in ?? () from /lib/x86_64-linux-gnu/libpango-1.0.so.0
#11 0x00007fa93dc0d1c7 in pango_itemize_with_base_dir ()
   from /lib/x86_64-linux-gnu/libpango-1.0.so.0
#12 0x00007fa93dc0d2e9 in pango_itemize ()
   from /lib/x86_64-linux-gnu/libpango-1.0.so.0
#13 0x00007fa9606e65b0 in ?? ()
#14 0x00007fa978560800 in ?? ()
#15 0x0000000000000000 in ?? ()

Comments
Changeset: bf2e972d Author: Johan Vos <jvos@openjdk.org> Date: 2020-06-15 17:12:55 +0000 URL: https://git.openjdk.java.net/jfx/commit/bf2e972d
15-06-2020

I just attached UnicodeTextTest2.java, a slightly modified version of UnicodeTextTest.java that doesn't use a null character. It doesn't crash with the current openjfx14 or 15-ea builds, and can be used as a regression test of the proposed fix for this bug.
11-06-2020

Both issues are addressed in https://github.com/openjdk/jfx/pull/249 The check on 0 chars is trivial (but might be expensive) The other issue requires us to know the number of Unicode codepoints in the (part of the) char array that is sent to Pango. The char array that we send requires a combination of the char[] and the TextRun, both supplied to provided to PangoGlyphLayout.layout(TextRun, PGFont, Strike, char[]) The problem is that TextRun only contains the begin and end position (and length) in UTF 16 notation. Hence, a Unicode codepoint that is a surrogate pair is counted as 2 characters. The `g_utf8_offset_to_pointer` methods require the number of unicode codepoints, which can thus be different from the number of characters. In the PR, the conversion from UTF16 to UTF8 is done per TextRun, for the UTF16 characters in the TextRun specific subrange of the char[]. Invoking `utf8_strlen` then returns the correct number of unicode codepoints. The drawback of this method is some bookkeeping inside the PangoGlyphLayout class. Another solution would be to store the begin/end index counted as unicode codepoints on the TextRun itself, but since TextRun is shared with non-pango implementations as well, this sounds more intrusive.
10-06-2020

The problem is wider then just the 0 character in the beginning, which gives a crash. The assertions in the web sample are due to the same cause: issues with g_utf8_strlen with embedded null (see e.g. https://mail.gnome.org/archives/commits-list/2009-July/msg03917.html) Pango has its own implementation (pango_utf8_strlen) but there are 2 places where g_utf8_strlen is used. I replaced g_utf8_strlen with pango_utf8_strlen in the pango source code, and then both samples work correctly.
05-06-2020

This problem is caused by a change in libpango-1.0.so In the version tagged with 1.40.14 (which is used in ubuntu 18.04), mini-fribidi implementation was used. In 1.47, which is used in ubuntu 20.04, an own implementation is used. That new implementation uses g_utf8_strlen() to find the utf8 length of a char*. However, if there is a 0 char in the array, the value is different from the old implementation (e.g. if the text starts with 0, the value of g_utf8_strlen is 0. That value is later used to allocate memory, and it thus allocates not enough memory. The test is trying to display a String starting with Character(0), hence the crash.
05-06-2020