Using Monkey Tester app (or a simple app that contains an editable TextArea or TextField), insert the following string:
"Bhojpuri 𑂦𑂷𑂔𑂣𑂳𑂩𑂲 test"
or using the unicode escapes
"Bhojpuri \ud804\udca6\ud804\udcb7\ud804\udc94\ud804\udca3\ud804\udcb3\ud804\udca9\ud804\udcb2 test"
- position the cursor at the end of text
- press Option+LEFT twice
Observed: the cursor first jumps to the beginning of the word 'test' then to the beginning of the word 'Bhojpuri'
Expected: the cursor should be positioned at the beginning of the word '𑂦𑂷𑂔𑂣𑂳𑂩𑂲'
The culprit is the code in TextInputControl:814 which calls Character.isLetterOrDigit(char) on a character which is a part of a surrogate pair instead of Character.isLetterOrDigit(int) on the code point.
The 'next word' function works correctly.
The issue is observed on macOS 14.4.1.
Added unit test in TextInputControlTest:
@Test
public void previousWord_Bhojpuri() {
// "Bhojpuri \ud804\udca6\ud804\udcb7\ud804\udc94\ud804\udca3\ud804\udcb3\ud804\udca9\ud804\udcb2 test"
textInput.setText("Bhojpuri 𑂦𑂷𑂔𑂣𑂳𑂩𑂲 test");
textInput.end();
verifyCaret(28);
textInput.previousWord(); // at the beginning of "test"
verifyCaret(24);
textInput.previousWord(); // at the beginning of "𑂦𑂷𑂔𑂣𑂳𑂩𑂲"
verifyCaret(9);
textInput.previousWord(); // at the beginning of "Bhojpuri"
verifyCaret(0);
}