> >> I have been looking into the definition of [character set]
> >> expressions in Java regular expressions, to understand what needs to
> >> be done to make ICU be compatible, or more compatible at least.
> >> There does not appear to be any formal definition for [set
> >> expressions], or at least not that I can find.
> >> Trying tests, one aspect of the behavior seems really odd. It would
> >> be good if we could find out from Sun whether it was really intended
> >> to work the way that it does.
> >> The question concerns the negation of a set,
> >> [^0-9], to get everything except for the ASCII digits, for example.
> >> In Java, the negation does _not_ apply to anything appearing in
> >> nested [brackets]
> >> So [^c] does not match "c", as you would expect.
> >> [^[c]] does match "c". Not what I would expect.
> >> [[^c]] does not match "c"
> >> The same holds true for ranges or property expressions - if they're
> >> inside brackets, a negation at an out level does not affect them.
> >> [^a-z] is opposite from [^[a-z]]
> >> And the same seems to hold for set expressions with &&, although the
> >> cases become hard to understand.
> >> Perl and Posix behavior doesn't provide any guidance here, as they do
> >> not support nested brackets at all - a '[' is not special within a
> >> set, and just becomes yet another member of the set.