Bug ID: JDK-8328821 Map.of() derived view collection mutators should throw UnsupportedOperationException

Other
tbdUnresolved

This most likely won't make it to 24: in the short span of 4 weeks we have over 10 JEPs to be integrated.
04-11-2024
Are there any more thoughts on this? I haven't been able to collect any more data or insight since what I shared previously. The motivation for asking is that I was reminded we're approaching the branch cut for JDK 24, maybe this should be retargeted for a future release?
04-11-2024
I also see this as a case where the decision was made, and we simply need to plumb that decision through more consistently (in specs and in implementation). I also think we have ample reason to believe the compatibility fallout will be minimal so I hope this change can move forward.
12-09-2024
I think it's possible to overstate the analogy with the Hashtable and Provider issue that's described in the Secure Coding Guidelines. The issue there is that Provider was enforcing certain invariants. When Hashtable had new methods added, it became possible to violate those invariants. Provider was clearly broken by this change. It's not strictly about mutation, either; Provider might have had invariants that only authorized callers were allowed to read its entries, which might also have been violated by new methods in Hashtable. The concern here is more about API design style, regarding edge cases that don't matter, from the standpoint of enforcing invariants. Given Map.of().entrySet().remove("x") should it return false or throw UOE? In fact remove("x") can never remove anything from an entrySet(), because a String is never equal to a Map.Entry, and thus no invariants can ever be violated. It seems likely this code fragment is a programming error. This is thus more of a design style issue, whether such cases should be ignored or throw an exception. As I've said previously the design style I chose was to be stricter and to throw exceptions. But the initial implementation of this section of code wasn't consistent with that. We're now faced with changing the behavior long after the initial implementation was delivered. That's why the decision is now more about compatibility than it is about correctness.
12-09-2024
I feel this situation is somewhat like another referred in the secure coding guidelines: https://www.oracle.com/java/technologies/javase/seccodeguide.html#4-6 I would argue this flaw is equivalent to that missing override to enforce checks in 1.2, albeit this flaw has a much smaller impact that it doesn't mutate the collection. Thus, we should add these missing overrides, possibly even in the Collections wrapper classes.
10-09-2024
> I've been looking at experimenting with making HashMap's derived view collections immutable and running a bunch of tests just to get a sense of how often any kind of mutation of derived view collections happens. I'll report back if I'm able to get some numbers. That experiment showed that it's very rare for code to do any kind of mutation of derived view collections, but it's less rare for a particular test to transitively depend on some code that does rely on mutation, so the results weren't super conclusive. And also we don't actually care about the mutation of derived view collections of HashMap, only of Map.of(). [~liach] have you had a chance to give this any more thought, or is there anything you can share on your side?
10-09-2024
> I doubt there are many code patterns that perform mutation. It would be interesting to find out how many there are, and what proportion there are of all calls I've been looking at experimenting with making HashMap's derived view collections immutable and running a bunch of tests just to get a sense of how often any kind of mutation of derived view collections happens. I'll report back if I'm able to get some numbers.
25-07-2024
> I think we should take the kve collections of `Collections`' `emptyMap` `singletonMap` and `unmodifiableMap` into account [~liach] returning to this part, I investigated and got different results than you. Do you see anything I'm missing? Depending on what the overall assessment of the risk is here, perhaps those views should also be throwing UOE. I got: Collections$UnmodifiableMap k/v/e x x x x x x x x Collections$EmptyMap k/v/e x - - - - - - - Collections$SingletonMap k/v/e x - x x - - x - Looking at the implementation it seems like e.g. EmptyMap#clear definitely doesn't throw UOE: https://github.com/openjdk/jdk/blob/b5b5a5b8e54ebd766456d2a4781a806ff97816a8/src/java.base/share/classes/java/util/Collections.java#L4933 I used this modification of MapView: https://gist.githubusercontent.com/cushon/3596dfb1ae4ee38f1067d3941e9f22f2/raw/17ecf6459a6aa25e9678be4cd786f00868211a6b/MapView.java
25-07-2024
It's more of a survey than an experiment. I agree it would be difficult to filter calls to entrySet() (and other view collections calls) do find only the calls on unmodifiable Maps. So yes if the survey covers entrySet() calls to all kinds of maps, then that's probably ok. As you note usages that perform mutation (probably) won't be on unmodifiable Maps. First, I doubt there are many code patterns that perform mutation. It would be interesting to find out how many there are, and what proportion there are of all calls. And if there are examples of code that performs mutation, are they such that the code in question "knows" that the Map is mutable? For example, does it occur over a Map stored in a private field of the same class, which presumably created the Map knowing that it would be mutated? Or is it some method that can accept a Map from anywhere? As for the corpus, we usually search through a snapshot of Maven central. Yes it would be helpful for everybody to be able to look through the actual source code. However, if you search Google's corpus and can come up with some general statistics, that might be useful.
22-07-2024
Stuart to check my understanding of the experiment you'd like to see: * Is the idea to survey usages of derived view collections where the view was definitely derived from Map.of(), or to survey a sample of all usages of the derived view collections? There are going to be some usages that do mutation, but where the derived sets are always backed by a mutable implementation. It's hard in general to determine statically whether or not a particular usage is safe in practice, which is why I started with the dynamic 'run all the tests' approach. * Do you have a suggestion for a corpus you'd like to see an analysis of? I don't think reviewing usages in Google's corpus would provide additional information, and I have less good tools for analyzing other ones.
22-07-2024
> If Liam's change passed all the tests in Google, I think reviewing a random 100 usages anywhere would be quite unlikely to turn anything up. I think this is likely true, but this is still conjecture. I'd like this not to be used as an argument for avoiding this effort. I suppose there's a risk that it ends up being a waste of time. However, if in fact it doesn't turn anything up, it helps confirm the conjecture. And there's always the possibility that we might be surprised by something! So I'd encourage us to proceed with this. Any volunteers? Note that the compatibility questions in the CSR for JDK-8316493 would, I think, be answered by the same analysis. As far as I can see that change has been held up because we're trying to figure out how to assess the compatibility issues. [~pminborg] See my description of a survey-based technique in my comment above. https://bugs.openjdk.org/browse/JDK-8328821?focusedId=14689541&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14689541
20-07-2024
I believe you're absolutely right that `entrySet` usages fall into a narrow range of patterns, with outliers being very rare. If Liam's change passed all the tests in Google, I think reviewing a random 100 usages anywhere would be quite unlikely to turn anything up.
15-07-2024
There are clearly tradeoffs here, both on the positive and negative sides. On the positive side there is the consistency argument. Usually we don't do things solely for the sake of consistency, especially if there is a potential compatibility issue. Thus I believe an additional positive argument is necessary. I don't think there is a correctness argument to be made here. No collection violates its invariants. The issue is that something that (we think) ought to be an error is simply a no-op. There is thus an argument that this potentially catches errors. For example, if somebody accidentally calls a mutator method such as removeAll() that usually happens to be a no-op because its argument is an empty collection, it would blow up if something were to change such that the argument became non-empty. (And if you could prove the argument is always empty, then why bother calling the method at all?) Arguably this is a programming error. This is sort of a fail-fast approach, where the library attempts to detect and flag programming errors, rather similar to ConcurrentModificationException. (But note that in that case there is a "best effort" level of service, not a guarantee, and there are cases where CME isn't thrown.) In any case there needs to be a clearer, positive justification that's more than just "consistency." On the negative side, there are some people who actually prefer the no-op behavior. I'm not sure I can represent this position adequately. Also on the negative side, it seems like code analysis can't do much to detect incompatibilities. We'd have to find calls to mutator methods on the result of Map.entrySet(). But the map would need to be one of the unmodifiable maps (as opposed to, say, a HashMap) and the argument would need to be such that the call is a no-op that would change to throw an exception (e.g., an empty collection argument to removeAll). This seems like it would require some data flow analysis. If somebody knows how to do that over a corpus, great, but if not, we'll need to consider alternatives. Let me propose a survey-based technique. I've used something similar in Sequenced Collections (see the compatibility document attached to the CSR JDK-8266572). The technique goes something like this: examine 100 cases where the Map.entrySet() method is used, and see what the code does. Do similar analysis for keySet() and values(). Draw conclusions from any patterns that emerge. First, this seems like rather a lot of manual effort. (What! No automation!?) However, I've found that most uses of the collections API are fairly stylized, and after looking at a handful of examples, they fall naturally into buckets, and it's pretty easy to figure out what bucket a case belongs in. Overall this should be only a few hours' effort. In addition, I've always found it educational to look at a bunch of other people's code to see what they're doing with our APIs, especially ones where we're considering making an incompatible change! Second, I've found that actual usage often matches one's intuition about the APIs, and that it's often possible to reason about coding patterns in general. For example, I'd conjecture that the majority of use cases of Map.entrySet() are of the form: for (var entry : someMap.entrySet()) { doSomethingWith(entry.getKey(), entry.getValue()); } or possibly a variant that streams the elements of the entrySet(). In both cases, this is totally safe: the entrySet is iterated or streamed, and there is no possible way to invoke any mutators on the view collection, thus none of this code would be affected by this change. It would be good to have a survey that verifies this conjecture. On the other hand, there is the possibility that a survey would invalidate this conjecture, in which case we've probably learned something interesting! Third, I pulled the number 100 out of the air. It could be something else. It needs to be large enough to give us some assurance that we've chosen a representative sample of code, and for various coding patterns to emerge. Of course it can't be definitive. It should be large enough though to convince us that if some case would be broken by the change, then it's probably an outlier. Maybe 50 cases of each is sufficient. But 5 is too small. (A similar sort of analysis might be useful for JDK-8316493 -- remove cached view fields from AbstractMap. Indeed, maybe the same analysis will provide us with useful results for both that issue and this one.) Is this a reasonable approach? Any volunteers?
12-07-2024
> I'm am going to do some more testing on Google's codebase with the latest version of the PR (your option 3), and see if I can find evidence of compatibility impact I did this, and there was no impact to any of our tests. That doesn't say anything definitive about compatibility impact to the ecosystem, but I think it's encouraging and that option (3) is worth seriously considering. Are there other ideas for how to evaluate the compatibility impact, and weigh the tradeoffs between the different options here?
12-07-2024
[~cushon] Thanks for doing additional testing. I'm all for greater consistency, but I'm also concerned about compatibility. Of course additional testing doesn't prove anything, but the additional data seems like it ought to be helpful in making a decision.
11-07-2024
[~smarks] thanks very much for the analysis and for unpacking the options. I'm am going to do some more testing on Google's codebase with the latest version of the PR (your option 3), and see if I can find evidence of compatibility impact. If I don't find anything that doesn't prove anything, but it's at least some signal about the potential impact.
11-07-2024
I think we should take the kve collections of `Collections`' `emptyMap` `singletonMap` and `unmodifiableMap` into account: they are emptyMap/kve x x x - x x x x singletonMap/kve x x x x x x x x unmodifiableMap/kve x x x ~ x x x x (~ means depends on the backing map) I think this is defintely a consistency improvement. And in the release note we write, we can simply note that Map.of now behaves similar to Collections unmodifiable maps.
11-07-2024
OK, I finally got a chance to dig into this. The change isn't necessarily wrong, but the actual history and the current change doesn't quite match what's been written thus far. And to be fair the approach has changed somewhat over time. Anyway, here's the complete rundown on the history. My original design intent for the unmodifiable collections -- including all the Map views -- is to throw UOE unconditionally, even if the operation wouldn't actually perform a mutation. A typical example is removeAll(arg) where arg is an empty collection. I attempted to write this intent into the specification, but I probably wasn't entirely successful at doing so. It probably isn't useful to try to read and interpret the specification in order to determine correctness. Instead, if we still agree with the design intent, we should implement that; but we also need to consider changes in behavior and possible compatibility impact on existing binaries. We can then adjust the implementation and specification accordingly. I wrote a little test program (attached) that tests the mutator methods cases of Collection on all the views of different-sized unmodifiable Maps (which can have different underlying implementations and which also differ across releases). I also tested various sizes of unmodifiable List and Set for good measure. The operations are [add, addAll, clear, itrm, remove, removeAll, removeIf, retainAll]. All are no-ops except for add and clear which may attempt actual mutation. Also, the "itrm" operation is removal of an item using an Iterator. This is also an actual mutation attempt, except for the case of an empty collection, in which case the operation is skipped. In the tables below, each row is a different case: list, set, and maps of different sizes, with operations performed on the map views (keySet, values, and entrySet) but not on the map itself. The result of each operation is shown in columns of each row, in the order of operations as listed above. An "x" result indicates that UnsupportedOperationException was thrown, and a "-" result indicates that the method call returned normally (or that the removal via iterator was not attempted because the collection was empty). I ran the test on all JDK versions from 9 through 22, and on the mainline with the current PR as of 2024-07-10 applied. On JDK 9 and JDK 10, the results are as follows: list0 x x x - x x x x list1 x x x x x x x x list2 x x x x x x x x list3 x x x x x x x x set0 x x x - x x x x set1 x x x x x x x x set2 x x x x x x x x set3 x x x x x x x x map0k x - x - - - - - map0v x - x - - - - - map0e x x x - x x x x map1k x - x x - - - - map1v x - x x - - - - map1e x x x x x x x x map2k x - x x - - - - map2v x - x x - - - - map2e x - x x - - - - On all JDK versions from 11 to 22 (inclusive), the results are: list0 x x x - x x x x list1 x x x x x x x x list2 x x x x x x x x list3 x x x x x x x x set0 x x x - x x x x set1 x x x x x x x x set2 x x x x x x x x set3 x x x x x x x x map0k x - x - - - - - map0v x - x - - - - - map0e x - - - - - - - map1k x - x x - - - - map1v x - x x - - - - map1e x x x x x x x x map2k x - x x - - - - map2v x - x x - - - - map2e x - x x - - - - Overall, the results show that UOE was thrown unconditionally for List and Set, conforming to my design intent. However, the map views deviate quite a bit, and many cases return normally without throwing UOE. The only difference is the map0e result between JDK 10 and JDK 11. (This is the regression that was uncovered by Sorin's and Liam's initial analysis.) map0e x x x - x x x x // JDK 9-10 map0e x - - - - - - - // JDK 11-22 That is, a zero-entry Map's entrySet changed behavior such that addAll, clear, remove, removeAll, removeIf, retainAll calls that don't attempt actual mutation now return normally, whereas previously these cases would throw UOE. The results with the change from the PR applied are as follows: list0 x x x - x x x x list1 x x x x x x x x list2 x x x x x x x x list3 x x x x x x x x set0 x x x - x x x x set1 x x x x x x x x set2 x x x x x x x x set3 x x x x x x x x map0k x x x - x x x x map0v x x x - x x x x map0e x x x - x x x x map1k x x x x x x x x map1v x x x x x x x x map1e x x x x x x x x map2k x x x x x x x x map2v x x x x x x x x map2e x x x x x x x x This comports with my original design intent, which is to throw UOE unconditionally even if a mutator method doesn't actually attempt a mutation. As such I can hardly argue against it. However, it's rather a bigger change than I had expected, and I hadn't considered the compatibility issues arising from changing this many cases. From reading the various discussions over time, it seems like there is a spectrum of possibilities from which we can choose: 0) Do nothing. Even though certain calls return normally instead of throwing UOE, no modifications are permitted to unmodifiable collections. Thus, the collections still preserve their invariants. This is the most compatible option, but it's also the least consistent, and it doesn't comport with my original intent. 1) Restore the throwing behavior of clear() on the entrySet of an empty Map. This is what was originally described by Sorin. I'm not sure why we would bother to change only the clear() method though. 2) Restore the throwing behavior of all mutator methods on the entrySet of an empty Map. This reverts the regression that occurred between JDK 10 and JDK 11, and restores the JDK 9-10 behavior. 3) Restore the throwing behavior of JDK 9-10 and also change the return-normally behavior to throw UOE for all mutator methods of all map views. This is what's currently in the PR (as of 2024-07-10). This is the most consistent behavior but it's also the biggest change, and thus imposes the most compatibility risk. Sorry it's taken me so long to get to this point. However, until I did all this analysis it wasn't clear to me that this change goes much farther than restoring the JDK 9-10 behavior. Instead, it goes much farther and establishes throwing behavior in a bunch of cases where exceptions were never thrown before. This isn't necessarily wrong. However, we need to give more thought to the compatibility implications of this change.
11-07-2024
A pull request was submitted for review. Branch: master URL: https://git.openjdk.org/jdk/pull/18522 Date: 2024-03-27 17:36:28 +0000
09-07-2024
[~cushon], yes a behavior change like this would require a CSR. HTH
28-03-2024
I tried backporting the fix and running all of Google's internal tests with it applied, and didn't find any regressions. That obviously doesn't guarantee the absence of any compatibility impact, but I think it's probably very minimal. Would you recommend a CSR and/or relnotes for this?
28-03-2024
Thanks for the historical analysis. I didn't realize this was actually a change in behavior between 10 and 11. Clearly it shows there are some holes in the test coverage. I don't think the compatibility issue is a blocker. It's an issue that we should keep an eye on, though. I've run into this kind of situation before, and I think it's difficult to get any information from corpus searches. It's fairly easy to search for something that's present; it's much harder to search for the something that's absent. However, I'd be interested if you can come up with anything though.
27-03-2024
Thanks Stuart, Sorin reports that this was originally noticed upgrading to JDK 11, and it does seem to have regressed between JDK 10 and 11, possibly as a result of the changes in https://bugs.openjdk.org/browse/JDK-8193128. $ jshell \| Welcome to JShell -- Version 10.0.2 jshell> Map.of().entrySet().addAll(List.of()) \| java.lang.UnsupportedOperationException thrown \| at ImmutableCollections.uoe (ImmutableCollections.java:71) \| at ImmutableCollections$AbstractImmutableSet.addAll (ImmutableCollections.java:282) \| at (#1:1) $ jshell \| Welcome to JShell -- Version 11.0.16 jshell> Map.of().entrySet().addAll(List.of()) $1 ==> false We don't have a lot of data either way on the impact of changing the behaviour for Map.of().entrySet(). I can do some more corpus research to see if I can find code that has started to rely on UnsupportedOperationException not being thrown in these cases.
26-03-2024
Thanks for digging into this. The principle here is that all mutator methods on the unmodifiable collections should throw UOE. Unfortunately some of the implementations are based on AbstractCollection (or other AbstractX classes) which don't throw UOE if they don't actually attempt an operation. This frequently occurs when an empty collection is involved, as noted here. I tried to override all the mutator methods in all cases to avoid this issue, but I had a feeling there were still a few remaining cases lurking. There are subtle tradeoffs around compatibility here. The no-op behavior is probably quite long-standing, and fixing it is arguably correct, but it might cause breakage in existing code that implicit relies on it. The issue with Map.of()'s entrySet applies to several mutators, not just clear(): Map.of().entrySet().addAll(List.of()) Map.of().entrySet().clear() Map.of().entrySet().remove("x") Map.of().entrySet().removeIf(x -> false) Map.of().entrySet().removeAll(List.of()) Map.of().entrySet().retainAll(List.of()) all are no-ops, when the should throw UOE. As an aside, this kind of behavior is present in some of the original utility collections like Collections.singleton(). For example, Collections.singleton("x").remove("y") Collections.singleton("x").addAll(List.of()) are no-ops, but Set.of("x").remove("y") Set.of("x").addAll(List.of()) throw UOE. It's not clear whether it's worth "fixing" these even longer-standing behaviors.
22-03-2024
Reading the spec on unmodifiable collections, https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/Collection.html#unmodifiable mentions: > For a collection to be properly unmodifiable, any view collections derived from it must also be unmodifiable. For example, if a List is unmodifiable, the List returned by List.subList is also unmodifiable. Making the derived EntrySet unmodifiable seems consistent with that.
22-03-2024

CSR :	JDK-8329284 - Map.of() derived view collection mutators should throw UnsupportedOperationException
Relates :	JDK-8316493 - Remove the caching fields in AbstractMap