Parsing manifests is noticeable chunk of start time. Especially for signed jars where manifests are large
and have attributes per each class. For FX runtime several thousands of Attibutes.Name objects are created
(even if without full security validation).
We should make this code faster and less memory consuming.
In particular, following improvements are possible:
1) Attributes.Name.equals() should compare references to names before using case insensitive comparator
In majority of real-world cases names are identical and they are interned.
2) Attribute.Names are mostly used for hash map lookups, i.e. for almost every Attributes.Name instance hashcode will be called. We can precalculate hashcode at time of creation/validation. This helps of avoid multiple iterations over char array.
3) Attribute names are often occur multiple times in teh same and different jars.
One very typical scenario is signed jars where every entry has signature with same name.
We can reuse Attributes.Name instances, at least for the same Manifest -
this helps to reduce footprint
(in simplest FX testcase number of instances of Name droped from 3500 to 50)
See suggested fix in attachment for further details.
Yet another idea is to try to avoid of use utf8 decoder if possible.
decoder was significantly improved but for short strings overhead of wrapping arrays with byte buffers, etc. seems to be significant. Majority of real world manifests/attributes are ascii anyway.
Quick test with replacing the utf8 based String construction with ASCII only shows additional 4% improvement using same test.
One possible approach for implementation is following:
1) Check if any of non ascii symbols (with code > 0x7f) were read in the FastInputStream.readline() (we iterate through byte array there anyway)
2) if all symbols were ascii during last line read - use ascii only String constructor