Object headers consume 14% of the average heap dump (assuming a 32-bit JVM with 8-byte headers). 64-bit JVMs will have a larger impact. The values in the object header should be trimmed to get down to 4-byte headers.
Identity hash codes could be moved to a WeakHashMap<Object, Integer>. This will make lookup of identity hash code slower. I assume identity hash codes aren't used very much. If so, then making the operation slower won't have a significant impact.
GC age can be removed. Instead of storing the age in the objects, let's be smarter on how we organize the young generation. For G1 GC, the region should track the age of the objects. When sweeping objects, they should be moved to a region of the appropriate age. For CMS, the survivor space should order the objects by age and keep pointers to the beginning of each age group. Newly created objects go at the end of the list. In a sense, the survivor space acts like a queue.
The JavaThread* and 3 lock flags can probably be removed if biased locking is no longer being used. If lock ownership is needed for thin/stack locks, a WeakConcurrentHashMap<Object, Thread> can be maintained to track lock ownership.
That leaves the Klass*. This can be turned into a 32-bit array index.