JDK-8034065 : GCC 4.3 and later doesn't export vtable symbols any more which seem to be needed by SA
  • Type: Bug
  • Component: hotspot
  • Sub-Component: svc-agent
  • Affected Version: 9
  • Priority: P4
  • Status: Resolved
  • Resolution: Not an Issue
  • Submitted: 2014-02-10
  • Updated: 2020-06-09
  • Resolved: 2020-06-09
Related Reports
Relates :  
Relates :  
Relates :  
Description
During the investigation of why we need both, linker export maps and -fvisibilty=hidden/__attribute__((visibility("default"))) (see http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-February/012426.html) I found out that vtable symbols arn't exported any more at all if we are using GCC 4.3 and later, even if they are listed in the export maps.

The question is now if these vtable symbols are really needed by the serviceability agent SA (in which case we should find another way of exporting them) or not (in which case we should simply drop the magic which generates the vtable symbols in the map files (i.e. build_vm_def.sh) from the build).

Here some more details:

The vtable symbols are defined as weak symbols in the object files like so:

0000000000000000 V _ZTV10ArrayKlass

If such an object file will be linked with gcc 4.1.2 without map file
into the libjvm.so, the symbol would turn into a local data object
like so:

0000000000e37160 d _ZTV10ArrayKlass

However, if we use a map file which specifies that the symbol
'_ZTV10ArrayKlass' should be exported, the libjvm.so will contain the
following, global symbol:

0000000001423240 D _ZTV10arrayKlass

Now I think this is the expected result of the whole dynamic map-file
generation process. However, with gcc 4.3 and later, there's no
difference if the corresponding vtable symbol is mentioned in the
export map or not. The resulting libjvm.so always only contains a
local data object (just use 'nm --defined-only --extern-only
libjvm.so' to check).


Comments
Closing as Not An Issue since SA does not rely on exported vtable symbols. It looks them up directly in the ELF file.
09-06-2020

I opened JDK-8247272 to cover "whatis" not working. Turns out that the java level ELF file support has never worked for 64-bit, but there is also separate native support that does work. The is why calls to getFile().getHeader().getELFSymbol(offset) fail but calls to dbg.lookup(addr) work. The underlying (critical) mechanism for instantiating java wrappers for C++ objects relies on the native support, so has continued to work.
09-06-2020

Ok. I added comments there, but my summary is I don't think SA needs explicit exports of symbols (vtable or VMStructs) since it looks them up in the elf file, and can convert what it finds in the elf file to an in-process memory address.
08-06-2020

Also see JDK-8017234 which discusses what is needed to move away from map files in Hotspot.
08-06-2020

From the build perspective, I would be interested in stop doing all the symbol processing we're currently doing. I am not sure why we are doing it (apart from the "we've always done it this way!" reply), and it adds considerable amount of complexity to the hotspot build. As I remember, the last time I asked around about it, there. was some handwaving about "SA needing it". I'm trying to figure out if that is perhaps not the case anymore.
08-06-2020

I pointed out above that getFile().getHeader().getELFSymbol(offset) seems to no longer work when trying to find the symbol at a certain address, but the following does work: ClosestSymbol nativeSymbol = dbg.lookup(dbg.getAddressValue(a)); The lookup calls ends up in LinuxDebuggerLocal::lookupByAddress0() chich is in libsaproc. I figured we must need some sort of vtable symbol support in order to do the numerous mappings of addresses to VM types that SA does. This is the reverse of the address to symbol lookup and instead does symbol to address, but has similar requirements on the underlying support. I tracked down how this is done, and it too relies on libsaproc. It calls LinuxDebuggerLocal_lookupByName0(), which does itself looks in the Elf file, but seems to be working, on like the java support. So the vtable symbols are in the ELF file, but it does not appear that getFile().getHeader().getELFSymbol(offset) works for any symbol, even ones that gcc exports. You instead need to use the libsaproc native code to do symbol lookups.
05-06-2020

Actually I meant "whatis", not "where". I think we should close this CR as "Not an Issue" and open a new one for the symbol lookup failure that we see on linux (and probably OSX too). Although I have a fix for it (another approach to getting to the symbol), it would be nice to first figure out why getFile().getHeader().getELFSymbol(offset) does not seem to work. I assume it did at some point.
02-06-2020

I was only referring to symbols which nm displays, but for some reason getFile().getHeader().getELFSymbol(offset) does not seem to find. This causes commands like "whatis" to not be able to map an address to a symbol. Yes, I was able to fix this by using LinuxDebugger.dbg() which seems to use the proc API to reverse look symbols. I don't know if this will find vtable symbols also. Probably not. It might be that there are two separate issues here. One is the lack of vtable symbols, which even nm does not seem to display, and the other is the failure to look up symbols that are present (and nm displays), which is impacting commands like "whatis". I currently don't know how or even if SA uses vtable symbols. In the description of this bug Dmitriy lists 3 commands that don't seem to work due to lack of vtables, but never mentions where he got the address being used, so I have no way or reproducing it. If I could reproduce it then I could turn on verbose mode and get the full back trace of the exception and further debug. As far as I can tell, the WrongTypeExceptions just means the address does not represent a type supported by dumpreplaydata (MetaData subclass or NMethod), jdis (Method), or printmdo (MethodData). You cannot simply infer it means there is a lack of vtable symbols, or is caused by a lack of vtable smbols. These commands all use <type>.instantiateWrapperFor(), which will throw WrongTypeException is passed an address that is not of the type requested. There are plenty clhsdb tests that use these commands and others that use instantiateWrapperFor(), and the only cases I've seen so far for getting WrongTypeException are due to other bugs, many of which I have fixed recently, but having nothing to do with vtable symbols.
02-06-2020

No, no significant inputs. Pointer to vtbl was used to deal with the Java Class metadata when Java has permgen, I'm not sure it has any value nowdays. And I agree with Chris - it looks like not working "where" is a separate issue.
02-06-2020

[~dsamersoff] Dmitriy, do you have any input to Chris to help him debug this?
02-06-2020

[~cjplummer] I'm not sure I understand the consequences of your findings. Are you saying that you've found a way to implement the functionality in SA without these vtable maps?
01-06-2020

Yep, that works. Unfortunately though I had to add linux specific code to PointerFinder: LinuxDebugger dbg = (LinuxDebugger)VM.getVM().getDebugger(); ClosestSymbol nativeSymbol = dbg.lookup(dbg.getAddressValue(a)); There's no OS agnostic API in JVMDebugger to allow getting to the lookup() API that exists in all our posix debuggers. WindbgDebugge doesn't have lookup(), but probably doesn't need it either. The PStack code gets around this lack of abstraction by going through CFrame.closestSymbolToPC(), for which there are platform specific implementations. So probably what is needed here is to declare lookup() as abstract in the JVMDebugger interface, and then add an implementation to WindbgDebuggerLocal that probably just continues to rely on DSO.closestSymbolToPC(pc).
29-05-2020

I think I found one bit of SA functionality that is not working because of this. DSO.closestSymbolToPC(pc) seems to always fail. It does the following, which never seems to get a result: ELFSymbol sym = getFile().getHeader().getELFSymbol(offset); I even tried with some global symbols that nm says are exported from libjvm.so, such as gHotSpotVMStructs and various JVM_XXX functions. This causes the old "whatis" command to fail to find the symbol closest to the address, although "whatis" is currently only supported on 8 and earlier (I plan on fixing that soon). However, I've also modified "findpc" to try to lookup the symbol for an address like "whatis" does, and it also fails. However, I noticed pstack does print out native symbol names so I looked into how. It starts off with: CFrame f = cdbg.topFrameForThread(th); ... ClosestSymbol sym = f.closestSymbolToPC(); However, don't be fooled by the similar looking CFrame.closestSymbolToPC(). It is completely unrelated to LoadObject/DSO.closestSymbolToPC(pc). Rather than doing the ELF symbol lookup as mention above, it ends up in the native LinuxDebuggerLocal::lookupByAddress0(), which does: struct ps_prochandle* ph = get_proc_handle(env, this_obj); sym = symbol_for_pc(ph, (uintptr_t) addr, &offset); So this seems to work. I'll see if I can get my "findpc" support (which is actually implemented in PointerFinder) to rely on this same approach.
29-05-2020

Dmitry shows some failures that are apparently due to the lack of _ZTV* symbols, although I'm not so sure you can conclude that is the case from his comments since I have know idea where the address he is using comes from. It could in fact simply be an invalid address. We do however have a couple of other CRs filed with the same WrongTypeException. They are JDK-8200217 and JDK-8235220. Possibly they are related to this issue somehow.
17-03-2020

This has apparently been broken for 4-5 years. Is this functionality still required, or can we drop this, close the bug, and finally clean out the related messy (but broken) code in the make files?
15-01-2019

Couple of SA commands are broken because of abscence of _ZTV* symbols. Opening core file, please wait... hsdb> dumpreplaydata 0x00007f106c0a6000 Error: sun.jvm.hotspot.types.WrongTypeException: No suitable match for type of address 0x00007f106c0a6000 hsdb> jdis 0x00007f106c0a6000 Error: sun.jvm.hotspot.types.WrongTypeException: No suitable match for type of address 0x00007f106c0a6000 hsdb> printmdo 0x00007f106c0a6000 Error: sun.jvm.hotspot.types.WrongTypeException: No suitable match for type of address 0x00007f106c0a6000 hsdb>
14-02-2014