JDK-8163015 : Windows os::check_heap() fails with fatal error: corrupted C heap
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 9
  • Priority: P3
  • Status: Closed
  • Resolution: Not an Issue
  • OS: windows
  • CPU: x86
  • Submitted: 2016-08-03
  • Updated: 2016-08-04
  • Resolved: 2016-08-04
Related Reports
Relates :  
Relates :  
Description
Heap verification fails on windows with:

 # A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (c:/jprt/T/P1/093716.dpochepk/s/hotspot/src/os/windows/vm/os_windows.cpp:5307), pid=52528, tid=105564
#  fatal error: corrupted C heap
#
# JRE version: Java(TM) SE Runtime Environment (9.0) (fastdebug build 9-internal+0-2016-08-01-093716.dpochepk.hs-comp)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 9-internal+0-2016-08-01-093716.dpochepk.hs-comp, mixed mode, tiered, compressed oops, g1 gc, windows-amd64)

Comments
[~iveresov] Agree. I filed JDK-8163146 - Disable os::check_heap on Windows by default
04-08-2016

If it doesn't work, why not disable it by default and put it under a develop flag? Otherwise we would be forced (by the current gatekeeping policy) to file this bug over and over again.
04-08-2016

Closing this bug as not an issue since it's most likely a spurious failure in HeapValidate().
04-08-2016

The error reported by os_windows.cpp is: [C heap has been corrupted (time: 1 allocations) corrupted block near address 0x3f57c0, length 40, count 2688 The offending address is within heap 00340000, and "!heap -v 00340000" (validate the heap) shows no corruption: 0:000> !heap Index Address Name Debugging options enabled 1: 00340000 2: 00010000 3: 00020000 4: 006a0000 5: 02130000 6: 002a0000 7: 1d9a0000 8: 01f80000 9: 1dd40000 0:000> !heap -v 00340000 Index Address Name Debugging options enabled 1: 00340000 Segment at 0000000000340000 to 0000000000440000 (00100000 bytes committed) Segment at 000000001f4d0000 to 000000001f5d0000 (00100000 bytes committed) Segment at 000000006aba0000 to 000000006ada0000 (00200000 bytes committed) Segment at 0000000071010000 to 0000000071410000 (00400000 bytes committed) Segment at 0000000073490000 to 0000000073c90000 (00420000 bytes committed) Flags: 00000002 ForceFlags: 00000000 Granularity: 16 bytes Segment Reserve: 01000000 Segment Commit: 00002000 DeCommit Block Thres: 00000400 DeCommit Total Thres: 00001000 Total Free Size: 0000a495 Max. Allocation Size: 000007fffffdefff Lock Variable at: 0000000000340208 Next TagIndex: 0000 Maximum TagIndex: 0000 Tag Entries: 00000000 PsuedoTag Entries: 00000000 Virtual Alloc List: 00340118 Unable to read nt!_HEAP_VIRTUAL_ALLOC_ENTRY structure at 0000000000440000 Uncommitted ranges: 003400f8 FreeList[ 00 ] at 0000000000340158: 0000000073827f10 . 000000007139b490 (60 blocks) I can walk out all blocks inside the heap without any issue. Note that 00000000003f57c0 is a sub-segment within the block 00000000003f5750: 0:000> !heap -p -all _HEAP @ 340000 _LFH_HEAP @ 440040 _HEAP_SEGMENT @ 340000 CommittedRange @ 340a80 HEAP_ENTRY Size Prev Flags UserPtr UserSize - state * 0000000000340a80 0086 0000 [00] 0000000000340a90 00850 - (busy) Line 0007 00000000003412e0 0003 0086 [00] 00000000003412f0 00028 - (busy) ... 00000000003f4c90 00ac 00ac [00] 00000000003f4ca0 00ab0 - (busy) * 00000000003f5750 0080 00ac [00] 00000000003f5760 007f0 - (busy) Line 2783 00000000003f5780 0003 0080 [00] 00000000003f5790 00020 - (busy) 00000000003f57b0 0003 0003 [00] 00000000003f57c0 00020 - (busy) <<<<< Reported as corrupt 00000000003f57e0 0003 0003 [00] 00000000003f57f0 00020 - (busy) 00000000003f5810 0003 0003 [00] 00000000003f5820 00020 - (busy) 00000000003f5840 0003 0003 [00] 00000000003f5850 00020 - (busy) ... 00000000003f5f00 0003 0003 [00] 00000000003f5f10 00020 - (busy) 00000000003f5f50 00ac 0003 [00] 00000000003f5f60 00ab0 - (busy) * 00000000003f6a10 0200 00ac [00] 00000000003f6a20 01ff0 - (busy) 00000000003f6a40 001a 0200 [00] 00000000003f6a50 00190 - (busy) 00000000003f6be0 001a 001a [00] 00000000003f6bf0 00190 - (busy) "heap -i" of the several blocks leading to the failure shows no corruption: 0:000> !heap -i 00000000003f5750 Detailed information for block entry 00000000003f5750 Assumed heap : 0x0000000000340000 (Use !heap -i NewHeapHandle to change) Header content : 0xD73280DD 0x10006591 (decoded : 0x89090080 0x100000AC) Owning segment : 0x0000000000340000 (offset 0) Block flags : 0x1 (busy ) Total block size : 0x80 units (0x800 bytes) Requested size : 0x7f0 bytes (unused 0x10 bytes) Previous block size: 0xac units (0xac0 bytes) Block CRC : OK - 0x89 Previous block : 0x00000000003f4c90 Next block : 0x00000000003f5f50 0:000> !heap -i 00000000003f5780 Detailed information for block entry 00000000003f5780 Assumed heap : 0x0000000000340000 (Use !heap -i NewHeapHandle to change) Header content : 0x4BBC3E9F 0x900000D5 Block flags : 0x1 LFH (busy ) Total block size : 0x3 units (0x30 bytes) Requested size : 0x20 bytes (unused 0x10 bytes) Subsegment : 0x00000000003ac1b0 0:000> !heap -i 00000000003f57b0 <<<<<<<<<<<< CRASHING BLOCK Detailed information for block entry 00000000003f57b0 Assumed heap : 0x0000000000340000 (Use !heap -i NewHeapHandle to change) Header content : 0x4BBC3E9C 0x900000D5 Block flags : 0x1 LFH (busy ) Total block size : 0x3 units (0x30 bytes) Requested size : 0x20 bytes (unused 0x10 bytes) Subsegment : 0x00000000003ac1b0 0:000> !heap -i 00000000003f57e0 Detailed information for block entry 00000000003f57e0 Assumed heap : 0x0000000000340000 (Use !heap -i NewHeapHandle to change) Header content : 0x4BBC3E99 0x900000D5 Block flags : 0x1 LFH (busy ) Total block size : 0x3 units (0x30 bytes) Requested size : 0x20 bytes (unused 0x10 bytes) Subsegment : 0x00000000003ac1b0 Inside Visual Studio, the previous 2 logged blocks (in saved_heap_entries) are the same as we have observed in the "heap -i" commands above lpData=0x00000000003f4ca0 cbData=0x00000ab0 cbOverhead=0x10 lpData=0x00000000003f5790 cbData=0x00000028 cbOverhead=0x08 Thus, no sign of corruption is found in windbg.
04-08-2016

Thanks Christian. I'll try to patch the pdb using the "pdb type threft" for the current dump, and see if that works.
03-08-2016

The symbols for that version of ntdll.dll seems to be missing symbol information that !heap needs: 00000000`77700000 00000000`778a9000 ntdll (pdb symbols) c:\temp\symbols1\ntdll.pdb\1EA2E1024B9149A883257AD45C8E45CB2\ntdll.pdb Loaded symbol image file: ntdll.dll Image path: C:\Windows\System32\ntdll.dll Image name: ntdll.dll Browse all global symbols functions data Timestamp: Wed Jul 15 14:08:22 2015 (55A6A196) CheckSum: 001AE9F4 ImageSize: 001A9000 File version: 6.1.7601.18933 Product version: 6.1.7601.18933 File flags: 0 (Mask 3F) File OS: 40004 NT Win32 File type: 2.0 Dll File date: 00000000.00000000 Translations: 0409.04b0 CompanyName: Microsoft Corporation ProductName: Microsoft® Windows® Operating System InternalName: ntdll.dll OriginalFilename: ntdll.dll ProductVersion: 6.1.7601.18933 FileVersion: 6.1.7601.18933 (win7sp1_gdr.150715-0600) FileDescription: NT Layer DLL LegalCopyright: © Microsoft Corporation. All rights reserved. The discussion mentioned in the previous comment is about ntdll from mid 2015, just as this one. Looks like the symbols were stripped more than usual, apparently this has now been corrected for later versions of ntdll.
03-08-2016

Found more info about the missing _HEAP_ENTRY type information: Microsoft seems to have removed type information from ntdll.pdb in recent Windows updates https://www.osronline.com/ShowThread.cfm?link=269221 http://stackoverflow.com/questions/32217038/ntdll-module-not-loading-correctly-in-windbg-but-why The stackoverflow page provides a work-around: UPDATE: 10/12/2015: Possible workaround using the PDB Type Theft python script which copies type information from one PDB to another. The usage would be to copy the type information from an older PDB that has the type information that was removed in later PDBs. This link has all the details: http://h30499.www3.hp.com/t5/HP-Security-Research-Blog/PDB-Type-Theft/ba-p/6801065#.Vhv2gPm6fmE
03-08-2016

When I load the MDMP file inside WinDBG and run !heap, WinDBG complains: 0:000> !heap ************************************************************************* *** *** *** *** *** Either you specified an unqualified symbol, or your debugger *** *** doesn't have full symbol information. Unqualified symbol *** *** resolution is turned off by default. Please either specify a *** *** fully qualified symbol module!symbolname, or enable resolution *** *** of unqualified symbols by typing ".symopt- 100". Note that *** *** enabling unqualified symbol resolution with network symbol *** *** server shares in the symbol path may cause the debugger to *** *** appear to hang for long periods of time when an incorrect *** *** symbol name is typed or the network symbol server is down. *** *** *** *** For some commands to work properly, your symbol path *** *** must point to .pdb files that have full type information. *** *** *** *** Certain .pdb files (such as the public OS symbols) do not *** *** contain the required information. Contact the group that *** *** provided you with these symbols if you need this command to *** *** work. *** *** *** *** Type referenced: ntdll!_HEAP_ENTRY *** *** *** ************************************************************************* Invalid type information [~ctornqvi] could you help?
03-08-2016