JDK-6511991 : add support for real temporaries in adlc
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 5.0,7
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • OS: generic,solaris_9
  • CPU: generic,sparc
  • Submitted: 2007-01-11
  • Updated: 2011-02-09
  • Resolved: 2007-03-15
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6 JDK 7 Other
6u4Fixed 7Fixed hs10Fixed
Related Reports
Relates :  
Relates :  
Description
Often when writing complex instruction definitions in an ad file temporary registers are needed for code generation.  KILLs can be use in some cases but this requires having a fixed register for the temporary which overly constrains the regiser allocator resulting in more shuffling of registers than is really necessary.  This is particularly noticeable in i486.ad.

Comments
EVALUATION TEMPs can be created by having Expand create new nodes to represent them.
14-02-2007

SUGGESTED FIX Job ID: 20070213174113.never.6511991 Original workspace: smite:/export/ws/6511991 Submitter: never Archived data: /net/prt-archiver.sfbay/data/archived_workspaces/main/c2_baseline/2007/20070213174113.never.6511991/ Webrev: http://prt-web.sfbay.sun.com/net/prt-archiver.sfbay/data/archived_workspaces/main/c2_baseline/2007/20070213174113.never.6511991/workspace/webrevs/webrev-2007.02.13/index.html Fixed 6511991: add support for real temporaries in adlc Often when writing complex instruction definitions in an ad file temporary registers are needed for code generation. KILLs can be use in some cases but this requires having a fixed register for the temporary which overly constrains the regiser allocator resulting in more shuffling of registers than is really necessary. This is particularly noticeable in i486.ad. This change add a new effect called TEMP which is like a synthetic USE. USEs represent real inputs to the MachNode and come for input the match rule. KILLs don't correspond inputs to the MachNode so they can't be assigned a register. KILLs also don't interfere with the inputs to the node either so they aren't very useful for creating temporaries. TEMP can also be used to modify DEFs which means that the DEF will interfere the inputs guaranteeing that the output register is different than any of the inputs. There are some minor restrictions on their use. TEMPs must come before any KILLs in the argument list of the instruction. This is because of the machinery in aldc having to do with the numbering of inputs. Fixing it to be flexible was too complicated but it will complain when you violate this rule. I changed all the ad files in the places which made sense. In sparc.ad I left alone the uses of O7 as a temp since O7 isn't a part of the allocatable register sets so broadening the mask for those uses wouldn't help register pressure. I also fixed a lot of code to use operand names instead of hard coding the names. I had to workaround a bug in the iterator model used in adlc since they are internal instead of external, so two different pieces of code can't iterate the same object simultaneously. I added some code to preserve the state of the iterator when ComponentLists use iteration internally so that queries on a ComponentList won't break users which are also iterating the list. I added field name printing in the opto assembly output when the ciField is available from the adr_type. So printing now looks like this: fd7 B129: # B141 B130 <- B128 Freq: 18860 fd7 MOV EBX,[EBP + #8] ! Field java/util/HashMap$Entry.key fda MOV EAX,[EBP + #12] ! Field java/util/HashMap$Entry.value fdd CMPu EBX,EDX fdf Jeq,us B141 P=0.027625 C=18860.000000 Register allocation time doesn't appear to be affected and performance looks pretty much like a wash though there are tiny regressions and improvements in some of the subbenchmarks. The current refworkload data is at the end. http://javaweb.sfbay/~never/webrev/6511991 Approved by: Reviewed by: Fix verified (y/n): y sunblade 2500 2x1.2G 2G RAM ============================================================================ t1: reference_server Benchmark Samples Mean Stdev jetstream 15 66.31 1.99 scimark 15 74.79 0.87 specjbb2000 15 25853.12 180.32 specjbb2005 15 8419.68 243.87 specjvm98 15 140.23 0.71 volano25 15 12443.73 220.08 -------------------------------------------------------------------------- Weighted Geomean 1767.89 ============================================================================ t2: reference_server Benchmark Samples Mean Stdev %Diff P Significant jetstream 15 65.85 1.37 -0.70 0.463 * scimark 15 75.20 0.92 0.54 0.226 * specjbb2000 15 25817.19 151.87 -0.14 0.560 * specjbb2005 15 8390.93 212.50 -0.34 0.733 * specjvm98 15 139.67 0.57 -0.39 0.026 * volano25 15 12514.67 265.19 0.57 0.432 * -------------------------------------------------------------------------- Weighted Geomean 1767.16 -0.04 ============================================================================ hsdev-5 8x2.6G 2G RAM ============================================================================ t1: reference_server Benchmark Samples Mean Stdev jetstream 15 149.89 1.98 scimark 15 306.71 0.86 specjbb2000 15 118245.86 2226.14 specjbb2005 15 10552.48 201.69 specjvm98 15 320.76 1.75 volano25 15 113678.13 7787.67 -------------------------------------------------------------------------- Weighted Geomean 5551.44 ============================================================================ t2: reference_server Benchmark Samples Mean Stdev %Diff P Significant jetstream 15 155.59 1.38 3.80 0.000 Yes scimark 15 306.32 0.65 -0.13 0.169 * specjbb2000 15 118370.99 1480.00 0.11 0.858 * specjbb2005 15 10659.55 101.61 1.01 0.081 * specjvm98 15 322.91 1.63 0.67 0.002 Yes volano25 15 113387.80 4217.76 -0.26 0.900 * -------------------------------------------------------------------------- Weighted Geomean 5588.83 0.67 ============================================================================ The jetstream improvement is a 13% improvement in Copy which just seems like an aberration. The generated code isn't significantly different.
14-02-2007