United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: JDK-6511991 add support for real temporaries in adlc
JDK-6511991 : add support for real temporaries in adlc

Details
Type:
Enhancement
Submit Date:
2007-01-11
Status:
Resolved
Updated Date:
2011-02-09
Project Name:
JDK
Resolved Date:
2007-03-15
Component:
hotspot
OS:
solaris_9,generic
Sub-Component:
compiler
CPU:
sparc,generic
Priority:
P4
Resolution:
Fixed
Affected Versions:
5.0,7
Fixed Versions:
hs10 (b10)

Related Reports
Backport:
Backport:
Relates:
Relates:

Sub Tasks

Description
Often when writing complex instruction definitions in an ad file temporary registers are needed for code generation.  KILLs can be use in some cases but this requires having a fixed register for the temporary which overly constrains the regiser allocator resulting in more shuffling of registers than is really necessary.  This is particularly noticeable in i486.ad.

                                    

Comments
EVALUATION

TEMPs can be created by having Expand create new nodes to represent them.
                                     
2007-02-14
SUGGESTED FIX

Job ID:                 20070213174113.never.6511991
Original workspace:     smite:/export/ws/6511991
Submitter:              never
Archived data:          /net/prt-archiver.sfbay/data/archived_workspaces/main/c2_baseline/2007/20070213174113.never.6511991/
Webrev:                 http://prt-web.sfbay.sun.com/net/prt-archiver.sfbay/data/archived_workspaces/main/c2_baseline/2007/20070213174113.never.6511991/workspace/webrevs/webrev-2007.02.13/index.html

Fixed 6511991: add support for real temporaries in adlc

Often when writing complex instruction definitions in an ad file
temporary registers are needed for code generation.  KILLs can be use
in some cases but this requires having a fixed register for the
temporary which overly constrains the regiser allocator resulting in
more shuffling of registers than is really necessary.  This is
particularly noticeable in i486.ad.

This change add a new effect called TEMP which is like a synthetic
USE.  USEs represent real inputs to the MachNode and come for input
the match rule.  KILLs don't correspond inputs to the MachNode so they
can't be assigned a register.  KILLs also don't interfere with the
inputs to the node either so they aren't very useful for creating
temporaries.

TEMP can also be used to modify DEFs which means that the DEF will
interfere the inputs guaranteeing that the output register is
different than any of the inputs.

There are some minor restrictions on their use.  TEMPs must come
before any KILLs in the argument list of the instruction.  This is
because of the machinery in aldc having to do with the numbering of
inputs.  Fixing it to be flexible was too complicated but it will
complain when you violate this rule.

I changed all the ad files in the places which made sense.  In
sparc.ad I left alone the uses of O7 as a temp since O7 isn't a part
of the allocatable register sets so broadening the mask for those uses
wouldn't help register pressure.  I also fixed a lot of code to use
operand names instead of hard coding the names.

I had to workaround a bug in the iterator model used in adlc since
they are internal instead of external, so two different pieces of code
can't iterate the same object simultaneously.  I added some code to
preserve the state of the iterator when ComponentLists use iteration
internally so that queries on a ComponentList won't break users which
are also iterating the list.

I added field name printing in the opto assembly output when the
ciField is available from the adr_type.  So printing now looks like
this:

fd7   B129: #   B141 B130 <- B128  Freq: 18860
fd7     MOV    EBX,[EBP + #8] ! Field java/util/HashMap$Entry.key
fda     MOV    EAX,[EBP + #12] ! Field java/util/HashMap$Entry.value
fdd     CMPu   EBX,EDX
fdf     Jeq,us B141  P=0.027625 C=18860.000000

Register allocation time doesn't appear to be affected and performance
looks pretty much like a wash though there are tiny regressions and
improvements in some of the subbenchmarks.  The current refworkload
data is at the end.


http://javaweb.sfbay/~never/webrev/6511991

Approved by:
Reviewed by:

Fix verified (y/n): y

sunblade 2500 2x1.2G 2G RAM
============================================================================
t1: reference_server
  Benchmark         Samples        Mean     Stdev
  jetstream              15       66.31      1.99
  scimark                15       74.79      0.87
  specjbb2000            15    25853.12    180.32
  specjbb2005            15     8419.68    243.87
  specjvm98              15      140.23      0.71
  volano25               15    12443.73    220.08
  --------------------------------------------------------------------------
  Weighted Geomean              1767.89
============================================================================
t2: reference_server
  Benchmark         Samples        Mean     Stdev   %Diff    P   Significant
  jetstream              15       65.85      1.37   -0.70 0.463            *
  scimark                15       75.20      0.92    0.54 0.226            *
  specjbb2000            15    25817.19    151.87   -0.14 0.560            *
  specjbb2005            15     8390.93    212.50   -0.34 0.733            *
  specjvm98              15      139.67      0.57   -0.39 0.026            *
  volano25               15    12514.67    265.19    0.57 0.432            *
  --------------------------------------------------------------------------
  Weighted Geomean              1767.16             -0.04
============================================================================

hsdev-5 8x2.6G 2G RAM
============================================================================
t1: reference_server
  Benchmark         Samples        Mean     Stdev
  jetstream              15      149.89      1.98
  scimark                15      306.71      0.86
  specjbb2000            15   118245.86   2226.14
  specjbb2005            15    10552.48    201.69
  specjvm98              15      320.76      1.75
  volano25               15   113678.13   7787.67
  --------------------------------------------------------------------------
  Weighted Geomean              5551.44
============================================================================
t2: reference_server
  Benchmark         Samples        Mean     Stdev   %Diff    P   Significant
  jetstream              15      155.59      1.38    3.80 0.000          Yes
  scimark                15      306.32      0.65   -0.13 0.169            *
  specjbb2000            15   118370.99   1480.00    0.11 0.858            *
  specjbb2005            15    10659.55    101.61    1.01 0.081            *
  specjvm98              15      322.91      1.63    0.67 0.002          Yes
  volano25               15   113387.80   4217.76   -0.26 0.900            *
  --------------------------------------------------------------------------
  Weighted Geomean              5588.83              0.67
============================================================================

The jetstream improvement is a 13% improvement in Copy which just
seems like an aberration.  The generated code isn't significantly different.
                                     
2007-02-14



Hardware and Software, Engineered to Work Together