Commit Graph

326 Commits

Author SHA1 Message Date
Andrew Trick
10ffc2b6c2 Various bits of framework needed for precise machine-level selection
DAG scheduling during isel. Most new functionality is currently
guarded by -enable-sched-cycles and -enable-sched-hazard.

Added InstrItineraryData::IssueWidth field, currently derived from
ARM itineraries, but could be initialized differently on other targets.

Added ScheduleHazardRecognizer::MaxLookAhead to indicate whether it is
active, and if so how many cycles of state it holds.

Added SchedulingPriorityQueue::HasReadyFilter to allowing gating entry
into the scheduler's available queue.

ScoreboardHazardRecognizer now accesses the ScheduleDAG in order to
get information about it's SUnits, provides RecedeCycle for bottom-up
scheduling, correctly computes scoreboard depth, tracks IssueCount, and
considers potential stall cycles when checking for hazards.

ScheduleDAGRRList now models machine cycles and hazards (under
flags). It tracks MinAvailableCycle, drives the hazard recognizer and
priority queue's ready filter, manages a new PendingQueue, properly
accounts for stall cycles, etc.

llvm-svn: 122541
2010-12-24 05:03:26 +00:00
Chris Lattner
11a33811b6 flags -> glue for selectiondag
llvm-svn: 122509
2010-12-23 17:24:32 +00:00
Andrew Trick
528fad91d2 Reorganize ListScheduleBottomUp in preparation for modeling machine cycles and instruction issue.
llvm-svn: 122491
2010-12-23 05:42:20 +00:00
Andrew Trick
a52f325c35 Converted LiveRegCycles to LiveRegGens. It's easier to work with and allows multiple nodes per cycle.
llvm-svn: 122474
2010-12-23 04:16:14 +00:00
Andrew Trick
12acde11cb In CheckForLiveRegDef use TRI->getOverlaps.
llvm-svn: 122473
2010-12-23 03:43:21 +00:00
Andrew Trick
033efdf4d7 Fixes PR8823: add-with-overflow-128.ll
In the bottom-up selection DAG scheduling, handle two-address
instructions that read/write unspillable registers. Treat
the entire chain of two-address nodes as a single live range.

llvm-svn: 122472
2010-12-23 03:15:51 +00:00
Andrew Trick
fbb3ed8774 In DelayForLiveRegsBottomUp, handle instructions that read and write
the same physical register. Simplifies the fix from the previous
checkin r122211.

llvm-svn: 122370
2010-12-21 22:27:44 +00:00
Andrew Trick
2085a96513 whitespace
llvm-svn: 122368
2010-12-21 22:25:04 +00:00
Chris Lattner
3e5fbd74ed rename MVT::Flag to MVT::Glue. "Flag" is a terrible name for
something that just glues two nodes together, even if it is
sometimes used for flags.

llvm-svn: 122310
2010-12-21 02:38:05 +00:00
Chris Lattner
981afd206b Fix a bug in the scheduler's handling of "unspillable" vregs.
Imagine we see:

EFLAGS = inst1
EFLAGS = inst2 FLAGS
gpr = inst3 EFLAGS

Previously, we would refuse to schedule inst2 because it clobbers
the EFLAGS of the predecessor.  However, it also uses the EFLAGS
of the predecessor, so it is safe to emit.  SDep edges ensure that
the right order happens already anyway.

This fixes 2 testsuite crashes with the X86 patch I'm going to
commit next.

llvm-svn: 122211
2010-12-20 00:55:43 +00:00
Chris Lattner
0cfe884874 the result of CheckForLiveRegDef is dead, remove it.
llvm-svn: 122209
2010-12-20 00:51:56 +00:00
Evan Cheng
debf9c502a Two sets of changes. Sorry they are intermingled.
1. Fix pre-ra scheduler so it doesn't try to push instructions above calls to
   "optimize for latency". Call instructions don't have the right latency and
   this is more likely to use introduce spills.
2. Fix if-converter cost function. For ARM, it should use instruction latencies,
   not # of micro-ops since multi-latency instructions is completely executed
   even when the predicate is false. Also, some instruction will be "slower"
   when they are predicated due to the register def becoming implicit input.
   rdar://8598427

llvm-svn: 118135
2010-11-03 00:45:17 +00:00
Evan Cheng
6c1414f9c2 Avoiding overly aggressive latency scheduling. If the two nodes share an
operand and one of them has a single use that is a live out copy, favor the
one that is live out. Otherwise it will be difficult to eliminate the copy
if the instruction is a loop induction variable update. e.g.

BB:
sub r1, r3, #1
str r0, [r2, r3]
mov r3, r1
cmp
bne BB

=>

BB:
str r0, [r2, r3]
sub r3, r3, #1
cmp
bne BB

This fixed the recent 256.bzip2 regression.

llvm-svn: 117675
2010-10-29 18:09:28 +00:00
Evan Cheng
e6d6c5dd11 The "excess register pressure" returned by HighRegPressure() is not accurate enough to factor into scheduling priority. Eliminate it and add early exits to speed up scheduling.
llvm-svn: 109449
2010-07-26 21:49:07 +00:00
Duncan Sands
136a6f0dbb Pacify gcc-4.5 which wrongly thinks that RExcess (passed as the Excess parameter)
may be used uninitialized in the callers of HighRegPressure.

llvm-svn: 109393
2010-07-26 07:54:17 +00:00
Evan Cheng
8ae3ecad2b Add comments.
llvm-svn: 109383
2010-07-25 18:59:43 +00:00
Bob Wilson
280ce9984e Fix crashes when scheduling a CopyToReg node -- getMachineOpcode asserts on
those.  Radar 8231572.

llvm-svn: 109367
2010-07-25 05:34:27 +00:00
Evan Cheng
37b740c4bf Add an ILP scheduler. This is a register pressure aware scheduler that's
appropriate for targets without detailed instruction iterineries.
The scheduler schedules for increased instruction level parallelism in
low register pressure situation; it schedules to reduce register pressure
when the register pressure becomes high.

On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2
by 16%.

llvm-svn: 109300
2010-07-24 00:39:05 +00:00
Evan Cheng
df907f4594 - Allow target to specify when is register pressure "too high". In most cases,
it's too late to start backing off aggressive latency scheduling when most
  of the registers are in use so the threshold should be a bit tighter.
- Correctly handle live out's and extract_subreg etc.
- Enable register pressure aware scheduling by default for hybrid scheduler.
  For ARM, this is almost always a win on # of instructions. It's runtime
  neutral for most of the tests. But for some kernels with high register
  pressure it can be a huge win. e.g. 464.h264ref reduced number of spills by
  54 and sped up by 20%.

llvm-svn: 109279
2010-07-23 22:39:59 +00:00
Evan Cheng
bf32e54bac Re-apply r109079 with fix.
llvm-svn: 109083
2010-07-22 06:24:48 +00:00
Owen Anderson
6c55cccf87 Revert r109079, which broke a lot of CodeGen tests.
llvm-svn: 109082
2010-07-22 06:01:28 +00:00
Evan Cheng
bd81bff672 Initialize RegLimit only when register pressure is being tracked.
llvm-svn: 109079
2010-07-22 05:18:41 +00:00
Evan Cheng
285903853f More register pressure aware scheduling work.
llvm-svn: 109064
2010-07-21 23:53:58 +00:00
Evan Cheng
a77f3d3b37 Teach bottom up pre-ra scheduler to track register pressure. Work in progress.
llvm-svn: 108991
2010-07-21 06:09:07 +00:00
Rafael Espindola
38a7d7cbc3 Add a VT argument to getMinimalPhysRegClass and replace the copy related uses
of getPhysicalRegisterRegClass with it.

If we want to make a copy (or estimate its cost), it is better to use the
smallest class as more efficient operations might be possible.

llvm-svn: 107140
2010-06-29 14:02:34 +00:00
Oscar Fuentes
a97311f152 Use llvm::next' instead of next' to make VC++ 2010 happy.
llvm-svn: 105168
2010-05-30 13:14:21 +00:00
Evan Cheng
cc2efe11db Fix some latency computation bugs: if the use is not a machine opcode do not just return zero.
llvm-svn: 105061
2010-05-28 23:26:21 +00:00
Dan Gohman
52c2738324 Eliminate the use of PriorityQueue and just use a std::vector,
implementing pop with a linear search for a "best" element. The priority
queue was a neat idea, but in practice the comparison functions depend
on dynamic information.

llvm-svn: 104718
2010-05-26 18:52:00 +00:00
Dan Gohman
1e5d0b0456 Delete an unused function.
llvm-svn: 104716
2010-05-26 18:34:12 +00:00
Dan Gohman
7c00576a62 Change push_all to a non-virtual function and implement it in the
base class, since all the implementations are the same.

llvm-svn: 104659
2010-05-26 01:10:55 +00:00
Evan Cheng
725211e948 Rename -pre-RA-sched=hybrid to -pre-RA-sched=list-hybrid.
llvm-svn: 104306
2010-05-21 00:42:32 +00:00
Evan Cheng
4401f8873c Allow targets more controls on what nodes are scheduled by reg pressure, what for latency in hybrid mode.
llvm-svn: 104293
2010-05-20 23:26:43 +00:00
Evan Cheng
bdd062dae0 Add a hybrid bottom up scheduler that reduce register usage while avoiding
pipeline stall. It's useful for targets like ARM cortex-a8. NEON has a lot
of long latency instructions so a strict register pressure reduction
scheduler does not work well.
Early experiments show this speeds up some NEON loops by over 30%.

llvm-svn: 104216
2010-05-20 06:13:19 +00:00
Chris Lattner
3b9f02a2aa Three changes:
1. Introduce some enums and accessors in the InlineAsm class
   that eliminate a ton of magic numbers when handling inline
   asm SDNode.
2. Add a new MDNodeSDNode selection dag node type that holds
   a MDNode (shocking!)
3. Add a new argument to ISD::INLINEASM nodes that hold !srcloc
   metadata, propagating it to the instruction emitter, which
   drops it.

No functionality change.

llvm-svn: 100605
2010-04-07 05:20:54 +00:00
Chris Lattner
b06015aa69 move target-independent opcodes out of TargetInstrInfo
into TargetOpcodes.h.  #include the new TargetOpcodes.h
into MachineInstr.  Add new inline accessors (like isPHI())
to MachineInstr, and start using them throughout the 
codebase.

llvm-svn: 95687
2010-02-09 19:54:29 +00:00
Evan Cheng
3b245876c0 When the scheduler unfold a load folding instruction it move some of the predecessors to the unfolded load. It decides what gets moved to the load by checking whether the new load is using the predecessor as an operand. The check neglects the cases whether the predecessor is a flagged scheduling unit.
rdar://7604000

llvm-svn: 95339
2010-02-05 01:27:11 +00:00
Bill Wendling
8cbc25d945 Remove the '-disable-scheduling' flag and replace it with the 'source' option of
the '-pre-RA-sched' flag. It actually makes more sense to do it this way. Also,
keep track of the SDNode ordering by default. Eventually, we would like to make
this ordering a way to break a "tie" in the scheduler. However, doing that now
breaks the "CodeGen/X86/abi-isel.ll" test for 32-bit Linux.

llvm-svn: 94308
2010-01-23 10:26:57 +00:00
Bill Wendling
c075acbb54 The previous code could potentially cause a cycle. Allow ordering w.r.t. a 0 order.
llvm-svn: 92810
2010-01-06 00:23:35 +00:00
Bill Wendling
578865ff3d Only check the ordering if there is an ordering for each nodes.
llvm-svn: 92807
2010-01-06 00:09:23 +00:00
Bill Wendling
0a7056fe52 Add a semi-primitive form of scheduling via the "SDNode ordering" to the
bottom-up scheduler. We prefer the lower order number.

llvm-svn: 92806
2010-01-05 23:48:12 +00:00
David Greene
f34d7ac9f1 Change errs() to dbgs().
llvm-svn: 92576
2010-01-05 01:24:54 +00:00
Nick Lewycky
974e12b2d3 Remove includes of Support/Compiler.h that are no longer needed after the
VISIBILITY_HIDDEN removal.

llvm-svn: 85043
2009-10-25 06:57:41 +00:00
Nick Lewycky
02d5f77d26 Remove VISIBILITY_HIDDEN from class/struct found inside anonymous namespaces.
Chris claims we should never have visibility_hidden inside any .cpp file but
that's still not true even after this commit.

llvm-svn: 85042
2009-10-25 06:33:48 +00:00
Dan Gohman
918ec53c64 The ScheduleDAG framework now requires an AliasAnalysis argument, though
it isn't needed in the ScheduleDAGSDNodes schedulers.

llvm-svn: 83691
2009-10-09 23:33:48 +00:00
Reid Kleckner
cea8dab1d1 Silence comparison always false warning in -Asserts mode.
llvm-svn: 83164
2009-09-30 20:43:07 +00:00
Reid Kleckner
8ff5c19ebd Fix integer overflow in instruction scheduling. This can happen if we have
basic blocks that are so long that their size overflows a short.

Also assert that overflow does not happen in the future, as requested by Evan.

This fixes PR4401.

llvm-svn: 83159
2009-09-30 20:15:38 +00:00
Chris Lattner
317dbbcfb1 eliminate uses of cerr()
llvm-svn: 79834
2009-08-23 07:05:07 +00:00
Chris Lattner
4dc3edde9f remove a few DOUTs here and there.
llvm-svn: 79832
2009-08-23 06:35:02 +00:00
Owen Anderson
9f94459d24 Split EVT into MVT and EVT, the former representing _just_ a primitive type, while
the latter is capable of representing either a primitive or an extended type.

llvm-svn: 78713
2009-08-11 20:47:22 +00:00
Owen Anderson
53aa7a960c Rename MVT to EVT, in preparation for splitting SimpleValueType out into its own struct type.
llvm-svn: 78610
2009-08-10 22:56:29 +00:00
Torok Edwin
fbcc663cbf llvm_unreachable->llvm_unreachable(0), LLVM_UNREACHABLE->llvm_unreachable.
This adds location info for all llvm_unreachable calls (which is a macro now) in
!NDEBUG builds.
In NDEBUG builds location info and the message is off (it only prints
"UREACHABLE executed").

llvm-svn: 75640
2009-07-14 16:55:14 +00:00
Torok Edwin
56d0659726 assert(0) -> LLVM_UNREACHABLE.
Make llvm_unreachable take an optional string, thus moving the cerr<< out of
line.
LLVM_UNREACHABLE is now a simple wrapper that makes the message go away for
NDEBUG builds.

llvm-svn: 75379
2009-07-11 20:10:48 +00:00
Bill Wendling
026e5d7667 Instead of passing in an unsigned value for the optimization level, use an enum,
which better identifies what the optimization is doing. And is more flexible for
future uses.

llvm-svn: 70440
2009-04-29 23:29:43 +00:00
Bill Wendling
084669a1c9 Second attempt:
Massive check in. This changes the "-fast" flag to "-O#" in llc. If you want to
use the old behavior, the flag is -O0. This change allows for finer-grained
control over which optimizations are run at different -O levels.

Most of this work was pretty mechanical. The majority of the fixes came from
verifying that a "fast" variable wasn't used anymore. The JIT still uses a
"Fast" flag. I'll change the JIT with a follow-up patch.

llvm-svn: 70343
2009-04-29 00:15:41 +00:00
Bill Wendling
56f2987a87 r70270 isn't ready yet. Back this out. Sorry for the noise.
llvm-svn: 70275
2009-04-28 01:04:53 +00:00
Bill Wendling
d0ae15946c Massive check in. This changes the "-fast" flag to "-O#" in llc. If you want to
use the old behavior, the flag is -O0. This change allows for finer-grained
control over which optimizations are run at different -O levels.

Most of this work was pretty mechanical. The majority of the fixes came from
verifying that a "fast" variable wasn't used anymore. The JIT still uses a
"Fast" flag. I'm not 100% sure if it's necessary to change it there...

llvm-svn: 70270
2009-04-28 00:21:31 +00:00
Dan Gohman
eefba6bbe0 In the list-burr's pseudo two-addr dependency heuristics, don't
add dependencies on nodes with exactly one successor which is a
COPY_TO_REGCLASS node. In the case that the copy is coalesced
away, the dependence should be on the user of the copy, rather
than the copy itself.

llvm-svn: 69309
2009-04-16 20:59:02 +00:00
Dan Gohman
3027bb6953 Handle SUBREG_TO_REG instructions with the same heuristics
as INSERT_SUBREG instructions in the list-burr scheduler.

llvm-svn: 69308
2009-04-16 20:57:10 +00:00
Dan Gohman
f3746cbc56 Minor compile-time optimization; don't bother checking
canClobberPhysRegDefs if the successor node doesn't
clobber any physical registers.

llvm-svn: 67587
2009-03-24 00:50:07 +00:00
Dan Gohman
9a658d72db Add a pre-pass to the burr-list scheduler which makes adjustments to
help out the register pressure reduction heuristics in the case of
nodes with multiple uses. Currently this uses very conservative
heuristics, so it doesn't have a broad impact, but in cases where it
does help it can make a big difference.

llvm-svn: 67586
2009-03-24 00:49:12 +00:00
Dan Gohman
ed0e8d44ce When unfolding a load during scheduling, the new operator node has
a data dependency on the load node, so it really needs a
data-dependence edge to the load node, even if the load previously
existed.

And add a few comments.

llvm-svn: 67554
2009-03-23 20:20:43 +00:00
Dan Gohman
a366da1bf7 Fix canClobberPhysRegDefs to check all SDNodes grouped together
in an SUnit, instead of just the first one. This fix is needed
by some upcoming scheduler changes.

llvm-svn: 67531
2009-03-23 16:23:01 +00:00
Evan Cheng
2e55923fba For inline asm output operand that matches an input. Encode the input operand index in the high bits.
llvm-svn: 67387
2009-03-20 18:03:34 +00:00
Dan Gohman
a19c662a83 Fix a typo in a comment.
llvm-svn: 66843
2009-03-12 23:55:10 +00:00
Dan Gohman
15af5524a4 Fix ScheduleDAGRRList::CopyAndMoveSuccessors' handling of nodes
with multiple chain operands. This can occur when the scheduler
has added chain operands to a node that already has a chain
operand, in order to handle physical register dependencies.

This fixes an llvm-gcc bootstrap failure on x86-64 introduced
in r66058.

llvm-svn: 66240
2009-03-06 02:23:01 +00:00
Evan Cheng
b8905c4e2c Fix PR3701. 1. X86 target renamed eflags register to flags. This matches what llvm-gcc generates so codegen knows flags register is being clobbered by inline asm. 2. BURR scheduler should also check if inline asm nodes can clobber "live" physical registers. Previously it was only checking target nodes with implicit defs.
llvm-svn: 65996
2009-03-04 01:41:49 +00:00
Evan Cheng
b570499c25 Oops. Last second clean up messed things up.
llvm-svn: 64373
2009-02-12 09:52:13 +00:00
Evan Cheng
3a14efacb6 Replace one of burr scheduling heuristic with something more sensible. Now calcMaxScratches simply compute the number of true data dependencies. This actually improve a couple of tests in dejagnu suite as many tests in llvm nightly test suite.
llvm-svn: 64369
2009-02-12 08:59:45 +00:00
Dan Gohman
45889d24fe Fix a comment.
llvm-svn: 64328
2009-02-11 21:32:08 +00:00
Dan Gohman
6571ef3577 Don't use special heuristics for nodes with no data predecessors
unless they actually have data successors, and likewise for nodes
with no data successors unless they actually have data precessors.

llvm-svn: 64327
2009-02-11 21:29:39 +00:00
Dan Gohman
298a2946f1 Delete the heuristic for non-livein CopyFromReg nodes. Non-liveinness
is determined by whether the node has a Flag operand. However, if the
node does have a Flag operand, it will be glued to its register's
def, so the heuristic would end up spuriously applying to whatever
node is the def.

llvm-svn: 64319
2009-02-11 20:25:59 +00:00
Dan Gohman
dfaf646c34 When scheduling a block in parts, keep track of the overall
instruction index across each part. Instruction indices are used
to make live range queries, and live ranges can extend beyond
scheduling region boundaries.

Refactor the ScheduleDAGSDNodes class some more so that it
doesn't have to worry about this additional information.

llvm-svn: 64288
2009-02-11 04:27:20 +00:00
Dan Gohman
b95434356c Factor out more code for computing register live-range informationfor
scheduling, and generalize is so that preserves state across
scheduling regions. This fixes incorrect live-range information around
terminators and labels, which are effective region boundaries.

In place of looking for terminators to anchor inter-block dependencies,
introduce special entry and exit scheduling units for this purpose.

llvm-svn: 64254
2009-02-10 23:27:53 +00:00
Evan Cheng
ce3bbe515b Fix PR3457: Ignore control successors when looking for closest scheduled successor. A control successor doesn't read result(s) produced by the scheduling unit being evaluated.
llvm-svn: 64210
2009-02-10 08:30:11 +00:00
Dan Gohman
483377c639 Move ScheduleDAGSDNodes.h to be a private header. Front-ends
that previously included this header should include
SchedulerRegistry.h instead.

llvm-svn: 63937
2009-02-06 17:22:58 +00:00
Dan Gohman
60d6844aa8 Make a few things const, fix some comments, and simplify
some assertions.

llvm-svn: 63328
2009-01-29 19:49:27 +00:00
Dan Gohman
619ef48a52 Move a few containers out of ScheduleDAGInstrs::BuildSchedGraph
and into the ScheduleDAGInstrs class, so that they don't get
destructed and re-constructed for each block. This fixes a
compile-time hot spot in the post-pass scheduler.

To help facilitate this, tidy and do some minor reorganization
in the scheduler constructor functions.

llvm-svn: 62275
2009-01-15 19:20:50 +00:00
Dan Gohman
1407484178 The list-td and list-tdrr schedulers don't yet support physreg
scheduling dependencies. Add assertion checks to help catch
this.

It appears the Mips target defaults to list-td, and it has a
regression test that uses a physreg dependence. Such code was
liable to be miscompiled, and now evokes an assertion failure.

llvm-svn: 62177
2009-01-13 20:24:13 +00:00
Evan Cheng
b2c42c648d Fix PR3241: Currently EmitCopyFromReg emits a copy from the physical register to a virtual register unless it requires an expensive cross class copy. That means we are only treating "expensive to copy" register dependency as physical register dependency.
Also future proof the scheduler to handle "normal" physical register dependencies. The code is not exercised yet.

llvm-svn: 62074
2009-01-12 03:19:55 +00:00
Evan Cheng
0c4fe2600a Minor debug output tweak.
llvm-svn: 62005
2009-01-09 20:42:34 +00:00
Dan Gohman
261ee6be57 Remove redundant 'else's. No functionality change.
llvm-svn: 61891
2009-01-07 22:30:55 +00:00
Dan Gohman
bf8e5204d1 Update these argument lists for the isNormalMemory
argument. This doesn't affect current functionality.

llvm-svn: 61779
2009-01-06 01:28:56 +00:00
Dan Gohman
79c3516912 Use a latency value of 0 for the artificial edges inserted by
AddPseudoTwoAddrDeps. This lets the scheduling infrastructure
avoid recalculating node heights. In very large testcases this
was a major bottleneck. Thanks to Roman Levenstein for finding
this!

As a side effect, fold-pcmpeqd-0.ll is now scheduled better
and it no longer requires spilling on x86-32.

llvm-svn: 61778
2009-01-06 01:19:04 +00:00
Dan Gohman
906152a20f Tidy up #includes, deleting a bunch of unnecessary #includes.
llvm-svn: 61715
2009-01-05 17:59:02 +00:00
Dan Gohman
4d41fdf4ca CommuteNodesToReducePressure() is now removed.
llvm-svn: 61612
2009-01-03 19:19:30 +00:00
Dan Gohman
1be2e9650e Remove the code from the scheduler that commuted two-address
instructions to avoid copies, because TwoAddressInstructionPass
also does this optimization.  The scheduler's version didn't
account for live-out values, which resulted in spurious commutes
and missed opportunities.

Now, TwoAddressInstructionPass handles all the opportunities,
instead of just those that the scheduler missed. The result is
usually the same, though there are occasional trivial differences
resulting from the avoidance of spurious commutes.

llvm-svn: 61611
2009-01-03 18:01:46 +00:00
Dan Gohman
04543e719e Rename BuildSchedUnits to BuildSchedGraph, and refactor the
code in ScheduleDAGSDNodes' BuildSchedGraph into separate functions.

llvm-svn: 61376
2008-12-23 18:36:58 +00:00
Dan Gohman
dddc1ac7ea Fix some register-alias-related bugs in the post-RA scheduler liveness
computation code. Also, avoid adding output-depenency edges when both
defs are dead, which frequently happens with EFLAGS defs.

Compute Depth and Height lazily, and always in terms of edge latency
values. For the schedulers that don't care about latency, edge latencies
are set to 1.

Eliminate Cycle and CycleBound, and LatencyPriorityQueue's Latencies array.
These are all subsumed by the Depth and Height fields.

llvm-svn: 61073
2008-12-16 03:25:46 +00:00
Dan Gohman
17214e633d Make addPred and removePred return void, since the return value is not
currently used by anything.

llvm-svn: 61066
2008-12-16 01:00:55 +00:00
Dan Gohman
2d170896ee Rewrite the SDep class, and simplify some of the related code.
The Cost field is removed. It was only being used in a very limited way,
to indicate when the scheduler should attempt to protect a live register,
and it isn't really needed to do that. If we ever want the scheduler to
start inserting copies in non-prohibitive situations, we'll have to
rethink some things anyway.

A Latency field is added. Instead of giving each node a single
fixed latency, each edge can have its own latency. This will eventually
be used to model various micro-architecture properties more accurately.

The PointerIntPair class and an internal union are now used, which
reduce the overall size.

llvm-svn: 60806
2008-12-09 22:54:47 +00:00
Dan Gohman
30cad9c192 Make debug output more informative.
llvm-svn: 60524
2008-12-04 02:14:57 +00:00
Dan Gohman
ad2134d45d Initial support for anti-dependence breaking. Currently this code does not
introduce any new spilling; it just uses unused registers.

Refactor the SUnit topological sort code out of the RRList scheduler and
make use of it to help with the post-pass scheduler.

llvm-svn: 59999
2008-11-25 00:52:40 +00:00
Dan Gohman
5cc12a8e31 Check in the rest of this change. The isAntiDep flag needs to be passed
to removePred because an SUnit can both data-depend and anti-depend
on the same SUnit.

llvm-svn: 59969
2008-11-24 17:33:52 +00:00
Dan Gohman
f00cef4491 Add a flag to SDep for tracking which edges are anti-dependence edges.
llvm-svn: 59785
2008-11-21 02:27:52 +00:00
Dan Gohman
67b35bd4d1 Rename SDep's isSpecial to isArtificial, to make this field a little
less mysterious.

llvm-svn: 59782
2008-11-21 02:18:56 +00:00
Dan Gohman
63be531e09 Remove the CycleBound computation code from the ScheduleDAGRRList
schedulers. This doesn't have much immediate impact because
targets that use these schedulers by default don't yet provide
pipeline information.

This code also didn't have the benefit of register pressure
information. Also, removing it will avoid problems with list-burr
suddenly starting to do latency-oriented scheduling on x86 when we
start providing pipeline data, which would increase spilling.

llvm-svn: 59775
2008-11-21 01:30:54 +00:00
Dan Gohman
c602dd407c Change these schedulers to not emit no-ops. It turns out that
the RR scheduler actually does look at latency values, but it
doesn't use a hazard recognizer so it has no way to know when
a no-op is needed, as opposed to just stalling and incrementing
the cycle count.

llvm-svn: 59759
2008-11-21 00:10:42 +00:00
Dan Gohman
8e066a1349 Remove a remnant of list-burr's fast mode.
llvm-svn: 59702
2008-11-20 03:32:45 +00:00
Dan Gohman
186f65d275 Factor out the SethiUllman numbering logic from the list-burr and
list-tdrr schedulers into a common base class.

llvm-svn: 59701
2008-11-20 03:30:37 +00:00
Dan Gohman
fd08af4ee7 Remove the "fast" form of the list-burr scheduler, and use the
dedicated "fast" scheduler in -fast mode instead, which is
faster. This speeds up llc -fast by a few percent on some
testcases -- the speedup only happens for code not handled by
fast-isel.

llvm-svn: 59700
2008-11-20 03:11:19 +00:00
Dan Gohman
3f656dfa03 Facter AddPseudoTwoAddrDeps and associated infrasructure out of
the list-burr scheduler so that it can be used by the list-tdrr
scheduler too.

llvm-svn: 59698
2008-11-20 02:45:51 +00:00
Dan Gohman
4ce15e12b9 Factor out the code for verifying the work of the scheduler,
extend it a bit, and make use of it in all schedulers, to
ensure consistent checking.

llvm-svn: 59689
2008-11-20 01:26:25 +00:00
Dan Gohman
60cb69e665 Experimental post-pass scheduling support. Post-pass scheduling
is currently off by default, and can be enabled with
-disable-post-RA-scheduler=false.

This doesn't have a significant impact on most code yet because it doesn't
yet do anything to address anti-dependencies and it doesn't attempt to
disambiguate memory references. Also, several popular targets
don't have pipeline descriptions yet.

The majority of the changes here are splitting the SelectionDAG-specific
code out of ScheduleDAG, so that ScheduleDAG can be moved to
libLLVMCodeGen.a. The interface between ScheduleDAG-using code and
the rest of the scheduling code is somewhat rough and will evolve.

llvm-svn: 59676
2008-11-19 23:18:57 +00:00
Dan Gohman
82016c243b Rearrange code to reduce the nesting level. No functionality change.
llvm-svn: 59580
2008-11-19 02:00:32 +00:00
Dan Gohman
6e58726416 Tidy up ScheduleNodeBottomUp methods, and make them more
consistent with ScheduleNodeTopDown methods.

llvm-svn: 59550
2008-11-18 21:22:20 +00:00
Dan Gohman
22d07b14bc Change SUnit's dump method to take a ScheduleDAG* instead of
a SelectionDAG*.

llvm-svn: 59488
2008-11-18 02:06:40 +00:00
Dan Gohman
5ebdb98a6e Avoid using a loop in ReleasePred and ReleaseSucc methods to compute the
new CycleBound value. Instead, just update CycleBound on each call.
Also, make ReleasePred and ReleaseSucc methods more consistent accross
the various schedulers.

This also happens to make ScheduleDAGRRList's CycleBound computation
somewhat more interesting, though it still doesn't have any noticeable
effect, because no current targets that use the register-pressure
reduction scheduler provide pipeline models.

llvm-svn: 59475
2008-11-18 00:38:59 +00:00
Dan Gohman
92a36d7a78 Eliminate some trivial differences between the ScheduleNodeTopDown
functions in these two schedulers.

llvm-svn: 59465
2008-11-17 21:31:02 +00:00
Dan Gohman
072734ebd6 Remove the FlaggedNodes member from SUnit. Instead of requiring each SUnit
to carry a SmallVector of flagged nodes, just calculate the flagged nodes
dynamically when they are needed.

The local-liveness change is due to a trivial scheduling change where
the scheduler arbitrary decision differently.

llvm-svn: 59273
2008-11-13 23:24:17 +00:00
Dan Gohman
1ddfcba5be Make the Node member of SUnit private, and add accessors.
llvm-svn: 59264
2008-11-13 21:36:12 +00:00
Dan Gohman
5a390b974c Change ScheduleDAG's DAG member from a reference to a pointer, to prepare
for the possibility of scheduling without a SelectionDAG being present.

llvm-svn: 59263
2008-11-13 21:21:28 +00:00
Dan Gohman
e52e0897e2 In ScheduleDAGRRList::CopyAndMoveSuccessors, create the SUnit for the load
before creating the SUnit for the operation that it was unfolded from. This
allows each SUnit to have all of its predecessor SUnits available at the time
it is created. I don't know yet if this will be absolutely required, but it
is a little tidier to do it this way.

llvm-svn: 59083
2008-11-11 21:34:44 +00:00
Dan Gohman
5499e89d06 Change the scheduler accessor methods to accept an explicit TargetMachine
argument instead of taking the SelectionDAG's TargetMachine. This is
needed for some upcoming scheduler changes.

llvm-svn: 59055
2008-11-11 17:50:47 +00:00
Dan Gohman
50c76beeb0 Remove some unused virtual function bodies.
llvm-svn: 58524
2008-10-31 19:06:33 +00:00
Dan Gohman
9c4b7d5c4f Fix command-line option printing to print two spaces where needed,
instead of requiring all "short description" strings to begin with
two spaces. This makes these strings less mysterious, and it fixes
some cases where short description strings mistakenly did not
begin with two spaces.

llvm-svn: 57521
2008-10-14 20:25:08 +00:00
Dan Gohman
c07f686665 Replace the LiveRegs SmallSet with a simple counter that keeps
track of the number of live registers, which is all the set was
being used for.

llvm-svn: 56498
2008-09-23 18:50:48 +00:00
Dan Gohman
6ab52a8018 Don't worry about clobbering physical register defs that aren't used.
llvm-svn: 56281
2008-09-17 15:25:49 +00:00
Gabor Greif
f304a7aa4d erect abstraction boundaries for accessing SDValue members, rename Val -> Node to reflect semantics
llvm-svn: 55504
2008-08-28 21:40:38 +00:00
Dan Gohman
3a3a52de58 Optimize ScheduleDAGRRList's topological sort to use one pass instead
of two, and to not need a scratch std::vector. Also, compute the ordering
immediately in the result array, instead of in another scratch std::vector
that is copied to the result array.

llvm-svn: 55421
2008-08-27 16:29:48 +00:00
Gabor Greif
abfdf928d8 disallow direct access to SDValue::ResNo, provide a getter instead
llvm-svn: 55394
2008-08-26 22:36:50 +00:00
Dan Gohman
23785a1679 Correct the filename in the top-of-file comment.
llvm-svn: 54688
2008-08-12 17:42:33 +00:00
Dan Gohman
e955c481fd Fix several const-correctness issues, resolving some -Wcast-qual warnings.
llvm-svn: 54349
2008-08-05 14:45:15 +00:00
Dan Gohman
2ce6f2ad5e Rename SDOperand to SDValue.
llvm-svn: 54128
2008-07-27 21:46:04 +00:00
Dan Gohman
1705968102 Add a new function, ReplaceAllUsesOfValuesWith, which handles bulk
replacement of multiple values. This is slightly more efficient
than doing multiple ReplaceAllUsesOfValueWith calls, and theoretically
could be optimized even further. However, an important property of this
new function is that it handles the case where the source value set and
destination value set overlap. This makes it feasible for isel to use
SelectNodeTo in many very common cases, which is advantageous because
SelectNodeTo avoids a temporary node and it doesn't require CSEMap
updates for users of values that don't change position.

Revamp MorphNodeTo, which is what does all the work of SelectNodeTo, to
handle operand lists more efficiently, and to correctly handle a number
of corner cases to which its new wider use exposes it.

This commit also includes a change to the encoding of post-isel opcodes
in SDNodes; now instead of being sandwiched between the target-independent
pre-isel opcodes and the target-dependent pre-isel opcodes, post-isel
opcodes are now represented as negative values. This makes it possible
to test if an opcode is pre-isel or post-isel without having to know
the size of the current target's post-isel instruction set.

These changes speed up llc overall by 3% and reduce memory usage by 10%
on the InstructionCombining.cpp testcase with -fast and -regalloc=local.

llvm-svn: 53728
2008-07-17 19:10:17 +00:00
Dan Gohman
adec96f438 Reapply 53476 and 53480, with a fix so that it properly updates
the BB member to the current basic block after emitting
instructions.

llvm-svn: 53567
2008-07-14 18:19:29 +00:00
Evan Cheng
ef8412c822 Back out 53476 and 53480 for now. Somehow they cause llc to miscompile 179.art.
llvm-svn: 53502
2008-07-12 01:38:51 +00:00
Dan Gohman
f4cd404e6f Factor out debugging code into the common base class.
llvm-svn: 53480
2008-07-11 22:36:22 +00:00
Dan Gohman
36a69373dc Add support for putting NamedRegionTimers in TimerGroups, and
use a timer group for the timers in SelectionDAGISel. Also,
Split scheduling out from emitting, to give each their own
timer.

llvm-svn: 53476
2008-07-11 21:54:34 +00:00
Evan Cheng
7e4abde27c - Use a faster priority comparison function if -fast.
- Code clean up.

llvm-svn: 53010
2008-07-02 09:23:51 +00:00
Evan Cheng
2c9773155a Do not use computationally expensive scheduling heuristics with -fast.
llvm-svn: 52971
2008-07-01 18:05:03 +00:00
Dan Gohman
fa63cc4e91 Move a DenseMap's declaration outside of a loop, and just call
clear() on each iteration. This avoids allocating and deallocating
all of DenseMap's memory on each iteration.

llvm-svn: 52642
2008-06-23 21:15:00 +00:00
Dan Gohman
ea0452016e canClobberPhysRegDefs shouldn't called without checking hasPhysRegDefs;
check this with an assert.

llvm-svn: 52603
2008-06-21 22:05:24 +00:00
Dan Gohman
46520a25a4 Remove ScheduleDAG's SUnitMap altogether. Instead, use SDNode's NodeId
field, which is otherwise unused after instruction selection, as an index
into the SUnit array.

llvm-svn: 52583
2008-06-21 19:18:17 +00:00
Dan Gohman
a4db3352f9 Add a priority queue class, which is a wrapper around std::priority_queue
and provides fairly efficient removal of arbitrary elements. Switch
ScheduleDAGRRList from std::set to this new priority queue.

llvm-svn: 52582
2008-06-21 18:35:25 +00:00
Dan Gohman
e6e1348275 Change ScheduleDAG's SUnitMap from DenseMap<SDNode*, vector<SUnit*> >
to DenseMap<SDNode*, SUnit*>, and adjust the way cloned SUnit nodes are
handled so that only the original node needs to be in the map.
This speeds up llc on 447.dealII.llvm.bc by about 2%.

llvm-svn: 52576
2008-06-21 15:52:51 +00:00
Dan Gohman
4b49be1cbe Simplify some template parameterization.
llvm-svn: 52571
2008-06-21 01:08:22 +00:00
Duncan Sands
13237ac3b9 Wrap MVT::ValueType in a struct to get type safety
and better control the abstraction.  Rename the type
to MVT.  To update out-of-tree patches, the main
thing to do is to rename MVT::ValueType to MVT, and
rewrite expressions like MVT::getSizeInBits(VT) in
the form VT.getSizeInBits().  Use VT.getSimpleVT()
to extract a MVT::SimpleValueType for use in switch
statements (you will get an assert failure if VT is
an extended value type - these shouldn't exist after
type legalization).
This results in a small speedup of codegen and no
new testsuite failures (x86-64 linux).

llvm-svn: 52044
2008-06-06 12:08:01 +00:00
Duncan Sands
70424d195a Silence the compiler warning differently. The
original method caused gcc-4.2 to complain.

llvm-svn: 51186
2008-05-16 09:19:16 +00:00
Evan Cheng
763ec13862 Silence some compiler warnings.
llvm-svn: 51115
2008-05-14 20:07:51 +00:00
Roman Levenstein
6b37114590 Use std::set instead of std::priority_queue for the RegReductionPriorityQueue.
This removes the existing bottleneck related to the removal of elements from 
the middle of the queue.

Also fixes a subtle bug in ScheduleDAGRRList::CapturePred:
It was updating the state of the SUnit before removing it. As a result, the
comparison operators were working incorrectly and this SUnit could not be removed 
from the queue properly.

Reviewed by Evan and Dan. Approved by Dan.

llvm-svn: 50412
2008-04-29 09:07:59 +00:00
Dan Gohman
82b6673c44 Fix the new scheduler assertion checks to work when
the scheduler has inserted no-ops. This fixes
the 2006-07-03-schedulers.ll regression on ppc32.

llvm-svn: 49747
2008-04-15 22:40:14 +00:00
Dan Gohman
4370f26750 Treat EntryToken nodes as "passive" so that they aren't added to the
ScheduleDAG; they don't correspond to any actual instructions so they
don't need to be scheduled.

This fixes a bug where the EntryToken was being scheduled multiple
times in some cases, though it ended up not causing any trouble because 
EntryToken doesn't expand into anything. With this fixed the schedulers
reliably schedule the expected number of units, so we can check this
with an assertion.

This requires a tweak to test/CodeGen/X86/loop-hoist.ll because it
ends up getting scheduled differently in a trivial way, though it was
enough to fool the prcontext+grep that the test does.

llvm-svn: 49701
2008-04-15 01:22:18 +00:00
Evan Cheng
16d72072df Cosmetic changes.
llvm-svn: 48947
2008-03-29 18:34:22 +00:00
Chris Lattner
a148acdc82 ifdef out a dead function. Should this be removed?
llvm-svn: 48916
2008-03-28 15:36:27 +00:00
Roman Levenstein
30d09518b5 Fix spelling. Thanks, Duncan! :-)
llvm-svn: 48873
2008-03-27 09:44:37 +00:00
Roman Levenstein
bc674501ba Speed-up the SumOfUnscheduledPredsOfSuccs by introducing a new function
called LimitedSumOfUnscheduledPredsOfSuccs. It terminates the computation
after a given treshold is reached. This new function is always faster, but
brings real wins only on bigger test-cases.

The old function SumOfUnscheduledPredsOfSuccs is left in-place for now and therefore a warning about an unused static function is produced.

llvm-svn: 48872
2008-03-27 09:14:57 +00:00
Roman Levenstein
733a4d6e85 Fixed some spelling errors. Thanks, Duncan!
llvm-svn: 48819
2008-03-26 11:23:38 +00:00
Roman Levenstein
7e71b4baaf Some improvements related to the computation of isReachable.
This fixes Bugzilla #1835 (http://llvm.org/bugs/show_bug.cgi?id=1835).
This patched is reviewed by Tanya and Dan. Dan tested and approved it.

The reason for the bad performance of the old algorithm is that it is very naive and scans every
time all nodes of the DAG in the worst case.

This patch introduces  a new algorithm based on the paper "Online algorithms
for maintaining the topological order of a directed acyclic graph" by
David J.Pearce and Paul H.J.Kelly. This is the MNR algorithm. It has a
linear time worst-case and performs much better in most situations.

The paper can be found here:
http://fano.ics.uci.edu/cites/Document/Online-algorithms-for-maintaining-the-topological-order-of-a-directed-acyclic-graph.html

The main idea of the new algorithm is to compute the topological ordering of the SNodes in the
DAG and to maintain it even after DAG modifications. The topological ordering allows for very fast 
node reachability checks. 

Tests on very big  input files with tens of thousands of instructions in a BB indicate huge 
speed-ups (up to 10x compilation time improvement) compared to the old version.

llvm-svn: 48817
2008-03-26 09:18:09 +00:00
Dan Gohman
fd227e9c3a Fix typos.
llvm-svn: 48779
2008-03-25 17:10:29 +00:00
Evan Cheng
e88a625ecd When the register allocator runs out of registers, spill a physical register around the def's and use's of the interval being allocated to make it possible for the interval to target a register and spill it right away and restore a register for uses. This likely generates terrible code but is before than aborting.
llvm-svn: 48218
2008-03-11 07:19:34 +00:00