Commit Graph

1363 Commits

Author SHA1 Message Date
Hans Wennborg
c944c13dc1 SelectionDAGISel: rangeify a loop
llvm-svn: 266478
2016-04-15 21:45:09 +00:00
Tim Shen
0012756489 [SSP] Remove llvm.stackprotectorcheck.
This is a cleanup patch for SSP support in LLVM. There is no functional change.
llvm.stackprotectorcheck is not needed, because SelectionDAG isn't
actually lowering it in SelectBasicBlock; rather, it adds check code in
FinishBasicBlock, ignoring the position where the intrinsic is inserted
(See FindSplitPointForStackProtector()).

llvm-svn: 265851
2016-04-08 21:26:31 +00:00
Manman Ren
e221a870d3 Swift Calling Convention: swifterror target-independent change.
At IR level, the swifterror argument is an input argument with type
ErrorObject**. For targets that support swifterror, we want to optimize it
to behave as an inout value with type ErrorObject*; it will be passed in a
fixed physical register.

The main idea is to track the virtual registers for each swifterror value. We
define swifterror values as AllocaInsts with swifterror attribute or a function
argument with swifterror attribute.

In SelectionDAGISel.cpp, we set up swifterror values (SwiftErrorVals) before
handling the basic blocks.

When iterating over all basic blocks in RPO, before actually visiting the basic
block, we call mergeIncomingSwiftErrors to merge incoming swifterror values when
there are multiple predecessors or to simply propagate them. There, we create a
virtual register for each swifterror value in the entry block. For predecessors
that are not yet visited, we create virtual registers to hold the swifterror
values at the end of the predecessor. The assignments are saved in
SwiftErrorWorklist and will be materialized at the end of visiting the basic
block.

When visiting a load from a swifterror value, we copy from the current virtual
register assignment. When visiting a store to a swifterror value, we create a
virtual register to hold the swifterror value and update SwiftErrorMap to
track the current virtual register assignment.

Differential Revision: http://reviews.llvm.org/D18108

llvm-svn: 265433
2016-04-05 18:13:16 +00:00
Manman Ren
9dd8c14674 CXX TLS: collect return blocks after SelectAllBasicBlocks.
It is incorrect to get the corresponding MBB for a ReturnInst before
SelectAllBasicBlocks since SelectAllBasicBlocks can change the
correspondence between a ReturnInst and the MBB it is in.

PR27062

llvm-svn: 264358
2016-03-24 23:21:29 +00:00
Manman Ren
2828c57b6f [CXX_FAST_TLS] fix issues with O0 on ARM, AArch64 and X86.
Since at O0, explicit copies via SplitCSR may not be removed even if
they are unnecessary, we choose not to use SplitCSR at O0.

llvm-svn: 263855
2016-03-18 23:38:49 +00:00
Craig Topper
267bdb2094 [CodeGen] Add space-optimized EmitMergeInputChains1_2 to the DAG isel matching tables. Shaves about 5100 bytes from the X86 matcher table. NFC
llvm-svn: 262815
2016-03-07 07:29:12 +00:00
Eugene Zelenko
ecefe5a81f Fix Clang-tidy readability-redundant-control-flow warnings; other minor fixes.
Differential revision: http://reviews.llvm.org/D16793

llvm-svn: 259539
2016-02-02 18:20:45 +00:00
Tim Shen
3b428cb764 [SelectionDAG] Eliminate exponential behavior in WalkChainUsers
llvm-svn: 259315
2016-01-31 03:59:34 +00:00
Matthias Braun
b30f2f5141 Avoid overly large SmallPtrSet/SmallSet
These sets perform linear searching in small mode so it is never a good
idea to use SmallSize/N bigger than 32.

llvm-svn: 259283
2016-01-30 01:24:31 +00:00
Benjamin Kramer
58e1998520 Don't check if a list is empty with ilist::size.
ilist::size() is O(n) while ilist::empty() is O(1)

llvm-svn: 258636
2016-01-23 20:58:09 +00:00
David Majnemer
3463e696fb [X86] Don't alter HasOpaqueSPAdjustment after we've relied on it
We rely on HasOpaqueSPAdjustment not changing after we've calculated
things based on it.  Things like whether or not we can use 'rep;movs' to
copy bytes around, that sort of thing.  If it changes, invariants in the
backend will quietly break.  This situation arose when we had a call to
memcpy *and* a COPY of the FLAGS register where we would attempt to
reference local variables using %esi, a register that was clobbered by
the 'rep;movs'.

This fixes PR26124.

llvm-svn: 257730
2016-01-14 01:20:03 +00:00
David Majnemer
ca1c9f074f [X86] Make hasFP constant time
We need a frame pointer if there is a push/pop sequence after the
prologue in order to unwind the stack.  Scanning the instructions to
figure out if this happened made hasFP not constant-time which is a
violation of expectations.  Let's compute this up-front and reuse that
computation when we need it.

llvm-svn: 256730
2016-01-04 04:49:41 +00:00
Manman Ren
3e3edc91f9 CXX_FAST_TLS calling convention: target independent portion.
Update supportSplitCSR's interface to take machine function instead of the
calling convention.

Review comments for http://reviews.llvm.org/D15341

llvm-svn: 255818
2015-12-16 20:45:48 +00:00
Manman Ren
abc7c1d1d2 CXX_FAST_TLS calling convention: target independent portion.
The access function has a short entry and a short exit, the initialization
block is only run the first time. To improve the performance, we want to
have a short frame at the entry and exit.

We explicitly handle most of the CSRs via copies. Only the CSRs that are not
handled via copies will be in CSR_SaveList.

Frame lowering and prologue/epilogue insertion will generate a short frame
in the entry and exit according to CSR_SaveList. The majority of the CSRs will
be handled by register allcoator. Register allocator will try to spill and
reload them in the initialization block.

We add CSRsViaCopy, it will be explicitly handled during lowering.

1> we first set FunctionLoweringInfo->SplitCSR if conditions are met (the target
   supports it for the given calling convention and the function has only return
   exits). We also call TLI->initializeSplitCSR to perform initialization.
2> we call TLI->insertCopiesSplitCSR to insert copies from CSRsViaCopy to
   virtual registers at beginning of the entry block and copies from virtual
   registers to CSRsViaCopy at beginning of the exit blocks.
3> we also need to make sure the explicit copies will not be eliminated.

rdar://problem/23557469

Differential Revision: http://reviews.llvm.org/D15340

llvm-svn: 255353
2015-12-11 18:24:30 +00:00
David Majnemer
70497c696a Move EH-specific helper functions to a more appropriate place
No functionality change is intended.

llvm-svn: 254562
2015-12-02 23:06:39 +00:00
Paul Robinson
a2550a6da3 Have 'optnone' respect the -fast-isel=false option.
This is primarily useful for debugging optnone v. ISel issues.

Differential Revision: http://reviews.llvm.org/D14792

llvm-svn: 254335
2015-11-30 21:56:16 +00:00
Cong Hou
1938f2eb98 Let SelectionDAG start to use probability-based interface to add successors.
The patch in http://reviews.llvm.org/D13745 is broken into four parts:

1. New interfaces without functional changes.
2. Use new interfaces in SelectionDAG, while in other passes treat probabilities
as weights.
3. Use new interfaces in all other passes.
4. Remove old interfaces.

This the second patch above. In this patch SelectionDAG starts to use
probability-based interfaces in MBB to add successors but other MC passes are
still using weight-based interfaces. Therefore, we need to maintain correct
weight list in MBB even when probability-based interfaces are used. This is
done by updating weight list in probability-based interfaces by treating the
numerator of probabilities as weights. This change affects many test cases
that check successor weight values. I will update those test cases once this
patch looks good to you.


Differential revision: http://reviews.llvm.org/D14361

llvm-svn: 253965
2015-11-24 08:51:23 +00:00
Daniel Sanders
b700203c8b Partially revert r253662: some unrelated work was accidentally committed with it.
Sorry.

llvm-svn: 253663
2015-11-20 13:16:35 +00:00
Daniel Sanders
be9db3c00a Revert the revert 253497 and 253539 - These commits aren't the cause of the clang-cmake-mips failures.
Sorry for the noise.

llvm-svn: 253662
2015-11-20 13:13:53 +00:00
Joseph Tremoulet
f748c8937e [WinEH] Update exception pointer registers
Summary:
The CLR's personality routine passes these in rdx/edx, not rax/eax.

Make getExceptionPointerRegister a virtual method parameterized by
personality function to allow making this distinction.

Similarly make getExceptionSelectorRegister a virtual method parameterized
by personality function, for symmetry.


Reviewers: pgavlin, majnemer, rnk

Subscribers: jyknight, dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D14344

llvm-svn: 252383
2015-11-07 01:11:31 +00:00
Duncan P. N. Exon Smith
e400a7d412 SelectionDAG: Remove implicit ilist iterator conversions, NFC
llvm-svn: 250214
2015-10-13 19:47:46 +00:00
Reid Kleckner
9abb3c06a6 Don't call PrepareEHLandingPad on non EH pads
This was a minor bug in r249492. Calling PrepareEHLandingPad on a
non-landingpad was a no-op, but it attempted to get the generic pointer
register class, which apparently doesn't exist for some targets.

llvm-svn: 250068
2015-10-12 17:42:32 +00:00
Reid Kleckner
14e773500e [WinEH] Delete the old landingpad implementation of Windows EH
The new implementation works at least as well as the old implementation
did.

Also delete the associated preparation tests. They don't exercise
interesting corner cases of the new implementation. All the codegen
tests of the EH tables have already been ported.

llvm-svn: 249918
2015-10-09 23:34:53 +00:00
Reid Kleckner
ebef256269 [SEH] Fix llvm.eh.exceptioncode fast register allocation assertion
I called the wrong MachineBasicBlock::addLiveIn() overload.

llvm-svn: 249786
2015-10-09 00:15:13 +00:00
Reid Kleckner
72ba70418f [SEH] Add llvm.eh.exceptioncode intrinsic
This will support the Clang __exception_code intrinsic.

llvm-svn: 249492
2015-10-07 00:27:33 +00:00
Joseph Tremoulet
2afea5438f [WinEH] Recognize CoreCLR personality function
Summary:
 - Add CoreCLR to if/else ladders and switches as appropriate.
 - Rename isMSVCEHPersonality to isFuncletEHPersonality to better
   reflect what it captures.

Reviewers: majnemer, andrew.w.kaylor, rnk

Subscribers: pgavlin, AndyAyers, llvm-commits

Differential Revision: http://reviews.llvm.org/D13449

llvm-svn: 249455
2015-10-06 20:28:16 +00:00
Chandler Carruth
2e4ca848f4 Add an explicit 'inline' specifier to these static functions. GCC is
warning on them having always_inline attribute for reasons I don't fully
understand -- static functions are just as inlinable as inline
functions in terms of linkage.

llvm-svn: 247334
2015-09-10 20:34:57 +00:00
Chandler Carruth
7b560d40bd [PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.

This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:

- FunctionAAResults is a type-erasing alias analysis results aggregation
  interface to walk a single query across a range of results from
  different alias analyses. Currently this is function-specific as we
  always assume that aliasing queries are *within* a function.

- AAResultBase is a CRTP utility providing stub implementations of
  various parts of the alias analysis result concept, notably in several
  cases in terms of other more general parts of the interface. This can
  be used to implement only a narrow part of the interface rather than
  the entire interface. This isn't really ideal, this logic should be
  hoisted into FunctionAAResults as currently it will cause
  a significant amount of redundant work, but it faithfully models the
  behavior of the prior infrastructure.

- All the alias analysis passes are ported to be wrapper passes for the
  legacy PM and new-style analysis passes for the new PM with a shared
  result object. In some cases (most notably CFL), this is an extremely
  naive approach that we should revisit when we can specialize for the
  new pass manager.

- BasicAA has been restructured to reflect that it is much more
  fundamentally a function analysis because it uses dominator trees and
  loop info that need to be constructed for each function.

All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.

The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.

This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.

Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.

One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.

Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.

Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.

Differential Revision: http://reviews.llvm.org/D12080

llvm-svn: 247167
2015-09-09 17:55:00 +00:00
Reid Kleckner
51189f0a1d [WinEH] Avoid creating MBBs for LLVM BBs that cannot contain code
Typically these are catchpads, which hold data used to decide whether to
catch the exception or continue unwinding. We also shouldn't create MBBs
for catchendpads, cleanupendpads, or terminatepads, since no real code
can live in them.

This fixes a problem where MI passes (like the register allocator) would
try to put code into catchpad blocks, which are not executed by the
runtime. In the new world, blocks ending in invokes now have many
possible successors.

llvm-svn: 247102
2015-09-08 23:28:38 +00:00
Cong Hou
511298b919 Distribute the weight on the edge from switch to default statement to edges generated in lowering switch.
Currently, when edge weights are assigned to edges that are created when lowering switch statement, the weight on the edge to default statement (let's call it "default weight" here) is not considered. We need to distribute this weight properly. However, without value profiling, we have no idea how to distribute it. In this patch, I applied the heuristic that this weight is evenly distributed to successors.

For example, given a switch statement with cases 1,2,3,5,10,11,20, and every edge from switch to each successor has weight 10. If there is a binary search tree built to test if n < 10, then its two out-edges will have weight 4x10+10/2 = 45 and 3x10 + 10/2 = 35 respectively (currently they are 40 and 30 without considering the default weight). Each distribution (which is 5 here) will be stored in each SwitchWorkListItem for further distribution.

There are some exceptions:

For a jump table header which doesn't have any edge to default statement, we don't distribute the default weight to it.
For a bit test header which covers a contiguous range and hence has no edges to default statement, we don't distribute the default weight to it.
When the branch checks a single value or a contiguous range with no edge to default statement, we don't distribute the default weight to it.
In other cases, the default weight is evenly distributed to successors.

Differential Revision: http://reviews.llvm.org/D12418

llvm-svn: 246522
2015-09-01 01:42:16 +00:00
Reid Kleckner
e00faf8ce1 [EH] Handle non-Function personalities like unknown personalities
Also delete and simplify a lot of MachineModuleInfo code that used to be
needed to handle personalities on landingpads.  Now that the personality
is on the LLVM Function, we no longer need to track it this way on MMI.
Certainly it should not live on LandingPadInfo.

llvm-svn: 246478
2015-08-31 20:02:16 +00:00
Reid Kleckner
0e2882345d [WinEH] Add some support for code generating catchpad
We can now run 32-bit programs with empty catch bodies.  The next step
is to change PEI so that we get funclet prologues and epilogues.

llvm-svn: 246235
2015-08-27 23:27:47 +00:00
Cong Hou
cd59591396 Remove the final bit test during lowering switch statement if all cases in bit test cover a contiguous range.
When lowering switch statement, if bit tests are used then LLVM will always generates a jump to the default statement in the last bit test. However, this is not necessary when all cases in bit tests cover a contiguous range. This is because when generating the bit tests header MBB, there is a range check that guarantees cases in bit tests won't go outside of [low, high], where low and high are minimum and maximum case values in the bit tests. This patch checks if this is the case and then doesn't emit jump to default statement and hence saves a bit test and a branch.

Differential Revision: http://reviews.llvm.org/D12249

llvm-svn: 245976
2015-08-25 21:34:38 +00:00
Mehdi Amini
b58f8137c1 Move the Target way of overriding DAG Scheduler to a target hook
Summary:
The previous way of overriding it was relying on calling "setDefault"
on the global registry, which implies global mutable state.

Reviewers: echristo, atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11538

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 243388
2015-07-28 06:18:04 +00:00
Chandler Carruth
96ada25bf3 [PM/AA] Remove all of the dead AliasAnalysis pointers being threaded
through APIs that are no longer necessary now that the update API has
been removed.

This will make changes to the AA interfaces significantly less
disruptive (I hope). Either way, it seems like a really nice cleanup.

llvm-svn: 242882
2015-07-22 09:52:54 +00:00
Cong Hou
ab23bfbc0e Create a wrapper pass for BranchProbabilityInfo.
This new wrapper pass is useful when we want to do branch probability analysis conditionally (e.g. only in PGO mode) but don't want to add one more pass dependence.

http://reviews.llvm.org/D11241

llvm-svn: 242349
2015-07-15 22:48:29 +00:00
Pat Gavlin
a717f255b6 Allow {e,r}bp as the target of {read,write}_register.
This patch allows the read_register and write_register intrinsics to
read/write the RBP/EBP registers on X86 iff the targeted register is
the frame pointer for the containing function.

Differential Revision: http://reviews.llvm.org/D10977

llvm-svn: 241827
2015-07-09 17:40:29 +00:00
Mehdi Amini
44ede33a69 Make TargetLowering::getPointerTy() taking DataLayout as an argument
Summary:
This change is part of a series of commits dedicated to have a single
DataLayout during compilation by using always the one owned by the
module.

Reviewers: echristo

Subscribers: jholewinski, ted, yaron.keren, rafael, llvm-commits

Differential Revision: http://reviews.llvm.org/D11028

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 241775
2015-07-09 02:09:04 +00:00
Pete Cooper
485d1146db Convert a bunch of loops to foreach. NFC.
This uses the new SDNode::op_values() iterator range committed in r240805.

llvm-svn: 240822
2015-06-26 19:37:02 +00:00
Alexander Kornienko
f00654e31b Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)
Apparently, the style needs to be agreed upon first.

llvm-svn: 240390
2015-06-23 09:49:53 +00:00
Rafael Espindola
36b718fc74 Avoid a Symbol -> Name -> Symbol conversion.
Before this we were producing a TargetExternalSymbol from a MCSymbol.
That meant extracting the symbol name and fetching the symbol again
down the pipeline.

This patch adds a DAG.getMCSymbol that lets the MCSymbol pass unchanged on the
DAG.

Doing so removes the need for MO_NOPREFIX and fixes the root cause of pr23900,
allowing r240130 to be committed again.

llvm-svn: 240300
2015-06-22 17:46:53 +00:00
Alexander Kornienko
70bc5f1398 Fixed/added namespace ending comments using clang-tidy. NFC
The patch is generated using this command:

tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \
  -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \
  llvm/lib/


Thanks to Eugene Kosov for the original patch!

llvm-svn: 240137
2015-06-19 15:57:42 +00:00
David Majnemer
7fddeccb8b Move the personality function from LandingPadInst to Function
The personality routine currently lives in the LandingPadInst.

This isn't desirable because:
- All LandingPadInsts in the same function must have the same
  personality routine.  This means that each LandingPadInst beyond the
  first has an operand which produces no additional information.

- There is ongoing work to introduce EH IR constructs other than
  LandingPadInst.  Moving the personality routine off of any one
  particular Instruction and onto the parent function seems a lot better
  than have N different places a personality function can sneak onto an
  exceptional function.

Differential Revision: http://reviews.llvm.org/D10429

llvm-svn: 239940
2015-06-17 20:52:32 +00:00
Hal Finkel
44b81ee40b Preserve the order of READ_REGISTER and WRITE_REGISTER
At the present time, we don't have a way to represent general dependency
relationships, so everything is represented using memory dependency. In order
to preserve the data dependency of a READ_REGISTER on WRITE_REGISTER, we need
to model WRITE_REGISTER as writing (which we had been doing) and model
READ_REGISTER as reading (which we had not been doing). Fix this, and also the
way that the chain operands were generated at the SDAG level.

Patch by Nicholas Paul Johnson, thanks! Test case by me.

llvm-svn: 237584
2015-05-18 16:42:10 +00:00
Pete Cooper
e4bb07ecff [Fast-ISel] Clear kill flags on registers replaced by updateValueMap.
When selecting an extract instruction, we don't actually generate code but instead work out which register we are reading, and rewrite uses of the extract def to the source register.  This is done via updateValueMap,.

However, its possible that the source register we are rewriting *to* to also have uses.  If those uses are after a kill of the value we are rewriting *from* then we have uses after a kill and the verifier fails.

This code checks for the case where the to register is also used, and if so it clears all kill on the from register.  This is conservative, but better that always clearing kills on the from register.

llvm-svn: 236897
2015-05-08 20:46:54 +00:00
Duncan P. N. Exon Smith
a9308c49ef IR: Give 'DI' prefix to debug info metadata
Finish off PR23080 by renaming the debug info IR constructs from `MD*`
to `DI*`.  The last of the `DIDescriptor` classes were deleted in
r235356, and the last of the related typedefs removed in r235413, so
this has all baked for about a week.

Note: If you have out-of-tree code (like a frontend), I recommend that
you get everything compiling and tests passing with the *previous*
commit before updating to this one.  It'll be easier to keep track of
what code is using the `DIDescriptor` hierarchy and what you've already
updated, and I think you're extremely unlikely to insert bugs.  YMMV of
course.

Back to *this* commit: I did this using the rename-md-di-nodes.sh
upgrade script I've attached to PR23080 (both code and testcases) and
filtered through clang-format-diff.py.  I edited the tests for
test/Assembler/invalid-generic-debug-node-*.ll by hand since the columns
were off-by-three.  It should work on your out-of-tree testcases (and
code, if you've followed the advice in the previous paragraph).

Some of the tests are in badly named files now (e.g.,
test/Assembler/invalid-mdcompositetype-missing-tag.ll should be
'dicompositetype'); I'll come back and move the files in a follow-up
commit.

llvm-svn: 236120
2015-04-29 16:38:44 +00:00
Sergey Dmitrouk
842a51bad8 Reapply r235977 "[DebugInfo] Add debug locations to constant SD nodes"
[DebugInfo] Add debug locations to constant SD nodes

This adds debug location to constant nodes of Selection DAG and updates
all places that create constants to pass debug locations
(see PR13269).

Can't guarantee that all locations are correct, but in a lot of cases choice
is obvious, so most of them should be. At least all tests pass.

Tests for these changes do not cover everything, instead just check it for
SDNodes, ARM and AArch64 where it's easy to get incorrect locations on
constants.

This is not complete fix as FastISel contains workaround for wrong debug
locations, which drops locations from instructions on processing constants,
but there isn't currently a way to use debug locations from constants there
as llvm::Constant doesn't cache it (yet). Although this is a bit different
issue, not directly related to these changes.

Differential Revision: http://reviews.llvm.org/D9084

llvm-svn: 235989
2015-04-28 14:05:47 +00:00
Daniel Jasper
48e93f7181 Revert "[DebugInfo] Add debug locations to constant SD nodes"
This breaks a test:
http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/23870

llvm-svn: 235987
2015-04-28 13:38:35 +00:00
Sergey Dmitrouk
adb4c69d5c [DebugInfo] Add debug locations to constant SD nodes
This adds debug location to constant nodes of Selection DAG and updates
all places that create constants to pass debug locations
(see PR13269).

Can't guarantee that all locations are correct, but in a lot of cases choice
is obvious, so most of them should be. At least all tests pass.

Tests for these changes do not cover everything, instead just check it for
SDNodes, ARM and AArch64 where it's easy to get incorrect locations on
constants.

This is not complete fix as FastISel contains workaround for wrong debug
locations, which drops locations from instructions on processing constants,
but there isn't currently a way to use debug locations from constants there
as llvm::Constant doesn't cache it (yet). Although this is a bit different
issue, not directly related to these changes.

Differential Revision: http://reviews.llvm.org/D9084

llvm-svn: 235977
2015-04-28 11:56:37 +00:00
Reid Kleckner
cfbfe6f29c [SEH] Implement GetExceptionCode in __except blocks
This introduces an intrinsic called llvm.eh.exceptioncode. It is lowered
by copying the EAX value live into whatever basic block it is called
from. Obviously, this only works if you insert it late during codegen,
because otherwise mid-level passes might reschedule it.

llvm-svn: 235768
2015-04-24 20:25:05 +00:00
Reid Kleckner
5c5facc2ce Re-commit "[SEH] Remove the old __C_specific_handler code now that WinEHPrepare works"
This reverts commit r235617.

r235649 should have addressed the problems.

llvm-svn: 235667
2015-04-23 23:22:33 +00:00
Reid Kleckner
909ea7e6b8 Revert "[SEH] Remove the old __C_specific_handler code now that WinEHPrepare works"
We still have some "uses remain after removal" issues in -O0 builds.

This reverts commit r235557.

llvm-svn: 235617
2015-04-23 18:34:01 +00:00
Hans Wennborg
0867b151c9 Re-commit r235560: Switch lowering: extract jump tables and bit tests before building binary tree (PR22262)
Third time's the charm. The previous commit was reverted as a
reverse for-loop in SelectionDAGBuilder::lowerWorkItem did 'I--'
on an iterator at the beginning of a vector, causing asserts
when using debugging iterators. This commit fixes that.

llvm-svn: 235608
2015-04-23 16:45:24 +00:00
Aaron Ballman
0be238cebd Revert r235560; this commit was causing several failed assertions in Debug builds using MSVC's STL. The iterator is being used outside of its valid range.
llvm-svn: 235597
2015-04-23 13:41:59 +00:00
Hans Wennborg
15823d49b6 Switch lowering: extract jump tables and bit tests before building binary tree (PR22262)
This is a re-commit of r235101, which also fixes the problems with the previous patch:

- Switches with only a default case and non-fallthrough were handled incorrectly

- The previous patch tickled a bug in PowerPC Early-Return Creation which is fixed here.

> This is a major rewrite of the SelectionDAG switch lowering. The previous code
> would lower switches as a binary tre, discovering clusters of cases
> suitable for lowering by jump tables or bit tests as it went along. To increase
> the likelihood of finding jump tables, the binary tree pivot was selected to
> maximize case density on both sides of the pivot.
>
> By not selecting the pivot in the middle, the binary trees would not always
> be balanced, leading to performance problems in the generated code.
>
> This patch rewrites the lowering to search for clusters of cases
> suitable for jump tables or bit tests first, and then builds the binary
> tree around those clusters. This way, the binary tree will always be balanced.
>
> This has the added benefit of decoupling the different aspects of the lowering:
> tree building and jump table or bit tests finding are now easier to tweak
> separately.
>
> For example, this will enable us to balance the tree based on profile info
> in the future.
>
> The algorithm for finding jump tables is quadratic, whereas the previous algorithm
> was O(n log n) for common cases, and quadratic only in the worst-case. This
> doesn't seem to be major problem in practice, e.g. compiling a file consisting
> of a 10k-case switch was only 30% slower, and such large switches should be rare
> in practice. Compiling e.g. gcc.c showed no compile-time difference.  If this
> does turn out to be a problem, we could limit the search space of the algorithm.
>
> This commit also disables all optimizations during switch lowering in -O0.
>
> Differential Revision: http://reviews.llvm.org/D8649

llvm-svn: 235560
2015-04-22 23:14:56 +00:00
Reid Kleckner
64a2a6a473 [SEH] Remove the old __C_specific_handler code now that WinEHPrepare works
This removes the -sehprepare flag and makes __C_specific_handler
functions always to use WinEHPrepare.

This was tested by building all of chromium_builder_tests and running a
few tests that use SEH, but if something breaks, we can revert this.

llvm-svn: 235557
2015-04-22 22:13:09 +00:00
Reid Kleckner
d2a1a51996 Re-land r235154-r235156 under the existing -sehprepare flag
Keep the old SEH fan-in lowering on by default for now, since projects
rely on it.  This will make it easy to test this change with a simple
flag flip.

llvm-svn: 235399
2015-04-21 18:23:57 +00:00
Nico Weber
a762fa6c98 Revert r235154-r235156, they cause asserts when building win64 code (http://crbug.com/477988)
llvm-svn: 235170
2015-04-17 09:10:43 +00:00
Reid Kleckner
d4523e3c51 [SEH] Reimplement x64 SEH using WinEHPrepare
This now emits simple, unoptimized xdata tables for __C_specific_handler
based on the handlers listed in @llvm.eh.actions calls produced by
WinEHPrepare.

This adds support for running __finally blocks when exceptions are
thrown, and removes the old landingpad fan-in codepath.

I ran some manual execution tests on small basic test cases with and
without optimization, as well as on Chrome base_unittests, which uses a
small amount of SEH.  I'm sure there are bugs, and we may need to
revert.

llvm-svn: 235154
2015-04-17 01:01:27 +00:00
Hans Wennborg
a9e2057416 Revert the switch lowering change (r235101, r235103, r235106)
Looks like it broke the sanitizer-ppc64-linux1 build. Reverting for now.

llvm-svn: 235108
2015-04-16 15:43:26 +00:00
Hans Wennborg
d403664ed8 Switch lowering: extract jump tables and bit tests before building binary tree (PR22262)
This is a major rewrite of the SelectionDAG switch lowering. The previous code
would lower switches as a binary tre, discovering clusters of cases
suitable for lowering by jump tables or bit tests as it went along. To increase
the likelihood of finding jump tables, the binary tree pivot was selected to
maximize case density on both sides of the pivot.

By not selecting the pivot in the middle, the binary trees would not always
be balanced, leading to performance problems in the generated code.

This patch rewrites the lowering to search for clusters of cases
suitable for jump tables or bit tests first, and then builds the binary
tree around those clusters. This way, the binary tree will always be balanced.

This has the added benefit of decoupling the different aspects of the lowering:
tree building and jump table or bit tests finding are now easier to tweak
separately.

For example, this will enable us to balance the tree based on profile info
in the future.

The algorithm for finding jump tables is O(n^2), whereas the previous algorithm
was O(n log n) for common cases, and quadratic only in the worst-case. This
doesn't seem to be major problem in practice, e.g. compiling a file consisting
of a 10k-case switch was only 30% slower, and such large switches should be rare
in practice. Compiling e.g. gcc.c showed no compile-time difference.  If this
does turn out to be a problem, we could limit the search space of the algorithm.

This commit also disables all optimizations during switch lowering in -O0.

Differential Revision: http://reviews.llvm.org/D8649

llvm-svn: 235101
2015-04-16 14:49:23 +00:00
Reid Kleckner
3e9fadfbc8 [WinEH] Try to make the MachineFunction CFG more accurate
This avoids emitting code for unreachable landingpad blocks that contain
calls to llvm.eh.actions and indirectbr.

It's also a first step towards unifying the SEH and WinEH lowering
codepaths. I'm keeping the old fan-in lowering of SEH around until the
preparation version works well enough that we can switch over without
breaking existing users.

llvm-svn: 235037
2015-04-15 18:48:15 +00:00
Alexander Kornienko
f817c1cb9a Use 'override/final' instead of 'virtual' for overridden methods
The patch is generated using clang-tidy misc-use-override check.

This command was used:

  tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py \
    -checks='-*,misc-use-override' -header-filter='llvm|clang' \
    -j=32 -fix -format

http://reviews.llvm.org/D8925

llvm-svn: 234679
2015-04-11 02:11:45 +00:00
Duncan P. N. Exon Smith
e686f1591f CodeGen: Stop using DIDescriptor::is*() and auto-casting
Same as r234255, but for lib/CodeGen and lib/Target.

llvm-svn: 234258
2015-04-06 23:27:40 +00:00
Duncan P. N. Exon Smith
3bef6a3803 CodeGen: Assert that inlined-at locations agree
As a follow-up to r234021, assert that a debug info intrinsic variable's
`MDLocalVariable::getInlinedAt()` always matches the
`MDLocation::getInlinedAt()` of its `!dbg` attachment.

The goal here is to get rid of `MDLocalVariable::getInlinedAt()`
entirely (PR22778), but I'll let these assertions bake for a while
first.

If you have an out-of-tree backend that just broke, you're probably
attaching the wrong `DebugLoc` to a `DBG_VALUE` instruction.  The one
you want is the location that was attached to the corresponding
`@llvm.dbg.declare` or `@llvm.dbg.value` call that you started with.

llvm-svn: 234038
2015-04-03 19:20:26 +00:00
David Majnemer
cde33036ed [WinEH] Run cleanup handlers when an exception is thrown
Generate tables in the .xdata section representing what actions to take
when an exception is thrown.  This currently fills in state for
cleanups, catch handlers are still unfinished.

llvm-svn: 233636
2015-03-30 22:58:10 +00:00
Yaron Keren
75e0c4b060 Remove superfluous .str() and replace std::string concatenation with Twine.
llvm-svn: 233392
2015-03-27 17:51:30 +00:00
Hans Wennborg
81cfeb1999 SelectionDAGIsel: Fix comment about terminators being "handled below".
That changed in r102128.

llvm-svn: 232692
2015-03-19 00:02:22 +00:00
Daniel Sanders
60f1db0525 Recommit r232027 with PR22883 fixed: Add infrastructure for support of multiple memory constraints.
The operand flag word for ISD::INLINEASM nodes now contains a 15-bit
memory constraint ID when the operand kind is Kind_Mem. This constraint
ID is a numeric equivalent to the constraint code string and is converted
with a target specific hook in TargetLowering.

This patch maps all memory constraints to InlineAsm::Constraint_m so there
is no functional change at this point. It just proves that using these
previously unused bits in the encoding of the flag word doesn't break
anything.

The next patch will make each target preserve the current mapping of
everything to Constraint_m for itself while changing the target independent
implementation of the hook to return Constraint_Unknown appropriately. Each
target will then be adapted in separate patches to use appropriate
Constraint_* values.

PR22883 was caused the matching operands copying the whole of the operand flags
for the matched operand. This included the constraint id which needed to be
replaced with the operand number. This has been fixed with a conversion
function. Following on from this, matching operands also used the operand
number as the constraint id. This has been fixed by looking up the matched
operand and taking it from there. 

llvm-svn: 232165
2015-03-13 12:45:09 +00:00
Hal Finkel
e78e52ba9b Revert "r232027 - Add infrastructure for support of multiple memory constraints"
This (r232027) has caused PR22883; so it seems those bits might be used by
something else after all. Reverting until we can figure out what else to do.

Original commit message:

The operand flag word for ISD::INLINEASM nodes now contains a 15-bit
memory constraint ID when the operand kind is Kind_Mem. This constraint
ID is a numeric equivalent to the constraint code string and is converted
with a target specific hook in TargetLowering.

This patch maps all memory constraints to InlineAsm::Constraint_m so there
is no functional change at this point. It just proves that using these
previously unused bits in the encoding of the flag word doesn't break anything.

The next patch will make each target preserve the current mapping of
everything to Constraint_m for itself while changing the target independent
implementation of the hook to return Constraint_Unknown appropriately. Each
target will then be adapted in separate patches to use appropriate Constraint_*
values.

llvm-svn: 232093
2015-03-12 20:09:39 +00:00
Daniel Sanders
41c072e63b Add infrastructure for support of multiple memory constraints.
Summary:
The operand flag word for ISD::INLINEASM nodes now contains a 15-bit
memory constraint ID when the operand kind is Kind_Mem. This constraint
ID is a numeric equivalent to the constraint code string and is converted
with a target specific hook in TargetLowering.

This patch maps all memory constraints to InlineAsm::Constraint_m so there
is no functional change at this point. It just proves that using these
previously unused bits in the encoding of the flag word doesn't break anything.

The next patch will make each target preserve the current mapping of
everything to Constraint_m for itself while changing the target independent
implementation of the hook to return Constraint_Unknown appropriately. Each
target will then be adapted in separate patches to use appropriate Constraint_*
values.

Reviewers: hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D8171

llvm-svn: 232027
2015-03-12 11:00:48 +00:00
Eric Christopher
5f141b03fa Remove useMachineScheduler and replace it with subtarget options
that control, individually, all of the disparate things it was
controlling.

At the same time move a FIXME in the Hexagon port to a new
subtarget function that will enable a user of the machine
scheduler to avoid using the source scheduler for pre-RA-scheduling.
The FIXME would have this removed, but involves either testcase
changes or adding -pre-RA-sched=source to a few testcases.

llvm-svn: 231980
2015-03-11 22:56:10 +00:00
Benjamin Kramer
867bfc53ee Make constant arrays that are passed to functions as const.
In theory this allows the compiler to skip materializing the array on
the stack. In practice clang often fails to do that, but that's a
different story. NFC.

llvm-svn: 231571
2015-03-07 17:41:00 +00:00
Mehdi Amini
367bfa42d8 Use report_fatal_error instead of unreachable for -fast-isel-abort
Suggestion by Andrea Di Biagio

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 231201
2015-03-04 01:48:39 +00:00
Mehdi Amini
04f0f5ba61 Fixup for recent -fast-isel-abort change: code didn't match description
Level 1 should abort for all instructions but call/terminators/args.
Instead it was aborting only if the level was > 2

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 230861
2015-02-28 19:34:54 +00:00
Mehdi Amini
945a660cbc Change the fast-isel-abort option from bool to int to enable "levels"
Summary:
Currently fast-isel-abort will only abort for regular instructions,
and just warn for function calls, terminators, function arguments.
There is already fast-isel-abort-args but nothing for calls and
terminators.

This change turns the fast-isel-abort options into an integer option,
so that multiple levels of strictness can be defined.
This will help no being surprised when the "abort" option indeed does
not abort, and enables the possibility to write test that verifies
that no intrinsics are forgotten by fast-isel.

Reviewers: resistor, echristo

Subscribers: jfb, llvm-commits

Differential Revision: http://reviews.llvm.org/D7941

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 230775
2015-02-27 18:32:11 +00:00
Reid Kleckner
2d5fb68ee0 Unify the two EH personality classification routines I wrote
We only need one.

llvm-svn: 229193
2015-02-14 00:21:02 +00:00
Chandler Carruth
71f308adb7 Re-sort #include lines using my handy dandy ./utils/sort_includes.py
script. This is in preparation for changes to lots of include lines.

llvm-svn: 229088
2015-02-13 09:09:03 +00:00
Philip Reames
56a03938f7 Revert GCStrategy ownership changes
This change reverts the interesting parts of 226311 (and 227046).  This change introduced two problems, and I've been convinced that an alternate approach is preferrable anyways.

The bugs were:
- Registery appears to require all users be within the same linkage unit.  After this change, asking for "statepoint-example" in Transform/ would sometimes get you nullptr, whereas asking the same question in CodeGen would return the right GCStrategy.  The correct long term fix is to get rid of the utter hack which is Registry, but I don't have time for that right now.  227046 appears to have been an attempt to fix this, but I don't believe it does so completely.
- GCMetadataPrinter::finishAssembly was being called more than once per GCStrategy.  Each Strategy was being added to the GCModuleInfo multiple times.

Once I get time again, I'm going to split GCModuleInfo into the gc.root specific part and a GCStrategy owning Analysis pass.  I'm probably also going to kill off the Registry.  Once that's done, I'll move the new GCStrategyAnalysis and all built in GCStrategies into Analysis.  (As original suggested by Chandler.)  This will accomplish my original goal of being able to access GCStrategy from Transform/  without adding all of the builtin GCs to IR/.  

llvm-svn: 227109
2015-01-26 18:26:35 +00:00
Reid Kleckner
5cc1569c54 Classify functions by EH personality type rather than using the triple
This mostly reverts commit r222062 and replaces it with a new enum. At
some point this enum will grow at least for other MSVC EH personalities.

Also beefs up the way we were sniffing the personality function.
Previously we would emit the Itanium LSDA despite using
__C_specific_handler.

Reviewers: majnemer

Differential Revision: http://reviews.llvm.org/D6987

llvm-svn: 226920
2015-01-23 18:49:01 +00:00
Chandler Carruth
37df2cfbf8 [PM] Remove the Pass argument from all of the critical edge splitting
APIs and replace it and numerous booleans with an option struct.

The critical edge splitting API has a really large surface of flags and
so it seems worth burning a small option struct / builder. This struct
can be constructed with the various preserved analyses and then flags
can be flipped in a builder style.

The various users are now responsible for directly passing along their
analysis information. This should be enough for the critical edge
splitting to work cleanly with the new pass manager as well.

This API is still pretty crufty and could be cleaned up a lot, but I've
focused on this change just threading an option struct rather than
a pass through the API.

llvm-svn: 226456
2015-01-19 12:09:11 +00:00
Philip Reames
2b45395876 Move ownership of GCStrategy objects to LLVMContext
Note: This change ended up being slightly more controversial than expected.  Chandler has tentatively okayed this for the moment, but I may be revisiting this in the near future after we settle some high level questions.

Rather than have the GCStrategy object owned by the GCModuleInfo - which is an immutable analysis pass used mainly by gc.root - have it be owned by the LLVMContext. This simplifies the ownership logic (i.e. can you have two instances of the same strategy at once?), but more importantly, allows us to access the GCStrategy in the middle end optimizer. To this end, I add an accessor through Function which becomes the canonical way to get at a GCStrategy instance.

In the near future, this will allows me to move some of the checks from http://reviews.llvm.org/D6808 into the Verifier itself, and to introduce optimization legality predicates for some of the recent additions to InstCombine. (These will follow as separate changes.)

Differential Revision: http://reviews.llvm.org/D6811

llvm-svn: 226311
2015-01-16 20:07:33 +00:00
Mehdi Amini
fa546b29a0 Fix SelectionDAG -view-*-dags filtering
llvm-svn: 226163
2015-01-15 12:03:32 +00:00
Chandler Carruth
b98f63dbdb [PM] Separate the TargetLibraryInfo object from the immutable pass.
The pass is really just a means of accessing a cached instance of the
TargetLibraryInfo object, and this way we can re-use that object for the
new pass manager as its result.

Lots of delta, but nothing interesting happening here. This is the
common pattern that is developing to allow analyses to live in both the
old and new pass manager -- a wrapper pass in the old pass manager
emulates the separation intrinsic to the new pass manager between the
result and pass for analyses.

llvm-svn: 226157
2015-01-15 10:41:28 +00:00
Chandler Carruth
62d4215baa [PM] Move TargetLibraryInfo into the Analysis library.
While the term "Target" is in the name, it doesn't really have to do
with the LLVM Target library -- this isn't an abstraction which LLVM
targets generally need to implement or extend. It has much more to do
with modeling the various runtime libraries on different OSes and with
different runtime environments. The "target" in this sense is the more
general sense of a target of cross compilation.

This is in preparation for porting this analysis to the new pass
manager.

No functionality changed, and updates inbound for Clang and Polly.

llvm-svn: 226078
2015-01-15 02:16:27 +00:00
Reid Kleckner
9b5eaf0d5a Emit the Itanium LSDA for unknown EH personalities on Win64
This fixes lots of generic CodeGen tests that use __gcc_personality_v0.
This suggests that using ExceptionHandling::MSVC was a mistake, and we
should instead classify each function by personality function. This
would, for example, allow us to LTO a binary containing uses of SEH and
Itanium EH.

llvm-svn: 226019
2015-01-14 18:50:10 +00:00
Mehdi Amini
d8976b8ed3 SelectionDAG: add a -filter-view-dags option to llc
This option takes the name of the basic block you want to visualize
with -view-*-dags

Differential Revision: http://reviews.llvm.org/D6948

llvm-svn: 225953
2015-01-14 06:03:18 +00:00
Reid Kleckner
0a57f65514 CodeGen support for x86_64 SEH catch handlers in LLVM
This adds handling for ExceptionHandling::MSVC, used by the
x86_64-pc-windows-msvc triple. It assumes that filter functions have
already been outlined in either the frontend or the backend. Filter
functions are used in place of the landingpad catch clause type info
operands. In catch clause order, the first filter to return true will
catch the exception.

The C specific handler table expects the landing pad to be split into
one block per handler, but LLVM IR uses a single landing pad for all
possible unwind actions. This patch papers over the mismatch by
synthesizing single instruction BBs for every catch clause to fill in
the EH selector that the landing pad block expects.

Missing functionality:
- Accessing data in the parent frame from outlined filters
- Cleanups (from __finally) are unsupported, as they will require
  outlining and parent frame access
- Filter clauses are unsupported, as there's no clear analogue in SEH

In other words, this is the minimal set of changes needed to write IR to
catch arbitrary exceptions and resume normal execution.

Reviewers: majnemer

Differential Revision: http://reviews.llvm.org/D6300

llvm-svn: 225904
2015-01-14 01:05:27 +00:00
David Blaikie
70573dcd9f Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool>
This is to be consistent with StringSet and ultimately with the standard
library's associative container insert function.

This lead to updating SmallSet::insert to return pair<iterator, bool>,
and then to update SmallPtrSet::insert to return pair<iterator, bool>,
and then to update all the existing users of those functions...

llvm-svn: 222334
2014-11-19 07:49:26 +00:00
Eric Christopher
2ae2de7562 Remove uses of the TargetMachine from FunctionLoweringInfo
via caching TargetLowering and using the MachineFunction.

llvm-svn: 219375
2014-10-09 00:57:31 +00:00
Eric Christopher
d2670a34c9 Grab the TargetRegisterInfo off of the subtarget from the
MachineFunction rather than a lookup on the TargetMachine
to avoid unnecessary lookups.

llvm-svn: 219291
2014-10-08 09:50:52 +00:00
Eric Christopher
b17140de35 Cache TargetLowering on SelectionDAGISel and update previous
calls to getTargetLowering() with the cached variable.

llvm-svn: 219284
2014-10-08 07:32:17 +00:00
Eric Christopher
60eb343e83 Cache SelectionDAGISel TargetInstrInfo lookups on the class and
propagate. Also use the TargetSubtargetInfo and the MachineFunction
and move TargetRegisterInfo query closer to uses.

llvm-svn: 219273
2014-10-08 01:58:03 +00:00
Eric Christopher
896044822e Reset the target options and optimization level as the first
thing we do inside selection dag. This code needs to be
migrated to queries on the function rather than global
data, but this organizes things before we start grabbing
the subtarget.

llvm-svn: 219271
2014-10-08 01:58:01 +00:00
Eric Christopher
8d07f44560 Have the selection dag grab TargetLowering off of the subtarget
inside init rather than have it passed in as an argument.

llvm-svn: 219270
2014-10-08 01:57:58 +00:00
Adam Nemet
ff63a2dc51 [ISel] Keep matching state consistent when folding during X86 address match
In the X86 backend, matching an address is initiated by the 'addr' complex
pattern and its friends.  During this process we may reassociate and-of-shift
into shift-of-and (FoldMaskedShiftToScaledMask) to allow folding of the
shift into the scale of the address.

However as demonstrated by the testcase, this can trigger CSE of not only the
shift and the AND which the code is prepared for but also the underlying load
node.  In the testcase this node is sitting in the RecordedNode and MatchScope
data structures of the matcher and becomes a deleted node upon CSE.  Returning
from the complex pattern function, we try to access it again hitting an assert
because the node is no longer a load even though this was checked before.

Now obviously changing the DAG this late is bending the rules but I think it
makes sense somewhat.  Outside of addresses we prefer and-of-shift because it
may lead to smaller immediates (FoldMaskAndShiftToScale is an even better
example because it create a non-canonical node).  We currently don't recognize
addresses during DAGCombiner where arguably this canonicalization should be
performed.  On the other hand, having this in the matcher allows us to cover
all the cases where an address can be used in an instruction.

I've also talked a little bit to Dan Gohman on llvm-dev who added the RAUW for
the new shift node in FoldMaskedShiftToScaledMask.  This RAUW is responsible
for initiating the recursive CSE on users
(http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-September/076903.html) but it
is not strictly necessary since the shift is hooked into the visited user.  Of
course it's safer to keep the DAG consistent at all times (e.g. for accurate
number of uses, etc.).

So rather than changing the fundamentals, I've decided to continue along the
previous patches and detect the CSE.  This patch installs a very targeted
DAGUpdateListener for the duration of a complex-pattern match and updates the
matching state accordingly.  (Previous patches used HandleSDNode to detect the
CSE but that's not practical here).  The listener is only installed on X86.

I tested that there is no measurable overhead due to this while running
through the spec2k BC files with llc.  The only thing we pay for is the
creation of the listener.  The callback never ever triggers in spec2k since
this is a corner case.

Fixes rdar://problem/18206171

llvm-svn: 219009
2014-10-03 20:00:34 +00:00
Adrian Prantl
87b7eb9d0f Move the complex address expression out of DIVariable and into an extra
argument of the llvm.dbg.declare/llvm.dbg.value intrinsics.

Previously, DIVariable was a variable-length field that has an optional
reference to a Metadata array consisting of a variable number of
complex address expressions. In the case of OpPiece expressions this is
wasting a lot of storage in IR, because when an aggregate type is, e.g.,
SROA'd into all of its n individual members, the IR will contain n copies
of the DIVariable, all alike, only differing in the complex address
reference at the end.

By making the complex address into an extra argument of the
dbg.value/dbg.declare intrinsics, all of the pieces can reference the
same variable and the complex address expressions can be uniqued across
the CU, too.
Down the road, this will allow us to move other flags, such as
"indirection" out of the DIVariable, too.

The new intrinsics look like this:
declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr)
declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr)

This patch adds a new LLVM-local tag to DIExpressions, so we can detect
and pretty-print DIExpression metadata nodes.

What this patch doesn't do:

This patch does not touch the "Indirect" field in DIVariable; but moving
that into the expression would be a natural next step.

http://reviews.llvm.org/D4919
rdar://problem/17994491

Thanks to dblaikie and dexonsmith for reviewing this patch!

Note: I accidentally committed a bogus older version of this patch previously.
llvm-svn: 218787
2014-10-01 18:55:02 +00:00
Adrian Prantl
b458dc2eee Revert r218778 while investigating buldbot breakage.
"Move the complex address expression out of DIVariable and into an extra"

llvm-svn: 218782
2014-10-01 18:10:54 +00:00
Adrian Prantl
25a7174e7a Move the complex address expression out of DIVariable and into an extra
argument of the llvm.dbg.declare/llvm.dbg.value intrinsics.

Previously, DIVariable was a variable-length field that has an optional
reference to a Metadata array consisting of a variable number of
complex address expressions. In the case of OpPiece expressions this is
wasting a lot of storage in IR, because when an aggregate type is, e.g.,
SROA'd into all of its n individual members, the IR will contain n copies
of the DIVariable, all alike, only differing in the complex address
reference at the end.

By making the complex address into an extra argument of the
dbg.value/dbg.declare intrinsics, all of the pieces can reference the
same variable and the complex address expressions can be uniqued across
the CU, too.
Down the road, this will allow us to move other flags, such as
"indirection" out of the DIVariable, too.

The new intrinsics look like this:
declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr)
declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr)

This patch adds a new LLVM-local tag to DIExpressions, so we can detect
and pretty-print DIExpression metadata nodes.

What this patch doesn't do:

This patch does not touch the "Indirect" field in DIVariable; but moving
that into the expression would be a natural next step.

http://reviews.llvm.org/D4919
rdar://problem/17994491

Thanks to dblaikie and dexonsmith for reviewing this patch!

llvm-svn: 218778
2014-10-01 17:55:39 +00:00
Eric Christopher
3976f78247 Move resetTargetOptions from taking a MachineFunction to a Function
since we are accessing the TargetMachine that we're a member
function of.

llvm-svn: 218489
2014-09-26 01:28:10 +00:00