Commit Graph

25526 Commits

Author SHA1 Message Date
Craig Topper
826f44b550 [TargetLowering][AMDGPU] Remove the SimplifyDemandedBits function that takes a User and OpIdx. Stop using it in AMDGPU target for simplifyI24.
As we saw in D56057 when we tried to use this function on X86, it's unsafe. It allows the operand node to have multiple users, but doesn't prevent recursing past the first node when it does have multiple users. This can cause other simplifications earlier in the graph without regard to what bits are needed by the other users of the first node. Ideally all we should do to the first node if it has multiple uses is bypass it when its not needed by the user we started from. Doing any other transformation that SimplifyDemandedBits can do like turning ZEXT/SEXT into AEXT would result in an increase in instructions.

Fortunately, we already have a function that can do just that, GetDemandedBits. It will only make transformations that involve bypassing a node.

This patch changes AMDGPU's simplifyI24, to use a combination of GetDemandedBits to handle the multiple use simplifications. And then uses the regular SimplifyDemandedBits on each operand to handle simplifications allowed when the operand only has a single use. Unfortunately, GetDemandedBits simplifies constants more aggressively than SimplifyDemandedBits. This caused the -7 constant in the changed test to be simplified to remove the upper bits. I had to modify computeKnownBits to account for this by ignoring the upper 8 bits of the input.

Differential Revision: https://reviews.llvm.org/D56087

llvm-svn: 350560
2019-01-07 19:30:43 +00:00
Lama Saba
f385c21f79 Revert "Resubmit rL345008 "Split MachinePipeliner code into header and cpp files""
This reverts commit rL350493
issues related to modules  still appear in http://green.lab.llvm.org/green/job/lldb-cmake

llvm-svn: 350497
2019-01-06 16:39:14 +00:00
Lama Saba
ea9d555b83 Resubmit rL345008 "Split MachinePipeliner code into header and cpp files"
Resubmitted in rL345290 and reverted in rL350345 due to failures in
http://green.lab.llvm.org/green/job/lldb-cmake/
Resubmitting after a workaround to lldb-cmake failure was
committed in rL350346, more info in https://reviews.llvm.org/D56084

llvm-svn: 350493
2019-01-06 15:45:40 +00:00
Craig Topper
57fc891c1b [LegalizeVectorOps] Add FSHL/FSHR to the list of vector operations that should be handled.
The FSHL/FSHR nodes are handled in the expand function, but they need to also be listed in the code that queries for the operation action too.

llvm-svn: 350490
2019-01-06 07:06:35 +00:00
Stanislav Mekhanoshin
35a3a3bd11 Added single use check to ShrinkDemandedConstant
Fixes cvt_f32_ubyte combine. performCvtF32UByteNCombine() could shrink
source node to demanded bits only even if there are other uses.

Differential Revision: https://reviews.llvm.org/D56289

llvm-svn: 350475
2019-01-05 19:20:00 +00:00
Craig Topper
cfeb1cf9af [X86] Add INSERT_SUBVECTOR to ComputeNumSignBits
This adds support for calculating sign bits of insert_subvector. I based it on the computeKnownBits.

My motivating case is propagating sign bits information across basic blocks on AVX targets where concatenating using insert_subvector is common.

Differential Revision: https://reviews.llvm.org/D56283

llvm-svn: 350432
2019-01-04 20:50:59 +00:00
Sanjay Patel
9633d76a40 [DAGCombiner][x86] scalarize binop followed by extractelement
As noted in PR39973 and D55558:
https://bugs.llvm.org/show_bug.cgi?id=39973
...this is a partial implementation of a fold that we do as an IR canonicalization in instcombine:

// extelt (binop X, Y), Index --> binop (extelt X, Index), (extelt Y, Index)

We want to have this in the DAG too because as we can see in some of the test diffs (reductions), 
the pattern may not be visible in IR.

Given that this is already an IR canonicalization, any backend that would prefer a vector op over 
a scalar op is expected to already have the reverse transform in DAG lowering (not sure if that's
a realistic expectation though). The transform is limited with a TLI hook because there's an
existing transform in CodeGenPrepare that tries to do the opposite transform.

Differential Revision: https://reviews.llvm.org/D55722

llvm-svn: 350354
2019-01-03 21:31:16 +00:00
Stefan Granitz
a9b7ca472d Revert "Resubmit rL345008 "Split MachinePipeliner code into header and cpp files""
This reverts commit r350290.

llvm-svn: 350345
2019-01-03 19:09:24 +00:00
Lama Saba
4d752a88e8 Resubmit rL345008 "Split MachinePipeliner code into header and cpp files"
The commit caused unclear failures in http://green.lab.llvm.org/green//job/lldb-cmake/
will revert if the error reappears

Differential Revision: https://reviews.llvm.org/D56084

llvm-svn: 350290
2019-01-03 10:03:54 +00:00
Markus Lavin
72b9deb21f [CodeGen] Skip over dbg-instr in twoaddr pass
A DBG_VALUE between a two-address instruction and a following COPY
would prevent rescheduleMIBelowKill optimization inside
TwoAddressInstructionPass.

Differential Revision: https://reviews.llvm.org/D55987

llvm-svn: 350289
2019-01-03 08:36:06 +00:00
Craig Topper
8dd7bd2cd7 [DAGCombiner] After performing the division by constant optimization for a DIV or REM node, replace the users of the corresponding REM or DIV node if it exists.
Currently we expand the two nodes separately. This gives DAG combiner an opportunity to optimize the expanded sequence taking into account only one set of users. When we expand the other node we'll create the expansion again, but might not be able to optimize it the same way. So the nodes won't CSE and we'll have two similarish sequences in the same basic block. By expanding both nodes at the same time we'll avoid prematurely optimizing the expansion until both the division and remainder have been replaced.

Improves the test case from PR38217. There may be additional opportunities after this.

Differential Revision: https://reviews.llvm.org/D56145

llvm-svn: 350239
2019-01-02 18:19:07 +00:00
Craig Topper
3109f3a4ab [LegalizeIntegerTypes] When promoting the result of an extract_vector_elt also promote the input type if necessary
By also promoting the input type we get a better idea for what scalar type to use. This can provide better results if the result of the extract is sign extended. What was previously happening is that the extract result would be legalized, sometime later the input of the sign extend would be legalized using the result of the extract. Then later the extract input would be legalized forcing a truncate into the input of the sign extend using a replace all uses. This requires DAG combine to combine out the sext/truncate pair. But sometimes we visited the truncate first and messed things up before the sext could be combined.

By creating the extract with the correct scalar type when we create legalize the result type, the truncate will be added right away. Then when the sign_extend input is legalized it will create an any_extend of the truncate which can be optimized by getNode to maybe remove the truncate. And then a sign_extend_inreg. Now DAG combine doesn't have to worry about getting rid of the extend.

This fixes the regression on X86 in D56156.

Differential Revision: https://reviews.llvm.org/D56176

llvm-svn: 350236
2019-01-02 17:58:30 +00:00
Craig Topper
c562fae02b [DAGCombiner][X86][PowerPC] Teach visitSIGN_EXTEND_INREG to fold (sext_in_reg (aext/sext x)) -> (sext x) when x has more than 1 sign bit and the sext_inreg is from one of them.
If x has multiple sign bits than it doesn't matter which one we extend from so we can sext from x's msb instead.

The X86 setcc-combine.ll changes are a little weird. It appears we ended up with a (sext_inreg (aext (trunc (extractelt)))) after type legalization. The sext_inreg+aext now gets optimized by this combine to leave (sext (trunc (extractelt))). Then we visit the trunc before we visit the sext. This ends up changing the truncate to an extractvectorelt from a bitcasted vector. I have a follow up patch to fix this.

Differential Revision: https://reviews.llvm.org/D56156

llvm-svn: 350235
2019-01-02 17:58:27 +00:00
Ayonam Ray
e00606a1b2 Reversing the commit in revision 350186. Revision causes regression in 4
tests.

llvm-svn: 350187
2019-01-01 07:28:55 +00:00
Ayonam Ray
c471bb2e67 Omit range checks from jump tables when lowering switches with unreachable
default

During the lowering of a switch that would result in the generation of a jump
table, a range check is performed before indexing into the jump table, for the
switch value being outside the jump table range and a conditional branch is
inserted to jump to the default block. In case the default block is
unreachable, this conditional jump can be omitted. This patch implements
omitting this conditional branch for unreachable defaults.

Review Reference: D52002

llvm-svn: 350186
2019-01-01 06:37:50 +00:00
Craig Topper
ed3ffae4a4 [SelectionDAG] Add SIGN_EXTEND_VECTOR_INREG support to computeKnownBits.
Differential Revision: https://reviews.llvm.org/D56168

llvm-svn: 350179
2018-12-31 19:09:30 +00:00
Craig Topper
802c4979ae [DAGCombiner] Add missing one use check on the shuffle in the bitcast(shuffle(bitcast(s0),bitcast(s1))) -> shuffle(s0,s1) transform.
Found while trying out some other changes so I don't really have a test case.

llvm-svn: 350172
2018-12-31 05:40:46 +00:00
Kang Zhang
4aa6453767 [PowerPC] Fix ADDE, SUBE do not know how to promote operator
Summary:
This patch is created to fix the Bugzilla bug 39815:
https://bugs.llvm.org/show_bug.cgi?id=39815 

This patch is to support promotion integer result for the instruction ADDE, SUBE.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D56119

llvm-svn: 350161
2018-12-30 07:48:09 +00:00
Richard Trieu
a87b70d1db Add vtable anchor to classes.
llvm-svn: 350142
2018-12-29 02:02:13 +00:00
Reid Kleckner
c168c6f86f [codeview] Check if this 'this' type of a method is a pointer
Fixes crash reported after r347354 for frontends that don't always emit
'this' pointers for methods. Now we will silently produce debug info
that makes functions like this look like static methods, which seems
reasonable.

llvm-svn: 350073
2018-12-26 21:52:17 +00:00
Justin Lebar
49fac56ea3 [NVPTX] Allow libcalls that are defined in the current module.
The patch adds a possibility to make library calls on NVPTX.

An important thing about library functions - they must be defined within
the current module. This basically should guarantee that we produce a
valid PTX assembly (without calls to not defined functions). The one who
wants to use the libcalls is probably will have to link against
compiler-rt or any other implementation.

Currently, it's completely impossible to make library calls because of
error LLVM ERROR: Cannot select: i32 = ExternalSymbol '...'. But we can
lower ExternalSymbol to TargetExternalSymbol and verify if the function
definition is available.

Also, there was an issue with a DAG during legalisation. When we expand
instruction into libcall, the inner call-chain isn't being "integrated"
into outer chain. Since the last "data-flow" (call retval load) node is
located in call-chain earlier than CALLSEQ_END node, the latter becomes
a leaf and therefore a dead node (and is being removed quite fast).
Proposed here solution relies on another data-flow pseudo nodes
(ProxyReg) which purpose is only to keep CALLSEQ_END at legalisation and
instruction selection phases - we remove the pseudo instructions before
register scheduling phase.

Patch by Denys Zariaiev!

Differential Revision: https://reviews.llvm.org/D34708

llvm-svn: 350069
2018-12-26 19:12:31 +00:00
Petar Avramovic
09dff33349 [MIPS GlobalISel] Select G_SELECT
Add widen scalar for type index 1 (i1 condition) for G_SELECT.
Select G_SELECT for pointer, s32(integer) and smaller low level
types on MIPS32.

Differential Revision: https://reviews.llvm.org/D56001

llvm-svn: 350063
2018-12-25 14:42:30 +00:00
Craig Topper
0229da8f07 [X86] Use GetDemandedBits to simplify the operands of PMULDQ/PMULUDQ.
This is an alternative to what I attempted in D56057.

GetDemandedBits is a special version of SimplifyDemandedBits that allows simplifications even when the operand has other uses. GetDemandedBits will only do simplifications that allow a node to be bypassed. It won't create new nodes or alter any of the other users.

I had to add support for bypassing SIGN_EXTEND_INREG to GetDemandedBits.

Based on a patch that Simon Pilgrim sent me in email.

Fixes PR40142.

llvm-svn: 350059
2018-12-24 19:40:20 +00:00
Max Kazantsev
0b455c2b71 Revert rL350048 and rL350050
These patches have broken almost all buildbots on test
DebugInfo/X86/addr_comments.ll. Reverting to green.

llvm-svn: 350052
2018-12-24 10:30:04 +00:00
David Blaikie
5353e64935 Fix build - follow-up to r350048 which broke headerless (v4) address pool
llvm-svn: 350050
2018-12-24 07:56:40 +00:00
David Blaikie
e20bf9ab91 DebugInfo: Use assembly label arithmetic for address pool size for easier reading/editing
llvm-svn: 350048
2018-12-24 07:35:10 +00:00
David Blaikie
d671eb7e7c DebugInfo: Add assembly comments for debug_addr contribution header fields
llvm-svn: 350047
2018-12-24 07:09:50 +00:00
George Burgess IV
610c76534f [SelectionDAGBuilder] Use ::precise LocationSizes; NFC
More migration so we can disable the implicit int -> LocationSize
conversion.

All of these are either scatter/gather'ed vector instructions, or direct
loads. Hence, they're all precise.

Perhaps if we see way more getTypeStoreSize calls, we can make a
getTypeStoreLocationSize (or similar) as a wrapper that applies this
::precise. Doesn't appear that it's a good idea to make getTypeStoreSize
return a LocationSize itself, however.

llvm-svn: 350042
2018-12-24 05:34:21 +00:00
Sanjay Patel
93f1074677 [DAGCombiner] limit shuffle to extend transform (PR40146)
It's dangerous to knowingly create an illegal vector type
no matter what stage of combining we're in.

This prevents the missed folding/scalarization seen in:
https://bugs.llvm.org/show_bug.cgi?id=40146

llvm-svn: 350034
2018-12-23 20:48:31 +00:00
Sanjay Patel
9933574ac3 [DAGCombiner] allow hoisting vector bitwise logic ahead of extends
llvm-svn: 350032
2018-12-23 19:58:16 +00:00
Sanjay Patel
4b537aaf6d [DAGCombiner] allow narrowing of add followed by truncate
trunc (add X, C ) --> add (trunc X), C'

If we're throwing away the top bits of an 'add' instruction, do it in the narrow destination type.
This makes the truncate-able opcode list identical to the sibling transform done in IR (in instcombine).

This change used to show regressions for x86, but those are gone after D55494. 
This gets us closer to deleting the x86 custom function (combineTruncatedArithmetic) 
that does almost the same thing.

Differential Revision: https://reviews.llvm.org/D55866

llvm-svn: 350006
2018-12-22 17:10:31 +00:00
Vedant Kumar
b264d69de7 [IR] Add Instruction::isLifetimeStartOrEnd, NFC
Instruction::isLifetimeStartOrEnd() checks whether an Instruction is an
llvm.lifetime.start or an llvm.lifetime.end intrinsic.

This was suggested as a cleanup in D55967.

Differential Revision: https://reviews.llvm.org/D56019

llvm-svn: 349964
2018-12-21 21:49:40 +00:00
Sanjay Patel
47a6129e26 [DAGCombiner] simplify code leading to scalarizeExtractedVectorLoad; NFC
llvm-svn: 349958
2018-12-21 21:26:30 +00:00
Jessica Paquette
453ab1db5b [GlobalISel][AArch64] Add support for widening G_FCEIL
This adds support for widening G_FCEIL in LegalizerHelper and
AArch64LegalizerInfo. More specifically, it teaches the AArch64 legalizer to
widen G_FCEIL from a 16-bit float to a 32-bit float when the subtarget doesn't
support full FP 16.

This also updates AArch64/f16-instructions.ll to show that we perform the
correct transformation.

llvm-svn: 349927
2018-12-21 17:05:26 +00:00
Simon Pilgrim
911dce2f30 [SelectionDAG] Always use the version of computeKnownBits that returns a value. NFCI.
Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version.

llvm-svn: 349907
2018-12-21 14:56:18 +00:00
Eli Friedman
b1bbd5dca3 [ARM] Complete the Thumb1 shift+and->shift+shift transforms.
This saves materializing the immediate.  The additional forms are less
common (they don't usually show up for bitfield insert/extract), but
they're still relevant.

I had to add a new target hook to prevent DAGCombine from reversing the
transform. That isn't the only possible way to solve the conflict, but
it seems straightforward enough.

Differential Revision: https://reviews.llvm.org/D55630

llvm-svn: 349857
2018-12-20 23:39:54 +00:00
David Blaikie
b3c56af49b DebugInfo: Fix for missing comp_dir handling with r349207
When deciding lazily whether a CU would be split or non-split I
accidentally dropped some handling for the line tables comp_dir (by
doing it lazily it was too late to be handled properly by the MC line
table code).

Move that bit of the code back to the non-lazy place.

llvm-svn: 349819
2018-12-20 20:46:55 +00:00
Brock Wyma
b17464e4e8 [CodeView] Emit global variables within lexical scopes to limit visibility
Emit static locals within the correct lexical scope so variables with the same
name will not confuse the debugger into getting the wrong value.

Differential Revision: https://reviews.llvm.org/D55336

llvm-svn: 349777
2018-12-20 17:33:45 +00:00
Simon Pilgrim
b208255fe0 [SelectionDAGBuilder] Enable funnel shift building to custom rotates
This patch enables funnel shift -> rotate building for all ROTL/ROTR custom/legal operations.

AFAICT X86 was the last target that was missing modulo support (PR38243), but I've tried to CC stakeholders for every target that has ROTL/ROTR custom handling for their final OK.

Differential Revision: https://reviews.llvm.org/D55747

llvm-svn: 349765
2018-12-20 14:56:44 +00:00
Clement Courbet
36a3480385 Re-land r349731 "[CodeGen][ExpandMemcmp] Add an option for allowing overlapping loads.
Update PPC ir following GEP->bitcat to bitcat->GEP->bitcat change.

llvm-svn: 349747
2018-12-20 13:01:04 +00:00
Clement Courbet
e22cf4d7cb Revert r349731 "[CodeGen][ExpandMemcmp] Add an option for allowing overlapping loads."
Forgot to update PowerPC tests for the GEP->bitcast change.

llvm-svn: 349733
2018-12-20 09:58:33 +00:00
Clement Courbet
1bb6e1b0f2 [CodeGen][ExpandMemcmp] Add an option for allowing overlapping loads.
Summary:
This allows expanding {7,11,13,14,15,21,22,23,25,26,27,28,29,30,31}-byte memcmp
in just two loads on X86. These were previously calling memcmp.

Reviewers: spatel, gchatelet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D55263

llvm-svn: 349731
2018-12-20 09:13:47 +00:00
Craig Topper
bd788ce5db [DAGCombiner] Fix a place that was creating a SIGN_EXTEND with an extra operand.
llvm-svn: 349726
2018-12-20 05:28:06 +00:00
Matt Davis
87b2268c0c [DwarfExpression] Fix a typo in a doxygen comment. NFC.
llvm-svn: 349703
2018-12-20 00:01:57 +00:00
Eli Friedman
a69084ffa8 [CodeGenPrepare] Fix bad IR created by large offset GEP splitting.
Creating the IR builder, then modifying the CFG, leads to an IRBuilder
where the BB and insertion point are inconsistent, so new instructions
have the wrong parent.

Modified an existing test because the test wasn't covering anything
useful (the "invoke" was not actually an invoke by the time we hit the
code in question).

Differential Revision: https://reviews.llvm.org/D55729

llvm-svn: 349693
2018-12-19 22:52:04 +00:00
Rhys Perry
972273d1d3 Fix test commit
Seems that was actually a eight space tab...

llvm-svn: 349690
2018-12-19 22:33:42 +00:00
Rhys Perry
111bf831de Test commit
Replace tab with 4 spaces.

llvm-svn: 349689
2018-12-19 22:26:51 +00:00
Jessica Paquette
3560e93dc1 [GlobalISel][AArch64] Add support for @llvm.ceil
This adds a G_FCEIL generic instruction and uses it in AArch64. This adds
selection for floating point ceil where it has a supported, dedicated
instruction. Other cases aren't handled here.

It updates the relevant gisel tests and adds a select-ceil test. It also adds a
check to arm64-vcvt.ll which ensures that we don't fall back when we run into
one of the relevant cases.

llvm-svn: 349664
2018-12-19 19:01:36 +00:00
Simon Pilgrim
2ae3a91656 [SelectionDAG] Optional handling of UNDEF elements in matchBinaryPredicate (part 2 of 2)
Now that SimplifyDemandedBits/SimplifyDemandedVectorElts is simplifying vector elements, we're seeing more constant BUILD_VECTOR containing undefs.

This patch provides opt-in support for UNDEF elements in matchBinaryPredicate, passing NULL instead of the result ConstantSDNode* argument.

I've updated the (or (and X, c1), c2) -> (and (or X, c2), c1|c2) fold to demonstrate its use, which I believe is safe for undef cases.

Differential Revision: https://reviews.llvm.org/D55822

llvm-svn: 349629
2018-12-19 14:09:38 +00:00
Simon Pilgrim
47ff0431e9 [SelectionDAG] Optional handling of UNDEF elements in matchBinaryPredicate (part 1 of 2)
Now that SimplifyDemandedBits/SimplifyDemandedVectorElts is simplifying vector elements, we're seeing more constant BUILD_VECTOR containing undefs.

This patch provides opt-in support for UNDEF elements in matchBinaryPredicate, passing NULL instead of the result ConstantSDNode* argument.

Differential Revision: https://reviews.llvm.org/D55822

llvm-svn: 349628
2018-12-19 14:09:09 +00:00