Commit Graph

9448 Commits

Author SHA1 Message Date
Simon Pilgrim
f6c898e12f [TargetLowering] Add ISD::AND handling to SimplifyDemandedVectorElts
If either of the operand elements are zero then we know the result element is going to be zero (even if the other element is undef).

Differential Revision: https://reviews.llvm.org/D55558

llvm-svn: 348926
2018-12-12 13:43:07 +00:00
Leonard Chan
118e53fd63 [Intrinsic] Signed Fixed Point Multiplication Intrinsic
Add an intrinsic that takes 2 signed integers with the scale of them provided
as the third argument and performs fixed point multiplication on them.

This is a part of implementing fixed point arithmetic in clang where some of
the more complex operations will be implemented as intrinsics.

Differential Revision: https://reviews.llvm.org/D54719

llvm-svn: 348912
2018-12-12 06:29:14 +00:00
Clement Courbet
8b6434bbb9 Revert r348843 "[CodeGen] Allow mempcy/memset to generate small overlapping stores."
Breaks ARM/memcpy-inline.ll

llvm-svn: 348844
2018-12-11 13:38:43 +00:00
Clement Courbet
93b3445770 [CodeGen] Allow mempcy/memset to generate small overlapping stores.
Summary:
All targets either just return false here or properly model `Fast`, so I
don't think there is any reason to prevent CodeGen from doing the right
thing here.

Subscribers: nemanjai, javed.absar, eraman, jsji, llvm-commits

Differential Revision: https://reviews.llvm.org/D55365

llvm-svn: 348843
2018-12-11 13:15:56 +00:00
Simon Pilgrim
f6371f5f23 [TargetLowering] Add ISD::EXTRACT_VECTOR_ELT support to SimplifyDemandedBits
Let SimplifyDemandedBits attempt to simplify all elements of a vector extraction.

Part of PR39689.

llvm-svn: 348839
2018-12-11 11:08:40 +00:00
Simon Pilgrim
fc2c9af99c [TargetLowering] Add UNDEF folding to SimplifyDemandedVectorElts
If all the demanded elements of the SimplifyDemandedVectorElts are known to be UNDEF, we can simplify to an ISD::UNDEF node.

Zero constant folding will be handled in a future patch - its a little trickier as we often have bitcasted zero values.

Differential Revision: https://reviews.llvm.org/D55511

llvm-svn: 348784
2018-12-10 18:29:46 +00:00
Simon Pilgrim
c73a955370 [DAGCombiner] Remove unnecessary recursive DAGCombiner::visitINSERT_SUBVECTOR call.
As discussed on D55511, this caused an issue if the inner node deletes a node that the outer node depends upon. As it doesn't affect any lit-tests and I've only been able to expose this with the D55511 change I'm committing this now.

llvm-svn: 348781
2018-12-10 18:18:50 +00:00
Francis Visoiu Mistrih
753efe3584 [DAGCombiner] Use the result value type in visitCONCAT_VECTORS
This triggers an assert when combining concat_vectors of a bitcast of
merge_values.

With asserts disabled, it fails to select:
fatal error: error in backend: Cannot select: 0x7ff19d000e90: i32 = any_extend 0x7ff19d000ae8
  0x7ff19d000ae8: f64,ch = CopyFromReg 0x7ff19d000c20:1, Register:f64 %1
    0x7ff19d000b50: f64 = Register %1
In function: d

Differential Revision: https://reviews.llvm.org/D55507

llvm-svn: 348759
2018-12-10 14:31:34 +00:00
Jeremy Morse
a06b163d5c [DebugInfo] Don't drop dbg.value's of nullptr
Currently, dbg.value's of "nullptr" are dropped when entering a SelectionDAG --
apparently just because of an oversight when recognising Values that are
constant (see PR39787). This patch adds ConstantPointerNull to the list of
constants that can be turned into DBG_VALUEs.

The matter of what bit-value a null pointer constant in LLVM has was raised
in this mailing list thread:

  http://lists.llvm.org/pipermail/llvm-dev/2018-December/128234.html

Where it transpires LLVM relies on (IR) null pointers being zero valued,
thus I've baked this assumption into the patch.

Differential Revision: https://reviews.llvm.org/D55227

llvm-svn: 348753
2018-12-10 12:04:08 +00:00
Jeremy Morse
045c67769d [DebugInfo] Emit undef DBG_VALUEs when SDNodes are optimised out
This is a fix for PR39896, where dbg.value's of SDNodes that have been
optimised out do not lead to "DBG_VALUE undef" instructions being created.
Such undef instructions are necessary to terminate earlier variable
ranges, otherwise variable values leak past the point where they're valid.

The "invalidated" flag of SDDbgValue is currently being abused to mean two
things:
 * The corresponding SDNode is now invalid
 * This SDDbgValue should not be emitted
Of which there are several legitimate combinations of meaning:
 * The SDNode has been invalidated and we should emit "DBG_VALUE undef"
 * The SDNode has been invalidated but the debug data was salvaged, don't
   emit anything for this SDDbgValue
 * This SDDbgValue has been emitted

This patch introduces distinct "Emitted" and "Invalidated" fields to the
SDDbgValue class, updates users accordingly, and generates "undef"
DBG_VALUEs for invalidated records. Awkwardly, there are circumstances
where we emit SDDbgValue's twice, specifically DebugInfo/X86/dbg-addr-dse.ll
which I've preserved.

Differential Revision: https://reviews.llvm.org/D55372

llvm-svn: 348751
2018-12-10 11:20:47 +00:00
Sanjay Patel
e767bf4468 [DAGCombiner] re-enable truncation of binops
This is effectively re-committing the changes from:
rL347917 (D54640)
rL348195 (D55126)
...which were effectively reverted here:
rL348604
...because the code had a bug that could induce infinite looping
or eventual out-of-memory compilation.

The bug was that this code did not guard against transforming
opaque constants. More details are in the post-commit mailing
list thread for r347917. A reduced test for that is included
in the x86 bool-math.ll file. (I wasn't able to reduce a PPC
backend test for this, but it was almost the same pattern.)

Original commit message for r347917:

The motivating case for this is shown in:
https://bugs.llvm.org/show_bug.cgi?id=32023
and the corresponding rot16.ll regression tests.

Because x86 scalar shift amounts are i8 values, we can end up with trunc-binop-trunc
sequences that don't get folded in IR.

As the TODO comments suggest, there will be regressions if we extend this (for x86,
we mostly seem to be missing LEA opportunities, but there are likely vector folds
missing too). I think those should be considered existing bugs because this is the
same transform that we do as an IR canonicalization in instcombine. We just need
more tests to make those visible independent of this patch.

llvm-svn: 348706
2018-12-08 16:07:38 +00:00
Craig Topper
b4c96f5a32 [SelectionDAG] Remove ISD::ADDC/ADDE from some undef handling code in getNode. NFCI
These nodes should have two results. A real VT and a Glue. But this code would have returned Undef which would only be a single result. But we're in the single result version of getNode so these opcodes should never be seen by this function anyway.

llvm-svn: 348670
2018-12-08 00:27:34 +00:00
Pete Cooper
782a490dfb Follow-up from r348441 to add the rest of the objc ARC intrinsics.
This adds the other intrinsics used by ARC and codegen's them to their respective runtime methods.

llvm-svn: 348646
2018-12-07 21:28:47 +00:00
Sanjay Patel
bc47ff86fe [DAGCombiner] split trunc from extend in hoistLogicOpWithSameOpcodeHands; NFC
This duplicates several shared checks, but we need to split
this up to fix underlying bugs in smaller steps.

llvm-svn: 348627
2018-12-07 18:51:08 +00:00
Sanjay Patel
3af4ae9735 [DAGCombiner] disable truncation of binops by default
As discussed in the post-commit thread of r347917, this
transform is fighting with an existing transform causing
an infinite loop or out-of-memory, so this is effectively 
reverting r347917 and its follow-up r348195 while we
investigate the bug.

llvm-svn: 348604
2018-12-07 15:47:52 +00:00
Sanjay Patel
bb796cd61c [DAGCombiner] remove explicit calls to AddToWorkList; NFCI
As noted in the post-commit thread for rL347917:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181203/608936.html
...we don't need to repeat these calls because the combiner does it automatically.

llvm-svn: 348597
2018-12-07 15:00:56 +00:00
Simon Pilgrim
d498dee7a2 [SelectionDAG] Don't pass on DemandedElts when handling SCALAR_TO_VECTOR
Fixes an assertion:

llc: lib/CodeGen/SelectionDAG/SelectionDAG.cpp:2200: llvm::KnownBits llvm::SelectionDAG::computeKnownBits(llvm::SDValue, const llvm::APInt&, unsigned int) const: Assertion `(!Op.getValueType().isVector() || NumElts == Op.getValueType().getVectorNumElements()) && "Unexpected vector size"' failed.

Committed on behalf of: @pendingchaos (Rhys Perry)

Differential Revision: https://reviews.llvm.org/D55223

llvm-svn: 348574
2018-12-07 09:18:44 +00:00
Sanjay Patel
c6441c8547 [DAGCombiner] use root SDLoc for all nodes created by logic fold
If this is not a valid way to assign an SDLoc, then we get this
wrong all over SDAG.

I don't know enough about the SDAG to explain this. IIUC, theoretically,
debug info is not supposed to affect codegen. But here it has clearly
affected 3 different targets, and the x86 change is an actual improvement.

llvm-svn: 348552
2018-12-07 00:01:57 +00:00
Sanjay Patel
86cb679851 [DAGCombiner] don't bother saving a SDLoc for a node that's dead; NFCI
We shouldn't care about the debug location for a node that
we're creating, but attaching the root of the pattern should
be the best effort. (If this is not true, then we are doing
it wrong all over the SDAG).

This is no-functional-change-intended, and there are no
regression test diffs...and that's what I expected. But
there's a similar line above this diff, where those
assumptions apparently do not hold.

llvm-svn: 348550
2018-12-06 23:53:58 +00:00
Sanjay Patel
276cef343c [DAGCombiner] more clean up in hoistLogicOpWithSameOpcodeHands(); NFC
This code can still misbehave.

llvm-svn: 348547
2018-12-06 23:39:28 +00:00
Sanjay Patel
70af85b0ac [DAGCombiner] don't group bswap with casts in logic hoisting fold
This was probably organized as it was because bswap is a unary op.
But that's where the similarity to the other opcodes ends. We should
not limit this transform to scalars, and we should not try it if
either input has other uses. This is another step towards trying to
clean this whole function up to prevent it from causing infinite loops
and memory explosions. 

Earlier commits in this series:
rL348501
rL348508
rL348518

llvm-svn: 348534
2018-12-06 22:10:44 +00:00
Sanjay Patel
03a3ef2a0c [DAGCombiner] reduce indent; NFC
Unlike some of the folds in hoistLogicOpWithSameOpcodeHands()
above this shuffle transform, this has the expected hasOneUse()
checks in place.

llvm-svn: 348523
2018-12-06 20:02:47 +00:00
Andrea Di Biagio
52a2bac583 [DagCombiner][X86] Simplify a ConcatVectors of a scalar_to_vector with undef.
This patch introduces a new DAGCombiner rule to simplify concat_vectors nodes:

concat_vectors( bitcast (scalar_to_vector %A), UNDEF)
    --> bitcast (scalar_to_vector %A)

This patch only partially addresses PR39257. In particular, it is enough to fix
one of the two problematic cases mentioned in PR39257. However, it is not enough
to fix the original test case posted by Craig; that particular case would
probably require a more complicated approach (and knowledge about used bits).

Before this patch, we used to generate the following code for function PR39257
(-mtriple=x86_64 , -mattr=+avx):

vmovsd  (%rdi), %xmm0           # xmm0 = mem[0],zero
vxorps  %xmm1, %xmm1, %xmm1
vblendps        $3, %xmm0, %xmm1, %xmm0 # xmm0 = xmm0[0,1],xmm1[2,3]
vmovaps %ymm0, (%rsi)
vzeroupper
retq

Now we generate this:

vmovsd  (%rdi), %xmm0           # xmm0 = mem[0],zero
vmovaps %ymm0, (%rsi)
vzeroupper
retq

As a side note: that VZEROUPPER is completely redundant...

I guess the vzeroupper insertion pass doesn't realize that the definition of
%xmm0 from vmovsd is already zeroing the upper half of %ymm0. Note that on
%-mcpu=btver2, we don't get that vzeroupper because pass vzeroupper insertion
%pass is disabled.

Differential Revision: https://reviews.llvm.org/D55274

llvm-svn: 348522
2018-12-06 19:55:38 +00:00
Sanjay Patel
bfc7ffa40f [DAGCombiner] don't hoist logic op if operands have other uses, part 2
The PPC test with 2 extra uses seems clearly better by avoiding this transform. 
With 1 extra use, we also prevent an extra register move (although that might
be an RA problem). The general rule should be to only make a change here if
it is always profitable. The x86 diffs are all neutral.

llvm-svn: 348518
2018-12-06 19:18:56 +00:00
Sanjay Patel
c3717cd0d5 [DAGCombiner] don't hoist logic op if operands have other uses
The AVX512 diffs are neutral, but the bswap test shows a clear overreach in 
hoistLogicOpWithSameOpcodeHands(). If we don't check for other uses, we can 
increase the instruction count.

This could also fight with transforms trying to go in the opposite direction 
and possibly blow up/infinite loop. This might be enough to solve the bug 
noted here:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181203/608593.html

I did not add the hasOneUse() checks to all opcodes because I see a perf 
regression for at least one opcode. We may decide that's irrelevant in the
face of potential compiler crashing, but I'll see if I can salvage that first.

llvm-svn: 348508
2018-12-06 18:16:32 +00:00
Sanjay Patel
e9bf78fa23 [DAGCombiner] refactor function that hoists bitwise logic; NFCI
Added FIXME and TODO comments for lack of safety checks.
This function is a suspect in out-of-memory errors as discussed in
the follow-up thread to r347917:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181203/608593.html

llvm-svn: 348501
2018-12-06 17:08:03 +00:00
Simon Pilgrim
105a366254 DAGCombiner::visitINSERT_VECTOR_ELT - pull out repeated VT.getVectorNumElements(). NFCI.
llvm-svn: 348494
2018-12-06 15:39:25 +00:00
Pete Cooper
e13d0992dc Add objc.* ARC intrinsics and codegen them to their runtime methods.
Reviewers: erik.pilkington, ahatanak

Differential Revision: https://reviews.llvm.org/D55233

llvm-svn: 348441
2018-12-06 00:52:54 +00:00
Sanjay Patel
33a448f935 [DAGCombiner] don't try to extract a fraction of a vector binop and crash (PR39893)
Because we're potentially peeking through a bitcast in this transform,
we need to use overall bitwidths rather than number of elements to
determine when it's safe to proceed.

Should fix:
https://bugs.llvm.org/show_bug.cgi?id=39893

llvm-svn: 348383
2018-12-05 17:10:30 +00:00
Simon Pilgrim
8fdaf5c915 [TargetLowering] Remove ISD::ANY_EXTEND/ANY_EXTEND_VECTOR_INREG opcodes from SimplifyDemandedVectorElts
These have no test coverage and the KnownZero flags can't be guaranteed unlike SIGN/ZERO_EXTEND cases.

llvm-svn: 348361
2018-12-05 12:20:05 +00:00
Simon Pilgrim
180639afe5 [SelectionDAG] Initial support for FSHL/FSHR funnel shift opcodes (PR39467)
This is an initial patch to add a minimum level of support for funnel shifts to the SelectionDAG and to begin wiring it up to the X86 SHLD/SHRD instructions.

Some partial legalization code has been added to handle the case for 'SlowSHLD' where we want to expand instead and I've added a few DAG combines so we don't get regressions from the existing DAG builder expansion code.

Differential Revision: https://reviews.llvm.org/D54698

llvm-svn: 348353
2018-12-05 11:12:12 +00:00
Simon Pilgrim
cd8a152b18 Remove superfluous comments. NFCI.
As requested in D54698.

llvm-svn: 348350
2018-12-05 10:45:44 +00:00
Simon Pilgrim
d24730cdda [TargetLowering] SimplifyDemandedVectorElts - don't alter DemandedElts mask
Fix potential issue with the ISD::INSERT_VECTOR_ELT case tweaking the DemandedElts mask instead of using a local copy - so later uses of the mask use the tweaked version.....

Noticed while investigating adding zero/undef folding to SimplifyDemandedVectorElts and the altered DemandedElts mask was causing mismatches.

llvm-svn: 348348
2018-12-05 10:37:45 +00:00
Amara Emerson
814a6794ba [SelectionDAG] Split very large token factors for loads into 64k chunks.
There's a 64k limit on the number of SDNode operands, and some very large
functions with 64k or more loads can cause crashes due to this limit being hit
when a TokenFactor with this many operands is created. To fix this, create
sub-tokenfactors if we've exceeded the limit.

No test case as it requires a very large function.

rdar://45196621

Differential Revision: https://reviews.llvm.org/D55073

llvm-svn: 348324
2018-12-05 00:41:30 +00:00
Nirav Dave
ce26c27b2a [SelectionDAG] Redefine isGAPlusOffset in terms of unwrapAddress. NFCI.
llvm-svn: 348288
2018-12-04 17:59:43 +00:00
Simon Pilgrim
0add090e24 [TargetLowering] expandFP_TO_UINT - avoid FPE due to out of range conversion (PR17686)
PR17686 demonstrates that for some targets FP exceptions can fire in cases where the FP_TO_UINT is expanded using a FP_TO_SINT instruction.

The existing code converts both the inrange and outofrange cases using FP_TO_SINT and then selects the result, this patch changes this for 'strict' cases to pre-select the FP_TO_SINT input and the offset adjustment.

The X87 cases don't need the strict flag but generates much nicer code with it....

Differential Revision: https://reviews.llvm.org/D53794

llvm-svn: 348251
2018-12-04 11:21:30 +00:00
Simon Pilgrim
666261cdc8 [TargetLowering] Add SimplifyDemandedVectorElts support to EXTEND opcodes
Add support for ISD::*_EXTEND and ISD::*_EXTEND_VECTOR_INREG opcodes.

The extra broadcast in trunc-subvector.ll will be fixed in an upcoming patch.

llvm-svn: 348246
2018-12-04 10:41:06 +00:00
Sanjay Patel
d24f63477d [DAGCombiner] narrow truncated vector binops when legal
This is the smallest vector enhancement I could find to D54640.
Here, we're allowing narrowing to only legal vector ops because we'll see
regressions without that. All of the test diffs are wins from what I can tell.
With AVX/AVX512, we can shrink ymm/zmm ops to xmm.

x86 vector multiplies are the problem case that we're avoiding due to the
patchwork ISA, and it's not clear to me if we can dance around those
regressions using TLI hooks or if we need preliminary patches to plug those
holes.

Differential Revision: https://reviews.llvm.org/D55126

llvm-svn: 348195
2018-12-03 21:57:35 +00:00
Craig Topper
e35b01f8ea [X86] Add DAG combine to combine a v8i32->v8i16 truncate with a packuswb that truncates v8i16->v8i8.
Summary:
Under -x86-experimental-vector-widening-legalization, fp_to_uint/fp_to_sint with a smaller than 128 bit vector type results are custom type legalized by promoting the result to a 128 bit vector by promoting the elements, inserting an assertzext/assertsext, then truncating back to original type. The truncate will be further legalizdd to a pack shuffle. In the case of a v8i8 result type, we'll end up with a v8i16 fp_to_sint. This will need to be further legalized during vector op legalization by promoting to v8i32 and then truncating again. Under avx2 this produces good code with two pack instructions, but Under avx512 this will result in a truncate instruction and a packuswb instruction. But we should be able to get away with a single truncate instruction.

The other option is to promote all the way to vXi32 result type during the first type legalization. But in some experimentation that seemed to require more work to produce good code for other configurations.

Reviewers: RKSimon, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D54836

llvm-svn: 348158
2018-12-03 18:26:24 +00:00
Sanjay Patel
b205606d3e [SelectionDAG] fold constant with undef vector per element
This makes the SDAG behavior consistent with the way we do this in IR.
It's possible that we were getting the wrong answer before. For example,
'xor undef, undef --> 0' but 'xor undef, C' --> undef. 

But the most practical improvement is likely as shown in the tests here - 
for FP, we were overconstraining undef lanes to NaN, and that can prevent 
vector simplifications/narrowing (see D51553).

llvm-svn: 348090
2018-12-02 13:48:42 +00:00
Sanjay Patel
2daceedf92 [DAGCombiner] guard against an oversized shift crash
This change prevents the crash noted in the post-commit comments 
for rL347478 :
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181119/605166.html

We can't guarantee that an oversized shift amount is folded away, 
so we have to check for it.

Note that I committed an incomplete fix for that crash with:
rL347502

But as discussed here:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181126/605679.html
...we have to try harder.

So I'm not sure how to expose the bug now (and apparently no fuzzers have found 
a way yet either).

On the plus side, we have discovered that we're missing real optimizations by 
not simplifying nodes sooner, so the earlier fix still has value, and there's 
likely more value in extending that so we can simplify more opcodes and simplify 
when doing RAUW and/or putting nodes on the combiner worklist.

Differential Revision: https://reviews.llvm.org/D54954

llvm-svn: 348089
2018-12-02 13:33:56 +00:00
Simon Pilgrim
e017ed3245 [SelectionDAG] Improve SimplifyDemandedBits to SimplifyDemandedVectorElts simplification
D52935 introduced the ability for SimplifyDemandedBits to call SimplifyDemandedVectorElts through BITCASTs if the demanded bit mask entirely covered the sub element.

This patch relaxes this to demanding an element if we need any bit from it.

Differential Revision: https://reviews.llvm.org/D54761

llvm-svn: 348073
2018-12-01 12:08:55 +00:00
Nicolai Haehnle
a9cc92c247 AMDGPU: Fix various issues around the VirtReg2Value mapping
Summary:
The VirtReg2Value mapping is crucial for getting consistently
reliable divergence information into the SelectionDAG. This
patch fixes a bunch of issues that lead to incorrect divergence
info and introduces tight assertions to ensure we don't regress:

1. VirtReg2Value is generated lazily; there were some cases where
   a lookup was performed before all relevant virtual registers were
   created, leading to an out-of-sync mapping. Those cases were:

  - Complex code to lower formal arguments that generated CopyFromReg
    nodes from live-in registers (fixed by never querying the mapping
    for live-in registers).

  - Code that generates CopyToReg for formal arguments that are used
    outside the entry basic block (fixed by never querying the
    mapping for Register nodes, which don't need the divergence info
    anyway).

2. For complex values that are lowered to a sequence of registers,
   all registers must be reflected in the VirtReg2Value mapping.

I am not adding any new tests, since I'm not actually aware of any
bugs that these problems are causing with trunk as-is. However,
I recently added a test case (in r346423) which fails when D53283 is
applied without this change. Also, the new assertions should provide
most of the effective test coverage.

There is one test change in sdwa-peephole.ll. The underlying issue
is that since the divergence info is now correct, the DAGISel will
select V_OR_B32 directly instead of S_OR_B32. This leads to an extra
COPY which affects the behavior of MachineLICM in a way that ends up
with the S_MOV_B32 with the constant in a different basic block than
the V_OR_B32, which is presumably what defeats the peephole.

Reviewers: alex-t, arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D54340

llvm-svn: 348049
2018-11-30 22:55:29 +00:00
Sanjay Patel
1901a12e76 [SelectionDAG] fold FP binops with 2 undef operands to undef
llvm-svn: 348016
2018-11-30 18:38:52 +00:00
Than McIntosh
0e0a8a3fee [CodeGen] Prefer static frame index for STATEPOINT liveness args
Summary:
If a given liveness arg of STATEPOINT is at a fixed frame index
(e.g. a function argument passed on stack), prefer to use this
fixed location even the address is also in a register. If we use
the register it will generate a spill, which is not necessary
since the fixed frame index can be directly recorded in the stack
map.

Patch by Cherry Zhang <cherryyz@google.com>.

Reviewers: thanm, niravd, reames

Reviewed By: reames

Subscribers: cherryyz, reames, anna, arphaman, llvm-commits

Differential Revision: https://reviews.llvm.org/D53889

llvm-svn: 347998
2018-11-30 16:22:41 +00:00
Nicolai Haehnle
445b0b6260 TableGen/ISel: Allow PatFrag predicate code to access captured operands
Summary:
This simplifies writing predicates for pattern fragments that are
automatically re-associated or commuted.

For example, a followup patch adds patterns for fragments of the form
(add (shl $x, $y), $z) to the AMDGPU backend. Such patterns are
automatically commuted to (add $z, (shl $x, $y)), which makes it basically
impossible to refer to $x, $y, and $z generically in the PredicateCode.

With this change, the PredicateCode can refer to $x, $y, and $z simply
as `Operands[i]`.

Test confirmed that there are no changes to any of the generated files
when building all (non-experimental) targets.

Change-Id: I61c00ace7eed42c1d4edc4c5351174b56b77a79c

Reviewers: arsenm, rampitec, RKSimon, craig.topper, hfinkel, uweigand

Subscribers: wdng, tpr, llvm-commits

Differential Revision: https://reviews.llvm.org/D51994

llvm-svn: 347992
2018-11-30 14:15:13 +00:00
Alex Bradbury
fca95cfee9 [SelectionDAG] Support result type promotion for FLT_ROUNDS_
For targets where i32 is not a legal type (e.g. 64-bit RISC-V), 
LegalizeIntegerTypes must promote the result of ISD::FLT_ROUNDS_.

Differential Revision: https://reviews.llvm.org/D53820

llvm-svn: 347986
2018-11-30 13:18:33 +00:00
Alex Bradbury
bd24c7b045 [SelectionDAG] Support promotion of PREFETCH operands
For targets where i32 is not a legal type (e.g. 64-bit RISC-V), 
LegalizeIntegerTypes must promote the operands of ISD::PREFETCH.

Differential Revision: https://reviews.llvm.org/D53281

llvm-svn: 347980
2018-11-30 10:06:31 +00:00
Alex Bradbury
36e0fd1d39 [SelectionDAG] Support promotion of FRAMEADDR/RETURNADDR operands
For targets where i32 is not a legal type (e.g. 64-bit RISC-V), 
LegalizeIntegerTypes must promote the operand.

Differential Revision: https://reviews.llvm.org/D53279

llvm-svn: 347978
2018-11-30 10:02:06 +00:00
Alex Bradbury
e0e62e97df [TargetLowering][RISCV] Introduce isSExtCheaperThanZExt hook and implement for RISC-V
DAGTypeLegalizer::PromoteSetCCOperands currently prefers to zero-extend 
operands when it is able to do so. For some targets this is more expensive 
than a sign-extension, which is also a valid choice. Introduce the 
isSExtCheaperThanZExt hook and use it in the new SExtOrZExtPromotedInteger 
helper. On RISC-V, we prefer sign-extension for FromTy == MVT::i32 and ToTy == 
MVT::i64, as it can be performed using a single instruction.

Differential Revision: https://reviews.llvm.org/D52978

llvm-svn: 347977
2018-11-30 09:56:54 +00:00
Sanjay Patel
8d27144251 [DAGCombiner] narrow truncated binops
The motivating case for this is shown in:
https://bugs.llvm.org/show_bug.cgi?id=32023
and the corresponding rot16.ll regression tests.

Because x86 scalar shift amounts are i8 values, we can end up with trunc-binop-trunc 
sequences that don't get folded in IR.

As the TODO comments suggest, there will be regressions if we extend this (for x86, 
we mostly seem to be missing LEA opportunities, but there are likely vector folds 
missing too). I think those should be considered existing bugs because this is the 
same transform that we do as an IR canonicalization in instcombine. We just need 
more tests to make those visible independent of this patch.

Differential Revision: https://reviews.llvm.org/D54640

llvm-svn: 347917
2018-11-29 20:58:26 +00:00
Craig Topper
129d529ab3 [SelectionDAG][AArch64][X86] Move legalization of vector MULHS/MULHU from LegalizeDAG to LegalizeVectorOps
I believe we should be legalizing these with the rest of vector binary operations. If any custom lowering is required for these nodes, this will give the DAG combine between LegalizeVectorOps and LegalizeDAG to run on the custom code before constant build_vectors are lowered in LegalizeDAG.

I've moved MULHU/MULHS handling in AArch64 from Lowering to isel. Moving the lowering earlier caused build_vector+extract_subvector simplifications to kick in which made the generated code worse.

Differential Revision: https://reviews.llvm.org/D54276

llvm-svn: 347902
2018-11-29 19:36:17 +00:00
Craig Topper
b955bf382c [LegalizeVectorTypes][X86][ARM][AArch64][PowerPC] Don't use SplitVecOp_TruncateHelper for FP_TO_SINT/UINT.
SplitVecOp_TruncateHelper tries to promote the result type while splitting FP_TO_SINT/UINT. It then concatenates the result and introduces a truncate to the original result type. But it does this without inserting the AssertZExt/AssertSExt that the regular result type promotion would insert. Nor does it turn FP_TO_UINT into FP_TO_SINT the way normal result type promotion for these operations does. This is bad on X86 which doesn't support FP_TO_SINT until AVX512.

This patch disables the use of SplitVecOp_TruncateHelper for these operations and just lets normal promotion handle it. I've tweaked a couple things in X86ISelLowering to avoid a few obvious regressions there. I believe all the changes on X86 are improvements. The other targets look neutral.

Differential Revision: https://reviews.llvm.org/D54906

llvm-svn: 347593
2018-11-26 21:12:39 +00:00
Craig Topper
923f463ef2 [SelectionDAG] Teach BaseIndexOffset::match to unwrap the base after looking through an add/or
We might find a target specific node that needs to be unwrapped after we look through an add/or. Otherwise we get inconsistent results if one pointer is just X86WrapperRIP and the other is (add X86WrapperRIP, C)

Differential Revision: https://reviews.llvm.org/D54818

llvm-svn: 347591
2018-11-26 20:16:33 +00:00
Sanjay Patel
7336e7c67a [x86] limit transform for select-of-fp-constants
This should likely be adjusted to limit this transform
further, but these diffs should be clear wins.

If we have blendv/conditional move, then we should assume 
those are cheap ops. The loads become independent of the
compare, so those can be speculated before we need to use 
the values in the blend/mov.

llvm-svn: 347526
2018-11-25 17:27:02 +00:00
Sanjay Patel
04435677d0 [SelectionDAG] move constant or splat functions to common location
rL347502 moved the null sibling, so we should group all of these
together. I'm not sure why these aren't methods of the SDValue
class itself, but that's another patch if that's possible.

llvm-svn: 347523
2018-11-25 16:09:32 +00:00
Sanjay Patel
7e119c0400 [DAG] consolidate shift simplifications
...and use them to avoid creating obviously undef values as
discussed in the post-commit thread for r347478.

The diffs in vector div/rem show that we were missing real
optimizations by creating bogus shift nodes.

llvm-svn: 347502
2018-11-23 20:05:12 +00:00
Craig Topper
0ec17884de [LegalizeVectorTypes] Don't use SplitVecOp_TruncateHelper if we're heading towards scalarizing the type.
This code takes a truncate, fp_to_int, or int_to_fp with a legal result type and an input type that needs to be split and enlarges the elements in the result type before doing the split. Then inserts a follow up truncate or fp_round after concatenating the two halves back together.

But if the input type of the original op is being split on its way to ultimately being scalarized we're just going to end up building a vector from scalars and then truncating or rounding it in the vector register. Seems kind of silly to enlarge the result element type of the operation only to end up with scalar code and then building a vector with large elements only to make the elements smaller again in the vector register. Seems better to just try to get away producing smaller result types in the scalarized code.

The X86 test case that changes is a pretty contrived test case that exists because of a bug we used to have in our AVG matching code. I think the code is better now, but its not realistic anyway.

llvm-svn: 347482
2018-11-23 02:32:13 +00:00
Craig Topper
b239763384 [LegalizeVectorTypes] Have SplitVecOp_TruncateHelper fall back to SplitVecOp_UnaryOp if splitting the output type would be a legal type.
SplitVecOp_TruncateHelper tries to introduce a multilevel truncate to avoid scalarization. But if splitting the result type would still be a legal type we don't need to do that.

The comment block at the top of the function implied that this was already implemented. I looked back through the history and it doesn't look to have ever been checked.

llvm-svn: 347479
2018-11-22 22:56:52 +00:00
Sanjay Patel
3e80019275 [DAGCombiner] form 'not' ops ahead of shifts (PR39657)
We fail to canonicalize IR this way (prefer 'not' ops to arbitrary 'xor'),
but that would not matter without this patch because DAGCombiner was 
reversing that transform. I think we need this transform in the backend 
regardless of what happens in IR to catch cases where the shift-xor 
is formed late from GEP or other ops.

https://rise4fun.com/Alive/NC1

  Name: shl
  Pre: (-1 << C2) == C1
  %shl = shl i8 %x, C2
  %r = xor i8 %shl, C1
  =>
  %not = xor i8 %x, -1
  %r = shl i8 %not, C2
  
  Name: shr
  Pre: (-1 u>> C2) == C1
  %sh = lshr i8 %x, C2
  %r = xor i8 %sh, C1
  =>
  %not = xor i8 %x, -1
  %r = lshr i8 %not, C2

https://bugs.llvm.org/show_bug.cgi?id=39657

llvm-svn: 347478
2018-11-22 19:24:10 +00:00
Sanjay Patel
20935e0ab5 [DAGCombiner] refactor select-of-FP-constants transform
This transform needs to be limited. 

We are converting to a constant pool load very early, and we 
are turning loads that are independent of the select condition 
(and therefore speculatable) into a dependent non-speculatable 
load.

We may also be transferring a condition code from an FP register
to integer to create that dependent load.

llvm-svn: 347424
2018-11-21 20:54:47 +00:00
Sanjay Patel
1c74747478 [DAGCombiner] reduce code duplication; NFC
llvm-svn: 347410
2018-11-21 20:00:32 +00:00
Simon Pilgrim
5448339889 [TargetLowering] SimplifyDemandedBits - only reduce known bits for integer constants
Avoids fuzzing crash found by Mikael Holmén.

llvm-svn: 347393
2018-11-21 14:26:19 +00:00
Nikita Popov
c5b152bdef Test commit: Delete trailing space in comment
llvm-svn: 347385
2018-11-21 10:57:22 +00:00
Sanjay Patel
357053f289 [DAGCombiner] look through bitcasts when trying to narrow vector binops
This is another step in vector narrowing - a follow-up to D53784
(and hoping to eventually squash potential regressions seen in
D51553).

The x86 test diffs are wins, but the AArch64 diff is probably not.
That problem already exists independent of this patch (see PR39722), but it
went unnoticed in the previous patch because there were no regression tests
that showed the possibility.

The x86 diff in i64-mem-copy.ll is close. Given the frequency throttling
concerns with using wider vector ops, an extra extract to reduce vector
width is the right trade-off at this level of codegen.

Differential Revision: https://reviews.llvm.org/D54392

llvm-svn: 347356
2018-11-20 22:26:35 +00:00
Simon Pilgrim
3735105961 [DAGCombine] Add calls to SimplifyDemandedVectorElts from visitINSERT_SUBVECTOR (PR37989)
This uncovered an off-by-one typo in SimplifyDemandedVectorElts's INSERT_SUBVECTOR handling as its bounds check was bailing on safe indices.

llvm-svn: 347313
2018-11-20 15:23:50 +00:00
Simon Pilgrim
b356d0463e [TargetLowering] Improve SimplifyDemandedVectorElts/SimplifyDemandedBits support
For bitcast nodes from larger element types, add the ability for SimplifyDemandedVectorElts to call SimplifyDemandedBits by merging the elts mask to a bits mask.

I've raised https://bugs.llvm.org/show_bug.cgi?id=39689 to deal with the few places where SimplifyDemandedBits's lack of vector handling is a problem.

Differential Revision: https://reviews.llvm.org/D54679

llvm-svn: 347301
2018-11-20 12:02:16 +00:00
Craig Topper
4954c66430 [SelectionDAG] Compute known bits and num sign bits for live out vector registers. Use it to add AssertZExt/AssertSExt in the live in basic blocks
Summary:
We already support this for scalars, but it was explicitly disabled for vectors. In the updated test cases this allows us to see the upper bits are zero to use less multiply instructions to emulate a 64 bit multiply.

This should help with this ispc issue that a coworker pointed me to https://github.com/ispc/ispc/issues/1362

Reviewers: spatel, efriedma, RKSimon, arsenm

Reviewed By: spatel

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D54725

llvm-svn: 347287
2018-11-20 04:30:26 +00:00
Sanjay Patel
a36c444471 [DAGCombiner] reduce code duplication in visitXOR; NFC
llvm-svn: 347278
2018-11-20 00:51:45 +00:00
Stanislav Mekhanoshin
54ebfe8aee Implement computeKnownBits for scalar_to_vector
Differential Revision: https://reviews.llvm.org/D54728

llvm-svn: 347274
2018-11-19 23:34:07 +00:00
Simon Pilgrim
740122fb8c [DAGCombine] SimplifyNodeWithTwoResults - ensure same legalization for LO/HI operands (PR21207)
Consistently use (!LegalOperations || isOperationLegalOrCustom) for all node pairs.

Differential Revision: https://reviews.llvm.org/D53478

llvm-svn: 347255
2018-11-19 19:37:59 +00:00
Simon Pilgrim
de3605f56b [TargetLowering] expandFP_TO_UINT - improve fp16 support
As discussed on D53794, for float types with ranges smaller than the destination integer type, then we should be able to just use a regular FP_TO_SINT opcode.

I thought we'd need to provide MSA test cases for very small integer types as well (fp16 -> i8 etc.), but it turns out that promotion will kick in so they're unnecessary.

Differential Revision: https://reviews.llvm.org/D54703

llvm-svn: 347251
2018-11-19 19:16:13 +00:00
Sanjay Patel
b25adf5edb [SelectionDAG] simplify vector select with undef operand(s)
llvm-svn: 347227
2018-11-19 17:06:05 +00:00
Sanjay Patel
a1dca3553e [SelectionDAG] simplify select FP with undef condition
llvm-svn: 347212
2018-11-19 14:42:28 +00:00
Sanjay Patel
c036d844be [SelectionDAG] add simplifySelect() to reduce code duplication; NFC
This should be extended to handle FP and vectors in follow-up patches.

llvm-svn: 347210
2018-11-19 14:35:22 +00:00
Sanjay Patel
8c0cd77bff [DAG] add undef simplifications for select nodes
Sadly, this duplicates (twice) the logic from InstSimplify. There
might be some way to at least share the DAG versions of the code, 
but copying the folds seems to be the standard method to ensure 
that we don't miss these folds. 

Unlike in IR, we don't run DAGCombiner to fixpoint, so there's no 
way to ensure that we do these kinds of simplifications unless the 
code is repeated at node creation time and during combines.

There were other tests that would become worthless with this
improvement that I changed as pre-commits:
rL347161
rL347164
rL347165
rL347166
rL347167

I'm not sure how to salvage the remaining tests (diffs in this patch).
So the x86 tests verify that the new code is working as intended.
The AMDGPU test is actually similar to my motivating case: we have
some undef value that has survived to machine IR in an x86 test, and 
then it gets folded in some weird way, or we crash if we don't transfer
the undef flag. But we would have been better off never getting to that
point by doing these simplifications.

This will lead back to PR32023 someday...
https://bugs.llvm.org/show_bug.cgi?id=32023

llvm-svn: 347170
2018-11-18 17:36:23 +00:00
Sanjay Patel
42c22a1f87 [SelectionDAG] simplify code; NFC
llvm-svn: 347160
2018-11-18 14:39:03 +00:00
Fangrui Song
7570932977 Use llvm::copy. NFC
llvm-svn: 347126
2018-11-17 01:44:25 +00:00
Stanislav Mekhanoshin
0ff7c8309d DAG combiner: fold (select, C, X, undef) -> X
Differential Revision: https://reviews.llvm.org/D54646

llvm-svn: 347110
2018-11-16 23:13:38 +00:00
Craig Topper
9e97054211 [LegalizeVectorOps] After custom legalizing an extending load or a truncating store, make sure the custom code is also legal.
For example, on X86 we emit a sign_extend_vector_inreg from LowerLoad and without sse4.1 this node will need further legalization. Previously this sign_extend_vector_inreg was being custom lowered during DAG legalization instead of vector op legalization.

Unfortunately, this doesn't seem to matter for the output of any existing lit tests.

llvm-svn: 347094
2018-11-16 21:04:58 +00:00
Simon Pilgrim
3c8baf4f90 [TargetLowering] Cleanup more of the EXTEND demanded bits cases so that they match. NFCI.
Use the same variable names etc.

llvm-svn: 347045
2018-11-16 12:26:26 +00:00
Sam Parker
ab99cfab21 [DAGCombine] Fix non-deterministic debug output
PR37970 reported non-deterministic debug output, this was caused by
iterating through a set and not a a vector.

bugzilla: https://bugs.llvm.org/show_bug.cgi?id=37970

Differential Revision: https://reviews.llvm.org/D54570

llvm-svn: 347037
2018-11-16 08:35:19 +00:00
Craig Topper
bac7d9735a [LegalizeVectorTypes] Teach WidenVecRes_Convert to turn ANY_EXTEND into ANY_EXTEND_VECTOR_INREG when the input and output types need to be widened to the same width.
If we don't do it here, DAGCombine will just end up creating it from the scalar any_extend+build_vector so might as well save a step.

llvm-svn: 347034
2018-11-16 07:13:34 +00:00
Stanislav Mekhanoshin
35de877e8c Fixed DAGTypeLegalizer::SplitVecOp_EXTRACT_VECTOR_ELT i1 handling
Legalizer used to request an ext load from i8 to i1 when promoting
vector element type to i8. Fixed.

Differential Revision: https://reviews.llvm.org/D54440

llvm-svn: 346795
2018-11-13 20:26:27 +00:00
Craig Topper
aca8390216 [SelectionDAG][X86] Relax restriction on the width of an input to *_EXTEND_VECTOR_INREG. Use them and regular *_EXTEND to replace the X86 specific VSEXT/VZEXT opcodes
Previously, the extend_vector_inreg opcode required their input register to be the same total width as their output. But this doesn't match up with how the X86 instructions are defined. For X86 the input just needs to be a legal type with at least enough elements to cover the output.

This patch weakens the check on these nodes and allows them to be used as long as they have more input elements than output elements. I haven't changed type legalization behavior so it will still create them with matching input and output sizes.

X86 will custom legalize these nodes by shrinking the input to be a 128 bit vector and once we've done that we treat them as legal operations. We still have one case during type legalization where we must custom handle v64i8 on avx512f targets without avx512bw where v64i8 isn't a legal type. In this case we will custom type legalize to a *extend_vector_inreg with a v16i8 input. After that the input is a legal type so type legalization should ignore the node and doesn't need to know about the relaxed restriction. We are no longer allowed to use the default expansion for these nodes during vector op legalization since the default expansion uses a shuffle which required the widths to match. Custom legalization for all types will prevent us from reaching the default expansion code.

I believe DAG combine works correctly with the released restriction because it doesn't check the number of input elements.

The rest of the patch is changing X86 to use either the vector_inreg nodes or the regular zero_extend/sign_extend nodes. I had to add additional isel patterns to handle any_extend during isel since simplifydemandedbits can create them at any time so we can't legalize to zero_extend before isel. We don't yet create any_extend_vector_inreg in simplifydemandedbits.

Differential Revision: https://reviews.llvm.org/D54346

llvm-svn: 346784
2018-11-13 19:45:21 +00:00
Cameron McInally
cbde0d9c7b [IR] Add a dedicated FNeg IR Instruction
The IEEE-754 Standard makes it clear that fneg(x) and
fsub(-0.0, x) are two different operations. The former is a bitwise
operation, while the latter is an arithmetic operation. This patch
creates a dedicated FNeg IR Instruction to model that behavior.

Differential Revision: https://reviews.llvm.org/D53877

llvm-svn: 346774
2018-11-13 18:15:47 +00:00
Craig Topper
0b33b468a1 [DAGCombiner] Enable tryToFoldExtendOfConstant to run after legalize vector ops
It should be ok to create a new build_vector after legal operations so long as it doesn't cause an infinite loop in DAG combiner.

Unfortunately, X86's custom constant folding in combineVSZext is hiding any test changes from this. But I'm trying to get to a point where that X86 specific code isn't necessary at all.

Differential Revision: https://reviews.llvm.org/D54285

llvm-svn: 346728
2018-11-13 01:59:32 +00:00
Nirav Dave
a395e2df56 [DAGCombiner] Fix load-store forwarding of indexed loads.
Summary:
Handle extra output from index loads in cases where we wish to
forward a load value directly from a preceeding store.

Fixes PR39571.

Reviewers: peter.smith, rengolin

Subscribers: javed.absar, hiraditya, arphaman, llvm-commits

Differential Revision: https://reviews.llvm.org/D54265

llvm-svn: 346654
2018-11-12 14:05:40 +00:00
Craig Topper
d23cdbbeb2 [DAGCombiner] Make tryToFoldExtendOfConstant return an SDValue instead of an SDNode*. NFC
Removes the need to call getNode internally and to recreate an SDValue after the call.

llvm-svn: 346600
2018-11-10 23:46:03 +00:00
Sanjay Patel
0a515595a7 [x86] allow vector load narrowing with multi-use values
This is a long-awaited follow-up suggested in D33578. Since then, we've picked up even more
opportunities for vector narrowing from changes like D53784, so there are a lot of test diffs.
Apart from 2-3 strange cases, these are all wins.

I've structured this to be no-functional-change-intended for any target except for x86
because I couldn't tell if AArch64, ARM, and AMDGPU would improve or not. All of those
targets have existing regression tests (4, 4, 10 files respectively) that would be
affected. Also, Hexagon overrides the shouldReduceLoadWidth() hook, but doesn't show
any regression test diffs. The trade-off is deciding if an extra vector load is better
than a single wide load + extract_subvector.

For x86, this is almost always better (on paper at least) because we often can fold
loads into subsequent ops and not increase the official instruction count. There's also
some unknown -- but potentially large -- benefit from using narrower vector ops if wide
ops are implemented with multiple uops and/or frequency throttling is avoided.

Differential Revision: https://reviews.llvm.org/D54073

llvm-svn: 346595
2018-11-10 20:05:31 +00:00
Craig Topper
f2e65f8636 [SelectionDAG] Fix a -Wparentheses warning from gcc in an assert. NFC
gcc wants parentheses around the logical OR since there is a logical AND for the string.

llvm-svn: 346564
2018-11-09 23:11:30 +00:00
Craig Topper
9a7e19b8f2 [DAGCombiner][X86][Mips] Enable combineShuffleOfScalars to run between vector op legalization and DAG legalization. Fix bad one use check in combineShuffleOfScalars
It's possible for vector op legalization to generate a shuffle. If that happens we should give a chance for DAG combine to combine that with a build_vector input.

I also fixed a bug in combineShuffleOfScalars that was considering the number of uses on a undef input to a shuffle. We don't care how many times undef is used.

Differential Revision: https://reviews.llvm.org/D54283

llvm-svn: 346530
2018-11-09 18:04:34 +00:00
Serge Guelton
86f8b70f1b Type safe version of MachinePassRegistry
Previous version used type erasure through a `void* (*)()` pointer,
which triggered gcc warning and implied a lot of reinterpret_cast.

This version should make it harder to hit ourselves in the foot.

Differential revision: https://reviews.llvm.org/D54203

llvm-svn: 346522
2018-11-09 17:19:45 +00:00
Alexandros Lamprineas
e15c982f6d [SelectionDAG] swap select_cc operands to enable folding
The DAGCombiner tries to SimplifySelectCC as follows:

  select_cc(x, y, 16, 0, cc) -> shl(zext(set_cc(x, y, cc)), 4)

It can't cope with the situation of reordered operands:

  select_cc(x, y, 0, 16, cc)

In that case we just need to swap the operands and invert the Condition Code:

  select_cc(x, y, 16, 0, ~cc)

Differential Revision: https://reviews.llvm.org/D53236

llvm-svn: 346484
2018-11-09 11:09:40 +00:00
Craig Topper
8cca8bd4aa [SelectionDAG] Assert on the width of DemandedElts argument to computeKnownBits for all vector typed operations not just build_vector.
Fix AArch64 unit test that fails with the assertion added.

llvm-svn: 346437
2018-11-08 20:29:17 +00:00
Nirav Dave
6ce9f72f76 [DAGCombine] Improve alias analysis for chain of independent stores.
FindBetterNeighborChains simulateanously improves the chain
dependencies of a chain of related stores avoiding the generation of
extra token factors. For chains longer than the GatherAllAliasDepths,
stores further down in the chain will necessarily fail, a potentially
significant waste and preventing otherwise trivial parallelization.

This patch directly parallelize the chains of stores before improving
each store. This generally improves DAG-level parallelism.

Reviewers: courbet, spatel, RKSimon, bogner, efriedma, craig.topper, rnk

Subscribers: sdardis, javed.absar, hiraditya, jrtc27, atanasyan, llvm-commits

Differential Revision: https://reviews.llvm.org/D53552

llvm-svn: 346432
2018-11-08 19:14:20 +00:00
James Y Knight
72f76bf230 Add support for llvm.is.constant intrinsic (PR4898)
This adds the llvm-side support for post-inlining evaluation of the
__builtin_constant_p GCC intrinsic.

Also fixed SCCPSolver::visitCallSite to not blow up when seeing a call
to a function where canConstantFoldTo returns true, and one of the
arguments is a struct.

Updated from patch initially by Janusz Sobczak.

Differential Revision: https://reviews.llvm.org/D4276

llvm-svn: 346322
2018-11-07 15:24:12 +00:00
Cameron McInally
9757d5d6c1 [FPEnv] Add constrained CEIL/FLOOR/ROUND/TRUNC intrinsics
Differential Revision: https://reviews.llvm.org/D53411

llvm-svn: 346141
2018-11-05 15:59:49 +00:00
Simon Pilgrim
6bd468bd8b [TargetLowering] Begin generalizing TargetLowering::expandFP_TO_SINT support. NFCI.
Prior to initial work to add vector expansion support, remove assumptions that we're working on scalar types.

llvm-svn: 346139
2018-11-05 15:49:09 +00:00
Craig Topper
8f2f2a76b9 [DAGCombiner] Use tryFoldToZero to simplify some code and make it work correctly between LegalTypes and LegalOperations.
The original code avoided creating a zero vector after type legalization, but if we're after type legalization the type we have is legal. The real hazard we need to avoid is creating a build vector after op legalization. tryFoldToZero takes care of checking for this.

llvm-svn: 346119
2018-11-05 05:53:06 +00:00