Commit Graph

341 Commits

Author SHA1 Message Date
Hiroshi Yamauchi
e6a3dc7699 Simplify more cases of logical ops of masked icmps.
Summary:
For example,

((X & 255) != 0) && ((X & 15) == 8) -> ((X & 15) == 8).
((X & 7) != 0) && ((X & 15) == 8) -> false.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43835

llvm-svn: 327450
2018-03-13 21:13:18 +00:00
Craig Topper
ee99aa4dd0 [InstCombine] Replace calls to getNumUses with hasNUses or hasNUsesOrMore
getNumUses is a linear time operation. It traverses the user linked list to the end and counts as it goes. Since we are only interested in small constant counts, we should use hasNUses or hasNUsesMore more that terminate the traversal as soon as it can provide the answer.

There are still two other locations in InstCombine, but changing those would force a rebase of D44266 which if accepted would remove them.

Differential Revision: https://reviews.llvm.org/D44398

llvm-svn: 327315
2018-03-12 18:46:05 +00:00
Sanjay Patel
8fdd87f929 [InstCombine] move constant check into foldBinOpIntoSelectOrPhi; NFCI
Also, rename 'foldOpWithConstantIntoOperand' because that's annoyingly 
vague. The constant check is redundant in some cases, but it allows 
removing duplication for most of the calls.

llvm-svn: 326329
2018-02-28 16:36:24 +00:00
Simon Pilgrim
19495198af [InstCombine] Add constant vector support for ~(C >> Y) --> ~C >> Y
Includes adding m_NonNegative constant pattern matcher

llvm-svn: 324825
2018-02-10 21:46:09 +00:00
Sanjay Patel
1d68112c4b [InstCombine] narrow masked zexted binops (PR35792)
This is guarded by shouldChangeType(), so the tests show that
we don't do the fold if the narrower type is not legal. Note
that there is a proposal (D42424) that would change the results
for the specific cases shown in these tests. That difference is
also discussed in PR35792:
https://bugs.llvm.org/show_bug.cgi?id=35792

Alive proofs for the cases handled here as well as the bitwise 
logic binops that we should already do better on:
https://rise4fun.com/Alive/c97
https://rise4fun.com/Alive/Lc5E
https://rise4fun.com/Alive/kdf

llvm-svn: 323437
2018-01-25 16:34:36 +00:00
Sanjay Patel
5a0cdac174 [InstCombine] canonicalize shifty abs(): ashr+add+xor --> cmp+neg+sel
We want to do this for 2 reasons:
1. Value tracking does not recognize the ashr variant, so it would fail to match for cases like D39766.
2. DAGCombiner does better at producing optimal codegen when we have the cmp+sel pattern.

More detail about what happens in the backend:
1. DAGCombiner has a generic transform for all targets to convert the scalar cmp+sel variant of abs 
   into the shift variant. That is the opposite of this IR canonicalization.
2. DAGCombiner has a generic transform for all targets to convert the vector cmp+sel variant of abs 
   into either an ABS node or the shift variant. That is again the opposite of this IR canonicalization.
3. DAGCombiner has a generic transform for all targets to convert the exact shift variants produced by #1 or #2
   into an ISD::ABS node. Note: It would be an efficiency improvement if we had #1 go directly to an ABS node 
   when that's legal/custom.
4. The pattern matching above is incomplete, so it is possible to escape the intended/optimal codegen in a 
   variety of ways.
   a. For #2, the vector path is missing the case for setlt with a '1' constant.
   b. For #3, we are missing a match for commuted versions of the shift variants.
5. Therefore, this IR canonicalization can only help get us to the optimal codegen. The version of cmp+sel 
   produced by this patch will be recognized in the DAG and converted to an ABS node when possible or the 
   shift sequence when not.
6. In the following examples with this patch applied, we may get conditional moves rather than the shift 
   produced by the generic DAGCombiner transforms. The conditional move is created using a target-specific 
   decision for any given target. Whether it is optimal or not for a particular subtarget may be up for debate.

define i32 @abs_shifty(i32 %x) {
  %signbit = ashr i32 %x, 31 
  %add = add i32 %signbit, %x  
  %abs = xor i32 %signbit, %add 
  ret i32 %abs
}

define i32 @abs_cmpsubsel(i32 %x) {
  %cmp = icmp slt i32 %x, zeroinitializer
  %sub = sub i32 zeroinitializer, %x
  %abs = select i1 %cmp, i32 %sub, i32 %x
  ret i32 %abs
}

define <4 x i32> @abs_shifty_vec(<4 x i32> %x) {
  %signbit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31> 
  %add = add <4 x i32> %signbit, %x  
  %abs = xor <4 x i32> %signbit, %add 
  ret <4 x i32> %abs
}

define <4 x i32> @abs_cmpsubsel_vec(<4 x i32> %x) {
  %cmp = icmp slt <4 x i32> %x, zeroinitializer
  %sub = sub <4 x i32> zeroinitializer, %x
  %abs = select <4 x i1> %cmp, <4 x i32> %sub, <4 x i32> %x
  ret <4 x i32> %abs
}

> $ ./opt -instcombine shiftyabs.ll -S | ./llc -o - -mtriple=x86_64 -mattr=avx 
> abs_shifty:
> 	movl	%edi, %eax
> 	negl	%eax
> 	cmovll	%edi, %eax
> 	retq
> 
> abs_cmpsubsel:
> 	movl	%edi, %eax
> 	negl	%eax
> 	cmovll	%edi, %eax
> 	retq
> 
> abs_shifty_vec:
> 	vpabsd	%xmm0, %xmm0
> 	retq
> 
> abs_cmpsubsel_vec:
> 	vpabsd	%xmm0, %xmm0
> 	retq
> 
> $ ./opt -instcombine shiftyabs.ll -S | ./llc -o - -mtriple=aarch64
> abs_shifty:
> 	cmp	w0, #0                  // =0
> 	cneg	w0, w0, mi
> 	ret
> 
> abs_cmpsubsel: 
> 	cmp	w0, #0                  // =0
> 	cneg	w0, w0, mi
> 	ret
>                                        
> abs_shifty_vec: 
> 	abs	v0.4s, v0.4s
> 	ret
> 
> abs_cmpsubsel_vec: 
> 	abs	v0.4s, v0.4s
> 	ret
> 
> $ ./opt -instcombine shiftyabs.ll -S | ./llc -o - -mtriple=powerpc64le 
> abs_shifty:  
> 	srawi 4, 3, 31
> 	add 3, 3, 4
> 	xor 3, 3, 4
> 	blr
> 
> abs_cmpsubsel:
> 	srawi 4, 3, 31
> 	add 3, 3, 4
> 	xor 3, 3, 4
> 	blr
> 
> abs_shifty_vec:   
> 	vspltisw 3, -16
> 	vspltisw 4, 15
> 	vsubuwm 3, 4, 3
> 	vsraw 3, 2, 3
> 	vadduwm 2, 2, 3
> 	xxlxor 34, 34, 35
> 	blr
> 
> abs_cmpsubsel_vec: 
> 	vspltisw 3, -16
> 	vspltisw 4, 15
> 	vsubuwm 3, 4, 3
> 	vsraw 3, 2, 3
> 	vadduwm 2, 2, 3
> 	xxlxor 34, 34, 35
> 	blr
>

Differential Revision: https://reviews.llvm.org/D40984

llvm-svn: 320921
2017-12-16 16:41:17 +00:00
Sanjay Patel
6840c5ff75 [ValueTracking, InstCombine] canonicalize fcmp ord/uno with non-NAN ops to null constants
This is a preliminary step towards solving the remaining part of PR27145 - IR for isfinite():
https://bugs.llvm.org/show_bug.cgi?id=27145

In order to solve that one more generally, we need to add matching for and/or of fcmp ord/uno
with a constant operand.

But while looking at those patterns, I realized we were missing a canonicalization for nonzero
constants. Rather than limiting to just folds for constants, we're adding a general value
tracking method for this based on an existing DAG helper.

By transforming everything to 0.0, we can simplify the existing code in foldLogicOfFCmps()
and pick up missing vector folds.

Differential Revision: https://reviews.llvm.org/D37427

llvm-svn: 312591
2017-09-05 23:13:13 +00:00
Sanjay Patel
bc6da4e40f [InstCombine] replace unnecessary fcmp fold with assert
See https://reviews.llvm.org/rL312411 for related InstSimplify tests.

llvm-svn: 312421
2017-09-02 18:10:29 +00:00
Sanjay Patel
64fc5daf42 [InstCombine] combine foldAndOfFCmps and foldOrOfFcmps; NFCI
In addition to removing chunks of duplicated code, we don't
want these to diverge. If there's a fold for one, there
should be a fold of the other via DeMorgan's Laws.

llvm-svn: 312420
2017-09-02 17:53:33 +00:00
Sanjay Patel
275bb5a14e [InstCombine] fix misnamed locals and use them to reduce code; NFCI
We had these locals:
Value *Op0RHS = LHS->getOperand(1);
Value *Op1LHS = RHS->getOperand(0);
...so we confusingly transposed the meaning of left/right and op0/op1.

llvm-svn: 312418
2017-09-02 17:17:17 +00:00
Sanjay Patel
da6f9b2fee [InstCombine] remove unnecessary code; NFC
llvm-svn: 312416
2017-09-02 16:32:37 +00:00
Sanjay Patel
4c52f765a5 [InstCombine] move related functions next to each other; NFC
This makes it easier to see that they're almost duplicates.
As with the similar icmp functions, there should be identical 
folds for both logic ops because those are DeMorganized variants.

llvm-svn: 312415
2017-09-02 16:30:27 +00:00
Craig Topper
d3b465606a [InstCombine] Don't require the compare types to be the same in getMaskedTypeForICmpPair.
A future patch will make the code look through truncates feeding the compare. So the compares might be different types but the pretruncated types might be the same.

This should be safe because we still require the same Value* to be used truncated or not in both compares. So that serves to ensure the types are the same.

llvm-svn: 312381
2017-09-01 21:27:31 +00:00
Craig Topper
085c1f4dea [InstCombine] When converting decomposeBitTestICmp's APInt return to ConstantInt, make sure we use the type from the Value* that was also returned from decomposeBitTestICmp.
Previously we used the type from the LHS of the compare, but a future patch will change decomposeBitTestICmp to look through truncates so it will return a pretruncated Value* and the type needs to match that.

llvm-svn: 312380
2017-09-01 21:27:29 +00:00
Craig Topper
ec4b82571c [InstCombine] Remove check for sext of vector icmp from shouldOptimizeCast
Looks like for 'and' and 'or' we end up performing at least some of the transformations this is bocking in a round about way anyway.

For 'and sext(cmp1), sext(cmp2) we end up later turning it into 'select cmp1, sext(cmp2), 0'. Then we optimize that back to sext (and cmp1, cmp2). This is the same result we would have gotten if shouldOptimizeCast hadn't blocked it. We do something analogous for 'or'.

With this patch we allow that transformation to happen directly in foldCastedBitwiseLogic. And we now support the same thing for 'xor'. This is definitely opening up many other cases, but since we already went around it for some cases hopefully it's ok.

Differential Revision: https://reviews.llvm.org/D36213

llvm-svn: 311508
2017-08-22 23:40:15 +00:00
Craig Topper
775ffcc8f5 [InstCombine] Move the checks for pointer types in getMaskedTypeForICmpPair earlier in the function
I don't think there's any reason to have them scattered about and on all 4 operands. We already have an early check that both compares must be the same type. And within a given compare the LHS and RHS must have the same type. Beyond that I don't think there's anyway this function returns anything valid for pointer types. So let's just return early and be done with it.

Differential Revision: https://reviews.llvm.org/D36561

llvm-svn: 311383
2017-08-21 21:00:45 +00:00
Craig Topper
0aa3a19512 Recommit r310869, "[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify"
This recommits r310869, with the moved files and no extra changes.

Original commit message:

This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too.

I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself.

I also had to make decomposeBitTest support vectors since InstSimplify needs that.

As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library.

Differential Revision: https://reviews.llvm.org/D36593

llvm-svn: 310889
2017-08-14 21:39:51 +00:00
Craig Topper
69fa8e0d99 Revert r310869 "[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify"
Failed to add the two files that moved. And then added an extra change I didn't mean to while trying to fix that. Reverting everything.

llvm-svn: 310873
2017-08-14 19:09:32 +00:00
Craig Topper
2f0b450666 [InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify
This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too.

I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself.

I also had to make decomposeBitTest support vectors since InstSimplify needs that.

As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library.

Differential Revision: https://reviews.llvm.org/D36593

llvm-svn: 310869
2017-08-14 18:49:42 +00:00
Craig Topper
f720099007 [InstCombine] Simplify and inline FoldOrWithConstants/FoldXorWithConstants
Summary:
These functions were overly complicated. The body of this function was rechecking for an And operation to find the constant, but we already knew we were looking at two Ands ORed together and the pieces are in variables. We already had earlier nearby code that checked for ConstantInts. So just inline the remaining parts into the earlier code.

Next step is to use m_APInt instead of ConstantInt.

Reviewers: spatel, efriedma, davide, majnemer

Reviewed By: spatel

Subscribers: zzheng, llvm-commits

Differential Revision: https://reviews.llvm.org/D36439

llvm-svn: 310806
2017-08-14 00:04:21 +00:00
Craig Topper
9a6110b2d3 [InstCombine] Make (X|C1)^C2 -> X^(C1^C2) iff X&~C1 == 0 work for splat vectors
This also corrects the description to match what was actually implemented. The old comment said X^(C1|C2), but it implemented X^((C1|C2)&~(C1&C2)). I believe ((C1|C2)&~(C1&C2)) is equivalent to (C1^C2).

Differential Revision: https://reviews.llvm.org/D36505

llvm-svn: 310658
2017-08-10 20:35:34 +00:00
Craig Topper
57b4d8646b [InstCombine] Fix a crash in getSelectCondition if we happen to have two inverse vectors of i1 constants.
We used to try to truncate the constant vector to vXi1, but if it's already i1 this would fail. Instead we now use IRBuilder::getZExtOrTrunc which should check the type and only create a trunc if needed. I believe this should trigger constant folding in the IRBuilder and ultimately do the same thing just with the additional type check.

llvm-svn: 310639
2017-08-10 17:48:14 +00:00
Craig Topper
5706c01c0b [InstCombine] Use regular dyn_cast instead of a matcher for a simple case. NFC
llvm-svn: 310446
2017-08-09 06:17:48 +00:00
Aaron Ballman
428f0fe910 Removing an unused variable that was missed with the refactoring in r310272; NFC.
llvm-svn: 310285
2017-08-07 19:26:17 +00:00
Craig Topper
7091a743b4 [InstCombine] Support (X | C1) & C2 --> (X & C2^(C1&C2)) | (C1&C2) for vector splats
Note the original code I deleted incorrectly listed this as (X | C1) & C2 --> (X & C2^(C1&C2)) | C1 Which is only valid if C1 is a subset of C2. This relied on SimplifyDemandedBits to remove any extra bits from C1 before we got to that code.

My new implementation avoids relying on that behavior so that it can be naively verified with alive.

Differential Revision: https://reviews.llvm.org/D36384

llvm-svn: 310272
2017-08-07 18:10:39 +00:00
Craig Topper
576fb91aef [InstCombine] Remove shift handling from OptAndOp.
Summary: This is all handled by SimplifyDemandedBits.

Reviewers: spatel, davide

Reviewed By: davide

Subscribers: davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D36382

llvm-svn: 310234
2017-08-06 23:30:49 +00:00
Craig Topper
a1693a2ed3 [InstCombine] Support (X ^ C1) & C2 --> (X & C2) ^ (C1&C2) for vector splats.
llvm-svn: 310233
2017-08-06 23:11:49 +00:00
Craig Topper
9cbdbefd0f [InstCombine] Support '(C - X) ^ signmask -> (C + signmask - X)' and '(X + C) ^ signmask -> (X + C + signmask)' for vector splats.
llvm-svn: 310232
2017-08-06 22:17:21 +00:00
Craig Topper
b5bf016015 [InstCombine] Support ~(c-X) --> X+(-c-1) and ~(X-c) --> (-c-1)-X for splat vectors.
llvm-svn: 310195
2017-08-06 06:28:41 +00:00
Craig Topper
9ffda5ab86 [InstCombine] Fold (C - X) ^ signmask -> (C + signmask - X).
llvm-svn: 310186
2017-08-05 20:00:44 +00:00
Craig Topper
760ff6ee87 [InstCombine] Remove the (not (sext)) case from foldBoolSextMaskToSelect and inline the remaining code to match visitOr
Summary:
The (not (sext)) case is really (xor (sext), -1) which should have been simplified to (sext (xor, 1)) before we got here. So we shouldn't need to handle it.

With that taken care of we only need to two cases so don't need the swap anymore. This makes us in sync with the equivalent code in visitOr so inline this to match.

Reviewers: spatel, eli.friedman, majnemer

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36240

llvm-svn: 310063
2017-08-04 16:07:20 +00:00
Craig Topper
4068e4eec5 [InstCombine] Remove explicit code for folding (xor(zext(cmp)), 1) and (xor(sext(cmp)), -1) to ext(!cmp).
As far as I can tell this should be handled by foldCastedBitwiseLogic which is called later in visitXor.

Differential Revision: https://reviews.llvm.org/D36214

llvm-svn: 309882
2017-08-02 20:30:27 +00:00
Craig Topper
ae9b87d10c [InstCombine] Support sext in foldLogicCastConstant
This adds support for sext in foldLogicCastConstant. This is a prerequisite for D36214.

Differential Revision: https://reviews.llvm.org/D36234

llvm-svn: 309880
2017-08-02 20:25:56 +00:00
Sanjay Patel
dac0ab272c [InstCombine] allow mask hoisting transform for vector types
llvm-svn: 309627
2017-07-31 21:01:53 +00:00
Craig Topper
2072aca51c [InstCombine] Move (0 - x) & 1 --> x & 1 to SimplifyDemandedUseBits.
This removes a dedicated matcher and allows us to support more than just an AND masking the lower bit.

llvm-svn: 308124
2017-07-16 05:37:58 +00:00
Craig Topper
d918d5b36b [InstCombine] Improve the expansion in SimplifyUsingDistributiveLaws to handle cases where one side doesn't simplify, but the other side resolves to an identity value
Summary:
If one side simplifies to the identity value for inner opcode, we can replace the value with just the operation that can't be simplified.

I've removed a couple now unneeded special cases in visitAnd and visitOr. There are probably other cases I missed.

Reviewers: spatel, majnemer, hfinkel, dberlin

Reviewed By: spatel

Subscribers: grandinj, llvm-commits, spatel

Differential Revision: https://reviews.llvm.org/D35451

llvm-svn: 308111
2017-07-15 21:49:49 +00:00
Sanjay Patel
3437ee2740 [InstCombine] improve (1 << x) & 1 --> zext(x == 0) folding
1. Add a one-use check to prevent increasing instruction count.
2. Generalize the pattern matching to include vector types.

llvm-svn: 308105
2017-07-15 17:26:01 +00:00
Sanjay Patel
55b9f88ecc [InstCombine] allow (0 - x) & 1 --> x & 1 for vectors
llvm-svn: 308098
2017-07-15 15:29:47 +00:00
Sanjay Patel
27339133a7 [InstCombine] remove dead code/tests; NFCI
These patterns and tests were added to InstSimplify with:
https://reviews.llvm.org/rL303004

llvm-svn: 308096
2017-07-15 15:01:33 +00:00
Craig Topper
fde4723ebe [IR] Add Type::isIntOrIntVectorTy(unsigned) similar to the existing isIntegerTy(unsigned), but also works for vectors.
llvm-svn: 307492
2017-07-09 07:04:03 +00:00
Craig Topper
bb4069e439 [InstCombine] Make InstCombine's IRBuilder be passed by reference everywhere
Previously the InstCombiner class contained a pointer to an IR builder that had been passed to the constructor. Sometimes this would be passed to helper functions as either a pointer or the pointer would be dereferenced to be passed by reference.

This patch makes it a reference everywhere including the InstCombiner class itself so there is more inconsistency. This a large, but mechanical patch. I've done very minimal formatting changes on it despite what clang-format wanted to do.

llvm-svn: 307451
2017-07-07 23:16:26 +00:00
Craig Topper
79ab643da8 [Constants] If we already have a ConstantInt*, prefer to use isZero/isOne/isMinusOne instead of isNullValue/isOneValue/isAllOnesValue inherited from Constant. NFCI
Going through the Constant methods requires redetermining that the Constant is a ConstantInt and then calling isZero/isOne/isMinusOne.

llvm-svn: 307292
2017-07-06 18:39:47 +00:00
Craig Topper
95e4142f94 [InstCombine] Change helper method to a file local static method. NFC
llvm-svn: 307275
2017-07-06 16:24:23 +00:00
Craig Topper
fc42acef92 [InstCombine] Clarify comment to mention other transform that it does. NFC
llvm-svn: 307274
2017-07-06 16:24:22 +00:00
Craig Topper
22795de20a [InstCombine] Add single use checks to SimplifyBSwap to ensure we are really saving instructions
Bswap isn't a simple operation so we need to make sure we are really removing a call to it before doing these simplifications.

For the case when both LHS and RHS are bswaps I've allowed it to be moved if either LHS or RHS has a single use since that at least allows us to move it later where it might find another bswap to combine with and it decreases the use count on the other side so maybe the other user can be optimized.

Differential Revision: https://reviews.llvm.org/D34974

llvm-svn: 307273
2017-07-06 16:24:21 +00:00
Craig Topper
cc418b656a [InstCombine] Use CmpInst::Predicate with m_Cmp instead of ICmpInst::Predicate. NFC
There isn't really an ICmpInst version so we're just accessing the CmpInst version through inheritance.

llvm-svn: 307199
2017-07-05 20:31:00 +00:00
Craig Topper
8036970008 [InstCombine] Add a TODO for a probable missing single use check. NFC
Will try to fix it soon, but in case I forget.

llvm-svn: 307003
2017-07-03 05:54:16 +00:00
Craig Topper
766ce6e9cf [InstCombine] Support BITWISE_OP( BSWAP(x), CONSTANT ) -> BSWAP( BITWISE_OP(x, BSWAP(CONSTANT) ) ) for splat vectors.
llvm-svn: 307002
2017-07-03 05:54:15 +00:00
Craig Topper
32fce4d647 [InstCombine] Remove support for BITWISE_OP(CONSTANT, BSWAP(x)) -> BSWAP(OP(BSWAP(CONSTANT), x)).
Constants were already canonicalized to the right hand side before we got here.

llvm-svn: 307000
2017-07-03 05:54:13 +00:00
Craig Topper
1e4643a98e [InstCombine] Support BITWISE_OP(BSWAP(A),BSWAP(B))->BSWAP(BITWISE_OP(A, B)) for vectors.
llvm-svn: 306999
2017-07-03 05:54:13 +00:00