Commit Graph

1020 Commits

Author SHA1 Message Date
Tim Northover
819bfb5a25 DAGCombiner: make sure or/shl/srl really has zero high bits before forming bswap
We want to convert code like (or (srl N, 8), (shl N, 8)) into (srl (bswap N),
const), but this is only valid if the bits above 16 on the source pattern are
0, the checks we were doing on this were slightly wrong before.

llvm-svn: 189348
2013-08-27 13:46:45 +00:00
Tom Stellard
838e2344ec SelectionDAG: Remove unnecessary uses of TargetLowering::getPointerTy()
If we have a binary operation like ISD:ADD, we can set the result type
equal to the result type of one of its operands rather than using
TargetLowering::getPointerTy().

Also, any use of DAG.getIntPtrConstant(C) as an operand for a binary
operation can be replaced with:
DAG.getConstant(C, OtherOperand.getValueType());

llvm-svn: 189227
2013-08-26 15:06:10 +00:00
Juergen Ributzka
3db39dc1ae Teach BaseIndexOffset::match to identify base pointers in loops.
The small utility function that pattern matches Base + Index +
Offset patterns for loads and stores fails to recognize the base
pointer for loads/stores from/into an array at offset 0 inside a
loop. As a result DAGCombiner::MergeConsecutiveStores was not able
to merge all stores.

This commit fixes the issue by adding an additional pattern match
and also a test case.

Reviewer: Nadav
llvm-svn: 188936
2013-08-21 21:53:38 +00:00
Craig Topper
d9c2783d8f Replace getValueType().getSimpleVT() with getSimpleValueType().
llvm-svn: 188442
2013-08-15 02:44:19 +00:00
Jim Grosbach
327ccc787e DAG: Combine (and (setne X, 0), (setne X, -1)) -> (setuge (add X, 1), 2)
A common idiom is to use zero and all-ones as sentinal values and to
check for both in a single conditional ("x != 0 && x != (unsigned)-1").
That generates code, for i32, like:
  testl %edi, %edi
  setne %al
  cmpl  $-1, %edi
  setne %cl
  andb  %al, %cl

With this transform, we generate the simpler:
  incl  %edi
  cmpl  $1, %edi
  seta  %al

Similar improvements for other integer sizes and on other platforms. In
general, combining the two setcc instructions into one is better.

rdar://14689217

llvm-svn: 188315
2013-08-13 21:30:58 +00:00
Craig Topper
309dfefb6f Optimize mask generation for one of the DAG combiner shufflevector cases.
llvm-svn: 187961
2013-08-08 07:38:55 +00:00
Tom Stellard
d42c594960 TargetLowering: Add getVectorIdxTy() function v2
This virtual function can be implemented by targets to specify the type
to use for the index operand of INSERT_VECTOR_ELT, EXTRACT_VECTOR_ELT,
INSERT_SUBVECTOR, EXTRACT_SUBVECTOR.  The default implementation returns
the result from TargetLowering::getPointerTy()

The previous code was using TargetLowering::getPointerTy() for vector
indices, because this is guaranteed to be legal on all targets.  However,
using TargetLowering::getPointerTy() can be a problem for targets with
pointer sizes that differ across address spaces.  On such targets,
when vectors need to be loaded or stored to an address space other than the
default 'zero' address space (which is the address space assumed by
TargetLowering::getPointerTy()), having an index that
is a different size than the pointer can lead to inefficient
pointer calculations, (e.g. 64-bit adds for a 32-bit address space).

There is no intended functionality change with this patch.

llvm-svn: 187748
2013-08-05 22:22:01 +00:00
Quentin Colombet
6bf4baa408 [DAGCombiner] insert_vector_elt: Avoid building a vector twice.
This patch prevents the following combine when the input vector is used more
than once.
insert_vector_elt (build_vector elt0, ..., eltN), NewEltIdx, idx
=>
build_vector elt0, ..., NewEltIdx, ..., eltN 

The reasons are:
- Building a vector may be expensive, so try to reuse the existing part of a
  vector instead of creating a new one (think big vectors).
- elt0 to eltN now have two users instead of one. This may prevent some other
  optimizations.

llvm-svn: 187396
2013-07-30 00:24:09 +00:00
Tom Stellard
c54731aa9d DAGCombiner: Pass the correct type to TargetLowering::isF(Abs|Neg)Free
This commit also implements these functions for R600 and removes a test
case that was relying on the buggy behavior.

llvm-svn: 187007
2013-07-23 23:55:03 +00:00
Craig Topper
b94011fd28 Use SmallVectorImpl& instead of SmallVector to avoid repeating small vector size.
llvm-svn: 186274
2013-07-14 04:42:23 +00:00
Craig Topper
e0b711864c Pass SmallVector by const reference instead of by value.
llvm-svn: 186243
2013-07-13 07:43:40 +00:00
Stephen Lin
10947502e5 Remove trailing whitespac
llvm-svn: 186032
2013-07-10 20:47:39 +00:00
Stephen Lin
73de7bf5de AArch64/PowerPC/SystemZ/X86: This patch fixes the interface, usage, and all
in-tree implementations of TargetLoweringBase::isFMAFasterThanMulAndAdd in
order to resolve the following issues with fmuladd (i.e. optional FMA)
intrinsics:

1. On X86(-64) targets, ISD::FMA nodes are formed when lowering fmuladd
intrinsics even if the subtarget does not support FMA instructions, leading
to laughably bad code generation in some situations.

2. On AArch64 targets, ISD::FMA nodes are formed for operations on fp128,
resulting in a call to a software fp128 FMA implementation.

3. On PowerPC targets, FMAs are not generated from fmuladd intrinsics on types
like v2f32, v8f32, v4f64, etc., even though they promote, split, scalarize,
etc. to types that support hardware FMAs.

The function has also been slightly renamed for consistency and to force a
merge/build conflict for any out-of-tree target implementing it. To resolve,
see comments and fixed in-tree examples.

llvm-svn: 185956
2013-07-09 18:16:56 +00:00
Hal Finkel
6c29bd9088 DAGCombine tryFoldToZero cannot create illegal types after type legalization
When folding sub x, x (and other similar constructs), where x is a vector, the
result is a vector of zeros. After type legalization, make sure that the input
zero elements have a legal type. This type may be larger than the result's
vector element type.

This was another bug found by llvm-stress.

llvm-svn: 185949
2013-07-09 17:02:45 +00:00
Stephen Lin
8e8424eb17 Style fixes: remove unnecessary braces for one-statement if blocks, no else after return, etc. No funcionality change.
llvm-svn: 185893
2013-07-09 00:44:49 +00:00
Stephen Lin
cfe7f352c7 Remove trailing whitespace from SelectionDAG/*.cpp
llvm-svn: 185780
2013-07-08 00:37:03 +00:00
Benjamin Kramer
c7332b2796 DAGCombiner: Don't drop extension behavior when shrinking a load when unsafe.
ReduceLoadWidth unconditionally drops extensions from loads. Limit it to the
case when all of the bits the extension would otherwise produce are dropped by
the shrink. It would be possible to shrink the load in more cases by merging
the extensions, but this isn't trivial and a very rare case. I left a TODO for
that case.

Fixes PR16551.

llvm-svn: 185755
2013-07-06 14:05:09 +00:00
Tim Northover
6823900e55 DAGCombiner: fix use-counting issue when forming zextload
DAGCombiner was counting all uses of a load node  when considering whether it's
worth combining into a zextload. Really, it wants to ignore the chain and just
count real uses.

rdar://problem/13896307

llvm-svn: 185419
2013-07-02 09:58:53 +00:00
Elena Demikhovsky
fed077be03 Fixed a comment.
llvm-svn: 184933
2013-06-26 12:15:53 +00:00
Elena Demikhovsky
6769c50d9e Optimized integer vector multiplication operation by replacing it with shift/xor/sub when it is possible. Fixed a bug in SDIV, where the const operand is not a splat constant vector.
llvm-svn: 184931
2013-06-26 10:55:03 +00:00
Michael Liao
62ebfd8786 Fix PR16360
When (srl (anyextend x), c) is folded into (anyextend (srl x, c)), the
high bits are not cleared. Add 'and' to clear off them.

llvm-svn: 184575
2013-06-21 18:45:27 +00:00
Stephen Lin
605207fe75 SelectionDAG: slightly refactor DAGCombiner::visitSELECT_CC to avoid redudant checks...
This doesn't really effect performance due to all the relevant calls being transparent but is clearer. 

llvm-svn: 184027
2013-06-15 04:03:33 +00:00
Matt Arsenault
d2f0332a29 Introduce getSelect usage and use more getSelectCC
llvm-svn: 184012
2013-06-14 22:04:37 +00:00
Stephen Lin
4e69d01b67 SelectionDAG: minor fix to order of operands in comments to match the code
llvm-svn: 184008
2013-06-14 21:33:58 +00:00
Stephen Lin
e31f2d2d54 SelectionDAG: Fix incorrect condition checks in some cases of folding FADD/FMUL combinations; also improve accuracy of comments
llvm-svn: 183993
2013-06-14 18:17:35 +00:00
Andrew Trick
ef9de2a739 Track IR ordering of SelectionDAG nodes 2/4.
Change SelectionDAG::getXXXNode() interfaces as well as call sites of
these functions to pass in SDLoc instead of DebugLoc.

llvm-svn: 182703
2013-05-25 02:42:55 +00:00
Michael J. Spencer
df1ecbd734 Replace Count{Leading,Trailing}Zeros_{32,64} with count{Leading,Trailing}Zeros.
llvm-svn: 182680
2013-05-24 22:23:49 +00:00
Matt Arsenault
75865923c9 Add LLVMContext argument to getSetCCResultType
llvm-svn: 182180
2013-05-18 00:21:46 +00:00
Matt Arsenault
04126234e5 Replace redundant code
Use EVT::changeExtendedVectorElementTypeToInteger instead of doing the
same thing that it does

llvm-svn: 182165
2013-05-17 21:43:43 +00:00
Bob Wilson
c5c0823724 Remove redundant variable introduced by r181682.
llvm-svn: 181721
2013-05-13 19:02:31 +00:00
Hao Liu
bc60196951 Fix PR15950 A bug in DAG Combiner about undef mask
llvm-svn: 181682
2013-05-13 02:07:05 +00:00
Benjamin Kramer
a5d59333b3 DAGCombiner: Generate a correct constant for vector types when folding (xor (and)) into (and (not)).
PR15948.

llvm-svn: 181597
2013-05-10 14:09:52 +00:00
David Majnemer
386ab7f872 DAGCombiner: Simplify inverted bit tests
Fold (xor (and x, y), y) -> (and (not x), y)

This removes an opportunity for a constant to appear twice.

llvm-svn: 181395
2013-05-08 06:44:42 +00:00
Michael Kuperstein
ac868757d0 Fix slightly too aggressive conact_vector optimization.
(Would sometimes optimize away conacts used to extend a vector with undef values)

llvm-svn: 181186
2013-05-06 08:06:13 +00:00
Nadav Rotem
e5a2dda372 Optimize away nop CONCAT_VECTOR nodes.
Optimize CONCAT_VECTOR nodes that merge EXTRACT_SUBVECTOR values that extract from the same vector.

rdar://13402653
PR15866

llvm-svn: 180871
2013-05-01 19:18:51 +00:00
Silviu Baranga
af7e8c367f Re-write the address propagation code for pre-indexed loads/stores to take into account some previously misssed cases (PRE_DEC addressing mode, the offset and base address are swapped, etc). This should fix PR15581.
llvm-svn: 180609
2013-04-26 15:52:24 +00:00
Benjamin Kramer
d56ffc709d DAGCombiner: Canonicalize vector integer abs in the same way we do it for scalars.
This already helps SSE2 x86 a lot because it lacks an efficient way to
represent a vector select. The long term goal is to enable the backend to match
a canonicalized pattern into a single instruction (e.g. vabs or pabs).

llvm-svn: 180597
2013-04-26 09:19:19 +00:00
Owen Anderson
2d4cca35c3 DAGCombine should not aggressively fold SEXT(VSETCC(...)) into a wider VSETCC without first checking the target's vector boolean contents.
This exposed an issue with PowerPC AltiVec where it appears it was setting the wrong vector boolean contents.  The included change
fixes the PowerPC tests, and was OK'd by Hal.

llvm-svn: 180129
2013-04-23 18:09:28 +00:00
Tim Northover
a2b533906a Remove unused MEMBARRIER DAG node; it's been replaced by ATOMIC_FENCE.
llvm-svn: 179939
2013-04-20 12:32:17 +00:00
Benjamin Kramer
bbae991db6 DAGCombiner: Fold a shuffle on CONCAT_VECTORS into a new CONCAT_VECTORS if possible.
This pattern occurs in SROA output due to the way vector arguments are lowered
on ARM.

The testcase from PR15525 now compiles into this, which is better than the code
we got with the old scalarrepl:
_Store:
	ldr.w	r9, [sp]
	vmov	d17, r3, r9
	vmov	d16, r1, r2
	vst1.8	{d16, d17}, [r0]
	bx	lr

Differential Revision: http://llvm-reviews.chandlerc.com/D647

llvm-svn: 179106
2013-04-09 17:41:43 +00:00
Arnold Schwaighofer
d6c6e868b2 DAGCombiner: Merge store/loads when we have extload/truncstores
This is helps on architectures where i8,i16 are not legal but we have byte, and
short loads/stores. Allowing us to merge copies like the one below on ARM.

copy(char *a, char *b, int n) {
 do {
   int t0 = a[0];
   int t1 = a[1];
   b[0] = t0;
   b[1] = t1;

radar://13536387

llvm-svn: 178546
2013-04-02 15:58:51 +00:00
Arnold Schwaighofer
6752366ed7 Merge load/store sequences with adresses: base + index + offset
We would also like to merge sequences that involve a variable index like in the
example below.

    int index = *idx++
    int i0 = c[index+0];
    int i1 = c[index+1];
    b[0] = i0;
    b[1] = i1;

By extending the parsing of the base pointer to handle dags that contain a
base, index, and offset we can handle examples like the one above.

The dag for the code above will look something like:

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (i8 load %index))))

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (i32 add (i32 signextend (i8 load %index))
                                         (i32 1)))))

The code that parses the tree ignores the intermediate sign extensions. However,
if there is a sign extension it needs to be on all indexes.

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (add (i8 load %index)
                                     (i8 1))))
 vs

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (i32 add (i32 signextend (i8 load %index))
                                         (i32 1)))))
radar://13536387

llvm-svn: 178483
2013-04-01 18:12:58 +00:00
Benjamin Kramer
9335443236 DAGCombine: visitXOR can replace a node without returning it, bail out in that case.
Fixes the crash reported in PR15608.

llvm-svn: 178429
2013-03-30 21:28:18 +00:00
Michael Liao
bb05a1d7b5 Enhance folding of (extract_subvec (insert_subvec V1, V2, IIdx), EIdx)
- Handle the case where the result of 'insert_subvect' is bitcasted
  before 'extract_subvec'. This removes the redundant insertf128/extractf128
  pair on unaligned 256-bit vector load/store on vectors of non 64-bit integer.

llvm-svn: 177945
2013-03-25 23:47:35 +00:00
Shuxin Yang
93b1f12ac1 Disable some unsafe-fp-math DAG-combine transformation after legalization.
For instance, following transformation will be disabled:
    x + x + x => 3.0f * x;

The problem of these transformations is that it introduces a FP constant, which
following Instruction-Selection pass cannot handle.

Reviewed by Nadav, thanks a lot!

rdar://13445387

llvm-svn: 177933
2013-03-25 22:52:29 +00:00
Richard Relph
61046a9727 Avoid generating ISD::SELECT for vector operands to SIGN_EXTEND
llvm-svn: 176881
2013-03-12 18:17:18 +00:00
Tom Stellard
b1588fc057 DAGCombiner: Use correct value type for checking legality of BR_CC v3
LegalizeDAG.cpp uses the value of the comparison operands when checking
the legality of BR_CC, so DAGCombiner should do the same.

v2:
  - Expand more BR_CC value types for NVPTX

v3:
  - Expand correct BR_CC value types for Hexagon, Mips, and XCore.

llvm-svn: 176694
2013-03-08 15:36:57 +00:00
Benjamin Kramer
3238dc0c61 DAGCombiner: Make the post-legalize vector op optimization more aggressive.
A legal BUILD_VECTOR goes in and gets constant folded into another legal
BUILD_VECTOR so we don't lose any legality here. The problematic PPC
optimization that made this check necessary was fixed recently.

llvm-svn: 175759
2013-02-21 15:24:35 +00:00
Arnold Schwaighofer
3f9568e921 DAGCombiner: Fold pointless truncate, bitcast, buildvector series
(2xi32) (truncate ((2xi64) bitcast (buildvector i32 a, i32 x, i32 b, i32 y)))
can be folded into a (2xi32) (buildvector i32 a, i32 b).

Such a DAG would cause uneccessary vdup instructions followed by vmovn
instructions.

We generate this code on ARM NEON for a setcc olt, 2xf64, 2xf64. For example, in
the vectorized version of the code below.

double A[N];
double B[N];

void test_double_compare_to_double() {
  int i;
  for(i=0;i<N;i++)
    A[i] = (double)(A[i] < B[i]);
}

radar://13191881

Fixes bug 15283.

llvm-svn: 175670
2013-02-20 21:33:32 +00:00
Nadav Rotem
495b1a43c1 Dont merge consecutive loads/stores into vectors when noimplicitfloat is used.
llvm-svn: 175190
2013-02-14 18:28:52 +00:00