Commit Graph

733 Commits

Author SHA1 Message Date
Duncan Sands
37c1f5267b Allow these transforms for types like i256 while
still excluding types like i1 (not byte sized)
and i120 (loading an i120 requires loading an i64,
an i32, an i16 and an i8, which is expensive). 

llvm-svn: 52310
2008-06-16 08:14:38 +00:00
Duncan Sands
075293ff46 The transforms in visitEXTRACT_VECTOR_ELT are
not valid if the load is volatile.  Hopefully
all wrong DAG combiner transforms of volatile
loads and stores have now been caught.

llvm-svn: 52293
2008-06-15 20:12:31 +00:00
Duncan Sands
b1bfff53fe Remove a redundant AfterLegalize check. Turn
on some code when !AfterLegalize - but since
this whole code section is turned off by an
"if (0)" it's not really turning anything on.

llvm-svn: 52276
2008-06-14 17:48:34 +00:00
Duncan Sands
8651e9c584 Disable some DAG combiner optimizations that may be
wrong for volatile loads and stores.  In fact this
is almost all of them!  There are three types of
problems: (1) it is wrong to change the width of
a volatile memory access.  These may be used to
do memory mapped i/o, in which case a load can have
an effect even if the result is not used.  Consider
loading an i32 but only using the lower 8 bits.  It
is wrong to change this into a load of an i8, because
you are no longer tickling the other three bytes.  It
is also unwise to make a load/store wider.  For
example, changing an i16 load into an i32 load is
wrong no matter how aligned things are, since the
fact of loading an additional 2 bytes can have
i/o side-effects.  (2) it is wrong to change the
number of volatile load/stores: they may be counted
by the hardware.  (3) it is wrong to change a volatile
load/store that requires one memory access into one
that requires several.  For example on x86-32, you
can store a double in one processor operation, but to
store an i64 requires two (two i32 stores).  In a
multi-threaded program you may want to bitcast an i64
to a double and store as a double because that will
occur atomically, and be indivisible to other threads.
So it would be wrong to convert the store-of-double
into a store of an i64, because this will become two
i32 stores - no longer atomic.  My policy here is
to say that the number of processor operations for
an illegal operation is undefined.  So it is alright
to change a store of an i64 (requires at least two
stores; but could be validly lowered to memcpy for
example) into a store of double (one processor op).
In short, if the new store is legal and has the same
size then I say that the transform is ok.  It would
also be possible to say that transforms are always
ok if before they were illegal, whether after they
are illegal or not, but that's more awkward to do
and I doubt it buys us anything much.
However this exposed an interesting thing - on x86-32
a store of i64 is considered legal!  That is because
operations are marked legal by default, regardless of
whether the type is legal or not.  In some ways this
is clever: before type legalization this means that
operations on illegal types are considered legal;
after type legalization there are no illegal types
so now operations are only legal if they really are.
But I consider this to be too cunning for mere mortals.
Better to do things explicitly by testing AfterLegalize.
So I have changed things so that operations with illegal
types are considered illegal - indeed they can never
map to a machine operation.  However this means that
the DAG combiner is more conservative because before
it was "accidentally" performing transforms where the
type was illegal because the operation was nonetheless
marked legal.  So in a few such places I added a check
on AfterLegalize, which I suppose was actually just
forgotten before.  This causes the DAG combiner to do
slightly more than it used to, which resulted in the X86
backend blowing up because it got a slightly surprising
node it wasn't expecting, so I tweaked it.

llvm-svn: 52254
2008-06-13 19:07:40 +00:00
Duncan Sands
bf17080ec2 Sometimes (rarely) nodes held in LegalizeTypes
maps can be deleted.  This happens when RAUW
replaces a node N with another equivalent node
E, deleting the first node.  Solve this by
adding (N, E) to ReplacedNodes, which is already
used to remap nodes to replacements.  This means
that deleted nodes are being allowed in maps,
which can be delicate: the memory may be reused
for a new node which might get confused with the
old deleted node pointer hanging around in the
maps, so detect this and flush out maps if it
occurs (ExpungeNode).  The expunging operation
is expensive, however it never occurs during
a llvm-gcc bootstrap or anywhere in the nightly
testsuite.  It occurs three times in "make check":
Alpha/illegal-element-type.ll,
PowerPC/illegal-element-type.ll and
X86/mmx-shift.ll.  If expunging proves to be too
expensive then there are other more complicated
ways of solving the problem.
In the normal case this patch adds the overhead
of a few more map lookups, which is hopefully
negligable.

llvm-svn: 52214
2008-06-11 11:42:12 +00:00
Duncan Sands
67d0f332d5 Various tweaks related to apint codegen. No functionality
change for non-funky-sized integers.

llvm-svn: 52151
2008-06-09 15:48:25 +00:00
Duncan Sands
93b6609ae2 Remove some DAG combiner assumptions about sizes
of integer types.  Fix the isMask APInt method to
actually work (hopefully) rather than crashing
because it adds apints of different bitwidths.
It looks like isShiftedMask is also broken, but
I'm leaving that one to the APInt people (it is
not used anywhere).

llvm-svn: 52142
2008-06-09 11:32:28 +00:00
Duncan Sands
11dd424539 Remove comparison methods for MVT. The main cause
of apint codegen failure is the DAG combiner doing
the wrong thing because it was comparing MVT's using
< rather than comparing the number of bits.  Removing
the < method makes this mistake impossible to commit.
Instead, add helper methods for comparing bits and use
them.

llvm-svn: 52098
2008-06-08 20:54:56 +00:00
Duncan Sands
13237ac3b9 Wrap MVT::ValueType in a struct to get type safety
and better control the abstraction.  Rename the type
to MVT.  To update out-of-tree patches, the main
thing to do is to rename MVT::ValueType to MVT, and
rewrite expressions like MVT::getSizeInBits(VT) in
the form VT.getSizeInBits().  Use VT.getSimpleVT()
to extract a MVT::SimpleValueType for use in switch
statements (you will get an assert failure if VT is
an extended value type - these shouldn't exist after
type legalization).
This results in a small speedup of codegen and no
new testsuite failures (x86-64 linux).

llvm-svn: 52044
2008-06-06 12:08:01 +00:00
Dan Gohman
643b3a0581 Add #includes to make some dependencies explicit.
llvm-svn: 51496
2008-05-23 20:40:06 +00:00
Dan Gohman
c1a4e212a3 Code simplification.
llvm-svn: 51345
2008-05-20 20:56:33 +00:00
Evan Cheng
1120279ae6 Instead of a vector load, shuffle and then extract an element. Load the element from address with an offset.
pshufd $1, (%rdi), %xmm0
        movd %xmm0, %eax
=>
        movl 4(%rdi), %eax

llvm-svn: 51026
2008-05-13 08:35:03 +00:00
Evan Cheng
b980f6fb3d Xform bitconvert(build_pair(load a, load b)) to a single load if the load locations are at the right offset from each other.
llvm-svn: 51008
2008-05-12 23:04:07 +00:00
Dan Gohman
c968c1f592 Evan pointed out that folding sext to zext may not be correct
if the zext is not legal.

llvm-svn: 50368
2008-04-28 18:47:17 +00:00
Dan Gohman
3eb10f758e Teach DAGCombine to convert (sext x) to (zext x) when the
sign-bit of x is known to be zero.

llvm-svn: 50357
2008-04-28 16:58:24 +00:00
Roman Levenstein
a3ee1a38a3 Ongoing work on improving the instruction selection infrastructure:
Rename SDOperandImpl back to SDOperand.
Introduce the SDUse class that represents a use of the SDNode referred by
an SDOperand. Now it is more similar to Use/Value classes.

Patch is approved by Dan Gohman.

llvm-svn: 49795
2008-04-16 16:15:27 +00:00
Roman Levenstein
51f532f92d Re-commit of the r48822, where the infinite looping problem discovered
by Dan Gohman is fixed.

llvm-svn: 49330
2008-04-07 10:06:32 +00:00
Evan Cheng
025cea1126 Backing out 48222 temporarily.
llvm-svn: 49124
2008-04-03 03:13:16 +00:00
Dan Gohman
f549b26254 Fix a DAGCombiner optimization to respect volatile qualification.
llvm-svn: 48994
2008-03-31 20:32:52 +00:00
Roman Levenstein
358e04a185 Use a linked data structure for the uses lists of an SDNode, just like
LLVM Value/Use does and MachineRegisterInfo/MachineOperand does.
This allows constant time for all uses list maintenance operations.

The idea was suggested by Chris. Reviewed by Evan and Dan.
Patch is tested and approved by Dan.

On normal use-cases compilation speed is not affected. On very big basic
blocks there are compilation speedups in the range of 15-20% or even better. 

llvm-svn: 48822
2008-03-26 12:39:26 +00:00
Evan Cheng
df1690dc7c Handle a special case xor undef, undef -> 0. Technically this should be transformed to undef. But this is such a common idiom (misuse) we are going to handle it.
llvm-svn: 48792
2008-03-25 20:08:07 +00:00
Evan Cheng
fe7610f37f Remove an unneeded test.
llvm-svn: 48755
2008-03-24 23:55:16 +00:00
Evan Cheng
31604a62f6 Teach DAG combiner to commute commutable binary nodes in order to achieve sdisel CSE.
llvm-svn: 48673
2008-03-22 01:55:50 +00:00
Christopher Lamb
3e9f49716e Check even more carefully before applying this DAGCombine transform.
llvm-svn: 48580
2008-03-20 04:31:39 +00:00
Evan Cheng
7a3e750fd2 Fix this xform: (sra (shl X, m), result_size) -> (sign_extend (trunc (shl X, result_size - n - m)))
llvm-svn: 48578
2008-03-20 02:18:41 +00:00
Christopher Lamb
8fe9109469 Fix X86's isTruncateFree to not claim that truncate to i1 is free. This fixes Bill's testcase that failed for r48491.
llvm-svn: 48542
2008-03-19 08:30:06 +00:00
Bill Wendling
efb4d9ef80 Temporarily revert r48491. It's breaking test/CodeGen/X86/xorl.ll.
llvm-svn: 48510
2008-03-18 22:29:51 +00:00
Christopher Lamb
3e408d4d82 Target independent DAG transform to use truncate for field extraction + sign extend on targets where this is profitable. Passes nightly on x86-64.
llvm-svn: 48491
2008-03-18 16:46:39 +00:00
Dan Gohman
b72127ac4c More APInt-ification.
llvm-svn: 48344
2008-03-13 22:13:53 +00:00
Evan Cheng
99ee78ef63 Clean up my own mess.
X86 lowering normalize vector 0 to v4i32. However DAGCombine can fold (sub x, x) -> 0 after legalization. It can create a zero vector of a type that's not expected (e.g. v8i16). We don't want to disable the optimization since leaving a (sub x, x) is really bad. Add isel patterns for other types of vector 0 to ensure correctness. It's highly unlikely to happen other than in bugpoint reduced test cases.

llvm-svn: 48279
2008-03-12 07:02:50 +00:00
Evan Cheng
0903aef2ff Total brain cramp.
llvm-svn: 48274
2008-03-12 02:05:05 +00:00
Evan Cheng
b9e4280e94 Somewhat better solution.
llvm-svn: 48170
2008-03-10 19:58:22 +00:00
Scott Michel
a6729e8666 Give TargetLowering::getSetCCResultType() a parameter so that ISD::SETCC's
return ValueType can depend its operands' ValueType.

This is a cosmetic change, no functionality impacted.

llvm-svn: 48145
2008-03-10 15:42:14 +00:00
Evan Cheng
831ae49599 Doh
llvm-svn: 48140
2008-03-10 07:59:01 +00:00
Evan Cheng
b5d11980d9 Avoid creating BUILD_VECTOR of all zero elements of "non-normalized" type (e.g. v8i16 on x86) after legalizer. Instruction selection does not expect to see them. In all likelihood this can only be an issue in a bugpoint reduced test case.
llvm-svn: 48136
2008-03-10 07:19:13 +00:00
Evan Cheng
567d2e5b57 Rename isOperand() to isOperandOf() (and other similar methods). It always confuses me.
llvm-svn: 47872
2008-03-04 00:41:45 +00:00
Dan Gohman
e1c4f99549 Misc. APInt-ification in the DAGCombiner.
llvm-svn: 47869
2008-03-03 23:51:38 +00:00
Dan Gohman
ae2b6fbb8e Convert SimplifyDemandedMask and ShrinkDemandedConstant to use APInt.
Change several cases in SimplifyDemandedMask that don't ever do any
simplifying to reuse the logic in ComputeMaskedBits instead of
duplicating it.

llvm-svn: 47648
2008-02-27 00:25:32 +00:00
Chris Lattner
07c83cc86e Fix PR2096, a regression introduced with my patch last night. This
also fixes cfrac, flops, and 175.vpr

llvm-svn: 47605
2008-02-26 17:09:59 +00:00
Chris Lattner
e7c14013f5 Fix isNegatibleForFree to not return true for ConstantFP nodes
after legalize.  Just because a constant is legal (e.g. 0.0 in SSE) 
doesn't mean that its negated value is legal (-0.0).  We could make
this stronger by checking to see if the negated constant is actually
legal post negation, but it doesn't seem like a big deal.

llvm-svn: 47591
2008-02-26 07:04:54 +00:00
Dan Gohman
1f372edd97 Convert MaskedValueIsZero and all its users to use APInt. Also add
a SignBitIsZero function to simplify a common use case.

llvm-svn: 47561
2008-02-25 21:11:39 +00:00
Dan Gohman
360c86aed5 Add explicit keywords.
llvm-svn: 47382
2008-02-20 16:44:09 +00:00
Dan Gohman
d0ff91dac5 Convert DAGCombiner to use the APInt form of ComputeMaskedBits.
llvm-svn: 47381
2008-02-20 16:33:30 +00:00
Anton Korobeynikov
035eaacd1f Update gcc 4.3 warnings fix patch with recent head changes
llvm-svn: 47368
2008-02-20 11:10:28 +00:00
Evan Cheng
6200c225e0 - When DAG combiner is folding a bit convert into a BUILD_VECTOR, it should check if it's essentially a SCALAR_TO_VECTOR. Avoid turning (v8i16) <10, u, u, u> to <10, 0, u, u, u, u, u, u>. Instead, simply convert it to a SCALAR_TO_VECTOR of the proper type.
- X86 now normalize SCALAR_TO_VECTOR to (BIT_CONVERT (v4i32 SCALAR_TO_VECTOR)). Get rid of X86ISD::S2VEC.

llvm-svn: 47290
2008-02-18 23:04:32 +00:00
Chris Lattner
ee322b44a4 teach dag combiner how to eliminate MERGE_VALUES nodes.
llvm-svn: 47052
2008-02-13 07:25:05 +00:00
Duncan Sands
7377f5fbe3 Add a isBigEndian method to complement isLittleEndian.
llvm-svn: 46954
2008-02-11 10:37:04 +00:00
Bill Wendling
9c2ce9a32d Return "(c1 + c2)" instead of yet another ADD node (which made this a
no-op).

llvm-svn: 46922
2008-02-10 08:10:24 +00:00
Chris Lattner
d2d166ea9f the world doesn't need my debugging code.
llvm-svn: 46678
2008-02-03 07:01:05 +00:00
Chris Lattner
b2b9d6f0fb Change the 'global modification' APIs in SelectionDAG to take a new
DAGUpdateListener object pointer instead of just returning a vector 
of deleted nodes.  This makes the interfaces more efficient (no more
allocating a vector [at least a malloc], filling it in, then walking
it) and more clean.  This also allows the client to be notified of
nodes that are *changed* but not deleted.

llvm-svn: 46677
2008-02-03 06:49:24 +00:00
Dan Gohman
47a7d6fafe Factor the addressing mode and the load/store VT out of LoadSDNode
and StoreSDNode into their common base class LSBaseSDNode. Member
functions getLoadedVT and getStoredVT are replaced with the common
getMemoryVT to simplify code that will handle both loads and stores.

llvm-svn: 46538
2008-01-30 00:15:11 +00:00
Dan Gohman
70de4cb1cd Use empty() instead of comparing size() with zero.
llvm-svn: 46514
2008-01-29 13:02:09 +00:00
Chris Lattner
2ee91f4300 Fix PowerPC/./2007-10-18-PtrArithmetic.ll
llvm-svn: 46424
2008-01-27 23:32:17 +00:00
Chris Lattner
d0496d0433 fix a crash on CodeGen/X86/vector-rem.ll
llvm-svn: 46422
2008-01-27 23:21:58 +00:00
Chris Lattner
888560d62c Implement some dag combines that allow doing fneg/fabs/fcopysign in integer
registers if used by a bitconvert or using a bitconvert.  This allows us to
avoid constant pool loads and use cheaper integer instructions when the
values come from or end up in integer regs anyway.  For example, we now 
compile CodeGen/X86/fp-in-intregs.ll to:

_test1:
	movl	$2147483648, %eax
	xorl	4(%esp), %eax
	ret
_test2:
	movl	$1065353216, %eax
	orl	4(%esp), %eax
	andl	$3212836864, %eax
	ret

Instead of:
_test1:
	movss	4(%esp), %xmm0
	xorps	LCPI2_0, %xmm0
	movd	%xmm0, %eax
	ret
_test2:
	movss	4(%esp), %xmm0
	andps	LCPI3_0, %xmm0
	movss	LCPI3_1, %xmm1
	andps	LCPI3_2, %xmm1
	orps	%xmm0, %xmm1
	movd	%xmm1, %eax
	ret

bitconverts can happen due to various calling conventions that require
fp values to passed in integer regs in some cases, e.g. when returning
a complex.

llvm-svn: 46414
2008-01-27 17:42:27 +00:00
Chris Lattner
e30e33af4f Infer alignment of loads and increase their alignment when we can tell they are
from the stack.  This allows us to compile stack-align.ll to:

_test:
	movsd	LCPI1_0, %xmm0
	movapd	%xmm0, %xmm1
***	andpd	4(%esp), %xmm1
	andpd	_G, %xmm0
	addsd	%xmm1, %xmm0
	movl	20(%esp), %eax
	movsd	%xmm0, (%eax)
	ret

instead of:

_test:
	movsd	LCPI1_0, %xmm0
**	movsd	4(%esp), %xmm1
**	andpd	%xmm0, %xmm1
	andpd	_G, %xmm0
	addsd	%xmm1, %xmm0
	movl	20(%esp), %eax
	movsd	%xmm0, (%eax)
	ret

llvm-svn: 46401
2008-01-26 19:45:50 +00:00
Chris Lattner
31e9edce1c Fix some bugs in SimplifyNodeWithTwoResults where it would call deletenode to
delete a node even if it was not dead in some cases.  Instead, just add it to
the worklist.  Also, make sure to use the CombineTo methods, as it was doing
things that were unsafe: the top level combine loop could touch dangling memory.

This fixes CodeGen/Generic/2008-01-25-dag-combine-mul.ll

llvm-svn: 46384
2008-01-26 01:09:19 +00:00
Chris Lattner
cb3cf546c3 reduce indentation
llvm-svn: 46377
2008-01-25 23:34:24 +00:00
Chris Lattner
2d7a830ff3 Add skeletal code to increase the alignment of loads and stores when
we can infer it.  This will eventually help stuff, though it doesn't
do much right now because all fixed FI's have an alignment of 1.

llvm-svn: 46349
2008-01-25 07:20:16 +00:00
Chris Lattner
34ed27c46d clarify a comment, thanks Duncan.
llvm-svn: 46313
2008-01-24 17:10:01 +00:00
Chris Lattner
e97fa8cdf0 Fix this buggy transformation. Two observations:
1. we already know the value is dead, so don't bother replacing 
   it with undef.
2. The very case the comment describes actually makes the load
   live which asserts in deletenode.  If we do the replacement
   and the node becomes live, just treat it as new.  This fixes
   a failure on X86/2008-01-16-InvalidDAGCombineXform.ll with
   some local changes in my tree.

llvm-svn: 46306
2008-01-24 07:57:06 +00:00
Chris Lattner
d66eac62fd The dag combiner is missing revisiting nodes that it really should, and thus leaving
dead stuff around.  This gets fed into the isel pass and causes certain foldings from
happening because nodes have extraneous uses floating around.  For example, if we turned
foo(bar(x)) -> baz(x), we sometimes left bar(x) around.

llvm-svn: 46305
2008-01-24 07:18:21 +00:00
Chris Lattner
0feb1b0f84 fold fp_round(fp_round(x)) -> fp_round(x).
llvm-svn: 46304
2008-01-24 06:45:35 +00:00
Chris Lattner
1ea55cf816 This commit changes:
1. Legalize now always promotes truncstore of i1 to i8. 
2. Remove patterns and gunk related to truncstore i1 from targets.
3. Rename the StoreXAction stuff to TruncStoreAction in TLI.
4. Make the TLI TruncStoreAction table a 2d table to handle from/to conversions.
5. Mark a wide variety of invalid truncstores as such in various targets, e.g.
   X86 currently doesn't support truncstore of any of its integer types.
6. Add legalize support for truncstores with invalid value input types.
7. Add a dag combine transform to turn store(truncate) into truncstore when
   safe.

The later allows us to compile CodeGen/X86/storetrunc-fp.ll to:

_foo:
	fldt	20(%esp)
	fldt	4(%esp)
	faddp	%st(1)
	movl	36(%esp), %eax
	fstps	(%eax)
	ret

instead of:

_foo:
	subl	$4, %esp
	fldt	24(%esp)
	fldt	8(%esp)
	faddp	%st(1)
	fstps	(%esp)
	movl	40(%esp), %eax
	movss	(%esp), %xmm0
	movss	%xmm0, (%eax)
	addl	$4, %esp
	ret

llvm-svn: 46140
2008-01-17 19:59:44 +00:00
Chris Lattner
7eabed3521 code cleanups, no functionality change.
llvm-svn: 46126
2008-01-17 07:20:38 +00:00
Chris Lattner
72733e573b * Introduce a new SelectionDAG::getIntPtrConstant method
and switch various codegen pieces and the X86 backend over
  to using it.

* Add some comments to SelectionDAGNodes.h

* Introduce a second argument to FP_ROUND, which indicates
  whether the FP_ROUND changes the value of its input. If
  not it is safe to xform things like fp_extend(fp_round(x)) -> x.

llvm-svn: 46125
2008-01-17 07:00:52 +00:00
Evan Cheng
7be1528004 Fixes a nasty dag combiner bug that causes a bunch of tests to fail at -O0.
It's not safe to use the two value CombineTo variant to combine away a dead load.
e.g. 
v1, chain2 = load chain1, loc
v2, chain3 = load chain2, loc
v3         = add v2, c 
Now we replace use of v1 with undef, use of chain2 with chain1.
ReplaceAllUsesWith() will iterate through uses of the first load and update operands:
v1, chain2 = load chain1, loc
v2, chain3 = load chain1, loc
v3         = add v2, c 
Now the second load is the same as the first load, SelectionDAG cse will ensure
the use of second load is replaced with the first load.
v1, chain2 = load chain1, loc
v3         = add v1, c
Then v1 is replaced with undef and bad things happen.

llvm-svn: 46099
2008-01-16 23:11:54 +00:00
Chris Lattner
2e50a6f90f Factor the ReachesChainWithoutSideEffects out of dag combiner into
a public SDOperand::reachesChainWithoutSideEffects method.  No 
functionality change.

llvm-svn: 46050
2008-01-16 05:49:24 +00:00
Chris Lattner
51b01bf8a5 Make load->store deletion a bit smarter. This allows us to compile this:
void test(long long *P) { *P ^= 1; }

into just:

_test:
	movl	4(%esp), %eax
	xorl	$1, (%eax)
	ret

instead of code like this:

_test:
	movl	4(%esp), %ecx
        xorl    $1, (%ecx)
	movl	4(%ecx), %edx
	movl	%edx, 4(%ecx)
	ret

llvm-svn: 45762
2008-01-08 23:08:06 +00:00
Chris Lattner
f3ebc3f3d2 Remove attribution from file headers, per discussion on llvmdev.
llvm-svn: 45418
2007-12-29 20:36:04 +00:00
Chris Lattner
2de9b85297 make sure not to zap volatile stores, thanks a lot to Dale for noticing this!
llvm-svn: 45402
2007-12-29 07:15:45 +00:00
Chris Lattner
5919b48fe9 don't fold fp_round(fp_extend(load)) -> fp_round(extload)
llvm-svn: 45400
2007-12-29 06:55:23 +00:00
Chris Lattner
3f9c6a7260 Delete a store whose input is a load from the same pointer:
x = load p
  store x -> p

llvm-svn: 45398
2007-12-29 06:26:16 +00:00
Chris Lattner
efd1cddb5a Tell TargetLoweringOpt whether it is running before
or after legalize.

llvm-svn: 45321
2007-12-22 20:56:36 +00:00
Evan Cheng
9f06e5e2df Don't leave newly created nodes around if it turns out they are not needed.
llvm-svn: 45186
2007-12-19 01:34:38 +00:00
Dale Johannesen
5eff4de9c8 Redo previous patch so optimization only done for i1.
Simpler and safer.

llvm-svn: 44663
2007-12-06 17:53:31 +00:00
Chris Lattner
eedaf92fcf third time around: instead of disabling this completely,
only disable it if we don't know it will be obviously profitable.
Still fixme, but less so. :)

llvm-svn: 44658
2007-12-06 07:47:55 +00:00
Chris Lattner
b5fdfb9612 Actually, disable this code for now. More analysis and improvements to
the X86 backend are needed before this should be enabled by default.

llvm-svn: 44657
2007-12-06 07:44:31 +00:00
Chris Lattner
7c709a5d08 implement a readme entry, compiling the code into:
_foo:
	movl	$12, %eax
	andl	4(%esp), %eax
	movl	_array(%eax), %eax
	ret

instead of:

_foo:
	movl	4(%esp), %eax
	shrl	$2, %eax
	andl	$3, %eax
	movl	_array(,%eax,4), %eax
	ret

As it turns out, this triggers all the time, in a wide variety of
situations, for example, I see diffs like this in various programs:

-       movl    8(%eax), %eax
-       shll    $2, %eax
-       andl    $1020, %eax
-       movl    (%esi,%eax), %eax
+       movzbl  8(%eax), %eax
+       movl    (%esi,%eax,4), %eax


-       shll    $2, %edx
-       andl    $1020, %edx
-       movl    (%edi,%edx), %edx
+       andl    $255, %edx
+       movl    (%edi,%edx,4), %edx

Unfortunately, I also see stuff like this, which can be fixed in the
X86 backend:

-       andl    $85, %ebx
-       addl    _bit_count(,%ebx,4), %ebp
+       shll    $2, %ebx
+       andl    $340, %ebx
+       addl    _bit_count(%ebx), %ebp

llvm-svn: 44656
2007-12-06 07:33:36 +00:00
Dale Johannesen
05bbbda78a Fix PR1842.
llvm-svn: 44649
2007-12-06 01:43:46 +00:00
Dan Gohman
9a69341725 Don't lower srem/urem X%C to X-X/C*C unless the division is actually
optimized. This avoids creating illegal divisions when the combiner is
running after legalize; this fixes PR1815. Also, it produces better
code in the included testcase by avoiding the subtract and multiply
when the division isn't optimized.

llvm-svn: 44341
2007-11-26 23:46:11 +00:00
Duncan Sands
e795efea5b Move MinAlign to MathExtras.h.
llvm-svn: 43944
2007-11-09 13:41:39 +00:00
Duncan Sands
e7a9ac929f Fix some load/store logic that would be wrong for
apints on big-endian machines if the bitwidth is
not a multiple of 8.  Introduce a new helper,
MVT::getStoreSizeInBits, and use it.

llvm-svn: 43934
2007-11-09 08:57:19 +00:00
Evan Cheng
ece4c68b82 If both parts of smul_lohi, etc. are used, don't simplify. If only one part is used, try simplify it.
llvm-svn: 43888
2007-11-08 09:25:29 +00:00
Evan Cheng
0747bc1df6 Typo.
llvm-svn: 43511
2007-10-30 20:11:21 +00:00
Dan Gohman
ae95d72a52 Fix a DAGCombiner abort on a bitcast from a scalar to a vector.
llvm-svn: 43470
2007-10-29 20:44:42 +00:00
Evan Cheng
e106e2f142 Enable more fold (sext (load x)) -> (sext (truncate (sextload x)))
transformation. Previously, it's restricted by ensuring the number of load uses
is one. Now the restriction is loosened up by allowing setcc uses to be
"extended" (e.g. setcc x, c, eq -> setcc sext(x), sext(c), eq).

llvm-svn: 43465
2007-10-29 19:58:20 +00:00
Duncan Sands
1826deda68 The guaranteed alignment of ptr+offset is only the minimum of
of offset and the alignment of ptr if these are both powers of
2.  While the ptr alignment is guaranteed to be a power of 2,
there is no reason to think that offset is.  For example, if
offset is 12 (the size of a long double on x86-32 linux) and
the alignment of ptr is 8, then the alignment of ptr+offset
will in general be 4, not 8.  Introduce a function MinAlign,
lifted from gcc, for computing the minimum guaranteed alignment.
I've tried to fix up everywhere under lib/CodeGen/SelectionDAG/.
I also changed some places that weren't wrong (because both values
were a power of 2), as a defensive change against people copying
and pasting the code.
Hopefully someone who cares about alignment will review the rest
of LLVM and fix up the remaining places.  Since I'm on x86 I'm
not very motivated to do this myself...

llvm-svn: 43421
2007-10-28 12:59:45 +00:00
Dale Johannesen
6802d0c96f Redo "last ppc long double fix" as Chris wants.
llvm-svn: 43189
2007-10-19 20:29:00 +00:00
Dale Johannesen
10432e5a67 More ppcf128 issues (maybe the last)?
llvm-svn: 43160
2007-10-19 00:59:18 +00:00
Dale Johannesen
e5facd51cb Disable attempts to constant fold PPC f128.
Remove the assumption that this will happen from
various places.

llvm-svn: 43053
2007-10-16 23:38:29 +00:00
Chris Lattner
3cfb56d489 One mundane change: Change ReplaceAllUsesOfValueWith to *optionally*
take a deleted nodes vector, instead of requiring it.

One more significant change:  Implement the start of a legalizer that
just works on types.  This legalizer is designed to run before the 
operation legalizer and ensure just that the input dag is transformed
into an output dag whose operand and result types are all legal, even
if the operations on those types are not.

This design/impl has the following advantages:

1. When finished, this will *significantly* reduce the amount of code in
   LegalizeDAG.cpp.  It will remove all the code related to promotion and
   expansion as well as splitting and scalarizing vectors.
2. The new code is very simple, idiomatic, and modular: unlike 
   LegalizeDAG.cpp, it has no 3000 line long functions. :)
3. The implementation is completely iterative instead of recursive, good
   for hacking on large dags without blowing out your stack.
4. The implementation updates nodes in place when possible instead of 
   deallocating and reallocating the entire graph that points to some 
   mutated node.
5. The code nicely separates out handling of operations with invalid 
   results from operations with invalid operands, making some cases
   simpler and easier to understand.
6. The new -debug-only=legalize-types option is very very handy :), 
   allowing you to easily understand what legalize types is doing.

This is not yet done.  Until the ifdef added to SelectionDAGISel.cpp is
enabled, this does nothing.  However, this code is sufficient to legalize
all of the code in 186.crafty, olden and freebench on an x86 machine.  The
biggest issues are:

1. Vectors aren't implemented at all yet
2. SoftFP is a mess, I need to talk to Evan about it.
3. No lowering to libcalls is implemented yet.
4. Various operations are missing etc.
5. There are FIXME's for stuff I hax0r'd out, like softfp.

Hey, at least it is a step in the right direction :).  If you'd like to help,
just enable the #ifdef in SelectionDAGISel.cpp and compile code with it.  If
this explodes it will tell you what needs to be implemented.  Help is 
certainly appreciated.

Once this goes in, we can do three things:

1. Add a new pass of dag combine between the "type legalizer" and "operation
   legalizer" passes.  This will let us catch some long-standing isel issues
   that we miss because operation legalization often obfuscates the dag with
   target-specific nodes.
2. We can rip out all of the type legalization code from LegalizeDAG.cpp,
   making it much smaller and simpler.  When that happens we can then 
   reimplement the core functionality left in it in a much more efficient and
   non-recursive way.
3. Once the whole legalizer is non-recursive, we can implement whole-function
   selectiondags maybe...

llvm-svn: 42981
2007-10-15 06:10:22 +00:00
Chris Lattner
f47e30627a Enhance the truncstore optimization code to handle shifted
values and propagate demanded bits through them in simple cases.

This allows this code:
void foo(char *P) {
   strcpy(P, "abc");
}
to compile to:

_foo:
        ldrb r3, [r1]
        ldrb r2, [r1, #+1]
        ldrb r12, [r1, #+2]!
        ldrb r1, [r1, #+1]
        strb r1, [r0, #+3]
        strb r2, [r0, #+1]
        strb r12, [r0, #+2]
        strb r3, [r0]
        bx lr

instead of:

_foo:
        ldrb r3, [r1, #+3]
        ldrb r2, [r1, #+2]
        orr r3, r2, r3, lsl #8
        ldrb r2, [r1, #+1]
        ldrb r1, [r1]
        orr r2, r1, r2, lsl #8
        orr r3, r2, r3, lsl #16
        strb r3, [r0]
        mov r2, r3, lsr #24
        strb r2, [r0, #+3]
        mov r2, r3, lsr #16
        strb r2, [r0, #+2]
        mov r3, r3, lsr #8
        strb r3, [r0, #+1]
        bx lr

testcase here: test/CodeGen/ARM/truncstore-dag-combine.ll

This also helps occasionally for X86 and other cases not involving 
unaligned load/stores.

llvm-svn: 42954
2007-10-13 06:58:48 +00:00
Chris Lattner
5e6fe054a2 Add a simple optimization to simplify the input to
truncate and truncstore instructions, based on the 
knowledge that they don't demand the top bits.

llvm-svn: 42952
2007-10-13 06:35:54 +00:00
Duncan Sands
56ab90d3ad Correct swapped arguments to getConstant.
llvm-svn: 42824
2007-10-10 09:54:50 +00:00
Dan Gohman
5c6d0c3b99 DAGCombiner support for UDIVREM/SDIVREM and UMUL_LOHI/SMUL_LOHI.
Check if one of the two results unneeded so see if a simpler operator
could bs used. Also check to see if each of the two computations could be
simplified if they were split into separate operators. Factor out the code
that calls visit() so that it can be used for this purpose.

llvm-svn: 42759
2007-10-08 17:57:15 +00:00
Evan Cheng
0de312dd7d Reapply 42677.
llvm-svn: 42692
2007-10-06 08:19:55 +00:00
Chris Lattner
82217bd155 revert evan's patch until the header is committed
llvm-svn: 42686
2007-10-06 06:08:17 +00:00
Evan Cheng
f4b5d491df Added DAG xforms. e.g.
(vextract (v4f32 s2v (f32 load $addr)), 0) -> (f32 load $addr) 
(vextract (v4i32 bc (v4f32 s2v (f32 load $addr))), 0) -> (i32 load $addr)
Remove x86 specific patterns.

llvm-svn: 42677
2007-10-06 02:46:29 +00:00
Evan Cheng
e2e8f2d96b Fix a bogus splat xform:
shuffle <undef, undef, x, undef>, <undef, undef, undef, undef>, <2, 2, 2, 2>
!=
<undef, undef, x, undef>

llvm-svn: 42111
2007-09-18 21:54:37 +00:00
Dale Johannesen
af12b57405 Prevent crash on long double.
llvm-svn: 42103
2007-09-18 18:36:59 +00:00
Dale Johannesen
028084efe5 Revise previous patch per review comments.
Next round of x87 long double stuff.
Getting close now, basically works.

llvm-svn: 41875
2007-09-12 03:30:33 +00:00
Dale Johannesen
245dceb06d Add APInt interfaces to APFloat (allows directly
access to bits).  Use them in place of float and
double interfaces where appropriate.
First bits of x86 long double constants handling 
(untested, probably does not work).

llvm-svn: 41858
2007-09-11 18:32:33 +00:00
Chris Lattner
58c227bd09 Emit:
cmpl    %eax, %ecx
        setae   %al
        movzbl  %al, %eax

instead of:

        cmpl    %eax, %ecx
        setb    %al
        xorb    $1, %al
        movzbl  %al, %eax

when using logical not of a C comparison.

llvm-svn: 41807
2007-09-10 21:39:07 +00:00
Dale Johannesen
446b900192 Add mod, copysign, abs operations to APFloat.
Implement some constant folding in SelectionDAG and
DAGCombiner using APFloat.  Remove double versions
of constructor and getValue from ConstantFPSDNode.

llvm-svn: 41664
2007-08-31 23:34:27 +00:00
Dan Gohman
9625d812c9 Make DAGCombiner's global alias analysis query more precise in the case
where both pointers have non-zero offsets.

llvm-svn: 41491
2007-08-27 16:32:11 +00:00
Dale Johannesen
b6d2bec418 Revise per review comments.
llvm-svn: 41409
2007-08-26 01:18:27 +00:00
Dale Johannesen
2cfcf70f82 Add APFloat interface to ConstantFPSDNode. Change
over uses in DAGCombiner.  Fix interfaces to work
with APFloats.

llvm-svn: 41407
2007-08-25 22:10:57 +00:00
Evan Cheng
f5a23abf37 Fold C ? 0 : 1 to ~C or zext(~C) or trunc(~C) depending the types.
llvm-svn: 41163
2007-08-18 05:57:05 +00:00
Dan Gohman
30f060be80 Fix the alias analysis query in DAGCombiner to not add in two
offsets. The SrcValueOffset values are the real offsets from the
SrcValue base pointers.

llvm-svn: 40534
2007-07-26 16:14:06 +00:00
Dan Gohman
80f9f077e3 Don't call SimplifyVBinOp for non-vector operations, following earlier review
feedback. This theoretically makes the common (scalar) case more efficient.

llvm-svn: 39823
2007-07-13 20:03:40 +00:00
Dan Gohman
adb3d37c07 Fix a bug in the folding of binary operators to undef.
Thanks to Lauro for spotting this!

llvm-svn: 38491
2007-07-10 15:19:29 +00:00
Dan Gohman
fa91282dbf Fix the folding of undef in several binary operators to recognize
undef in either the left or right operand.

llvm-svn: 38489
2007-07-10 14:20:37 +00:00
Dan Gohman
2af3063337 Preserve volatililty and alignment information when lowering or
simplifying loads and stores.

llvm-svn: 38473
2007-07-09 22:18:38 +00:00
Chris Lattner
6caf8fdd04 Fix this warning:
DAGCombiner.cpp: In member function 'llvm::SDOperand<unnamed>::DAGCombiner::visitOR(llvm::SDNode*)':
DAGCombiner.cpp:1608: warning: passing negative value '-0x00000000000000001' for argument 1 to 'llvm::SDOperand llvm::SelectionDAG::getConstant(uint64_t, llvm::MVT::ValueType, bool)'

oiy.

llvm-svn: 38458
2007-07-09 16:16:34 +00:00
Dan Gohman
06563a8702 Fix several over-aggressive folds for undef nodes in dagcombine, to
follow the rules for undef used in instcombine.

llvm-svn: 37851
2007-07-03 14:03:57 +00:00
Dan Gohman
9a70823375 Teach GetNegatedExpression to negate 0-B to B in UnsafeFPMath mode, and
visitFSUB to fold 0-B to -B in UnsafeFPMath mode. Also change visitFNEG
to use isNegatibleForFree/GetNegatedExpression instead of doing a subset
of the same thing manually.

This fixes test/CodeGen/X86/negative-sin.ll.

llvm-svn: 37842
2007-07-02 15:48:56 +00:00
Dan Gohman
a866514528 Generalize MVT::ValueType and associated functions to be able to represent
extended vector types. Remove the special SDNode opcodes used for pre-legalize
vector operations, and the special MVT::Vector type used with them. Adjust
lowering and legalize to work with the normal SDNode kinds instead, and to
use the normal MVT functions to work with vector types instead of using the
two special operands that the pre-legalize nodes held.

This allows pre-legalize and post-legalize DAGs, and the code that operates
on them, to be more consistent. Pre-legalize vector operators can be handled
more consistently with scalar operators. And, -view-dag-combine1-dags and
-view-legalize-dags now look prettier for vector code.

llvm-svn: 37719
2007-06-25 16:23:39 +00:00
Dan Gohman
309d3d51b3 Move ComputeMaskedBits, MaskedValueIsZero, and ComputeNumSignBits from
TargetLowering to SelectionDAG so that they have more convenient
access to the current DAG, in preparation for the ValueType routines
being changed from standalone functions to members of SelectionDAG for
the pre-legalize vector type changes.

llvm-svn: 37704
2007-06-22 14:59:07 +00:00
Evan Cheng
aa5f5d960d Xforms:
(add (select cc, 0, c), x) -> (select cc, x, (add, x, c))
(sub x, (select cc, 0, c)) -> (select cc, x, (sub, x, c))

llvm-svn: 37685
2007-06-21 07:39:16 +00:00
Dan Gohman
a7644dd9b9 Pass a SelectionDAG into SDNode::dump everywhere it's used, in prepration
for needing the DAG node to print pre-legalize extended value types, and
to get better debug messages with target-specific nodes.

llvm-svn: 37656
2007-06-19 14:13:56 +00:00
Dan Gohman
5c4413120f Rename MVT::getVectorBaseType to MVT::getVectorElementType.
llvm-svn: 37579
2007-06-14 22:58:02 +00:00
Chris Lattner
4698083b96 tighten up recursion depth again
llvm-svn: 37330
2007-05-25 02:19:06 +00:00
Evan Cheng
a4d187b8ce Fix a typo that caused combiner to create mal-formed pre-indexed store where value store is the same as the base pointer.
llvm-svn: 37318
2007-05-24 02:35:39 +00:00
Chris Lattner
6509c0673f prevent exponential recursion in isNegatibleForFree
llvm-svn: 37310
2007-05-23 07:35:22 +00:00
Dan Gohman
b539df3389 Qualify calls to getTypeForValueType with MVT:: too.
llvm-svn: 37233
2007-05-18 18:41:29 +00:00
Dale Johannesen
7a6c175e7a Don't fold bitconvert(load) for preinc/postdec loads. Likewise stores.
llvm-svn: 37130
2007-05-16 22:45:30 +00:00
Chris Lattner
48fb92f75d Use a ptr set instead of a linear search to unique TokenFactor operands.
This fixes PR1423

llvm-svn: 37102
2007-05-16 06:37:59 +00:00
Evan Cheng
288f133c71 Bug fix: should check ABI alignment, not pref. alignment.
llvm-svn: 37094
2007-05-16 02:04:50 +00:00
Lauro Ramos Venancio
3f142cbca2 Fix an infinite recursion in GetNegatedExpression.
llvm-svn: 37086
2007-05-15 17:05:43 +00:00
Chris Lattner
e49c974a7c implement a simple fneg optimization/propagation thing. This compiles:
CodeGen/PowerPC/fneg.ll into:

_t4:
        fmul f0, f3, f4
        fmadd f1, f1, f2, f0
        blr

instead of:

_t4:
        fneg f0, f3
        fmul f0, f0, f4
        fmsub f1, f1, f2, f0
        blr

llvm-svn: 37054
2007-05-14 22:04:50 +00:00
Evan Cheng
f325c2a65e Can't fold the bit_convert is the store is a truncating store.
llvm-svn: 36962
2007-05-09 21:49:47 +00:00
Evan Cheng
562e45692e Forgot a check.
llvm-svn: 36910
2007-05-07 21:36:06 +00:00
Evan Cheng
a4cf58a103 Enable a couple of xforms:
- (store (bitconvert v)) -> (store v) if resultant store does not require
higher alignment
- (bitconvert (load v)) -> (load (bitconvert*)v) if resultant load does not
require higher alignment

llvm-svn: 36908
2007-05-07 21:27:48 +00:00
Evan Cheng
044a0a8cfb Don't create indexed load / store with zero offset!
llvm-svn: 36716
2007-05-03 23:52:19 +00:00
Evan Cheng
b68343cdd8 Forgot about chain result; also UNDEF cannot have multiple values.
llvm-svn: 36622
2007-05-01 08:53:39 +00:00
Evan Cheng
a684cd23a5 * Only turn a load to UNDEF if all of its outputs have no uses (indexed loads
produce two results.)
* Do not touch volatile loads.

llvm-svn: 36604
2007-05-01 00:38:21 +00:00
Christopher Lamb
8af6d5896f PR400 phase 2. Propagate attributed load/store information through DAGs.
llvm-svn: 36356
2007-04-22 23:15:30 +00:00
Reid Spencer
0c1349e6bc Revert Christopher Lamb's load/store alignment changes.
llvm-svn: 36309
2007-04-21 18:36:27 +00:00
Christopher Lamb
bff50208c8 add support for alignment attributes on load/store instructions
llvm-svn: 36301
2007-04-21 08:16:25 +00:00
Chris Lattner
f03c90bee6 allow SRL to simplify its operands, as it doesn't demand all bits as input.
llvm-svn: 36245
2007-04-18 03:06:49 +00:00
Chris Lattner
bf14f20632 When replacing a node in SimplifyDemandedBits, if the old node used any
single-use nodes, they will be dead soon.  Make sure to remove them before
processing other nodes.  This implements CodeGen/X86/shl_elim.ll

llvm-svn: 36244
2007-04-18 03:05:22 +00:00
Chris Lattner
9ad5915559 SIGN_EXTEND_INREG does not demand its top bits. Give SimplifyDemandedBits
a chance to hack on it.  This compiles:

int baz(long long a) { return (short)(((int)(a >>24)) >> 9); }

into:
_baz:
        slwi r2, r3, 8
        srwi r2, r2, 9
        extsh r3, r2
        blr

instead of:

_baz:
        srwi r2, r4, 24
        rlwimi r2, r3, 8, 0, 23
        srwi r2, r2, 9
        extsh r3, r2
        blr

This implements CodeGen/PowerPC/sign_ext_inreg1.ll

llvm-svn: 36212
2007-04-17 19:03:21 +00:00
Chris Lattner
18e4ac4107 fix an infinite loop compiling ldecod, notice by JeffC.
llvm-svn: 35910
2007-04-11 16:51:53 +00:00
Chris Lattner
a083ffcad7 Fix this harder.
llvm-svn: 35888
2007-04-11 06:50:51 +00:00
Chris Lattner
c5f85d3738 don't create shifts by zero, fix some problems with my previous patch
llvm-svn: 35887
2007-04-11 06:43:25 +00:00
Chris Lattner
65786b078c Teach the codegen to turn [aez]ext (setcc) -> selectcc of 1/0, which often
allows other simplifications.  For example, this compiles:
int isnegative(unsigned int X) {
   return !(X < 2147483648U);
}

Into this code:

x86:
        movl 4(%esp), %eax
        shrl $31, %eax
        ret
arm:
        mov r0, r0, lsr #31
        bx lr
thumb:
        lsr r0, r0, #31
        bx lr

instead of:

x86:
        cmpl $0, 4(%esp)
        sets %al
        movzbl %al, %eax
        ret

arm:
        mov r3, #0
        cmp r0, #0
        movlt r3, #1
        mov r0, r3
        bx lr

thumb:
        mov r2, #1
        mov r1, #0
        cmp r0, #0
        blt LBB1_2      @entry
LBB1_1: @entry
        cpy r2, r1
LBB1_2: @entry
        cpy r0, r2
        bx lr

Testcase here: test/CodeGen/Generic/ispositive.ll

llvm-svn: 35883
2007-04-11 05:32:27 +00:00
Chris Lattner
41189c63cc Codegen integer abs more efficiently using the trick from the PPC CWG. This
improves codegen on many architectures.  Tests committed as CodeGen/*/iabs.ll

X86 Old:			X86 New:
_test:				_test:
   movl 4(%esp), %ecx		   movl 4(%esp), %eax
   movl %ecx, %eax		   movl %eax, %ecx
   negl %eax			   sarl $31, %ecx
   testl %ecx, %ecx		   addl %ecx, %eax
   cmovns %ecx, %eax		   xorl %ecx, %eax
   ret				   ret

PPC Old:			PPC New:
_test:				_test:
   cmpwi cr0, r3, -1		   srawi r2, r3, 31
   neg r2, r3			   add r3, r3, r2
   bgt cr0, LBB1_2 ;		   xor r3, r3, r2
LBB1_1: ;			   blr
   mr r3, r2
LBB1_2: ;
   blr

ARM Old:			ARM New:
_test:				_test:
   rsb r3, r0, #0		   add r3, r0, r0, asr #31
   cmp r0, #0			   eor r0, r3, r0, asr #31
   movge r3, r0			   bx lr
   mov r0, r3
   bx lr

Thumb Old:			Thumb New:
_test:				_test:
   neg r2, r0			   asr r2, r0, #31
   cmp r0, #0			   add r0, r0, r2
   bge LBB1_2			   eor r0, r2
LBB1_1: @			   bx lr
   cpy r0, r2
LBB1_2: @
   bx lr


Sparc Old:			Sparc New:
test:				test:
   save -96, %o6, %o6		   save -96, %o6, %o6
   sethi 0, %l0			   sra %i0, 31, %l0
   sub %l0, %i0, %l0		   add %i0, %l0, %l1
   subcc %i0, -1, %l1		   xor %l1, %l0, %i0
   bg .BB1_2			   restore %g0, %g0, %g0
   nop				   retl
.BB1_1:				   nop
   or %g0, %l0, %i0
.BB1_2:
   restore %g0, %g0, %g0
   retl
   nop

It also helps alpha/ia64 :)

llvm-svn: 35881
2007-04-11 05:11:38 +00:00
Scott Michel
16627a542f 1. Insert custom lowering hooks for ISD::ROTR and ISD::ROTL.
2. Help DAGCombiner recognize zero/sign/any-extended versions of ROTR and ROTL
patterns. This was motivated by the X86/rotate.ll testcase, which should now
generate code for other platforms (and soon-to-come platforms.) Rewrote code
slightly to make it easier to read.

llvm-svn: 35605
2007-04-02 21:36:32 +00:00
Dale Johannesen
4bbd2eefba Fix incorrect combination of different loads. Reenable zext-over-truncate
combination.

llvm-svn: 35517
2007-03-30 21:38:07 +00:00
Evan Cheng
ccee35fd0d Disable load width reduction xform of variant (zext (truncate load x)) for
big endian targets until llvm-gcc build issue has been resolved.

llvm-svn: 35449
2007-03-29 07:56:46 +00:00
Evan Cheng
8275f0e0af SIGN_EXTEND_INREG requires one extra operand, a ValueType node.
llvm-svn: 35350
2007-03-26 07:12:51 +00:00
Evan Cheng
b7051f596a Adjust offset to compensate for big endian machines.
llvm-svn: 35293
2007-03-24 00:02:43 +00:00
Evan Cheng
a883b58caf Make sure SEXTLOAD of the specific type is supported on the target.
llvm-svn: 35289
2007-03-23 22:13:36 +00:00
Evan Cheng
e2f5f24e8e Also replace uses of SRL if that's also folded during ReduceLoadWidth().
llvm-svn: 35286
2007-03-23 20:55:21 +00:00
Evan Cheng
a824e79f06 A couple of bug fixes for reducing load width xform:
1. Address offset is in bytes.
2. Make sure truncate node uses are replaced with new load.

llvm-svn: 35274
2007-03-23 02:16:52 +00:00
Evan Cheng
464dc9b74c More opportunities to reduce load size.
llvm-svn: 35254
2007-03-22 01:54:19 +00:00
Evan Cheng
d63baead9b fold (truncate (srl (load x), c)) -> (smaller load (x+c/vt bits))
llvm-svn: 35239
2007-03-21 20:14:05 +00:00
Evan Cheng
8a1d09d079 Avoid combining indexed load further.
llvm-svn: 35005
2007-03-07 08:07:03 +00:00
Chris Lattner
47206667c0 fold away addc nodes when we know there cannot be a carry-out.
llvm-svn: 34913
2007-03-04 20:40:38 +00:00
Chris Lattner
2dcc6e7f58 generalize
llvm-svn: 34910
2007-03-04 20:08:45 +00:00
Chris Lattner
e2e13caeb2 canonicalize constants to the RHS of addc/adde. If nothing uses the carry out of
addc, turn it into add.

This allows us to compile:

long long test(long long A, unsigned B) {
  return (A + ((long long)B << 32)) & 123;
}

into:

_test:
        movl $123, %eax
        andl 4(%esp), %eax
        xorl %edx, %edx
        ret

instead of:
_test:
        xorl %edx, %edx
        movl %edx, %eax
        addl 4(%esp), %eax   ;; add of zero
        andl $123, %eax
        ret

llvm-svn: 34909
2007-03-04 20:03:15 +00:00
Chris Lattner
fce448f856 Fold (sext (truncate x)) more aggressively, by avoiding creation of a
sextinreg if not needed.   This is useful in two cases: before legalize,
it avoids creating a sextinreg that will be trivially removed.  After legalize
if the target doesn't support sextinreg, the trunc/sext would not have been
removed before.

llvm-svn: 34621
2007-02-26 03:13:59 +00:00
Evan Cheng
92658d5648 Move SimplifySetCC to TargetLowering and allow it to be shared with legalizer.
llvm-svn: 34065
2007-02-08 22:13:59 +00:00
Evan Cheng
00a640dbe0 Fix for PR1108: type of insert_vector_elt index operand is PtrVT, not MVT::i32.
llvm-svn: 33398
2007-01-20 10:10:26 +00:00
Evan Cheng
9201100b29 Remove this xform:
(shl (add x, c1), c2) -> (add (shl x, c2), c1<<c2)
Replace it with:
(add (shl (add x, c1), c2), ) -> (add (add (shl x, c2), c1<<c2), )

This fixes test/CodeGen/ARM/smul.ll

llvm-svn: 33361
2007-01-19 17:51:44 +00:00
Chris Lattner
4dc4489286 Fix PR1114 and CodeGen/Generic/2007-01-15-LoadSelectCycle.ll by being
careful when folding "c ? load p : load q" that C doesn't reach either load.
If so, folding this into load (c ? p : q) will induce a cycle in the graph.

llvm-svn: 33251
2007-01-16 05:59:59 +00:00
Chris Lattner
f70c5cd5db add options to view the dags before the first or second pass of dag combine.
llvm-svn: 33249
2007-01-16 04:55:25 +00:00
Chris Lattner
0199fd6d59 Implement some trivial FP foldings when -enable-unsafe-fp-math is specified.
This implements CodeGen/PowerPC/unsafe-math.ll

llvm-svn: 33024
2007-01-08 23:04:05 +00:00
Chris Lattner
aee775a6b7 Eliminate static ctors from Statistics
llvm-svn: 32698
2006-12-19 22:41:21 +00:00
Evan Cheng
28cf4277bb Cannot combine an indexed load / store any further.
llvm-svn: 32629
2006-12-16 06:25:23 +00:00
Jim Laskey
26df19ace6 This code was usurping the sextload expand in teh legalizer. Just make
sure the right conditions are checked.

llvm-svn: 32611
2006-12-15 21:38:30 +00:00
Chris Lattner
b7524b6d0e make this code more aggressive about turning store fpimm into store int imm.
This is not sufficient to fix X86/store-fp-constant.ll

llvm-svn: 32465
2006-12-12 04:16:14 +00:00
Evan Cheng
218369881f Don't convert store double C, Ptr to store long C, Ptr if i64 is not a legal type.
llvm-svn: 32434
2006-12-11 17:25:19 +00:00
Nate Begeman
8e20c760fa Move something that should be in the dag combiner from the legalizer to the
dag combiner.

llvm-svn: 32431
2006-12-11 02:23:46 +00:00
Chris Lattner
d9f04e4875 Fix CodeGen/PowerPC/2006-12-07-SelectCrash.ll on PPC64
llvm-svn: 32336
2006-12-07 22:36:47 +00:00
Bill Wendling
22e978a736 Removing even more <iostream> includes.
llvm-svn: 32320
2006-12-07 20:04:42 +00:00
Chris Lattner
700b873130 Detemplatize the Statistic class. The only type it is instantiated with
is 'unsigned'.

llvm-svn: 32279
2006-12-06 17:46:33 +00:00
Chris Lattner
3da631f29a For better or worse, load from i1 is assumed to be zero extended. Do not
form a load from i1 from larger loads that may not be zext'd.

llvm-svn: 31933
2006-11-27 04:40:53 +00:00
Chris Lattner
3676a994ca Fix PR1011 and CodeGen/Generic/2006-11-20-DAGCombineCrash.ll
llvm-svn: 31878
2006-11-20 18:05:46 +00:00
Evan Cheng
f64da389f8 Fix an incorrectly inverted condition.
llvm-svn: 31773
2006-11-16 00:08:20 +00:00
Chris Lattner
a0a8003f59 disallow preinc of a frameindex. This is not profitable and causes 2-addr
pass to explode.  This fixes a bunch of llc-beta failures on ppc last night.

llvm-svn: 31661
2006-11-11 01:00:15 +00:00
Chris Lattner
eabc15c1d8 reduce indentation by using early exits. No functionality change.
llvm-svn: 31660
2006-11-11 00:56:29 +00:00
Chris Lattner
ffad2166e1 move big chunks of code out-of-line, no functionality change.
llvm-svn: 31658
2006-11-11 00:39:41 +00:00
Chris Lattner
4eac5f59e6 Fix a dag combiner bug exposed by my recent instcombine patch. This fixes
CodeGen/Generic/2006-11-10-DAGCombineMiscompile.ll and PPC gsm/toast

llvm-svn: 31644
2006-11-10 21:37:15 +00:00
Evan Cheng
13440b025c When forming a pre-indexed store, make sure ptr isn't the same or is a pred of value being stored. It would cause a cycle.
llvm-svn: 31631
2006-11-10 08:28:11 +00:00
Evan Cheng
6878378390 Don't attempt expensive pre-/post- indexed dag combine if target does not support them.
llvm-svn: 31598
2006-11-09 19:10:46 +00:00
Evan Cheng
b15000736c Rename ISD::MemOpAddrMode to ISD::MemIndexedMode
llvm-svn: 31595
2006-11-09 17:55:04 +00:00
Evan Cheng
b58e06bc9e getPostIndexedAddressParts change: passes in load/store instead of its loaded / stored VT.
llvm-svn: 31584
2006-11-09 04:29:46 +00:00
Evan Cheng
85e54223cd Match more post-indexed ops.
llvm-svn: 31569
2006-11-08 20:27:27 +00:00
Jim Laskey
61feeb90f9 Remove redundant <cmath>.
llvm-svn: 31561
2006-11-08 19:16:44 +00:00
Evan Cheng
0303cb9b33 - When performing pre-/post- indexed load/store transformation, do not worry
about whether the new base ptr would be live below the load/store. Let two
  address pass split it back to non-indexed ops.
- Minor tweaks / fixes.

llvm-svn: 31544
2006-11-08 08:30:28 +00:00
Evan Cheng
6072435756 Fixed a minor bug preventing some pre-indexed load / store transformation.
llvm-svn: 31543
2006-11-08 06:56:05 +00:00
Evan Cheng
d48f7dd250 Fix a obscure post-indexed load / store dag combine bug.
llvm-svn: 31537
2006-11-08 02:38:55 +00:00
Evan Cheng
60c6846d21 Add post-indexed load / store transformations.
llvm-svn: 31498
2006-11-07 09:03:05 +00:00
Evan Cheng
eb99bd736a Add comment.
llvm-svn: 31473
2006-11-06 08:14:30 +00:00
Jeff Cohen
7d6f3db3e2 Unbreak VC++ build.
llvm-svn: 31464
2006-11-05 19:31:28 +00:00
Evan Cheng
33157700d9 Added pre-indexed store support.
llvm-svn: 31459
2006-11-05 09:31:14 +00:00
Evan Cheng
1dfd26a151 Rename
llvm-svn: 31413
2006-11-03 07:21:16 +00:00
Reid Spencer
52f958741a Remove dead variable. Fix 80 column violations.
llvm-svn: 31412
2006-11-03 03:30:34 +00:00
Evan Cheng
357017f4a9 Added DAG combiner transformation to generate pre-indexed loads.
llvm-svn: 31410
2006-11-03 03:06:21 +00:00
Reid Spencer
de46e48420 For PR786:
Turn on -Wunused and -Wno-unused-parameter. Clean up most of the resulting
fall out by removing unused variables. Remaining warnings have to do with
unused functions (I didn't want to delete code without review) and unused
variables in generated code. Maintainers should clean up the remaining
issues when they see them. All changes pass DejaGnu tests and Olden.

llvm-svn: 31380
2006-11-02 20:25:50 +00:00
Jim Laskey
55e4dcad36 Add option for controlling inclusion of global AA.
llvm-svn: 31040
2006-10-18 19:08:31 +00:00
Jim Laskey
a15b0ebb5e Use global info for alias analysis.
llvm-svn: 31035
2006-10-18 12:29:57 +00:00
Chris Lattner
327b88b102 Fix CodeGen/PowerPC/2006-10-17-brcc-miscompile.ll
llvm-svn: 31019
2006-10-17 21:24:15 +00:00
Jim Laskey
e7d2c24a7d Make it simplier to dump DAGs while in DAGCombiner. Remove a nasty optimization.
llvm-svn: 31009
2006-10-17 19:33:52 +00:00
Evan Cheng
1e3a39cd08 Make sure operand does have size and element type operands.
llvm-svn: 30999
2006-10-17 17:06:35 +00:00
Evan Cheng
f3ae00a64a Be careful when looking through a vbit_convert. Optimizing this:
(vector_shuffle
  (vbitconvert (vbuildvector (copyfromreg v4f32), 1, v4f32), 4, f32),
  (undef, undef, undef, undef), (0, 0, 0, 0), 4, f32)
to the
  vbitconvert
is a very bad idea.

llvm-svn: 30989
2006-10-16 22:49:37 +00:00
Jim Laskey
dcb2b83886 Pass AliasAnalysis thru to DAGCombiner.
llvm-svn: 30984
2006-10-16 20:52:31 +00:00
Jim Laskey
3bf4f3bd60 Tidy up after truncstore changes.
llvm-svn: 30961
2006-10-14 12:14:27 +00:00
Chris Lattner
6a1b2de8c4 Make sure that the node returned by SimplifySetCC is added to the worklist
so that it can be deleted if unused.

llvm-svn: 30955
2006-10-14 03:52:46 +00:00
Chris Lattner
0626bd2fbc fold setcc of a setcc.
llvm-svn: 30953
2006-10-14 01:02:29 +00:00
Chris Lattner
bd9acad805 When SimplifySetCC was moved to the DAGCombiner, it was never removed from
SelectionDAG and it has since bitrotted.  Remove the copy from SelectionDAG.
Next, remove the constant folding piece of DAGCombiner::SimplifySetCC into
a new FoldSetCC method which can be used by getNode() and SimplifySetCC.

This fixes obscure bugs.

llvm-svn: 30952
2006-10-14 00:41:01 +00:00
Jim Laskey
dcf983ce41 Reduce the workload by not adding chain users to work list.
llvm-svn: 30948
2006-10-13 23:32:28 +00:00
Evan Cheng
ab51cf2e78 Merge ISD::TRUNCSTORE to ISD::STORE. Switch to using StoreSDNode.
llvm-svn: 30945
2006-10-13 21:14:26 +00:00
Chris Lattner
d0620d2773 Lower X%C into X/C+stuff. This allows the 'division by a constant' logic to
apply to rems as well as divs.  This fixes PR945 and speeds up ReedSolomon
from 14.57s to 10.90s (which is now faster than gcc).

It compiles CodeGen/X86/rem.ll into:

_test1:
        subl $4, %esp
        movl %esi, (%esp)
        movl $2155905153, %ecx
        movl 8(%esp), %esi
        movl %esi, %eax
        imull %ecx
        addl %esi, %edx
        movl %edx, %eax
        shrl $31, %eax
        sarl $7, %edx
        addl %eax, %edx
        imull $255, %edx, %eax
        subl %eax, %esi
        movl %esi, %eax
        movl (%esp), %esi
        addl $4, %esp
        ret
_test2:
        movl 4(%esp), %eax
        movl %eax, %ecx
        sarl $31, %ecx
        shrl $24, %ecx
        addl %eax, %ecx
        andl $4294967040, %ecx
        subl %ecx, %eax
        ret
_test3:
        subl $4, %esp
        movl %esi, (%esp)
        movl $2155905153, %ecx
        movl 8(%esp), %esi
        movl %esi, %eax
        mull %ecx
        shrl $7, %edx
        imull $255, %edx, %eax
        subl %eax, %esi
        movl %esi, %eax
        movl (%esp), %esi
        addl $4, %esp
        ret

instead of div/idiv instructions.

llvm-svn: 30920
2006-10-12 20:58:32 +00:00
Chris Lattner
2e33fb453b add a minor dag combine noticed when looking at PR945
llvm-svn: 30915
2006-10-12 20:23:19 +00:00
Jim Laskey
df2ccc395e D'oh - need to use the rigth kind of store.
llvm-svn: 30903
2006-10-12 15:22:24 +00:00
Jim Laskey
a13b9c7aa4 Alias analysis of TRUNCSTORE.
llvm-svn: 30889
2006-10-11 18:55:16 +00:00
Jim Laskey
0f7c328ae7 Handle aliasing of loadext.
llvm-svn: 30883
2006-10-11 17:47:52 +00:00
Jim Laskey
08edf332ed Fix regression in combiner alias analysis.
llvm-svn: 30880
2006-10-11 13:47:09 +00:00
Evan Cheng
d35734bd1f Naming consistency.
llvm-svn: 30878
2006-10-11 07:10:22 +00:00
Evan Cheng
e71fe34d75 Reflects ISD::LOAD / ISD::LOADX / LoadSDNode changes.
llvm-svn: 30844
2006-10-09 20:57:25 +00:00
Chris Lattner
5ab6d8b3fc Eliminate more token factors by taking advantage of transitivity:
if TF depends on A and B, and A depends on B, TF just needs to depend on
A.  With Jim's alias-analysis stuff enabled, this compiles the testcase in
PR892 into:

__Z4test3Val:
        subl $44, %esp
        call L__Z3foov$stub
        movl %edx, 28(%esp)
        movl %eax, 32(%esp)
        movl %eax, 24(%esp)
        movl %edx, 36(%esp)
        movl 52(%esp), %ecx
        movl %ecx, 4(%esp)
        movl %eax, 8(%esp)
        movl %edx, 12(%esp)
        movl 48(%esp), %eax
        movl %eax, (%esp)
        call L__Z3bar3ValS_$stub
        addl $44, %esp
        ret

instead of:

__Z4test3Val:
        subl $44, %esp
        call L__Z3foov$stub
        movl %eax, 24(%esp)
        movl %edx, 28(%esp)
        movl 24(%esp), %eax
        movl %eax, 32(%esp)
        movl 28(%esp), %eax
        movl %eax, 36(%esp)
        movl 32(%esp), %eax
        movl 36(%esp), %ecx
        movl 52(%esp), %edx
        movl %edx, 4(%esp)
        movl %eax, 8(%esp)
        movl %ecx, 12(%esp)
        movl 48(%esp), %eax
        movl %eax, (%esp)
        call L__Z3bar3ValS_$stub
        addl $44, %esp
        ret

llvm-svn: 30821
2006-10-08 22:57:01 +00:00
Jim Laskey
0463e08005 Combiner alias analysis passes Multisource (release-asserts.)
llvm-svn: 30818
2006-10-07 23:37:56 +00:00
Evan Cheng
df9ac47e5e Make use of getStore().
llvm-svn: 30759
2006-10-05 23:01:46 +00:00
Jim Laskey
6549d22ef9 Alias analysis code clean ups.
llvm-svn: 30753
2006-10-05 15:07:25 +00:00
Jim Laskey
708d0db2d8 More extensive alias analysis.
llvm-svn: 30721
2006-10-04 16:53:27 +00:00
Evan Cheng
5d9fd977d3 Combine ISD::EXTLOAD, ISD::SEXTLOAD, ISD::ZEXTLOAD into ISD::LOADX. Add an
extra operand to LOADX to specify the exact value extension type.

llvm-svn: 30714
2006-10-04 00:56:09 +00:00
Jim Laskey
60832693a7 Load chain check is not needed
llvm-svn: 30613
2006-09-26 17:44:58 +00:00
Jim Laskey
dde51671e5 Chain can be any operand
llvm-svn: 30611
2006-09-26 09:32:41 +00:00
Jim Laskey
5f3e0af9d0 Wrong size for load
llvm-svn: 30610
2006-09-26 08:14:06 +00:00
Jim Laskey
b4a864d533 Can't move a load node if it's chain is not used.
llvm-svn: 30609
2006-09-26 07:37:42 +00:00
Jim Laskey
7aa0638aa9 Accidental enable of bad code
llvm-svn: 30601
2006-09-25 21:11:32 +00:00
Jim Laskey
b5534e5c28 Fix chain dropping in load and drop unused stores in ret blocks.
llvm-svn: 30600
2006-09-25 19:32:58 +00:00
Jim Laskey
d07be232ba Core antialiasing for load and store.
llvm-svn: 30597
2006-09-25 16:29:54 +00:00
Evan Cheng
449a0c7e33 Make it work for DAG combine of multi-value nodes.
llvm-svn: 30573
2006-09-21 19:04:05 +00:00
Jim Laskey
35f7eebb49 core corrections
llvm-svn: 30570
2006-09-21 17:35:47 +00:00
Jim Laskey
5d19d59017 Basic "in frame" alias analysis.
llvm-svn: 30568
2006-09-21 16:28:59 +00:00
Chris Lattner
082db3f9aa fold (aext (and (trunc x), cst)) -> (and x, cst).
llvm-svn: 30561
2006-09-21 06:40:43 +00:00
Chris Lattner
fa9f92cf65 Check the right value type. This fixes 186.crafty on x86
llvm-svn: 30560
2006-09-21 06:17:39 +00:00
Chris Lattner
8d8a3bf9c9 Compile:
int %test(ulong *%tmp) {
        %tmp = load ulong* %tmp         ; <ulong> [#uses=1]
        %tmp.mask = shr ulong %tmp, ubyte 50            ; <ulong> [#uses=1]
        %tmp.mask = cast ulong %tmp.mask to ubyte
        %tmp2 = and ubyte %tmp.mask, 3          ; <ubyte> [#uses=1]
        %tmp2 = cast ubyte %tmp2 to int         ; <int> [#uses=1]
        ret int %tmp2
}

to:

_test:
        movl 4(%esp), %eax
        movl 4(%eax), %eax
        shrl $18, %eax
        andl $3, %eax
        ret

instead of:

_test:
        movl 4(%esp), %eax
        movl 4(%eax), %eax
        shrl $18, %eax
        # TRUNCATE movb %al, %al
        andb $3, %al
        movzbl %al, %eax
        ret

llvm-svn: 30558
2006-09-21 06:14:31 +00:00
Chris Lattner
a31f0a622b Generalize (zext (truncate x)) and (sext (truncate x)) folding to work when
the src/dst are not the same size.  This catches things like "truncate
32-bit X to 8 bits, then zext to 16", which happens a bit on X86.

llvm-svn: 30557
2006-09-21 06:00:20 +00:00
Chris Lattner
c8cd62d381 Compile:
int test3(int a, int b) { return (a < 0) ? a : 0; }

to:

_test3:
        srawi r2, r3, 31
        and r3, r2, r3
        blr

instead of:

_test3:
        cmpwi cr0, r3, 1
        li r2, 0
        blt cr0, LBB2_2 ;entry
LBB2_1: ;entry
        mr r3, r2
LBB2_2: ;entry
        blr


This implements: PowerPC/select_lt0.ll:seli32_a_a

llvm-svn: 30517
2006-09-20 06:41:35 +00:00
Chris Lattner
8746e2cd57 Fold the full generality of (any_extend (truncate x))
llvm-svn: 30514
2006-09-20 06:29:17 +00:00
Chris Lattner
8b68decb27 Two things:
1. teach SimplifySetCC that '(srl (ctlz x), 5) == 0' is really x != 0.
2. Teach visitSELECT_CC to use SimplifySetCC instead of calling it and
   ignoring the result.  This allows us to compile:

bool %test(ulong %x) {
  %tmp = setlt ulong %x, 4294967296
  ret bool %tmp
}

to:

_test:
        cntlzw r2, r3
        cmplwi cr0, r3, 1
        srwi r2, r2, 5
        li r3, 0
        beq cr0, LBB1_2 ;
LBB1_1: ;
        mr r3, r2
LBB1_2: ;
        blr

instead of:

_test:
        addi r2, r3, -1
        cntlzw r2, r2
        cntlzw r3, r3
        srwi r2, r2, 5
        cmplwi cr0, r2, 0
        srwi r2, r3, 5
        li r3, 0
        bne cr0, LBB1_2 ;
LBB1_1: ;
        mr r3, r2
LBB1_2: ;
        blr

This isn't wonderful, but it's an improvement.

llvm-svn: 30513
2006-09-20 06:19:26 +00:00
Chris Lattner
46d710e6ea Fold (X & C1) | (Y & C2) -> (X|Y) & C3 when possible.
This implements CodeGen/X86/and-or-fold.ll

llvm-svn: 30379
2006-09-14 21:11:37 +00:00
Chris Lattner
97614c86ce Split rotate matching code out to its own function. Make it stronger, by
matching things like ((x >> c1) & c2) | ((x << c3) & c4) to (rot x, c5) & c6

llvm-svn: 30376
2006-09-14 20:50:57 +00:00
Evan Cheng
31305c45da DAG combiner fix for rotates. Previously the outer-most condition checks
for ROTL availability. This prevents it from forming ROTR for targets that
has ROTR only.

llvm-svn: 29997
2006-08-31 07:41:12 +00:00
Evan Cheng
e5570a4c3f Move isCommutativeBinOp from SelectionDAG.cpp and DAGCombiner.cpp out. Make it a static method of SelectionDAG.
llvm-svn: 29951
2006-08-29 06:42:35 +00:00
Chris Lattner
3d27be1333 s|llvm/Support/Visibility.h|llvm/Support/Compiler.h|
llvm-svn: 29911
2006-08-27 12:54:02 +00:00
Chris Lattner
6f22ebd8be change internal impl of dag combiner so that calls to CombineTo never have to
make a temporary vector.

llvm-svn: 29618
2006-08-11 17:56:38 +00:00
Chris Lattner
a2f4086828 Change one ReplaceAllUsesWith method to take an array of operands to replace
instead of a vector of operands.

llvm-svn: 29616
2006-08-11 17:46:28 +00:00
Chris Lattner
c24a1d3093 Start eliminating temporary vectors used to create DAG nodes. Instead, pass
in the start of an array and a count of operands where applicable.  In many
cases, the number of operands is known, so this static array can be allocated
on the stack, avoiding the heap.  In many other cases, a SmallVector can be
used, which has the same benefit in the common cases.

I updated a lot of code calling getNode that takes a vector, but ran out of
time.  The rest of the code should be updated, and these methods should be
removed.

We should also do the same thing to eliminate the methods that take a
vector of MVT::ValueTypes.

It would be extra nice to convert the dagiselemitter to avoid creating vectors
for operands when calling getTargetNode.

llvm-svn: 29566
2006-08-08 02:23:42 +00:00
Reid Spencer
658b9476f0 Initialize some variables the compiler warns about.
llvm-svn: 29277
2006-07-25 20:44:41 +00:00
Evan Cheng
7c970b98d0 If a shuffle is a splat, check if the argument is a build_vector with all elements being the same. If so, return the argument.
llvm-svn: 29242
2006-07-21 08:25:53 +00:00
Evan Cheng
8472e0c4af If a shuffle is unary, i.e. one of the vector argument is not needed, turn the
operand into a undef and adjust mask accordingly.

llvm-svn: 29232
2006-07-20 22:44:41 +00:00
Andrew Lenharth
ec104a2b41 80 cols
llvm-svn: 29221
2006-07-20 17:43:27 +00:00
Andrew Lenharth
c496b418b5 Reduce number of exported symbols
llvm-svn: 29220
2006-07-20 17:28:38 +00:00
Chris Lattner
54a34cd20b Mark these two classes as hidden, shrinking libllbmgcc.dylib by 25K
llvm-svn: 28970
2006-06-28 21:58:30 +00:00
Andrew Lenharth
0e57b2cb92 Start on my todo list
llvm-svn: 28752
2006-06-12 16:07:18 +00:00
Evan Cheng
64d2846017 visitVBinOp: Can't fold divide by zero!
llvm-svn: 28584
2006-05-31 06:08:35 +00:00
Chris Lattner
8f872d2091 Fix a nasty dag combiner bug that caused nondeterminstic crashes (MY FAVORITE!):
SimplifySelectOps would eliminate a Select, delete it, then return true.

The clients would see that it did something and return null.

The top level would see a null return, and decide that nothing happened,
proceeding to process the node in other ways: boom.

The fix is simple: clients of SimplifySelectOps should return the select
node itself.

In order to catch really obnoxious boogs like this in the future, add an
assert that nodes are not deleted.  We do this by checking for a sentry node
type that the SDNode dtor sets when a node is destroyed.

llvm-svn: 28514
2006-05-27 00:43:02 +00:00
Andrew Lenharth
1dc9ec5874 Move this code to a common place
llvm-svn: 28329
2006-05-16 17:42:15 +00:00
Chris Lattner
afe72481f6 Comment out dead variables
llvm-svn: 28252
2006-05-12 17:57:54 +00:00
Chris Lattner
66adee93aa Two simplifications for token factor nodes: simplify tf(x,x) -> x.
simplify tf(x,y,y,z) -> tf(x,y,z).

llvm-svn: 28233
2006-05-12 05:01:37 +00:00
Evan Cheng
2c74848af1 Debugging info
llvm-svn: 28200
2006-05-09 06:55:15 +00:00
Chris Lattner
446e1ef26a Make the case I just checked in stronger. Now we compile this:
short test2(short X, short x) {
  int Y = (short)(X+x);
  return Y >> 1;
}

to:

_test2:
        add r2, r3, r4
        extsh r2, r2
        srawi r3, r2, 1
        blr

instead of:

_test2:
        add r2, r3, r4
        extsh r2, r2
        srwi r2, r2, 1
        extsh r3, r2
        blr

llvm-svn: 28175
2006-05-08 21:18:59 +00:00
Chris Lattner
29062da0ac Implement and_sext.ll:test3, generating:
_test4:
        srawi r3, r3, 16
        blr

instead of:

_test4:
        srwi r2, r3, 16
        extsh r3, r2
        blr

for:

short test4(unsigned X) {
  return (X >> 16);
}

llvm-svn: 28174
2006-05-08 20:59:41 +00:00
Chris Lattner
2935d8190c Compile this:
short test4(unsigned X) {
  return (X >> 16);
}

to:

_test4:
        movl 4(%esp), %eax
        sarl $16, %eax
        ret

instead of:

_test4:
        movl $-65536, %eax
        andl 4(%esp), %eax
        sarl $16, %eax
        ret

llvm-svn: 28171
2006-05-08 20:51:54 +00:00
Nate Begeman
e5ce5bb6da Fix PR772
llvm-svn: 28161
2006-05-08 01:35:01 +00:00
Chris Lattner
7e7bcf3a54 Simplify some code, add a couple minor missed folds
llvm-svn: 28152
2006-05-06 23:06:26 +00:00
Chris Lattner
2a4d7b845b remove cases handled elsewhere
llvm-svn: 28150
2006-05-06 22:43:44 +00:00
Chris Lattner
1ecb2a2dac Use the new TargetLowering::ComputeNumSignBits method to eliminate
sign_extend_inreg operations.  Though ComputeNumSignBits is still rudimentary,
this is enough to compile this:

short test(short X, short x) {
  int Y = X+x;
  return (Y >> 1);
}
short test2(short X, short x) {
  int Y = (short)(X+x);
  return Y >> 1;
}

into:

_test:
        add r2, r3, r4
        srawi r3, r2, 1
        blr
_test2:
        add r2, r3, r4
        extsh r2, r2
        srawi r3, r2, 1
        blr

instead of:

_test:
        add r2, r3, r4
        srawi r2, r2, 1
        extsh r3, r2
        blr
_test2:
        add r2, r3, r4
        extsh r2, r2
        srawi r2, r2, 1
        extsh r3, r2
        blr

llvm-svn: 28146
2006-05-06 09:30:03 +00:00
Chris Lattner
907e392dba Fold trunc(any_ext). This gives stuff like:
27,28c27
<       movzwl %di, %edi
<       movl %edi, %ebx
---
>       movw %di, %bx

llvm-svn: 28137
2006-05-05 22:56:26 +00:00
Chris Lattner
57f8c5a387 Shrink shifts when possible.
llvm-svn: 28136
2006-05-05 22:53:17 +00:00
Chris Lattner
3d26577396 Fold (fpext (load x)) -> (extload x)
llvm-svn: 28130
2006-05-05 21:34:35 +00:00
Chris Lattner
25a5283a86 Fold some common code.
llvm-svn: 28124
2006-05-05 06:32:04 +00:00
Chris Lattner
002ee91457 Implement:
// fold (and (sext x), (sext y)) -> (sext (and x, y))
  // fold (or  (sext x), (sext y)) -> (sext (or  x, y))
  // fold (xor (sext x), (sext y)) -> (sext (xor x, y))
  // fold (and (aext x), (aext y)) -> (aext (and x, y))
  // fold (or  (aext x), (aext y)) -> (aext (or  x, y))
  // fold (xor (aext x), (aext y)) -> (aext (xor x, y))

llvm-svn: 28123
2006-05-05 06:31:05 +00:00
Chris Lattner
5ac4293606 Pull and through and/or/xor. This compiles some bitfield code to:
mov EAX, DWORD PTR [ESP + 4]
        mov ECX, DWORD PTR [EAX]
        mov EDX, ECX
        add EDX, EDX
        or EDX, ECX
        and EDX, -2147483648
        and ECX, 2147483647
        or EDX, ECX
        mov DWORD PTR [EAX], EDX
        ret

instead of:

        sub ESP, 4
        mov DWORD PTR [ESP], ESI
        mov EAX, DWORD PTR [ESP + 8]
        mov ECX, DWORD PTR [EAX]
        mov EDX, ECX
        add EDX, EDX
        mov ESI, ECX
        and ESI, -2147483648
        and EDX, -2147483648
        or EDX, ESI
        and ECX, 2147483647
        or EDX, ECX
        mov DWORD PTR [EAX], EDX
        mov ESI, DWORD PTR [ESP]
        add ESP, 4
        ret

llvm-svn: 28122
2006-05-05 06:10:43 +00:00
Chris Lattner
812646aa0c Implement a variety of simplifications for ANY_EXTEND.
llvm-svn: 28121
2006-05-05 05:58:59 +00:00
Chris Lattner
8d6fc20181 Factor some code, add these transformations:
// fold (and (trunc x), (trunc y)) -> (trunc (and x, y))
  // fold (or  (trunc x), (trunc y)) -> (trunc (or  x, y))
  // fold (xor (trunc x), (trunc y)) -> (trunc (xor x, y))

llvm-svn: 28120
2006-05-05 05:51:50 +00:00
Chris Lattner
2b48a94413 Remove a bogus transformation. This fixes SingleSource/UnitTests/2006-01-23-InitializedBitField.c
with some changes I have to the new CFE.

llvm-svn: 28022
2006-04-28 23:33:20 +00:00
Chris Lattner
662e940f73 Fix a couple more memory issues
llvm-svn: 27930
2006-04-21 15:32:26 +00:00
Chris Lattner
cc47ab3305 Fix a really subtle and obnoxious memory bug that caused issues with an
llvm-gcc4 boostrap.  Whenever a node is deleted by the dag combiner, it
*must* be returned by the visit function, or the dag combiner will not
know that the node has been processed (and will, e.g., send it to the
target dag combine xforms).

llvm-svn: 27922
2006-04-20 23:55:59 +00:00
Evan Cheng
a320abc494 Turn a VAND into a VECTOR_SHUFFLE is applicable.
DAG combiner can turn a VAND V, <-1, 0, -1, -1>, i.e. vector clear elements,
into a vector shuffle with a zero vector. It only does so when TLI tells it
the xform is profitable.

llvm-svn: 27874
2006-04-20 08:56:16 +00:00
Chris Lattner
e1401e3610 Canonicalize vvector_shuffle(x,x) -> vvector_shuffle(x,undef) to enable patterns
to match again :)

llvm-svn: 27533
2006-04-08 05:34:25 +00:00
Chris Lattner
098c01e94e Codegen shufflevector as VVECTOR_SHUFFLE
llvm-svn: 27529
2006-04-08 04:15:24 +00:00
Evan Cheng
613996c55e 1. If both vector operands of a vector_shuffle are undef, turn it into an undef.
2. A shuffle mask element can also be an undef.

llvm-svn: 27472
2006-04-06 23:20:43 +00:00
Chris Lattner
4ea52cac01 Do not create ZEXTLOAD's unless we are before legalize or the operation is
legal.

llvm-svn: 27402
2006-04-04 17:39:18 +00:00
Chris Lattner
e1e3adf802 Add a missing check, this fixes UnitTests/Vector/sumarray.c
llvm-svn: 27375
2006-04-03 17:29:28 +00:00
Chris Lattner
04c00fc844 Add a missing check, which broke a bunch of vector tests.
llvm-svn: 27374
2006-04-03 17:21:50 +00:00
Andrew Lenharth
94f012f606 back this out
llvm-svn: 27367
2006-04-03 03:16:50 +00:00
Andrew Lenharth
015eaf5f33 This should be a win of every arch
llvm-svn: 27364
2006-04-02 21:42:45 +00:00
Chris Lattner
4993249a04 Add a little dag combine to compile this:
int %AreSecondAndThirdElementsBothNegative(<4 x float>* %in) {
entry:
        %tmp1 = load <4 x float>* %in           ; <<4 x float>> [#uses=1]
        %tmp = tail call int %llvm.ppc.altivec.vcmpgefp.p( int 1, <4 x float> < float 0x7FF8000000000000, float 0.000000e+00, float 0.000000e+00, float 0x7FF8000000000000 >, <4 x float> %tmp1 )           ; <int> [#uses=1]
        %tmp = seteq int %tmp, 0                ; <bool> [#uses=1]
        %tmp3 = cast bool %tmp to int           ; <int> [#uses=1]
        ret int %tmp3
}

into this:

_AreSecondAndThirdElementsBothNegative:
        mfspr r2, 256
        oris r4, r2, 49152
        mtspr 256, r4
        li r4, lo16(LCPI1_0)
        lis r5, ha16(LCPI1_0)
        lvx v0, 0, r3
        lvx v1, r5, r4
        vcmpgefp. v0, v1, v0
        mfcr r3, 2
        rlwinm r3, r3, 27, 31, 31
        mtspr 256, r2
        blr

instead of this:

_AreSecondAndThirdElementsBothNegative:
        mfspr r2, 256
        oris r4, r2, 49152
        mtspr 256, r4
        li r4, lo16(LCPI1_0)
        lis r5, ha16(LCPI1_0)
        lvx v0, 0, r3
        lvx v1, r5, r4
        vcmpgefp. v0, v1, v0
        mfcr r3, 2
        rlwinm r3, r3, 27, 31, 31
        xori r3, r3, 1
        cntlzw r3, r3
        srwi r3, r3, 5
        mtspr 256, r2
        blr

llvm-svn: 27356
2006-04-02 06:11:11 +00:00
Chris Lattner
0442a18758 Constant fold all of the vector binops. This allows us to compile this:
"vector unsigned char mergeLowHigh = (vector unsigned char)
( 8, 9, 10, 11, 16, 17, 18, 19, 12, 13, 14, 15, 20, 21, 22, 23 );
vector unsigned char mergeHighLow = vec_xor( mergeLowHigh, vec_splat_u8(8));"

aka:

void %test2(<16 x sbyte>* %P) {
  store <16 x sbyte> cast (<4 x int> xor (<4 x int> cast (<16 x ubyte> < ubyte 8, ubyte 9, ubyte 10, ubyte 11, ubyte 16, ubyte 17, ubyte 18, ubyte 19, ubyte 12, ubyte 13, ubyte 14, ubyte 15, ubyte 20, ubyte 21, ubyte 22, ubyte 23 > to <4 x int>), <4 x int> cast (<16 x sbyte> < sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8 > to <4 x int>)) to <16 x sbyte>), <16 x sbyte> * %P
  ret void
}

into this:

_test2:
        mfspr r2, 256
        oris r4, r2, 32768
        mtspr 256, r4
        li r4, lo16(LCPI2_0)
        lis r5, ha16(LCPI2_0)
        lvx v0, r5, r4
        stvx v0, 0, r3
        mtspr 256, r2
        blr

instead of this:

_test2:
        mfspr r2, 256
        oris r4, r2, 49152
        mtspr 256, r4
        li r4, lo16(LCPI2_0)
        lis r5, ha16(LCPI2_0)
        vspltisb v0, 8
        lvx v1, r5, r4
        vxor v0, v1, v0
        stvx v0, 0, r3
        mtspr 256, r2
        blr

... which occurs here:
http://developer.apple.com/hardware/ve/calcspeed.html

llvm-svn: 27343
2006-04-02 03:25:57 +00:00
Chris Lattner
e4e64b6b85 Implement constant folding of bit_convert of arbitrary constant vbuild_vector nodes.
llvm-svn: 27341
2006-04-02 02:53:43 +00:00
Chris Lattner
39dcf1a9e2 Delete identity shuffles, implementing CodeGen/Generic/vector-identity-shuffle.ll
llvm-svn: 27317
2006-03-31 22:16:43 +00:00
Chris Lattner
7e30af3887 Remove dead *extloads. This allows us to codegen vector.ll:test_extract_elt
to:

test_extract_elt:
        alloc r3 = ar.pfs,0,1,0,0
        adds r8 = 12, r32
        ;;
        ldfs f8 = [r8]
        mov ar.pfs = r3
        br.ret.sptk.many rp

instead of:

test_extract_elt:
        alloc r3 = ar.pfs,0,1,0,0
        adds r8 = 28, r32
        adds r9 = 24, r32
        adds r10 = 20, r32
        adds r11 = 16, r32
        ;;
        ldfs f6 = [r8]
        ;;
        ldfs f6 = [r9]
        adds r8 = 12, r32
        adds r9 = 8, r32
        adds r14 = 4, r32
        ;;
        ldfs f6 = [r10]
        ;;
        ldfs f6 = [r11]
        ldfs f8 = [r8]
        ;;
        ldfs f6 = [r9]
        ;;
        ldfs f6 = [r14]
        ;;
        ldfs f6 = [r32]
        mov ar.pfs = r3
        br.ret.sptk.many rp

llvm-svn: 27297
2006-03-31 18:10:41 +00:00
Chris Lattner
2d8551c85b Delete dead loads in the dag. This allows us to compile
vector.ll:test_extract_elt2 into:

_test_extract_elt2:
        lfd f1, 32(r3)
        blr

instead of:

_test_extract_elt2:
        lfd f0, 56(r3)
        lfd f0, 48(r3)
        lfd f0, 40(r3)
        lfd f1, 32(r3)
        lfd f0, 24(r3)
        lfd f0, 16(r3)
        lfd f0, 8(r3)
        lfd f0, 0(r3)
        blr

llvm-svn: 27296
2006-03-31 18:06:18 +00:00
Chris Lattner
20e619fba3 When building a VVECTOR_SHUFFLE node from extract_element operations, make
sure to build it as SHUFFLE(X, undef, mask), not SHUFFLE(X, X, mask).

The later is not canonical form, and prevents the PPC splat pattern from
matching.  For a particular splat, we go from generating this:

	li r10, lo16(LCPI1_0)
	lis r11, ha16(LCPI1_0)
	lvx v3, r11, r10
	vperm v3, v2, v2, v3

to generating:

	vspltw v3, v2, 3

llvm-svn: 27236
2006-03-28 22:19:47 +00:00
Chris Lattner
a46dfe80c8 Canonicalize VECTOR_SHUFFLE(X, X, Y) -> VECTOR_SHUFFLE(X,undef,Y')
llvm-svn: 27235
2006-03-28 22:11:53 +00:00
Chris Lattner
c9992548fc Turn a series of extract_element's feeding a build_vector into a
vector_shuffle node.  For this:

void test(__m128 *res, __m128 *A, __m128 *B) {
  *res = _mm_unpacklo_ps(*A, *B);
}

we now produce this code:

_test:
        movl 8(%esp), %eax
        movaps (%eax), %xmm0
        movl 12(%esp), %eax
        unpcklps (%eax), %xmm0
        movl 4(%esp), %eax
        movaps %xmm0, (%eax)
        ret

instead of this:

_test:
        subl $76, %esp
        movl 88(%esp), %eax
        movaps (%eax), %xmm0
        movaps %xmm0, (%esp)
        movaps %xmm0, 32(%esp)
        movss 4(%esp), %xmm0
        movss 32(%esp), %xmm1
        unpcklps %xmm0, %xmm1
        movl 84(%esp), %eax
        movaps (%eax), %xmm0
        movaps %xmm0, 16(%esp)
        movaps %xmm0, 48(%esp)
        movss 20(%esp), %xmm0
        movss 48(%esp), %xmm2
        unpcklps %xmm0, %xmm2
        unpcklps %xmm1, %xmm2
        movl 80(%esp), %eax
        movaps %xmm2, (%eax)
        addl $76, %esp
        ret

GCC produces this (with -fomit-frame-pointer):

_test:
        subl    $12, %esp
        movl    20(%esp), %eax
        movaps  (%eax), %xmm0
        movl    24(%esp), %eax
        unpcklps        (%eax), %xmm0
        movl    16(%esp), %eax
        movaps  %xmm0, (%eax)
        addl    $12, %esp
        ret

llvm-svn: 27233
2006-03-28 20:28:38 +00:00
Chris Lattner
b7163598f9 Don't crash on X^X if X is a vector. Instead, produce a vector of zeros.
llvm-svn: 27229
2006-03-28 19:11:05 +00:00
Chris Lattner
dc1eab5886 Don't call SimplifyDemandedBits on vectors
llvm-svn: 27128
2006-03-25 22:19:00 +00:00
Chris Lattner
5336a59e4b fold insertelement(buildvector) -> buildvector if the inserted element # is
a constant.  This implements test_constant_insert in CodeGen/Generic/vector.ll

llvm-svn: 26851
2006-03-19 01:27:56 +00:00
Nate Begeman
bb01d4f272 Remove BRTWOWAY*
Make the PPC backend not dependent on BRTWOWAY_CC and make the branch
selector smarter about the code it generates, fixing a case in the
readme.

llvm-svn: 26814
2006-03-17 01:40:33 +00:00
Chris Lattner
68ac09d5cb make sure dead token factor nodes are removed by the dag combiner.
llvm-svn: 26731
2006-03-13 18:37:30 +00:00
Chris Lattner
d8c2a48d58 Fold X+Y -> X|Y when safe. This implements:
Regression/CodeGen/PowerPC/and_add.ll

a case that occurs with dynamic allocas of constant size.

llvm-svn: 26727
2006-03-13 06:51:27 +00:00
Chris Lattner
8bb6cb7d7b add a couple of missing folds
llvm-svn: 26724
2006-03-13 06:26:26 +00:00
Chris Lattner
bdaf4f38b5 Reinstate this now that the offending opposite xform has been removed.
llvm-svn: 26548
2006-03-05 19:53:55 +00:00
Evan Cheng
d428e22c07 Back out fold (shl (add x, c1), c2) -> (add (shl x, c2), c1<<c2) for now.
It's causing an infinite loop compiling ldecod on x86 / Darwin.

llvm-svn: 26544
2006-03-05 07:30:16 +00:00
Chris Lattner
3bc4050217 Add some simple copysign folds
llvm-svn: 26543
2006-03-05 05:30:57 +00:00
Chris Lattner
f29f5204cc fold (mul (add x, c1), c2) -> (add (mul x, c2), c1*c2)
fold (shl (add x, c1), c2) -> (add (shl x, c2), c1<<c2)

This allows us to compile CodeGen/PowerPC/addi-reassoc.ll into:

_test1:
        slwi r2, r4, 4
        add r2, r2, r3
        lwz r3, 36(r2)
        blr
_test2:
        mulli r2, r4, 5
        add r2, r2, r3
        lbz r2, 11(r2)
        extsb r3, r2
        blr

instead of:

_test1:
        addi r2, r4, 2
        slwi r2, r2, 4
        add r2, r3, r2
        lwz r3, 4(r2)
        blr
_test2:
        addi r2, r4, 2
        mulli r2, r2, 5
        add r2, r3, r2
        lbz r2, 1(r2)
        extsb r3, r2
        blr

llvm-svn: 26535
2006-03-04 23:33:26 +00:00
Chris Lattner
0db2f2c689 Fix CodeGen/Generic/2006-03-01-dagcombineinfloop.ll, an infinite loop
in the dag combiner on 176.gcc on x86.

llvm-svn: 26459
2006-03-01 21:47:21 +00:00
Chris Lattner
232024edb8 Fix a typo evan noticed
llvm-svn: 26454
2006-03-01 19:55:35 +00:00
Chris Lattner
bc1c85beea Add support for target-specific dag combines
llvm-svn: 26443
2006-03-01 04:53:38 +00:00
Chris Lattner
fbcd62d3bb Add a new AddToWorkList method, start using it
llvm-svn: 26441
2006-03-01 04:03:14 +00:00
Chris Lattner
324871ef1a Pull shifts by a constant through multiplies (a form of reassociation),
implementing Regression/CodeGen/X86/mul-shift-reassoc.ll

llvm-svn: 26440
2006-03-01 03:44:24 +00:00
Evan Cheng
b97aab4371 Vector ops lowering.
llvm-svn: 26436
2006-03-01 01:09:54 +00:00
Chris Lattner
f0032b350c Compile:
unsigned foo4(unsigned short *P) { return *P & 255; }
unsigned foo5(short *P) { return *P & 255; }

to:

_foo4:
        lbz r3,1(r3)
        blr
_foo5:
        lbz r3,1(r3)
        blr

not:

_foo4:
        lhz r2, 0(r3)
        rlwinm r3, r2, 0, 24, 31
        blr
_foo5:
        lhz r2, 0(r3)
        rlwinm r3, r2, 0, 24, 31
        blr

llvm-svn: 26419
2006-02-28 06:49:37 +00:00
Chris Lattner
bdbc4476d9 Fold "and (LOAD P), 255" -> zextload. This allows us to compile:
unsigned foo3(unsigned *P) { return *P & 255; }
as:
_foo3:
        lbz r3, 3(r3)
        blr

instead of:

_foo3:
        lwz r2, 0(r3)
        rlwinm r3, r2, 0, 24, 31
        blr

and:

unsigned short foo2(float a) { return a; }

as:
_foo2:
        fctiwz f0, f1
        stfd f0, -8(r1)
        lhz r3, -2(r1)
        blr

instead of:

_foo2:
        fctiwz f0, f1
        stfd f0, -8(r1)
        lwz r2, -4(r1)
        rlwinm r3, r2, 0, 16, 31
        blr

llvm-svn: 26417
2006-02-28 06:35:35 +00:00
Chris Lattner
0f8a727c49 fold (sra (sra x, c1), c2) -> (sra x, c1+c2)
llvm-svn: 26416
2006-02-28 06:23:04 +00:00
Chris Lattner
47ee42829d remove some completed notes
llvm-svn: 26390
2006-02-27 00:39:31 +00:00
Chris Lattner
301f45cf6f Fix a problem Nate and Duraid reported where simplifying nodes can cause
them to get ressurected, in which case, deleting the undead nodes is
unfriendly.

llvm-svn: 26291
2006-02-20 06:51:04 +00:00
Nate Begeman
abac61603f Add checks to make sure we don't create bogus extend nodes, and fix a bug
where we were doing exactly that which was causing failures on x86 and
alpha.

llvm-svn: 26284
2006-02-18 02:40:58 +00:00
Chris Lattner
375e1a71cc Fix a tricky issue in the SimplifyDemandedBits code where CombineTo wasn't
exactly the API we wanted to call into.  This fixes the crash on crafty last
night.

llvm-svn: 26269
2006-02-17 21:58:01 +00:00
Nate Begeman
fb5dbadf15 Clean up DemandedBitsAreZero interface
Make more use of the new mask helpers in valuetypes.h
Combine (sra (srl x, c1), c1) -> sext_inreg if legal

llvm-svn: 26263
2006-02-17 19:54:08 +00:00
Nate Begeman
57b3567552 Don't expand sdiv by power of two before legalize, since it will likely
generate illegal nodes.

llvm-svn: 26261
2006-02-17 07:26:20 +00:00
Nate Begeman
5965bd19f8 kill ADD_PARTS & SUB_PARTS and replace them with fancy new ADDC, ADDE, SUBC
and SUBE nodes that actually expose what's going on and allow for
significant simplifications in the targets.

llvm-svn: 26255
2006-02-17 05:43:56 +00:00
Nate Begeman
8a77efe4f7 Rework the SelectionDAG-based implementations of SimplifyDemandedBits
and ComputeMaskedBits to match the new improved versions in instcombine.
Tested against all of multisource/benchmarks on ppc.

llvm-svn: 26238
2006-02-16 21:11:51 +00:00
Chris Lattner
471627c49d Lowering of sdiv X, pow2 was broken, this fixes it. This patch is written
by Nate, I'm just committing it for him.

llvm-svn: 26230
2006-02-16 08:02:36 +00:00
Jim Laskey
2eea436192 Should not combine ISD::LOCATIONs until we have scheme to remove from
MachineDebugInfo tables.

llvm-svn: 26216
2006-02-15 19:34:44 +00:00
Chris Lattner
a10e23c19f Compile this:
xori r6, r2, 1
        rlwinm r6, r6, 0, 31, 31
        cmpwi cr0, r6, 0
        bne cr0, LBB1_3 ; endif

to this:

        rlwinm r6, r2, 0, 31, 31
        cmpwi cr0, r6, 0
        beq cr0, LBB1_3 ; endif

llvm-svn: 26047
2006-02-08 02:13:15 +00:00
Nate Begeman
8c9cd461df Back out previous commit, it isn't safe.
llvm-svn: 26006
2006-02-05 08:23:00 +00:00
Nate Begeman
3dc8b89493 fold c1 << (x + c2) into (c1 << c2) << x. fix a warning.
llvm-svn: 26005
2006-02-05 08:07:24 +00:00
Nate Begeman
c89fdf1eb3 Handle urem by shifted powers of 2.
llvm-svn: 26001
2006-02-05 07:36:48 +00:00
Nate Begeman
25d178bece handle combining A / (B << N) into A >>u (log2(B)+N) when B is a power of 2
llvm-svn: 26000
2006-02-05 07:20:23 +00:00
Nate Begeman
dc7bba9ffe Add a framework for eliminating instructions that produces undemanded bits.
llvm-svn: 25945
2006-02-03 22:24:05 +00:00
Nate Begeman
22e251abf1 Add common code for reassociating ops in the dag combiner
llvm-svn: 25934
2006-02-03 06:46:56 +00:00
Chris Lattner
49beaf40fc Turn any_extend nodes into zero_extend nodes when it allows us to remove an
and instruction.  This allows us to compile stuff like this:

bool %X(int %X) {
        %Y = add int %X, 14
        %Z = setne int %Y, 12345
        ret bool %Z
}

to this:

_X:
        cmpl $12331, 4(%esp)
        setne %al
        movzbl %al, %eax
        ret

instead of this:

_X:
        cmpl $12331, 4(%esp)
        setne %al
        movzbl %al, %eax
        andl $1, %eax
        ret

This occurs quite a bit with the X86 backend.  For example, 25 times in
lambda, 30 times in 177.mesa, 14 times in galgel,  70 times in fma3d,
25 times in vpr, several hundred times in gcc, ~45 times in crafty,
~60 times in parser, ~140 times in eon, 110 times in perlbmk, 55 on gap,
16 times on bzip2, 14 times on twolf, and 1-2 times in many other SPEC2K
programs.

llvm-svn: 25901
2006-02-02 07:17:31 +00:00
Chris Lattner
49ce35542f add two dag combines:
(C1-X) == C2 --> X == C1-C2
(X+C1) == C2 --> X == C2-C1

This allows us to compile this:

bool %X(int %X) {
        %Y = add int %X, 14
        %Z = setne int %Y, 12345
        ret bool %Z
}

into this:

_X:
        cmpl $12331, 4(%esp)
        setne %al
        movzbl %al, %eax
        andl $1, %eax
        ret

not this:

_X:
        movl $14, %eax
        addl 4(%esp), %eax
        cmpl $12345, %eax
        setne %al
        movzbl %al, %eax
        andl $1, %eax
        ret

Testcase here: Regression/CodeGen/X86/compare-add.ll

nukage of the and coming up next.

llvm-svn: 25898
2006-02-02 06:36:13 +00:00
Nate Begeman
7e7f439f85 Fix some of the stuff in the PPC README file, and clean up legalization
of the SELECT_CC, BR_CC, and BRTWOWAY_CC nodes.

llvm-svn: 25875
2006-02-01 07:19:44 +00:00
Chris Lattner
f0b24d2dc0 Move MaskedValueIsZero from the DAGCombiner to the TargetLowering interface,making isMaskedValueZeroForTargetNode simpler, and useable from other partsof the compiler.
llvm-svn: 25803
2006-01-30 04:09:27 +00:00
Chris Lattner
3b40e64aa3 pass the address of MaskedValueIsZero into isMaskedValueZeroForTargetNode,
to permit recursion

llvm-svn: 25799
2006-01-30 03:49:37 +00:00
Chris Lattner
678da98835 eliminate uses of SelectionDAG::getBR2Way_CC
llvm-svn: 25767
2006-01-29 06:00:45 +00:00
Nate Begeman
af397cec0b Add a missing case to the dag combiner.
llvm-svn: 25723
2006-01-28 01:06:30 +00:00
Chris Lattner
de02d7727f Add explicit #includes of <iostream>
llvm-svn: 25515
2006-01-22 23:41:00 +00:00
Nate Begeman
569c439567 Get rid of code in the DAGCombiner that is duplicated in SelectionDAG.cpp
Now all constant folding in the code generator is in one place.

llvm-svn: 25426
2006-01-18 22:35:16 +00:00
Chris Lattner
5fee908be5 Fix a backwards conditional that caused an inf loop in some cases. This
fixes: test/Regression/CodeGen/Generic/2005-01-18-SetUO-InfLoop.ll

llvm-svn: 25419
2006-01-18 19:13:41 +00:00