Chris Lattner
2d8551c85b
Delete dead loads in the dag. This allows us to compile
...
vector.ll:test_extract_elt2 into:
_test_extract_elt2:
lfd f1, 32(r3)
blr
instead of:
_test_extract_elt2:
lfd f0, 56(r3)
lfd f0, 48(r3)
lfd f0, 40(r3)
lfd f1, 32(r3)
lfd f0, 24(r3)
lfd f0, 16(r3)
lfd f0, 8(r3)
lfd f0, 0(r3)
blr
llvm-svn: 27296
2006-03-31 18:06:18 +00:00
Chris Lattner
20e619fba3
When building a VVECTOR_SHUFFLE node from extract_element operations, make
...
sure to build it as SHUFFLE(X, undef, mask), not SHUFFLE(X, X, mask).
The later is not canonical form, and prevents the PPC splat pattern from
matching. For a particular splat, we go from generating this:
li r10, lo16(LCPI1_0)
lis r11, ha16(LCPI1_0)
lvx v3, r11, r10
vperm v3, v2, v2, v3
to generating:
vspltw v3, v2, 3
llvm-svn: 27236
2006-03-28 22:19:47 +00:00
Chris Lattner
a46dfe80c8
Canonicalize VECTOR_SHUFFLE(X, X, Y) -> VECTOR_SHUFFLE(X,undef,Y')
...
llvm-svn: 27235
2006-03-28 22:11:53 +00:00
Chris Lattner
c9992548fc
Turn a series of extract_element's feeding a build_vector into a
...
vector_shuffle node. For this:
void test(__m128 *res, __m128 *A, __m128 *B) {
*res = _mm_unpacklo_ps(*A, *B);
}
we now produce this code:
_test:
movl 8(%esp), %eax
movaps (%eax), %xmm0
movl 12(%esp), %eax
unpcklps (%eax), %xmm0
movl 4(%esp), %eax
movaps %xmm0, (%eax)
ret
instead of this:
_test:
subl $76, %esp
movl 88(%esp), %eax
movaps (%eax), %xmm0
movaps %xmm0, (%esp)
movaps %xmm0, 32(%esp)
movss 4(%esp), %xmm0
movss 32(%esp), %xmm1
unpcklps %xmm0, %xmm1
movl 84(%esp), %eax
movaps (%eax), %xmm0
movaps %xmm0, 16(%esp)
movaps %xmm0, 48(%esp)
movss 20(%esp), %xmm0
movss 48(%esp), %xmm2
unpcklps %xmm0, %xmm2
unpcklps %xmm1, %xmm2
movl 80(%esp), %eax
movaps %xmm2, (%eax)
addl $76, %esp
ret
GCC produces this (with -fomit-frame-pointer):
_test:
subl $12, %esp
movl 20(%esp), %eax
movaps (%eax), %xmm0
movl 24(%esp), %eax
unpcklps (%eax), %xmm0
movl 16(%esp), %eax
movaps %xmm0, (%eax)
addl $12, %esp
ret
llvm-svn: 27233
2006-03-28 20:28:38 +00:00
Chris Lattner
b7163598f9
Don't crash on X^X if X is a vector. Instead, produce a vector of zeros.
...
llvm-svn: 27229
2006-03-28 19:11:05 +00:00
Chris Lattner
dc1eab5886
Don't call SimplifyDemandedBits on vectors
...
llvm-svn: 27128
2006-03-25 22:19:00 +00:00
Chris Lattner
5336a59e4b
fold insertelement(buildvector) -> buildvector if the inserted element # is
...
a constant. This implements test_constant_insert in CodeGen/Generic/vector.ll
llvm-svn: 26851
2006-03-19 01:27:56 +00:00
Nate Begeman
bb01d4f272
Remove BRTWOWAY*
...
Make the PPC backend not dependent on BRTWOWAY_CC and make the branch
selector smarter about the code it generates, fixing a case in the
readme.
llvm-svn: 26814
2006-03-17 01:40:33 +00:00
Chris Lattner
68ac09d5cb
make sure dead token factor nodes are removed by the dag combiner.
...
llvm-svn: 26731
2006-03-13 18:37:30 +00:00
Chris Lattner
d8c2a48d58
Fold X+Y -> X|Y when safe. This implements:
...
Regression/CodeGen/PowerPC/and_add.ll
a case that occurs with dynamic allocas of constant size.
llvm-svn: 26727
2006-03-13 06:51:27 +00:00
Chris Lattner
8bb6cb7d7b
add a couple of missing folds
...
llvm-svn: 26724
2006-03-13 06:26:26 +00:00
Chris Lattner
bdaf4f38b5
Reinstate this now that the offending opposite xform has been removed.
...
llvm-svn: 26548
2006-03-05 19:53:55 +00:00
Evan Cheng
d428e22c07
Back out fold (shl (add x, c1), c2) -> (add (shl x, c2), c1<<c2) for now.
...
It's causing an infinite loop compiling ldecod on x86 / Darwin.
llvm-svn: 26544
2006-03-05 07:30:16 +00:00
Chris Lattner
3bc4050217
Add some simple copysign folds
...
llvm-svn: 26543
2006-03-05 05:30:57 +00:00
Chris Lattner
f29f5204cc
fold (mul (add x, c1), c2) -> (add (mul x, c2), c1*c2)
...
fold (shl (add x, c1), c2) -> (add (shl x, c2), c1<<c2)
This allows us to compile CodeGen/PowerPC/addi-reassoc.ll into:
_test1:
slwi r2, r4, 4
add r2, r2, r3
lwz r3, 36(r2)
blr
_test2:
mulli r2, r4, 5
add r2, r2, r3
lbz r2, 11(r2)
extsb r3, r2
blr
instead of:
_test1:
addi r2, r4, 2
slwi r2, r2, 4
add r2, r3, r2
lwz r3, 4(r2)
blr
_test2:
addi r2, r4, 2
mulli r2, r2, 5
add r2, r3, r2
lbz r2, 1(r2)
extsb r3, r2
blr
llvm-svn: 26535
2006-03-04 23:33:26 +00:00
Chris Lattner
0db2f2c689
Fix CodeGen/Generic/2006-03-01-dagcombineinfloop.ll, an infinite loop
...
in the dag combiner on 176.gcc on x86.
llvm-svn: 26459
2006-03-01 21:47:21 +00:00
Chris Lattner
232024edb8
Fix a typo evan noticed
...
llvm-svn: 26454
2006-03-01 19:55:35 +00:00
Chris Lattner
bc1c85beea
Add support for target-specific dag combines
...
llvm-svn: 26443
2006-03-01 04:53:38 +00:00
Chris Lattner
fbcd62d3bb
Add a new AddToWorkList method, start using it
...
llvm-svn: 26441
2006-03-01 04:03:14 +00:00
Chris Lattner
324871ef1a
Pull shifts by a constant through multiplies (a form of reassociation),
...
implementing Regression/CodeGen/X86/mul-shift-reassoc.ll
llvm-svn: 26440
2006-03-01 03:44:24 +00:00
Evan Cheng
b97aab4371
Vector ops lowering.
...
llvm-svn: 26436
2006-03-01 01:09:54 +00:00
Chris Lattner
f0032b350c
Compile:
...
unsigned foo4(unsigned short *P) { return *P & 255; }
unsigned foo5(short *P) { return *P & 255; }
to:
_foo4:
lbz r3,1(r3)
blr
_foo5:
lbz r3,1(r3)
blr
not:
_foo4:
lhz r2, 0(r3)
rlwinm r3, r2, 0, 24, 31
blr
_foo5:
lhz r2, 0(r3)
rlwinm r3, r2, 0, 24, 31
blr
llvm-svn: 26419
2006-02-28 06:49:37 +00:00
Chris Lattner
bdbc4476d9
Fold "and (LOAD P), 255" -> zextload. This allows us to compile:
...
unsigned foo3(unsigned *P) { return *P & 255; }
as:
_foo3:
lbz r3, 3(r3)
blr
instead of:
_foo3:
lwz r2, 0(r3)
rlwinm r3, r2, 0, 24, 31
blr
and:
unsigned short foo2(float a) { return a; }
as:
_foo2:
fctiwz f0, f1
stfd f0, -8(r1)
lhz r3, -2(r1)
blr
instead of:
_foo2:
fctiwz f0, f1
stfd f0, -8(r1)
lwz r2, -4(r1)
rlwinm r3, r2, 0, 16, 31
blr
llvm-svn: 26417
2006-02-28 06:35:35 +00:00
Chris Lattner
0f8a727c49
fold (sra (sra x, c1), c2) -> (sra x, c1+c2)
...
llvm-svn: 26416
2006-02-28 06:23:04 +00:00
Chris Lattner
47ee42829d
remove some completed notes
...
llvm-svn: 26390
2006-02-27 00:39:31 +00:00
Chris Lattner
301f45cf6f
Fix a problem Nate and Duraid reported where simplifying nodes can cause
...
them to get ressurected, in which case, deleting the undead nodes is
unfriendly.
llvm-svn: 26291
2006-02-20 06:51:04 +00:00
Nate Begeman
abac61603f
Add checks to make sure we don't create bogus extend nodes, and fix a bug
...
where we were doing exactly that which was causing failures on x86 and
alpha.
llvm-svn: 26284
2006-02-18 02:40:58 +00:00
Chris Lattner
375e1a71cc
Fix a tricky issue in the SimplifyDemandedBits code where CombineTo wasn't
...
exactly the API we wanted to call into. This fixes the crash on crafty last
night.
llvm-svn: 26269
2006-02-17 21:58:01 +00:00
Nate Begeman
fb5dbadf15
Clean up DemandedBitsAreZero interface
...
Make more use of the new mask helpers in valuetypes.h
Combine (sra (srl x, c1), c1) -> sext_inreg if legal
llvm-svn: 26263
2006-02-17 19:54:08 +00:00
Nate Begeman
57b3567552
Don't expand sdiv by power of two before legalize, since it will likely
...
generate illegal nodes.
llvm-svn: 26261
2006-02-17 07:26:20 +00:00
Nate Begeman
5965bd19f8
kill ADD_PARTS & SUB_PARTS and replace them with fancy new ADDC, ADDE, SUBC
...
and SUBE nodes that actually expose what's going on and allow for
significant simplifications in the targets.
llvm-svn: 26255
2006-02-17 05:43:56 +00:00
Nate Begeman
8a77efe4f7
Rework the SelectionDAG-based implementations of SimplifyDemandedBits
...
and ComputeMaskedBits to match the new improved versions in instcombine.
Tested against all of multisource/benchmarks on ppc.
llvm-svn: 26238
2006-02-16 21:11:51 +00:00
Chris Lattner
471627c49d
Lowering of sdiv X, pow2 was broken, this fixes it. This patch is written
...
by Nate, I'm just committing it for him.
llvm-svn: 26230
2006-02-16 08:02:36 +00:00
Jim Laskey
2eea436192
Should not combine ISD::LOCATIONs until we have scheme to remove from
...
MachineDebugInfo tables.
llvm-svn: 26216
2006-02-15 19:34:44 +00:00
Chris Lattner
a10e23c19f
Compile this:
...
xori r6, r2, 1
rlwinm r6, r6, 0, 31, 31
cmpwi cr0, r6, 0
bne cr0, LBB1_3 ; endif
to this:
rlwinm r6, r2, 0, 31, 31
cmpwi cr0, r6, 0
beq cr0, LBB1_3 ; endif
llvm-svn: 26047
2006-02-08 02:13:15 +00:00
Nate Begeman
8c9cd461df
Back out previous commit, it isn't safe.
...
llvm-svn: 26006
2006-02-05 08:23:00 +00:00
Nate Begeman
3dc8b89493
fold c1 << (x + c2) into (c1 << c2) << x. fix a warning.
...
llvm-svn: 26005
2006-02-05 08:07:24 +00:00
Nate Begeman
c89fdf1eb3
Handle urem by shifted powers of 2.
...
llvm-svn: 26001
2006-02-05 07:36:48 +00:00
Nate Begeman
25d178bece
handle combining A / (B << N) into A >>u (log2(B)+N) when B is a power of 2
...
llvm-svn: 26000
2006-02-05 07:20:23 +00:00
Nate Begeman
dc7bba9ffe
Add a framework for eliminating instructions that produces undemanded bits.
...
llvm-svn: 25945
2006-02-03 22:24:05 +00:00
Nate Begeman
22e251abf1
Add common code for reassociating ops in the dag combiner
...
llvm-svn: 25934
2006-02-03 06:46:56 +00:00
Chris Lattner
49beaf40fc
Turn any_extend nodes into zero_extend nodes when it allows us to remove an
...
and instruction. This allows us to compile stuff like this:
bool %X(int %X) {
%Y = add int %X, 14
%Z = setne int %Y, 12345
ret bool %Z
}
to this:
_X:
cmpl $12331, 4(%esp)
setne %al
movzbl %al, %eax
ret
instead of this:
_X:
cmpl $12331, 4(%esp)
setne %al
movzbl %al, %eax
andl $1, %eax
ret
This occurs quite a bit with the X86 backend. For example, 25 times in
lambda, 30 times in 177.mesa, 14 times in galgel, 70 times in fma3d,
25 times in vpr, several hundred times in gcc, ~45 times in crafty,
~60 times in parser, ~140 times in eon, 110 times in perlbmk, 55 on gap,
16 times on bzip2, 14 times on twolf, and 1-2 times in many other SPEC2K
programs.
llvm-svn: 25901
2006-02-02 07:17:31 +00:00
Chris Lattner
49ce35542f
add two dag combines:
...
(C1-X) == C2 --> X == C1-C2
(X+C1) == C2 --> X == C2-C1
This allows us to compile this:
bool %X(int %X) {
%Y = add int %X, 14
%Z = setne int %Y, 12345
ret bool %Z
}
into this:
_X:
cmpl $12331, 4(%esp)
setne %al
movzbl %al, %eax
andl $1, %eax
ret
not this:
_X:
movl $14, %eax
addl 4(%esp), %eax
cmpl $12345, %eax
setne %al
movzbl %al, %eax
andl $1, %eax
ret
Testcase here: Regression/CodeGen/X86/compare-add.ll
nukage of the and coming up next.
llvm-svn: 25898
2006-02-02 06:36:13 +00:00
Nate Begeman
7e7f439f85
Fix some of the stuff in the PPC README file, and clean up legalization
...
of the SELECT_CC, BR_CC, and BRTWOWAY_CC nodes.
llvm-svn: 25875
2006-02-01 07:19:44 +00:00
Chris Lattner
f0b24d2dc0
Move MaskedValueIsZero from the DAGCombiner to the TargetLowering interface,making isMaskedValueZeroForTargetNode simpler, and useable from other partsof the compiler.
...
llvm-svn: 25803
2006-01-30 04:09:27 +00:00
Chris Lattner
3b40e64aa3
pass the address of MaskedValueIsZero into isMaskedValueZeroForTargetNode,
...
to permit recursion
llvm-svn: 25799
2006-01-30 03:49:37 +00:00
Chris Lattner
678da98835
eliminate uses of SelectionDAG::getBR2Way_CC
...
llvm-svn: 25767
2006-01-29 06:00:45 +00:00
Nate Begeman
af397cec0b
Add a missing case to the dag combiner.
...
llvm-svn: 25723
2006-01-28 01:06:30 +00:00
Chris Lattner
de02d7727f
Add explicit #includes of <iostream>
...
llvm-svn: 25515
2006-01-22 23:41:00 +00:00
Nate Begeman
569c439567
Get rid of code in the DAGCombiner that is duplicated in SelectionDAG.cpp
...
Now all constant folding in the code generator is in one place.
llvm-svn: 25426
2006-01-18 22:35:16 +00:00
Chris Lattner
5fee908be5
Fix a backwards conditional that caused an inf loop in some cases. This
...
fixes: test/Regression/CodeGen/Generic/2005-01-18-SetUO-InfLoop.ll
llvm-svn: 25419
2006-01-18 19:13:41 +00:00
Chris Lattner
fcdb420baf
Disable two transformations that contribute to bus errors on SparcV8.
...
llvm-svn: 25339
2006-01-15 18:58:59 +00:00
Chris Lattner
3470b5dee6
Add a simple missing fold to produce this:
...
subfic r3, r2, 33
instead of this:
subfic r2, r2, 32
addi r3, r2, 1
llvm-svn: 25255
2006-01-12 20:22:43 +00:00
Chris Lattner
b1ee616de9
Don't create rotate instructions in unsupported types, because we don't have
...
promote/expand code yet. This fixes the 177.mesa failure on PPC.
llvm-svn: 25250
2006-01-12 18:57:33 +00:00
Nate Begeman
1b8121b227
Add bswap, rotl, and rotr nodes
...
Add dag combiner code to recognize rotl, rotr
Add ppc code to match rotl
Targets should add rotl/rotr patterns if they have them
llvm-svn: 25222
2006-01-11 21:21:00 +00:00
Evan Cheng
85c973cda9
Revert the previous check-in. Leave shl x, 1 along for target to deal with.
...
llvm-svn: 25121
2006-01-06 01:56:02 +00:00
Evan Cheng
b03f9b32d2
fold (shl x, 1) -> (add x, x)
...
llvm-svn: 25120
2006-01-06 01:06:31 +00:00
Jim Laskey
762e9ec06c
Added initial support for DEBUG_LABEL allowing debug specific labels to be
...
inserted in the code.
llvm-svn: 25104
2006-01-05 01:25:28 +00:00
Jim Laskey
0da76a676a
Add unique id to debug location for debug label use (work in progress.)
...
llvm-svn: 25096
2006-01-04 15:04:11 +00:00
Jim Laskey
bdba3e2a46
Remove redundant debug locations.
...
llvm-svn: 24995
2005-12-23 20:08:28 +00:00
Chris Lattner
26943b9691
Simplify store(bitconv(x)) to store(x). This allows us to compile this:
...
void bar(double Y, double *X) {
*X = Y;
}
to this:
bar:
save -96, %o6, %o6
st %i1, [%i2+4]
st %i0, [%i2]
restore %g0, %g0, %g0
retl
nop
instead of this:
bar:
save -104, %o6, %o6
st %i1, [%i6+-4]
st %i0, [%i6+-8]
ldd [%i6+-8], %f0
std %f0, [%i2]
restore %g0, %g0, %g0
retl
nop
on sparcv8.
llvm-svn: 24983
2005-12-23 05:48:07 +00:00
Chris Lattner
54560f6887
fold (conv (load x)) -> (load (conv*)x).
...
This allows us to compile this:
void foo(double);
void bar(double *X) { foo(*X); }
To this:
bar:
save -96, %o6, %o6
ld [%i0+4], %o1
ld [%i0], %o0
call foo
nop
restore %g0, %g0, %g0
retl
nop
instead of this:
bar:
save -104, %o6, %o6
ldd [%i0], %f0
std %f0, [%i6+-8]
ld [%i6+-4], %o1
ld [%i6+-8], %o0
call foo
nop
restore %g0, %g0, %g0
retl
nop
on SparcV8.
llvm-svn: 24982
2005-12-23 05:44:41 +00:00
Chris Lattner
efbbedbf4a
Fold bitconv(bitconv(x)) -> x. We now compile this:
...
void foo(double);
void bar(double X) { foo(X); }
to this:
bar:
save -96, %o6, %o6
or %g0, %i0, %o0
or %g0, %i1, %o1
call foo
nop
restore %g0, %g0, %g0
retl
nop
instead of this:
bar:
save -112, %o6, %o6
st %i1, [%i6+-4]
st %i0, [%i6+-8]
ldd [%i6+-8], %f0
std %f0, [%i6+-16]
ld [%i6+-12], %o1
ld [%i6+-16], %o0
call foo
nop
restore %g0, %g0, %g0
retl
nop
on V8.
llvm-svn: 24981
2005-12-23 05:37:50 +00:00
Chris Lattner
a187460552
constant fold bits_convert in getNode and in the dag combiner for fp<->int
...
conversions. This allows V8 to compiles this:
void %test() {
call float %test2( float 1.000000e+00, float 2.000000e+00, double 3.000000e+00, double* null )
ret void
}
into:
test:
save -96, %o6, %o6
sethi 0, %o3
sethi 1049088, %o2
sethi 1048576, %o1
sethi 1040384
, %o0
or %g0, %o3, %o4
call test2
nop
restore %g0, %g0, %g0
retl
nop
instead of:
test:
save -112, %o6, %o6
sethi 0, %o4
sethi 1049088, %l0
st %o4, [%i6+-12]
st %l0, [%i6+-16]
ld [%i6+-12], %o3
ld [%i6+-16], %o2
sethi 1048576, %o1
sethi 1040384
, %o0
call test2
nop
restore %g0, %g0, %g0
retl
nop
llvm-svn: 24980
2005-12-23 05:30:37 +00:00
Evan Cheng
9cdc16c6d3
* Fix a GlobalAddress lowering bug.
...
* Teach DAG combiner about X86ISD::SETCC by adding a TargetLowering hook.
llvm-svn: 24921
2005-12-21 23:05:39 +00:00
Chris Lattner
83e4407379
Don't create SEXTLOAD/ZEXTLOAD instructions that the target doesn't support
...
if after legalize. This fixes IA64 failures.
llvm-svn: 24725
2005-12-15 19:02:38 +00:00
Chris Lattner
d39c60fcc8
When folding loads into ops, immediately replace uses of the op with the
...
load. This reduces number of worklist iterations and avoid missing optimizations
depending on folding of things into sext_inreg nodes (which aren't supported by
all targets).
Tested by Regression/CodeGen/X86/extend.ll:test2
llvm-svn: 24712
2005-12-14 19:25:30 +00:00
Chris Lattner
7dac1083da
Fix the (zext (zextload)) case to trigger, similarly for sign extends.
...
Allow (zext (truncate)) to apply after legalize if the target supports
AND (which all do).
This compiles
short %foo() {
%tmp.0 = load ubyte* %X ; <ubyte> [#uses=1]
%tmp.3 = cast ubyte %tmp.0 to short ; <short> [#uses=1]
ret short %tmp.3
}
to:
_foo:
movzbl _X, %eax
ret
instead of:
_foo:
movzbl _X, %eax
movzbl %al, %eax
ret
thanks to Evan for pointing this out.
llvm-svn: 24709
2005-12-14 19:05:06 +00:00
Chris Lattner
f753d1a574
Fix a miscompilation in crafty due to a recent patch
...
llvm-svn: 24706
2005-12-14 07:58:38 +00:00
Evan Cheng
bce7c47306
Fold (zext (load x) to (zextload x).
...
llvm-svn: 24702
2005-12-14 02:19:23 +00:00
Chris Lattner
57c882edf8
Only transform (sext (truncate x)) -> (sextinreg x) if before legalize or
...
if the target supports the resultant sextinreg
llvm-svn: 24632
2005-12-07 18:02:05 +00:00
Chris Lattner
cbd3d01a43
Teach the dag combiner to turn a truncate/sign_extend pair into a sextinreg
...
when the types match up. This allows the X86 backend to compile:
sbyte %toggle_value(sbyte* %tmp.1) {
%tmp.2 = load sbyte* %tmp.1
ret sbyte %tmp.2
}
to this:
_toggle_value:
mov %EAX, DWORD PTR [%ESP + 4]
movsx %EAX, BYTE PTR [%EAX]
ret
instead of this:
_toggle_value:
mov %EAX, DWORD PTR [%ESP + 4]
movsx %EAX, BYTE PTR [%EAX]
movsx %EAX, %AL
ret
noticed in Shootout/objinst.
-Chris
llvm-svn: 24630
2005-12-07 07:11:03 +00:00
Jeff Cohen
cf1f782a2f
Fix operator precedence bug caught by VC++.
...
llvm-svn: 24318
2005-11-12 00:59:01 +00:00
Chris Lattner
bf4f233214
Switch the allnodes list from a vector of pointers to an ilist of nodes.This eliminates the vector, allows constant time removal of a node froma graph, and makes iteration over the all nodes list stable when adding
...
nodes to the graph.
llvm-svn: 24263
2005-11-09 23:47:37 +00:00
Nate Begeman
ee065281e8
Fix a crash that Andrew noticed, and add a pair of braces to unfconfuse
...
XCode's indenting.
llvm-svn: 24159
2005-11-02 18:42:59 +00:00
Chris Lattner
17df608719
Fix a source of undefined behavior when dealing with 64-bit types. This
...
may fix PR652. Thanks to Andrew for tracking down the problem.
llvm-svn: 24145
2005-11-02 01:47:04 +00:00
Chris Lattner
a70878d4fb
Codegen mul by negative power of two with a shift and negate.
...
This implements test/Regression/CodeGen/PowerPC/mul-neg-power-2.ll,
producing:
_foo:
slwi r2, r3, 1
subfic r3, r2, 63
blr
instead of:
_foo:
mulli r2, r3, -2
addi r3, r2, 63
blr
llvm-svn: 24106
2005-10-30 06:41:49 +00:00
Chris Lattner
4b6d583d7a
Fix DSE to not nuke dead stores unless they redundant store is the same
...
VT as the killing one. Fix fixes PR491
llvm-svn: 24034
2005-10-27 07:10:34 +00:00
Chris Lattner
d8c5c066a1
Add a simple xform that is useful for bitfield operations.
...
llvm-svn: 24029
2005-10-27 05:06:38 +00:00
Chris Lattner
3b409a85eb
Clear a bit in this file that was causing a miscompilation of 178.galgel.
...
llvm-svn: 23980
2005-10-25 18:57:30 +00:00
Chris Lattner
9faa5b7a9a
BuildSDIV and BuildUDIV only work for i32/i64, but they don't check that
...
the input is that type, this caused a failure on gs on X86 last night.
Move the hard checks into Build[US]Div since that is where decisions like
this should be made.
llvm-svn: 23881
2005-10-22 18:50:15 +00:00
Chris Lattner
75ea5b10bf
add a case missing from the dag combiner that exposed the failure on
...
2005-10-21-longlonggtu.ll.
llvm-svn: 23875
2005-10-21 21:23:25 +00:00
Nate Begeman
8f62cd32ad
Fix a typo in the dag combiner, so that this can work on i64 targets
...
llvm-svn: 23856
2005-10-21 01:51:45 +00:00
Nate Begeman
4dd383120f
Invert the TargetLowering flag that controls divide by consant expansion.
...
Add a new flag to TargetLowering indicating if the target has really cheap
signed division by powers of two, make ppc use it. This will probably go
away in the future.
Implement some more ISD::SDIV folds in the dag combiner
Remove now dead code in the x86 backend.
llvm-svn: 23853
2005-10-21 00:02:42 +00:00
Nate Begeman
7efe53d90b
Fix a couple bugs in the const div stuff where we'd generate MULHS/MULHU
...
for types that aren't legal, and fail a divisor is less than zero
comparison, which would cause us to drop a subtract.
llvm-svn: 23846
2005-10-20 17:45:03 +00:00
Chris Lattner
a6efeb01f9
don't use llabs with apparently VC++ doesn't have
...
llvm-svn: 23845
2005-10-20 17:01:00 +00:00
Nate Begeman
c6f067a8c4
Move the target constant divide optimization up into the dag combiner, so
...
that the nodes can be folded with other nodes, and we can not duplicate
code in every backend. Alpha will probably want this too.
llvm-svn: 23835
2005-10-20 02:15:44 +00:00
Chris Lattner
6c14c35bd7
Fold (select C, load A, load B) -> load (select C, A, B). This happens quite
...
a lot throughout many programs. In particular, specfp triggers it a bunch for
constant FP nodes when you have code like cond ? 1.0 : -1.0.
If the PPC ISel exposed the loads implicit in pic references to external globals,
we would be able to eliminate a load in cases like this as well:
%X = external global int
%Y = external global int
int* %test4(bool %C) {
%G = select bool %C, int* %X, int* %Y
ret int* %G
}
Note that this breaks things that use SrcValue's (see the fixme), but since nothing
uses them yet, this is ok.
Also, simplify some code to use hasOneUse() on an SDOperand instead of hasNUsesOfValue directly.
llvm-svn: 23781
2005-10-18 06:04:22 +00:00
Nate Begeman
418c6e4045
Implement some feedback from Chris re: constant canonicalization
...
llvm-svn: 23777
2005-10-18 00:28:13 +00:00
Nate Begeman
ec48a1bfbd
fold fmul X, +2.0 -> fadd X, X;
...
llvm-svn: 23774
2005-10-17 20:40:11 +00:00
Chris Lattner
eeb2bda2fa
add a trivial fold
...
llvm-svn: 23764
2005-10-17 01:07:11 +00:00
Chris Lattner
e540800d5a
Fix this logic.
...
llvm-svn: 23756
2005-10-15 22:35:40 +00:00
Chris Lattner
17cc9edd33
Add a case we were missing that was causing us to fail CodeGen/PowerPC/rlwinm.ll:test3
...
llvm-svn: 23755
2005-10-15 22:18:08 +00:00
Nate Begeman
6e673b24d3
fold sext_in_reg, sext_in_reg where both have the same VT. This was
...
popping up in Fourinarow.
llvm-svn: 23722
2005-10-14 01:29:07 +00:00
Nate Begeman
d59e5a7abb
Relax the checking on zextload generation a bit, since as sabre pointed out
...
you could be AND'ing with the result of a shift that shifts out all the
bits you care about, in addition to a constant.
Also, move over an add/sub_parts fold from legalize to the dag combiner,
where it works for things other than constants. Woot!
llvm-svn: 23720
2005-10-14 01:12:21 +00:00
Chris Lattner
b8282987f4
Fix the trunc(load) case, finally allowing crafty and povray to pass
...
llvm-svn: 23718
2005-10-13 22:10:05 +00:00
Chris Lattner
dbc5ae3109
Fix some bugs in (sext (load x))
...
llvm-svn: 23717
2005-10-13 21:52:31 +00:00
Nate Begeman
8e022b3d89
Fix the remaining DAGCombiner issues pointed out by sabre. This should fix
...
the remainder of the failures introduced by my patch last night.
llvm-svn: 23714
2005-10-13 18:34:58 +00:00
Chris Lattner
a80f1f6e72
Fix a minor bug in the dag combiner that broke pcompress2 and some other
...
tests.
llvm-svn: 23713
2005-10-13 18:16:34 +00:00
Nate Begeman
02b23c6065
Move some Legalize functionality over to the DAGCombiner where it belongs.
...
Kill some dead code.
llvm-svn: 23706
2005-10-13 03:11:28 +00:00
Nate Begeman
70d28c5e32
Fix a potential bug with two combine-to's back to back that chris pointed
...
out, where after the first CombineTo() call, the node the second CombineTo
wishes to replace may no longer exist.
Fix a very real bug with the truncated load optimization on little endian
targets, which do not need a byte offset added to the load.
llvm-svn: 23704
2005-10-12 23:18:53 +00:00
Nate Begeman
8caf81d617
More cool stuff for the dag combiner. We can now finally handle things
...
like turning:
_foo:
fctiwz f0, f1
stfd f0, -8(r1)
lwz r2, -4(r1)
rlwinm r3, r2, 0, 16, 31
blr
into
_foo:
fctiwz f0,f1
stfd f0,-8(r1)
lhz r3,-2(r1)
blr
Also removed an unncessary constraint from sra -> srl conversion, which
should take care of hte only reason we would ever need to handle sra in
MaskedValueIsZero, AFAIK.
llvm-svn: 23703
2005-10-12 20:40:40 +00:00
Chris Lattner
514f058be1
Fix a powerpc crash on CodeGen/Generic/llvm-ct-intrinsics.ll
...
llvm-svn: 23694
2005-10-11 17:56:34 +00:00
Chris Lattner
c38fb8e2a1
Add a canonicalization that got lost, fixing PowerPC/fold-li.ll:SUB
...
llvm-svn: 23693
2005-10-11 06:07:15 +00:00
Chris Lattner
cc6e53e6ee
clean up some corner cases
...
llvm-svn: 23692
2005-10-10 23:00:08 +00:00
Chris Lattner
04c737091f
Implement trivial DSE. If two stores are neighbors and store to the same
...
location, replace them with a new store of the last value. This occurs
in the same neighborhood in 197.parser, speeding it up about 1.5%
llvm-svn: 23691
2005-10-10 22:31:19 +00:00
Chris Lattner
e260ed8628
Add support for CombineTo, allowing the dag combiner to replace nodes with
...
multiple results.
Use this support to implement trivial store->load forwarding, implementing
CodeGen/PowerPC/store-load-fwd.ll. Though this is the most simple case and
can be extended in the future, it is still useful. For example, it speeds
up 197.parser by 6.2% by avoiding an LSU reject in xalloc:
stw r6, lo16(l5_end_of_array)(r2)
addi r2, r5, -4
stwx r5, r4, r2
- lwzx r5, r4, r2
- rlwinm r5, r5, 0, 0, 30
stwx r5, r4, r2
lwz r2, -4(r4)
ori r2, r2, 1
llvm-svn: 23690
2005-10-10 22:04:48 +00:00
Nate Begeman
6828ed9bfd
Teach the DAGCombiner several new tricks, teaching it how to turn
...
sext_inreg into zext_inreg based on the signbit (fires a lot), srem into
urem, etc.
llvm-svn: 23688
2005-10-10 21:26:48 +00:00
Chris Lattner
7730924067
Fix comment
...
llvm-svn: 23686
2005-10-10 16:52:03 +00:00
Chris Lattner
3d1d4a3d12
Add ISD::ADD to MaskedValueIsZero
...
llvm-svn: 23685
2005-10-10 16:51:40 +00:00
Chris Lattner
6a49b7cabb
add a todo for something I noticed
...
llvm-svn: 23679
2005-10-09 22:59:08 +00:00
Chris Lattner
1d3dc00674
(X & Y) & C == 0 if either X&C or Y&C are zero
...
llvm-svn: 23678
2005-10-09 22:12:36 +00:00
Nate Begeman
2042aa5b92
Lo and behold, the last bits of SelectionDAG.cpp have been moved over.
...
llvm-svn: 23665
2005-10-08 00:29:44 +00:00
Chris Lattner
fb12624a3f
implement CodeGen/PowerPC/div-2.ll:test2-4 by propagating zero bits through
...
C-X's
llvm-svn: 23662
2005-10-07 15:30:32 +00:00
Chris Lattner
5bcd0dd811
Turn sdivs into udivs when we can prove the sign bits are clear. This
...
implements CodeGen/PowerPC/div-2.ll
llvm-svn: 23659
2005-10-07 06:10:46 +00:00
Nate Begeman
bd7df030d2
Check in some more DAGCombiner pieces
...
llvm-svn: 23639
2005-10-05 21:43:42 +00:00
Chris Lattner
a49e16fefa
implement visitBR_CC so that PowerPC/inverted-bool-compares.ll passes
...
with the dag combiner. This speeds up espresso by 8%, reaching performance
parity with the dag-combiner-disabled llc.
llvm-svn: 23636
2005-10-05 06:47:48 +00:00
Chris Lattner
06f1d0f73a
Add a new HandleNode class, which is used to handle (haha) cases in the
...
dead node elim and dag combiner passes where the root is potentially updated.
This fixes a fixme in the dag combiner.
llvm-svn: 23634
2005-10-05 06:35:28 +00:00
Chris Lattner
a6895d180e
Implement the code for PowerPC/inverted-bool-compares.ll, even though it
...
that testcase still does not pass with the dag combiner. This is because
not all forms of br* are folded yet.
Also, when we combine a node into another one, delete the node immediately
instead of waiting for the node to potentially come up in the future.
llvm-svn: 23632
2005-10-05 06:11:08 +00:00
Chris Lattner
746d50a01a
Fix a crash compiling Olden/tsp
...
llvm-svn: 23630
2005-10-05 04:45:43 +00:00
Chris Lattner
6f3b577ee6
Add FP versions of the binary operators, keeping the int and fp worlds seperate.
...
Though I have done extensive testing, it is possible that this will break
things in configs I can't test. Please let me know if this causes a problem
and I'll fix it ASAP.
llvm-svn: 23504
2005-09-28 22:28:18 +00:00
Nate Begeman
c760f80fed
Stub out the rest of the DAG Combiner. Just need to fill in the
...
select_cc bits and then wrap it in a convenience function for use with
regular select.
llvm-svn: 23389
2005-09-19 22:34:01 +00:00
Nate Begeman
24a7eca282
More DAG combining. Still need the branch instructions, and select_cc
...
llvm-svn: 23371
2005-09-16 00:54:12 +00:00
Chris Lattner
bd39c1a4c6
Add a missing #include, patch courtesy of Baptiste Lepilleur.
...
llvm-svn: 23302
2005-09-09 23:53:39 +00:00
Nate Begeman
049b748c76
Last round of 2-node folds from SD.cpp. Will move on to 3 node ops such
...
as setcc and select next.
llvm-svn: 23295
2005-09-09 19:49:52 +00:00
Nate Begeman
85c1cc4523
Move yet more folds over to the dag combiner from sd.cpp
...
llvm-svn: 23278
2005-09-08 20:18:10 +00:00
Nate Begeman
2cc2c9a79c
Another round of dag combiner changes. This fixes some missing XOR folds
...
as well as fixing how we replace old values with new values.
llvm-svn: 23260
2005-09-07 23:25:52 +00:00
Nate Begeman
6791d63e55
Implement a common missing fold, (add (add x, c1), c2) -> (add x, c1+c2).
...
This restores all of stanford to being identical with and without the dag
combiner with the add folding turned off in sd.cpp.
llvm-svn: 23258
2005-09-07 16:09:19 +00:00
Nate Begeman
007c650699
Add an option to the DAG Combiner to enable it for beta runs, and turn on
...
that option for PowerPC's beta.
llvm-svn: 23253
2005-09-07 00:15:36 +00:00
Nate Begeman
d23739d020
Next round of DAGCombiner changes. This version now passes all the tests
...
I have run so far when run before Legalize. It still needs to pick up the
SetCC folds, and nodes that use SetCC.
llvm-svn: 23243
2005-09-06 04:43:02 +00:00
Nate Begeman
7cea6ef16e
Next round of DAG Combiner changes. Just need to support multiple return
...
values, and then we should be able to hook it up.
llvm-svn: 23231
2005-09-02 21:18:40 +00:00
Nate Begeman
2504fe2613
Implement first round of feedback from chris (there's still a couple things
...
left to do).
llvm-svn: 23195
2005-09-01 23:24:04 +00:00
Nate Begeman
e8f78d1aab
Add the rest of the currently implemented visit routines to the switch
...
statement in visit().
llvm-svn: 23185
2005-09-01 00:33:32 +00:00
Nate Begeman
21158fc485
First pass at the DAG Combiner. It isn't used anywhere yet, but it should
...
be mostly functional. It currently has all folds from SelectionDAG.cpp
that do not involve a condition code.
llvm-svn: 23184
2005-09-01 00:19:25 +00:00