teak-llvm

mirror of https://github.com/Gericom/teak-llvm.git synced 2025-06-29 00:08:59 -04:00

Author	SHA1	Message	Date
Stepan Dyatkovskiy	6a638ec521	Fixed DAGCombiner bug (found and localized by James Malloy): The DAGCombiner tries to optimise a BUILD_VECTOR by checking if it consists purely of get_vector_elts from one or two source vectors. If so, it either makes a concat_vectors node or a shufflevector node. However, it doesn't check the element type width of the underlying vector, so if you have this sequence: Node0: v4i16 = ... Node1: i32 = extract_vector_elt Node0 Node2: i32 = extract_vector_elt Node0 Node3: v16i8 = BUILD_VECTOR Node1, Node2, ... It will attempt to: Node0: v4i16 = ... NewNode1: v16i8 = concat_vectors Node0, ... Where this is actually invalid because the element width is completely different. This causes an assertion failure on DAG legalization stage. Fix: If output item type of BUILD_VECTOR differs from input item type. Make concat_vectors based on input element type and then bitcast it to the output vector type. So the case described above will transformed to: Node0: v4i16 = ... NewNode1: v8i16 = concat_vectors Node0, ... NewNode2: v16i8 = bitcast NewNode1 llvm-svn: 162195	2012-08-20 07:57:06 +00:00
Owen Anderson	a40319b7f1	Add a roundToIntegral method to APFloat, which can be parameterized over various rounding modes. Use this to implement SelectionDAG constant folding of FFLOOR, FCEIL, and FTRUNC. llvm-svn: 161807	2012-08-13 23:32:49 +00:00
Elena Demikhovsky	3cb3b0045c	Added FMA functionality to X86 target. llvm-svn: 161110	2012-08-01 12:06:00 +00:00
Nadav Rotem	9056076cab	Fixed DAGCombine optimizations which generate select_cc for targets that do not support it (X86 does not lower select_cc). PR: 13428 Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160619	2012-07-23 07:59:50 +00:00
Bill Wendling	d163405df8	Remove tabs. llvm-svn: 160475	2012-07-19 00:04:14 +00:00
Evan Cheng	e6a3b03ee0	Back out r160101 and instead implement a dag combine to recover from instcombine transformation. llvm-svn: 160387	2012-07-17 18:54:11 +00:00
Nadav Rotem	a62368c965	Refactor the code that checks that all operands of a node are UNDEFs. Add a micro-optimization to getNode of CONCAT_VECTORS when both operands are undefs. Can't find a testcase for this because VECTOR_SHUFFLE already handles undef operands, but Duncan suggested that we add this. Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160229	2012-07-15 08:38:23 +00:00
Nadav Rotem	018921002e	Add a dagcombine optimization to convert concat_vectors of undefs into a single undef. The unoptimized concat_vectors isd prevented the canonicalization of the vector_shuffle node. llvm-svn: 160221	2012-07-14 21:30:27 +00:00
Owen Anderson	b8844d6744	Only apply the SETCC+SITOFP -> SELECTCC optimization when the SETCC returns an MVT::i1, i.e. before type legalization. This is a speculative fix for a problem on Mips reported by Akira Hatanaka. llvm-svn: 160036	2012-07-11 06:38:55 +00:00
Nadav Rotem	d908ddc186	Improve the loading of load-anyext vectors by allowing the codegen to load multiple scalars and insert them into a vector. Next, we shuffle the elements into the correct places, as before. Also fix a small dagcombine bug in SimplifyBinOpWithSameOpcodeHands, when the migration of bitcasts happened too late in the SelectionDAG process. llvm-svn: 159991	2012-07-10 13:25:08 +00:00
Owen Anderson	d4b841f8f9	Teach the DAG combiner to turn sitofp/uitofp from i1 into a conditional move, since there are only two possible values. Previously, this would become an integer extension operation, followed by a real integer->float conversion. llvm-svn: 159957	2012-07-09 20:31:12 +00:00
Evan Cheng	4c6f917d34	Make sure type is not extended or untyped before create a constant of the type. No test case. Found by inspection. llvm-svn: 159179	2012-06-26 01:19:33 +00:00
Lang Hames	b8650f106a	Rename -allow-excess-fp-precision flag to -fuse-fp-ops, and switch from a boolean flag to an enum: { Fast, Standard, Strict } (default = Standard). This option controls the creation by optimizations of fused FP ops that store intermediate results in higher precision than IEEE allows (E.g. FMAs). The behavior of this option is intended to match the behaviour specified by a soon-to-be-introduced frontend flag: '-ffuse-fp-ops'. Fast mode - allows formation of fused FP ops whenever they're profitable. Standard mode - allow fusion only for 'blessed' FP ops. At present the only blessed op is the fmuladd intrinsic. In the future more blessed ops may be added. Strict mode - allow fusion only if/when it can be proven that the excess precision won't effect the result. Note: This option only controls formation of fused ops by the optimizers. Fused operations that are explicitly requested (e.g. FMA via the llvm.fma.* intrinsic) will always be honored, regardless of the value of this option. Internally TargetOptions::AllowExcessFPPrecision has been replaced by TargetOptions::AllowFPOpFusion. llvm-svn: 158956	2012-06-22 01:09:09 +00:00
Pete Cooper	5b61422d80	Fix potential crash if DAGCombine on stores sees a half type llvm-svn: 158927	2012-06-21 18:00:39 +00:00
Pete Cooper	fe5b84b404	Add users of a MERGE_VALUE node to the worklist to process again when the node is removed. Sorry, no test case. Foudn it by inspection of the code llvm-svn: 158839	2012-06-20 19:35:43 +00:00
Hal Finkel	8a31138521	Fix DAGCombine to deal with ext-conversion of pre/post_inc loads. The test case for this will come with the PPC indexed preinc loads commit. llvm-svn: 158822	2012-06-20 15:42:48 +00:00
Lang Hames	39fb1d08dc	Add DAG-combines for aggressive FMA formation. This patch adds DAG combines to form FMAs from pairs of FADD + FMUL or FSUB + FMUL. The combines are performed when: (a) Either AllowExcessFPPrecision option (-enable-excess-fp-precision for llc) OR UnsafeFPMath option (-enable-unsafe-fp-math) are set, and (b) TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) is true for the type of the FADD/FSUB, and (c) The FMUL only has one user (the FADD/FSUB). If your target has fast FMA instructions you can make use of these combines by overriding TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) to return true for types supported by your FMA instruction, and adding patterns to match ISD::FMA to your FMA instructions. llvm-svn: 158757	2012-06-19 22:51:23 +00:00
Lang Hames	a33db65bd9	Make comment slightly more helpful. llvm-svn: 158467	2012-06-14 20:37:15 +00:00
Owen Anderson	0eda3e1de6	Switch the canonical FMA term operand order to match both the comment I wrote and the usual LLVM convention. llvm-svn: 157708	2012-05-30 18:54:50 +00:00
Owen Anderson	c7aaf523e1	Teach DAGCombine to canonicalize the position of a constant in the term operands of an FMA node. llvm-svn: 157707	2012-05-30 18:50:39 +00:00
Jim Grosbach	92f6adc8be	DAGCombiner should not change the type of an extract_vector index. When a combine twiddles an extract_vector, care should be take to preserve the type of the index operand. No luck extracting a reasonable testcase, unfortunately. rdar://11391009 llvm-svn: 156419	2012-05-08 20:56:07 +00:00
Owen Anderson	ab63d84252	Teach DAG combine to fold x-x to 0.0 when unsafe FP math is enabled. llvm-svn: 156324	2012-05-07 20:51:25 +00:00
Owen Anderson	41b0665b5b	Teach DAGCombine the same multiply-by-1.0 folding trick when doing FMAs, just like it now knows for FMULs. llvm-svn: 156029	2012-05-02 22:17:40 +00:00
Owen Anderson	b5f167c660	Teach DAG combine that multiplication by 1.0 can always be constant folded. llvm-svn: 156023	2012-05-02 21:32:35 +00:00
Elena Demikhovsky	8d7e56c409	ZERO_EXTEND/SIGN_EXTEND/TRUNCATE optimization for AVX2 llvm-svn: 155309	2012-04-22 09:39:03 +00:00
Jakob Stoklund Olesen	beb9469d5c	Register DAGUpdateListeners with SelectionDAG. Instead of passing listener pointers to RAUW, let SelectionDAG itself keep a linked list of interested listeners. This makes it possible to have multiple listeners active at once, like RAUWUpdateListener was already doing. It also makes it possible to register listeners up the call stack without controlling all RAUW calls below. DAGUpdateListener uses an RAII pattern to add itself to the SelectionDAG list of active listeners. llvm-svn: 155248	2012-04-20 22:08:46 +00:00
Hal Finkel	e0cf6397fd	Remove dead SD nodes after the combining pass. Fixes PR12201. llvm-svn: 154786	2012-04-16 03:33:22 +00:00
Nadav Rotem	9d376b6578	Reapply 154397. Original message: Fix a dagcombine optimization which assumes that the vsetcc result type is always of the same size as the compared values. This is ture for SSE/AVX/NEON but not for all targets. llvm-svn: 154490	2012-04-11 08:26:11 +00:00
Duncan Sands	4f53074cca	Add a comment noting that the fdiv -> fmul conversion won't generate multiplication by a denormal, and some tests checking that. llvm-svn: 154431	2012-04-10 20:35:27 +00:00
Owen Anderson	3efc8f22bd	Revert r154397, which was causing make check failures on the buildbots. llvm-svn: 154414	2012-04-10 18:02:12 +00:00
Nadav Rotem	065564d85a	Fix a dagcombine optimization which assumes that the vsetcc result type is always of the same size as the compared values. This is ture for SSE/AVX/NEON but not for all targets. llvm-svn: 154397	2012-04-10 14:58:31 +00:00
Anton Korobeynikov	4d1220de34	Transform div to mul with reciprocal only when fp imm is legal. This fixes PR12516 and uncovers one weird problem in legalize (workarounded) llvm-svn: 154394	2012-04-10 13:22:49 +00:00
Rafael Espindola	1d9672bdce	Don't try to zExt just to check if an integer constant is zero, it might not fit in a i64. llvm-svn: 154364	2012-04-10 00:16:22 +00:00
Rafael Espindola	8f62b3248e	Pattern match a setcc of boolean value with 0 as a truncate. llvm-svn: 154322	2012-04-09 16:06:03 +00:00
Craig Topper	9c3da316ec	Remove unnecessary type check when combining and/or/xor of swizzles. Move some checks to allow better early out. llvm-svn: 154309	2012-04-09 07:19:09 +00:00
Craig Topper	e5893f64e8	Remove unnecessary 'else' on an 'if' that always returns llvm-svn: 154308	2012-04-09 05:59:53 +00:00
Craig Topper	e3ad4834ae	Optimize code slightly. No functionality change. llvm-svn: 154307	2012-04-09 05:55:33 +00:00
Craig Topper	5894fe430a	Replace some explicit checks with asserts for conditions that should never happen. llvm-svn: 154305	2012-04-09 05:16:56 +00:00
Benjamin Kramer	bb6ff08766	Silence sign-compare warning. llvm-svn: 154297	2012-04-08 19:04:45 +00:00
Duncan Sands	2f1dc3814b	Only have codegen turn fdiv by a constant into fmul by the reciprocal when -ffast-math, i.e. don't just always do it if the reciprocal can be formed exactly. There is already an IR level transform that does that, and it does it more carefully. llvm-svn: 154296	2012-04-08 18:08:12 +00:00
Nadav Rotem	71d07ae5cb	1. Remove the part of r153848 which optimizes shuffle-of-shuffle into a new shuffle node because it could introduce new shuffle nodes that were not supported efficiently by the target. 2. Add a more restrictive shuffle-of-shuffle optimization for cases where the second shuffle reverses the transformation of the first shuffle. llvm-svn: 154266	2012-04-07 21:19:08 +00:00
Duncan Sands	5f8397a934	Convert floating point division by a constant into multiplication by the reciprocal if converting to the reciprocal is exact. Do it even if inexact if -ffast-math. This substantially speeds up ac.f90 from the polyhedron benchmarks. llvm-svn: 154265	2012-04-07 20:04:00 +00:00
Rafael Espindola	ba0a6cabb8	Always compute all the bits in ComputeMaskedBits. This allows us to keep passing reduced masks to SimplifyDemandedBits, but know about all the bits if SimplifyDemandedBits fails. This allows instcombine to simplify cases like the one in the included testcase. llvm-svn: 154011	2012-04-04 12:51:34 +00:00
Owen Anderson	98f2c0c384	Add predicates for checking whether targets have free FNEG and FABS operations, and prevent the DAGCombiner from turning them into bitwise operations if they do. llvm-svn: 153901	2012-04-02 22:10:29 +00:00
Nadav Rotem	702f080767	Optimizing swizzles of complex shuffles may generate additional complex shuffles. Do not try to optimize swizzles of shuffles if the source shuffle has more than a single user, except when the source shuffle is also a swizzle. llvm-svn: 153864	2012-04-02 07:11:12 +00:00
Nadav Rotem	b078350872	This commit contains a few changes that had to go in together. 1. Simplify xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) (and also scalar_to_vector). 2. Xor/and/or are indifferent to the swizzle operation (shuffle of one src). Simplify xor/and/or (shuff(A), shuff(B)) -> shuff(op (A, B)) 3. Optimize swizzles of shuffles: shuff(shuff(x, y), undef) -> shuff(x, y). 4. Fix an X86ISelLowering optimization which was very bitcast-sensitive. Code which was previously compiled to this: movd (%rsi), %xmm0 movdqa .LCPI0_0(%rip), %xmm2 pshufb %xmm2, %xmm0 movd (%rdi), %xmm1 pshufb %xmm2, %xmm1 pxor %xmm0, %xmm1 pshufb .LCPI0_1(%rip), %xmm1 movd %xmm1, (%rdi) ret Now compiles to this: movl (%rsi), %eax xorl %eax, (%rdi) ret llvm-svn: 153848	2012-04-01 19:31:22 +00:00
Chris Lattner	1cc25e8a40	fix what looks like a real logic bug, found by PVS-Studio (part of PR12357) llvm-svn: 153513	2012-03-27 16:27:21 +00:00
Craig Topper	aaeae98936	When combining (vextract shuffle (load ), <1,u,u,u>), 0) -> (load ), add users of the final load to the worklist too. Needed by changes I'm preparing to make to X86 backend. llvm-svn: 153078	2012-03-20 05:28:39 +00:00
Duncan Sands	3fb2fc6edb	Fix DAG combine which creates illegal vector shuffles. Patch by Heikki Kultala. llvm-svn: 153035	2012-03-19 15:35:44 +00:00
Nadav Rotem	6fd1d32c63	When optimizing certain BUILD_VECTOR nodes into other BUILD_VECTOR nodes, add the new node into the work list because there is a potential for further optimizations. llvm-svn: 152784	2012-03-15 08:49:06 +00:00
Bill Wendling	df170db2f6	Add a xform to the DAG combiner. Transform: (fsub x, (fadd x, y)) -> (fneg y) and (fsub x, (fadd y, x)) -> (fneg y) if 'unsafe math' is specified. <rdar://problem/7540295> llvm-svn: 152777	2012-03-15 05:12:00 +00:00
Evan Cheng	d5f8e5766c	Fortify r152675 a bit. Although I'm not able to come up with a test case that would trigger the truncation case. llvm-svn: 152678	2012-03-13 22:16:11 +00:00
Evan Cheng	7bf83096df	DAG combine incorrectly optimize (i32 vextract (v4i16 load $addr), c) to (i16 load $addr+csizeof(i16)) and replace uses of (i32 vextract) with the i16 load. It should issue an extload instead: (i32 extload $addr+csizeof(i16)). rdar://11035895 llvm-svn: 152675	2012-03-13 22:00:52 +00:00
Benjamin Kramer	e1e549d617	Give dagcombiner's worklist some inline capacity. llvm-svn: 152454	2012-03-10 00:23:58 +00:00
Evan Cheng	80893ce5f5	Extend r148086 to check for [r +/- reg] address mode. This fixes queens performance regression (due to increased register pressure from overly aggressive pre-inc formation). llvm-svn: 152162	2012-03-06 23:33:32 +00:00
Owen Anderson	2ee7c4dfc5	Make it possible for a target to mark FSUB as Expand. This requires providing a default expansion (FADD+FNEG), and teaching DAGCombine not to form FSUBs post-legalize if they are not legal. llvm-svn: 152079	2012-03-06 00:29:31 +00:00
James Molloy	862fe49c55	Teach the DAGCombiner that certain loadext nodes followed by ANDs can be converted to zeroexts. llvm-svn: 150957	2012-02-20 12:02:38 +00:00
James Molloy	920ae8c642	Remove extraneous #include and spelling mistake introduced in r150669. llvm-svn: 150670	2012-02-16 09:48:07 +00:00
James Molloy	67b6b11b52	Modify the algorithm when traversing the DAGCombiner's worklist to be O(log N) for all operations. This fixes a horrible worst case with lots of nodes where 99% of the time was being spent in std::remove. llvm-svn: 150669	2012-02-16 09:17:04 +00:00
Nadav Rotem	0c65064dbe	Fix a bug in DAGCombine for the optimization of BUILD_VECTOR. We cant generate a shuffle node from two vectors of different types. llvm-svn: 150383	2012-02-13 12:42:26 +00:00
Nadav Rotem	34ca89afa8	This patch addresses the problem of poor code generation for the zext v8i8 -> v8i32 on AVX machines. The codegen often scalarizes ANY_EXTEND nodes. The DAGCombiner has two optimizations that can mitigate the problem. First, if all of the operands of a BUILD_VECTOR node are extracted from an ZEXT/ANYEXT nodes, then it is possible to create a new simplified BUILD_VECTOR which uses UNDEFS/ZERO values to eliminate the scalar ZEXT/ANYEXT nodes. Second, another dag combine optimization lowers BUILD_VECTOR into a shuffle vector instruction. In the case of zext v8i8->v8i32 on AVX, a value in an XMM register is to be shuffled into a wide YMM register. This patch modifes the second optimization and allows the creation of shuffle vectors even when the newly generated vector and the original vector from which we extract the values are of different types. llvm-svn: 150340	2012-02-12 15:05:31 +00:00
Nadav Rotem	4f4546b73a	Add additional documentation to the extract-and-trunc dagcombine optimization. llvm-svn: 149823	2012-02-05 11:39:23 +00:00
Nadav Rotem	5399f4d6bf	The type-legalizer often scalarizes code. One of the common patterns is extract-and-truncate. In this patch we optimize this pattern and convert the sequence into extract op of a narrow type. This allows the BUILD_VECTOR dag optimizations to construct efficient shuffle operations in many cases. llvm-svn: 149692	2012-02-03 13:18:25 +00:00
Nadav Rotem	fb6ddee0e9	Transform: (EXTRACT_VECTOR_ELT( VECTOR_SHUFFLE )) -> EXTRACT_VECTOR_ELT. llvm-svn: 148337	2012-01-17 21:44:01 +00:00
Craig Topper	02cb0fb136	Teach DAG combiner to turn a BUILD_VECTOR of UNDEFs into an UNDEF of vector type. llvm-svn: 148297	2012-01-17 09:09:48 +00:00
Benjamin Kramer	5a377e28da	DAGCombiner: Deduplicate code. llvm-svn: 148217	2012-01-15 11:50:43 +00:00
Evan Cheng	fa8326334b	DAGCombine's logic for forming pre- and post- indexed loads / stores were being overly conservative. It was concerned about cases where it would prohibit folding simple [r, c] addressing modes. e.g. ldr r0, [r2] ldr r1, [r2, #4] => ldr r0, [r2], #4 ldr r1, [r2] Change the logic to look for such cases which allows it to form indexed memory ops more aggressively. rdar://10674430 llvm-svn: 148086	2012-01-13 01:37:24 +00:00
Chandler Carruth	55b2cdee26	Teach the X86 instruction selection to do some heroic transforms to detect a pattern which can be implemented with a small 'shl' embedded in the addressing mode scale. This happens in real code as follows: unsigned x = my_accelerator_table[input >> 11]; Here we have some lookup table that we look into using the high bits of 'input'. Each entity in the table is 4-bytes, which means this implicitly gets turned into (once lowered out of a GEP): (unsigned)((char)my_accelerator_table + ((input >> 11) << 2)); The shift right followed by a shift left is canonicalized to a smaller shift right and masking off the low bits. That hides the shift right which x86 has an addressing mode designed to support. We now detect masks of this form, and produce the longer shift right followed by the proper addressing mode. In addition to saving a (rather large) instruction, this also reduces stalls in Intel chips on benchmarks I've measured. In order for all of this to work, one part of the DAG needs to be canonicalized still further* than it currently is. This involves removing pointless 'trunc' nodes between a zextload and a zext. Without that, we end up generating spurious masks and hiding the pattern. llvm-svn: 147936	2012-01-11 08:41:08 +00:00
Craig Topper	0515cd41e4	Replace some uses of hasNUsesOfValue(0, X) with !hasAnyUseOfValue(X) llvm-svn: 147733	2012-01-07 18:31:09 +00:00
Craig Topper	43a1bd6ac7	Add some DAG combines for SUBC/SUBE. If nothing uses the carry/borrow out of subc, turn it into a sub. Turn (subc x, x) into 0 with no borrow. Turn (subc x, 0) into x with no borrow. Turn (subc -1, x) into (xor x, -1) with no borrow. Turn sube with no borrow in into subc. llvm-svn: 147728	2012-01-07 09:06:39 +00:00
Chandler Carruth	e041a30bb9	Prevent a DAGCombine from firing where there are two uses of a combined-away node and the result of the combine isn't substantially smaller than the input, it's just canonicalized. This is the first part of a significant (7%) performance gain for Snappy's hot decompression loop. llvm-svn: 147604	2012-01-05 11:05:55 +00:00
Craig Topper	279c77b677	Implement VECTOR_SHUFFLE canonicalizations during DAG combine. llvm-svn: 147525	2012-01-04 08:07:43 +00:00
Eli Friedman	e96286cdf2	Make sure DAGCombiner doesn't introduce multiple loads from the same memory location. PR10747, part 2. llvm-svn: 147283	2011-12-26 22:49:32 +00:00
Chandler Carruth	637cc6a8aa	Initial CodeGen support for CTTZ/CTLZ where a zero input produces an undefined result. This adds new ISD nodes for the new semantics, selecting them when the LLVM intrinsic indicates that the undef behavior is desired. The new nodes expand trivially to the old nodes, so targets don't actually need to do anything to support these new nodes besides indicating that they should be expanded. I've done this for all the operand types that I could figure out for all the targets. Owners of various targets, please review and let me know if any of these are incorrect. Note that the expand behavior is conservatively correct, and exactly matches LLVM's current behavior with these operations. Ideally this patch will not change behavior in any way. For example the regtest suite finds the exact same instruction sequences coming out of the code generator. That's why there are no new tests here -- all of this is being exercised by the existing test suite. Thanks to Duncan Sands for reviewing the various bits of this patch and helping me get the wrinkles ironed out with expanding for each target. Also thanks to Chris for clarifying through all the discussions that this is indeed the approach he was looking for. That said, there are likely still rough spots. Further review much appreciated. llvm-svn: 146466	2011-12-13 01:56:10 +00:00
Eli Friedman	f9081a8afe	Zap unnecessary isIntDivCheap() check. PR11485. No testcase because this doesn't affect any in-tree target. llvm-svn: 146015	2011-12-07 03:55:52 +00:00
Eli Friedman	0e58cba286	Fix an optimization involving EXTRACT_SUBVECTOR in DAGCombine so it behaves correctly. PR11494. llvm-svn: 145996	2011-12-07 00:11:56 +00:00
Nick Lewycky	50f02cb21b	Move global variables in TargetMachine into new TargetOptions class. As an API change, now you need a TargetOptions object to create a TargetMachine. Clang patch to follow. One small functionality change in PTX. PTX had commented out the machine verifier parts in their copy of printAndVerify. That now calls the version in LLVMTargetMachine. Users of PTX who need verification disabled should rely on not passing the command-line flag to enable it. llvm-svn: 145714	2011-12-02 22:16:29 +00:00
Evan Cheng	4a5b2040e2	Revert r145273 and fix in SelectionDAG::InferPtrAlignment() instead. Conservatively returns zero when the GV does not specify an alignment nor is it initialized. Previously it returns ABI alignment for type of the GV. However, if the type is a "packed" type, then the under-specified alignments is attached to the load / store instructions. In that case, the alignment of the type cannot be trusted. rdar://10464621 llvm-svn: 145300	2011-11-28 22:37:34 +00:00
Evan Cheng	a4b6404cf0	DAG combine should not increase alignment of loads / stores with alignment less than ABI alignment. These are loads / stores from / to "packed" data structures. Their alignments are intentionally under-specified. rdar://10301431 llvm-svn: 145273	2011-11-28 20:42:56 +00:00
Eli Friedman	ff1eaa7578	Make sure to replace the chain properly when DAGCombining a LOAD+EXTRACT_VECTOR_ELT into a single LOAD. Fixes PR10747/PR11393. llvm-svn: 144863	2011-11-16 23:50:22 +00:00
Jay Foad	70679df664	Remove some unnecessary includes of PseudoSourceValue.h. llvm-svn: 144634	2011-11-15 07:50:46 +00:00
Eli Friedman	9d448e4a42	Don't try to form pre/post-indexed loads/stores until after LegalizeDAG runs. Fixes PR11029. llvm-svn: 144438	2011-11-12 00:35:34 +00:00
Lang Hames	b85fcd07df	Lower mem-ops to unaligned i32/i16 load/stores on ARM where supported. Add support for trimming constants to GetDemandedBits. This fixes some funky constant generation that occurs when stores are expanded for targets that don't support unaligned stores natively. llvm-svn: 144102	2011-11-08 18:56:23 +00:00
Pete Cooper	82cd9e81fc	Added invariant field to the DAG.getLoad method and changed all calls. When this field is true it means that the load is from constant (runt-time or compile-time) and so can be hoisted from loops or moved around other memory accesses llvm-svn: 144100	2011-11-08 18:42:53 +00:00
Richard Osborne	561fac4d4e	Don't introduce custom nodes after legalization in TargetLowering::BuildSDIV() and TargetLowering::BuildUDIV(). Fixes PR11283 llvm-svn: 143964	2011-11-07 17:09:05 +00:00
Nadav Rotem	f310361a7d	Cleanup. Document. Make sure that this build_vector optimization only runs before the op legalizer and that the used type is legal. llvm-svn: 143358	2011-10-31 20:08:25 +00:00
Benjamin Kramer	a4eba41b7a	Silence compiler warning. llvm-svn: 143308	2011-10-30 08:39:55 +00:00
Nadav Rotem	bf6568b5d6	Add a new DAGCombine optimization for BUILD_VECTOR. If all of the inputs are zero/any_extended, create a new simple BV which can be further optimized by other BV optimizations. llvm-svn: 143297	2011-10-29 21:23:04 +00:00
Eli Friedman	e9e356ad6b	Don't crash on 128-bit sdiv by constant. Found by inspection. llvm-svn: 143095	2011-10-27 02:06:39 +00:00
Eli Friedman	3e9ef907e0	Remove a couple redundant checks. llvm-svn: 142959	2011-10-25 20:34:22 +00:00
Bob Wilson	681561901d	Fix a DAG combiner assertion failure when constant folding BUILD_VECTORS. svn r139159 caused SelectionDAG::getConstant() to promote BUILD_VECTOR operands with illegal types, even before type legalization. For this testcase, that led to one BUILD_VECTOR with i16 operands and another with promoted i32 operands, which triggered the assertion. llvm-svn: 142370	2011-10-18 17:34:47 +00:00
Dan Gohman	e83e1b2d2c	Fix SimplifySelectCC to add newly created nodes to the DAGCombiner worklist, as it may be possible to perform further optimization on them. llvm-svn: 140349	2011-09-22 23:01:29 +00:00
Bruno Cardoso Lopes	6cb23f6e7f	Add a DAGCombine for subvector extracts to remove useless chains of subvector inserts and extracts. Initial patch by Rackover, Zvi with some tweak done by me. llvm-svn: 140204	2011-09-20 23:19:33 +00:00
Eli Friedman	b7910b79f5	Make the SelectionDAG verify that all the operands of BUILD_VECTOR have the same type. Teach DAGCombiner::visitINSERT_VECTOR_ELT not to make invalid BUILD_VECTORs. Fixes PR10897. llvm-svn: 139407	2011-09-09 21:04:06 +00:00
Duncan Sands	f2641e1bc1	Add codegen support for vector select (in the IR this means a select with a vector condition); such selects become VSELECT codegen nodes. This patch also removes VSETCC codegen nodes, unifying them with SETCC nodes (codegen was actually often using SETCC for vector SETCC already). This ensures that various DAG combiner optimizations kick in for vector comparisons. Passes dragonegg bootstrap with no testsuite regressions (nightly testsuite as well as "make check-all"). Patch mostly by Nadav Rotem. llvm-svn: 139159	2011-09-06 19:07:46 +00:00
Benjamin Kramer	68ed46ce9a	Roll back the rest of r126557. It's a hack that will break in some obscure cases. llvm-svn: 138130	2011-08-19 22:39:31 +00:00
Nadav Rotem	62da15a330	Revert r137310 because it does not optimize any code on ToT llvm-svn: 137466	2011-08-12 17:15:04 +00:00
Nadav Rotem	61140e1028	[AVX] When joining two XMM registers into a YMM register, make sure that the lower XMM register gets in first. This will allow the SUBREG pattern to elliminate the first vector insertion. llvm-svn: 137310	2011-08-11 16:49:36 +00:00
Eli Friedman	cbd3ba91b7	Make sure this DAGCombine actually returns an UNDEF of the correct type; PR10476. llvm-svn: 135993	2011-07-25 22:25:42 +00:00
Chris Lattner	229907cd11	land David Blaikie's patch to de-constify Type, with a few tweaks. llvm-svn: 135375	2011-07-18 04:54:35 +00:00
Eric Christopher	d6300d2956	Add a dag combine pattern for folding C2-(A+C1) -> (C2-C1)-A Fixes rdar://9761830 llvm-svn: 135123	2011-07-14 01:12:15 +00:00
Lang Hames	5a00499e87	Add functions 'hasPredecessor' and 'hasPredecessorHelper' to SDNode. The hasPredecessorHelper function allows predecessors to be cached to speed up repeated invocations. This fixes PR10186. X.isPredecessorOf(Y) now just calls Y.hasPredecessor(X) Y.hasPredecessor(X) calls Y.hasPredecessorHelper(X, Visited, Worklist) with empty Visited and Worklist sets (i.e. no caching over invocations). Y.hasPredecessorHelper(X, Visited, Worklist) caches search state in Visited and Worklist to speed up repeated calls. The Visited set is searched for X before going to the worklist to further search the DAG if necessary. llvm-svn: 134592	2011-07-07 04:31:51 +00:00
Benjamin Kramer	8665f8d916	Revert a part of r126557 which could create unschedulable DAGs. llvm-svn: 134067	2011-06-29 13:47:25 +00:00
Jay Foad	83be361b8a	Replace the existing forms of ConstantArray::get() with a single form that takes an ArrayRef. llvm-svn: 133615	2011-06-22 09:24:39 +00:00
Evan Cheng	4c0bd9629d	Teach dag combine to match halfword byteswap patterns. 1. (((x) & 0xFF00) >> 8) \| (((x) & 0x00FF) << 8) => (bswap x) >> 16 2. ((x&0xff)<<8)\|((x&0xff00)>>8)\|((x&0xff000000)>>8)\|((x&0x00ff0000)<<8)) => (rotl (bswap x) 16) This allows us to eliminate most of the def : Pat patterns for ARM rev16 revsh instructions. It catches many more cases for ARM and x86. rdar://9609108 llvm-svn: 133503	2011-06-21 06:01:08 +00:00
Nick Lewycky	6d677cfdd8	Add a DAGCombine for (ext (binop (load x), cst)). llvm-svn: 133124	2011-06-16 01:15:49 +00:00
Nadav Rotem	d2d9bdb2b0	Enable the simplification of truncating-store after fixing the usage of GetDemandBits (which must operate on the vector element type). Fix the a usage of getZeroExtendInReg which must also be done on scalar types. llvm-svn: 133052	2011-06-15 11:19:12 +00:00
Chad Rosier	818e116723	When pattern matching during instruction selection make sure shl x,1 is not converted to add x,x if x is a undef. add undef, undef does not guarantee that the resulting low order bit is zero. Fixes <rdar://problem/9453156> and <rdar://problem/9487392>. llvm-svn: 133022	2011-06-14 22:29:10 +00:00
Nadav Rotem	571ae19af7	Disable trunc-store simplification on vectors. llvm-svn: 132984	2011-06-14 07:18:26 +00:00
Eli Friedman	1877ac9937	Change this DAGCombine to build AND of SHR instead of SHR of AND; this matches the ordering we prefer in instcombine. Part of rdar://9562809. The potential DAGCombine which enforces this more generally messes up some other very fragile patterns, so I'm leaving that alone, at least for now. llvm-svn: 132809	2011-06-09 22:14:44 +00:00
Devang Patel	efec7715ec	Revert 121907 (it causes llc crash) and apply original patch from PR9817. llvm-svn: 131926	2011-05-23 22:04:42 +00:00
Benjamin Kramer	2fd48f2730	Implement mulo x, 2 -> addo x, x in DAGCombiner. llvm-svn: 131800	2011-05-21 18:31:55 +00:00
Dan Gohman	4298df6d86	Misc. code cleanups. llvm-svn: 131495	2011-05-17 22:20:36 +00:00
Nadav Rotem	8a7beb80f0	Fixes a bug in the DAGCombiner. LoadSDNodes have two values (data, chain). If there is a store after the load node, then there is a chain, which means that there is another user. Thus, asking hasOneUser would fail. Instead we ask hasNUsesOfValue on the 'data' value. llvm-svn: 131183	2011-05-11 14:40:50 +00:00
Duncan Sands	6be291a2cd	Indent properly, no functionality change. llvm-svn: 131082	2011-05-09 08:03:33 +00:00
Eli Friedman	55b0acd624	PR9055: extend the fix to PR4050 (r70179) to apply to zext and anyext. Returning a new node makes the code try to replace the old node, which in the included testcase is killed by CSE. llvm-svn: 129650	2011-04-16 23:25:34 +00:00
Owen Anderson	a519284fec	Fix another instance of the DAG combiner not using the correct type for the RHS of a shift. llvm-svn: 129522	2011-04-14 17:30:49 +00:00
Chris Lattner	41c80e89f3	have dag combine zap "store undef", which can be formed during call lowering with undef arguments. llvm-svn: 129185	2011-04-09 02:32:02 +00:00
Cameron Zwarich	8c7bbc09e2	Add a RemoveFromWorklist method to DCI. This is needed to do some complicated transformations in target-specific DAG combines without causing DAGCombiner to delete the same node twice. If you know of a better way to avoid this (see my next patch for an example), please let me know. llvm-svn: 128758	2011-04-02 02:40:26 +00:00
Evan Cheng	adb9c03e41	Avoid replacing the value of a directly stored load with the stored value if the load is indexed. rdar://9117613. llvm-svn: 127440	2011-03-11 00:48:56 +00:00
Stuart Hastings	6b4007dec6	Can't introduce floating-point immediate constants after legalization. Radar 9056407. llvm-svn: 126864	2011-03-02 19:36:30 +00:00
Nadav Rotem	b00913028f	Fix typos in the comments. llvm-svn: 126565	2011-02-27 07:40:43 +00:00
Benjamin Kramer	26691d9660	Add some DAGCombines for (adde 0, 0, glue), which are useful to optimize legalized code for large integer arithmetic. 1. Inform users of ADDEs with two 0 operands that it never sets carry 2. Fold other ADDs or ADDCs into the ADDE if possible It would be neat if we could do the same thing for SETCC+ADD eventually, but we can't do that in target independent code. llvm-svn: 126557	2011-02-26 22:48:07 +00:00
Owen Anderson	b2c80da4ae	Allow targets to specify a the type of the RHS of a shift parameterized on the type of the LHS. llvm-svn: 126518	2011-02-25 21:41:48 +00:00
Nadav Rotem	502f1b943f	Enable support for vector sext and trunc: Limit the folding of any_ext and sext into the load operation to scalars. Limit the active-bits trunc optimization to scalars. Document vector trunc and vector sext in LangRef. Similar to commit 126080 (for enabling zext). llvm-svn: 126424	2011-02-24 21:01:34 +00:00
Nadav Rotem	25f2ac948b	Fix 9267; Add vector zext support. The DAGCombiner folds the zext into complex load instructions. This patch prevents this optimization on vectors since none of the supported targets knows how to perform load+vector_zext in one instruction. llvm-svn: 126080	2011-02-20 12:37:50 +00:00
Stuart Hastings	81c4306005	Swap VT and DebugLoc operands of getExtLoad() for consistency with other getNode() methods. Radar 9002173. llvm-svn: 125665	2011-02-16 16:23:55 +00:00
Eric Christopher	e5ca1e0506	Refactor zero folding slightly. Clean up todo. llvm-svn: 125651	2011-02-16 04:50:12 +00:00
Eric Christopher	ef72141a75	The change for PR9190 wasn't quite right. We need to avoid making the transformation if we can't legally create a build vector of the correct type. Check that we can make the transformation first, and add a TODO to refactor this code with similar cases. Fixes: PR9223 and rdar://9000350 llvm-svn: 125631	2011-02-16 01:10:03 +00:00
Chris Lattner	e95d195014	Revisit my fix for PR9028: the issue is that DAGCombine was generating i8 shift amounts for things like i1024 types. Add an assert in getNode to prevent this from occuring in the future, fix the buggy transformation, revert my previous patch, and document this gotcha in ISDOpcodes.h llvm-svn: 125465	2011-02-13 19:09:16 +00:00
Nadav Rotem	db2f54811d	A fix for 9165. The DAGCombiner created illegal BUILD_VECTOR operations. The patch added a check that either illegal operations are allowed or that the created operation is legal. llvm-svn: 125435	2011-02-12 14:40:33 +00:00
Nadav Rotem	a49a02a04f	SimplifySelectOps can only handle selects with a scalar condition. Add a check that the condition is not a vector. llvm-svn: 125398	2011-02-11 19:57:47 +00:00
Nadav Rotem	18f6a33457	Fix #9190 The bug happens when the DAGCombiner attempts to optimize one of the patterns of the SUB opcode. It tries to create a zero of type v2i64. This type is legal on 32bit machines, but the initializer of this vector (i64) is target dependent. Currently, the initializer attempts to create an i64 zero constant, which fails. Added a flag to tell the DAGCombiner to create a legal zero, if we require that the pass would generate legal types. llvm-svn: 125391	2011-02-11 19:20:37 +00:00
Evan Cheng	d42641c6b5	Given a pair of floating point load and store, if there are no other uses of the load, then it may be legal to transform the load and store to integer load and store of the same width. This is done if the target specified the transformation as profitable. e.g. On arm, this can transform: vldr.32 s0, [] vstr.32 s0, [] to ldr r12, [] str r12, [] rdar://8944252 llvm-svn: 124708	2011-02-02 01:06:55 +00:00
Richard Osborne	272e084bca	Fix bug where ReduceLoadWidth was creating illegal ZEXTLOAD instructions. llvm-svn: 124587	2011-01-31 17:41:44 +00:00
Benjamin Kramer	946e1522b6	Teach DAGCombine to fold fold (sra (trunc (sr x, c1)), c2) -> (trunc (sra x, c1+c2) when c1 equals the amount of bits that are truncated off. This happens all the time when a smul is promoted to a larger type. On x86-64 we now compile "int test(int x) { return x/10; }" into movslq %edi, %rax imulq $1717986919, %rax, %rax movq %rax, %rcx shrq $63, %rcx sarq $34, %rax <- used to be "shrq $32, %rax; sarl $2, %eax" addl %ecx, %eax This fires 96 times in gcc.c on x86-64. llvm-svn: 124559	2011-01-30 16:38:43 +00:00
Benjamin Kramer	65bb14d368	Add the missing sub identity "A-(A-B) -> B" to DAGCombine. This happens e.g. for code like "X - X%10" where we lower the modulo operation to a series of multiplies and shifts that are then subtracted from X, leading to this missed optimization. llvm-svn: 124532	2011-01-29 12:34:05 +00:00
Anton Korobeynikov	2f93128109	Rename TargetFrameInfo into TargetFrameLowering. Also, put couple of FIXMEs and fixes here and there. llvm-svn: 123170	2011-01-10 12:39:04 +00:00
Benjamin Kramer	1f4dfbbcb0	DAGCombine add (sext i1), X into sub X, (zext i1) if sext from i1 is illegal. The latter usually compiles into smaller code. example code: unsigned foo(unsigned x, unsigned y) { if (x != 0) y--; return y; } before: _foo: ## @foo cmpl $1, 4(%esp) ## encoding: [0x83,0x7c,0x24,0x04,0x01] sbbl %eax, %eax ## encoding: [0x19,0xc0] notl %eax ## encoding: [0xf7,0xd0] addl 8(%esp), %eax ## encoding: [0x03,0x44,0x24,0x08] ret ## encoding: [0xc3] after: _foo: ## @foo cmpl $1, 4(%esp) ## encoding: [0x83,0x7c,0x24,0x04,0x01] movl 8(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x08] adcl $-1, %eax ## encoding: [0x83,0xd0,0xff] ret ## encoding: [0xc3] llvm-svn: 122455	2010-12-22 23:17:45 +00:00
Chris Lattner	cafc1e60bb	Fix a bug in ReduceLoadWidth that wasn't handling extending loads properly. We miscompiled the testcase into: _test: ## @test movl $128, (%rdi) movzbl 1(%rdi), %eax ret Now we get a proper: _test: ## @test movl $128, (%rdi) movsbl (%rdi), %eax movzbl %ah, %eax ret This fixes PR8757. llvm-svn: 122392	2010-12-22 08:02:57 +00:00
Chris Lattner	9a499e96eb	more cleanups, move a check for "roundedness" earlier to reject unhanded cases faster and simplify code. llvm-svn: 122391	2010-12-22 08:01:44 +00:00
Chris Lattner	222374d886	reduce indentation and improve comments, no functionality change. llvm-svn: 122389	2010-12-22 07:36:50 +00:00
Dale Johannesen	a94e36bbee	Reapply 122353-122355 with fixes. 122354 was wrong; the shift type was needed one place, the shift count type another. The transform in 123555 had the same problem. llvm-svn: 122366	2010-12-21 21:55:50 +00:00
Dale Johannesen	87c47499c6	Revert 122353-122355 for the moment, they broke stuff. llvm-svn: 122360	2010-12-21 21:22:27 +00:00
Dale Johannesen	caf42aa6a4	Add a new transform to DAGCombiner. llvm-svn: 122355	2010-12-21 20:10:51 +00:00
Dale Johannesen	fa5dc82fda	Get the type of a shift from the shift, not from its shift count operand. These should be the same but apparently are not always, and this is cleaner anyway. This improves the code in an existing test. llvm-svn: 122354	2010-12-21 20:06:19 +00:00
Dale Johannesen	d64931df77	Shift by the word size is invalid IR; don't create it. llvm-svn: 122353	2010-12-21 20:00:06 +00:00
Chris Lattner	2a7ff99979	fix some typos llvm-svn: 122349	2010-12-21 18:05:22 +00:00
Chris Lattner	3e5fbd74ed	rename MVT::Flag to MVT::Glue. "Flag" is a terrible name for something that just glues two nodes together, even if it is sometimes used for flags. llvm-svn: 122310	2010-12-21 02:38:05 +00:00
Dale Johannesen	0a291a36f2	Cosmetic changes. llvm-svn: 122259	2010-12-20 20:10:50 +00:00

1 2 3 4 5 ...

999 Commits