teak-llvm

mirror of https://github.com/Gericom/teak-llvm.git synced 2025-06-20 03:55:48 -04:00

Author	SHA1	Message	Date
Roman Lebedev	1711be78f7	[NFC][DAGCombine] Clarify comment for 'A - (A & (B - 1))' fold	2020-01-03 17:55:42 +03:00
Roman Lebedev	8dab0a4a7d	[DAGCombine][X86][AArch64] 'A - (A & (B - 1))' -> 'A & (0 - B)' fold (PR44448) While we do manage to fold integer-typed IR in middle-end, we can't do that for the main motivational case of pointers. There is @llvm.ptrmask() intrinsic which may or may not be helpful, but i'm not sure it is fully considered canonical yet, not everything is fully aware of it likely. https://rise4fun.com/Alive/ZVdp Name: ptr - (ptr & (alignment-1)) -> ptr & (0 - alignment) %mask = add i64 %alignment, -1 %bias = and i64 %ptr, %mask %r = sub i64 %ptr, %bias => %highbitmask = sub i64 0, %alignment %r = and i64 %ptr, %highbitmask See https://bugs.llvm.org/show_bug.cgi?id=44448 https://reviews.llvm.org/D71499	2020-01-03 13:58:36 +03:00
Matt Arsenault	4d7201e7b9	DAG: Stop trying to fold FP -(x-y) -> y-x in getNode with nsz This was increasing the number of instructions when fsub was legalized on AMDGPU with no signed zeros enabled. This fold should be guarded by hasOneUse, and I don't think getNode should be doing that. The same fold is already done as a regular combine through isNegatibleForFree. This does require duplicating, even though isNegatibleForFree does this combine already (and properly checks hasOneUse) to avoid one PPC regression. In the regression, the outer fneg has nsz but the fsub operand does not. isNegatibleForFree only sees the operand, and doesn't see it's used from a nsz context. A nsz parameter needs to be added and threaded through isNegatibleForFree to avoid this.	2019-12-31 22:49:51 -05:00
Sanjay Patel	8cefc37be5	[DAGCombine] visitEXTRACT_SUBVECTOR - 'little to big' extract_subvector(bitcast()) support This moves the X86 specific transform from rL364407 into DAGCombiner to generically handle 'little to big' cases (for example: extract_subvector(v2i64 bitcast(v16i8))). This allows us to remove both the x86 implementation and the aarch64 bitcast(extract_subvector(bitcast())) combine. Earlier patches that dealt with regressions initially exposed by this patch: rG5e5e99c041e4 rG0b38af89e2c0 Patch by: @RKSimon (Simon Pilgrim) Differential Revision: https://reviews.llvm.org/D63815	2019-12-23 10:11:45 -05:00
Carl Ritson	2791667d2e	[DAGCombiner] Check term use before applying aggressive FSUB optimisations Summary: Without this check unnecessary FMA instructions are generated when the FSUB terms are reused. This also has the side-effect that the same value is computed to different levels of precision, which can create undesirable effects if the results are used together in subsequent computation. Reviewers: arsenm, nhaehnle, foad, tpr, dstuttard, spatel Reviewed By: arsenm Subscribers: jvesely, wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71656	2019-12-23 09:37:58 +09:00
Amaury Séchet	ff6567cc77	[DAGCombiner] Add node back in the worklist in topological order in CommitTargetLoweringOpt Summary: Right now, DAGCombiner process the nodes in an iplementation defined order. This tends to be fragile as optimisation may or may not kick in depending on the traversal order. This is part of a larger effort to get the DAGCombiner to process its node in topological order. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70921	2019-12-17 18:26:16 +01:00
Alex Richardson	11448eeb72	[NFC] Use SelectionDAG::getMemBasePlusOffset() instead of getNode(ISD::ADD) Summary: To find potential opportunities to use getMemBasePlusOffset() I looked at all ISD::ADD uses found with the regex getNode\(ISD::ADD,.+,.+Ptr in lib/CodeGen/SelectionDAG. If this patch is accepted I will convert the files in the individual backends too. The motivation for this change is our out-of-tree CHERI backend (https://github.com/CTSRD-CHERI/llvm-project). We use a separate register type to store pointers (128-bit capabilities, which are effectively unforgeable and monotonic fat pointers). These capabilities permit a reduced set of operations and therefore use a separate ValueType (iFATPTR). to represent pointers implemented as capabilities. Therefore, we need to avoid using ISD::ADD for our patterns that operate on pointers and need to use a function that chooses ISD::ADD or a new ISD::PTRADD opcode depending on the value type. We originally added a new DAG.getPointerAdd() function, but after this patch series we can modify the implementation of getMemBasePlusOffset() instead. Avoiding direct uses of ISD::ADD for pointer types will significantly reduce the amount of assertion/instruction selection failures for us in future upstream merges. Reviewers: spatel Reviewed By: spatel Subscribers: merge_guards_bot, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71207	2019-12-13 21:40:03 +00:00
Sanjay Patel	2f0c7fd2db	[DAGCombiner] fold shift-trunc-shift to shift-mask-trunc (2nd try) The initial attempt (rG89633320) botched the logic by reversing the source/dest types. Added x86 tests for additional coverage. The vector tests show a potential improvement (fold vector load instead of broadcasting), but that's a known/existing problem. This fold is done in IR by instcombine, and we have a special form of it already here in DAGCombiner, but we want the more general transform too: https://rise4fun.com/Alive/3jZm Name: general Pre: (C1 + zext(C2) < 64) %s = lshr i64 %x, C1 %t = trunc i64 %s to i16 %r = lshr i16 %t, C2 => %s2 = lshr i64 %x, C1 + zext(C2) %a = and i64 %s2, zext((1 << (16 - C2)) - 1) %r = trunc %a to i16 Name: special Pre: C1 == 48 %s = lshr i64 %x, C1 %t = trunc i64 %s to i16 %r = lshr i16 %t, C2 => %s2 = lshr i64 %x, C1 + zext(C2) %r = trunc %s2 to i16 ...because D58017 exposes a regression without this fold.	2019-12-13 14:03:54 -05:00
Alex Richardson	be15dfa88f	[NFC] Use EVT instead of bool for getSetCCInverse() Summary: The use of a boolean isInteger flag (generally initialized using VT.isInteger()) caused errors in our out-of-tree CHERI backend (https://github.com/CTSRD-CHERI/llvm-project). In our backend, pointers use a separate ValueType (iFATPTR) and therefore .isInteger() returns false. This meant that getSetCCInverse() was using the floating-point variant and generated incorrect code for us: `(void )0x12033091e < (void )0xffffffffffffffff` would return false. Committing this change will significantly reduce our merge conflicts for each upstream merge. Reviewers: spatel, bogner Reviewed By: bogner Subscribers: wuzish, arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70917	2019-12-13 12:22:03 +00:00
Sanjay Patel	9432937190	Revert "[DAGCombiner] fold shift-trunc-shift to shift-mask-trunc" This reverts commit `8963332c33`. There was a logic bug typo in this code, but it wasn't visible in the asm for the tests.	2019-12-12 16:24:40 -05:00
Sanjay Patel	8963332c33	[DAGCombiner] fold shift-trunc-shift to shift-mask-trunc This fold is done in IR by instcombine, and we have a special form of it already here in DAGCombiner, but we want the more general transform too: https://rise4fun.com/Alive/3jZm Name: general Pre: (C1 + zext(C2) < 64) %s = lshr i64 %x, C1 %t = trunc i64 %s to i16 %r = lshr i16 %t, C2 => %s2 = lshr i64 %x, C1 + zext(C2) %a = and i64 %s2, zext((1 << (16 - C2)) - 1) %r = trunc %a to i16 Name: special Pre: C1 == 48 %s = lshr i64 %x, C1 %t = trunc i64 %s to i16 %r = lshr i16 %t, C2 => %s2 = lshr i64 %x, C1 + zext(C2) %r = trunc %s2 to i16 ...because D58017 exposes a regression without this fold.	2019-12-12 15:44:13 -05:00
Sanjay Patel	b39009bf1d	[DAGCombiner] improve readability This is not quite NFC because I changed the SDLoc to use the more standard 'N' (the starting node for the fold). This transform is a special-case of a more general fold that we do in IR, but it seems like the general fold is needed here too to avoid a potential regression seen in D58017. https://rise4fun.com/Alive/3jZm	2019-12-12 13:16:50 -05:00
Amaury Séchet	c594d14d40	[DAGCombine] Factor oplist operations. NFC	2019-12-02 19:12:03 +01:00
Amaury Séchet	d8d5106225	[SelectionDAG] Reduce assumptions made about levels. NFC	2019-12-02 17:43:13 +01:00
Amaury Séchet	ca818f4550	[DAGCombiner] Peek through vector concats when trying to combine shuffles. Summary: This combine showed up as needed when exploring the regression when processing the DAG in topological order. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68195	2019-11-28 23:57:29 +01:00
David Green	b5315ae8ff	[Codegen][ARM] Add addressing modes from masked loads and stores MVE has a basic symmetry between it's normal loads/store operations and the masked variants. This means that masked loads and stores can use pre-inc and post-inc addressing modes, just like the standard loads and stores already do. To enable that, this patch adds all the relevant infrastructure for treating masked loads/stores addressing modes in the same way as normal loads/stores. This involves: - Adding an AddressingMode to MaskedLoadStoreSDNode, along with an extra Offset operand that is added after the PtrBase. - Extending the IndexedModeActions from 8bits to 16bits to store the legality of masked operations as well as normal ones. This array is fairly small, so doubling the size still won't make it very large. Offset masked loads can then be controlled with setIndexedMaskedLoadAction, similar to standard loads. - The same methods that combine to indexed loads, such as CombineToPostIndexedLoadStore, are adjusted to handle masked loads in the same way. - The ARM backend is then adjusted to make use of these indexed masked loads/stores. - The X86 backend is adjusted to hopefully be no functional changes. Differential Revision: https://reviews.llvm.org/D70176	2019-11-26 16:21:01 +00:00
Sanjay Patel	214683f3b2	[DAGCombiner] avoid crash on out-of-bounds insert index (PR44139) We already have this simplification at node-creation-time, but the test from: https://bugs.llvm.org/show_bug.cgi?id=44139 ...shows that we can combine our way to an assert/crash too.	2019-11-25 16:24:06 -05:00
Clement Courbet	cb15ba84fe	Reland "[DAGCombiner] Allow zextended load combines." Check that the generated type is simple.	2019-11-22 14:47:18 +01:00
Clement Courbet	88e205525c	Revert "[DAGCombiner] Allow zextended load combines." Breaks some bots.	2019-11-22 09:01:08 +01:00
Clement Courbet	036790f988	[DAGCombiner] Allow zextended load combines. Summary: or(zext(load8(base)), zext(load8(base+1)) -> zext(load16 base) Reviewers: apilipenko, RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70487	2019-11-22 08:40:19 +01:00
Hiroshi Yamauchi	52e377497d	[PGO][PGSO] DAG.shouldOptForSize part. Summary: (Split of off D67120) SelectionDAG::shouldOptForSize changes for profile guided size optimization. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70095	2019-11-21 14:16:00 -08:00
Clement Courbet	252567377c	[DAGCombine][NFC] Use ArrayRef and correctly size SmallVectors. In preparation for D70487.	2019-11-21 08:53:37 +01:00
David Zarzycki	257acbf6ae	[SelectionDAG] Combine U{ADD,SUB}O diamonds into {ADD,SUB}CARRY Summary: Convert (uaddo (uaddo x, y), carryIn) into addcarry x, y, carryIn if-and-only-if the carry flags of the first two uaddo are merged via OR or XOR. Work remaining: match ADD, etc. Reviewers: craig.topper, RKSimon, spatel, niravd, jonpa, uweigand, deadalnix, nikic, lebedev.ri, dmgreen, chfast Reviewed By: lebedev.ri Subscribers: chfast, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70079	2019-11-20 16:25:42 +02:00
Matt Arsenault	7fe9435dc8	Work on cleaning up denormal mode handling Cleanup handling of the denormal-fp-math attribute. Consolidate places checking the allowed names in one place. This is in preparation for introducing FP type specific variants of the denormal-fp-mode attribute. AMDGPU will switch to using this in place of the current hacky use of subtarget features for the denormal mode. Introduce a new header for dealing with FP modes. The constrained intrinsic classes define related enums that should also be moved into this header for uses in other contexts. The verifier could use a check to make sure the denorm-fp-mode attribute is sane, but there currently isn't one. Currently, DAGCombiner incorrectly asssumes non-IEEE behavior by default in the one current user. Clang must be taught to start emitting this attribute by default to avoid regressions when this is switched to assume ieee behavior if the attribute isn't present.	2019-11-19 22:01:14 +05:30
Matt Arsenault	b696b9dba7	DAG: Add function context to isFMAFasterThanFMulAndFAdd AMDGPU needs to know the FP mode for the function to answer this correctly when this is removed from the subtarget. AArch64 had to make this more complicated by using this from an IR hook, so add an IR typed overload.	2019-11-19 19:25:26 +05:30
Graham Hunter	3f08ad611a	[SVE][CodeGen] Scalable vector MVT size queries * Implements scalable size queries for MVTs, split out from D53137. * Contains a fix for FindMemType to avoid using scalable vector type to contain non-scalable types. * Explicit casts for several places where implicit integer sign changes or promotion from 32 to 64 bits caused problems. * CodeGenDAGPatterns will treat scalable and non-scalable vector types as different. Reviewers: greened, cameron.mcinally, sdesmalen, rovka Reviewed By: rovka Differential Revision: https://reviews.llvm.org/D66871	2019-11-18 12:30:59 +00:00
Paweł Bylica	1c247dd028	[DAGCombiner] Drop redundant DAG method param. NFC	2019-11-14 14:02:53 +01:00
Paweł Bylica	9b89bda517	[DAGCombiner] Use TLI field already available. NFC	2019-11-14 14:02:52 +01:00
joanlluch	d384ad6b63	[TargetLowering][DAGCombine][MSP430] Shift Amount Threshold in DAGCombine (4) Summary: Replaces ``` unsigned getShiftAmountThreshold(EVT VT) ``` by ``` bool shouldAvoidTransformToShift(EVT VT, unsigned amount) ``` thus giving more flexibility for targets to decide whether particular shift amounts must be considered expensive or not. Updates the MSP430 target with a custom implementation. This continues D69116, D69120, D69326 and updates them, so all of them must be committed before this. Existing tests apply, a few more have been added. Reviewers: asl, spatel Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70042	2019-11-13 09:23:08 +01:00
Philip Reames	db036ee0a4	[X86/Atomics] Correct a few transforms for new atomic lowering This is a partial fix for the issues described in commit message of `027aa27` (the revert of G24609). Unfortunately, I can't provide test coverage for it on it's own as the only (known) wrong example is still wrong, but due to a separate issue. These fixes are cases where when performing unrelated DAG combines, we were dropping the atomicity flags entirely.	2019-11-05 13:20:08 -08:00
Thomas Preud'homme	646896a442	Fix PR40644: miscompile indexed FP constant store Summary: Functions replaceStoreOfFPConstant() and OptimizeFloatStore() both replace store of float by a store of an integer unconditionally. However this generates wrong code when the store that is replaced is an indexed or truncating store. This commit solves this issue by adding an early return in these functions when the store being considered is not a normal store. Bug was only observed on out of tree targets, hence the lack of testcase in this commit. Reviewers: efriedma Subscribers: hiraditya, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68420	2019-11-05 11:07:52 +00:00
Sanjay Patel	113181e9bd	[DAGCombine][MSP430] use shift amount threshold in DAGCombine (2/2) Continuation of: D69116 Contributes to a fix for PR43559: https://bugs.llvm.org/show_bug.cgi?id=43559 See also D69099 and D69116 Use the TLI hook in DAGCombine.cpp to guard against creating shift nodes that are not optimal for a target. Patch by: @joanlluch (Joan LLuch) Differential Revision: https://reviews.llvm.org/D69120	2019-11-04 13:41:41 -05:00
Matt Arsenault	6221767055	DAG: Add DAG argument to isFPExtFoldable For AMDGPU this is dependent on the FP mode, which should eventually not be a property of the subtarget.	2019-10-31 22:32:45 -07:00
Matt Arsenault	1725f28841	DAG: Add new control for ISD::FMAD formation For AMDGPU this depends on whether denormals are enabled in the default FP mode for the function. Currently this is treated as a subtarget feature, so FMAD is selectively legal based on that. I want to move this out of the subtarget features so this can be controlled with a denormal mode attribute. Additionally, this will allow folding based on a future ftz fast math flag.	2019-10-31 07:51:38 -07:00
Sanjay Patel	1ebd4a2e3a	[DAGCombiner] widen any_ext of popcount based on target support This enhances D69127 (rGe6c145e0548e3b3de6eab27e44e1504387cf6b53) to handle the looser "any_extend" cast in addition to zext. This is a prerequisite step for canonicalizing in the other direction (narrow the popcount) in IR - PR43688: https://bugs.llvm.org/show_bug.cgi?id=43688	2019-10-28 10:07:12 -04:00
Kerry McLaughlin	da720a38b9	[AArch64][SVE] Implement masked load intrinsics Summary: Adds support for codegen of masked loads, with non-extending, zero-extending and sign-extending variants. Reviewers: huntergr, rovka, greened, dmgreen Reviewed By: dmgreen Subscribers: dmgreen, samparker, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68877	2019-10-28 10:06:14 +00:00
Sanjay Patel	85a2146c15	[SDAG] fold insert_vector_elt with undef index Similar to: rG4c47617627fb This makes the DAG behavior consistent with IR's insertelement. https://bugs.llvm.org/show_bug.cgi?id=42689 I've tried to maintain test intent for AArch64 and WebAssembly by replacing undef index operands with something else.	2019-10-27 15:28:43 -04:00
Sanjay Patel	e6c145e054	[DAGCombiner] widen zext of popcount based on target support zext (ctpop X) --> ctpop (zext X) This is a prerequisite step for canonicalizing in the other direction (narrow the popcount) in IR - PR43688: https://bugs.llvm.org/show_bug.cgi?id=43688 I'm not sure if any other targets are affected, but I found a missing fold for PPC, so added tests based on that. The reason we widen all the way to 64-bit in these tests is because the initial DAG looks something like this: t5: i8 = ctpop t4 t6: i32 = zero_extend t5 <-- created based on IR, but unused node? t7: i64 = zero_extend t5 Differential Revision: https://reviews.llvm.org/D69127	2019-10-25 14:10:51 -04:00
Simon Pilgrim	a18818207a	Fix cppcheck shadow variable warning. NFCI.	2019-10-24 22:14:36 +01:00
Graham Hunter	84da2596f9	[AArch64][SVE] Add SPLAT_VECTOR ISD Node Adds a new ISD node to replicate a scalar value across all elements of a vector. This is needed for scalable vectors, since BUILD_VECTOR cannot be used. Fixes up default type legalization for scalable vectors after the new MVT type ranges were introduced. At present I only use this node for scalable vectors. A DAGCombine has been added to transform a BUILD_VECTOR into a SPLAT_VECTOR if all elements are the same, but only if the default operation action of Expand has been overridden by the target. I've only added result promotion legalization for scalable vector i8/i16/i32/i64 types in AArch64 for now. Reviewers: t.p.northover, javed.absar, greened, cameron.mcinally, jmolloy Reviewed By: jmolloy Differential Revision: https://reviews.llvm.org/D47775 llvm-svn: 375222	2019-10-18 11:48:35 +00:00
Sam Parker	39af8a3a3b	[DAGCombine][ARM] Enable extending masked loads Add generic DAG combine for extending masked loads. Allow us to generate sext/zext masked loads which can access v4i8, v8i8 and v4i16 memory to produce v4i32, v8i16 and v4i32 respectively. Differential Revision: https://reviews.llvm.org/D68337 llvm-svn: 375085	2019-10-17 07:55:55 +00:00
David Zarzycki	59390efef2	[X86] Make memcmp() use PTEST if possible and also enable AVX1 llvm-svn: 374922	2019-10-15 17:40:12 +00:00
Sanjay Patel	d545c9056e	[DAGCombiner] fold select-of-constants based on sign-bit test Examples: i32 X > -1 ? C1 : -1 --> (X >>s 31) \| C1 i8 X < 0 ? C1 : 0 --> (X >>s 7) & C1 This is a small generalization of a fold requested in PR43650: https://bugs.llvm.org/show_bug.cgi?id=43650 The sign-bit of the condition operand can be used as a mask for the true operand: https://rise4fun.com/Alive/paT Note that we already handle some of the patterns (isNegative + scalar) because there's an over-specialized, yet over-reaching fold for that in foldSelectCCToShiftAnd(). It doesn't use any TLI hooks, so I can't easily rip out that code even though we're duplicating part of it here. This fold is guarded by TLI.convertSelectOfConstantsToMath(), so it should not cause problems for targets that prefer select over shift. Also worth noting: I thought we could generalize this further to include the case where the true operand of the select is not constant, but Alive says that may allow poison to pass through where it does not in the original select form of the code. Differential Revision: https://reviews.llvm.org/D68949 llvm-svn: 374902	2019-10-15 15:23:57 +00:00
Sanjay Patel	3b581ac80f	[DAGCombiner] fold vselect-of-constants to shift The diffs suggest that we are missing some more basic analysis/transforms, but this keeps the vector path in sync with the scalar (rL374397). This is again a preliminary step for introducing the reverse transform in IR as proposed in D63382. llvm-svn: 374555	2019-10-11 14:17:56 +00:00
Sanjay Patel	7b904ce724	[DAGCombiner] fold select-of-constants to shift This reverses the scalar canonicalization proposed in D63382. Pre: isPowerOf2(C1) %r = select i1 %cond, i32 C1, i32 0 => %z = zext i1 %cond to i32 %r = shl i32 %z, log2(C1) https://rise4fun.com/Alive/Z50 x86 already tries to fold this pattern, but it isn't done uniformly, so we still see a diff. AArch64 probably should enable the TLI hook to benefit too, but that's a follow-on. llvm-svn: 374397	2019-10-10 17:52:02 +00:00
Sanjay Patel	7f0e7c0b1c	[DAGCombiner] reduce code duplication; NFC llvm-svn: 374370	2019-10-10 15:38:29 +00:00
Amaury Sechet	aaf0507896	[DAGCombine] Match more patterns for half word bswap Summary: It ensures that the bswap is generated even when a part of the subtree already matches a bswap transform. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68250 llvm-svn: 374340	2019-10-10 13:20:10 +00:00
Philip Reames	931120846e	Conservatively add volatility and atomic checks in a few places As background, starting in D66309, I'm working on support unordered atomics analogous to volatile flags on normal LoadSDNode/StoreSDNodes for X86. As part of that, I spent some time going through usages of LoadSDNode and StoreSDNode looking for cases where we might have missed a volatility check or need an atomic check. I couldn't find any cases that clearly miscompile - i.e. no test cases - but a couple of pieces in code loop suspicious though I can't figure out how to exercise them. This patch adds defensive checks and asserts in the places my manual audit found. If anyone has any ideas on how to either a) disprove any of the checks, or b) hit the bug they might be fixing, I welcome suggestions. Differential Revision: https://reviews.llvm.org/D68419 llvm-svn: 374261	2019-10-09 23:43:33 +00:00
Simon Pilgrim	b4ba3cbda0	[X86][AVX] Access a scalar float/double as a free extract from a broadcast load (PR43217) If a fp scalar is loaded and then used as both a scalar and a vector broadcast, perform the load as a broadcast and then extract the scalar for 'free' from the 0th element. This involved switching the order of the X86ISD::BROADCAST combines so we only convert to X86ISD::BROADCAST_LOAD once all other canonicalizations have been attempted. Adds a DAGCombinerInfo::recursivelyDeleteUnusedNodes wrapper. Fixes PR43217 Differential Revision: https://reviews.llvm.org/D68544 llvm-svn: 373871	2019-10-06 21:11:45 +00:00
Sanjay Patel	f643fabb52	Revert [DAGCombine] Match more patterns for half word bswap This reverts r373850 (git commit `25ba49824d`) This patch appears to cause multiple codegen regression test failures - http://lab.llvm.org:8011/builders/clang-cmake-armv7-quick/builds/10680 llvm-svn: 373853	2019-10-06 15:27:34 +00:00

1 2 3 4 5 ...

2727 Commits