teak-llvm

mirror of https://github.com/Gericom/teak-llvm.git synced 2025-06-22 21:15:40 -04:00

Author	SHA1	Message	Date
Nirav Dave	ac6081cb67	Make library calls sensitive to regparm module flag (Fixes PR3997). Reviewers: mkuper, rnk Subscribers: mehdi_amini, jyknight, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D27050 llvm-svn: 298179	2017-03-18 00:44:07 +00:00
Nirav Dave	6de2c77944	Capitalize ArgListEntry fields. NFC. llvm-svn: 298178	2017-03-18 00:43:57 +00:00
Craig Topper	6ffc044b2f	[DAGCombine] Use APInt::operator\|(uint64_t) instead of creating a temporary APInt and calling APInt::Or. NFC This is more efficient by itself. But this is prep for a future patch that may remove APInt::Or while making operator\| support rvalue references similar to add/sub. llvm-svn: 296981	2017-03-05 01:08:16 +00:00
Simon Pilgrim	ffe2535cf6	Use SelectionDAG::getBuildVector helper function where possible. NFCI. llvm-svn: 293532	2017-01-30 18:53:45 +00:00
Matt Arsenault	0b382a7cb8	DAG: Avoid OOB when legalizing vector indexing If a vector index is out of bounds, the result is supposed to be undefined but is not undefined behavior. Change the legalization for indexing the vector on the stack so that an out of bounds index does not create an out of bounds memory access. llvm-svn: 291604	2017-01-10 22:02:30 +00:00
Nicolai Haehnle	f08dc90253	[SelectionDAG] Add expansion and promotion of [US]MUL_LOHI Summary: Most targets set the action for these nodes to Expand even though there isn't actually any code for them in ExpandNode. Instead, targets simply relied on the fact that no code generates these nodes as long as the nodes aren't legal or custom. However, generating these nodes can be useful e.g. for divide-by-constant in wider integer types. Expand of [US]MUL_LOHI will use MULH[US] when legal or custom, and a sequence of half-width multiplications otherwise. Promote uses a wider multiply. This patch intends to not change the generated code, but indirect effects are possible since expansions/promotions that were previously done in DAGCombine may now be done in LegalizeDAG. See D24822 for a change that actually uses the new expansion. Reviewers: spatel, bkramer, venkatra, efriedma, hfinkel, ast, nadav, tstellarAMD Subscribers: arsenm, jyknight, nemanjai, wdng, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D24956 llvm-svn: 289050	2016-12-08 14:08:14 +00:00
Eli Friedman	0a76e3241f	[CodeGen] Fix result type for SMULO/UMULO legalization On some platforms (like MSP430) the second element of the result structure for SMULO/UMULO may have a shorter type than the one returned by SetCC. We need to truncate it to the right type, or else some incorrect code may be generated later on. This fixes issue https://github.com/rust-lang/rust/issues/37829 Patch by Vadzim Dambrouski! Differential Revision: https://reviews.llvm.org/D27154 llvm-svn: 288857	2016-12-06 22:49:36 +00:00
Simon Pilgrim	7a6b6d5656	Fix spelling mistakes in SelectionDAG comments. NFC. Identified by Pedro Giffuni in PR27636. llvm-svn: 287487	2016-11-20 13:14:57 +00:00
Joerg Sonnenberger	1a7eec68a9	Introduce TLI predicative for base-relative Jump Tables. For 64bit ABIs it is common practice to use relative Jump Tables with potentially different relocation bases. As the logic for the jump table itself doesn't depend on the relocation base, make it easier for targets to use the generic logic. Start by dropping the now redundant MIPS logic. Differential Revision: https://reviews.llvm.org/D26578 llvm-svn: 286951	2016-11-15 12:39:46 +00:00
Joerg Sonnenberger	5e31b3ad93	Simplify. llvm-svn: 285802	2016-11-02 12:45:28 +00:00
Tom Stellard	284cf32ab4	LegalizeDAG: Support promoting [US]DIV and [US]REM operations Summary: AMDGPU will need this one i16 is added as a legal type. This is tested by: test/CodeGen/AMDGPU/sdiv.ll test/CodeGen/AMDGPU/sdivrem24.ll test/CodeGen/AMDGPU/udiv.ll test/CodeGen/AMDGPU/udivrem24.ll Reviewers: bogner, efriedma Subscribers: efriedma, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D25699 llvm-svn: 285199	2016-10-26 14:52:25 +00:00
Sanjay Patel	3a3aaf67e0	[DAG] optimize negation of bool Use mask and negate for legalization of i1 source type with SIGN_EXTEND_INREG. With the mask, this should be no worse than 2 shifts. The mask can be eliminated in some cases, so that should be better than 2 shifts. This change exposed some missing folds related to negation: https://reviews.llvm.org/rL284239 https://reviews.llvm.org/rL284395 There may be others, so please let me know if you see any regressions. Differential Revision: https://reviews.llvm.org/D25485 llvm-svn: 284611	2016-10-19 16:58:59 +00:00
Konstantin Zhuravlyov	8ea0246e93	[MachineMemOperand] Move synchronization scope and atomic orderings from SDNode to MachineMemOperand, and remove redundant getAtomic* member functions from SelectionDAG. Differential Revision: https://reviews.llvm.org/D24577 llvm-svn: 284312	2016-10-15 22:01:18 +00:00
Tom Stellard	f80c1875a3	LegalizeDAG: Implement PROMOTE for ISD::BITREVERSE Summary: This operation is promoted the same way was ISD::BSWAP. This will prevent a regression in test/Target/AMDGOU/bitreverse.ll when i16 support is implemented. Reviewers: bogner, hfinkel Subscribers: hfinkel, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D25202 llvm-svn: 284163	2016-10-13 21:03:49 +00:00
Albert Gutowski	795d7d6381	Create llvm.addressofreturnaddress intrinsic Summary: We need a new LLVM intrinsic to implement MS _AddressOfReturnAddress builtin on 64-bit Windows. Reviewers: majnemer, rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25293 llvm-svn: 284061	2016-10-12 22:13:19 +00:00
whitequark	7c4fe0e9a3	[SelectionDAG] Fix calling convention in expansion of ?MULO. The SMULO/UMULO DAG nodes, when not directly supported by the target, expand to a multiplication twice as wide. In case that the resulting type is not legal, an __mul?i3 intrinsic is used. Since the type is not legal, the legalizer cannot directly call the intrinsic with the wide arguments; instead, it "pre-lowers" them by splitting them in halves. The "pre-lowering" code in essence made assumptions about the calling convention, specifically that i(N*2) values will be split into two iN values and passed in consecutive registers in little-endian order. This, naturally, breaks on a big-endian system, such as our OR1K out-of-tree backend. Thanks to James Miller <james@aatch.net> for help in debugging. Differential Revision: https://reviews.llvm.org/D25223 llvm-svn: 283203	2016-10-04 09:07:49 +00:00
Sanjay Patel	284582b6d4	getValueType().getScalarSizeInBits() -> getScalarValueSizeInBits(), round 2 ; NFCI llvm-svn: 281498	2016-09-14 16:54:10 +00:00
Sanjay Patel	1ed771f5d7	getVectorElementType().getSizeInBits() -> getScalarSizeInBits() ; NFCI llvm-svn: 281495	2016-09-14 16:37:15 +00:00
Sanjay Patel	b1f0a0f4a8	getValueType().getSizeInBits() -> getValueSizeInBits() ; NFCI llvm-svn: 281493	2016-09-14 16:05:51 +00:00
Sanjay Patel	bd6fca1419	getScalarType().getSizeInBits() -> getScalarSizeInBits() ; NFCI llvm-svn: 281489	2016-09-14 15:21:00 +00:00
Elena Demikhovsky	dcc86d5bb6	Shift-left (ISD::SHL) operation crashes on "DAG Legalization" phase. https://llvm.org/bugs/show_bug.cgi?id=29058. While node legalization we tried to legalize its operands. If an operand node is replaced during legalization the user node may be destroyed. Differential Revision: https://reviews.llvm.org/D24244 llvm-svn: 280862	2016-09-07 20:54:33 +00:00
Hal Finkel	5081ac27c7	Add ISD::EH_DWARF_CFA, simplify @llvm.eh.dwarf.cfa on Mips, fix on PowerPC LLVM has an @llvm.eh.dwarf.cfa intrinsic, used to lower the GCC-compatible __builtin_dwarf_cfa() builtin. As pointed out in PR26761, this is currently broken on PowerPC (and likely on ARM as well). Currently, @llvm.eh.dwarf.cfa is lowered using: ADD(FRAMEADDR, FRAME_TO_ARGS_OFFSET) where FRAME_TO_ARGS_OFFSET defaults to the constant zero. On x86, FRAME_TO_ARGS_OFFSET is lowered to 2*SlotSize. This setup, however, does not work for PowerPC. Because of the way that the stack layout works, the canonical frame address is not exactly (FRAMEADDR + FRAME_TO_ARGS_OFFSET) on PowerPC (there is a lower save-area offset as well), so it is not just a matter of implementing FRAME_TO_ARGS_OFFSET for PowerPC (unless we redefine its semantics -- We can do that, since it is currently used only for @llvm.eh.dwarf.cfa lowering, but the better to directly lower the CFA construct itself (since it can be easily represented as a fixed-offset FrameIndex)). Mips currently does this, but by using a custom lowering for ADD that specifically recognizes the (FRAMEADDR, FRAME_TO_ARGS_OFFSET) pattern. This change introduces a ISD::EH_DWARF_CFA node, which by default expands using the existing logic, but can be directly lowered by the target. Mips is updated to use this method (which simplifies its implementation, and I suspect makes it more robust), and updates PowerPC to do the same. Fixes PR26761. Differential Revision: https://reviews.llvm.org/D24038 llvm-svn: 280350	2016-09-01 10:28:47 +00:00
Justin Bogner	cd1d5aaf2e	Replace a few more "fall through" comments with LLVM_FALLTHROUGH Follow up to r278902. I had missed "fall through", with a space. llvm-svn: 278970	2016-08-17 20:30:52 +00:00
Justin Bogner	b03fd12cef	Replace "fallthrough" comments with LLVM_FALLTHROUGH This is a mechanical change of comments in switches like fallthrough, fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead. llvm-svn: 278902	2016-08-17 05:10:15 +00:00
Elliot Colp	82b1468a4d	Disable shrinking of SNaN constants When expanding FP constants, we attempt to shrink doubles to floats and perform an extending load. However, on SystemZ, and possibly on other targets (I've only confirmed the problem on SystemZ), the FP extending load instruction may convert SNaN into QNaN, or may cause an exception. So in the general case, we would still like to shrink FP constants, but SNaNs should be left as doubles. Differential Revision: https://reviews.llvm.org/D22685 llvm-svn: 277602	2016-08-03 15:09:21 +00:00
Simon Pilgrim	820f87a72d	[SelectionDAG] Optimization of BITREVERSE legalization for power-of-2 integer scalar/vector types An extension of D19978, this patch replaces the default BITREVERSE evaluation of individual bit masks+shifts with block mask+shifts when we have integer elements of power-of-2 bits in size. After calling BSWAP to reverse the order of the constituent bytes (which typically follows a similar approach), every neighbouring 4-bits, 2-bits and finally 1-bit pairs are masked off and swapped over with shifts. In doing so we can significantly reduce the number of operations required. Differential Revision: https://reviews.llvm.org/D21578 llvm-svn: 276432	2016-07-22 16:46:25 +00:00
Justin Lebar	9c375817ac	[SelectionDAG] Get rid of bool parameters in SelectionDAG::getLoad, getStore, and friends. Summary: Instead, we take a single flags arg (a bitset). Also add a default 0 alignment, and change the order of arguments so the alignment comes before the flags. This greatly simplifies many callsites, and fixes a bug in AMDGPUISelLowering, wherein the order of the args to getLoad was inverted. It also greatly simplifies the process of adding another flag to getLoad. Reviewers: chandlerc, tstellarAMD Subscribers: jholewinski, arsenm, jyknight, dsanders, nemanjai, llvm-commits Differential Revision: http://reviews.llvm.org/D22249 llvm-svn: 275592	2016-07-15 18:27:10 +00:00
Craig Topper	2bd8b4b180	[CodeGen,Target] Remove the version of DAG.getVectorShuffle that takes a pointer to a mask array. Convert all callers to use the ArrayRef version. No functional change intended. For the most part this simplifies all callers. There were two places in X86 that needed an explicit makeArrayRef to shorten a statically sized array. llvm-svn: 274337	2016-07-01 06:54:47 +00:00
Rafael Espindola	db6bd02185	Delete unused includes. NFC. llvm-svn: 274225	2016-06-30 12:19:16 +00:00
Rafael Espindola	b1556c42ce	Use isPositionIndependent in a few more places. I think this converts all the simple cases that really just care about the generated code being position independent or not. The remaining uses are a bit more complicated and are checking things like "is this a library or executable" or "can this symbol be preempted". llvm-svn: 274055	2016-06-28 20:13:36 +00:00
Nirav Dave	bfdb483755	Preserve DebugInfo when replacing values in DAGCombiner Recommiting after correcting over-eager Debug Value transfer fixing PR28270. [DAG] Previously debug values would transfer debuginfo for the selected start node for a replacement which allows for debug to be dropped. Push debug value transfer to occur with node/value replacement in SelectionDAG, remove now extraneous transfers of debug values. This refixes PR9817 which was being incompletely checked in the testsuite. Reviewers: jyknight Subscribers: dblaikie, llvm-commits Differential Revision: http://reviews.llvm.org/D21037 llvm-svn: 273585	2016-06-23 17:52:57 +00:00
Peter Collingbourne	6717803485	Revert r273456, "Preserve DebugInfo when replacing values in DAGCombiner" as it caused pr28270. llvm-svn: 273518	2016-06-23 00:06:17 +00:00
Nirav Dave	96beb7dee5	Preserve DebugInfo when replacing values in DAGCombiner Recommiting after fixing over-aggressive assertion [DAG] Previously debug values would transfer debuginfo for the selected start node for a replacement which allows for debug to be dropped. Push debug value transfer to occur with node/value replacement in SelectionDAG, remove now extraneous transfers of debug values. This refixes PR9817 which was being incompletely checked in the testsuite. Reviewers: jyknight Subscribers: dblaikie, llvm-commits Differential Revision: http://reviews.llvm.org/D21037 llvm-svn: 273456	2016-06-22 19:03:26 +00:00
Krzysztof Parzyszek	e116d500a7	[SDAG] Remove FixedArgs parameter from CallLoweringInfo::setCallee The setCallee function will set the number of fixed arguments based on the size of the argument list. The FixedArgs parameter was often explicitly set to 0, leading to a lack of consistent value for non- vararg functions. Differential Revision: http://reviews.llvm.org/D20376 llvm-svn: 273403	2016-06-22 12:54:25 +00:00
Simon Pilgrim	bb8a40fdd2	Strip trailing whitespace llvm-svn: 273264	2016-06-21 14:37:39 +00:00
Daniel Sanders	bf2c03ee69	[arm+x86] Make GNU variants behave like GNU w.r.t combining sin+cos into sincos. Summary: canCombineSinCosLibcall() would previously combine sin+cos into sincos for GNUX32/GNUEABI/GNUEABIHF regardless of whether UnsafeFPMath were set or not. However, GNU would only combine them for UnsafeFPMath because sincos does not set errno like sin and cos do. It seems likely that this is an oversight. Reviewers: t.p.northover Subscribers: t.p.northover, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D21431 llvm-svn: 273259	2016-06-21 12:29:03 +00:00
Nirav Dave	194cb55f37	Revert "Preserve DebugInfo when replacing values in DAGCombiner" Reverting due to assertion failure in lib/CodeGen/SelectionDAG/InstrEmitter.cpp This reverts commit r272792. llvm-svn: 272799	2016-06-15 16:08:50 +00:00
Nirav Dave	a72e308403	Preserve DebugInfo when replacing values in DAGCombiner [DAG] Previously debug values would transfer debuginfo for the selected start node for a replacement which allows for debug to be dropped. Push debug value transfer to occur with node/value replacement in SelectionDAG, remove now extraneous transfers of debug values. This refixes PR9817 which was being incompletely checked in the testsuite. Reviewers: jyknight Subscribers: dblaikie, llvm-commits Differential Revision: http://reviews.llvm.org/D21037 llvm-svn: 272792	2016-06-15 14:50:08 +00:00
Benjamin Kramer	bdc4956bac	Pass DebugLoc and SDLoc by const ref. This used to be free, copying and moving DebugLocs became expensive after the metadata rewrite. Passing by reference eliminates a ton of track/untrack operations. No functionality change intended. llvm-svn: 272512	2016-06-12 15:39:02 +00:00
Chad Rosier	1cb56a1850	Remove extra whitespace. NFC. llvm-svn: 269685	2016-05-16 20:03:02 +00:00
Marcin Koscielnicki	bbac890b53	[PR27599] [SystemZ] [SelectionDAG] Fix extension of atomic cmpxchg result. Currently, SelectionDAG assumes 8/16-bit cmpxchg returns either a sign extended result, or a zero extended result. SystemZ takes a third option by returning junk in the high bits (rotated contents of the other bytes in the memory word). In that case, don't use Assert*ext, and zero-extend the result ourselves if a comparison is needed. Differential Revision: http://reviews.llvm.org/D19800 llvm-svn: 269075	2016-05-10 16:49:04 +00:00
Craig Topper	7e5fad66f3	[CodeGen] When promoting CTTZ operations to larger type, don't insert a select to detect if the input is zero to return the original size instead of the extended size. Instead just set the first bit in the zero extended part. llvm-svn: 267280	2016-04-23 05:20:47 +00:00
Matt Arsenault	7846d885ed	LegalizeDAG: Move unaligned load/store expansion to TLI When custom lowered, this is not called if the store is custom lowered. Move it to be a utility function so targets can easily expand unaligned accesses when custom lowering. llvm-svn: 267029	2016-04-21 18:19:11 +00:00
Matt Arsenault	46ba31650e	LegalizeDAG: Don't replace vector store with integer if not legal For the same reason as the corresponding load change. Note that ExpandStore is completely broken for non-byte sized element vector stores, but preserve the current broken behavior which has tests for it. The behavior should be the same, but now introduces a new typed store that is incorrectly split later rather than doing it directly. llvm-svn: 264928	2016-03-30 21:15:18 +00:00
Matt Arsenault	a4b1b6ea05	LegalizeDAG: Don't replace vector load with integer unless legal On AMDGPU we want to be able to promote i64/f64 loads to v2i32. If the access is unaligned, this would conclude that since i64 is legal, it would convert it back to i64 and there is an endless legalization loop. Extract the logic for scalarizing the load into a new TargetLowering function, where this can also replace the custom function AMDGPU has for this. llvm-svn: 264927	2016-03-30 21:15:10 +00:00
Nirav Dave	fa250cad37	Prevent construction of cycle in DAG store merge When merging stores in DAGCombiner, add check to ensure that no dependenices exist that would cause the construction of a cycle in our DAG. This may happen if one store has a data dependence on another instruction (e.g. a load) which itself has a (chain) dependence on another store being merged. These stores cannot be merged safely and doing so results in a cycle that is discovered in LegalizeDAG. This test is only done in cases where Antialias analysis is used (UseAA) as non-AA store merge candidates will be merged logically after all loads which have been checked to not alias. Reviewers: ahatanak, spatel, niravd, arsenm, hfinkel, tstellarAMD, jyknight Subscribers: llvm-commits, tberghammer, danalbert, srhines Differential Revision: http://reviews.llvm.org/D18336 llvm-svn: 264461	2016-03-25 21:06:30 +00:00
Tim Northover	4498eff9bb	CodeGen: extend RHS when splitting ATOMIC_CMP_SWAP_WITH_SUCCESS. If the operation's type has been promoted during type legalization, we need to account for the fact that the high bits of the comparison operand are likely unspecified. The LHS is usually zero-extended, but MIPS sign extends it, so we have to be slightly careful. Patch by Simon Dardis. llvm-svn: 264296	2016-03-24 15:38:38 +00:00
Tim Northover	b49a8a9dbb	CodeGen: check return types match when emitting tail call to builtin. We were just completely ignoring the types when determining whether we could safely emit a libcall as a tail call. This is clearly wrong. Theoretically, we could dig deeper looking for incidental matches (much like the generic code in Analysis.cpp does), but it's probably not worth it for the few libcalls that exist. llvm-svn: 264084	2016-03-22 19:14:38 +00:00
James Y Knight	f44fc5219f	Tweak some atomics functions in preparation for larger changes; NFC. - Rename getATOMIC to getSYNC, as llvm will soon be able to emit both '__sync' libcalls and '__atomic' libcalls, and this function is for the '__sync' ones. - getInsertFencesForAtomic() has been replaced with shouldInsertFencesForAtomic(Instruction), so that the decision can be made per-instruction. This functionality will be used soon. - emitLeadingFence/emitTrailingFence are no longer called if shouldInsertFencesForAtomic returns false, and thus don't need to check the condition themselves. llvm-svn: 263665	2016-03-16 22:12:04 +00:00
Sanjay Patel	5719584129	[DAG] use isUndef() ; NFCI llvm-svn: 263448	2016-03-14 17:28:46 +00:00
Matt Arsenault	a67c4916cf	LegalizeDAG: Use correct ptr type when expanding unaligned load/store This fixes regressions exposed in existing AMDGPU tests in a future commit when all loads are custom lowered. llvm-svn: 262299	2016-03-01 05:13:35 +00:00
Matthias Braun	848e79c578	LegalizeDAG: Fix ExpandFCOPYSIGN assuming the same type on both inputs llvm-svn: 261306	2016-02-19 04:44:19 +00:00
Ahmed Bougacha	f8dfb47c02	[CodeGen] Prefer "if (SDValue R = ...)" to "if (R.getNode())". NFCI. llvm-svn: 260316	2016-02-09 22:54:12 +00:00
Ahmed Bougacha	60b201b662	[CodeGen] Don't assume fp_to_fp16 produces i16 when legalizing it. Since r230276, we support an improved legalization for f64->f16, which goes through a temporary f32, improving codegen when f32->f16 is legal but not f64->f16. This requires unsafe-fp-math. However, that legalization assumed that the second step, producing a pseudo-softened f16, had type i16. That's not true on targets with illegal i16, such as ARM. Use the initial f64->f16 result type instead. llvm-svn: 257794	2016-01-14 19:45:36 +00:00
Matt Arsenault	5ca3c72c5a	LegalizeDAG: Expand ctlz with ctlz_zero_undef if legal llvm-svn: 257345	2016-01-11 16:37:46 +00:00
Robert Lougher	f0033b29d4	Fix cycle in selection DAG introduced by extractelement legalization During selection DAG legalization, extractelement is replaced with a load instruction. To do this, a temporary store to the stack is used unless an existing store is found that can be re-used. If re-using a store, the chain going out of the store must be replaced by the one going out of the new load (this ensures that any stores that must take place after the store happens after the load, else the value might be overwritten before it is loaded). The problem is, if the extractelement index is dependent on the store replacing the chain will introduce a cycle in the selection DAG (the load uses the index, and by replacing the chain we will make the index dependent on the load). To fix this, if the index is dependent on the store, the store is skipped. This is conservative as we may end up creating an unnecessary extra store to the stack. However, the situation is not expected to occur very often. Differential Revision: http://reviews.llvm.org/D15330 llvm-svn: 255114	2015-12-09 14:34:10 +00:00
Chih-Hung Hsieh	ed7d81e5d4	[X86] Part 1 to fix x86-64 fp128 calling convention. Almost all these changes are conditioned and only apply to the new x86-64 f128 type configuration, which will be enabled in a follow up patch. They are required together to make new f128 work. If there is any error, we should fix or revert them as a whole. These changes should have no impact to current configurations. * Relax type legalization checks to accept new f128 type configuration, whose TypeAction is TypeSoftenFloat, not TypeLegal, but also has TLI.isTypeLegal true. * Relax GetSoftenedFloat to return in some cases f128 type SDValue, which is TLI.isTypeLegal but not "softened" to i128 node. * Allow customized FABS, FNEG, FCOPYSIGN on new f128 type configuration, to generate optimized bitwise operators for libm functions. * Enhance related Lower* functions to handle f128 type. * Enhance DAGTypeLegalizer::run, SoftenFloatResult, and related functions to keep new f128 type in register, and convert f128 operators to library calls. * Fix Combiner, Emitter, Legalizer routines that did not handle f128 type. * Add ExpandConstant to handle i128 constants, ExpandNode to handle ISD::Constant node. * Add one more parameter to getCommonSubClass and firstCommonClass, to guarantee that returned common sub class will contain the specified simple value type. This extra parameter is used by EmitCopyFromReg in InstrEmitter.cpp. * Fix infinite loop in getTypeLegalizationCost when f128 is the value type. * Fix printOperand to handle null operand. * Enhance ISD::BITCAST node to handle f128 constant. * Expand new f128 type for BR_CC, SELECT_CC, SELECT, SETCC nodes. * Enhance X86AsmPrinter to emit f128 values in comments. Differential Revision: http://reviews.llvm.org/D15134 llvm-svn: 254653	2015-12-03 22:02:40 +00:00
Yury Gribov	d7dbb66eb8	Introduce new @llvm.get.dynamic.area.offset.i{32, 64} intrinsics. The @llvm.get.dynamic.area.offset.* intrinsic family is used to get the offset from native stack pointer to the address of the most recent dynamic alloca on the caller's stack. These intrinsics are intendend for use in combination with @llvm.stacksave and @llvm.restore to get a pointer to the most recent dynamic alloca. This is useful, for example, for AddressSanitizer's stack unpoisoning routines. Patch by Max Ostapenko. Differential Revision: http://reviews.llvm.org/D14983 llvm-svn: 254404	2015-12-01 11:40:55 +00:00
Eric Christopher	4675c439aa	Fix some places where we were assuming that memory type had been legalized to a simple type when lowering a truncating store of a vector type. In this case for an EVT we'll return Expand as we should in all of the cases anyhow. The testcase triggered at the one in VectorLegalizer::LegalizeOp, inspection found the rest. llvm-svn: 254061	2015-11-25 09:11:53 +00:00
Hans Wennborg	dcc2500452	X86: More efficient legalization of wide integer compares In particular, this makes the code for 64-bit compares on 32-bit targets much more efficient. Example: define i32 @test_slt(i64 %a, i64 %b) { entry: %cmp = icmp slt i64 %a, %b br i1 %cmp, label %bb1, label %bb2 bb1: ret i32 1 bb2: ret i32 2 } Before this patch: test_slt: movl 4(%esp), %eax movl 8(%esp), %ecx cmpl 12(%esp), %eax setae %al cmpl 16(%esp), %ecx setge %cl je .LBB2_2 movb %cl, %al .LBB2_2: testb %al, %al jne .LBB2_4 movl $1, %eax retl .LBB2_4: movl $2, %eax retl After this patch: test_slt: movl 4(%esp), %eax movl 8(%esp), %ecx cmpl 12(%esp), %eax sbbl 16(%esp), %ecx jge .LBB1_2 movl $1, %eax retl .LBB1_2: movl $2, %eax retl Differential Revision: http://reviews.llvm.org/D14496 llvm-svn: 253572	2015-11-19 16:35:08 +00:00
James Molloy	bb1dbf530a	[SDAG] Fix expansion of BITREVERSE Richard Trieu noted that UBSan detected an overflowing shift, and the obvious fix caused a crash. What was happening was that the shiftee (1U) was indeed too small for the possible range of shifts it had to handle, but also we were using "VT.getSizeInBits()" to get the maximum type bitwidth, but we wanted "VT.getScalarSizeInBits()" to get the vector lane size instead of the entire vector size. Use an APInt for the shift and VT.getScalarSizeInBits(). llvm-svn: 253023	2015-11-13 10:02:36 +00:00
James Molloy	90111f79f9	[SDAG] Introduce a new BITREVERSE node along with a corresponding LLVM intrinsic Several backends have instructions to reverse the order of bits in an integer. Conceptually matching such patterns is similar to @llvm.bswap, and it was mentioned in http://reviews.llvm.org/D14234 that it would be best if these patterns were matched in InstCombine instead of reimplemented in every different target. This patch introduces an intrinsic @llvm.bitreverse.i* that operates similarly to @llvm.bswap. For plumbing purposes there is also a new ISD node ISD::BITREVERSE, with simple expansion and promotion support. The intention is that InstCombine's BSWAP detection logic will be extended to support BITREVERSE too, and @llvm.bitreverse intrinsics emitted (if the backend supports lowering it efficiently). llvm-svn: 252878	2015-11-12 12:29:09 +00:00
Matthias Braun	b9610a6bc2	LegalizeDAG: Fix and improve FCOPYSIGN/FABS legalization - Factor out code to query and modify the sign bit of a floatingpoint value as an integer. This also works if none of the targets integer types is big enough to hold all bits of the floatingpoint value. - Legalize FABS(x) as FCOPYSIGN(x, 0.0) if FCOPYSIGN is available, otherwise perform bit manipulation on the sign bit. The previous code used "x >u 0 ? x : -x" which is incorrect for x being -0.0! It also takes 34 instructions on ARM Cortex-M4. With this patch we only require 5: vldr d0, LCPI0_0 vmov r2, r3, d0 lsrs r2, r3, #31 bfi r1, r2, #31, #1 bx lr (This could be further improved if the compiler would recognize that r2, r3 is zero). - Only lower FCOPYSIGN(x, y) = sign(x) ? -FABS(x) : FABS(x) if FABS is available otherwise perform bit manipulation on the sign bit. - Perform the sign(x) test by masking out the sign bit and comparing with 0 rather than shifting the sign bit to the highest position and testing for "<s 0". For x86 copysignl (on 80bit values) this gets us: testl $32768, %eax rather than: shlq $48, %rax sets %al testb %al, %al Differential Revision: http://reviews.llvm.org/D11172 llvm-svn: 252839	2015-11-12 01:02:47 +00:00
Matt Arsenault	aa118e299c	LegalizeDAG: Implement promote for scalar_to_vector This allows avoiding the default Expand behavior which introduces stack usage. Bitcast the scalar and replace the missing elements with undef. This is covered by existing tests and used by a future commit which makes 64-bit vectors legal types on AMDGPU. llvm-svn: 252632	2015-11-10 18:48:11 +00:00
Matt Arsenault	a46aa641f2	LegalizeDAG: Implement promote for insert_vector_elt This is covered by existing tests and used by a future commit which makes 64-bit vectors legal types on AMDGPU. llvm-svn: 252631	2015-11-10 18:48:08 +00:00
Matt Arsenault	0b7958a59b	LegalizeDAG: Implement promote for extract_vector_elt This is for AMDGPU to implement v2i64 extract as extract of half of a v4i32. This is covered by existing tests and used by a future commit which makes 64-bit vectors legal types on AMDGPU. llvm-svn: 252630	2015-11-10 18:48:04 +00:00
Matt Arsenault	29f9663f97	LegalizeDAG: Implement promote for build_vector This will be used in future commits for AMDGPU to promote operations on i64 vectors into operations on 32-bit vector components. This will be used / tested in future AMDGPU commits. llvm-svn: 250945	2015-10-21 21:10:10 +00:00
Artyom Skrobov	7fd67e25aa	Adding support for TargetLoweringBase::LibCall Summary: TargetLoweringBase::Expand is defined as "Try to expand this to other ops, otherwise use a libcall." For ISD::UDIV and ISD::SDIV, the choice between the two possibilities was defined in a rather convoluted way: - if DIVREM is legal, expand to DIVREM - if DIVREM has a custom lowering, expand to DIVREM - if DIVREM libcall is defined and a remainder from the same division is computed elsewhere, expand to a DIVREM libcall - else, expand to a DIV libcall This had the undesirable effect that if both DIV and DIVREM are implemented as libcalls, then ISD::UDIV and ISD::SDIV are expanded to the heavier DIVREM libcall, even when the remainder isn't used. The new code adds a new LegalizeAction, TargetLoweringBase::LibCall, so that backends can directly control whether they prefer an expansion or a conversion to a libcall. This makes the generic lowering code even more generic, allowing its reuse in a wider range of target-specific configurations. The useful effect is that ARM backend will now generate a call to __aeabi_{i,u}div rather than __aeabi_{i,u}divmod in cases where it doesn't need the remainder. There's no functional change outside the ARM backend. Reviewers: t.p.northover, rengolin Subscribers: t.p.northover, llvm-commits, aemerson Differential Revision: http://reviews.llvm.org/D13862 llvm-svn: 250826	2015-10-20 13:14:52 +00:00
Artyom Skrobov	b844fa7fc0	Combining DIV+REM->DIVREM doesn't belong in LegalizeDAG; move it over into DAGCombiner. Summary: In addition to moving the code over, this patch amends the DIV,REM -> DIVREM combining to run on all affected nodes at once: if the nodes are converted to DIVREM one at a time, then the resulting DIVREM may get legalized by the backend into something target-specific that we won't be able to recognize and correlate with the remaining nodes. The motivation is to "prepare terrain" for D13862: when we set DIV and REM to be legalized to libcalls, instead of the DIVREM, we otherwise lose the ability to combine them together. To prevent this, we need to take the DIV,REM -> DIVREM combining out of the lowering stage. Reviewers: RKSimon, eli.friedman, rengolin Subscribers: john.brawn, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D13733 llvm-svn: 250825	2015-10-20 13:06:02 +00:00
Duncan P. N. Exon Smith	e400a7d412	SelectionDAG: Remove implicit ilist iterator conversions, NFC llvm-svn: 250214	2015-10-13 19:47:46 +00:00
Sanjay Patel	a260701bbb	propagate fast-math-flags on DAG nodes After D10403, we had FMF in the DAG but disabled by default. Nick reported no crashing errors after some stress testing, so I enabled them at r243687. However, Escha soon notified us of a bug not covered by any in-tree regression tests: if we don't propagate the flags, we may fail to CSE DAG nodes because differing FMF causes them to not match. There is one test case in this patch to prove that point. This patch hopes to fix or leave a 'TODO' for all of the in-tree places where we create nodes that are FMF-capable. I did this by putting an assert in SelectionDAG.getNode() to find any FMF-capable node that was being created without FMF ( D11807 ). I then ran all regression tests and test-suite and confirmed that everything passes. This patch exposes remaining work to get DAG FMF to be fully functional: (1) add the flags to non-binary nodes such as FCMP, FMA and FNEG; (2) add the flags to intrinsics; (3) use the flags as conditions for transforms rather than the current global settings. Differential Revision: http://reviews.llvm.org/D12095 llvm-svn: 247815	2015-09-16 16:31:21 +00:00
Matt Arsenault	acd68b58ae	SelectionDAG: Support Expand of f16 extloads Currently this hits an assert that extload should always be supported, which assumes integer extloads. This moves a hack out of SI's argument lowering and is covered by existing tests. llvm-svn: 247113	2015-09-09 01:12:27 +00:00
Ahmed Bougacha	f9c19da03a	[CodeGen] Support (and default to) expanding READCYCLECOUNTER to 0. For targets that didn't support this, this will let us respect the langref instead of failing to select. Note that we don't need to change the 32-bit x86/PPC lowerings (to account for the result type/# difference) because they're both custom and bypass type legalization. llvm-svn: 246258	2015-08-28 01:49:59 +00:00
Charles Davis	119525914c	Make variable argument intrinsics behave correctly in a Win64 CC function. Summary: This change makes the variable argument intrinsics, `llvm.va_start` and `llvm.va_copy`, and the `va_arg` instruction behave as they do on Windows inside a `CallingConv::X86_64_Win64` function. It's needed for a Clang patch I have to add support for GCC's `__builtin_ms_va_list` constructs. Reviewers: nadav, asl, eugenis CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D1622 llvm-svn: 245990	2015-08-25 23:27:41 +00:00
Ahmed Bougacha	a196661bb0	[CodeGen] Mark the promoted FCOPYSIGN result FP_ROUND as TRUNCating. Now that we can properly promote mismatched FCOPYSIGNs (r244858), we can mark the FP_ROUND on the result as truncating, to expose folding. FCOPYSIGN doesn't change anything but the sign bit, so (fp_round (fcopysign (fpext a), b)) is equivalent to (modulo the sign bit): (fp_round (fpext a)) which is a no-op. llvm-svn: 244862	2015-08-13 01:32:30 +00:00
Ahmed Bougacha	40ded502ff	[CodeGen] When Promoting, don't extend the 2nd FCOPYSIGN operand. We don't care about its type, and there's even a combine that'll fold away the FP_EXTEND if we let it run. However, until it does, we'll have something broken like: (f32 (fp_extend (f64 v))) Scalar f16 follow-up to r243924. llvm-svn: 244858	2015-08-13 01:09:43 +00:00
Alex Lorenz	e40c8a2b26	PseudoSourceValue: Replace global manager with a manager in a machine function. This commit removes the global manager variable which is responsible for storing and allocating pseudo source values and instead it introduces a new manager class named 'PseudoSourceValueManager'. Machine functions now own an instance of the pseudo source value manager class. This commit also modifies the 'get...' methods in the 'MachinePointerInfo' class to construct pseudo source values using the instance of the pseudo source value manager object from the machine function. This commit updates calls to the 'get...' methods from the 'MachinePointerInfo' class in a lot of different files because those calls now need to pass in a reference to a machine function to those methods. This change will make it easier to serialize pseudo source values as it will enable me to transform the mips specific MipsCallEntry PseudoSourceValue subclass into two target independent subclasses. Reviewers: Akira Hatanaka llvm-svn: 244693	2015-08-11 23:09:45 +00:00
Hal Finkel	caf1149b8b	[SDAG] Fix a result chain in ExpandUnalignedLoad On the code path in ExpandUnalignedLoad which expands an unaligned vector/fp value in terms of a legal integer load of the same size, the ChainResult needs to be the chain result of the integer load. No in-tree test case is currently available. Patch by Jan Hranac! llvm-svn: 243956	2015-08-04 06:29:12 +00:00
Sanjay Patel	0f9dcf8b90	move DAGCombiner's allowableAlignment() helper function into the TLI Making allowableAlignment() more accessible was suggested as a predecessor patch for D10662, so I've pulled it into TargetLowering. This let's us remove 4 instances of duplicate logic in LegalizeDAG. There's a subtle functional change in the implementation: the existing allowableAlignment() code was using getPrefTypeAlignment() when checking alignment with the DataLayout and assumed that was fast. In this implementation, we use getABITypeAlignment() and assume that is fast. See the TODO comment or the discussion in the Phab review for future improvements in this implementation (don't use the data layout at all). There are no regression test changes from this difference, and I'm not sure how to expose it via a test. I think we actually do want to provide the 'Fast' param when checking this from DAGCombiner::MergeConsecutiveStores(). Ie, we shouldn't merge stores if the new stores are not going to be fast. But that change will require fixing allowsMisalignedMemoryAccess() overrides as noted in D10662. Differential Revision: http://reviews.llvm.org/D10905 llvm-svn: 243549	2015-07-29 18:24:18 +00:00
Matthias Braun	3cd00c1739	Fix __builtin_setjmp in combination with sjlj exception handling. llvm.eh.sjlj.setjmp was used as part of the SjLj exception handling style but is also used in clang to implement __builtin_setjmp. The ARM backend needs to output additional dispatch tables for the SjLj exception handling style, these tables however can't be emitted if llvm.eh.sjlj.setjmp is simply used for __builtin_setjmp and no actual landing pad blocks exist. To solve this issue a new llvm.eh.sjlj.setup_dispatch intrinsic is introduced which is used instead of llvm.eh.sjlj.setjmp in the SjLj exception handling lowering, so we can differentiate between the case where we actually need to setup a dispatch table and the case where we just need the __builtin_setjmp semantic. Differential Revision: http://reviews.llvm.org/D9313 llvm-svn: 242481	2015-07-16 22:34:16 +00:00
Pete Cooper	8acd386969	Use getZExtOrTrunc helper instead of manually doing zext/trunc check. NFC. The code here was doing exactly what is already in getZExtOrTrunc(). Just use that method instead. llvm-svn: 242260	2015-07-15 00:43:54 +00:00
Matthias Braun	75e668ea6e	Revert "LegalizeDAG: Fix and improve FCOPYSIGN/FABS legalization" Accidental commit, needs review first. This reverts commit r242107. llvm-svn: 242108	2015-07-14 02:09:57 +00:00
Matthias Braun	4ac4ecdadf	LegalizeDAG: Fix and improve FCOPYSIGN/FABS legalization - Factor out code to query and modify the sign bit of a floatingpoint value as an integer. This also works if none of the targets integer types is big enough to hold all bits of the floatingpoint value. - Legalize FABS(x) as FCOPYSIGN(x, 0.0) if FCOPYSIGN is available, otherwise perform bit manipulation on the sign bit. The previous code used "x >u 0 ? x : -x" which is incorrect for x being -0.0! It also takes 34 instructions on ARM Cortex-M4. With this patch we only require 5: vldr d0, LCPI0_0 vmov r2, r3, d0 lsrs r2, r3, #31 bfi r1, r2, #31, #1 bx lr (This could be further improved if the compiler would recognize that r2, r3 is zero). - Only lower FCOPYSIGN(x, y) = sign(x) ? -FABS(x) : FABS(x) if FABS is available otherwise perform bit manipulation on the sign bit. - Perform the sign(x) test by masking out the sign bit and comparing with 0 rather than shifting the sign bit to the highest position and testing for "<s 0". For x86 copysignl (on 80bit values) this gets us: testl $32768, %eax rather than: shlq $48, %rax sets %al testb %al, %al llvm-svn: 242107	2015-07-14 02:08:26 +00:00
Mehdi Amini	9639d650bb	Make TargetLowering::getShiftAmountTy() taking DataLayout as an argument Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11037 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241776	2015-07-09 02:09:20 +00:00
Mehdi Amini	44ede33a69	Make TargetLowering::getPointerTy() taking DataLayout as an argument Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, ted, yaron.keren, rafael, llvm-commits Differential Revision: http://reviews.llvm.org/D11028 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241775	2015-07-09 02:09:04 +00:00
Mehdi Amini	8ac7a9d57a	Redirect DataLayout from TargetMachine to Module in SelectionDAG Summary: SelectionDAG itself is not invoking directly the DataLayout in the TargetMachine, but the "TargetLowering" class is still using it. I'll address it in a following commit. This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11000 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241618	2015-07-07 19:07:19 +00:00
Pete Cooper	af61ac71e2	Wrap assert loops in #ifndef NDEBUG The body of the loops here only contained asserts. This triggered an unused variable warning on release builds and -Werror on the bots. llvm-svn: 240819	2015-06-26 19:23:20 +00:00
Pete Cooper	8fc121dfc4	Convert a bunch of loops to foreach. NFC. This uses the new SDNode::op_values() iterator range committed in r240805. llvm-svn: 240815	2015-06-26 19:08:33 +00:00
Alexander Kornienko	f00654e31b	Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC) Apparently, the style needs to be agreed upon first. llvm-svn: 240390	2015-06-23 09:49:53 +00:00
Alexander Kornienko	70bc5f1398	Fixed/added namespace ending comments using clang-tidy. NFC The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-,llvm-namespace-comment -header-filter='llvm/.\|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137	2015-06-19 15:57:42 +00:00
James Molloy	7e9776b559	Add SDNodes for umin, umax, smin and smax. This adds new SDNodes for signed/unsigned min/max. These nodes are built from select/icmp pairs matched at SDAGBuilder stage. This patch adds the nodes, as well as legalization support and sets them to be "expand" for all targets. NFC for now; this will be tested when I switch AArch64 to using these new nodes. llvm-svn: 237423	2015-05-15 09:03:15 +00:00
Eric Christopher	824f42f209	Migrate existing backends that care about software floating point to use the information in the module rather than TargetOptions. We've had and clang has used the use-soft-float attribute for some time now so have the backends set a subtarget feature based on a particular function now that subtargets are created based on functions and function attributes. For the one middle end soft float check go ahead and create an overloadable TargetLowering::useSoftFloat function that just checks the TargetSubtargetInfo in all cases. Also remove the command line option that hard codes whether or not soft-float is set by using the attribute for all of the target specific test cases - for the generic just go ahead and add the attribute in the one case that showed up. llvm-svn: 237079	2015-05-12 01:26:05 +00:00
Jan Vesely	7539548738	CodeGen: Default overflow operations to expand so we don't have to assume targets are lying Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: ab Differential Revision: http://reviews.llvm.org/D9265 llvm-svn: 236119	2015-04-29 16:30:46 +00:00
Sergey Dmitrouk	842a51bad8	Reapply r235977 "[DebugInfo] Add debug locations to constant SD nodes" [DebugInfo] Add debug locations to constant SD nodes This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235989	2015-04-28 14:05:47 +00:00
Daniel Jasper	48e93f7181	Revert "[DebugInfo] Add debug locations to constant SD nodes" This breaks a test: http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/23870 llvm-svn: 235987	2015-04-28 13:38:35 +00:00
Sergey Dmitrouk	adb4c69d5c	[DebugInfo] Add debug locations to constant SD nodes This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235977	2015-04-28 11:56:37 +00:00
Ahmed Bougacha	1ffe7c7d36	[AArch64] Promote f16 operations to f32. For the most common ones (such as fadd), we already did the promotion. Do the same thing for all the others. Currently, we'll just crash/assert on all these operations, as there's no hardware or libcall support whatsoever. f16 (half) is specified as an interchange - not arithmetic - format, and is expected to be promoted to single-precision for arithmetic operations. While there, teach the legalizer about promoting some of the (mostly floating-point) operations that we never needed before. Differential Revision: http://reviews.llvm.org/D8648 See related discussion on the thread for: http://reviews.llvm.org/D8755 llvm-svn: 234550	2015-04-10 00:08:48 +00:00
Ahmed Bougacha	c809761dc0	[CodeGen] Replace the reused stores' chain for extractelt expansion. This fixes a subtle issue that was introduced in r205153. When reusing a store for the extractelement expansion (to load directly from it, inserting of going through the stack), later stores to the same location might have overwritten the data we were expecting to extract from. To fix that, we need to explicitly replace the chain going out of the reused store, so that later stores also have an explicit dependency on the generated element-extracting loads, and can't clobber them. rdar://20066785 Differential Revision: http://reviews.llvm.org/D8180 llvm-svn: 231721	2015-03-09 22:51:05 +00:00
Benjamin Kramer	c54c38e090	SDAG: Merge the meat of two ExpandAtomic implementations. The copies already diverged, don't let them become any worse. Reduce redundancy in code with a little macro metaprogramming. llvm-svn: 231401	2015-03-05 20:04:29 +00:00
Hal Finkel	cec70130ac	[SDAG] Handle LowerOperation returning its input consistently For almost all node types, if the target requested custom lowering, and LowerOperation returned its input, we'd treat the original node as legal. This did not work, however, for many loads and stores, because they follow slightly different code paths, and we did not account for the possibility of LowerOperation returning its input at those call sites. I think that we now handle this consistently everywhere. At the call sites in LegalizeDAG, we used to assert in this case, so there's no functional change for any existing code there. For the call sites in LegalizeVectorOps, this really only affects whether or not we set Changed = true, but I think makes the semantics clearer. No test case here, but it will be covered by an upcoming PowerPC commit adding QPX support. llvm-svn: 230332	2015-02-24 12:59:47 +00:00
Andrea Di Biagio	af3f397b10	[X86] Teach how to custom lower double-to-half conversions under fast-math. This patch teaches the backend how to expand a double-half conversion into a double-float conversion immediately followed by a float-half conversion. We do this only under fast-math, and if float-half conversions are legal for the target. Added test CodeGen/X86/fastmath-float-half-conversion.ll Differential Revision: http://reviews.llvm.org/D7832 llvm-svn: 230276	2015-02-23 22:59:02 +00:00
Matt Arsenault	0dc54c4dee	Add generic fmad DAG node. This allows sharing of FMA forming combines to work with instructions that have the same semantics as a separate multiply and add. This is expand by default, and only formed post legalization so it shouldn't have much impact on targets that do not want it. llvm-svn: 230070	2015-02-20 22:10:33 +00:00
Matt Arsenault	bd22342322	Implement new way of expanding extloads. Now that the source and destination types can be specified, allow doing an expansion that doesn't use an EXTLOAD of the result type. Try to do a legal extload to an intermediate type and extend that if possible. This generalizes the special case custom lowering of extloads R600 has been using to work around this problem. This also happens to fix a bug that would incorrectly use more aligned loads than should be used. llvm-svn: 225925	2015-01-14 01:35:17 +00:00
Ahmed Bougacha	2b6917b020	[SelectionDAG] Allow targets to specify legality of extloads' result type (in addition to the memory type). The LoadExt legalization handling used to only have one type, the memory type. This forced users to assume that as long as the extload for the memory type was declared legal, and the result type was legal, the whole extload was legal. However, this isn't always the case. For instance, on X86, with AVX, this is legal: v4i32 load, zext from v4i8 but this isn't: v4i64 load, zext from v4i8 Whereas v4i64 is (arguably) legal, even without AVX2. Note that the same thing was done a while ago for truncstores (r46140), but I assume no one needed it yet for extloads, so here we go. Calls to getLoadExtAction were changed to add the value type, found manually in the surrounding code. Calls to setLoadExtAction were mechanically changed, by wrapping the call in a loop, to match previous behavior. The loop iterates over the MVT subrange corresponding to the memory type (FP vectors, etc...). I also pulled neighboring setTruncStoreActions into some of the loops; those shouldn't make a difference, as the additional types are illegal. (e.g., i128->i1 truncstores on PPC.) No functional change intended. Differential Revision: http://reviews.llvm.org/D6532 llvm-svn: 225421	2015-01-08 00:51:32 +00:00
Sanjay Patel	eb4a4d5aeb	Don't repeat class/function/variable names in comments. NFC. llvm-svn: 222555	2014-11-21 18:58:38 +00:00
Sanjay Patel	b06441aded	Less space; NFC llvm-svn: 222546	2014-11-21 18:05:59 +00:00
David Blaikie	70573dcd9f	Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool> This is to be consistent with StringSet and ultimately with the standard library's associative container insert function. This lead to updating SmallSet::insert to return pair<iterator, bool>, and then to update SmallPtrSet::insert to return pair<iterator, bool>, and then to update all the existing users of those functions... llvm-svn: 222334	2014-11-19 07:49:26 +00:00
Owen Anderson	b5a259935c	Fix an incorrect chain operand when expanding INSERT_VECTOR operations through the stack. Patch by Daniil Troshkov! llvm-svn: 222254	2014-11-18 20:50:19 +00:00
Matt Arsenault	7c93690be0	Add minnum / maxnum codegen llvm-svn: 220342	2014-10-21 23:01:01 +00:00
Eric Christopher	85de8f98a9	Use the subtarget on the dag to get TargetFrameLowering rather than off the target machine. llvm-svn: 219378	2014-10-09 01:35:27 +00:00
Benjamin Kramer	c6cc58e703	Remove unnecessary copying or replace it with moves in a bunch of places. NFC. llvm-svn: 219061	2014-10-04 16:55:56 +00:00
Sanjay Patel	bb29221129	Replace dead links to "Hacker's Delight" with general references. NFC. llvm-svn: 217814	2014-09-15 19:47:44 +00:00
Chandler Carruth	74ec9e19ee	[SDAG] Re-instate r215611 with a fix to a pesky X86 DAG combine. This combine is essentially combining target-specific nodes back into target independent nodes that it "knows" will be combined yet again by a target independent DAG combine into a different set of target-independent nodes that are legal (not custom though!) and thus "ok". This seems... deeply flawed. The crux of the problem is that we don't combine un-legalized shuffles that are introduced by legalizing other operations, and thus we don't see a very profitable combine opportunity. So the backend just forces the input to that combine to re-appear. However, for this to work, the conditions detected to re-form the unlegalized nodes must be exactly right. Previously, failing this would have caused poor code (if you're lucky) or a crasher when we failed to select instructions. After r215611 we would fall back into the legalizer. In some cases, this just "fixed" the crasher by produces bad code. But in the test case added it caused the legalizer and the dag combiner to iterate forever. The fix is to make the alignment checking in the x86 side of things match the alignment checking in the generic DAG combine exactly. This isn't really a satisfying or principled fix, but it at least make the code work as intended. It also highlights that it would be nice to detect the availability of under aligned loads for a given type rather than bailing on this optimization. I've left a FIXME to document this. Original commit message for r215611 which covers the rest of the chang: [SDAG] Fix a case where we would iteratively legalize a node during combining by replacing it with something else but not re-process the node afterward to remove it. In a truly remarkable stroke of bad luck, this would (in the test case attached) end up getting some other node combined into it without ever getting re-processed. By adding it back on to the worklist, in addition to deleting the dead nodes more quickly we also ensure that if it stops being dead for any reason it makes it back through the legalizer. Without this, the test case will end up failing during instruction selection due to an and node with a type we don't have an instruction pattern for. It took many million runs of the shuffle fuzz tester to find this. llvm-svn: 216537	2014-08-27 11:22:16 +00:00
Nick Lewycky	a4967c2740	Revert r215611 because it caused the infinite loop in bug 20736. There is a reduced testcase in that bug. llvm-svn: 216307	2014-08-23 00:45:03 +00:00
Oliver Stannard	51b1d460cb	[ARM] Enable DP copy, load and store instructions for FPv4-SP The FPv4-SP floating-point unit is generally referred to as single-precision only, but it does have double-precision registers and load, store and GPR<->DPR move instructions which operate on them. This patch enables the use of these registers, the main advantage of which is that we now comply with the AAPCS-VFP calling convention. This partially reverts r209650, which added some AAPCS-VFP support, but did not handle return values or alignment of double arguments in registers. This patch also adds tests for Thumb2 code generation for floating-point instructions and intrinsics, which previously only existed for ARM. llvm-svn: 216172	2014-08-21 12:50:31 +00:00
Oliver Stannard	f5469bec97	Teach the AArch64 backend to handle f16 This allows the AArch64 backend to handle fadd, fsub, fmul and fdiv operations on f16 (half-precision) types by promoting to f32. llvm-svn: 215891	2014-08-18 14:22:39 +00:00
Chandler Carruth	8039b16de7	[SDAG] Fix a case where we would iteratively legalize a node during combining by replacing it with something else but not re-process the node afterward to remove it. In a truly remarkable stroke of bad luck, this would (in the test case attached) end up getting some other node combined into it without ever getting re-processed. By adding it back on to the worklist, in addition to deleting the dead nodes more quickly we also ensure that if it stops being dead for any reason it makes it back through the legalizer. Without this, the test case will end up failing during instruction selection due to an and node with a type we don't have an instruction pattern for. It took many million runs of the shuffle fuzz tester to find this. llvm-svn: 215611	2014-08-14 01:07:37 +00:00
Eric Christopher	d913448b38	Remove the TargetMachine forwards for TargetSubtargetInfo based information and update all callers. No functional change. llvm-svn: 214781	2014-08-04 21:25:23 +00:00
Chandler Carruth	1f52b3da0a	[SDAG] Begin simplifying the way in which the legalizer deletes nodes. This lifts the (very few) places the legalizer would delete dead nodes into the outer loop around the legalizer. This is significantly simpler because it doesn't require the legalizer itself to manage the iterator validity, and it doesn't require the legalizer to be a DAG update listener in order to remove things from the legalized set. It also makes the interface much less contrived for the case of the legalizer running inside the last phase of DAG combining. I'm working on centralizing the deletion of nodes during both legalizing and combining as much as possible. My hope is to remove the need for DAG update listeners from the combiner next, which would remove a costly virtual dispatch chain on every deletion. This in turn should allow us to more aggressively delete DAG nodes during combining which will in turn allow us to combine more aggressively by exposing the actual nodes which have single users to the combine phases. llvm-svn: 214546	2014-08-01 19:49:59 +00:00
Louis Gerbarg	67474e3755	Make sure no loads resulting from load->switch DAGCombine are marked invariant Currently when DAGCombine converts loads feeding a switch into a switch of addresses feeding a load the new load inherits the isInvariant flag of the left side. This is incorrect since invariant loads can be reordered in cases where it is illegal to reoarder normal loads. This patch adds an isInvariant parameter to getExtLoad() and updates all call sites to pass in the data if they have it or false if they don't. It also changes the DAGCombine to use that data to make the right decision when creating the new load. llvm-svn: 214449	2014-07-31 21:45:05 +00:00
Chandler Carruth	b143274ad0	[SDAG] Add DEBUG logging to the legalizer, fixing a "bug" found by inspection in the proccess, and shuffle the logging in the DAG combiner around a bit. With this it is much easier to follow what the legalizer is doing. It should even accurately present most of the strange legalization operations where a single node is replaced by multiple nodes, etc. There is still some information lost (we log SDNodes not SDValues so we don't log which result is used for which thing), but I think this is much closer to a usable system. Notably, this will make it much more apparant when legalization is actually happening inside the combiner, or when there is a cycle caused by interactions of the legalizer and the combiner. The "bug" I fixed here I'm not sure is remotely possible to trigger. We were only adding one of the nodes in a replacement to the updated set rather than all of the nodes in the replacement. Realistically, the worst result of this are nodes not getting back onto the worklist in the DAG combiner. I doubt it is possible to trigger this today, and I certainly don't have any ideas about how, but this at least brings the code into alignment with the principled operation of the routine. llvm-svn: 214105	2014-07-28 17:55:07 +00:00
Matt Arsenault	6f2a526101	Add alignment value to allowsUnalignedMemoryAccess Rename to allowsMisalignedMemoryAccess. On R600, 8 and 16 byte accesses are mostly OK with 4-byte alignment, and don't need to be split into multiple accesses. Vector loads with an alignment of the element type are not uncommon in OpenCL code. llvm-svn: 214055	2014-07-27 17:46:40 +00:00
Chandler Carruth	5a85c7beb8	[SDAG] Add an assert that we don't mess up the number of values when replacing nodes in the legalizer. This caught a number of bugs for me during development. llvm-svn: 214022	2014-07-26 05:53:16 +00:00
Chandler Carruth	98655fa4d8	[SDAG] Simplify the code for handling single-value nodes and add a missing transfer of debug information (without which tests fail). llvm-svn: 214021	2014-07-26 05:52:51 +00:00
Chandler Carruth	411fb407f8	[SDAG] When performing post-legalize DAG combining, run the legalizer over each node in the worklist prior to combining. This allows the combiner to produce new nodes which need to go back through legalization. This is particularly useful when generating operands to target specific nodes in a post-legalize DAG combine where the operands are significantly easier to express as pre-legalized operations. My immediate use case will be PSHUFB formation where we need to build a constant shuffle mask with a build_vector node. This also refactors the relevant functionality in the legalizer to support this, and updates relevant tests. I've spoken to the R600 folks and these changes look like improvements to them. The avx512 change needs to be investigated, I suspect there is a disagreement between the legalizer and the DAG combiner there, but it seems a minor issue so leaving it to be re-evaluated after this patch. Differential Revision: http://reviews.llvm.org/D4564 llvm-svn: 214020	2014-07-26 05:49:40 +00:00
Hal Finkel	cc39b67530	AA metadata refactoring (introduce AAMDNodes) In order to enable the preservation of noalias function parameter information after inlining, and the representation of block-level __restrict__ pointer information (etc.), additional kinds of aliasing metadata will be introduced. This metadata needs to be carried around in AliasAnalysis::Location objects (and MMOs at the SDAG level), and so we need to generalize the current scheme (which is hard-coded to just one TBAA MDNode). This commit introduces only the necessary refactoring to allow for the introduction of other aliasing metadata types, but does not actually introduce any (that will come in a follow-up commit). What it does introduce is a new AAMDNodes structure to hold all of the aliasing metadata nodes associated with a particular memory-accessing instruction, and uses that structure instead of the raw MDNode in AliasAnalysis::Location, etc. No functionality change intended. llvm-svn: 213859	2014-07-24 12:16:19 +00:00
Tim Northover	84ce0a642e	CodeGen: generate single libcall for fptrunc -> f16 operations. Previously we asserted on this code. Currently compiler-rt doesn't actually implement any of these new libcalls, but external help is pretty much the only viable option for LLVM. I've followed the much more generic "__truncST2" naming, as opposed to the odd name for f32 -> f16 truncation. This can obviously be changed later, or overridden by any targets that need to. llvm-svn: 213252	2014-07-17 11:12:12 +00:00
Tim Northover	fd7e424935	CodeGen: extend f16 conversions to permit types > float. This makes the two intrinsics @llvm.convert.from.f16 and @llvm.convert.to.f16 accept types other than simple "float". This is only strictly needed for the truncate operation, since otherwise double rounding occurs and there's no way to represent the strict IEEE conversion. However, for symmetry we allow larger types in the extend too. During legalization, we can expand an "fp16_to_double" operation into two extends for convenience, but abort when the truncate isn't legal. A new libcall is probably needed here. Even after this commit, various target tweaks are needed to actually use the extended intrinsics. I've put these into separate commits for clarity, so there are no actual tests of f64 conversion here. llvm-svn: 213248	2014-07-17 10:51:23 +00:00
Oliver Stannard	6eda6ffc0c	ARM: Allow __fp16 as a function arg or return type for AArch64 ACLE 2.0 allows __fp16 to be used as a function argument or return type. This enables this for AArch64. llvm-svn: 212812	2014-07-11 13:33:46 +00:00
Jan Vesely	eca89d283e	SelectionDAG: Factor FP_TO_SINT lower code out of DAGLegalizer Move the code to a helper function to allow calls from TypeLegalizer. No functionality change intended Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> Reviewed-by: Owen Anderson <resistor@mac.com> llvm-svn: 212772	2014-07-10 22:40:18 +00:00
Daniel Sanders	cbd44c591d	Make it possible for ints/floats to return different values from getBooleanContents() Summary: On MIPS32r6/MIPS64r6, floating point comparisons return 0 or -1 but integer comparisons return 0 or 1. Updated the various uses of getBooleanContents. Two simplifications had to be disabled when float and int boolean contents differ: - ScalarizeVecRes_VSELECT except when the kind of boolean contents is trivially discoverable (i.e. when the condition of the VSELECT is a SETCC node). - visitVSELECT (select C, 0, 1) -> (xor C, 1). Come to think of it, this one could test for the common case of 'C' being a SETCC too. Preserved existing behaviour for all other targets and updated the affected MIPS32r6/MIPS64r6 tests. This also fixes the pi benchmark where the 'low' variable was counting in the wrong direction because it thought it could simply add the result of the comparison. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: hfinkel, jholewinski, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D4389 llvm-svn: 212697	2014-07-10 10:18:12 +00:00
Juergen Ributzka	3bd03c7099	[DAG] Pass the argument list to the CallLoweringInfo via move semantics. NFCI. The argument list vector is never used after it has been passed to the CallLoweringInfo and moving it to the CallLoweringInfo is cleaner and pretty much as cheap as keeping a pointer to it. llvm-svn: 212135	2014-07-01 22:01:54 +00:00
Tom Stellard	aad4659470	SelectionDAG: Expand i64 = FP_TO_SINT i32 llvm-svn: 211108	2014-06-17 16:53:07 +00:00
Tim Northover	65277a2bc0	LegalizeDAG: make sure cast is unsigned before using FP_TO_UINT. It's valid to use FP_TO_SINT when asking for a smaller type (e.g. all "unsigned int16" values fit into a "signed int32"), but the reverse isn't true. Unfortunately, I'm not actually aware of any architecture with asymmetric FP_TO_SINT and FP_TO_UINT handling and the logic happens to work in the symmetric case, so I can't actually write a test for this. llvm-svn: 210986	2014-06-15 09:27:20 +00:00
Tim Northover	420a216817	IR: add "cmpxchg weak" variant to support permitted failure. This commit adds a weak variant of the cmpxchg operation, as described in C++11. A cmpxchg instruction with this modifier is permitted to fail to store, even if the comparison indicated it should. As a result, cmpxchg instructions must return a flag indicating success in addition to their original iN value loaded. Thus, for uniformity all cmpxchg instructions now return "{ iN, i1 }". The second flag is 1 when the store succeeded. At the DAG level, a new ATOMIC_CMP_SWAP_WITH_SUCCESS node has been added as the natural representation for the new cmpxchg instructions. It is a strong cmpxchg. By default this gets Expanded to the existing ATOMIC_CMP_SWAP during Legalization, so existing backends should see no change in behaviour. If they wish to deal with the enhanced node instead, they can call setOperationAction on it. Beware: as a node with 2 results, it cannot be selected from TableGen. Currently, no use is made of the extra information provided in this patch. Test updates are almost entirely adapting the input IR to the new scheme. Summary for out of tree users: ------------------------------ + Legacy Bitcode files are upgraded during read. + Legacy assembly IR files will be invalid. + Front-ends must adapt to different type for "cmpxchg". + Backends should be unaffected by default. llvm-svn: 210903	2014-06-13 14:24:07 +00:00
Tom Stellard	3ca1bfc728	SelectionDAG: Expand SELECT_CC to SELECT + SETCC This consolidates code from the Hexagon, R600, and XCore targets. No functionality change intended. llvm-svn: 210539	2014-06-10 16:01:22 +00:00
Matt Arsenault	3ee3746374	Fix wrong setcc result type when legalizing uaddo/usubo No test because no in-tree targets change the bitwidth of the setcc type depending on the bitwidth of the compared type. Patch by Ke Bai llvm-svn: 209771	2014-05-28 20:51:42 +00:00
Saleem Abdulrasool	f3a5a5c546	Target: remove old constructors for CallLoweringInfo This is mostly a mechanical change changing all the call sites to the newer chained-function construction pattern. This removes the horrible 15-parameter constructor for the CallLoweringInfo in favour of setting properties of the call via chained functions. No functional change beyond the removal of the old constructors are intended. llvm-svn: 209082	2014-05-17 21:50:17 +00:00
Pete Cooper	7fd1d725b9	Use a logical not when inverting SetCC. This unfortunately doesn't fire on any targets so I couldn't find a test case to trigger it. The problem occurs when a non-i1 setcc is inverted. For example 'i8 = setcc' will get 'xor 0xff' to invert this. This is clearly wrong when the boolean contents are ZeroOrOne. This patch introduces getLogicalNOT and updates SetCC legalisation to use it. Reviewed by Hal Finkel. llvm-svn: 208641	2014-05-12 23:26:58 +00:00
Renato Golin	c7aea40ec6	Implememting named register intrinsics This patch implements the infrastructure to use named register constructs in programs that need access to specific registers (bare metal, kernels, etc). So far, only the stack pointer is supported as a technology preview, but as it is, the intrinsic can already support all non-allocatable registers from any architecture. llvm-svn: 208104	2014-05-06 16:51:25 +00:00
Eric Christopher	83dd2fad2a	We already calculate WideVT above, just reuse it. Patch by Jan Vesely <jan.vesely@rutgers.edu>. llvm-svn: 207455	2014-04-28 22:24:57 +00:00
Craig Topper	48d114bed1	Convert SelectionDAG::getNode methods to use ArrayRef<SDValue>. llvm-svn: 207327	2014-04-26 18:35:24 +00:00
Craig Topper	c0196b1b40	[C++11] More 'nullptr' conversion. In some cases just using a boolean check instead of comparing to nullptr. llvm-svn: 206142	2014-04-14 00:51:57 +00:00
Tom Stellard	a1a5d9aa2e	SelectionDAG: Use helper function to improve legalization of ISD::MUL The TargetLowering::expandMUL() helper contains lowering code extracted from the DAGTypeLegalizer and allows the SelectionDAGLegalizer to expand more ISD::MUL patterns without having to use a library call. llvm-svn: 206037	2014-04-11 16:12:01 +00:00
Hal Finkel	b811b6d0d1	Add an optional ability to expand larger BUILD_VECTORs with shuffles This adds the ability to expand large (meaning with more than two unique defined values) BUILD_VECTOR nodes in terms of SCALAR_TO_VECTOR and (legal) vector shuffles. There is now no limit of the size we are capable of expanding this way, although we don't currently do this for vectors with many unique values because of the default implementation of TLI's shouldExpandBuildVectorWithShuffles function. There is currently no functional change to any existing targets because the new capabilities are not used unless some target overrides the TLI shouldExpandBuildVectorWithShuffles function. As a result, I've not included a test case for the new functionality in this commit, but regression tests will (at least) be added soon when I commit support for the PPC QPX vector instruction set. The benefit of committing this now is that it makes the shouldExpandBuildVectorWithShuffles callback, which had to be added for other reasons regardless, fully functional. I suspect that other targets will also benefit from tuning the heuristic. llvm-svn: 205243	2014-03-31 19:42:55 +00:00
Hal Finkel	1977514287	Add a TLI hook to control when BUILD_VECTOR might be expanded using shuffles There are two general methods for expanding a BUILD_VECTOR node: 1. Use SCALAR_TO_VECTOR on the defined scalar values and then shuffle them together. 2. Build the vector on the stack and then load it. Currently, we use a fixed heuristic: If there are only one or two unique defined values, then we attempt an expansion in terms of SCALAR_TO_VECTOR and vector shuffles (provided that the required shuffle mask is legal). Otherwise, always expand via the stack. Even when SCALAR_TO_VECTOR is not legal, this can still be a good idea depending on what tricks the target can play when lowering the resulting shuffle. If the target can't do anything special, however, and if SCALAR_TO_VECTOR is expanded via the stack, this heuristic leads to sub-optimal code (two stack loads instead of one). Because only the target knows whether the SCALAR_TO_VECTORs and shuffles for a build vector of a particular type are likely to be optimial, this adds a new TLI function: shouldExpandBuildVectorWithShuffles which takes the vector type and the count of unique defined values. If this function returns true, then method (1) will be used, subject to the constraint that all of the necessary shuffles are legal (as determined by isShuffleMaskLegal). If this function returns false, then method (2) is always used. This commit does not enhance the current code to support expanding a build_vector with more than two unique values using shuffles, but I'll commit an implementation of the more-general case shortly. llvm-svn: 205230	2014-03-31 17:48:10 +00:00
Hal Finkel	90adf0fe06	Make use of previously generated stores in SelectionDAGLegalize::ExpandExtractFromVectorThroughStack When expanding EXTRACT_VECTOR_ELT and EXTRACT_SUBVECTOR using SelectionDAGLegalize::ExpandExtractFromVectorThroughStack, we store the entire vector and then load the piece we want. This is fine in isolation, but generating a new store (and corresponding stack slot) for each extraction ends up producing code of poor quality. When we scalarize a vector operation (using SelectionDAG::UnrollVectorOp for example) we generate one EXTRACT_VECTOR_ELT for each element in the vector. This used to generate one stored copy of the vector for each element in the vector. Now we search the uses of the vector for a suitable store before generating a new one, which results in much more efficient scalarization code. llvm-svn: 205153	2014-03-30 15:10:18 +00:00
Tom Stellard	c9a67a2b6d	SelectionDAG: Allow promotion of SELECT nodes from float to int types And vice-versa, as long as the types are the same width. There are a few R600 tests that will cover this. llvm-svn: 204616	2014-03-24 16:07:28 +00:00
Tim Northover	e94a518a22	IR: add a second ordering operand to cmpxhg for failure The syntax for "cmpxchg" should now look something like: cmpxchg i32* %addr, i32 42, i32 3 acquire monotonic where the second ordering argument gives the required semantics in the case that no exchange takes place. It should be no stronger than the first ordering constraint and cannot be either "release" or "acq_rel" (since no store will have taken place). rdar://problem/15996804 llvm-svn: 203559	2014-03-11 10:48:52 +00:00
Matt Arsenault	95b714c749	Fix non 2-space indentation. llvm-svn: 203514	2014-03-11 00:01:25 +00:00

1 2 3 4 5 ...

1441 Commits