teak-llvm

mirror of https://github.com/Gericom/teak-llvm.git synced 2025-06-20 12:05:48 -04:00

Author	SHA1	Message	Date
Justin Lebar	49fac56ea3	[NVPTX] Allow libcalls that are defined in the current module. The patch adds a possibility to make library calls on NVPTX. An important thing about library functions - they must be defined within the current module. This basically should guarantee that we produce a valid PTX assembly (without calls to not defined functions). The one who wants to use the libcalls is probably will have to link against compiler-rt or any other implementation. Currently, it's completely impossible to make library calls because of error LLVM ERROR: Cannot select: i32 = ExternalSymbol '...'. But we can lower ExternalSymbol to TargetExternalSymbol and verify if the function definition is available. Also, there was an issue with a DAG during legalisation. When we expand instruction into libcall, the inner call-chain isn't being "integrated" into outer chain. Since the last "data-flow" (call retval load) node is located in call-chain earlier than CALLSEQ_END node, the latter becomes a leaf and therefore a dead node (and is being removed quite fast). Proposed here solution relies on another data-flow pseudo nodes (ProxyReg) which purpose is only to keep CALLSEQ_END at legalisation and instruction selection phases - we remove the pseudo instructions before register scheduling phase. Patch by Denys Zariaiev! Differential Revision: https://reviews.llvm.org/D34708 llvm-svn: 350069	2018-12-26 19:12:31 +00:00
Max Kazantsev	28298e9647	[NFC] Use utility function for guards detection llvm-svn: 350064	2018-12-26 08:22:25 +00:00
Petar Avramovic	09dff33349	[MIPS GlobalISel] Select G_SELECT Add widen scalar for type index 1 (i1 condition) for G_SELECT. Select G_SELECT for pointer, s32(integer) and smaller low level types on MIPS32. Differential Revision: https://reviews.llvm.org/D56001 llvm-svn: 350063	2018-12-25 14:42:30 +00:00
Max Kazantsev	9b25bf3960	[NFC] Reuse variables instead of re-calling getParent llvm-svn: 350062	2018-12-25 07:20:06 +00:00
Kang Zhang	d501a1e596	[PowerPC] Fix the bug of ISD::ADDE to set its second return type to glue Summary: This patch is to fix the bug imported by rL341634. In above submit , the the return type of ISD::ADDE is 14224: SDVTList VTs = DAG.getVTList(MVT::i64, MVT::i64), but in fact, the second return type of ISD::ADDE should be MVT::Glue not MVT::i64. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D55977 llvm-svn: 350061	2018-12-25 03:29:51 +00:00
Craig Topper	0229da8f07	[X86] Use GetDemandedBits to simplify the operands of PMULDQ/PMULUDQ. This is an alternative to what I attempted in D56057. GetDemandedBits is a special version of SimplifyDemandedBits that allows simplifications even when the operand has other uses. GetDemandedBits will only do simplifications that allow a node to be bypassed. It won't create new nodes or alter any of the other users. I had to add support for bypassing SIGN_EXTEND_INREG to GetDemandedBits. Based on a patch that Simon Pilgrim sent me in email. Fixes PR40142. llvm-svn: 350059	2018-12-24 19:40:20 +00:00
Eugene Leviant	4dc3a3f746	[HWASAN] Instrument memorty intrinsics by default Differential revision: https://reviews.llvm.org/D55926 llvm-svn: 350055	2018-12-24 16:02:48 +00:00
Max Kazantsev	0b455c2b71	Revert rL350048 and rL350050 These patches have broken almost all buildbots on test DebugInfo/X86/addr_comments.ll. Reverting to green. llvm-svn: 350052	2018-12-24 10:30:04 +00:00
David Blaikie	5353e64935	Fix build - follow-up to r350048 which broke headerless (v4) address pool llvm-svn: 350050	2018-12-24 07:56:40 +00:00
Max Kazantsev	edabb9ae56	[LoopSimplifyCFG] Delete dead exiting edges This patch teaches LoopSimplifyCFG to remove dead exiting edges from loops. Differential Revision: https://reviews.llvm.org/D54025 Reviewed By: fedor.sergeev llvm-svn: 350049	2018-12-24 07:41:33 +00:00
David Blaikie	e20bf9ab91	DebugInfo: Use assembly label arithmetic for address pool size for easier reading/editing llvm-svn: 350048	2018-12-24 07:35:10 +00:00
David Blaikie	d671eb7e7c	DebugInfo: Add assembly comments for debug_addr contribution header fields llvm-svn: 350047	2018-12-24 07:09:50 +00:00
David Blaikie	b917c3a41a	llvm-dwarfdump: Skip address index info (and dump only the address, if found) when non-verbose dumping addrx forms There's a few bugs here still - demonstrated with FIXITs in the test. llvm-svn: 350046	2018-12-24 06:52:31 +00:00
Max Kazantsev	347c583772	Return "[LoopSimplifyCFG] Delete dead in-loop blocks" The underlying bug that caused the revert should be fixed by rL348567. Differential Revision: https://reviews.llvm.org/D54023 llvm-svn: 350045	2018-12-24 06:06:17 +00:00
George Burgess IV	7e12875c89	[LoopIdioms] More LocationSize::precise annotations; NFC Both of these places reference memset-like loops. Memset is precise. Trying to keep these patches super small so they're easily post-commit verifiable, as requested in D44748. llvm-svn: 350044	2018-12-24 05:55:50 +00:00
Craig Topper	0adc3fe9e7	[X86] Remove unused variables left after r350041. NFC llvm-svn: 350043	2018-12-24 05:45:45 +00:00
George Burgess IV	610c76534f	[SelectionDAGBuilder] Use ::precise LocationSizes; NFC More migration so we can disable the implicit int -> LocationSize conversion. All of these are either scatter/gather'ed vector instructions, or direct loads. Hence, they're all precise. Perhaps if we see way more getTypeStoreSize calls, we can make a getTypeStoreLocationSize (or similar) as a wrapper that applies this ::precise. Doesn't appear that it's a good idea to make getTypeStoreSize return a LocationSize itself, however. llvm-svn: 350042	2018-12-24 05:34:21 +00:00
Craig Topper	d8217b23ff	[X86] Move the optimization that turns 'CMP (AND+IMM64), 0' into SRL/SHL+TEST to X86ISelDAGToDAG. This cleans more code out of EmitTest. llvm-svn: 350041	2018-12-24 05:27:13 +00:00
Craig Topper	e8c50fc6af	[X86] Remove the ANDN check from EmitTest. Remove the TESTmr isel patterns and add another postprocessing combine for TESTrr+ANDrm->TESTmr. We already have a postprocessing combine for TESTrr+ANDrr->TESTrr. With this we can give ANDN a chance to match first. And clean it up during post processing if we ended up with just a regular AND. This is another step towards my plan to gut EmitTest and do more flag handling during isel matching or by using optimizeCompare. llvm-svn: 350038	2018-12-24 01:10:13 +00:00
Sanjay Patel	93f1074677	[DAGCombiner] limit shuffle to extend transform (PR40146) It's dangerous to knowingly create an illegal vector type no matter what stage of combining we're in. This prevents the missed folding/scalarization seen in: https://bugs.llvm.org/show_bug.cgi?id=40146 llvm-svn: 350034	2018-12-23 20:48:31 +00:00
Sanjay Patel	9933574ac3	[DAGCombiner] allow hoisting vector bitwise logic ahead of extends llvm-svn: 350032	2018-12-23 19:58:16 +00:00
Simon Atanasyan	4553cdafe1	[ORC] Rename register in the OrcMips64 resolver code comments. NFC The `fp` and `s8` register names are synonyms. But `fp` better reflects a purpose of the register. llvm-svn: 350023	2018-12-23 12:05:04 +00:00
Simon Atanasyan	15b68b87d5	[ORC] clang-format OrcMips32 and OrcMips64 code. NFC llvm-svn: 350022	2018-12-23 12:05:00 +00:00
Simon Atanasyan	da29981707	[ORC] Remove redundant instruction from MIPS resolver code. NFC It's redundant to restore the `$a3` register twice. llvm-svn: 350021	2018-12-23 12:04:55 +00:00
George Burgess IV	5e4a03a089	[MemCpyOpt] Use LocationSize instead of ints; NFC Trying to keep these patches super small so they're easily post-commit verifiable, as requested in D44748. srcSize is derived from the size of an alloca, and we quit out if the size of that is > the size of the thing we're copying to. Hence, we should always copy everything over, so these sizes are precise. Don't make srcSize itself a LocationSize, since optionality isn't helpful, and we do some comparisons against other sizes elsewhere in that function. llvm-svn: 350019	2018-12-23 06:40:39 +00:00
Craig Topper	006bac6880	[X86] Return false from hasAndNotCompare if the comparision value is a constant. We won't end up using an ANDN instruction in this case so we should generate the same code we do for pre-BMI targets. llvm-svn: 350018	2018-12-23 05:52:55 +00:00
George Burgess IV	69952979da	[MemoryLocation] Use LocationSize instead of ints; NFC Trying to keep these patches super small so they're easily post-commit verifiable, as requested in D44748. This one sadly isn't super small, but all of the changes here are either to: - libfuncs that are passed a constant size (memcpy, memset, ...) - instructions that store/load a constant size So they have to be precise llvm-svn: 350017	2018-12-23 03:36:44 +00:00
George Burgess IV	8c5413f3f7	[Loads] Use LocationSize instead of ints; NFC Keeping these patches super small so they're easily post-commit verifiable, as requested in D44748. This tries to find literal loads/stores of the given type, so this has to be precise. llvm-svn: 350016	2018-12-23 03:10:56 +00:00
George Burgess IV	1329cf1791	[Lint] Use LocationSize instead of ints; NFC Keeping these patches super small so they're easily post-commit verifiable, as requested in D44748. llvm-svn: 350015	2018-12-23 02:50:08 +00:00
George Burgess IV	685e781d55	[AAEval] Use LocationSize instead of ints; NFC Keeping these patches super small so they're easily post-commit verifiable, as requested in D44748. llvm-svn: 350014	2018-12-23 02:39:58 +00:00
Craig Topper	3cc92a28ce	[X86] Fix an old FIXME about folding the zero constant into the OR instruction we use for sequentially consistent fence in 32-bit mode without SSE2. llvm-svn: 350013	2018-12-23 01:54:43 +00:00
David Blaikie	2a38c17b34	DebugInfo: Accurately propagate the section used by a relocation when accessing ranges defined by low/high_pc This is difficult/not possible to test in LLVM, but is visible as a crash in LLD when parsing DWARF to generate gdb-index. This function is called by llvm-dwarfdump when parsing high_pc for non-verbose output (to print the actual high_pc rather than the low_pc relative value), but in that case llvm-dwarfdump doesn't print section names (if it did, it would hit this problem). We could add some other features to llvm-dwarfdump to expose this, but nothing really springs to my mind. I will add a test to lld, though. llvm-svn: 350010	2018-12-22 22:20:40 +00:00
David Blaikie	25179613f6	llvm-dwarfdump: Dump the section name/number for addr attributes llvm-svn: 350009	2018-12-22 20:34:58 +00:00
George Burgess IV	640be69249	[Analysis] More LocationSize cleanup; NFC Keeping these patches super small so they're easily post-commit verifiable, as requested in D44748. llvm-svn: 350008	2018-12-22 18:23:21 +00:00
Sanjay Patel	4b537aaf6d	[DAGCombiner] allow narrowing of add followed by truncate trunc (add X, C ) --> add (trunc X), C' If we're throwing away the top bits of an 'add' instruction, do it in the narrow destination type. This makes the truncate-able opcode list identical to the sibling transform done in IR (in instcombine). This change used to show regressions for x86, but those are gone after D55494. This gets us closer to deleting the x86 custom function (combineTruncatedArithmetic) that does almost the same thing. Differential Revision: https://reviews.llvm.org/D55866 llvm-svn: 350006	2018-12-22 17:10:31 +00:00
Sanjay Patel	52c02d70e2	[x86] add load fold patterns for movddup with vzext_load The missed load folding noticed in D55898 is visible independent of that change either with an adjusted IR pattern to start or with AVX2/AVX512 (where the build vector becomes a broadcast first; movddup is not produced until we get into isel via tablegen patterns). Differential Revision: https://reviews.llvm.org/D55936 llvm-svn: 350005	2018-12-22 16:59:02 +00:00
David Blaikie	9efb0153f0	llvm-dwarfdump: Remove extraneous space between '(' and 'indexed' When dumping string or address indexes llvm-svn: 349997	2018-12-22 08:43:08 +00:00
David Blaikie	c04d2bf22a	llvm-dwarfdump: Print the section name/number for addr_index attributes (addr attributes coming shortly) llvm-svn: 349996	2018-12-22 08:33:55 +00:00
David Blaikie	87ae80fb2f	DebugInfo: Refactor named section dumping into a reusable helper Currently the section name (& possibly number) is only printed on addresses in ranges - but no reason it couldn't also be displayed on other addresses (like low/high PC). Refactor in that direction by pulling out the section lookup and name ambiguity dumping logic into a reusable helper. llvm-svn: 349995	2018-12-22 08:23:10 +00:00
David Blaikie	e4e0b9f48f	DebugInfo: Remove extra attribute lookup llvm-svn: 349985	2018-12-22 02:24:13 +00:00
Craig Topper	1f02ac3451	[X86] FixupLEAs, reduce number of calls to getOperand and use X86::AddrBaseReg/AddrIndexReg, etc. instead of hardcoded constants. Makes the code a little more readable. llvm-svn: 349983	2018-12-22 01:34:47 +00:00
Justin Lebar	7f41fe3a58	[NVPTX] Reduce stack size in NVPTXAsmPrinter::doInitialization(). NVPTXAsmPrinter::doInitialization() was creating an NVPTXSubtarget on the stack. This object is huge, about 80kb. Also it's slow to create. And it's all redundant; we have one in NVPTXTargetMachine anyway! llvm-svn: 349982	2018-12-22 01:30:37 +00:00
David Blaikie	219c6bd388	libDebugInfo: Refactor error handling in range list parsing Propagate the llvm::Error a little further up. This is NFC for llvm-dwarfdump in this change, but allows ld.lld to emit more precise error messages about which object and archive the erroneous DWARF is in. llvm-svn: 349978	2018-12-22 00:31:02 +00:00
Reid Kleckner	98bbd07cc3	[MC] Enable .file support on COFF and diagnose it on unsupported targets Summary: The "single parameter" .file directive appears to be an ELF-only feature that is intended to insert the main source filename into the string table table. I noticed that if you assemble an ELF .s file for COFF, typically it will assert right away on a .file directive near the top of the file. My first change was to make this emit a proper error in the asm parser so that we don't assert so easily. However, COFF actually does have some support for this directive, and if you emit an object file, llvm-mc does not assert. When emitting a COFF object, MC will take those file names and create "debug" symbol table entries for them. I'm not familiar with these kinds of symbol table entries, and I'm not aware of any users of them, but @compnerd added them a while ago. They don't introduce absolute paths, and most main source file paths are short enough that this extra entry shouldn't cause any problems, so I enabled the flag in MCAsmInfoCOFF that indicates that it's supported. This has the side effect of adding an extra debug symbol to every object produced by clang, which is a pretty big functional change. My question is, should we keep the functionality or remove it in the name of symbol table minimalism? Reviewers: mstorsjo, compnerd Subscribers: hiraditya, compnerd, llvm-commits Differential Revision: https://reviews.llvm.org/D55900 llvm-svn: 349976	2018-12-21 23:35:48 +00:00
Mircea Trofin	499a66ecc0	Silence warning in assert introduced in rL349973. Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56030 llvm-svn: 349975	2018-12-21 23:02:10 +00:00
Mircea Trofin	b53eeb6f4c	[llvm] API for encoding/decoding DWARF discriminators. Summary: Added a pair of APIs for encoding/decoding the 3 components of a DWARF discriminator described in http://lists.llvm.org/pipermail/llvm-dev/2016-October/106532.html: the base discriminator, the duplication factor (useful in profile-guided optimization) and the copy index (used to identify copies of code in cases like loop unrolling) The encoding packs 3 unsigned values in 32 bits. This CL addresses 2 issues: - communicates overflow back to the user - supports encoding all 3 components together. Current APIs assume a sequencing of events. For example, creating a new discriminator based on an existing one by changing the base discriminator was not supported. Reviewers: davidxl, danielcdh, wmi, dblaikie Reviewed By: dblaikie Subscribers: zzheng, dmgreen, aprantl, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D55681 llvm-svn: 349973	2018-12-21 22:48:50 +00:00
David Blaikie	c3f30a7fc6	Reapply: DebugInfo: Assume an absence of ranges or high_pc on a CU means the CU is empty (devoid of code addresses) Originally committed in r349333, reverted in r349353. GCC emitted these unconditionally on/before 4.4/March 2012 Clang emitted these unconditionally on/before 3.5/March 2014 This improves performance when parsing CUs (especially those using split DWARF) that contain no code ranges (such as the mini CUs that may be created by ThinLTO importing - though generally they should be/are avoided, especially for Split DWARF because it produces a lot of very small CUs, which don't scale well in a bunch of other ways too (including size)). The revert was due to a (Google internal) test that had some checked in old object files missing DW_AT_ranges. That's since been fixed. llvm-svn: 349968	2018-12-21 22:25:01 +00:00
Vedant Kumar	b264d69de7	[IR] Add Instruction::isLifetimeStartOrEnd, NFC Instruction::isLifetimeStartOrEnd() checks whether an Instruction is an llvm.lifetime.start or an llvm.lifetime.end intrinsic. This was suggested as a cleanup in D55967. Differential Revision: https://reviews.llvm.org/D56019 llvm-svn: 349964	2018-12-21 21:49:40 +00:00
Craig Topper	e58cd9cbc6	[X86] Add isel patterns to match BMI/TBMI instructions when lowering has turned the root nodes into one of the flag producing binops. This fixes the patterns that have or/and as a root. 'and' is handled differently since thy usually have a CMP wrapped around them. I had to look for uses of the CF flag because all these nodes have non-standard CF flag behavior. A real or/xor would always clear CF. In practice we shouldn't be using the CF flag from these nodes as far as I know. Differential Revision: https://reviews.llvm.org/D55813 llvm-svn: 349962	2018-12-21 21:42:43 +00:00
Sanjay Patel	47a6129e26	[DAGCombiner] simplify code leading to scalarizeExtractedVectorLoad; NFC llvm-svn: 349958	2018-12-21 21:26:30 +00:00
Craig Topper	62ec024d3b	[X86] Don't allow optimizeCompareInstr to replace a CMP with BEXTR if the sign flag is used. The BEXTR instruction documents the SF bit as undefined. The TBM BEXTR instruction has the same issue, but I'm not sure how to test it. With the control being an immediate we can determine the sign bit is 0 or the BEXTR would have been removed. Fixes PR40060 Differential Revision: https://reviews.llvm.org/D55807 llvm-svn: 349956	2018-12-21 21:16:26 +00:00
Changpeng Fang	6f539294b5	AMDGPU: Don't peel of the offset if the resulting base could possibly be negative in Indirect addressing. Summary: Don't peel of the offset if the resulting base could possibly be negative in Indirect addressing. This is because the M0 field is of unsigned. This patch achieves the similar goal as https://reviews.llvm.org/D55241, but keeps the optimization if the base is known unsigned. Reviewers: arsemn Differential Revision: https://reviews.llvm.org/D55568 llvm-svn: 349951	2018-12-21 20:57:34 +00:00
Armando Montanez	4cc2113114	[TextAPI][elfabi] Fix YAML support for weak symbols Weak symbols are supposed to be supported in the ELF TextAPI implementation, but the YAML handler didn't read or write the `Weak` member of ELFSymbol. This change adds the YAML mapping and updates tests to ensure correct behavior. Differential Revision: https://reviews.llvm.org/D56020 llvm-svn: 349950	2018-12-21 20:45:58 +00:00
Reid Kleckner	84a2d29681	[BasicAA] Fix AA bug on dynamic allocas and stackrestore Summary: BasicAA has special logic for unescaped allocas, which normally applies equally well to dynamic and static allocas. However, llvm.stackrestore has the power to end the lifetime of dynamic allocas, without referring to them directly. stackrestore is already marked with the most conservative memory modification attributes, but because the alloca is not escaped, the normal logic produces incorrect results. I think BasicAA needs a special case here to teach it about the relationship between dynamic allocas and stackrestore. Fixes PR40118 Reviewers: gbiv, efriedma, george.burgess.iv Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D55969 llvm-svn: 349945	2018-12-21 19:59:03 +00:00
Anna Thomas	18be3cb606	[RuntimeUnrolling] NFC: Add TODO and comments in connectProlog Currently, runtime unrolling does not support loops where multiple exiting blocks exit to the latchExit. Added TODO and other code clarifications for ConnectProlog code. llvm-svn: 349944	2018-12-21 19:45:05 +00:00
Sanjay Patel	80187b8a17	[x86] add movddup specialization for build vector lowering (PR37502) This is admittedly a narrow fix for the problem: https://bugs.llvm.org/show_bug.cgi?id=37502 ...but as the XOP restriction shows, it's a maze to get this right. In the motivating example, note that we have movddup before SSE4.1 and again with AVX2. That's because insertps isn't available pre-SSE41 and vbroadcast is (more generally) available with AVX2 (and the splat is reduced to movddup via isel pattern). Differential Revision: https://reviews.llvm.org/D55898 llvm-svn: 349937	2018-12-21 18:48:32 +00:00
Florian Hahn	8c9f865e3d	[ARM] Set Defs = [CPSR] for COPY_STRUCT_BYVAL, as it clobbers CPSR. Fixes PR35023. Reviewers: MatzeB, t.p.northover, sunfish, qcolombet, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D55909 llvm-svn: 349935	2018-12-21 18:07:10 +00:00
Jessica Paquette	453ab1db5b	[GlobalISel][AArch64] Add support for widening G_FCEIL This adds support for widening G_FCEIL in LegalizerHelper and AArch64LegalizerInfo. More specifically, it teaches the AArch64 legalizer to widen G_FCEIL from a 16-bit float to a 32-bit float when the subtarget doesn't support full FP 16. This also updates AArch64/f16-instructions.ll to show that we perform the correct transformation. llvm-svn: 349927	2018-12-21 17:05:26 +00:00
Evandro Menezes	96c11eceb2	[AArch64] Refactor Exynos predicate (NFC) Change order of conditions in predicate. llvm-svn: 349918	2018-12-21 15:51:34 +00:00
Simon Pilgrim	aa53fc15b7	[XCore] Always use the version of computeKnownBits that returns a value. NFCI. Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version. llvm-svn: 349915	2018-12-21 15:35:32 +00:00
Simon Pilgrim	7787a02b23	[Sparc] Always use the version of computeKnownBits that returns a value. NFCI. Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version. llvm-svn: 349914	2018-12-21 15:32:36 +00:00
Simon Pilgrim	3c157d3fa3	[AMDGPU] Always use the version of computeKnownBits that returns a value. NFCI. Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version. llvm-svn: 349912	2018-12-21 15:29:47 +00:00
Simon Pilgrim	ca8bca2ad3	[WebAssembly] Always use the version of computeKnownBits that returns a value. NFCI. Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version. llvm-svn: 349911	2018-12-21 15:25:37 +00:00
Simon Pilgrim	d800ee4861	[ARM] Always use the version of computeKnownBits that returns a value. NFCI. Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version. llvm-svn: 349909	2018-12-21 15:15:38 +00:00
Simon Pilgrim	148957f336	[AArch64] Always use the version of computeKnownBits that returns a value. NFCI. Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version. llvm-svn: 349908	2018-12-21 15:05:10 +00:00
Simon Pilgrim	911dce2f30	[SelectionDAG] Always use the version of computeKnownBits that returns a value. NFCI. Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version. llvm-svn: 349907	2018-12-21 14:56:18 +00:00
Simon Pilgrim	2482c51e99	[SystemZ] Always use the version of computeKnownBits that returns a value. NFCI. Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version. llvm-svn: 349906	2018-12-21 14:50:54 +00:00
Simon Pilgrim	d43bdc715c	[Lanai] Always use the version of computeKnownBits that returns a value. NFCI. Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version. llvm-svn: 349905	2018-12-21 14:48:35 +00:00
Simon Pilgrim	af1ab22a76	[PPC] Always use the version of computeKnownBits that returns a value. NFCI. Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version. llvm-svn: 349903	2018-12-21 14:32:39 +00:00
Simon Pilgrim	57733507fe	[X86] Always use the version of computeKnownBits that returns a value. NFCI. Continues the work started by @bogner in rL340594 to remove uses of the old KnownBits output paramater version. llvm-svn: 349902	2018-12-21 14:25:14 +00:00
Fedor Sergeev	2d94c2265e	[NewPM] -print-module-scope -print-after now prints module even after invalidated Loop/SCC -print-after IR printing generally can not print the IR unit (Loop or SCC) which has just been invalidated by the pass. However, when working in -print-module-scope mode even if Loop was invalidated there is still a valid module that we can print. Since we can not access invalidated IR unit from AfterPassInvalidated instrumentation point we can remember the module to be printed before pass. This change introduces BeforePass instrumentation that stores all the information required for module printing into the stack and then after pass (in AfterPassInvalidated) just print whatever has been placed on stack. Reviewed By: philip.pfaffe Differential Revision: https://reviews.llvm.org/D55278 llvm-svn: 349896	2018-12-21 11:49:05 +00:00
Luke Cheeseman	41a9e53500	[Dwarf/AArch64] Return address signing B key dwarf support - When signing return addresses with -msign-return-address=<scope>{+<key>}, either the A key instructions or the B key instructions can be used. To correctly authenticate the return address, the unwinder/debugger must know which key was used to sign the return address. - When and exception is thrown or a break point reached, it may be necessary to unwind the stack. To accomplish this, the unwinder/debugger must be able to first authenticate an the return address if it has been signed. - To enable this, the augmentation string of CIEs has been extended to allow inclusion of a 'B' character. Functions that are signed using the B key variant of the instructions should have and FDE whose associated CIE has a 'B' in the augmentation string. - One must also be able to preserve these semantics when first stepping from a high level language into assembly and then, as a second step, into an object file. To achieve this, I have introduced a new assembly directive '.cfi_b_key_frame ', that tells the assembler the current frame uses return address signing with the B key. - This ensures that the FDE is associated with a CIE that has 'B' in the augmentation string. Differential Revision: https://reviews.llvm.org/D51798 llvm-svn: 349895	2018-12-21 10:45:08 +00:00
Simon Pilgrim	5d403f6bf8	[X86][SSE] Auto upgrade PADDS/PSUBS intrinsics to SADD_SAT/SSUB_SAT generic intrinsics (llvm) This auto upgrades the signed SSE saturated math intrinsics to SADD_SAT/SSUB_SAT generic intrinsics. Clang counterpart: https://reviews.llvm.org/D55890 Differential Revision: https://reviews.llvm.org/D55894 llvm-svn: 349892	2018-12-21 09:04:14 +00:00
Thomas Lively	b6dac89c87	[WebAssembly] Fix invalid machine instrs in -O0, verify in tests Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D55956 llvm-svn: 349889	2018-12-21 06:58:15 +00:00
Matt Arsenault	3eae3c4590	AMDGPU/GlobalISel: RegBankSelect for amdgcn.wqm.vote llvm-svn: 349882	2018-12-21 03:20:54 +00:00
Matt Arsenault	f4c21c575a	AMDGPU/GlobalISel: RegBankSelect for some fp ops llvm-svn: 349880	2018-12-21 03:14:45 +00:00
Matt Arsenault	bee2ad7185	AMDGPU/GlobalISel: Redo legality for build_vector It seems better to avoid using the callback if possible since there are coverage assertions which are disabled if this is used. Also fix missing tests. Only test the legal cases since it seems legalization for build_vector is quite lacking. llvm-svn: 349878	2018-12-21 03:03:11 +00:00
Reid Kleckner	b894ecf903	[memcpyopt] Add debug logs when forwarding memcpy src to dst llvm-svn: 349873	2018-12-21 01:41:20 +00:00
Eli Friedman	3af2f53456	[LoopUnroll] Don't verify domtree by default with +Asserts. This verification is linear in the size of the function, so it can cause a quadratic compile-time explosion in a function with many loops to unroll. Differential Revision: https://reviews.llvm.org/D54732 llvm-svn: 349871	2018-12-21 01:28:49 +00:00
Craig Topper	54f1a7be13	[X86] Refactor hasNoCarryFlagUses and hasNoSignFlagUses in X86ISelDAGToDAG.cpp to tranlate opcode to condition code using the helpers in X86InstrInfo.cpp. This shortens the switches in X86ISelDAGToDAG.cpp to only need to check condition code instead of a list of opcodes. This also fixes a bug where the memory forms of SETcc were missing from hasNoCarryFlagUses. llvm-svn: 349868	2018-12-21 01:14:25 +00:00
Craig Topper	e0cff10289	[X86] Add memory forms of some SETCC instructions to hasNoCarryFlagUses. Found while working on another patch llvm-svn: 349867	2018-12-21 01:14:23 +00:00
Eli Friedman	b1bbd5dca3	[ARM] Complete the Thumb1 shift+and->shift+shift transforms. This saves materializing the immediate. The additional forms are less common (they don't usually show up for bitfield insert/extract), but they're still relevant. I had to add a new target hook to prevent DAGCombine from reversing the transform. That isn't the only possible way to solve the conflict, but it seems straightforward enough. Differential Revision: https://reviews.llvm.org/D55630 llvm-svn: 349857	2018-12-20 23:39:54 +00:00
Tom Stellard	2f44fbe936	cmake: Remove add_llvm_loadable_module() Summary: This function is very similar to add_llvm_library(), so this patch merges it into add_llvm_library() and replaces all calls to add_llvm_loadable_module(lib ...) with add_llvm_library(lib MODULE ...) Reviewers: philip.pfaffe, beanz, chandlerc Reviewed By: philip.pfaffe Subscribers: chapuni, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D51748 llvm-svn: 349839	2018-12-20 22:04:08 +00:00
Jessica Paquette	a6b9c68a85	[GlobalISel][AArch64] Add G_FCEIL to isPreISelGenericFloatingPointOpcode If you don't do this, then if you hit a G_LOAD in getInstrMapping, you'll end up with GPRs on the G_FCEIL instead of FPRs. This causes a fallback. Add it to the switch, and add a test verifying that this happens. llvm-svn: 349822	2018-12-20 21:14:15 +00:00
David Blaikie	b3c56af49b	DebugInfo: Fix for missing comp_dir handling with r349207 When deciding lazily whether a CU would be split or non-split I accidentally dropped some handling for the line tables comp_dir (by doing it lazily it was too late to be handled properly by the MC line table code). Move that bit of the code back to the non-lazy place. llvm-svn: 349819	2018-12-20 20:46:55 +00:00
Eli Friedman	48397102d0	[MC] [AArch64] Correctly resolve ":abs_g1:3" etc. We have to treat constructs like this as if they were "symbolic", to use the correct codepath to resolve them. This mostly only affects movz etc. because the other uses of classifySymbolRef conservatively treat everything that isn't a constant as if it were a symbol. Differential Revision: https://reviews.llvm.org/D55906 llvm-svn: 349800	2018-12-20 19:46:14 +00:00
Eli Friedman	4648209e16	[MC] [AArch64] Support resolving fixups for abs_g0 etc. This requires a bit more code than other fixups, to distingush between abs_g0/abs_g1/etc. Actually, I think some of the other fixups are missing some checks, but I won't try to address that here. I haven't seen any real-world code that uses a construct like this, but it clearly should work, and we're considering using it in the implementation of localescape/localrecover on Windows (see https://reviews.llvm.org/D53540). I've verified that binutils produces the same code as llvm-mc for the testcase. This currently doesn't include support for the *_s variants (that requires a bit more work to set the opcode). Differential Revision: https://reviews.llvm.org/D55896 llvm-svn: 349799	2018-12-20 19:38:07 +00:00
Simon Pilgrim	2a25360ae3	[X86] Auto upgrade XOP/AVX512 rotation intrinsics to generic funnel shift intrinsics (llvm) This emits FSHL/FSHR generic intrinsics for the XOP VPROT and AVX512 VPROL/VPROR rotation intrinsics. Clang counterpart: https://reviews.llvm.org/D55937 Differential Revision: https://reviews.llvm.org/D55938 llvm-svn: 349795	2018-12-20 19:01:07 +00:00
Florian Hahn	ef307b8c26	[LAA] Avoid generating RT checks for known deps preventing vectorization. If we found unsafe dependences other than 'unknown', we already know at compile time that they are unsafe and the runtime checks should always fail. So we can avoid generating them in those cases. This should have no negative impact on performance as the runtime checks that would be created previously should always fail. As a sanity check, I measured the test-suite, spec2k and spec2k6 and there were no regressions. Reviewers: Ayal, anemet, hsaito Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D55798 llvm-svn: 349794	2018-12-20 18:49:09 +00:00
Michael Trent	f44d830e2d	Add PLATFORM constants for iOS, tvOS, and watchOS simulators Summary: Add PLATFORM constants for iOS, tvOS, and watchOS simulators, as well as human readable names for these constants, to the Mach-O file format header files. rdar://46854119 Reviewers: ab, davide Reviewed By: ab, davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D55905 llvm-svn: 349779	2018-12-20 17:51:17 +00:00
Yonghong Song	821c93d556	[BPF] Disable relocation for .BTF.ext section Build llvm with assertion on, and then build bcc against this llvm. Run any bcc tool with debug=8 (turning on -g for clang compilation), you will get the following assertion errors, /home/yhs/work/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp:888: void llvm::RuntimeDyldELF::resolveBPFRelocation(const llvm::SectionEntry&, uint64_t, uint64_t, uint32_t, int64_t): Assertion `Value <= (4294967295U)' failed. The .BTF.ext ELF section uses Fixup's to get the instruction offsets. The data width of the Fixup is 4 bytes since we only need the insn offset within the section. This caused the above error though since R_BPF_64_32 expects 4-byte value and the Runtime Dyld tried to resolve the actual insn address which is 8 bytes. Actually the offset within the section is all what we need. Therefore, there is no need to perform any kind of relocation for .BTF.ext section and such relocation will actually cause incorrect result. This patch changed BPFELFObjectWriter::getRelocType() such that for Fixup Kind FK_Data_4, if the relocation Target is a temporary symbol, let us skip the relocation (ELF::R_BPF_NONE). Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 349778	2018-12-20 17:40:23 +00:00
Brock Wyma	b17464e4e8	[CodeView] Emit global variables within lexical scopes to limit visibility Emit static locals within the correct lexical scope so variables with the same name will not confuse the debugger into getting the wrong value. Differential Revision: https://reviews.llvm.org/D55336 llvm-svn: 349777	2018-12-20 17:33:45 +00:00
Michael Kruse	199427100b	[InstCombine] Preserve access-group metadata. Preserve llvm.access.group metadata when combining store instructions. This was forgotten in r349725. Fixes llvm.org/PR40117 llvm-svn: 349774	2018-12-20 17:11:02 +00:00
Krzysztof Parzyszek	30c42e2ab6	[Hexagon] Add patterns for funnel shifts llvm-svn: 349770	2018-12-20 16:39:20 +00:00
Simon Pilgrim	b208255fe0	[SelectionDAGBuilder] Enable funnel shift building to custom rotates This patch enables funnel shift -> rotate building for all ROTL/ROTR custom/legal operations. AFAICT X86 was the last target that was missing modulo support (PR38243), but I've tried to CC stakeholders for every target that has ROTL/ROTR custom handling for their final OK. Differential Revision: https://reviews.llvm.org/D55747 llvm-svn: 349765	2018-12-20 14:56:44 +00:00
Alex Bradbury	eb3a64a4da	[RISCV] Properly evaluate fixup_riscv_pcrel_lo12 This is a update to D43157 to correctly handle fixup_riscv_pcrel_lo12. Notable changes: Rebased onto trunk Handle and test S-type Test case pcrel-hilo.s is merged into relocations.s D43157 description: VK_RISCV_PCREL_LO has to be handled specially. The MCExpr inside is actually the location of an auipc instruction with a VK_RISCV_PCREL_HI fixup pointing to the real target. Differential Revision: https://reviews.llvm.org/D54029 Patch by Chih-Mao Chen and Michael Spencer. llvm-svn: 349764	2018-12-20 14:52:15 +00:00
Simon Pilgrim	09c081176a	[X86][AVX512] Don't custom lower v16i8 rotations. As discussed on D55747, the expansion to (wider) shifts is better on all AVX512 cases, not just BWI. llvm-svn: 349763	2018-12-20 14:38:35 +00:00
Ulrich Weigand	380bece7af	[SystemZ] "Generic" vector assembler instructions shoud clobber CC There are several vector instructions which may or may not set the condition code register, depending on the value of an argument. For codegen, we use two versions of the instruction, one that sets CC and one that doesn't, which hard-code appropriate values of that argument. But we also have a "generic" version of the instruction that is used for the assembler/disassembler. These generic versions should always be considered to clobber CC just to be safe. llvm-svn: 349761	2018-12-20 14:24:17 +00:00
Ulrich Weigand	44d37ae38c	[SystemZ] Make better use of VLLEZ This patch fixes two deficiencies in current code that recognizes the VLLEZ idiom: - For the floating-point versions, we have ISel patterns that match on a bitconvert as the top node. In more complex cases, that bitconvert may already have been merged into something else. Fix the patterns to match the inner nodes instead. - For the 64-bit integer versions, depending on the surrounding code, we may get either a DAG tree based on JOIN_DWORDS or one based on INSERT_VECTOR_ELT. Use a PatFrags to simply match both variants. llvm-svn: 349749	2018-12-20 13:05:03 +00:00
Ulrich Weigand	8bb46b0f01	[SystemZ] Make better use of VGEF/VGEG Current code in SystemZDAGToDAGISel::tryGather refuses to perform any transformation if the Load SDNode has more than one use. This (erronously) counts uses of the chain result, which prevents the optimization in many cases unnecessarily. Fixed by this patch. llvm-svn: 349748	2018-12-20 13:01:20 +00:00
Clement Courbet	36a3480385	Re-land r349731 "[CodeGen][ExpandMemcmp] Add an option for allowing overlapping loads. Update PPC ir following GEP->bitcat to bitcat->GEP->bitcat change. llvm-svn: 349747	2018-12-20 13:01:04 +00:00
Ulrich Weigand	f43b510015	[SystemZ] Make better use of VLDEB We already have special code (DAG combine support for FP_ROUND) to recognize cases where we an use a vector version of VLEDB to perform two floating-point truncates in parallel, but equivalent support for VLEDB (vector floating-point extends) has been missing so far. This patch adds corresponding DAG combine support for FP_EXTEND. llvm-svn: 349746	2018-12-20 12:59:05 +00:00
George Rimar	6367d7a6d1	[yaml2obj/obj2yaml] - Support dumping/parsing ABI version. These tools were assuming ABI version is 0, that is not always true. Patch teaches them to work with that field. Differential revision: https://reviews.llvm.org/D55884 llvm-svn: 349737	2018-12-20 10:43:49 +00:00
Piotr Sobczak	deaacc17fe	[InstCombine][AMDGPU] Handle more buffer intrinsics Summary: Include the following intrinsics in the InsctCombine simplification: * amdgcn_raw_buffer_load * amdgcn_raw_buffer_load_format * amdgcn_struct_buffer_load * amdgcn_struct_buffer_load_format Change-Id: I14deceff74bcb21179baf6aa6e94bf39e7d63d5d Reviewers: arsenm Reviewed By: arsenm Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D55882 llvm-svn: 349735	2018-12-20 10:08:18 +00:00
Alexander Potapenko	0e3b85a730	[MSan] Don't emit __msan_instrument_asm_load() calls LLVM treats void* pointers passed to assembly routines as pointers to sized types. We used to emit calls to __msan_instrument_asm_load() for every such void*, which sometimes led to false positives. A less error-prone (and truly "conservative") approach is to unpoison only assembly output arguments. llvm-svn: 349734	2018-12-20 10:05:00 +00:00
Clement Courbet	e22cf4d7cb	Revert r349731 "[CodeGen][ExpandMemcmp] Add an option for allowing overlapping loads." Forgot to update PowerPC tests for the GEP->bitcast change. llvm-svn: 349733	2018-12-20 09:58:33 +00:00
Clement Courbet	d4bd3eb85d	[NFC] Fix trailing comma after function. lib/Analysis/VectorUtils.cpp:482:2: warning: extra ‘;’ [-Wpedantic] llvm-svn: 349732	2018-12-20 09:20:07 +00:00
Clement Courbet	1bb6e1b0f2	[CodeGen][ExpandMemcmp] Add an option for allowing overlapping loads. Summary: This allows expanding {7,11,13,14,15,21,22,23,25,26,27,28,29,30,31}-byte memcmp in just two loads on X86. These were previously calling memcmp. Reviewers: spatel, gchatelet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D55263 llvm-svn: 349731	2018-12-20 09:13:47 +00:00
Eugene Leviant	2d98eb1b2e	[HWASAN] Add support for memory intrinsics Differential revision: https://reviews.llvm.org/D55117 llvm-svn: 349728	2018-12-20 09:04:33 +00:00
Kang Zhang	ca8db48974	[PowerPC] Implement the isSelectSupported() target hook Summary: PowerPC has scalar selects (isel) and vector mask selects (xxsel). But PowerPC does not have vector CR selects, PowerPC does not support scalar condition selects on vectors. In addition to implementing this hook, isSelectSupported() should return false when the SelectSupportKind is ScalarCondVectorVal, so that predictable selects are converted into branch sequences. Reviewed By: steven.zhang, hfinkel Differential Revision: https://reviews.llvm.org/D55754 llvm-svn: 349727	2018-12-20 06:19:59 +00:00
Craig Topper	bd788ce5db	[DAGCombiner] Fix a place that was creating a SIGN_EXTEND with an extra operand. llvm-svn: 349726	2018-12-20 05:28:06 +00:00
Michael Kruse	978ba61536	Introduce llvm.loop.parallel_accesses and llvm.access.group metadata. The current llvm.mem.parallel_loop_access metadata has a problem in that it uses LoopIDs. LoopID unfortunately is not loop identifier. It is neither unique (there's even a regression test assigning the some LoopID to multiple loops; can otherwise happen if passes such as LoopVersioning make copies of entire loops) nor persistent (every time a property is removed/added from a LoopID's MDNode, it will also receive a new LoopID; this happens e.g. when calling Loop::setLoopAlreadyUnrolled()). Since most loop transformation passes change the loop attributes (even if it just to mark that a loop should not be processed again as llvm.loop.isvectorized does, for the versioned and unversioned loop), the parallel access information is lost for any subsequent pass. This patch unlinks LoopIDs and parallel accesses. llvm.mem.parallel_loop_access metadata on instruction is replaced by llvm.access.group metadata. llvm.access.group points to a distinct MDNode with no operands (avoiding the problem to ever need to add/remove operands), called "access group". Alternatively, it can point to a list of access groups. The LoopID then has an attribute llvm.loop.parallel_accesses with all the access groups that are parallel (no dependencies carries by this loop). This intentionally avoid any kind of "ID". Loops that are clones/have their attributes modifies retain the llvm.loop.parallel_accesses attribute. Access instructions that a cloned point to the same access group. It is not necessary for each access to have it's own "ID" MDNode, but those memory access instructions with the same behavior can be grouped together. The behavior of llvm.mem.parallel_loop_access is not changed by this patch, but should be considered deprecated. Differential Revision: https://reviews.llvm.org/D52116 llvm-svn: 349725	2018-12-20 04:58:07 +00:00
Thomas Lively	feb18fe927	[WebAssembly] Emit a splat for v128 IMPLICIT_DEF Summary: This is a code size savings and is also important to get runnable code while engines do not support v128.const. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D55910 llvm-svn: 349724	2018-12-20 04:20:32 +00:00
Amara Emerson	321bfb210a	Fix build errors introduced by r349712 on aarch64 bots. llvm-svn: 349723	2018-12-20 03:27:42 +00:00
Thomas Lively	8dbf29af95	[WebAssembly] Gate unimplemented SIMD ops on flag Summary: Gates v128.const, f32x4.sqrt, f32x4.div, i8x16.extract_lane_u, and i16x8.extract_lane_u on the --wasm-enable-unimplemented-simd flag, since these ops are not implemented yet in V8. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D55904 llvm-svn: 349720	2018-12-20 02:10:22 +00:00
Matt Arsenault	4339883710	AMDGPU: Make i1/i64/v2i32 and/or/xor legal The 64-bit types do depend on the register bank, but that's another issue to deal with later. llvm-svn: 349716	2018-12-20 01:35:49 +00:00
Matt Arsenault	8cc98bee8a	AMDGPU/GlobalISel: Fix ValueMapping tables for i1 This was incorrectly selecting SGPR for any i1 values, e.g. G_TRUNC to i1 from a VGPR was still an SGPR. llvm-svn: 349715	2018-12-20 01:33:43 +00:00
Craig Topper	9ca2f5605e	[X86] Disable custom widening of signed/unsigned add/sub saturation intrinsics under -x86-experimental-vector-widening-legalization. Generic legalization should take care of this. llvm-svn: 349714	2018-12-20 01:32:06 +00:00
Amara Emerson	8cb186ce17	[AArch64][GlobalISel] Implement selection og G_MERGE of two s32s into s64. This code pattern is an unfortunate side effect of the way some types get split at call lowering. Ideally we'd either not generate it at all or combine it away in the legalizer artifact combiner. Until then, add selection support anyway which is a significant proportion of our current fallbacks on CTMark. rdar://46491420 llvm-svn: 349712	2018-12-20 01:11:04 +00:00
Matt Arsenault	dff33c38e1	AMDGPU/GlobalISel: RegBankSelect for fp conversions llvm-svn: 349709	2018-12-20 00:37:02 +00:00
Matt Arsenault	36d4092173	AMDGPU/GlobalISel: Legality/regbankselect for atomicrmw/atomic_cmpxchg llvm-svn: 349708	2018-12-20 00:33:49 +00:00
Vitaly Buka	07a55f27dc	[asan] Undo special treatment of linkonce_odr and weak_odr Summary: On non-Windows these are already removed by ShouldInstrumentGlobal. On Window we will wait until we get actual issues with that. Reviewers: pcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D55899 llvm-svn: 349707	2018-12-20 00:30:27 +00:00
Vitaly Buka	d414e1bbb5	[asan] Prevent folding of globals with redzones Summary: ICF prevented by removing unnamed_addr and local_unnamed_addr for all sanitized globals. Also in general unnamed_addr is not valid here as address now is important for ODR violation detector and redzone poisoning. Before the patch ICF on globals caused: 1. false ODR reports when we register global on the same address more than once 2. globals buffer overflow if we fold variables of smaller type inside of large type. Then the smaller one will poison redzone which overlaps with the larger one. Reviewers: eugenis, pcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D55857 llvm-svn: 349706	2018-12-20 00:30:18 +00:00
Matt Davis	87b2268c0c	[DwarfExpression] Fix a typo in a doxygen comment. NFC. llvm-svn: 349703	2018-12-20 00:01:57 +00:00
Craig Topper	217b3b20d8	[X86] Remove TLI variable from ReplaceNodeResults. NFC We're already in X86TargetLowering which is a derived class of TargetLowering. We can just call methods directly. llvm-svn: 349695	2018-12-19 23:13:03 +00:00
Rhys Perry	3931ad38b9	AMDGPU: Add patterns for v4i16/v4f16 -> v4i16/v4f16 bitcasts Reviewers: arsenm, tstellar Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D55058 llvm-svn: 349694	2018-12-19 22:53:33 +00:00
Eli Friedman	a69084ffa8	[CodeGenPrepare] Fix bad IR created by large offset GEP splitting. Creating the IR builder, then modifying the CFG, leads to an IRBuilder where the BB and insertion point are inconsistent, so new instructions have the wrong parent. Modified an existing test because the test wasn't covering anything useful (the "invoke" was not actually an invoke by the time we hit the code in question). Differential Revision: https://reviews.llvm.org/D55729 llvm-svn: 349693	2018-12-19 22:52:04 +00:00
Rhys Perry	972273d1d3	Fix test commit Seems that was actually a eight space tab... llvm-svn: 349690	2018-12-19 22:33:42 +00:00
Rhys Perry	111bf831de	Test commit Replace tab with 4 spaces. llvm-svn: 349689	2018-12-19 22:26:51 +00:00
Evandro Menezes	374ccf6768	[AArch64] Improve Exynos predicates Expand the predicate `ExynosResetPred` to include all forms of immediate moves. llvm-svn: 349686	2018-12-19 22:24:36 +00:00
Evandro Menezes	ff827d737a	[AArch64] Use canonical copy idiom Use only the canonical form of the alias for register transfers in the `IsCopyIdiomPred` predicate. llvm-svn: 349685	2018-12-19 22:24:31 +00:00
Nikita Popov	3817ee7908	Revert "[BDCE][DemandedBits] Detect dead uses of undead instructions" This reverts commit r349674. It causes a failure in test-suite enc-3des.execution_time. llvm-svn: 349684	2018-12-19 22:09:02 +00:00
Reid Kleckner	ed3ef41711	[llvm-ar] Simplify string table get-or-insert pattern with .insert, NFC llvm-svn: 349681	2018-12-19 20:54:06 +00:00
Nikita Popov	649e125451	[BDCE][DemandedBits] Detect dead uses of undead instructions This (mostly) fixes https://bugs.llvm.org/show_bug.cgi?id=39771. BDCE currently detects instructions that don't have any demanded bits and replaces their uses with zero. However, if an instruction has multiple uses, then some of the uses may be dead (have no demanded bits) even though the instruction itself is still live. This patch extends DemandedBits/BDCE to detect such uses and replace them with zero. While this will not immediately render any instructions dead, it may lead to simplifications (in the motivating case, by converting a rotate into a simple shift), break dependencies, etc. The implementation tries to strike a balance between analysis power and complexity/memory usage. Originally I wanted to track demanded bits on a per-use level, but ultimately we're only really interested in whether a use is entirely dead or not. I'm using an extra set to track which uses are dead. However, as initially all uses are dead, I'm not storing uses those user is also dead. This case is checked separately instead. The test case has a couple of cases that are not simplified yet. In particular, we're only looking at uses of instructions right now. I think it would make sense to also extend this to arguments. Furthermore DemandedBits doesn't yet know some of the tricks that InstCombine does for the demanded bits or bitwise or/and/xor in combination with known bits information. Differential Revision: https://reviews.llvm.org/D55563 llvm-svn: 349674	2018-12-19 19:56:21 +00:00
Craig Topper	d16da2b479	[X86] Remove a bunch of 'else' after returns in reduceVMULWidth. NFC This reduces indentation and makes it obvious this function always returns something. llvm-svn: 349671	2018-12-19 19:39:34 +00:00
David Blaikie	ac69af7ad6	llvm-dwarfdump: Improve/fix pretty printing of array dimensions This is to address post-commit feedback from Paul Robinson on r348954. The original commit misinterprets count and upper bound as the same thing (I thought I saw GCC producing an upper bound the same as Clang's count, but GCC correctly produces an upper bound that's one less than the count (in C, that is, where arrays are zero indexed)). I want to preserve the C-like output for the common case, so in the absence of a lower bound the count (or one greater than the upper bound) is rendered between []. In the trickier cases, where a lower bound is specified, a half-open range is used (eg: lower bound 1, count 2 would be "[1, 3)" and an unknown parts use a '?' (eg: "[1, ?)" or "[?, 7)" or "[?, ? + 3)"). Reviewers: aprantl, probinson, JDevlieghere Differential Revision: https://reviews.llvm.org/D55721 llvm-svn: 349670	2018-12-19 19:34:24 +00:00
Matthew Voss	62fcfc5adb	[ThinLTO] Remove dllimport attribute from locally defined symbols Summary: The LTO/ThinLTO driver currently creates invalid bitcode by setting symbols marked dllimport as dso_local. The compiler often has access to the definition (often dllexport) and the declaration (often dllimport) of an object at link-time, leading to a conflicting declaration. This patch resolves the inconsistency by removing the dllimport attribute. Reviewers: tejohnson, pcc, rnk, echristo Reviewed By: rnk Subscribers: dmikulin, wristow, mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, dang, llvm-commits Differential Revision: https://reviews.llvm.org/D55627 llvm-svn: 349667	2018-12-19 19:07:45 +00:00
Jessica Paquette	3560e93dc1	[GlobalISel][AArch64] Add support for @llvm.ceil This adds a G_FCEIL generic instruction and uses it in AArch64. This adds selection for floating point ceil where it has a supported, dedicated instruction. Other cases aren't handled here. It updates the relevant gisel tests and adds a select-ceil test. It also adds a check to arm64-vcvt.ll which ensures that we don't fall back when we run into one of the relevant cases. llvm-svn: 349664	2018-12-19 19:01:36 +00:00
Craig Topper	84a00bd98a	[X86] Don't match TESTrr from (cmp (and X, Y), 0) during isel. Defer to post processing The (cmp (and X, Y) 0) pattern is greedy and ends up forming a TESTrr and consuming the and when it might be better to use one of the BMI/TBM like BLSR or BLSI. This patch moves removes the pattern from isel and adds a post processing check to combine TESTrr+ANDrr into just a TESTrr. With this patch we are able to select the BMI/TBM instructions, but we'll also emit a TESTrr when the result is compared to 0. In many cases the peephole pass will be able to use optimizeCompareInstr to remove the TEST, but its probably not perfect. Differential Revision: https://reviews.llvm.org/D55870 llvm-svn: 349661	2018-12-19 18:49:13 +00:00
Craig Topper	291470347a	[X86] Fix assert fails in pass X86AvoidSFBPass Fixes https://bugs.llvm.org/show_bug.cgi?id=38743 The function removeRedundantBlockingStores is supposed to remove any blocking stores contained in each other in lockingStoresDispSizeMap. But it currently looks only at the previous one, which will miss some cases that result in assert. This patch refine the function to check all previous layouts until find the uncontained one. So all redundant stores will be removed. Patch by Pengfei Wang Differential Revision: https://reviews.llvm.org/D55642 llvm-svn: 349660	2018-12-19 18:45:57 +00:00
Evandro Menezes	5d409b2278	[AArch64] Improve the Exynos M3 pipeline model llvm-svn: 349652	2018-12-19 17:37:51 +00:00
Anton Afanasyev	ce28791e20	Test commit Fix typos. llvm-svn: 349644	2018-12-19 17:18:40 +00:00
Sanjay Patel	798c5982a0	[ValueTracking] remove unused parameters from helper functions; NFC llvm-svn: 349641	2018-12-19 16:49:18 +00:00
Yonghong Song	7b410ac352	[BPF] Generate BTF DebugInfo under BPF target This patch implements BTF (BPF Type Format). The BTF is the debug info format for BPF, introduced in the below linux patch: `69b693f0ae (diff-06fb1c8825f653d7e539058b72c83332)` and further extended several times, e.g., https://www.spinics.net/lists/netdev/msg534640.html https://www.spinics.net/lists/netdev/msg538464.html https://www.spinics.net/lists/netdev/msg540246.html The main advantage of implementing in LLVM is: . better integration/deployment as no extra tools are needed. . bpf JIT based compilation (like bcc, bpftrace, etc.) can get BTF without much extra effort. . BTF line_info needs selective source codes, which can be easily retrieved when inside the compiler. This patch implemented BTF generation by registering a BPF specific DebugHandler in BPFAsmPrinter. Signed-off-by: Yonghong Song <yhs@fb.com> Differential Revision: https://reviews.llvm.org/D55752 llvm-svn: 349640	2018-12-19 16:40:25 +00:00
Peter Wu	f0ad811b54	[Object] Deduplicate long archive member names Summary: Import libraries as created by llvm-dlltool always use the same archive member name for every object file (namely, the DLL library name). Ensure that long names are not repeatedly stored in the string table. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D55860 llvm-svn: 349637	2018-12-19 16:15:05 +00:00
Simon Pilgrim	7bfbf3caa4	[X86][SSE] Auto upgrade PADDUS/PSUBUS intrinsics to UADD_SAT/USUB_SAT generic intrinsics (llvm) Now that we use the generic ISD opcodes, we can use the generic intrinsics directly as well. This fixes the poor fast-isel codegen by not expanding to an easily broken IR code sequence. I'm intending to deal with the signed saturation equivalents as well. Clang counterpart: https://reviews.llvm.org/D55879 Differential Revision: https://reviews.llvm.org/D55855 llvm-svn: 349630	2018-12-19 14:43:36 +00:00
Simon Pilgrim	2ae3a91656	[SelectionDAG] Optional handling of UNDEF elements in matchBinaryPredicate (part 2 of 2) Now that SimplifyDemandedBits/SimplifyDemandedVectorElts is simplifying vector elements, we're seeing more constant BUILD_VECTOR containing undefs. This patch provides opt-in support for UNDEF elements in matchBinaryPredicate, passing NULL instead of the result ConstantSDNode* argument. I've updated the (or (and X, c1), c2) -> (and (or X, c2), c1\|c2) fold to demonstrate its use, which I believe is safe for undef cases. Differential Revision: https://reviews.llvm.org/D55822 llvm-svn: 349629	2018-12-19 14:09:38 +00:00
Simon Pilgrim	47ff0431e9	[SelectionDAG] Optional handling of UNDEF elements in matchBinaryPredicate (part 1 of 2) Now that SimplifyDemandedBits/SimplifyDemandedVectorElts is simplifying vector elements, we're seeing more constant BUILD_VECTOR containing undefs. This patch provides opt-in support for UNDEF elements in matchBinaryPredicate, passing NULL instead of the result ConstantSDNode* argument. Differential Revision: https://reviews.llvm.org/D55822 llvm-svn: 349628	2018-12-19 14:09:09 +00:00
Simon Pilgrim	6c95bea072	[TargetLowering] Fix propagation of undefs in zero extension ops (PR40091) As described on PR40091, we have several places where zext (and zext_vector_inreg) fold an undef input into an undef output. For zero extensions this is incorrect as the output should guarantee to least have the new upper bits set to zero. SimplifyDemandedVectorElts is the worst offender (and its the most likely to cause new undefs to appear) but DAGCombiner's tryToFoldExtendOfConstant has a similar issue. Thanks to @dmgreen for catching this. Differential Revision: https://reviews.llvm.org/D55883 llvm-svn: 349625	2018-12-19 13:37:59 +00:00
Nico Weber	f7cf1a1a73	Let TableGen write output only if it changed, instead of doing so in cmake, attempt 2 This relands r330742: """ Let TableGen write output only if it changed, instead of doing so in cmake. Removes one subprocess and one temp file from the build for each tablegen invocation. No intended behavior change. """ In particular, if you see rebuilds after this change that you didn't see before this change, that's unintended and it's fine to revert this change again (but let me know). r330742 got reverted because some people reported that llvm-tblgen ran on every build after it. This could happen if the depfile output got deleted without deleting the main .inc output. To fix, make TableGen always write the depfile, but keep writing the main .inc output only if it has changed. This matches what we did in cmake before. Differential Revision: https://reviews.llvm.org/D55842 llvm-svn: 349624	2018-12-19 13:35:53 +00:00
Nicolai Haehnle	8d5e974076	AMDGPU: Use an ABS32_LO relocation for SCRATCH_RSRC_DWORD1 Summary: Using HI here makes no logical sense, since the dword is only 32 bits to begin with. Current Mesa master does not look at the relocation type at all, so this change is fine. Future Mesa will rely on this, however. Change-Id: I91085707834c4ac0370926602b93c94b90e44cb1 Reviewers: arsenm, rampitec, mareko Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D55369 llvm-svn: 349620	2018-12-19 11:55:03 +00:00
Simon Pilgrim	2072b5afbe	[SelectionDAG] Optional handling of UNDEF elements in matchUnaryPredicate Now that SimplifyDemandedBits/SimplifyDemandedVectorElts are simplifying vector elements, we're seeing more constant BUILD_VECTOR containing UNDEFs. This patch provides opt-in handling of UNDEF elements in matchUnaryPredicate, passing NULL instead of the ConstantSDNode* argument. I've updated SelectionDAG::simplifyShift to demonstrate its use. Differential Revision: https://reviews.llvm.org/D55819 llvm-svn: 349616	2018-12-19 10:41:06 +00:00
Carl Ritson	c521ac3a44	AMDGPU/InsertWaitcnts: Update VGPR/SGPR bounds when brackets are merged Summary: Fix an issue where VGPR/SGPR bounds are not properly extended when brackets are merged. This manifests as missing waitcnt insertions when multiple brackets are forwarded to a successor block and the first forward has lower VGPR/SGPR bounds. Irreducible loop test has been extended based on a CTS failure detected for GFX9. Reviewers: nhaehnle Reviewed By: nhaehnle Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D55602 llvm-svn: 349611	2018-12-19 10:17:49 +00:00
Diana Picus	6c35a1e5af	[ARM GlobalISel] Support G_CONSTANT for Thumb2 All we have to do is mark it as legal. This allows us to select a lot of new patterns handled by TableGen. This patch adds tests for them and splits up the existing test file for binary operators into 2 files, one for arithmetic ops and one for logical ones. llvm-svn: 349610	2018-12-19 09:55:10 +00:00
Matt Arsenault	b110e2277c	AMDGPU/GlobalISel: Regbankselect for fsub llvm-svn: 349608	2018-12-19 09:07:58 +00:00
Martin Storsjo	e84a0b5a9e	[llvm-objcopy] Initial COFF support This is an initial implementation of no-op passthrough copying of COFF with objcopy. Differential Revision: https://reviews.llvm.org/D54939 llvm-svn: 349605	2018-12-19 07:24:38 +00:00
Kewen Lin	a6247e7cf4	[PowerPC]Exploit P9 vabsdu for unsigned vselect patterns For type v4i32/v8ii16/v16i8, do following transforms: (vselect (setcc a, b, setugt), (sub a, b), (sub b, a)) -> (vabsd a, b) (vselect (setcc a, b, setuge), (sub a, b), (sub b, a)) -> (vabsd a, b) (vselect (setcc a, b, setult), (sub b, a), (sub a, b)) -> (vabsd a, b) (vselect (setcc a, b, setule), (sub b, a), (sub a, b)) -> (vabsd a, b) Differential Revision: https://reviews.llvm.org/D55812 llvm-svn: 349599	2018-12-19 03:04:07 +00:00
Evandro Menezes	f03c45d582	[AArch64] Simplify the Exynos M3 pipeline model llvm-svn: 349569	2018-12-18 23:19:57 +00:00
Evandro Menezes	4e39fa4474	[AArch64] Fix instructions order (NFC) llvm-svn: 349568	2018-12-18 23:19:55 +00:00
Yonghong Song	61b189e06f	[DebugInfo] Move several private headers to include directory This patch moved the following files in lib/CodeGen/AsmPrinter/ AsmPrinterHandler.h DbgEntityHistoryCalculator.h DebugHandlerBase.h to include/llvm/CodeGen directory. Such a change will enable Target to extend DebugHandlerBase and emit Target specific debug info sections. Signed-off-by: Yonghong Song <yhs@fb.com> Differential Revision: https://reviews.llvm.org/D55755 llvm-svn: 349564	2018-12-18 23:10:17 +00:00
Pete Cooper	a3e0be109c	Preserve the linkage for objc* intrinsics as clang will set them to weak_external in some cases Clang uses weak linkage for objc runtime functions when they are not available on the platform. The intrinsic has this linkage so we just need to pass that on to the runtime call. llvm-svn: 349559	2018-12-18 22:42:08 +00:00
Pete Cooper	d0ffdf8782	Add nonlazybind to objc_retain/objc_release when converting from intrinsics. For performance reasons, clang set nonlazybind on these functions. Now that we are using intrinsics instead of runtime calls, we should set this attribute when creating the runtime functions. llvm-svn: 349558	2018-12-18 22:31:34 +00:00
Florian Hahn	485f2826ba	[LAA] Introduce enum for vectorization safety status (NFC). This patch adds a VectorizationSafetyStatus enum, which will be extended in a follow up patch to distinguish between 'safe with runtime checks' and 'known unsafe' dependences. Reviewers: anemet, anna, Ayal, hsaito Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D54892 llvm-svn: 349556	2018-12-18 22:25:11 +00:00
Vitaly Buka	4e4920694c	[asan] Restore ODR-violation detection on vtables Summary: unnamed_addr is still useful for detecting of ODR violations on vtables Still unnamed_addr with lld and --icf=safe or --icf=all can trigger false reports which can be avoided with --icf=none or by using private aliases with -fsanitize-address-use-odr-indicator Reviewers: eugenis Reviewed By: eugenis Subscribers: kubamracek, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D55799 llvm-svn: 349555	2018-12-18 22:23:30 +00:00
Pete Cooper	f86db5ce9e	Rewrite objc intrinsics to runtime methods in PreISelIntrinsicLowering instead of SDAG. SelectionDAG currently changes these intrinsics to function calls, but that won't work for other ISel's. Also we want to eventually support nonlazybind and weak linkage coming from the front-end which we can't do in SelectionDAG. llvm-svn: 349552	2018-12-18 22:20:03 +00:00
Martin Storsjo	df20c666d6	[AArch64] Avoid crashing on .seh directives in assembly Differential Revision: https://reviews.llvm.org/D55670 llvm-svn: 349549	2018-12-18 22:10:17 +00:00
Kuba Mracek	3760fc9f3d	[asan] In llvm.asan.globals, allow entries to be non-GlobalVariable and skip over them Looks like there are valid reasons why we need to allow bitcasts in llvm.asan.globals, see discussion at https://github.com/apple/swift-llvm/pull/133. Let's look through bitcasts when iterating over entries in the llvm.asan.globals list. Differential Revision: https://reviews.llvm.org/D55794 llvm-svn: 349544	2018-12-18 21:20:17 +00:00
Evandro Menezes	3753c25b8c	[llvm-mca] Dump mask in hex Dump the resources masks as hexadecimal. llvm-svn: 349536	2018-12-18 20:45:50 +00:00
Pete Cooper	be4f571107	Change the objc ARC optimizer to use the new objc.* intrinsics We're moving ARC optimisation and ARC emission in clang away from runtime methods and towards intrinsics. This is the part which actually uses the intrinsics in the ARC optimizer when both analyzing the existing calls and emitting new ones. Differential Revision: https://reviews.llvm.org/D55348 Reviewers: ahatanak llvm-svn: 349534	2018-12-18 20:32:49 +00:00
Craig Topper	18a9d545e1	[X86] Add BSR to isUseDefConvertible. We already had BSF here as part of __builtin_ffs improvements and I was just wondering yesterday whether we should have BSR there. This addresses one issue from PR40090. llvm-svn: 349531	2018-12-18 20:03:54 +00:00
Nikita Popov	20853a7807	[InstCombine] Simplify cttz/ctlz + icmp eq/ne into mask check Checking whether a number has a certain number of trailing / leading zeros means checking whether it is of the form XXXX1000 / 0001XXXX, which can be done with an and+icmp. Related to https://bugs.llvm.org/show_bug.cgi?id=28668. As a next step, this can be extended to non-equality predicates. Differential Revision: https://reviews.llvm.org/D55745 llvm-svn: 349530	2018-12-18 19:59:50 +00:00
Farhana Aleen	59ee2c5362	[AMDGPU] Removed the unnecessary operand size-check-assert from processBaseWithConstOffset(). Summary: 32bit operand sizes are guaranteed by the opcode check AMDGPU::V_ADD_I32_e64 and AMDGPU::V_ADDC_U32_e64. Therefore, we don't any additional operand size-check-assert. Author: FarhanaAleen llvm-svn: 349529	2018-12-18 19:58:39 +00:00
David Blaikie	693f617763	DebugInfo: Fix missing local imported entities after r349207 Post commit review/bug reported by Pavel Labath - thanks! llvm-svn: 349528	2018-12-18 19:40:22 +00:00
Florian Hahn	5c014037b3	[SCCP] Get rid of redundant call for getPredicateInfoFor (NFC). We can use the result fetched a few lines above. llvm-svn: 349527	2018-12-18 19:37:07 +00:00
Craig Topper	8434ef7d1e	[X86] Don't use SplitOpsAndApply to create ISD::UADDSAT/ISD::USUBSAT nodes. Let type legalization and op legalization deal with it. Now that we've switched to target independent nodes we can rely on generic infrastructure to do the legalization for us. llvm-svn: 349526	2018-12-18 19:29:08 +00:00
Sanjay Patel	e51d5bdb3c	[InstCombine] refactor isCheapToScalarize(); NFC As the FIXME indicates, this has the potential to go overboard. So I'm not sure if it's even worth keeping this vs. iteratively doing simple matches, but we might as well clean it up. llvm-svn: 349523	2018-12-18 19:07:38 +00:00
Nikita Popov	f6058ff140	[X86] Use SADDSAT/SSUBSAT instead of ADDS/SUBS Migrate the X86 backend from X86ISD opcodes ADDS and SUBS to generic ISD opcodes SADDSAT and SSUBSAT. This also improves scodegen for @llvm.sadd.sat() and @llvm.ssub.sat() intrinsics. This is a followup to D55787 and part of PR40056. Differential Revision: https://reviews.llvm.org/D55833 llvm-svn: 349520	2018-12-18 18:28:22 +00:00
Craig Topper	20a6db5a84	[X86] Create PSUBUS from (add (umax X, C), -C) InstCombine seems to canonicalize or PSUB patter into a max with the cosntant and an add with an inverse of the constant. This patch recognizes this pattern and turns it into PSUBUS. Future work could improve undef element handling. Fixes some of PR40053 Differential Revision: https://reviews.llvm.org/D55780 llvm-svn: 349519	2018-12-18 18:26:25 +00:00
Alexandre Ganea	b536bf5299	Buildfix for r345516 (Clang compilation failing). llvm-svn: 349518	2018-12-18 18:23:36 +00:00
Alexandre Ganea	b67d91e090	[llvm-symbolizer] Omit stderr output when symbolizing a crash Differential revision: https://reviews.llvm.org/D55723 llvm-svn: 349516	2018-12-18 18:13:13 +00:00
Michael Berg	c6a5245cf7	Add FMF management to common fp intrinsics in GlobalIsel Summary: This the initial code change to facilitate managing FMF flags from Instructions to MI wrt Intrinsics in Global Isel. Eventually the GlobalObserver interface will be added as well, where FMF additions can be tracked for the builder and CSE. Reviewers: aditya_nandakumar, bogner Reviewed By: bogner Subscribers: rovka, kristof.beyls, javed.absar Differential Revision: https://reviews.llvm.org/D55668 llvm-svn: 349514	2018-12-18 17:54:52 +00:00
Michael Kruse	d4eb13c880	[LoopVectorize] Rename pass options. NFC. Rename: NoUnrolling to InterleaveOnlyWhenForced and AlwaysVectorize to !VectorizeOnlyWhenForced Contrary to what the name 'AlwaysVectorize' suggests, it does not unconditionally vectorize all loops, but applies a cost model to determine whether vectorization is profitable to all loops. Hence, passing false will disable the cost model, except when a loop is marked with llvm.loop.vectorize.enable. The 'OnlyWhenForced' suffix (suggested by @hfinkel in D55716) better matches this behavior. Similarly, 'NoUnrolling' disables the profitability cost model for interleaving (a term to distinguish it from unrolling by the LoopUnrollPass); rename it for consistency. Differential Revision: https://reviews.llvm.org/D55785 llvm-svn: 349513	2018-12-18 17:46:09 +00:00
Simon Pilgrim	1411917431	[X86][SSE] Don't use 'sign bit select' vXi8 ROTL lowering for constant rotation amounts Noticed by @spatel on D55747 - we get much better codegen if we use the regular shift expansion. llvm-svn: 349510	2018-12-18 17:31:11 +00:00
Michael Kruse	3284775b70	[LoopUnroll] Honor '#pragma unroll' even with -fno-unroll-loops. When using clang with `-fno-unroll-loops` (implicitly added with `-O1`), the LoopUnrollPass is not not added to the (legacy) pass pipeline. This also means that it will not process any loop metadata such as llvm.loop.unroll.enable (which is generated by #pragma unroll or WarnMissedTransformationsPass emits a warning that a forced transformation has not been applied (see https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181210/610833.html). Such explicit transformations should take precedence over disabling heuristics. This patch unconditionally adds LoopUnrollPass to the optimizing pipeline (that is, it is still not added with `-O0`), but passes a flag indicating whether automatic unrolling is dis-/enabled. This is the same approach as LoopVectorize uses. The new pass manager's pipeline builder has no option to disable unrolling, hence the problem does not apply. Differential Revision: https://reviews.llvm.org/D55716 llvm-svn: 349509	2018-12-18 17:16:05 +00:00
Simon Pilgrim	e9effe9744	[X86][SSE] Don't use 'sign bit select' vXi8 ROTL lowering for splat rotation amounts Noticed by @spatel on D55747 - we get much better codegen if we use the regular shift expansion. llvm-svn: 349500	2018-12-18 16:02:23 +00:00
Petar Avramovic	0a5e4eb776	[MIPS GlobalISel] Select G_SDIV, G_UDIV, G_SREM and G_UREM Add support for s64 libcalls for G_SDIV, G_UDIV, G_SREM and G_UREM and use integer type of correct size when creating arguments for CLI.lowerCall. Select G_SDIV, G_UDIV, G_SREM and G_UREM for types s8, s16, s32 and s64 on MIPS32. Differential Revision: https://reviews.llvm.org/D55651 llvm-svn: 349499	2018-12-18 15:59:51 +00:00
Nikita Popov	665ab08178	[X86] Use UADDSAT/USUBSAT instead of ADDUS/SUBUS Replace the X86ISD opcodes ADDUS and SUBUS with generic ISD opcodes UADDSAT and USUBSAT. As a side-effect, this also makes codegen for the @llvm.uadd.sat and @llvm.usub.sat intrinsics reasonable. This only replaces use in the X86 backend, and does not move any of the ADDUS/SUBUS X86 specific combines into generic codegen. Differential Revision: https://reviews.llvm.org/D55787 llvm-svn: 349481	2018-12-18 13:23:03 +00:00
Nikita Popov	a7d2a235bb	[SelectionDAG][X86] Fix [US](ADD\|SUB)SAT vector legalization, add tests Integer result promotion needs to use the scalar size, and we need support for result widening. This is in preparation for D55787. llvm-svn: 349480	2018-12-18 13:22:53 +00:00
Petar Avramovic	150fd430f6	[MIPS GlobalISel] ClampScalar G_AND G_OR and G_XOR Add narrowScalar for G_AND and G_XOR. Legalize G_AND G_OR and G_XOR for types other then s32 with clampScalar on MIPS32. Differential Revision: https://reviews.llvm.org/D55362 llvm-svn: 349475	2018-12-18 11:36:14 +00:00
Luke Cheeseman	f57d7d8237	[AArch64] - Return address signing dwarf support - Reapply changes intially introduced in r343089 - The archtecture info is no longer loaded whenever a DWARFContext is created - The runtimes libraries (santiziers) make use of the dwarf context classes but do not intialise the target info - The architecture of the object can be obtained without loading the target info - Adding a method to the dwarf context to get this information and multiplex the string printing later on Differential Revision: https://reviews.llvm.org/D55774 llvm-svn: 349472	2018-12-18 10:37:42 +00:00
Dylan McKay	f920da009e	[IPO][AVR] Create new Functions in the default address space specified in the data layout This modifies the IPO pass so that it respects any explicit function address space specified in the data layout. In targets with nonzero program address spaces, all functions should, by default, be placed into the default program address space. This is required for Harvard architectures like AVR. Without this, the functions will be marked as residing in data space, and thus not be callable. This has no effect to any in-tree official backends, as none use an explicit program address space in their data layouts. Patch by Tim Neumann. llvm-svn: 349469	2018-12-18 09:52:52 +00:00
Matt Arsenault	c94e26c71d	AMDGPU: Legalize/regbankselect frame_index llvm-svn: 349468	2018-12-18 09:46:13 +00:00
Matt Arsenault	c0ea221068	AMDGPU: Legalize/regbankselect fma llvm-svn: 349467	2018-12-18 09:39:56 +00:00
Simon Pilgrim	af6fbbf18b	[TargetLowering] Fallback from SimplifyDemandedVectorElts to SimplifyDemandedBits For opcodes not covered by SimplifyDemandedVectorElts, SimplifyDemandedBits might be able to help now that it supports demanded elts as well. llvm-svn: 349466	2018-12-18 09:33:25 +00:00
Tim Northover	856628f707	SROA: preserve alignment tags on loads and stores. When splitting up an alloca's uses we were dropping any explicit alignment tags, which means they default to the ABI-required default alignment and this can cause miscompiles if the real value was smaller. Also refactor the TBAA metadata into a parent class since it's shared by both children anyway. llvm-svn: 349465	2018-12-18 09:29:39 +00:00
Matt Arsenault	1ac38ba73f	GlobalISel: Improve crash on invalid mapping If NumBreakDowns is 0, BreakDown is null. This trades a null dereference with an assert somewhere else. llvm-svn: 349464	2018-12-18 09:27:29 +00:00
Matt Arsenault	e01e7c81f2	AMDGPU/GlobalISel: Legalize/regbankselect fneg/fabs/fsub llvm-svn: 349463	2018-12-18 09:19:03 +00:00
Simon Pilgrim	8488a44c34	[X86][SSE] Move VSRAI sign extend in reg fold into SimplifyDemandedBits (VSRAI (VSHLI X, C1), C1) --> X iff NumSignBits(X) > C1 This works better as part of SimplifyDemandedBits than part of the general combine. llvm-svn: 349462	2018-12-18 09:11:34 +00:00
Simon Pilgrim	26c630f416	[X86][SSE] Replace (VSRLI (VSRAI X, Y), 31) -> (VSRLI X, 31) fold. This fold was incredibly specific - replace with a SimplifyDemandedBits fold to remove a VSRAI if only the original sign bit is demanded (its guaranteed to stay the same). Test change is merely a rescheduling. llvm-svn: 349459	2018-12-18 08:55:47 +00:00
Kristof Beyls	e66bc1f756	Introduce control flow speculation tracking pass for AArch64 The pass implements tracking of control flow miss-speculation into a "taint" register. That taint register can then be used to mask off registers with sensitive data when executing under miss-speculation, a.k.a. "transient execution". This pass is aimed at mitigating against SpectreV1-style vulnarabilities. At the moment, it implements the tracking of miss-speculation of control flow into a taint register, but doesn't implement a mechanism yet to then use that taint register to mask off vulnerable data in registers (something for a follow-on improvement). Possible strategies to mask out vulnerable data that can be implemented on top of this are: - speculative load hardening to automatically mask of data loaded in registers. - using intrinsics to mask of data in registers as indicated by the programmer (see https://lwn.net/Articles/759423/). For AArch64, the following implementation choices are made. Some of these are different than the implementation choices made in the similar pass implemented in X86SpeculativeLoadHardening.cpp, as the instruction set characteristics result in different trade-offs. - The speculation hardening is done after register allocation. With a relative abundance of registers, one register is reserved (X16) to be the taint register. X16 is expected to not clash with other register reservation mechanisms with very high probability because: . The AArch64 ABI doesn't guarantee X16 to be retained across any call. . The only way to request X16 to be used as a programmer is through inline assembly. In the rare case a function explicitly demands to use X16/W16, this pass falls back to hardening against speculation by inserting a DSB SYS/ISB barrier pair which will prevent control flow speculation. - It is easy to insert mask operations at this late stage as we have mask operations available that don't set flags. - The taint variable contains all-ones when no miss-speculation is detected, and contains all-zeros when miss-speculation is detected. Therefore, when masking, an AND instruction (which only changes the register to be masked, no other side effects) can easily be inserted anywhere that's needed. - The tracking of miss-speculation is done by using a data-flow conditional select instruction (CSEL) to evaluate the flags that were also used to make conditional branch direction decisions. Speculation of the CSEL instruction can be limited with a CSDB instruction - so the combination of CSEL + a later CSDB gives the guarantee that the flags as used in the CSEL aren't speculated. When conditional branch direction gets miss-speculated, the semantics of the inserted CSEL instruction is such that the taint register will contain all zero bits. One key requirement for this to work is that the conditional branch is followed by an execution of the CSEL instruction, where the CSEL instruction needs to use the same flags status as the conditional branch. This means that the conditional branches must not be implemented as one of the AArch64 conditional branches that do not use the flags as input (CB(N)Z and TB(N)Z). This is implemented by ensuring in the instruction selectors to not produce these instructions when speculation hardening is enabled. This pass will assert if it does encounter such an instruction. - On function call boundaries, the miss-speculation state is transferred from the taint register X16 to be encoded in the SP register as value 0. Future extensions/improvements could be: - Implement this functionality using full speculation barriers, akin to the x86-slh-lfence option. This may be more useful for the intrinsics-based approach than for the SLH approach to masking. Note that this pass already inserts the full speculation barriers if the function for some niche reason makes use of X16/W16. - no indirect branch misprediction gets protected/instrumented; but this could be done for some indirect branches, such as switch jump tables. Differential Revision: https://reviews.llvm.org/D54896 llvm-svn: 349456	2018-12-18 08:50:02 +00:00

... 2 3 4 5 6 ...

119417 Commits