teak-llvm

mirror of https://github.com/Gericom/teak-llvm.git synced 2025-06-21 04:25:45 -04:00

Author	SHA1	Message	Date
Erich Keane	3efe00206f	Implement cpu_dispatch/cpu_specific Multiversioning As documented here: https://software.intel.com/en-us/node/682969 and https://software.intel.com/en-us/node/523346. cpu_dispatch multiversioning is an ICC feature that provides for function multiversioning. This feature is implemented with two attributes: First, cpu_specific, which specifies the individual function versions. Second, cpu_dispatch, which specifies the location of the resolver function and the list of resolvable functions. This is valuable since it provides a mechanism where the resolver's TU can be specified in one location, and the individual implementions each in their own translation units. The goal of this patch is to be source-compatible with ICC, so this implementation diverges from the ICC implementation in a few ways: 1- Linux x86/64 only: This implementation uses ifuncs in order to properly dispatch functions. This is is a valuable performance benefit over the ICC implementation. A future patch will be provided to enable this feature on Windows, but it will obviously more closely fit ICC's implementation. 2- CPU Identification functions: ICC uses a set of custom functions to identify the feature list of the host processor. This patch uses the cpu_supports functionality in order to better align with 'target' multiversioning. 1- cpu_dispatch function def/decl: ICC's cpu_dispatch requires that the function marked cpu_dispatch be an empty definition. This patch supports that as well, however declarations are also permitted, since the linker will solve the issue of multiple emissions. Differential Revision: https://reviews.llvm.org/D47474 llvm-svn: 337552	2018-07-20 14:13:28 +00:00
Erich Keane	be65e874fe	[NFC] Switch CodeGenFunction to use value init instead of member init lists The member init list for the sole constructor for CodeGenFunction has gotten out of hand, so this patch moves the non-parameter-dependent initializations into the member value inits. Note: This is what was intended to be committed in r336726 llvm-svn: 336729	2018-07-10 21:07:50 +00:00
Erich Keane	9960b8f13a	Revert -r336726, which included more files than intended. llvm-svn: 336727	2018-07-10 20:51:41 +00:00
Erich Keane	7b8c12e7cc	[NFC] Switch CodeGenFunction to use value init instead of member init lists The member init list for the sole constructor for CodeGenFunction has gotten out of hand, so this patch moves the non-parameter-dependent initializations into the member value inits. llvm-svn: 336726	2018-07-10 20:46:46 +00:00
Craig Topper	74c10e3236	[Builtins][Attributes][X86] Tag all X86 builtins with their required vector width. Add a min_vector_width function attribute and tag all x86 instrinsics with it This is part of an ongoing attempt at making 512 bit vectors illegal in the X86 backend type legalizer due to CPU frequency penalties associated with wide vectors on Skylake Server CPUs. We want the loop vectorizer to be able to emit IR containing wide vectors as intermediate operations in vectorized code and allow these wide vectors to be legalized to 256 bits by the X86 backend even though we are targetting a CPU that supports 512 bit vectors. This is similar to what happens with an AVX2 CPU, the vectorizer can emit wide vectors and the backend will split them. We want this splitting behavior, but still be able to use new Skylake instructions that work on 256-bit vectors and support things like masking and gather/scatter. Of course if the user uses explicit vector code in their source code we need to not split those operations. Especially if they have used any of the 512-bit vector intrinsics from immintrin.h. And we need to make it so that merely using the intrinsics produces the expected code in order to be backwards compatible. To support this goal, this patch adds a new IR function attribute "min-legal-vector-width" that can indicate the need for a minimum vector width to be legal in the backend. We need to ensure this attribute is set to the largest vector width needed by any intrinsics from immintrin.h that the function uses. The inliner will be reponsible for merging this attribute when a function is inlined. We may also need a way to limit inlining in the future as well, but we can discuss that in the future. To make things more complicated, there are two different ways intrinsics are implemented in immintrin.h. Either as an always_inline function containing calls to builtins(can be target specific or target independent) or vector extension code. Or as a macro wrapper around a taget specific builtin. I believe I've removed all cases where the macro was around a target independent builtin. To support the always_inline function case this patch adds attribute((min_vector_width(128))) that can be used to tag these functions with their vector width. All x86 intrinsic functions that operate on vectors have been tagged with this attribute. To support the macro case, all x86 specific builtins have also been tagged with the vector width that they require. Use of any builtin with this property will implicitly increase the min_vector_width of the function that calls it. I've done this as a new property in the attribute string for the builtin rather than basing it on the type string so that we can opt into it on a per builtin basis and avoid any impact to target independent builtins. There will be future work to support vectors passed as function arguments and supporting inline assembly. And whatever else we can find that isn't covered by this patch. Special thanks to Chandler who suggested this direction and reviewed a preview version of this patch. And thanks to Eric Christopher who has had many conversations with me about this issue. Differential Revision: https://reviews.llvm.org/D48617 llvm-svn: 336583	2018-07-09 19:00:16 +00:00
Peter Collingbourne	e44acadf6a	Implement CFI for indirect calls via a member function pointer. Similarly to CFI on virtual and indirect calls, this implementation tries to use program type information to make the checks as precise as possible. The basic way that it works is as follows, where `C` is the name of the class being defined or the target of a call and the function type is assumed to be `void()`. For virtual calls: - Attach type metadata to the addresses of function pointers in vtables (not the functions themselves) of type `void (B::)()` for each `B` that is a recursive dynamic base class of `C`, including `C` itself. This type metadata has an annotation that the type is for virtual calls (to distinguish it from the non-virtual case). - At the call site, check that the computed address of the function pointer in the vtable has type `void (C::)()`. For non-virtual calls: - Attach type metadata to each non-virtual member function whose address can be taken with a member function pointer. The type of a function in class `C` of type `void()` is each of the types `void (B::)()` where `B` is a most-base class of `C`. A most-base class of `C` is defined as a recursive base class of `C`, including `C` itself, that does not have any bases. - At the call site, check that the function pointer has one of the types `void (B::)()` where `B` is a most-base class of `C`. Differential Revision: https://reviews.llvm.org/D47567 llvm-svn: 335569	2018-06-26 02:15:47 +00:00
Igor Kudrin	eff8f9d178	[CodeGen] Provide source locations for UBSan type checks when emitting constructor calls. Differential Revision: https://reviews.llvm.org/D48531 llvm-svn: 335445	2018-06-25 05:48:04 +00:00
Yaxun Liu	aefdb8ed34	[NFC] Add CreateMemTempWithoutCast and CreateTempAllocaWithoutCast This is partial re-commit of r332982 llvm-svn: 334837	2018-06-15 15:33:22 +00:00
Heejin Ahn	c647919933	[WebAssembly] Use Windows EH instructions for Wasm EH Summary: Because wasm control flow needs to be structured, using WinEH instructions to support wasm EH brings several benefits. This patch makes wasm EH uses Windows EH instructions, with some changes: 1. Because wasm uses a single catch block to catch all C++ exceptions, this merges all catch clauses into a single catchpad, within which we test the EH selector as in Itanium EH. 2. Generates a call to `__clang_call_terminate` in case a cleanup throws. Wasm does not have a runtime to handle this. 3. In case there is no catch-all clause, inserts a call to `__cxa_rethrow` at the end of a catchpad in order to unwind to an enclosing EH scope. Reviewers: majnemer, dschuff Subscribers: jfb, sbc100, jgravelle-google, sunfish, cfe-commits Differential Revision: https://reviews.llvm.org/D44931 llvm-svn: 333703	2018-05-31 22:18:13 +00:00
Simon Tatham	89e31fa7fc	Support __iso_volatile_load8 etc on aarch64-win32. These intrinsics are used by MSVC's header files on AArch64 Windows as well as AArch32, so we should support them for both targets. I've factored them out of CodeGenFunction::EmitARMBuiltinExpr into separate functions that EmitAArch64BuiltinExpr can call as well. Reviewers: javed.absar, mstorsjo Reviewed By: mstorsjo Subscribers: kristof.beyls, cfe-commits Differential Revision: https://reviews.llvm.org/D47476 llvm-svn: 333513	2018-05-30 07:54:05 +00:00
Yaxun Liu	00ddbed298	Revert r332982 Call CreateTempMemWithoutCast for ActiveFlag Due to regression on arm. llvm-svn: 332991	2018-05-22 16:13:07 +00:00
Yaxun Liu	8a60e5db70	Call CreateTempMemWithoutCast for ActiveFlag Introduced CreateMemTempWithoutCast and CreateTemporaryAllocaWithoutCast to emit alloca without casting to default addr space. ActiveFlag is a temporary variable emitted for clean up. It is defined as AllocaInst* type and there is a cast to AlllocaInst in SetActiveFlag. An alloca casted to generic pointer causes assertion in SetActiveFlag. Since there is only load/store of ActiveFlag, it is safe to use the original alloca, therefore use CreateMemTempWithoutCast is called. Differential Revision: https://reviews.llvm.org/D47099 llvm-svn: 332982	2018-05-22 14:36:26 +00:00
Yaxun Liu	a2a9cfab83	CodeGen: Fix invalid bitcast for lifetime.start/end lifetime.start/end expects pointer argument in alloca address space. However in C++ a temporary variable is in default address space. This patch changes API CreateMemTemp and CreateTempAlloca to get the original alloca instruction and pass it lifetime.start/end. It only affects targets with non-zero alloca address space. Differential Revision: https://reviews.llvm.org/D45900 llvm-svn: 332593	2018-05-17 11:16:35 +00:00
Adrian Prantl	9fc8faf9e6	Remove \brief commands from doxygen comments. This is similar to the LLVM change https://reviews.llvm.org/D46290. We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46320 llvm-svn: 331834	2018-05-09 01:00:01 +00:00
Akira Hatanaka	ccda3d2970	[CodeGen] Avoid destructing a callee-destructued struct type in a function if a function delegates to another function. Fix a bug introduced in r328731, which caused a struct with ObjC __weak fields that was passed to a function to be destructed twice, once in the callee function and once in another function the callee function delegates to. To prevent this, keep track of the callee-destructed structs passed to a function and disable their cleanups at the point of the call to the delegated function. This reapplies r331016, which was reverted in r331019 because it caused an assertion to fail in EmitDelegateCallArg on a windows bot. I made changes to EmitDelegateCallArg so that it doesn't try to deactivate cleanups for structs that have trivial destructors (cleanups for those structs are never pushed to the cleanup stack in EmitParmDecl). rdar://problem/39194693 Differential Revision: https://reviews.llvm.org/D45382 llvm-svn: 331020	2018-04-27 06:57:00 +00:00
Akira Hatanaka	b4f3637cec	Revert "[CodeGen] Avoid destructing a callee-destructued struct type in a" This reverts commit r331016, which broke a windows bot. http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/11727 llvm-svn: 331019	2018-04-27 05:56:55 +00:00
Akira Hatanaka	e712374496	[CodeGen] Avoid destructing a callee-destructued struct type in a function if a function delegates to another function. Fix a bug introduced in r328731, which caused a struct with ObjC __weak fields that was passed to a function to be destructed twice, once in the callee function and once in another function the callee function delegates to. To prevent this, keep track of the callee-destructed structs passed to a function and disable their cleanups at the point of the call to the delegated function. rdar://problem/39194693 Differential Revision: https://reviews.llvm.org/D45382 llvm-svn: 331016	2018-04-27 04:21:51 +00:00
Keith Wyss	f437e35671	[XRay] Add clang builtin for xray typed events. Summary: A clang builtin for xray typed events. Differs from __xray_customevent(...) by the presence of a type tag that is vended by compiler-rt in typical usage. This allows xray handlers to expand logged events with their type description and plugins to process traced events based on type. This change depends on D45633 for the intrinsic definition. Reviewers: dberris, pelikan, rnk, eizan Subscribers: cfe-commits, llvm-commits Differential Revision: https://reviews.llvm.org/D45716 llvm-svn: 330220	2018-04-17 21:32:43 +00:00
Akira Hatanaka	617e26152d	Add a command line option 'fregister_global_dtors_with_atexit' to register destructor functions annotated with __attribute__((destructor)) using __cxa_atexit or atexit. Register destructor functions annotated with __attribute__((destructor)) calling __cxa_atexit in a synthesized constructor function instead of emitting references to the functions in a special section. The primary reason for adding this option is that we are planning to deprecate the __mod_term_funcs section on Darwin in the future. This feature is enabled by default only on Darwin. Users who do not want this can use command line option 'fno_register_global_dtors_with_atexit' to disable it. rdar://problem/33887655 Differential Revision: https://reviews.llvm.org/D45578 llvm-svn: 330199	2018-04-17 18:41:52 +00:00
Alexey Bataev	ddf3db9b5e	[OPENMP] Code cleanup + formatting, NFC. llvm-svn: 330040	2018-04-13 17:31:06 +00:00
Alexander Kornienko	2a8c18d991	Fix typos in clang Found via codespell -q 3 -I ../clang-whitelist.txt Where whitelist consists of: archtype cas classs checkk compres definit frome iff inteval ith lod methode nd optin ot pres statics te thru Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few files that have dubious fixes reverted.) Differential revision: https://reviews.llvm.org/D44188 llvm-svn: 329399	2018-04-06 15:14:32 +00:00
Richard Smith	e78fac5126	PR36992: do not store beyond the dsize of a class object unless we know the tail padding is not reused. We track on the AggValueSlot (and through a couple of other initialization actions) whether we're dealing with an object that might share its tail padding with some other object, so that we can avoid emitting stores into the tail padding if that's the case. We still widen stores into tail padding when we can do so. Differential Revision: https://reviews.llvm.org/D45306 llvm-svn: 329342	2018-04-05 20:52:58 +00:00
Reid Kleckner	399d96e39c	[MS] Emit vftable thunks for functions with incomplete prototypes Summary: The following class hierarchy requires that we be able to emit a this-adjusting thunk for B::foo in C's vftable: struct Incomplete; struct A { virtual A* foo(Incomplete p) = 0; }; struct B : virtual A { void foo(Incomplete p) override; }; struct C : B { int c; }; This TU is valid, but lacks a definition of 'Incomplete', which makes it hard to build a thunk for the final overrider, B::foo. Before this change, Clang gives up attempting to emit the thunk, because it assumes that if the parameter types are incomplete, it must be emitting the thunk for optimization purposes. This is untrue for the MS ABI, where the implementation of B::foo has no idea what thunks C's vftable may require. Clang needs to emit the thunk without necessarily having access to the complete prototype of foo. This change makes Clang emit a musttail variadic call when it needs such a thunk. I call these "unprototyped" thunks, because they only prototype the "this" parameter, which must always come first in the MS C++ ABI. These thunks work, but they create ugly LLVM IR. If the call to the thunk is devirtualized, it will be a call to a bitcast of a function pointer. Today, LLVM cannot inline through such a call, but I want to address that soon, because we also use this pattern for virtual member pointer thunks. This change also implements an old FIXME in the code about reusing the thunk's computed CGFunctionInfo as much as possible. Now we don't end up computing the thunk's mangled name and arranging it's prototype up to around three times. Fixes PR25641 Reviewers: rjmccall, rsmith, hans Subscribers: Prazek, cfe-commits Differential Revision: https://reviews.llvm.org/D45112 llvm-svn: 329009	2018-04-02 20:20:33 +00:00
Eric Fiselier	fa752f23cc	[Builtins] Overload __builtin_operator_new/delete to allow forwarding to usual allocation/deallocation functions. Summary: Libc++'s default allocator uses `__builtin_operator_new` and `__builtin_operator_delete` in order to allow the calls to new/delete to be ellided. However, libc++ now needs to support over-aligned types in the default allocator. In order to support this without disabling the existing optimization Clang needs to support calling the aligned new overloads from the builtins. See llvm.org/PR22634 for more information about the libc++ bug. This patch changes `__builtin_operator_new`/`__builtin_operator_delete` to call any usual `operator new`/`operator delete` function. It does this by performing overload resolution with the arguments passed to the builtin to determine which allocation function to call. If the selected function is not a usual allocation function a diagnostic is issued. One open issue is if the `align_val_t` overloads should be considered "usual" when `LangOpts::AlignedAllocation` is disabled. In order to allow libc++ to detect this new behavior the value for `__has_builtin(__builtin_operator_new)` has been updated to `201802`. Reviewers: rsmith, majnemer, aaron.ballman, erik.pilkington, bogner, ahatanak Reviewed By: rsmith Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D43047 llvm-svn: 328134	2018-03-21 19:19:48 +00:00
Akira Hatanaka	797afe3a4e	[CodeGen] Ignore OpaqueValueExprs that are unique references to their source expressions when iterating over a PseudoObjectExpr's semantic subexpression list. Previously the loop in emitPseudoObjectExpr would emit the IR for each OpaqueValueExpr that was in a PseudoObjectExpr's semantic-form expression list and use the result when the OpaqueValueExpr later appeared in other expressions. This caused an assertion failure when AggExprEmitter tried to copy the result of an OpaqueValueExpr and the copied type didn't have trivial copy/move constructors or assignment operators. This patch adds flag IsUnique to OpaqueValueExpr which indicates it is a unique reference to its source expression (it is not used in multiple places). The loop in emitPseudoObjectExpr ignores OpaqueValueExprs that are unique and CodeGen visitors simply traverse the source expressions of such OpaqueValueExprs. rdar://problem/34363596 Differential Revision: https://reviews.llvm.org/D39562 llvm-svn: 327939	2018-03-20 01:47:58 +00:00
Akira Hatanaka	d791e92b5f	[ObjC] Allow declaring __weak pointer fields in C structs in ARC. This patch uses the infrastructure added in r326307 for enabling non-trivial fields to be declared in C structs to allow __weak fields in C structs in ARC. This recommits r327206, which was reverted because it caused module-enabled builders to fail. I discovered that the CXXRecordDecl::CanPassInRegisters flag wasn't being set correctly in some cases after I moved it to RecordDecl. Thanks to Eric Liu for helping me investigate the bug. rdar://problem/33599681 https://reviews.llvm.org/D44095 llvm-svn: 327870	2018-03-19 17:38:40 +00:00
Yaxun Liu	5b330e8d61	Recommit r326946 after reducing CallArgList memory footprint llvm-svn: 327634	2018-03-15 15:25:19 +00:00
Sjoerd Meijer	95da875898	This reverts "r327189 - [ARM] Add ARMv8.2-A FP16 vector intrinsic" This is causing problems in testing, and PR36683 was raised. Reverting it until we have sorted out how to pass f16 vectors. llvm-svn: 327437	2018-03-13 19:38:56 +00:00
Akira Hatanaka	be7daa3d50	Revert "[ObjC] Allow declaring __weak pointer fields in C structs in ARC." This reverts commit r327206 as there were test failures caused by this patch. http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20180312/221427.html llvm-svn: 327294	2018-03-12 17:05:06 +00:00
Akira Hatanaka	c181b127c0	[ObjC] Allow declaring __weak pointer fields in C structs in ARC. This patch uses the infrastructure added in r326307 for enabling non-trivial fields to be declared in C structs to allow __weak fields in C structs in ARC. rdar://problem/33599681 Differential Revision: https://reviews.llvm.org/D44095 llvm-svn: 327206	2018-03-10 06:36:08 +00:00
Richard Smith	007cb6df58	Revert r326946. It caused stack overflows by significantly increasing the size of a CallArgList. llvm-svn: 327195	2018-03-10 01:47:22 +00:00
Abderrazek Zaafrani	5bd68cf742	[ARM] Add ARMv8.2-A FP16 vector intrinsic Add the fp16 neon vector intrinsic for ARM as described in the ARM ACLE document. Reviews in https://reviews.llvm.org/D43650 llvm-svn: 327189	2018-03-09 23:39:34 +00:00
Yaxun Liu	06dd81149f	CodeGen: Fix address space of indirect function argument The indirect function argument is in alloca address space in LLVM IR. However, during Clang codegen for C++, the address space of indirect function argument should match its address space in the source code, i.e., default addr space, even for indirect argument. This is because destructor of the indirect argument may be called in the caller function, and address of the indirect argument may be taken, in either case the indirect function argument is expected to be in default addr space, not the alloca address space. Therefore, the indirect function argument should be mapped to the temp var casted to default address space. The caller will cast it to alloca addr space when passing it to the callee. In the callee, the argument is also casted to the default address space and used. CallArg is refactored to facilitate this fix. Differential Revision: https://reviews.llvm.org/D34367 llvm-svn: 326946	2018-03-07 21:45:40 +00:00
Alexey Bataev	ab4ea225fe	[OPENMP] Fix lifetime of the loop counters. We may emit incorrect lifetime info during codegen for loop counters in OpenMP constructs because of automatic scope cleanup when we needed temporarily locations for private loop counters. llvm-svn: 326922	2018-03-07 18:17:06 +00:00
Akira Hatanaka	7275da0f2e	[ObjC] Allow declaring __strong pointer fields in structs in Objective-C ARC mode. Declaring __strong pointer fields in structs was not allowed in Objective-C ARC until now because that would make the struct non-trivial to default-initialize, copy/move, and destroy, which is not something C was designed to do. This patch lifts that restriction. Special functions for non-trivial C structs are synthesized that are needed to default-initialize, copy/move, and destroy the structs and manage the ownership of the objects the __strong pointer fields point to. Non-trivial structs passed to functions are destructed in the callee function. rdar://problem/33599681 Differential Revision: https://reviews.llvm.org/D41228 llvm-svn: 326307	2018-02-28 07:15:55 +00:00
Yaxun Liu	fa13d015a3	[OpenCL] Fix __enqueue_block for block with captures The following test case causes issue with codegen of __enqueue_block void (^block)(void) = ^{ callee(id, out); }; enqueue_kernel(queue, 0, ndrange, block); Clang first does codegen for block expression in the first line and deletes its block info. Clang then tries to do codegen for the same block expression again for the second line, and fails because the block info is gone. The fix is to do normal codegen for both lines. Introduce an API to OpenCL runtime to record llvm block invoke function and llvm block literal emitted for each AST block expression, and use the recorded information for generating the wrapper kernel. The EmitBlockLiteral APIs are cleaned up to minimize changes to the normal codegen of blocks. Another minor issue is that some clean up AST expression is generated for block with captures, which can be stripped by IgnoreImplicit. Differential Revision: https://reviews.llvm.org/D43240 llvm-svn: 325264	2018-02-15 16:39:19 +00:00
Reid Kleckner	b75a3f04ec	[WinEH] Put funclet bundles on inline asm calls Summary: Fixes PR36247, which is where WinEHPrepare replaces inline asm in funclets with unreachable. Make getBundlesForFunclet return by value to simplify some call sites. Reviewers: smeenai, majnemer Subscribers: eraman, cfe-commits Differential Revision: https://reviews.llvm.org/D43033 llvm-svn: 324689	2018-02-09 00:16:41 +00:00
Sander de Smalen	891af03a55	Recommit rL323952: [DebugInfo] Enable debug information for C99 VLA types. Fixed build issue when building with g++-4.8 (specialization after instantiation). llvm-svn: 324173	2018-02-03 13:55:59 +00:00
Eric Fiselier	88df555d05	Emit label names according to -discard-value-names. Summary: Previously, Clang only emitted label names in assert builds. However there is a CC1 option -discard-value-names that should have been used to control emission instead. This patch removes the NDEBUG preprocessor block and instead allows LLVM to handle removing the names in accordance with the option. Reviewers: erichkeane, aaron.ballman, majnemer Reviewed By: aaron.ballman Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D42829 llvm-svn: 324127	2018-02-02 19:58:34 +00:00
Sander de Smalen	4e9a1264dd	Reverting patch rL323952 due to build errors that I haven't encountered in local builds. llvm-svn: 323956	2018-02-01 12:27:13 +00:00
Sander de Smalen	17c4633e7f	[DebugInfo] Enable debug information for C99 VLA types Summary: This patch enables debugging of C99 VLA types by generating more precise LLVM Debug metadata, using the extended DISubrange 'count' field that takes a DIVariable. This should implement: Bug 30553: Debug info generated for arrays is not what GDB expects (not as good as GCC's) https://bugs.llvm.org/show_bug.cgi?id=30553 Reviewers: echristo, aprantl, dexonsmith, clayborg, pcc, kristof.beyls, dblaikie Reviewed By: aprantl Subscribers: jholewinski, schweitz, davide, fhahn, JDevlieghere, cfe-commits Differential Revision: https://reviews.llvm.org/D41698 llvm-svn: 323952	2018-02-01 11:25:10 +00:00
Ivan A. Kosarev	1860b520a2	[CodeGen] Decorate aggregate accesses with TBAA tags Differential Revision: https://reviews.llvm.org/D41539 llvm-svn: 323421	2018-01-25 14:21:55 +00:00
Alexey Bataev	647dd84422	[OPENMP] Initial codegen for `target teams distribute parallel for simd`. Added host codegen + codegen for devices with default codegen for `#pragma omp target teams distribute parallel for simd` directive. llvm-svn: 322515	2018-01-15 20:59:40 +00:00
John McCall	5cdf038374	Allocate and access NormalCleanupDest with the natural alignment of i32. This alignment can be less than 4 on certain embedded targets, which may not even be able to deal with 4-byte alignment on the stack. Patch by Jacob Young! llvm-svn: 322406	2018-01-12 22:07:01 +00:00
Alexey Bataev	475a7440f1	[OPENMP] Replace calls of getAssociatedStmt(). getAssociatedStmt() returns the outermost captured statement for the OpenMP directive. It may return incorrect region in case of combined constructs. Reworked the code to reduce the number of calls of getAssociatedStmt() and used getInnermostCapturedStmt() and getCapturedStmt() functions instead. In case of firstprivate variables it may lead to an extra allocas generation for private copies even if the variable is passed by value into outlined function and could be used directly as private copy. llvm-svn: 322393	2018-01-12 19:39:11 +00:00
Erich Keane	281d20b601	Implement Attribute Target MultiVersioning GCC's attribute 'target', in addition to being an optimization hint, also allows function multiversioning. We currently have the former implemented, this is the latter's implementation. This works by enabling functions with the same name/signature to coexist, so that they can all be emitted. Multiversion state is stored in the FunctionDecl itself, and SemaDecl manages the definitions. Note that it ends up having to permit redefinition of functions so that they can all be emitted. Additionally, all versions of the function must be emitted, so this also manages that. Note that this includes some additional rules that GCC does not, since defining something as a MultiVersion function after a usage has been made illegal. The only 'history rewriting' that happens is if a function is emitted before it has been converted to a multiversion'ed function, at which point its name needs to be changed. Function templates and virtual functions are NOT yet supported (not supported in GCC either). Additionally, constructors/destructors are disallowed, but the former is planned. llvm-svn: 322028	2018-01-08 21:34:17 +00:00
Carlo Bertolli	52978c3554	[OpenMP] Initial implementation of code generation for pragma 'target teams distribute parallel for' on host https://reviews.llvm.org/D41709 This patch includes code generation and testing for offloading when target device is host. llvm-svn: 321759	2018-01-03 21:12:44 +00:00
Reid Kleckner	06f19a0de0	[WinEH] Allow for multiple terminatepads Fixes verifier errors with Windows EH and OpenMP, which injects a terminate scope around parallel blocks. Fixes PR35778 llvm-svn: 321676	2018-01-02 21:34:16 +00:00
Alexey Bataev	a8a9153a37	[OPENMP] Support for -fopenmp-simd option with compilation of simd loops only. Added support for -fopenmp-simd option that allows compilation of simd-based constructs without emission of OpenMP runtime calls. llvm-svn: 321560	2017-12-29 18:07:07 +00:00
Stephan Bergmann	d71ad177eb	-fsanitize=vptr warnings on bad static types in dynamic_cast and typeid ...when such an operation is done on an object during con-/destruction. This is the cfe part of a patch covering both cfe and compiler-rt. Differential Revision: https://reviews.llvm.org/D40295 llvm-svn: 321519	2017-12-28 12:45:41 +00:00
Alexey Bataev	d2202caeda	[OPENMP] Support for `depend` clauses on `target data update`. Added codegen for `depend` clauses on `target data update` directives. llvm-svn: 321493	2017-12-27 17:58:32 +00:00
Abderrazek Zaafrani	abb890b7be	[AArch64] Enable fp16 data type for the Builtin for AArch64 only. Differential Revision: https:://reviews.llvm.org/D41360 llvm-svn: 321301	2017-12-21 20:10:03 +00:00
Vedant Kumar	09b5bfdd85	[ubsan] Diagnose noreturn functions which return Diagnose 'unreachable' UB when a noreturn function returns. 1. Insert a check at the end of functions marked noreturn. 2. A decl may be marked noreturn in the caller TU, but not marked in the TU where it's defined. To diagnose this scenario, strip away the noreturn attribute on the callee and insert check after calls to it. Testing: check-clang, check-ubsan, check-ubsan-minimal, D40700 rdar://33660464 Differential Revision: https://reviews.llvm.org/D40698 llvm-svn: 321231	2017-12-21 00:10:25 +00:00
Krzysztof Parzyszek	5a6558382c	[Hexagon] Intrinsic support for V62 and V65 llvm-svn: 320609	2017-12-13 19:56:03 +00:00
Alexey Bataev	fbe17fb8a5	[OPENMP] Initial codegen for `target teams distribute simd` directive. Host + generic device codegen for `target teams distribute simd` directive. llvm-svn: 320608	2017-12-13 19:45:06 +00:00
Alexey Bataev	dfa430f694	[OPENMP] Initial codegen for `target teams distribute` directive. Host + default devices codegen for `target teams distribute` directive. llvm-svn: 320149	2017-12-08 15:03:50 +00:00
Vedant Kumar	36347d917f	[ubsan] Use pass_object_size info in bounds checks Teach UBSan's bounds check to opportunistically use pass_object_size information to check array accesses. rdar://33272922 llvm-svn: 320128	2017-12-08 01:51:47 +00:00
Dean Michael Berris	1a5b10d5b4	[XRay][clang] Introduce -fxray-always-emit-customevents Summary: The -fxray-always-emit-customevents flag instructs clang to always emit the LLVM IR for calls to the `__xray_customevent(...)` built-in function. The default behaviour currently respects whether the function has an `[[clang::xray_never_instrument]]` attribute, and thus not lower the appropriate IR code for the custom event built-in. This change allows users calling through to the `__xray_customevent(...)` built-in to always see those calls lowered to the corresponding LLVM IR to lay down instrumentation points for these custom event calls. Using this flag enables us to emit even just the user-provided custom events even while never instrumenting the start/end of the function where they appear. This is useful in cases where "phase markers" using __xray_customevent(...) can have very few instructions, must never be instrumented when entered/exited. Reviewers: rnk, dblaikie, kpw Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D40601 llvm-svn: 319388	2017-11-30 00:04:54 +00:00
Alexey Bataev	f836537516	[OPENMP] Codegen for `target simd` construct. Added codegen support for `target simd` directive. llvm-svn: 318536	2017-11-17 17:57:25 +00:00
Alexey Bataev	2139ed638b	[OPENMP] Add support for cancelling inside target parallel for directive. Added missed support for cancelling of target parallel for construct. llvm-svn: 318434	2017-11-16 18:20:21 +00:00
Hans Wennborg	76c26c1dca	Switch -mcount and -finstrument-functions to emit EnterExitInstrumenter attributes This updates -mcount to use the new attribute names (LLVM r318195), and switches over -finstrument-functions to also use these attributes rather than inserting instrumentation in the frontend. It also adds a new flag, -finstrument-functions-after-inlining, which makes the cygprofile instrumentation get inserted after inlining rather than before. Differential Revision: https://reviews.llvm.org/D39331 llvm-svn: 318199	2017-11-14 21:13:27 +00:00
Gor Nishanov	04491bd8f3	[coroutines] Promote cleanup.dest.slot allocas to registers to avoid storing it in the coroutine frame Summary: We don't want to store cleanup dest slot saved into the coroutine frame (as some of the cleanup code may access them after coroutine frame destroyed). This is an alternative to https://reviews.llvm.org/D37093 It is possible to do this for all functions, but, cursory check showed that in -O0, we get slightly longer function (by 1-3 instructions), thus, we are only limiting cleanup.dest.slot elimination to coroutines. Reviewers: rjmccall, hfinkel, eric_niebler Reviewed By: eric_niebler Subscribers: EricWF, cfe-commits Differential Revision: https://reviews.llvm.org/D39768 llvm-svn: 317981	2017-11-11 17:00:43 +00:00
Hans Wennborg	7bf8201009	Remove declaration of EmitMCountInstrumentation(). NFC The definition was removed in r280355. llvm-svn: 317944	2017-11-10 22:34:23 +00:00
Alexey Bataev	5d7edca316	[OPENMP] Codegen for `#pragma omp target parallel for simd`. Added codegen for `#pragma omp target parallel for simd` and clauses. llvm-svn: 317813	2017-11-09 17:32:15 +00:00
Alexey Bataev	fb0ebecf0e	[OPENMP] Codegen for `#pragma omp target parallel for`. llvm-svn: 317719	2017-11-08 20:16:14 +00:00
Ivan A. Kosarev	b9c59f36fc	[CodeGen] Propagate may-alias'ness of lvalues with TBAA info This patch fixes various places in clang to propagate may-alias TBAA access descriptors during construction of lvalues, thus eliminating the need for the LValueBaseInfo::MayAlias flag. This is part of D38126 reworked to be a separate patch to simplify review. Differential Revision: https://reviews.llvm.org/D39008 llvm-svn: 316988	2017-10-31 11:05:34 +00:00
Ivan A. Kosarev	9f9d157517	[CodeGen] Generate TBAA info for reference loads Differential Revision: https://reviews.llvm.org/D39177 llvm-svn: 316896	2017-10-30 11:49:31 +00:00
Ivan A. Kosarev	d17f12a35d	[CodeGen] Pass TBAA info along with lvalue base info everywhere This patch addresses the rest of the cases where we pass lvalue base info, but do not provide corresponding TBAA info. This patch should not bring in any functional changes. This is part of D38126 reworked to be a separate patch to make reviewing easier. Differential Revision: https://reviews.llvm.org/D38945 llvm-svn: 315986	2017-10-17 10:17:43 +00:00
Ivan A. Kosarev	ed141bab63	[CodeGen] EmitPointerWithAlignment() to generate TBAA info along with LValue base info Differential Revision: https://reviews.llvm.org/D38796 llvm-svn: 315984	2017-10-17 09:12:13 +00:00
Yaxun Liu	c2a87a05f1	[OpenCL] Emit enqueued block as kernel In OpenCL the kernel function and non-kernel function has different calling conventions. For certain targets they have different argument ABIs. Also kernels have special function attributes and metadata for runtime to launch them. The blocks passed to enqueue_kernel is supposed to be executed as kernels. As such, the block invoke function should be emitted as kernel with proper calling convention and argument ABI. This patch emits enqueued block as kernel. If a block is both called directly and passed to enqueue_kernel, separate functions will be generated. Differential Revision: https://reviews.llvm.org/D38134 llvm-svn: 315804	2017-10-14 12:23:50 +00:00
Ivan A. Kosarev	ce601eedf6	Revert "[CodeGen] EmitPointerWithAlignment() to generate TBAA info along with LValue base info", r315731. With this change we fail on the clang-x86_64-linux-selfhost-modules builder. Differential Revision: https://reviews.llvm.org/D38796 llvm-svn: 315739	2017-10-13 19:55:01 +00:00
Ivan A. Kosarev	0e528202b8	[CodeGen] EmitPointerWithAlignment() to generate TBAA info along with LValue base info Differential Revision: https://reviews.llvm.org/D38796 llvm-svn: 315731	2017-10-13 18:40:18 +00:00
Ivan A. Kosarev	78f486d136	[CodeGen] getNaturalTypeAlignment() to generate TBAA info along with LValue base info This patch should not bring in any functional changes. Differential Revision: https://reviews.llvm.org/D38794 llvm-svn: 315708	2017-10-13 16:58:30 +00:00
Ivan A. Kosarev	1590fd3aa8	[CodeGen] EmitLoadOfReference() to generate TBAA info along with LValue base info This patch should not bring in any functional changes. Differential Revision: https://reviews.llvm.org/D38793 llvm-svn: 315705	2017-10-13 16:50:50 +00:00
Ivan A. Kosarev	9029564e8c	[CodeGen] EmitLoadOfPointer() to generate TBAA info along with LValue base info This patch should not bring in any functional changes. Differential Revision: https://reviews.llvm.org/D38791 llvm-svn: 315704	2017-10-13 16:47:22 +00:00
Ivan A. Kosarev	229a6d8d17	[CodeGen] EmitCXXMemberDataPointerAddress() to generate TBAA info along with LValue base info This patch should not bring in any functional changes. Differential Revision: https://reviews.llvm.org/D38788 llvm-svn: 315702	2017-10-13 16:38:32 +00:00
Ivan A. Kosarev	f5f204679b	[CodeGen] Generate TBAA info along with LValue base info This patch enables explicit generation of TBAA information in all cases where LValue base info is propagated or constructed in non-trivial ways. Eventually, we will consider each of these cases to make sure the TBAA information is correct and not too conservative. For now, we just fall back to generating TBAA info from the access type. This patch should not bring in any functional changes. This is part of D38126 reworked to be a separate patch to simplify review. Differential Revision: https://reviews.llvm.org/D38733 llvm-svn: 315575	2017-10-12 11:29:46 +00:00
Ivan A. Kosarev	5f8c0ca53d	[CodeGen] Do not construct complete LValue base info in trivial cases Besides obvious code simplification, avoiding explicit creation of LValueBaseInfo objects makes it easier to make TBAA information to be part of such objects. This is part of D38126 reworked to be a separate patch to simplify review. Differential Revision: https://reviews.llvm.org/D38695 llvm-svn: 315289	2017-10-10 09:39:32 +00:00
Erich Keane	1fe643a6d7	Split X86::BI__builtin_cpu_init handling into own function[NFC] The Cpu Init functionality is required for the target attribute, so this patch simply splits it out into its own function, exactly like CpuIs and CpuSupports. llvm-svn: 315075	2017-10-06 16:40:45 +00:00
Ivan A. Kosarev	383890bad4	Refine generation of TBAA information in clang This patch is an attempt to clarify and simplify generation and propagation of TBAA information. The idea is to pack all values that describe a memory access, namely, base type, access type and offset, into a single structure. This is supposed to make further changes, such as adding support for unions and array members, easier to prepare and review. DecorateInstructionWithTBAA() is no more responsible for converting types to tags. These implicit conversions not only complicate reading the code, but also suggest assigning scalar access tags while we generally prefer full-size struct-path tags. TBAAPathTag is replaced with TBAAAccessInfo; the latter is now the type of the keys of the cache map that translates access descriptors to metadata nodes. Fixed a bug with writing to a wrong map in getTBAABaseTypeMetadata() (former getTBAAStructTypeInfo()). We now check for valid base access types every time we dereference a field. The original code only checks the top-level base type. See isValidBaseType() / isTBAAPathStruct() calls. Some entities have been renamed to sound more adequate and less confusing/misleading in presence of path-aware TBAA information. Now we do not lookup twice for the same cache entry in getAccessTagInfo(). Refined relevant comments and descriptions. Differential Revision: https://reviews.llvm.org/D37826 llvm-svn: 315048	2017-10-06 08:17:48 +00:00
Akira Hatanaka	6b103bc18c	[CodeGen] Emit a helper function for __builtin_os_log_format to reduce code size. Currently clang expands a call to __builtin_os_log_format into a long sequence of instructions at the call site, causing code size to increase in some cases. This commit attempts to reduce code size by emitting a helper function that can be shared by calls to __builtin_os_log_format with similar formats and arguments. The helper function has linkonce_odr linkage to enable the linker to merge identical functions across translation units. Attribute 'noinline' is attached to the helper function at -Oz so that the inliner doesn't inline functions that can potentially be merged. This commit also fixes a bug where the generated IR writes past the end of the buffer when "%m" is the last specifier appearing in the format string passed to __builtin_os_log_format. Original patch by Duncan Exon Smith. rdar://problem/34065973 rdar://problem/34196543 Differential Revision: https://reviews.llvm.org/D38606 llvm-svn: 315045	2017-10-06 07:12:46 +00:00
Ivan A. Kosarev	afc074cc41	Revert r314977 "[CodeGen] Unify generation of scalar and struct-path TBAA tags" D37826 has been mistakenly committed where it should be the patch from D38503. Differential Revision: https://reviews.llvm.org/D38503 llvm-svn: 314978	2017-10-05 11:05:43 +00:00
Ivan A. Kosarev	6fa20cfea3	[CodeGen] Unify generation of scalar and struct-path TBAA tags This patch makes it possible to produce access tags in a uniform manner regardless whether the resulting tag will be a scalar or a struct-path one. getAccessTagInfo() now takes care of the actual translation of access descriptors to tags and can handle all kinds of accesses. Facilities that specific to scalar accesses are eliminated. Some more details: * DecorateInstructionWithTBAA() is not responsible for conversion of types to access tags anymore. Instead, it takes an access descriptor (TBAAAccessInfo) and generates corresponding access tag from it. * getTBAAInfoForVTablePtr() reworked to getTBAAVTablePtrAccessInfo() that now returns the virtual-pointer access descriptor and not the virtual-point type metadata. * Added function getTBAAMayAliasAccessInfo() that returns the descriptor for may-alias accesses. * getTBAAStructTagInfo() renamed to getTBAAAccessTagInfo() as now it is the only way to generate access tag by a given access descriptor. It is capable of producing both scalar and struct-path tags, depending on options and availability of the base access type. getTBAAScalarTagInfo() and its cache ScalarTagMetadataCache are eliminated. * Now that we do not need to care about whether the resulting access tag should be a scalar or struct-path one, getTBAAStructTypeInfo() is renamed to getBaseTypeInfo(). * Added function getTBAAAccessInfo() that constructs access descriptor by a given QualType access type. This is part of D37826 reworked to be a separate patch to simplify review. Differential Revision: https://reviews.llvm.org/D38503 llvm-svn: 314977	2017-10-05 10:47:51 +00:00
Ivan A. Kosarev	a511ed7501	[CodeGen] Introduce generic TBAA access descriptors With this patch we implement a concept of TBAA access descriptors that are capable of representing both scalar and struct-path accesses in a generic way. This is part of D37826 reworked to be a separate patch to simplify review. Differential Revision: https://reviews.llvm.org/D38456 llvm-svn: 314780	2017-10-03 10:52:39 +00:00
Vedant Kumar	24792e3ab1	[ubsan] Add helpers to decide when null/vptr checks are required. NFC. llvm-svn: 314750	2017-10-03 01:27:25 +00:00
Ivan A. Kosarev	289574edc0	[CodeGen] Do not refer to complete TBAA info where we actually deal with just TBAA access types This patch fixes misleading names of entities related to getting, setting and generation of TBAA access type descriptors. This is effectively an attempt to provide a review for D37826 by breaking it into smaller pieces. Differential Revision: https://reviews.llvm.org/D38404 llvm-svn: 314657	2017-10-02 09:54:47 +00:00
Alexey Bataev	f47c4b4184	[OPENMP] Generate implicit map\|firstprivate clauses for target-based directives. If the variable is used in the target-based region but is not found in any private\|mapping clause, then generate implicit firstprivate\|map clauses for these implicitly mapped variables. llvm-svn: 314205	2017-09-26 13:47:31 +00:00
Akira Hatanaka	ba0367a708	[CodeGen][ObjC] Build the global block structure before emitting the body of global block invoke functions. This commit fixes an infinite loop in IRGen that occurs when compiling the following code: void FUNC2() { static void (^const block1)(int) = ^(int a){ if (a--) block1(a); }; } This is how IRGen gets stuck in the infinite loop: 1. GenerateBlockFunction is called to emit the body of "block1". 2. GetAddrOfGlobalBlock is called to get the address of "block1". The function calls getAddrOfGlobalBlockIfEmitted to check whether the global block has been emitted. If it hasn't been emitted, it then tries to emit the body of the block function by calling GenerateBlockFunction, which goes back to step 1. This commit prevents the inifinite loop by building the global block in GenerateBlockFunction before emitting the body of the block function. rdar://problem/34541684 Differential Revision: https://reviews.llvm.org/D38118 llvm-svn: 314029	2017-09-22 21:32:06 +00:00
Alexey Bataev	b7f18c3297	[OPENMP] Handle re-declaration of captured variables in CodeGen. If the captured variable has re-declaration we may end up with the situation where the captured variable is the re-declaration while the referenced variable is the canonical declaration (or vice versa). In this case we may generate wrong code. Patch fixes this situation. llvm-svn: 313995	2017-09-22 16:56:13 +00:00
Vedant Kumar	bb5d485cd3	[ubsan] Function Sanitizer: Don't require writable text segments This change will make it possible to use -fsanitize=function on Darwin and possibly on other platforms. It fixes an issue with the way RTTI is stored into function prologue data. On Darwin, addresses stored in prologue data can't require run-time fixups and must be PC-relative. Run-time fixups are undesirable because they necessitate writable text segments, which can lead to security issues. And absolute addresses are undesirable because they break PIE mode. The fix is to create a private global which points to the RTTI, and then to encode a PC-relative reference to the global into prologue data. Differential Revision: https://reviews.llvm.org/D37597 llvm-svn: 313096	2017-09-13 00:04:35 +00:00
Krasimir Georgiev	46dfb7a39d	Updated two annotations for Store.h and CodeGenFunction.h. Summary: 1.Updated annotations for include/clang/StaticAnalyzer/Core/PathSensitive/Store.h, which belong to the old version of clang. 2.Delete annotations for CodeGenFunction::getEvaluationKind() in clang/lib/CodeGen/CodeGenFunction.h, which belong to the old version of clang. Reviewers: bkramer, krasimir, klimek Reviewed By: bkramer Subscribers: MTC Differential Revision: https://reviews.llvm.org/D36330 Contributed by @MTC! llvm-svn: 312790	2017-09-08 13:44:51 +00:00
Karl-Johan Karlsson	33e205a40f	Debug info: Fixed faulty debug locations for attributed statements Summary: As the attributed statements are considered simple statements no stoppoint was generated before emitting attributed do/while/for/range- statement. This lead to faulty debug locations. Reviewers: echristo, aaron.ballman, dblaikie Reviewed By: dblaikie Subscribers: bjope, aprantl, cfe-commits Differential Revision: https://reviews.llvm.org/D37428 llvm-svn: 312623	2017-09-06 08:47:18 +00:00
Erich Keane	9937b134c5	[CodeGen]Refactor CpuSupports/CPUIs Builtin Code Gen to better work with "target" implementation A small set of refactors that'll make it easier for me to implement 'target' support. First, extract the CPUSupports functionality into its own function. THis has the advantage of not wasting time in this builtin to deal with arguments. Second, pulls both CPUSupports and CPUIs implementation into a member-function, so that it can be called from the resolver generation that I'm working on. Third, creates an overload that takes simply the feature/cpu name (rather than extracting it from a callexpr), since that info isn't available later. Note that despite how the 'diff' looks, the EmitX86CPUSupports function simply takes the implementation out of the 'switch'. llvm-svn: 312355	2017-09-01 19:42:45 +00:00
Alex Lorenz	6cc8317c38	[IRGen] Evaluate constant static variables referenced through member expressions C++ allows us to reference static variables through member expressions. Prior to this commit, non-integer static variables that were referenced using a member expression were always emitted using lvalue loads. The old behaviour introduced an inconsistency between regular uses of static variables and member expressions uses. For example, the following program compiled and linked successfully: struct Foo { constexpr static const char name = "foo"; }; int main() { return Foo::name[0] == 'f'; } but this program failed to link because "Foo::name" wasn't found: struct Foo { constexpr static const char name = "foo"; }; int main() { Foo f; return f.name[0] == 'f'; } This commit ensures that constant static variables referenced through member expressions are emitted in the same way as ordinary static variable references. rdar://33942261 Differential Revision: https://reviews.llvm.org/D36876 llvm-svn: 311772	2017-08-25 10:07:00 +00:00
Alexey Bataev	6a71f364f1	[OPENMP] Fix for PR34014: OpenMP 4.5: Target construct in static method of class fails to map class static variable. If the global variable is captured and it has several redeclarations, sometimes it may lead to a compiler crash. Patch fixes this by working only with canonical declarations. llvm-svn: 311479	2017-08-22 17:54:52 +00:00
Alexey Bataev	8c3edfef6b	[OPENMP] Fix for PR28581: OpenMP linear clause - wrong results. If worksharing construct has at least one linear item, an implicit synchronization point must be emitted to avoid possible conflict with the loading/storing values to the original variables. Added implicit barrier if the linear item is found before actual start of the worksharing construct. llvm-svn: 311013	2017-08-16 15:58:46 +00:00
Reid Kleckner	2d3c421f1c	Clean up some lambda conversion operator code, NFC We don't need special handling in CodeGenFunction::GenerateCode for lambda block pointer conversion operators anymore. The conversion operator emission code immediately calls back to the generic EmitFunctionBody. Rename EmitLambdaStaticInvokeFunction to EmitLambdaStaticInvokeBody for better consistency with the other Emit*Body methods. I'm preparing to do something about PR28299, which touches this code. llvm-svn: 310145	2017-08-04 22:38:06 +00:00
Vedant Kumar	10c3102071	[ubsan] Diagnose invalid uses of builtins (clang) On some targets, passing zero to the clz() or ctz() builtins has undefined behavior. I ran into this issue while debugging UB in __hash_table from libcxx: the bug I was seeing manifested itself differently under -O0 vs -Os, due to a UB call to clz() (see: libcxx/r304617). This patch introduces a check which can detect UB calls to builtins. llvm.org/PR26979 Differential Revision: https://reviews.llvm.org/D34590 llvm-svn: 309459	2017-07-29 00:19:51 +00:00
Erich Keane	0026ed2f9c	Fix double destruction of objects when OpenMP construct is canceled When an omp for loop is canceled the constructed objects are being destructed twice. It looks like the desired code is: { Obj o; If (cancelled) branch-through-cleanups to cancel.exit. } [cleanups] cancel.exit: __kmpc_for_static_fini br cancel.cont (*) cancel.cont: __kmpc_barrier return The problem seems to be the branch to cancel.cont is currently also going through the cleanups calling them again. This change just does a direct branch instead. Patch By: michael.p.rice@intel.com Differential Revision: https://reviews.llvm.org/D35854 llvm-svn: 309288	2017-07-27 16:28:20 +00:00
Richard Smith	ae8d62c9c5	Add branch weights to branches for static initializers. The initializer for a static local variable cannot be hot, because it runs at most once per program. That's not quite the same thing as having a low branch probability, but under the assumption that the function is invoked many times, modeling this as a branch probability seems reasonable. For TLS variables, the situation is less clear, since the initialization side of the branch can run multiple times in a program execution, but we still expect initialization to be rare relative to non-initialization uses. It would seem worthwhile to add a PGO counter along this path to make this estimation more accurate in future. For globals with guarded initialization, we don't yet apply any branch weights. Due to our use of COMDATs, the guard will be reached exactly once per DSO, but we have no idea how many DSOs will define the variable. llvm-svn: 309195	2017-07-26 22:01:09 +00:00

1 2 3 4 5 ...

1350 Commits