The following test case causes issue with codegen of __enqueue_block
void (^block)(void) = ^{ callee(id, out); };
enqueue_kernel(queue, 0, ndrange, block);
Clang first does codegen for block expression in the first line and deletes its block info.
Clang then tries to do codegen for the same block expression again for the second line,
and fails because the block info is gone.
The fix is to do normal codegen for both lines. Introduce an API to OpenCL runtime to
record llvm block invoke function and llvm block literal emitted for each AST block
expression, and use the recorded information for generating the wrapper kernel.
The EmitBlockLiteral APIs are cleaned up to minimize changes to the normal codegen
of blocks.
Another minor issue is that some clean up AST expression is generated for block
with captures, which can be stripped by IgnoreImplicit.
Differential Revision: https://reviews.llvm.org/D43240
llvm-svn: 325264
Summary:
Fixes PR36247, which is where WinEHPrepare replaces inline asm in
funclets with unreachable.
Make getBundlesForFunclet return by value to simplify some call sites.
Reviewers: smeenai, majnemer
Subscribers: eraman, cfe-commits
Differential Revision: https://reviews.llvm.org/D43033
llvm-svn: 324689
Summary:
Previously, Clang only emitted label names in assert builds.
However there is a CC1 option -discard-value-names that should have been used to control emission instead.
This patch removes the NDEBUG preprocessor block and instead allows LLVM to handle removing the names in accordance with the option.
Reviewers: erichkeane, aaron.ballman, majnemer
Reviewed By: aaron.ballman
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D42829
llvm-svn: 324127
Summary:
This patch enables debugging of C99 VLA types by generating more precise
LLVM Debug metadata, using the extended DISubrange 'count' field that
takes a DIVariable.
This should implement:
Bug 30553: Debug info generated for arrays is not what GDB expects (not as good as GCC's)
https://bugs.llvm.org/show_bug.cgi?id=30553
Reviewers: echristo, aprantl, dexonsmith, clayborg, pcc, kristof.beyls, dblaikie
Reviewed By: aprantl
Subscribers: jholewinski, schweitz, davide, fhahn, JDevlieghere, cfe-commits
Differential Revision: https://reviews.llvm.org/D41698
llvm-svn: 323952
simd`.
Added host codegen + codegen for devices with default codegen for
`#pragma omp target teams distribute parallel for simd` directive.
llvm-svn: 322515
This alignment can be less than 4 on certain embedded targets, which may
not even be able to deal with 4-byte alignment on the stack.
Patch by Jacob Young!
llvm-svn: 322406
getAssociatedStmt() returns the outermost captured statement for the
OpenMP directive. It may return incorrect region in case of combined
constructs. Reworked the code to reduce the number of calls of
getAssociatedStmt() and used getInnermostCapturedStmt() and
getCapturedStmt() functions instead.
In case of firstprivate variables it may lead to an extra allocas
generation for private copies even if the variable is passed by value
into outlined function and could be used directly as private copy.
llvm-svn: 322393
GCC's attribute 'target', in addition to being an optimization hint,
also allows function multiversioning. We currently have the former
implemented, this is the latter's implementation.
This works by enabling functions with the same name/signature to coexist,
so that they can all be emitted. Multiversion state is stored in the
FunctionDecl itself, and SemaDecl manages the definitions.
Note that it ends up having to permit redefinition of functions so
that they can all be emitted. Additionally, all versions of the function
must be emitted, so this also manages that.
Note that this includes some additional rules that GCC does not, since
defining something as a MultiVersion function after a usage has been made illegal.
The only 'history rewriting' that happens is if a function is emitted before
it has been converted to a multiversion'ed function, at which point its name
needs to be changed.
Function templates and virtual functions are NOT yet supported (not supported
in GCC either).
Additionally, constructors/destructors are disallowed, but the former is
planned.
llvm-svn: 322028
only.
Added support for -fopenmp-simd option that allows compilation of
simd-based constructs without emission of OpenMP runtime calls.
llvm-svn: 321560
...when such an operation is done on an object during con-/destruction.
This is the cfe part of a patch covering both cfe and compiler-rt.
Differential Revision: https://reviews.llvm.org/D40295
llvm-svn: 321519
Diagnose 'unreachable' UB when a noreturn function returns.
1. Insert a check at the end of functions marked noreturn.
2. A decl may be marked noreturn in the caller TU, but not marked in
the TU where it's defined. To diagnose this scenario, strip away the
noreturn attribute on the callee and insert check after calls to it.
Testing: check-clang, check-ubsan, check-ubsan-minimal, D40700
rdar://33660464
Differential Revision: https://reviews.llvm.org/D40698
llvm-svn: 321231
Summary:
The -fxray-always-emit-customevents flag instructs clang to always emit
the LLVM IR for calls to the `__xray_customevent(...)` built-in
function. The default behaviour currently respects whether the function
has an `[[clang::xray_never_instrument]]` attribute, and thus not lower
the appropriate IR code for the custom event built-in.
This change allows users calling through to the
`__xray_customevent(...)` built-in to always see those calls lowered to
the corresponding LLVM IR to lay down instrumentation points for these
custom event calls.
Using this flag enables us to emit even just the user-provided custom
events even while never instrumenting the start/end of the function
where they appear. This is useful in cases where "phase markers" using
__xray_customevent(...) can have very few instructions, must never be
instrumented when entered/exited.
Reviewers: rnk, dblaikie, kpw
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D40601
llvm-svn: 319388
This updates -mcount to use the new attribute names (LLVM r318195), and
switches over -finstrument-functions to also use these attributes rather
than inserting instrumentation in the frontend.
It also adds a new flag, -finstrument-functions-after-inlining, which
makes the cygprofile instrumentation get inserted after inlining rather
than before.
Differential Revision: https://reviews.llvm.org/D39331
llvm-svn: 318199
Summary:
We don't want to store cleanup dest slot saved into the coroutine frame (as some of the cleanup code may
access them after coroutine frame destroyed).
This is an alternative to https://reviews.llvm.org/D37093
It is possible to do this for all functions, but, cursory check showed that in -O0, we get slightly longer function (by 1-3 instructions), thus, we are only limiting cleanup.dest.slot elimination to coroutines.
Reviewers: rjmccall, hfinkel, eric_niebler
Reviewed By: eric_niebler
Subscribers: EricWF, cfe-commits
Differential Revision: https://reviews.llvm.org/D39768
llvm-svn: 317981
This patch fixes various places in clang to propagate may-alias
TBAA access descriptors during construction of lvalues, thus
eliminating the need for the LValueBaseInfo::MayAlias flag.
This is part of D38126 reworked to be a separate patch to
simplify review.
Differential Revision: https://reviews.llvm.org/D39008
llvm-svn: 316988
This patch addresses the rest of the cases where we pass lvalue
base info, but do not provide corresponding TBAA info.
This patch should not bring in any functional changes.
This is part of D38126 reworked to be a separate patch to make
reviewing easier.
Differential Revision: https://reviews.llvm.org/D38945
llvm-svn: 315986
In OpenCL the kernel function and non-kernel function has different calling conventions.
For certain targets they have different argument ABIs. Also kernels have special function
attributes and metadata for runtime to launch them.
The blocks passed to enqueue_kernel is supposed to be executed as kernels. As such,
the block invoke function should be emitted as kernel with proper calling convention and
argument ABI.
This patch emits enqueued block as kernel. If a block is both called directly and passed
to enqueue_kernel, separate functions will be generated.
Differential Revision: https://reviews.llvm.org/D38134
llvm-svn: 315804
This patch enables explicit generation of TBAA information in all
cases where LValue base info is propagated or constructed in
non-trivial ways. Eventually, we will consider each of these
cases to make sure the TBAA information is correct and not too
conservative. For now, we just fall back to generating TBAA info
from the access type.
This patch should not bring in any functional changes.
This is part of D38126 reworked to be a separate patch to
simplify review.
Differential Revision: https://reviews.llvm.org/D38733
llvm-svn: 315575
Besides obvious code simplification, avoiding explicit creation
of LValueBaseInfo objects makes it easier to make TBAA
information to be part of such objects.
This is part of D38126 reworked to be a separate patch to
simplify review.
Differential Revision: https://reviews.llvm.org/D38695
llvm-svn: 315289
The Cpu Init functionality is required for the target
attribute, so this patch simply splits it out into its own
function, exactly like CpuIs and CpuSupports.
llvm-svn: 315075
This patch is an attempt to clarify and simplify generation and
propagation of TBAA information. The idea is to pack all values
that describe a memory access, namely, base type, access type and
offset, into a single structure. This is supposed to make further
changes, such as adding support for unions and array members,
easier to prepare and review.
DecorateInstructionWithTBAA() is no more responsible for
converting types to tags. These implicit conversions not only
complicate reading the code, but also suggest assigning scalar
access tags while we generally prefer full-size struct-path tags.
TBAAPathTag is replaced with TBAAAccessInfo; the latter is now
the type of the keys of the cache map that translates access
descriptors to metadata nodes.
Fixed a bug with writing to a wrong map in
getTBAABaseTypeMetadata() (former getTBAAStructTypeInfo()).
We now check for valid base access types every time we
dereference a field. The original code only checks the top-level
base type. See isValidBaseType() / isTBAAPathStruct() calls.
Some entities have been renamed to sound more adequate and less
confusing/misleading in presence of path-aware TBAA information.
Now we do not lookup twice for the same cache entry in
getAccessTagInfo().
Refined relevant comments and descriptions.
Differential Revision: https://reviews.llvm.org/D37826
llvm-svn: 315048
code size.
Currently clang expands a call to __builtin_os_log_format into a long
sequence of instructions at the call site, causing code size to
increase in some cases.
This commit attempts to reduce code size by emitting a helper function
that can be shared by calls to __builtin_os_log_format with similar
formats and arguments. The helper function has linkonce_odr linkage to
enable the linker to merge identical functions across translation units.
Attribute 'noinline' is attached to the helper function at -Oz so that
the inliner doesn't inline functions that can potentially be merged.
This commit also fixes a bug where the generated IR writes past the end
of the buffer when "%m" is the last specifier appearing in the format
string passed to __builtin_os_log_format.
Original patch by Duncan Exon Smith.
rdar://problem/34065973
rdar://problem/34196543
Differential Revision: https://reviews.llvm.org/D38606
llvm-svn: 315045
This patch makes it possible to produce access tags in a uniform
manner regardless whether the resulting tag will be a scalar or a
struct-path one. getAccessTagInfo() now takes care of the actual
translation of access descriptors to tags and can handle all
kinds of accesses. Facilities that specific to scalar accesses
are eliminated.
Some more details:
* DecorateInstructionWithTBAA() is not responsible for conversion
of types to access tags anymore. Instead, it takes an access
descriptor (TBAAAccessInfo) and generates corresponding access
tag from it.
* getTBAAInfoForVTablePtr() reworked to
getTBAAVTablePtrAccessInfo() that now returns the
virtual-pointer access descriptor and not the virtual-point
type metadata.
* Added function getTBAAMayAliasAccessInfo() that returns the
descriptor for may-alias accesses.
* getTBAAStructTagInfo() renamed to getTBAAAccessTagInfo() as now
it is the only way to generate access tag by a given access
descriptor. It is capable of producing both scalar and
struct-path tags, depending on options and availability of the
base access type. getTBAAScalarTagInfo() and its cache
ScalarTagMetadataCache are eliminated.
* Now that we do not need to care about whether the resulting
access tag should be a scalar or struct-path one,
getTBAAStructTypeInfo() is renamed to getBaseTypeInfo().
* Added function getTBAAAccessInfo() that constructs access
descriptor by a given QualType access type.
This is part of D37826 reworked to be a separate patch to
simplify review.
Differential Revision: https://reviews.llvm.org/D38503
llvm-svn: 314977
With this patch we implement a concept of TBAA access descriptors
that are capable of representing both scalar and struct-path
accesses in a generic way.
This is part of D37826 reworked to be a separate patch to
simplify review.
Differential Revision: https://reviews.llvm.org/D38456
llvm-svn: 314780