Summary:
- By declaring device variables as `static`, we assume they won't be
addressable from the host side. Thus, no `externally_initialized` is
required.
Reviewers: yaxunl
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D62603
llvm-svn: 361994
Recently D60274 was introduced to allow lld to handle dependent libs. However current
usage of dependent libs (e.g. pragma comment(lib, *) in windows header files) are intended
for host only. Emitting the metadata in device IR causes link error in device path.
Until there is a way to different it dependent libs for device or host, metadata for dependent
libs should be emitted for host only. This patch enforces that.
Differential Revision: https://reviews.llvm.org/D62483
llvm-svn: 361880
representing no such object, and an "Indeterminate" state representing
an uninitialized object. The latter is not yet used, but soon will be.
llvm-svn: 361328
Summary:
This patch adds support for the registration of the requires directives with the runtime.
Each requires directive clause will enable a particular flag to be set.
The set of flags is passed to the runtime to be checked for compatibility with other such flags coming from other object files.
The registration function is called whenever OpenMP is present even if a requires directive is not present. This helps detect cases in which requires directives are used inconsistently.
Reviewers: ABataev, AlexEichenberger, caomhin
Reviewed By: ABataev, AlexEichenberger
Subscribers: jholewinski, guansong, jfb, jdoerfert, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D60568
llvm-svn: 361298
Currently, we ignore all dso locality attributes/info when building for
the device and thus all symblos are externally visible and can be
preemted at the runtime. It may lead to incorrect results. We need to
follow the same logic, compiler uses for static/pie builds.
llvm-svn: 361283
This patch implements a limited form of autolinking primarily designed to allow
either the --dependent-library compiler option, or "comment lib" pragmas (
https://docs.microsoft.com/en-us/cpp/preprocessor/comment-c-cpp?view=vs-2017) in
C/C++ e.g. #pragma comment(lib, "foo"), to cause an ELF linker to automatically
add the specified library to the link when processing the input file generated
by the compiler.
Currently this extension is unique to LLVM and LLD. However, care has been taken
to design this feature so that it could be supported by other ELF linkers.
The design goals were to provide:
- A simple linking model for developers to reason about.
- The ability to to override autolinking from the linker command line.
- Source code compatibility, where possible, with "comment lib" pragmas in other
environments (MSVC in particular).
Dependent library support is implemented differently for ELF platforms than on
the other platforms. Primarily this difference is that on ELF we pass the
dependent library specifiers directly to the linker without manipulating them.
This is in contrast to other platforms where they are mapped to a specific
linker option by the compiler. This difference is a result of the greater
variety of ELF linkers and the fact that ELF linkers tend to handle libraries in
a more complicated fashion than on other platforms. This forces us to defer
handling the specifiers to the linker.
In order to achieve a level of source code compatibility with other platforms
we have restricted this feature to work with libraries that meet the following
"reasonable" requirements:
1. There are no competing defined symbols in a given set of libraries, or
if they exist, the program owner doesn't care which is linked to their
program.
2. There may be circular dependencies between libraries.
The binary representation is a mergeable string section (SHF_MERGE,
SHF_STRINGS), called .deplibs, with custom type SHT_LLVM_DEPENDENT_LIBRARIES
(0x6fff4c04). The compiler forms this section by concatenating the arguments of
the "comment lib" pragmas and --dependent-library options in the order they are
encountered. Partial (-r, -Ur) links are handled by concatenating .deplibs
sections with the normal mergeable string section rules. As an example, #pragma
comment(lib, "foo") would result in:
.section ".deplibs","MS",@llvm_dependent_libraries,1
.asciz "foo"
For LTO, equivalent information to the contents of a the .deplibs section can be
retrieved by the LLD for bitcode input files.
LLD processes the dependent library specifiers in the following way:
1. Dependent libraries which are found from the specifiers in .deplibs sections
of relocatable object files are added when the linker decides to include that
file (which could itself be in a library) in the link. Dependent libraries
behave as if they were appended to the command line after all other options. As
a consequence the set of dependent libraries are searched last to resolve
symbols.
2. It is an error if a file cannot be found for a given specifier.
3. Any command line options in effect at the end of the command line parsing apply
to the dependent libraries, e.g. --whole-archive.
4. The linker tries to add a library or relocatable object file from each of the
strings in a .deplibs section by; first, handling the string as if it was
specified on the command line; second, by looking for the string in each of the
library search paths in turn; third, by looking for a lib<string>.a or
lib<string>.so (depending on the current mode of the linker) in each of the
library search paths.
5. A new command line option --no-dependent-libraries tells LLD to ignore the
dependent libraries.
Rationale for the above points:
1. Adding the dependent libraries last makes the process simple to understand
from a developers perspective. All linkers are able to implement this scheme.
2. Error-ing for libraries that are not found seems like better behavior than
failing the link during symbol resolution.
3. It seems useful for the user to be able to apply command line options which
will affect all of the dependent libraries. There is a potential problem of
surprise for developers, who might not realize that these options would apply
to these "invisible" input files; however, despite the potential for surprise,
this is easy for developers to reason about and gives developers the control
that they may require.
4. This algorithm takes into account all of the different ways that ELF linkers
find input files. The different search methods are tried by the linker in most
obvious to least obvious order.
5. I considered adding finer grained control over which dependent libraries were
ignored (e.g. MSVC has /nodefaultlib:<library>); however, I concluded that this
is not necessary: if finer control is required developers can fall back to using
the command line directly.
RFC thread: http://lists.llvm.org/pipermail/llvm-dev/2019-March/131004.html.
Differential Revision: https://reviews.llvm.org/D60274
llvm-svn: 360984
Without this, I get e.g. 'PerformPendingInstantiations' -> 'std::fill',
now I get 'std::fill<unsigned long *, int>'.
Differential Revision: https://reviews.llvm.org/D61822
llvm-svn: 360539
We need to be able to enqueue internal function that initializes
global constructors on the host side. Therefore it has to be
converted to a kernel.
This change factors out common logic for adding kernel metadata
and moves it from CodeGenFunction to CodeGenModule in order to
make it accessible for the extra use case.
Differential revision: https://reviews.llvm.org/D61488
llvm-svn: 360342
Summary:
A COFF stub indirects the reference to a symbol through memory. A
.refptr.$sym global variable pointer is created to refer to $sym.
Typically mingw uses these for external global variable declarations,
but we can use them for weak function declarations as well.
Updates the dso_local classification to add a special case for
extern_weak symbols on COFF in both clang and LLVM.
Fixes PR37598
Reviewers: smeenai, mstorsjo
Subscribers: hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D61615
llvm-svn: 360207
This reverts r359250 (git commit 4730604bd3)
The newly added test should use -cc1 and -emit-llvm and there are other
test failures that need fixing.
llvm-svn: 359251
Statically link certain runtime library functions for MSVC/GNU Windows
environments. This is consistent with MSVC behavior.
Fixes LNK4286 and LNK4217 warnings from link.exe when linking the static
CRT:
LINK : warning LNK4286: symbol '__std_terminate' defined in 'libvcruntime.lib(ehhelpers.obj)' is imported by 'ASAN_NOINST_TEST_OBJECTS.asan_noinst_test.cc.x86_64-calls.o'
LINK : warning LNK4286: symbol '__std_terminate' defined in 'libvcruntime.lib(ehhelpers.obj)' is imported by 'ASAN_NOINST_TEST_OBJECTS.asan_test_main.cc.x86_64-calls.o'
LINK : warning LNK4217: symbol '_CxxThrowException' defined in 'libvcruntime.lib(throw.obj)' is imported by 'ASAN_NOINST_TEST_OBJECTS.gtest-all.cc.x86_64-calls.o' in function '"int `public: static class UnitTest::GetInstance * __cdecl testing::UnitTest::GetInstance(void)'::`1'::dtor$5" (?dtor$5@?0??GetInstance@UnitTest@testing@@SAPEAV12@XZ@4HA)'
Reviewers: mstorsjo, efriedma, TomTan, compnerd, smeenai, mgrang
Subscribers: abdulras, theraven, smeenai, pcc, mehdi_amini, javed.absar, inglorion, kristof.beyls, dexonsmith, cfe-commits
Differential Revision: https://reviews.llvm.org/D55229
llvm-svn: 359250
AMDGPU currently relies on global properties being set before
setTargetProperties is called. Existing targets like MIPS which rely on
setTargetProperties do not rely on the current behavior, so this patch
moves the call later in SetFunctionAttributes.
Differential Revision: https://reviews.llvm.org/D60967
llvm-svn: 359039
This change adds hierarchical "time trace" profiling blocks that can be visualized in Chrome, in a "flame chart" style. Each profiling block can have a "detail" string that for example indicates the file being processed, template name being instantiated, function being optimized etc.
This is taken from GitHub PR: https://github.com/aras-p/llvm-project-20170507/pull/2
Patch by Aras Pranckevičius.
Differential Revision: https://reviews.llvm.org/D58675
llvm-svn: 357340
For the global variables the allocate directive must specify only the
predefined allocator. This allocator must be translated into the correct
form of the address space for the targets that support different address
spaces.
llvm-svn: 356702
This patch adds an XCOFF triple object format type into LLVM.
This XCOFF triple object file type will be used later by object file and assembly generation for the AIX platform.
Differential Revision: https://reviews.llvm.org/D58930
llvm-svn: 355989
Add .stub to kernel stub function name so that it is different from kernel
name in device code. This is necessary to let debugger find correct symbol
for kernel.
Differential Revision: https://reviews.llvm.org/D58518
llvm-svn: 354948
Summary:
- If a string literal is reused directly, need to add necessary address
space casting if the target requires that.
Reviewers: yaxunl
Subscribers: jvesely, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D58509
llvm-svn: 354610
__hipRegisterFunction and __hipRegisterVar need to accept device side kernel and variable names
so that HIP runtime can associate kernel stub functions in host code with kernel symbols in fat binaries,
and associate shadow variables in host code with device variables in fat binaries.
Currently, clang assumes kernel functions and device variables have the same name as the kernel
stub functions and shadow variables. However, when host is compiled in windows with MSVC C++
ABI and device is compiled with Itanium C++ ABI (e.g. AMDGPU), kernels and device symbols in fat
binary are mangled differently than host.
This patch gets the device side kernel and variable name by mangling them in the mangle context
of aux target.
Differential Revision: https://reviews.llvm.org/D58163
llvm-svn: 354004
This allows the global visibility controls to be restrictive while still
populating the dynamic symbol table where required.
Differential Revision: https://reviews.llvm.org/D56871
llvm-svn: 353870
The patch in r350643 incorrectly sets the COFF emission based on bits
instead of bytes. This patch converts the 32 via CharUnits to bits to
compare the correct values.
Change-Id: Icf38a16470ad5ae3531374969c033557ddb0d323
llvm-svn: 353411
Emit{Nounwind,}RuntimeCall{,OrInvoke} have been modified to take a
FunctionCallee as an argument, and CreateRuntimeFunction has been
modified to return a FunctionCallee. All callers have been updated.
Additionally, CreateBuiltinFunction is removed, as it was redundant
with CreateRuntimeFunction after some previous changes.
Differential Revision: https://reviews.llvm.org/D57668
llvm-svn: 353184
This argument was added in r254554 in order to support the
pass_object_size attribute. However, in r296076, the attribute's
presence is now also represented in FunctionProtoType's
ExtParameterInfo, and thus it's unnecessary to pass along a separate
FunctionDecl.
The functions modified are:
RequiredArgs::forPrototype{,Plus}, and
CodeGenTypes::ConvertFunctionType.
After this, it's also (again) unnecessary to have a separate
ConvertFunctionType function ConvertType, so convert callers back to
the latter, leaving the former as an internal helper function.
llvm-svn: 352946
This patch implements parsing and sema for "omp declare mapper"
directive. User defined mapper, i.e., declare mapper directive, is a new
feature in OpenMP 5.0. It is introduced to extend existing map clauses
for the purpose of simplifying the copy of complex data structures
between host and device (i.e., deep copy). An example is shown below:
struct S { int len; int *d; };
#pragma omp declare mapper(struct S s) map(s, s.d[0:s.len]) // Memory region that d points to is also mapped using this mapper.
Contributed-by: Lingda Li <lildmh@gmail.com>
Differential Revision: https://reviews.llvm.org/D56326
llvm-svn: 352906
Introduce an option to request global visibility settings be applied to
declarations without a definition or an explicit visibility, rather than
the existing behavior of giving these default visibility. When the
visibility of all or most extern definitions are known this allows for
the same optimisations -fvisibility permits without updating source code
to annotate all declarations.
Differential Revision: https://reviews.llvm.org/D56868
llvm-svn: 352391
This code doesn't need to traverse types, lambdas, template arguments,
etc to detect trivial recursion. We can do a basic statement traversal
instead. This reduces the time spent compiling CodeGenModule.cpp, the
object file size (mostly reduced debug info), and the final executable
size by a small amount. I measured the exe mostly to check how much of
the overhead is from debug info, object file section headers, etc, vs
actual code.
metric | before | after | diff
time (s) | 47.4 | 38.5 | -8.9
obj (kb) | 12888 | 12012 | -876
exe (kb) | 86072 | 85996 | -76
llvm-svn: 352232
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
With commit r351627, LLVM gained the ability to apply (existing) IPO
optimizations on indirections through callbacks, or transitive calls.
The general idea is that we use an abstraction to hide the middle man
and represent the callback call in the context of the initial caller.
It is described in more detail in the commit message of the LLVM patch
r351627, the llvm::AbstractCallSite class description, and the
language reference section on callback-metadata.
This commit enables clang to emit !callback metadata that is
understood by LLVM. It does so in three different cases:
1) For known broker functions declarations that are directly
generated, e.g., __kmpc_fork_call for the OpenMP pragma parallel.
2) For known broker functions that are identified by their name and
source location through the builtin detection, e.g.,
pthread_create from the POSIX thread API.
3) For user annotated functions that carry the "callback(callee, ...)"
attribute. The attribute has to include the name, or index, of
the callback callee and how the passed arguments can be
identified (as many as the callback callee has). See the callback
attribute documentation for detailed information.
Differential Revision: https://reviews.llvm.org/D55483
llvm-svn: 351629
This is an initial implementation for msp430 toolchain including
-mmcu option support
-mhwmult options support
-integrated-as by default
The toolchain uses msp430-elf-as as a linker and supports msp430-gcc toolchain tree.
Patch by Kristina Bessonova!
Differential Revision: https://reviews.llvm.org/D56658
llvm-svn: 351228
After a discussion on the commit thread, it seems the 32 byte alignment
limitation is an MSVC toolchain artifact, not an inherent COFF
restriction. Clarify the comment accordingly, since saying COFF in the
comment but using isKnownWindowsMSVCEnvironment in the conditional is
confusing. Also add a newline before the comment, which is consistent
with the local style.
Differential Revision: https://reviews.llvm.org/D56466
llvm-svn: 350754
As reported in PR33035, LLVM crashes if given a common object with an
alignment of greater than 32 bits. This is because the COFF file format
does not support these alignments, so emitting them is broken anyway.
This patch changes any global definitions greater than 32 bit alignment
to no longer be in 'common'.
https://bugs.llvm.org/show_bug.cgi?id=33035
Differential Revision: https://reviews.llvm.org/D56391
Change-Id: I48609289753b7f3b58c5e2bc1712756750fbd45a
llvm-svn: 350643
The autolinking extension for ELF uses a slightly different format for
encoding the autolink information compared to COFF and MachO. Account
for this in the CGM to ensure that we do not assert when emitting
assembly or an object file.
llvm-svn: 350476
This fixes compiler crash when we attempted to compile this code:
extern __device__ int data;
__device__ int data = 1;
Differential Revision: https://reviews.llvm.org/D56033
llvm-svn: 349981
Implement options in clang to enable recording the driver command-line
in an ELF section.
Implement a new special named metadata, llvm.commandline, to support
frontends embedding their command-line options in IR/ASM/ELF.
This differs from the GCC implementation in some key ways:
* In GCC there is only one command-line possible per compilation-unit,
in LLVM it mirrors llvm.ident and multiple are allowed.
* In GCC individual options are separated by NULL bytes, in LLVM entire
command-lines are separated by NULL bytes. The advantage of the GCC
approach is to clearly delineate options in the face of embedded
spaces. The advantage of the LLVM approach is to support merging
multiple command-lines unambiguously, while handling embedded spaces
with escaping.
Differential Revision: https://reviews.llvm.org/D54487
Clang Differential Revision: https://reviews.llvm.org/D54489
llvm-svn: 349155
The host-side code can't (and should not) access the values that may
only exist on the device side. E.g. address of a __device__ function
does not exist on the host side as we don't generate the code for it there.
Differential Revision: https://reviews.llvm.org/D55663
llvm-svn: 349087
Inline cpu_specific versions referenced before the cpu_dispatch function
weren't properly emitted, since they hadn't been referred to. This
patch ensures that during resolver generation that all appropriate
versions are emitted.
Change-Id: I94c3766aaf9c75ca07a0ad8258efdbb834654ff8
llvm-svn: 348600
Declarations without the attribute were disallowed because it would be
ambiguous which 'target' it was supposed to be on. For example:
void ___attribute__((target("v1"))) foo();
void foo(); // Redecl of above, or fwd decl of below?
void ___attribute__((target("v2"))) foo();
However, a first declaration doesn't have that problem, and erroring
prevents it from working in cases where the forward declaration is
useful.
Additionally, a forward declaration of target==default wouldn't properly
cause multiversioning, so this patch fixes that.
The patch was not split since the 'default' fix would require
implementing the same check for that case, followed by undoing the same
change for the fwd-decl implementation.
Change-Id: I66f2c5bc2477bcd3f7544b9c16c83ece257077b0
llvm-svn: 347805
Summary:
Experience has shown that the functionality is useful. It makes linking
optimized clang with debug info for me a lot faster, 20s to 13s. The
type merging phase of PDB writing goes from 10s to 3s.
This removes the LLVM cl::opt and replaces it with a metadata flag.
After this change, users can do the following to use ghash:
- add -gcodeview-ghash to compiler flags
- replace /DEBUG with /DEBUG:GHASH in linker flags
Reviewers: zturner, hans, thakis, takuto.ikuta
Subscribers: aprantl, hiraditya, JDevlieghere, llvm-commits
Differential Revision: https://reviews.llvm.org/D54370
llvm-svn: 347072
As suggested by Richard Smith, and initially put up for review here:
https://reviews.llvm.org/D53341, this patch removes a hack that was used
to ensure that proper target-feature lists were used when emitting
cpu-dispatch (and eventually, target-clones) implementations. As a part
of this, the GlobalDecl object is proliferated to a bunch more
locations.
Originally, this was put up for review (see above) to get acceptance on
the approach, though discussion with Richard in San Diego showed he
approved of the approach taken here. Thus, I believe this is acceptable
for Review-After-commit
Differential Revision: https://reviews.llvm.org/D53341
Change-Id: I0a0bd673340d334d93feac789d653e03d9f6b1d5
llvm-svn: 346757
This patch modifies clang so that, if compiling for a target that
explicitly specifies a nonzero program memory address space, the
constructor list global will have the same address space as the
functions it contains.
AVR is the only in-tree backend which has a nonzero program memory
address space.
Without this, the IR verifier would always fail if a constructor
was used on a Harvard architecture backend.
This has no functional change to any in-tree backends except AVR.
llvm-svn: 346520
The member type creation for a cpu-dispatch function was not correctly
including the 'this' parameter, so ensure that the type is properly
determined. Also, disable defer in the cases of emitting the functoins,
as it can end up resulting in the wrong version being emitted.
Change-Id: I0b8fc5e0b0d1ae1a9d98fd54f35f27f6e5d5d083
llvm-svn: 345838
When a dispatch function was being emitted that had both a generic and a
pentium configuration listed, we would assert. This is because neither
configuration has any 'features' associated with it so they were both
considered the 'default' version. 'pentium' lacks any features because
we implement it in terms of __builtin_cpu_supports (instead of Intel
proprietary checks), which is unable to decern between the two.
The fix for this is to omit the 'generic' version from the dispatcher if
both are present. This permits existing code to compile, and still will
choose the 'best' version available (since 'pentium' is technically
better than 'generic').
Change-Id: I4b69f3e0344e74cbdbb04497845d5895dd05fda0
llvm-svn: 345826
We haven't supported compiling ObjC1 for a long time (and never will again), so
there isn't any reason to keep these separate. This patch replaces
LangOpts::ObjC1 and LangOpts::ObjC2 with LangOpts::ObjC.
Differential revision: https://reviews.llvm.org/D53547
llvm-svn: 345637
This corrects the leader for the swift names. The encoding for 4.2 and
5.0 differ by a single bit on the second character and were swapped.
llvm-svn: 345360
storage class.
To be more in line with what GCC does, switch the condition to be based
on the Static Storage duration instead of the storage class.
Change-Id: I8e959d762433cda48855099353bf3c950b9d54b8
llvm-svn: 345302
Similar to how ICC handles CPU-Dispatch on Windows, this patch uses the
resolver function directly to forward the call to the proper function.
This is not nearly as efficient as IFuncs of course, but is still quite
useful for large functions specifically developed for certain
processors.
This is unfortunately still limited to x86, since it depends on
__builtin_cpu_supports and __builtin_cpu_is, which are x86 builtins.
The naming for the resolver/forwarding function for cpu-dispatch was
taken from ICC's implementation, which uses the unmodified name for this
(no mangling additions). This is possible, since cpu-dispatch uses '.A'
for the 'default' version.
In 'target' multiversioning, this function keeps the '.resolver'
extension in order to keep the default function keeping the default
mangling.
Change-Id: I4731555a39be26c7ad59a2d8fda6fa1a50f73284
Differential Revision: https://reviews.llvm.org/D53586
llvm-svn: 345298
Add a new driver level flag `-fcf-runtime-abi=` that allows one to specify the
runtime ABI for CoreFoundation. This controls the language interoperability.
In particular, this is relevant for generating the CFConstantString classes
(primarily through the `__builtin___CFStringMakeConstantString` builtin) which
construct a reference to the "CFObject"'s `isa` field. This type differs
between swift 4.1 and 4.2+.
Valid values for the new option include:
- objc [default behaviour] - enable ObjectiveC interoperability
- swift-4.1 - enable interoperability with swift 4.1
- swift-4.2 - enable interoperability with swift 4.2
- swift-5.0 - enable interoperability with swift 5.0
- swift [alias] - target the latest swift ABI
Furthermore, swift 4.2+ changed the layout for the CFString when building
CoreFoundation *without* ObjectiveC interoperability. In such a case, a field
was added to the CFObject base type changing it from: <{ const int*, int }> to
<{ uintptr_t, uintptr_t, uint64_t }>.
In swift 5.0, the CFString type will be further adjusted to change the length
from a uint32_t on everything but BE LP64 targets to uint64_t.
Note that the default behaviour for clang remains unchanged and the new layout
must be explicitly opted into via `-fcf-runtime-abi=swift*`.
llvm-svn: 345222
Extract the reference to the ASTContext and Triple and use them throughout the
function. This is simply a cosmetic cleanup while in the area. NFC.
llvm-svn: 345160
For instantiated functions, search the template pattern to see if it marked
inline to determine if InlineHint attribute should be added to the function.
llvm-svn: 344987
Since multiversion variant functions can be inline, in C they become
available-externally linkage. This ends up causing the variants to not
be emitted, and not available to the linker.
The solution is to make sure that multiversion functions are always
emitted by marking them linkonce.
Change-Id: I897aa37c7cbba0c1eb2c57ee881d5000a2113b75
llvm-svn: 344957
This can be used to preserve profiling information across codebase
changes that have widespread impact on mangled names, but across which
most profiling data should still be usable. For example, when switching
from libstdc++ to libc++, or from the old libstdc++ ABI to the new ABI,
or even from a 32-bit to a 64-bit build.
The user can provide a remapping file specifying parts of mangled names
that should be treated as equivalent (eg, std::__1 should be treated as
equivalent to std::__cxx11), and profile data will be treated as
applying to a particular function if its name is equivalent to the name
of a function in the profile data under the provided equivalences. See
the documentation change for a description of how this is configured.
Remapping is supported for both sample-based profiling and instruction
profiling. We do not support remapping indirect branch target
information, but all other profile data should be remapped
appropriately.
Support is only added for the new pass manager. If someone wants to also
add support for this for the old pass manager, doing so should be
straightforward.
llvm-svn: 344199
When ifunc support was added to Clang (r265917) it did not allow
resolvers to take function arguments. This was based on GCC's
documentation, which states resolvers return a pointer and take no
arguments.
However, GCC actually allows resolvers to take arguments, and glibc (on
non-x86 platforms) and FreeBSD (on x86 and arm64) pass some CPU
identification information as arguments to ifunc resolvers. I believe
GCC's documentation is simply incorrect / out-of-date.
FreeBSD already removed the prohibition in their in-tree Clang copy.
Differential Revision: https://reviews.llvm.org/D52703
llvm-svn: 344100
There are a few leftovers of rC343147 that are not (\w+)\.begin but in
the form of ([-[:alnum:]>.]+)\.begin or spanning two lines. Change them
to use the container form in this commit. The 12 occurrences have been
inspected manually for safety.
llvm-svn: 343425
Add support for OMP5.0 requires directive and unified_address clause.
Patches to follow will include support for additional clauses.
Differential Revision: https://reviews.llvm.org/D52359
llvm-svn: 343063
Relanding rL342883 with more fragmented tests to test ELF-specific
section emission separately from broad-scope CFString tests. Now this
tests the following separately
1). CoreFoundation builds and linkage for ELF while building it.
2). CFString ELF section emission outside CF in assembly output.
3). Broad scope `cfstring3.c` tests which cover all object formats at
bitcode level and assembly level (including ELF).
This fixes non-bridged CoreFoundation builds on ELF targets
that use -fconstant-cfstrings. The original changes from differential
for a similar patch to PE/COFF (https://reviews.llvm.org/D44491) did not
check for an edge case where the global could be a constant which surfaced
as an issue when building for ELF because of different linkage semantics.
This patch addresses several issues with crashes related to CF builds on ELF
as well as improves data layout by ensuring string literals that back
the actual CFConstStrings end up in .rodata in line with Mach-O.
Change itself tested with CoreFoundation on Linux x86_64 but should be valid
for BSD-like systems as well that use ELF as the native object format.
Differential Revision: https://reviews.llvm.org/D52344
llvm-svn: 343038
[Clang][CodeGen][ObjC]: Fix non-bridged CoreFoundation builds on ELF targets
that use `-fconstant-cfstrings`. The original changes from differential
for a similar patch to PE/COFF (https://reviews.llvm.org/D44491) did not
check for an edge case where the global could be a constant which surfaced
as an issue when building for ELF because of different linkage semantics.
This patch addresses several issues with crashes related to CF builds on ELF
as well as improves data layout by ensuring string literals that back
the actual CFConstStrings end up in .rodata in line with Mach-O.
Change itself tested with CoreFoundation on Linux x86_64 but should be valid
for BSD-like systems as well that use ELF as the native object format.
Differential Revision: https://reviews.llvm.org/D52344
llvm-svn: 342883
Currently the code-model does not get saved in the module IR, so if a
code model is specified when compiling with LTO, it gets lost and is
not propagated properly to LTO. This patch does what is necessary in
the front end to pass the code-model to the module, so that the back
end can store it in the Module .
Differential Revision: https://reviews.llvm.org/D52323
llvm-svn: 342758
Summary:
Before this change, we only emit the XRay attributes in LLVM IR when the
-fxray-instrument flag is provided. This may cause issues with thinlto
when the final binary is being built/linked with -fxray-instrument, and
the constitutent LLVM IR gets re-lowered with xray instrumentation.
With this change, we can honour the "never-instrument "attributes
provided in the source code and preserve those in the IR. This way, even
in thinlto builds, we retain the attributes which say whether functions
should never be XRay instrumented.
This change addresses llvm.org/PR38922.
Reviewers: mboerger, eizan
Subscribers: mehdi_amini, dexonsmith, cfe-commits, llvm-commits
Differential Revision: https://reviews.llvm.org/D52015
llvm-svn: 342200
Previously, both types (plus the future target-clones) of
multiversioning had a separate ResolverOption structure and emission
function. This patch combines the two, at the expense of a slightly
more expensive sorting function.
llvm-svn: 342152
Previously the alignment on the newly created rtti/typeinfo data was largely
not set, meaning that DataLayout::getPreferredAlignment was free to overalign
it to 16 bytes. This causes unnecessary code bloat.
Differential Revision: https://reviews.llvm.org/D51416
llvm-svn: 342053
'declare target'.
All the functions, referenced in implicit|explicit target regions must
be emitted during code emission for the device.
llvm-svn: 341093
Since MinGW supports automatically importing external variables from
DLLs even without the DLLImport attribute, we shouldn't mark them
as DSO local unless we actually know them to be local for sure.
Keep marking thread local variables as DSO local.
Differential Revision: https://reviews.llvm.org/D51382
llvm-svn: 340941
constants by default when there is no optimization.
GCC's option -fno-keep-static-consts can be used to not emit
unused static constants.
In Clang, since default behavior does not keep unused static constants,
-fkeep-static-consts can be used to emit these if required. This could be
useful for producing identification strings like SVN identifiers
inside the object file even though the string isn't used by the program.
Differential Revision: https://reviews.llvm.org/D40925
llvm-svn: 340439
If the function is actually a weak reference, it should not be marked as
deferred definition as this is only a declaration. Patch adds checks for
the definitions if they must be emitted. Otherwise, only declaration is
emitted.
llvm-svn: 340191
The compiler may produce unexpected error messages/crashes when declare
target variables were used. Patch fixes problems with the declarations
marked as declare target to or link.
llvm-svn: 339805
This extension emits the guard cf table without inserting the
instrumentation. Currently that's what clang-cl does with /guard:cf
anyway, but this allows a user to request that explicitly.
Differential Revision: https://reviews.llvm.org/D50513
llvm-svn: 339420
declare target.
According to OpenMP 5.0, variables captured in lambdas in declare target
regions must be considered as implicitly declare target.
llvm-svn: 339152
CUDA 8.0 E.3.9.4 says: Within the body of a __device__ or __global__
function, only __shared__ variables or variables without any device
memory qualifiers may be declared with static storage class.
It is unclear how a function-scope non-const static variable
without device memory qualifier is implemented, therefore only static
const variable without device memory qualifier is allowed, which
can be emitted as a global variable in constant address space.
Currently clang only allows function-scope static variable with
__shared__ qualifier.
This patch also allows function-scope static const variable without
device memory qualifier and emits it as a global variable in constant
address space.
Differential Revision: https://reviews.llvm.org/D49931
llvm-svn: 338188
As documented here: https://software.intel.com/en-us/node/682969 and
https://software.intel.com/en-us/node/523346. cpu_dispatch multiversioning
is an ICC feature that provides for function multiversioning.
This feature is implemented with two attributes: First, cpu_specific,
which specifies the individual function versions. Second, cpu_dispatch,
which specifies the location of the resolver function and the list of
resolvable functions.
This is valuable since it provides a mechanism where the resolver's TU
can be specified in one location, and the individual implementions
each in their own translation units.
The goal of this patch is to be source-compatible with ICC, so this
implementation diverges from the ICC implementation in a few ways:
1- Linux x86/64 only: This implementation uses ifuncs in order to
properly dispatch functions. This is is a valuable performance benefit
over the ICC implementation. A future patch will be provided to enable
this feature on Windows, but it will obviously more closely fit ICC's
implementation.
2- CPU Identification functions: ICC uses a set of custom functions to identify
the feature list of the host processor. This patch uses the cpu_supports
functionality in order to better align with 'target' multiversioning.
1- cpu_dispatch function def/decl: ICC's cpu_dispatch requires that the function
marked cpu_dispatch be an empty definition. This patch supports that as well,
however declarations are also permitted, since the linker will solve the
issue of multiple emissions.
Differential Revision: https://reviews.llvm.org/D47474
llvm-svn: 337552
This patch uses CodeSegAttr to represent __declspec(code_seg) rather than
building on the existing support for #pragma code_seg.
The code_seg declspec is applied on functions and classes. This attribute
enables the placement of code into separate named segments, including compiler-
generated codes and template instantiations.
For more information, please see the following:
https://msdn.microsoft.com/en-us/library/dn636922.aspx
This patch fixes the regression for the support for attribute ((section).
746b78de78
Patch by Soumi Manna (Manna)
Differential Revision: https://reviews.llvm.org/D48841
llvm-svn: 337420
Code in `CodeGenModule::SetFunctionAttributes()` could set an empty
attribute `implicit-section-name` on a function that is affected by
`#pragma clang text="section"`. This is incorrect because the attribute
should contain a valid section name. If the function additionally also
used `__attribute__((section("section")))` then this could result in
emitting the function in a section with an empty name.
The patch fixes the issue by removing the problematic code that sets
empty `implicit-section-name` from
`CodeGenModule::SetFunctionAttributes()` because it is sufficient to set
this attribute only from a similar code in `setNonAliasAttributes()`
when the function is emitted.
Differential Revision: https://reviews.llvm.org/D48916
llvm-svn: 336842
This matches the way NVCC does it. Doing module cleanup at global
destructor phase used to work, but is, apparently, too late for
the CUDA runtime in CUDA-9.2, which ends up crashing with double-free.
Differential Revision: https://reviews.llvm.org/D48613
llvm-svn: 335763
Similarly to CFI on virtual and indirect calls, this implementation
tries to use program type information to make the checks as precise
as possible. The basic way that it works is as follows, where `C`
is the name of the class being defined or the target of a call and
the function type is assumed to be `void()`.
For virtual calls:
- Attach type metadata to the addresses of function pointers in vtables
(not the functions themselves) of type `void (B::*)()` for each `B`
that is a recursive dynamic base class of `C`, including `C` itself.
This type metadata has an annotation that the type is for virtual
calls (to distinguish it from the non-virtual case).
- At the call site, check that the computed address of the function
pointer in the vtable has type `void (C::*)()`.
For non-virtual calls:
- Attach type metadata to each non-virtual member function whose address
can be taken with a member function pointer. The type of a function
in class `C` of type `void()` is each of the types `void (B::*)()`
where `B` is a most-base class of `C`. A most-base class of `C`
is defined as a recursive base class of `C`, including `C` itself,
that does not have any bases.
- At the call site, check that the function pointer has one of the types
`void (B::*)()` where `B` is a most-base class of `C`.
Differential Revision: https://reviews.llvm.org/D47567
llvm-svn: 335569
Currently clang set kernel calling convention for CUDA/HIP after
arranging function, which causes incorrect kernel function type since
it depends on calling convention.
This patch moves setting kernel convention before arranging
function.
Differential Revision: https://reviews.llvm.org/D47733
llvm-svn: 334457
This should reduce the binary size penalty of ASan on Windows. After
r334313, ASan will add red zones to globals in comdats, so we will still
find OOB accesses to string literals.
llvm-svn: 334417
Summary:
When requirement imposed by __target__ attributes on functions
are not satisfied, prefer printing those requirements, which
are explicitly mentioned in the attributes.
This makes such messages more useful, e.g. printing avx512f instead of avx2
in the following scenario:
```
$ cat foo.c
static inline void __attribute__((__always_inline__, __target__("avx512f")))
x(void)
{
}
int main(void)
{
x();
}
$ clang foo.c
foo.c:7:2: error: always_inline function 'x' requires target feature 'avx2', but would be inlined into function 'main' that is compiled without support for 'avx2'
x();
^
1 error generated.
```
bugzilla: https://bugs.llvm.org/show_bug.cgi?id=37338
Reviewers: craig.topper, echristo, dblaikie
Reviewed By: craig.topper, echristo
Differential Revision: https://reviews.llvm.org/D46541
llvm-svn: 334174