This patch adds a flag -fclang-abi-compat that can be used to request that
Clang attempts to be ABI-compatible with some older version of itself.
This is provided on a best-effort basis; right now, this can be used to undo
the ABI change in r310401, reverting Clang to its prior C++ ABI for pass/return
by value of class types affected by that change, and to undo the ABI change in
r262688, reverting Clang to using integer registers rather than SSE registers
for passing <1 x long long> vectors. The intent is that we will maintain this
backwards compatibility path as we make ABI-breaking fixes in future.
The reversion to the old behavior for r310401 is also applied to the PS4 target
since that change is not part of its platform ABI (which is essentially to do
whatever Clang 3.2 did).
llvm-svn: 311823
This patch is intended to enable the use of basic double letter constraints used in GCC extended inline asm {Yi Y2 Yz Y0 Ym Yt}.
Supersedes D35205
llvm counterpart: D36369
Differential Revision: https://reviews.llvm.org/D36371
llvm-svn: 311643
The comment markers accepted by the assembler vary between different targets,
but '//' is always accepted, so we should use that for consistency.
Differential revision: https://reviews.llvm.org/D36666
llvm-svn: 311325
This is causing failures when compiling clang with -O3
as one of the structures used by clang is passed by
value and uses the fastcc calling convention.
Faliures manifest for stage2 mips build.
llvm-svn: 310704
This is an improvement over always using byval for
structs.
This will use registers until ~16 are used, and then
switch back to byval. This needs more work, since I'm
not sure it ever really makes sense to use byval. If
the register limit is exceeded, the arguments still
end up passed on the stack, but with a different ABI.
It also may make sense to base this on number of
registers used for non-struct arguments, rather than
just arguments that appear first in the argument list.
llvm-svn: 310527
OpenCL 2.0 atomic builtin functions have a scope argument which is ideally
represented as synchronization scope argument in LLVM atomic instructions.
Clang supports translating Clang atomic builtin functions to LLVM atomic
instructions. However it currently does not support synchronization scope
of LLVM atomic instructions. Without this, users have to use LLVM assembly
code to implement OpenCL atomic builtin functions.
This patch adds OpenCL 2.0 atomic builtin functions as Clang builtin
functions, which supports generating LLVM atomic instructions with
synchronization scope operand.
Currently only constant memory scope argument is supported. Support of
non-constant memory scope argument will be added later.
Differential Revision: https://reviews.llvm.org/D28691
llvm-svn: 310082
This option when combined with -mgpopt and -membedded-data places all
uninitialized constant variables in the read-only section.
Reviewers: atanasyan, nitesh.jain
Differential Revision: https://reviews.llvm.org/D35917
llvm-svn: 309940
The ARM Runtime ABI document (IHI0043) defines the AEABI floating point
helper functions in 4.1.2 The floating-point helper functions. These
functions always use the base PCS (soft-fp). However helper functions
defined outside of this document such as the complex-number multiply and
divide helpers are not covered by this requirement and should use
hard-float PCS if the target is hard-float as both compiler-rt and libgcc
for a hard-float sysroot implement these functions with a hard-float PCS.
All of the floating point helper functions that are explicitly soft float
are expanded in the llvm ARM backend. This change makes clang not force the
BuiltinCC to AAPCS for AAPCS_VFP. With this change the ARM compiler-rt
tests involving _Complex pass with both hard-fp and soft-fp targets.
Differential Revision: https://reviews.llvm.org/D35538
llvm-svn: 309257
This change is part of the RegCall calling convention support for LLVM.
Existing RegCall implementation was extended to include correct handling of
Complex Long Double type. Complex long double types should be returned/passed
in memory and not register stack. This patch implements this behavior.
Patch by: eandrews
Differential Revision: https://reviews.llvm.org/D35259
llvm-svn: 308769
This patch adds support for the `long_call`, `far`, and `near` attributes
for MIPS targets. The `long_call` and `far` attributes are synonyms. All
these attributes override `-mlong-calls` / `-mno-long-calls` command
line options for particular function.
Differential revision: https://reviews.llvm.org/D35479
llvm-svn: 308667
Certain targets (e.g. amdgcn) require global variable to stay in global or constant address
space. In C or C++ global variables are emitted in the default (generic) address space.
This patch introduces virtual functions TargetCodeGenInfo::getGlobalVarAddressSpace
and TargetInfo::getConstantAddressSpace to handle this in a general approach.
It only affects IR generated for amdgcn target.
Differential Revision: https://reviews.llvm.org/D33842
llvm-svn: 307470
In running some internal vectorcall tests in 32 bit mode, we discovered that the
behavior I'd previously implemented for x64 (and applied to x32) regarding the
assignment of SSE registers was incorrect. See spec here:
https://msdn.microsoft.com/en-us/library/dn375768.aspx
My previous implementation applied register argument position from the x64
version to both. This isn't correct for x86, so this removes and refactors that
section. Additionally, it corrects the integer/int-pointer assignments. Unlike
x64, x86 permits integers to be assigned independent of position.
Finally, the code for 32 bit was cleaned up a little to clarify the intent,
as well as given a descriptive comment.
Differential Revision: https://reviews.llvm.org/D34455
llvm-svn: 305928
Summary: OpenCL and SPIR version metadata must be generated once per module instead of once per mangled global value.
Reviewers: Anastasia, yaxunl
Reviewed By: Anastasia
Subscribers: ahatanak, cfe-commits
Differential Revision: https://reviews.llvm.org/D34235
llvm-svn: 305796
Rationale: OpenCL kernels are called via an explicit runtime API
with arguments set with clSetKernelArg(), not as normal sub-functions.
Return SPIR_KERNEL by default as the kernel calling convention to ensure
the fingerprint is fixed such way that each OpenCL argument gets one
matching argument in the produced kernel function argument list to enable
feasible implementation of clSetKernelArg() with aggregates etc. In case
we would use the default C calling conv here, clSetKernelArg() might
break depending on the target-specific conventions; different targets
might split structs passed as values to multiple function arguments etc.
https://reviews.llvm.org/D33639
llvm-svn: 304389
This patch adds support for the `micromips` and `nomicromips` attributes
for MIPS targets.
Differential revision: https://reviews.llvm.org/D33363
llvm-svn: 303546
Alloca always returns a pointer in alloca address space, which may
be different from the type defined by the language. For example,
in C++ the auto variables are in the default address space. Therefore
cast alloca to the expected address space when necessary.
Differential Revision: https://reviews.llvm.org/D32248
llvm-svn: 303370
Modified MipsABIInfo::classifyArgumentType so that it now coerces
aggregate structures only if the size of said aggregate is less than
16/64 bytes, depending on the ABI.
Patch by Stefan Maksimovic.
Differential Revision: https://reviews.llvm.org/D32900
with minor changes (use regexp instead of the hardcoded values) to the test.
llvm-svn: 302670
Use variadic templates instead of relying on <cstdarg> + sentinel.
This enforces better type checking and makes code more readable.
Differential revision: https://reviews.llvm.org/D32550
llvm-svn: 302572
Reverting
Modified MipsABIInfo::classifyArgumentType so that it now coerces
aggregate structures only if the size of said aggregate is less than 16/64
bytes, depending on the ABI.
as it broke clang-with-lto-ubuntu builder.
llvm-svn: 302555
Modified MipsABIInfo::classifyArgumentType so that it now coerces aggregate
structures only if the size of said aggregate is less than 16/64 bytes,
depending on the ABI.
Patch by Stefan Maksimovic.
Differential Revision: https://reviews.llvm.org/D32900
llvm-svn: 302547
It turns out there are some sort-of-but-not-quite empty structs that break all
the rules. For example:
struct SuperEmpty { int arr[0]; };
struct SortOfEmpty { struct SuperEmpty e; };
Both of these have sizeof == 0, even in C++ mode, for GCC compatibility. The
first one also doesn't occupy a register when passed by value in GNU C++ mode,
unlike everything else.
On Darwin, we want to ignore the lot (and especially don't want to try to use
an i0 as we were).
llvm-svn: 302313
These two attributes specify the same info in a different way.
AMGPU BE only checks the latter as a target specific attribute
as opposed to language specific reqd_work_group_size.
This change produces amdgpu_flat_work_group_size out of
reqd_work_group_size if specified.
Differential Revision: https://reviews.llvm.org/D31728
llvm-svn: 299678
Use # as the comment leader for AArch64 auto-release elision marker.
This is to keep it in sync with the value used in swift. When building
libdispatch for Linux AArch64, the auto-release elision marker was
emitted. However, ELF uses # as the comment leader while MachO accepts
both ; and #. Use the common marker for it instead.
llvm-svn: 294877
Summary:
This teaches clang how to parse and lower the 'interrupt' and 'naked'
attributes.
This allows interrupt signal handlers to be written.
Reviewers: aaron.ballman
Subscribers: malcolm.parsons, cfe-commits
Differential Revision: https://reviews.llvm.org/D28451
llvm-svn: 294402
This comes up in V8, which has a Handle template class that wraps a
typed pointer, and is frequently passed by value. The pointer is stored
in the base, HandleBase. This change allows us to pass the struct as a
pointer instead of using byval. This avoids creating tons of temporary
allocas that we copy from during call lowering.
Eventually, it would be good to use FCAs here instead.
llvm-svn: 291917
Front end component (back end changes are D27392). The vectorcall
calling convention was broken subtly in two cases. First,
it didn't properly handle homogeneous vector aggregates (HVAs).
Second, the vectorcall specification requires that only the
first 6 parameters be eligible for register assignment.
This patch fixes both issues.
Differential Revision: https://reviews.llvm.org/D27529
llvm-svn: 291041
At least the plugin used by the LibreOffice build
(<https://wiki.documentfoundation.org/Development/Clang_plugins>) indirectly
uses those members (through inline functions in LLVM/Clang include files in turn
using them), but they are not exported by utils/extract_symbols.py on Windows,
and accessing data across DLL/EXE boundaries on Windows is generally
problematic.
Differential Revision: https://reviews.llvm.org/D26671
llvm-svn: 289647
In amdgcn target, null pointers in global, constant, and generic address space take value 0 but null pointers in private and local address space take value -1. Currently LLVM assumes all null pointers take value 0, which results in incorrectly translated IR. To workaround this issue, instead of emit null pointers in local and private address space, a null pointer in generic address space is emitted and casted to local and private address space.
Tentative definition of global variables with non-zero initializer will have weak linkage instead of common linkage since common linkage requires zero initializer and does not have explicit section to hold the non-zero value.
Virtual member functions getNullPointer and performAddrSpaceCast are added to TargetCodeGenInfo which by default returns ConstantPointerNull and emitting addrspacecast instruction. A virtual member function getNullPointerValue is added to TargetInfo which by default returns 0. Each target can override these virtual functions to get target specific null pointer and the null pointer value for specific address space, and perform specific translations for addrspacecast.
Wrapper functions getNullPointer is added to CodegenModule and getTargetNullPointerValue is added to ASTContext to facilitate getting the target specific null pointers and their values.
This change has no effect on other targets except amdgcn target. Other targets can provide support of non-zero null pointer in a similar way.
This change only provides support for non-zero null pointer for C and OpenCL. Supporting for other languages will be added later incrementally.
Differential Revision: https://reviews.llvm.org/D26196
llvm-svn: 289252
This patch implements the register call calling convention, which ensures
as many values as possible are passed in registers. CodeGen changes
were committed in https://reviews.llvm.org/rL284108.
Differential Revision: https://reviews.llvm.org/D25204
llvm-svn: 285849
Enable soft-float support on PPC64, as the backend now supports it. Also, the
backend now uses -hard-float instead of +soft-float, so set the target features
accordingly.
Fixes PR26970.
llvm-svn: 283061
__attribute__((amdgpu_flat_work_group_size(<min>, <max>))) - request minimum and maximum flat work group size
__attribute__((amdgpu_waves_per_eu(<min>[, <max>]))) - request minimum and/or maximum waves per execution unit
Differential Revision: https://reviews.llvm.org/D24513
llvm-svn: 282371
Move the definition of `getTriple()` into the header. It would just call
`getTarget().getTriple()`. Inline the definition to allow the compiler to see
the same amount of the layout as previously. Remove the more verbose
`getTarget().getTriple()` in favour of `getTriple()`.
llvm-svn: 281487
The PPC64 DWARF register-size table did not match the ABI specification (or
GCC, for that matter). Fix that, and add a regression test.
Fixes PR27931.
llvm-svn: 280053
Structs are currently handled as pointer + byval, which makes AMDGPU
LLVM backend generate incorrect code when structs are used. This patch
changes struct argument to be handled directly and without flattening,
which Clover (Mesa 3D Gallium OpenCL state tracker) will be able to
handle. Flattening would expand the struct to individual elements and
pass each as a separate argument, which Clover can not
handle. Furthermore, such expansion does not fit the OpenCL
programming model which requires to explicitely specify each argument
index, size and memory location.
Patch by Vedran Miletić
llvm-svn: 279463
We processed unnamed bitfields after our logic for non-vector field
elements in records larger than 128 bits. The vector logic would
determine that the bit-field disqualifies the record from occupying a
register despite the unnamed bit-field not participating in the record
size nor its alignment.
N.B. This behavior matches GCC and ICC.
llvm-svn: 278656
An __m512 vector type wrapped in a structure should be passed in a
vector register.
Our prior implementation was based on a draft version of the psABI.
This fixes PR28975.
N.B. The update to the ABI was made here:
https://github.com/hjl-tools/x86-psABI/commit/30f9c9
llvm-svn: 278655
Summary:
Based on a patch by Michael Mueller.
This attribute specifies that a function can be hooked or patched. This
mechanism was originally devised by Microsoft for hotpatching their
binaries (which they're constantly updating to stay ahead of crackers,
script kiddies, and other ne'er-do-wells on the Internet), but it's now
commonly abused by Windows programs that want to hook API functions. It
is for this reason that this attribute was added to GCC--hence the name,
`ms_hook_prologue`.
Depends on D19908.
Reviewers: rnk, aaron.ballman
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D19909
llvm-svn: 278050
The size of image type is reported incorrectly as size of a pointer to address space 0, which causes error when casting image type to pointers by __builtin_astype.
The fix is to get image address space from TargetInfo then report the size accordingly.
Differential Revision: https://reviews.llvm.org/D22927
llvm-svn: 277647
Summary:
In RenderScript, the size of the argument or return value emitted in the
IR is expected to be the same as the size of corresponding qualified
type. For ARM and AArch64, the coercion performed by Clang can
change the parameter or return value to a type whose size is different
(usually larger) than the original aggregate type. Specifically, this
can happen in the following cases:
- Aggregate parameters of size <= 64 bytes and return values smaller
than 4 bytes on ARM
- Aggregate parameters and return values smaller than bytes on
AArch64
This patch coerces the cases above to an integer array that is the same
size and alignment as the original aggregate. A new field is added to
TargetInfo to detect a RenderScript target and limit this coercion just
to that case.
Tests added to test/CodeGen/renderscript.c
Reviewers: rsmith
Subscribers: aemerson, srhines, llvm-commits
Differential Revision: https://reviews.llvm.org/D22822
llvm-svn: 276904
Allows AMDGCN target to generate images (such as %opencl.image2d_t) in constant address space.
Images will still be generated in global address space by default.
Added tests to existing opencl-types.cl in test\CodeGenOpenCL.
Patch by Aaron En Ye Shi.
Differential Revision: https://reviews.llvm.org/D22523
llvm-svn: 276161
Added the opencl.ocl.version metadata to be emitted with amdgcn. Created a static function emitOCLVerMD which is shared between triple spir and target amdgcn.
Also added new testcases to existing test file, spir_version.cl inside test/CodeGenOpenCL.
Patch by Aaron En Ye Shi.
Differential Revision: https://reviews.llvm.org/D22424
llvm-svn: 276010
Summary:
Summary:
Change Clang calling convention SpirKernel to OpenCLKernel.
Set calling convention OpenCLKernel for amdgcn as well.
Add virtual method .getOpenCLKernelCallingConv() to TargetCodeGenInfo
and use it to set target calling convention for AMDGPU and SPIR.
Update tests.
Reviewers: rsmith, tstellarAMD, Anastasia, yaxunl
Subscribers: kzhuravl, cfe-commits
Differential Revision: http://reviews.llvm.org/D21367
llvm-svn: 274220
smaller than register as argument in variadic functions on
big endian architectures.
Differential Revision: http://reviews.llvm.org/D21611
llvm-svn: 273665
We would incorrectly emit the directive sections due to the missing overridden
methods. We now emit the expected "/DEFAULTLIB" rather than "-l" options for
requested linkage
llvm-svn: 273558
Summary:
Clang does not detect `aapcs-vfp` for the EABIHF environment. The reason is that only GNUEABIHF is considered while choosing calling convention, EABIHF is ignored.
This causes clang to use `aapcs` for EABIHF and add the `arm_aapcscc` specifier to functions in generated IR.
The modified `arm-cc.c` test checks that no calling convention specifier is added to functions for EABIHF, which means the default one is used (`CallingConv::ARM_AAPCS_VFP`).
Reviewers: rengolin, compnerd, t.p.northover
Subscribers: aemerson, rengolin, asl, cfe-commits
Differential Revision: http://reviews.llvm.org/D20219
llvm-svn: 269419
Use a utility function to check whether the number of elements is a power of 2
and drop the redundant upper limit (a 128-bit vector with more than 16 elements
would have each element < 8 bits, not possible).
llvm-svn: 268422
Before this change, we would pass all non-HFA record arguments on
Windows with byval. Byval often blocks optimizations and results in bad
code generation. Windows now uses the existing workaround that other
x86_32 platforms use.
I also expanded the workaround to handle C++ records with constructors
on Windows. On non-Windows platforms, we have to keep generating the
same LLVM IR prototypes if we want our bitcode to be ABI compatible.
Otherwise we will encounter mismatch issues like PR21573.
Essentially fixes PR27522 in Clang instead of LLVM.
Reviewers: hans
Differential Revision: http://reviews.llvm.org/D19756
llvm-svn: 268261
Summary:
Port rL265324 to SystemZ to allow using the 'swiftcall' attribute on that architecture.
Depends on D19414.
Reviewers: kbarton, rjmccall, uweigand
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D19432
llvm-svn: 267879
SPIR target can be used for C/C++ inputs too (i.e. in OpenCL compatible mode for the libs creation).
Patch by Neil Henning!
Review: http://reviews.llvm.org/D19478
llvm-svn: 267561
Currently, for the ppc64--gnu and aarch64 ABIs, we recognize:
typedef __attribute__((__ext_vector_type__(3))) float v3f32;
typedef __attribute__((__ext_vector_type__(16))) char v16i8;
struct HFA {
v3f32 a;
v16i8 b;
};
as an HFA. Since the first type encountered is used as the base type,
we pass the HFA as:
[2 x <3 x float>]
Which leads to incorrect IR (relying on padding values) when the
second field is used.
Instead, explicitly widen the vector (after size rounding) in
isHomogeneousAggregate.
Differential Revision: http://reviews.llvm.org/D18998
llvm-svn: 266784
Non-owning pointers that cache LLVM types and constants can use
'nullptr' default member initializers so that we don't need to mention
them in the constructor initializer list.
Owning pointers should use std::unique_ptr so that we don't need to
manually delete them in the destructor. They also don't need to be
mentioned in the constructor at that point.
NFC
llvm-svn: 266263
Revert the two changes to thread CodeGenOptions into the TargetInfo allocation
and to fix the layering violation by moving CodeGenOptions into Basic.
Code Generation is arguably not particularly "basic". This addresses Richard's
post-commit review comments. This change purely does the mechanical revert and
will be followed up with an alternate approach to thread the desired information
into TargetInfo.
llvm-svn: 265806
This is a mechanical move of CodeGenOptions from libFrontend to libBasic. This
fixes the layering violation introduced earlier by threading CodeGenOptions into
TargetInfo. It should also fix the modules based self-hosting builds. NFC.
llvm-svn: 265702
Summary:
r246764 handled __fp16 arguments and returns for AAPCS, but skipped this
handling for OpenCL. Simlar to OpenCL, RenderScript also handles __fp16
type natively.
This patch adds the -fnative-half-arguments-and-returns command line
flag to allow such languages to skip this coercion of __fp16.
Reviewers: srhines, olista01
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D18138
llvm-svn: 263795
For compatibility with GCC, classify __m64 as SSE.
However, clang is a platform compiler for certain targets; retain our
old behavior on those targets: classify __m64 as integer.
This fixes PR26832.
llvm-svn: 262688
Fixes PR11517 for SPARC.
On most targets, clang lowers va_arg itself, eschewing the use of the
llvm vaarg instruction. This is necessary (at least for now) as the type
argument to the vaarg instruction cannot represent all the ABI
information that is needed to support complex calling conventions.
However, on targets with a simpler varrags ABIs, the LLVM instruction
can work just fine, and clang can simply lower to it. Unfortunately,
even on such targets, vaarg with a struct argument would fail, because
the default lowering to vaarg was naive: it didn't take into account the
ABI attribute computed by classifyArgumentType. In particular, for the
DefaultABIInfo, structs are supposed to be passed indirectly and so
llvm's vaarg instruction should be emitted with a pointer argument.
Now, vaarg instruction emission is able to use computed ABIArgInfo for
the provided argument type, which allows the default ABI support to work
for structs too.
I haven't touched the EmitVAArg implementation for PPC32_SVR4 or XCore,
although I believe both are now redundant, and could be switched over to
use the default implementation as well.
Differential Revision: http://reviews.llvm.org/D16154
llvm-svn: 261717
This uses the general emitVoidPtrVAArg lowering logic for everything, since
this supports all types, and we don't have any special requirements.
llvm-svn: 261557
This modification applies the following Android commit when we have an
Android environment. This is the sole non-renderscript in the Android repo
commit 9212d4fb30a3ca2f4ee966dd2748c35573d9682c
Author: Tim Murray <timmurray@google.com>
Date: Fri Aug 15 16:00:15 2014 -0700
Update vector calling convention for AArch64.
bug 16846318
Change-Id: I3cfd167758b4bd634d8480ee6ba6bb55d61f82a7
Reviewers: srhines, jyknight
Subscribers: mcrosier, aemerson, rengolin, tberghammer, danalbert, srhines
Differential Revision: http://reviews.llvm.org/D17448
llvm-svn: 261533
It can happen that when we only have 1 more register left in the regsave
area we need to store a value bigger than 1 register and therefore we
go to the overflow area. In this case we have to leave the last slot
in the regsave area unused and keep using overflow area. Do this
by storing a limit value to the used register counter in the overflow block.
Issue diagnosed by and solution tested by Mark Millard!
llvm-svn: 261422