Summary:
This is a follow up on https://reviews.llvm.org/D71473#inline-647262.
There's a caveat here that `Align(1)` relies on the compiler understanding of `Log2_64` implementation to produce good code. One could use `Align()` as a replacement but I believe it is less clear that the alignment is one in that case.
Reviewers: xbolva00, courbet, bollu
Subscribers: arsenm, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, Jim, kerbowa, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D73099
Collect the calling convention and a number of boolean arguments into a
structure to slightly reduces the number of arguments passed around between
LowerCall_<Subtarget>, FinishCall and a few of the helpers. Also
calulates if a call is indirect once using the exisitng helper and caches the
result replacing several instances where we duplicated the logic determining if
a call is indirect.
These intrinsics and the corresponding ISD nodes were recently added. PPC has
instructions that do this for vectors. Legalize them and add patterns to emit
the satuarting instructions.
Differential revision: https://reviews.llvm.org/D71940
For memcpy/memset/memmove etc., replace ExternalSymbolSDNode with a
MCSymbolSDNode, which have a prefix dot before function name as entry
point symbol.
Differential Revision: https://reviews.llvm.org/D70718
Summary:
This patch pushes the AIX vararg unimplemented error diagnostic later
and allows vararg calls so long as all the arguments can be passed in register.
This patch extends the AIX calling convention implementation to initialize
GPR(s) for vararg float arguments. On AIX, both GPR(s) and FPR are allocated
for floating point arguments. The GPR(s) are only initialized for vararg calls,
otherwise the callee is expected to retrieve the float argument in the FPR.
f64 in AIX PPC32 requires special handling in order to allocated and
initialize 2 GPRs. This is performed with bitcast, SRL, truncation to
initialize one GPR for the MSW and bitcast, truncations to initialize
the other GPR for the LSW.
A future patch will follow to add support for arguments passed on the stack.
Patch provided by: cebowleratibm
Reviewers: sfertile, ZarkoCA, hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D71013
Only PPC seems to be using it, and only checks some simple cases and
doesn't distinguish between FP. Just switch to using LLT to simplify
use from GlobalISel.
We use o suffix to indicate record form instuctions,
(as it is similar to dot '.' in mne?)
This was fine before, as we did not support XO-form.
However, with https://reviews.llvm.org/D66902,
we now have XO-form support.
It becomes confusing now to still use 'o' for record form,
and it is weird to have something like 'Oo' .
This patch rename all 'o' instructions to use '_rec' instead.
Also rename `isDot` to `isRecordForm`.
Reviewed By: #powerpc, hfinkel, nemanjai, steven.zhang, lkail
Differential Revision: https://reviews.llvm.org/D70758
When the "disable-tail-calls" attribute was added, checks were added for
it in various backends. Now this code has proliferated, and it is
something the target is responsible for checking. Move that
responsibility back to the ISels (fast, global, and SD).
There's no major functionality change, except for targets that never
implemented this check.
This LLVM attribute was originally added in
d9699bc7bd (2015).
Reviewers: echristo, MaskRay
Differential Revision: https://reviews.llvm.org/D72118
Commit 0f0330a787 legalized these nodes on PPC without consideration of
unsafe math which means that we get inexact exceptions raised for nearbyint.
Since this doesn't conform to the standard, switch this legalization to depend
on unsafe fp math.
VSX provides a full complement of rounding instructions yet we somehow ended up
with some of them legal and others not. This just legalizes all of the FP
rounding nodes and the FP -> int rounding nodes with unsafe math.
Differential revision: https://reviews.llvm.org/D69949
This is a fix for https://bugs.llvm.org/show_bug.cgi?id=40554
Some CPU's trap to the kernel on unaligned floating point access and there are
kernels that do not handle the interrupt. The program then fails with a SIGBUS
according to the PR. This just switches the default for unaligned access to only
allow it on recent server CPUs that are known to allow this.
Differential revision: https://reviews.llvm.org/D71954
The custom node PPCISD::XXREVERSE has completely the same semantics of generic node ISD::BSWAP.
We need to clean up it as we have the combine rules for bswap in the base class, while nothing for xxreverse.
Differential Revision: https://reviews.llvm.org/D70657
Summary:
Currently, we set legalization action of `ISD::ROTL` vectors as
`Expand` in `PPCISelLowering`. However, we can exploit `vrl(b|h|w|d)`
to lower `ISD::ROTL` directly.
Differential Revision: https://reviews.llvm.org/D71324
static void *ifunc(void) __attribute__((ifunc("resolver")));
void foo() { ifunc(); }
The relocation produced by the ifunc() call:
1. gcc -msecure-plt -fPIC => R_PPC_PLTREL24 r_addend=0x8000
2. gcc -msecure-plt -PIE => R_PPC_PLTREL24 r_addend=0x8000
3. clang -msecure-plt -fPIC => R_PPC_PLTREL24 r_addend=0x8000
4. clang -msecure-plt -fPIE => R_PPC_REL24
4 is incorrect. The R_PPC_REL24 needs a call stub due to ifunc. If this
relocation is mixed with other R_PPC_PLTREL24(r_addend=0x8000) in a
function, both GNU ld and lld (after D71621 fix) may produce a wrong
result.
This patch fixes 4 to use R_PPC_PLTREL24, which matches GCC.
Both GNU ld and lld (after D71621) will be happy.
Reviewed By: sfertile
Differential Revision: https://reviews.llvm.org/D71649
Summary:
The default static (non-PIC, non-PIE) model for 32-bit powerpc does not
use @PLT annotations and relocations in GCC. LLVM shouldn't use @PLT
annotations either, because it breaks secure-PLT linking with (some
versions of?) GNU LD.
Update the available-externally.ll test to reflect that default mode should be
the same as the static relocation, by using the same check prefix.
Reviewed by: sfertile
Differential Revision: https://reviews.llvm.org/D70570
Refactor the splatting of a constant to a vector so that common code is used
both for Power9 and Power8.
Patch by: Anil Mahmud
Differential Revision: https://reviews.llvm.org/D71481
We somehow missed doing this when we were working on Power9 exploitation.
This just adds the missing legalization and cost for producing the vector
intrinsics.
Differential revision: https://reviews.llvm.org/D70436
Extends the desciptor-based indirect call support for 32-bit codegen,
and enables indirect calls for AIX.
In-depth Description:
In a function descriptor based ABI, a function pointer points at a
descriptor structure as opposed to the function's entry point. The
descriptor takes the form of 3 pointers: 1 for the function's entry
point, 1 for the TOC anchor of the module containing the function
definition, and 1 for the environment pointer:
struct FunctionDescriptor {
void *EntryPoint;
void *TOCAnchor;
void *EnvironmentPointer;
};
An indirect call has several steps of loading the the information from
the descriptor into the proper registers for setting up the call. Namely
it has to:
1) Save the caller's TOC pointer into the TOC save slot in the linkage
area, and then load the callee's TOC pointer into the TOC register
(GPR 2 on AIX).
2) Load the function descriptor's entry point into the count register.
3) Load the environment pointer into the environment pointer register
(GPR 11 on AIX).
4) Perform the call by branching on count register.
5) Restore the caller's TOC pointer after returning from the indirect call.
A couple important caveats to the above:
- There is no way to directly load a value from memory into the count register.
Instead we populate the count register by loading the entry point address into
a gpr and then moving the gpr to the count register.
- The TOC restore has to come immediately after the branch on count register
instruction (i.e., the 1st instruction executed after we return from the
call). This is an implementation limitation. We could, in theory, schedule
the restore elsewhere as long as no uses of the TOC pointer fall in between
the call and the restore; however, to keep it simple, we insert a pseudo
instruction that represents both the indirect branch instruction and the
load instruction that restores the caller's TOC from the linkage area. As
they flow through the compiler as a single pseudo instruction, nothing can be
inserted between them and the caller's TOC is then valid at any use.
Differtential Revision: https://reviews.llvm.org/D70724
This has two main effects:
- Optimizes debug info size by saving 221.86 MB of obj file size in a
Windows optimized+debug build of 'all'. This is 3.03% of 7,332.7MB of
object file size.
- Incremental step towards decoupling target intrinsics.
The enums are still compact, so adding and removing a single
target-specific intrinsic will trigger a rebuild of all of LLVM.
Assigning distinct target id spaces is potential future work.
Part of PR34259
Reviewers: efriedma, echristo, MaskRay
Reviewed By: echristo, MaskRay
Differential Revision: https://reviews.llvm.org/D71320
Summary:
This is found during https://reviews.llvm.org/D70758
All the other record forms are having suffix o at the end.
ANDIo8 and ANDISo8 are the only two that put o before 8.
This patch rename them to be consistent with others.
Reviewers: #powerpc, hfinkel, nemanjai, lei, steven.zhang, echristo, jhibbits, joerg
Reviewed By: jhibbits
Subscribers: wuzish, hiraditya, kbarton, shchenz, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70928
Refactor FinishCall to be more easily understandable as a precursor to
implementing indirect calls for AIX. The refactor tries to group similar
code together at the cost of some code duplication. The high level
overview of the refactor:
- Adds a number of helper functions for things like:
* Determining if a call is indirect.
* What the Opcode for a call is.
* Transforming the callee for a direct function call.
* Extracting the Chain operand from a CallSeqStart node.
* Building the operands of the call.
- Adds helpers for building the indirect call DAG nodes
(excluding the call instruction itself which is created in
`FinishCall`).
- Removes PrepareCall, which has been subsumed by the
helpers.
- Rename 'InFlag' to 'Glue'.
- FinishCall has been refactored to:
1) Set TOC pointer usage on the DAG for the TOC based
subtargets.
2) Calculate if a call is indirect.
3) Determine the Opcode to use for the call
instruction.
4) Transform the Callee for direct calls, or build
the DAG nodes for indirect calls.
5) Buildup the call operands.
6) Emit the call instruction.
7) If needed, emit the callSeqEnd Node and
finish lowering by calling `LowerCallResult`
Differential Revision: https://reviews.llvm.org/D70126
This patch adds LowerFormalArguments_AIX, support is added for lowering
int, float, and double formal arguments into general purpose and
floating point registers only.
The aix calling convention testcase have been redone to test for caller
and callee functionality in the same lit test.
Patch by Zarko Todorovski!
Differential Revision: https://reviews.llvm.org/D69578
This is a continuation of D70262
The previous patch as listed above added the future CPU in clang. This patch
adds the future CPU in the PowerPC backend. At this point the patch simply
assumes that a future CPU will have the same characteristics as pwr9. Those
characteristics may change with later patches.
Differential Revision: https://reviews.llvm.org/D70333
Summary:
This patch renames the DarwinDirective (used to identify which CPU was defined)
to CPUDirective. It also adds the getCPUDirective() method and replaces all uses
of getDarwinDirective() with getCPUDirective().
Once this patch lands and downstream users of the getDarwinDirective() method
have switched to the getCPUDirective() method, the old getDarwinDirective()
method will be removed.
Reviewers: nemanjai, hfinkel, power-llvm-team, jsji, echristo, #powerpc, jhibbits
Reviewed By: hfinkel, jsji, jhibbits
Subscribers: hiraditya, shchenz, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70352
If an inline asm statement clobbers a VSX register that overlaps with a
callee-saved Altivec register or FPR, we will not record the clobber and will
therefore violate the ABI. This is clearly a bug so this patch fixes it.
Differential revision: https://reviews.llvm.org/D68576
Summary:
This patch sets up the infrastructure for
1. Associate MCSymbolXCOFF with an MCSectionXCOFF when it could not
get implicitly associated.
2. Generate undefined symbols. The patch itself generates undefined symbol
for external function call only. Generate undefined symbol for external
global variable and external function descriptors will be handled in
separate patch(s) after this is land.
Differential Revision: https://reviews.llvm.org/D70443
Power9 has instructions to implement the semantics of SIGN_EXTEND_INREG for vector type.
Mark it as legal and add the match pattern.
Differential Revision: https://reviews.llvm.org/D69601
This patch lowering jump table, constant pool and block address in assembly.
1. On AIX, jump table index is always relative;
2. Put CPI and JTI into ReadOnlySection until we support unique data sections;
3. Create the temp symbol for block address symbol;
4. Update MIR testcases and add related assembly part;
Differential Revision: https://reviews.llvm.org/D70243
AMDGPU needs to know the FP mode for the function to answer this
correctly when this is removed from the subtarget.
AArch64 had to make this more complicated by using this from an IR
hook, so add an IR typed overload.
This option allows the user to specify the use of absolute jumptables instead
of relative which is the default on most PPC subtargets.
Patch by Kamauu Bridgeman
Differential revision: https://reviews.llvm.org/D69108
VSX provides floating point minimum and maximum instructions that conform
to IEEE semantics. This legalizes the respective nodes and emits VSX code
for them. Furthermore, on Power9 cores we have xsmaxcdp and xsmincdp
instructions that conform to language semantics for the conditional operator
even in the presence of NaNs.
Differential revision: https://reviews.llvm.org/D62993
This patch reworks the AIX call lowering to use CCState. Some defensive errors
are added in this patch to protect from emitting bad code for calling convention
logic that has not been implemented by design. The use of CCState follows the
precedent of other targets and enables the reuse of calling convention logic in
LowerFormalArguments, which will be rewritten to also use CCState in a late
patch.
Patch by Chris Bowler.
Differential Revision: https://reviews.llvm.org/D69101
Replace with the MachineFunction. X86 is the only user, and only uses
it for the function. This removes one obstacle from using this in
GlobalISel. The other is the more tolerable EVT argument.
The X86 use of the function seems questionable to me. It checks hasFP,
before frame lowering.
llvm-svn: 373292
We always(and only) check the NLP flag after calling
classifyGlobalReference to see whether it is accessed
indirectly.
Refactor to code to use isGVIndirectSym instead.
llvm-svn: 372417
We currently produce a load, followed by (possibly a move for integers and) a
splat as separate instructions. VSX has always had a splatting load for
doublewords, but as of Power9, we have it for words as well. This patch just
exploits these instructions.
Differential revision: https://reviews.llvm.org/D63624
llvm-svn: 372139
* Reordered MVT simple types to group scalable vector types
together.
* New range functions in MachineValueType.h to only iterate over
the fixed-length int/fp vector types.
* Stopped backends which don't support scalable vector types from
iterating over scalable types.
Reviewers: sdesmalen, greened
Reviewed By: greened
Differential Revision: https://reviews.llvm.org/D66339
llvm-svn: 372099
Add the missing piece of r372029.
Somehow when the patch for review D61961 was committed, only the test case
went in and the code didn't. This of course caused all kinds of build bot
breaks.
This patch just adds the code for that patch.
Author: Lei Huang
Differential revision: https://reviews.llvm.org/D61961
llvm-svn: 372043