teak-llvm/clang/lib/CodeGen
Simon Tatham 08074cc965 [clang,ARM] Initial ACLE intrinsics for MVE.
This commit sets up the infrastructure for auto-generating <arm_mve.h>
and doing clang-side code generation for the builtins it relies on,
and demonstrates that it works by implementing a representative sample
of the ACLE intrinsics, more or less matching the ones introduced in
LLVM IR by D67158,D68699,D68700.

Like NEON, that header file will provide a set of vector types like
uint16x8_t and C functions with names like vaddq_u32(). Unlike NEON,
the ACLE spec for <arm_mve.h> includes a polymorphism system, so that
you can write plain vaddq() and disambiguate by the vector types you
pass to it.

Unlike the corresponding NEON code, I've arranged to make every user-
facing ACLE intrinsic into a clang builtin, and implement all the code
generation inside clang. So <arm_mve.h> itself contains nothing but
typedefs and function declarations, with the latter all using the new
`__attribute__((__clang_builtin))` system to arrange that the user-
facing function names correspond to the right internal BuiltinIDs.

So the new MveEmitter tablegen system specifies the full sequence of
IRBuilder operations that each user-facing ACLE intrinsic should
translate into. Where possible, the ACLE intrinsics map to standard IR
operations such as vector-typed `add` and `fadd`; where no standard
representation exists, I call down to the sample IR intrinsics
introduced in an earlier commit.

Doing it like this means that you get the polymorphism for free just
by using __attribute__((overloadable)): the clang overload resolution
decides which function declaration is the relevant one, and _then_ its
BuiltinID is looked up, so by the time we're doing code generation,
that's all been resolved by the standard system. It also means that
you get really nice error messages if the user passes the wrong
combination of types: clang will show the declarations from the header
file and explain why each one doesn't match.

(The obvious alternative approach would be to have wrapper functions
in <arm_mve.h> which pass their arguments to the underlying builtins.
But that doesn't work in the case where one of the arguments has to be
a constant integer: the wrapper function can't pass the constantness
through. So you'd have to do that case using a macro instead, and then
use C11 `_Generic` to handle the polymorphism. Then you have to add
horrible workarounds because `_Generic` requires even the untaken
branches to type-check successfully, and //then// if the user gets the
types wrong, the error message is totally unreadable!)

Reviewers: dmgreen, miyuki, ostannard

Subscribers: mgorny, javed.absar, kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D67161
2019-10-24 16:33:13 +01:00
..
ABIInfo.h
Address.h
BackendUtil.cpp Insert module constructors in a module pass 2019-10-11 08:47:03 +00:00
CGAtomic.cpp Codegen - silence static analyzer getAs<> null dereference warnings. NFCI. 2019-10-07 16:42:25 +00:00
CGBlocks.cpp CGBlocks - silence static analyzer getAs<> null dereference warnings. NFCI. 2019-10-04 15:01:54 +00:00
CGBlocks.h
CGBuilder.h Fix parameter name comments using clang-tidy. NFC. 2019-07-16 04:46:31 +00:00
CGBuiltin.cpp [clang,ARM] Initial ACLE intrinsics for MVE. 2019-10-24 16:33:13 +01:00
CGCall.cpp [Alignment] Migrate Attribute::getWith(Stack)Alignment 2019-10-15 12:56:24 +00:00
CGCall.h
CGClass.cpp [OpenCL] Preserve addrspace in CGClass (PR43145) 2019-10-17 14:12:51 +00:00
CGCleanup.cpp [Alignment][Clang][NFC] Add CharUnits::getAsAlign 2019-10-03 13:00:29 +00:00
CGCleanup.h
CGCoroutine.cpp Fix parameter name comments using clang-tidy. NFC. 2019-07-16 04:46:31 +00:00
CGCUDANV.cpp [Alignment][Clang][NFC] Add CharUnits::getAsAlign 2019-10-03 13:00:29 +00:00
CGCUDARuntime.cpp
CGCUDARuntime.h [HIP] Add the interface deriving the stub name of device kernels. 2019-06-17 12:51:36 +00:00
CGCXX.cpp Codegen - silence static analyzer getAs<> null dereference warnings. NFCI. 2019-10-07 16:42:25 +00:00
CGCXXABI.cpp Silence static analyzer getAs<RecordType> null dereference warnings. NFCI. 2019-10-03 11:22:48 +00:00
CGCXXABI.h Improve code generation for thread_local variables: 2019-09-12 20:00:24 +00:00
CGDebugInfo.cpp PCH debug info: Avoid appending the source directory to an absolute path 2019-10-21 16:44:37 +00:00
CGDebugInfo.h Revert "[CGDebugInfo] Simplify EmitFunctionDecl parameters, NFC" 2019-07-11 19:28:07 +00:00
CGDecl.cpp Added support for "#pragma clang section relro=<name>" 2019-10-15 18:31:10 +00:00
CGDeclCXX.cpp [HIP] Add option -fgpu-allow-device-init 2019-10-22 16:06:20 -04:00
CGException.cpp [WebAssembly] Add -fwasm-exceptions for wasm EH 2019-09-12 04:01:37 +00:00
CGExpr.cpp [c++20] Add CXXRewrittenBinaryOperator to represent a comparison 2019-10-19 00:04:38 +00:00
CGExprAgg.cpp [c++20] Add CXXRewrittenBinaryOperator to represent a comparison 2019-10-19 00:04:38 +00:00
CGExprComplex.cpp [c++20] Add CXXRewrittenBinaryOperator to represent a comparison 2019-10-19 00:04:38 +00:00
CGExprConstant.cpp CGExprConstant - silence static analyzer getAs<> null dereference warning. NFCI. 2019-10-16 10:38:40 +00:00
CGExprCXX.cpp CFI: wrong type passed to llvm.type.test with multiple inheritance devirtualization. 2019-10-15 16:32:50 +00:00
CGExprScalar.cpp [c++20] Add CXXRewrittenBinaryOperator to represent a comparison 2019-10-19 00:04:38 +00:00
CGGPUBuiltin.cpp
CGLoopInfo.cpp Recommit "[Clang] Pragma vectorize_width() implies vectorize(enable)" 2019-10-10 08:27:14 +00:00
CGLoopInfo.h Don't keep stale pointers to LoopInfos. 2019-08-19 13:37:41 +00:00
CGNonTrivialStruct.cpp Do a sweep of symbol internalization. NFC. 2019-08-23 19:59:23 +00:00
CGObjC.cpp Properly handle instantiation-dependent array bounds. 2019-10-04 01:25:59 +00:00
CGObjCGNU.cpp Codegen - silence static analyzer getAs<> null dereference warnings. NFCI. 2019-10-07 16:42:25 +00:00
CGObjCMac.cpp [Alignment][Clang][NFC] Add CharUnits::getAsAlign 2019-10-03 13:00:29 +00:00
CGObjCRuntime.cpp
CGObjCRuntime.h
CGOpenCLRuntime.cpp
CGOpenCLRuntime.h
CGOpenMPRuntime.cpp [OPENMP50]Add support for master taskloop simd. 2019-10-18 16:47:35 +00:00
CGOpenMPRuntime.h [Clang][OpenMP Offload] Move offload registration code to the wrapper 2019-10-15 18:42:47 +00:00
CGOpenMPRuntimeNVPTX.cpp [OPENMP50]Add support for master taskloop simd. 2019-10-18 16:47:35 +00:00
CGOpenMPRuntimeNVPTX.h [OPENMP50]Support for declare variant directive for NVPTX target. 2019-10-10 17:28:10 +00:00
CGRecordLayout.h
CGRecordLayoutBuilder.cpp P0840R2: support for [[no_unique_address]] attribute 2019-06-20 20:44:45 +00:00
CGStmt.cpp [OPENMP50]Add support for master taskloop simd. 2019-10-18 16:47:35 +00:00
CGStmtOpenMP.cpp [OPENMP50]Add support for master taskloop simd. 2019-10-18 16:47:35 +00:00
CGValue.h
CGVTables.cpp Reland: Dead Virtual Function Elimination 2019-10-17 09:58:57 +00:00
CGVTables.h
CGVTT.cpp
CMakeLists.txt Make CodeGen depend on ASTMatchers 2019-06-26 14:13:43 +00:00
CodeGenABITypes.cpp Fix parameter name comments using clang-tidy. NFC. 2019-07-16 04:46:31 +00:00
CodeGenAction.cpp Reland "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM" 2019-09-11 16:19:50 +00:00
CodeGenFunction.cpp Reland r374450 with Richard Smith's comments and test fixed. 2019-10-11 14:59:44 +00:00
CodeGenFunction.h [clang,ARM] Initial ACLE intrinsics for MVE. 2019-10-24 16:33:13 +01:00
CodeGenModule.cpp [AMDGPU] Fix assertion due to initializer list 2019-10-20 15:02:22 +00:00
CodeGenModule.h Reland: Dead Virtual Function Elimination 2019-10-17 09:58:57 +00:00
CodeGenPGO.cpp [Clang] Migrate llvm::make_unique to std::make_unique 2019-08-14 23:04:18 +00:00
CodeGenPGO.h Fix uninitialized variable warning in CodeGenPGO constructor. NFCI. 2019-10-02 21:05:21 +00:00
CodeGenTBAA.cpp Fix TBAA representation for zero-sized fields and unnamed bit-fields. 2019-06-22 21:30:43 +00:00
CodeGenTBAA.h
CodeGenTypeCache.h
CodeGenTypes.cpp Silence static analyzer getAs<RecordType> null dereference warnings. NFCI. 2019-10-03 11:22:48 +00:00
CodeGenTypes.h IRGen: Remove StructorType; thread GlobalDecl through more code. NFCI. 2019-03-22 23:05:10 +00:00
ConstantEmitter.h
ConstantInitBuilder.cpp [Alignment][Clang][NFC] Add CharUnits::getAsAlign 2019-10-03 13:00:29 +00:00
CoverageMappingGen.cpp Re-land "Use -fdebug-compilation-dir to form absolute paths in coverage mappings" 2019-10-10 18:01:20 +00:00
CoverageMappingGen.h Re-land "Use -fdebug-compilation-dir to form absolute paths in coverage mappings" 2019-10-10 18:01:20 +00:00
EHScopeStack.h Replace llvm::integer_sequence and friends with the C++14 standard version 2019-08-15 10:56:05 +00:00
ItaniumCXXABI.cpp Reland: Dead Virtual Function Elimination 2019-10-17 09:58:57 +00:00
MacroPPCallbacks.cpp
MacroPPCallbacks.h
MicrosoftCXXABI.cpp Codegen - silence static analyzer getAs<> null dereference warnings. NFCI. 2019-10-07 16:42:25 +00:00
ModuleBuilder.cpp [Clang] Use -main-file-name for source filename if not set 2019-09-30 15:05:35 +00:00
ObjectFilePCHContainerOperations.cpp [Alignment][Clang][NFC] Add CharUnits::getAsAlign 2019-10-03 13:00:29 +00:00
PatternInit.cpp CodeGet: Init 32bit pointers with 0xFFFFFFFF 2019-07-12 17:21:55 +00:00
PatternInit.h Variable auto-init: also auto-init alloca 2019-04-12 00:11:27 +00:00
README.txt
SanitizerMetadata.cpp ARM MTE stack sanitizer. 2019-07-15 20:02:23 +00:00
SanitizerMetadata.h
SwiftCallingConv.cpp
TargetInfo.cpp Remove an useless allocation (from by clang-analyzer/scan-build) 2019-10-08 09:17:46 +00:00
TargetInfo.h [OpenCL][PR41727] Prevent ICE on global dtors 2019-07-15 11:58:10 +00:00
VarBypassDetector.cpp
VarBypassDetector.h

IRgen optimization opportunities.

//===---------------------------------------------------------------------===//

The common pattern of
--
short x; // or char, etc
(x == 10)
--
generates an zext/sext of x which can easily be avoided.

//===---------------------------------------------------------------------===//

Bitfields accesses can be shifted to simplify masking and sign
extension. For example, if the bitfield width is 8 and it is
appropriately aligned then is is a lot shorter to just load the char
directly.

//===---------------------------------------------------------------------===//

It may be worth avoiding creation of alloca's for formal arguments
for the common situation where the argument is never written to or has
its address taken. The idea would be to begin generating code by using
the argument directly and if its address is taken or it is stored to
then generate the alloca and patch up the existing code.

In theory, the same optimization could be a win for block local
variables as long as the declaration dominates all statements in the
block.

NOTE: The main case we care about this for is for -O0 -g compile time
performance, and in that scenario we will need to emit the alloca
anyway currently to emit proper debug info. So this is blocked by
being able to emit debug information which refers to an LLVM
temporary, not an alloca.

//===---------------------------------------------------------------------===//

We should try and avoid generating basic blocks which only contain
jumps. At -O0, this penalizes us all the way from IRgen (malloc &
instruction overhead), all the way down through code generation and
assembly time.

On 176.gcc:expr.ll, it looks like over 12% of basic blocks are just
direct branches!

//===---------------------------------------------------------------------===//