Commit Graph

345 Commits

Author SHA1 Message Date
Reid Kleckner
357600eab5 Add out of line virtual destructors to all LLVMTargetMachine subclasses
These recently all grew a unique_ptr<TargetLoweringObjectFile> member in
r221878.  When anyone calls a virtual method of a class, clang-cl
requires all virtual methods to be semantically valid. This includes the
implicit virtual destructor, which triggers instantiation of the
unique_ptr destructor, which fails because the type being deleted is
incomplete.

This is just part of the ongoing saga of PR20337, which is affecting
Blink as well. Because the MSVC ABI doesn't have key functions, we end
up referencing the vtable and implicit destructor on any virtual call
through a class. We don't actually end up emitting the dtor, so it'd be
good if we could avoid this unneeded type completion work.

llvm-svn: 222480
2014-11-20 23:37:18 +00:00
Aditya Nandakumar
a27193297f This patch changes the ownership of TLOF from TargetLoweringBase to TargetMachine so that different subtargets could share the TLOF effectively
llvm-svn: 221878
2014-11-13 09:26:31 +00:00
Eric Christopher
3faf2f1e02 Add subtarget caches to aarch64, arm, ppc, and x86.
These will make it easier to test further changes to the
code generation and optimization pipelines as those are
moved to subtargets initialized with target feature and
target cpu.

llvm-svn: 219106
2014-10-06 06:45:36 +00:00
Eric Christopher
a94e592e49 We can grab the options struct from the TargetMachine, no need to
pass it down in the constructor.

llvm-svn: 218929
2014-10-03 00:10:03 +00:00
Eric Christopher
79cc1e3ae7 Reinstate "Nuke the old JIT."
Approved by Jim Grosbach, Lang Hames, Rafael Espindola.

This reinstates commits r215111, 215115, 215116, 215117, 215136.

llvm-svn: 216982
2014-09-02 22:28:02 +00:00
Robin Morisset
59c23cd946 Rename AtomicExpandLoadLinked into AtomicExpand
AtomicExpandLoadLinked is currently rather ARM-specific. This patch is the first of
a group that aim at making it more target-independent. See
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075873.html
for details

The command line option is "atomic-expand"

llvm-svn: 216231
2014-08-21 21:50:01 +00:00
Jonathan Roelofs
5e98ff967b Add a thread-model knob for lowering atomics on baremetal & single threaded systems
http://reviews.llvm.org/D4984

llvm-svn: 216182
2014-08-21 14:35:47 +00:00
Eric Christopher
b9fd9ed37e Temporarily Revert "Nuke the old JIT." as it's not quite ready to
be deleted. This will be reapplied as soon as possible and before
the 3.6 branch date at any rate.

Approved by Jim Grosbach, Lang Hames, Rafael Espindola.

This reverts commits r215111, 215115, 215116, 215117, 215136.

llvm-svn: 215154
2014-08-07 22:02:54 +00:00
Rafael Espindola
f8b27c41e8 Nuke the old JIT.
I am sure we will be finding bits and pieces of dead code for years to
come, but this is a good start.

Thanks to Lang Hames for making MCJIT a good replacement!

llvm-svn: 215111
2014-08-07 14:21:18 +00:00
Eric Christopher
80b24ef429 Move all of the ARM subtarget features down onto the subtarget
rather than the target machine.

llvm-svn: 211799
2014-06-26 19:30:02 +00:00
Eric Christopher
c40e5edbbc Add a new subtarget hook for whether or not we'd like to enable
the atomic load linked expander pass to run for a particular
subtarget. This requires a check of the subtarget and so save
the TargetMachine rather than only TargetLoweringInfo and update
all callers.

llvm-svn: 211314
2014-06-19 21:03:04 +00:00
Eric Christopher
3d19f1388f Move ARMJITInfo off of the TargetMachine and down onto the subtarget.
This required untangling a mess of headers that included around.

This a recommit of r210953 with a fix for the removed accessor
for JITInfo.

llvm-svn: 211233
2014-06-18 22:48:09 +00:00
James Molloy
f6419cfb14 Refactor the disabling of Thumb-1 LDM/STM generation
Originally I switched the LD/ST optimizer off in TargetMachine as it was previously, but Eric has suggested he'd prefer that it be short-circuited in the pass itself.

No functionality change.

llvm-svn: 211037
2014-06-16 16:42:53 +00:00
Eric Christopher
f6db93ab81 Temporarily revert r210953 in an attempt to bring the ARM buildbots
back.

llvm-svn: 210996
2014-06-15 19:55:14 +00:00
Eric Christopher
fb0c26c696 Remove InstrItineraryData off of the TargetMachine - it's already
on the subtarget and just forward the accessor.

llvm-svn: 210955
2014-06-13 23:11:13 +00:00
Eric Christopher
a0cdc005dd Move ARMJITInfo off of the TargetMachine and down onto the subtarget.
This required untangling a mess of headers that included around.

llvm-svn: 210953
2014-06-13 23:04:46 +00:00
Eric Christopher
030294e4c5 Move ARMSelectionDAGInfo from the TargetMachine to the subtarget.
llvm-svn: 210862
2014-06-13 00:20:39 +00:00
Eric Christopher
a47f6804d2 Move to a private function to initialize subtarget dependencies
so we can use initializer lists for the ARMSubtarget and then
use this to initialize a moved DataLayout on the subtarget from
the TargetMachine.

llvm-svn: 210861
2014-06-13 00:20:35 +00:00
Eric Christopher
70e005a171 Have ARMSelectionDAGInfo take a DataLayout as it's argument as the
DAG has access to the subtarget and TargetSelectionDAGInfo only
needs a DataLayout.

llvm-svn: 210859
2014-06-12 23:39:49 +00:00
James Molloy
1417b0be3e Disable the load/store optimization pass for Thumb-1.
Moritz's changes have improved codegen a lot, but further testing showed significant correctness problems. Disable by default until these have been worked out.

Patch by Moritz Roth!

llvm-svn: 210789
2014-06-12 15:18:33 +00:00
Tim Northover
b4ddc0845a ARM & AArch64: make use of common cmpxchg idioms after expansion
The C and C++ semantics for compare_exchange require it to return a bool
indicating success. This gets mapped to LLVM IR which follows each cmpxchg with
an icmp of the value loaded against the desired value.

When lowered to ldxr/stxr loops, this extra comparison is redundant: its
results are implicit in the control-flow of the function.

This commit makes two changes: it replaces that icmp with appropriate PHI
nodes, and then makes sure earlyCSE is called after expansion to actually make
use of the opportunities revealed.

I've also added -{arm,aarch64}-enable-atomic-tidy options, so that
existing fragile tests aren't perturbed too much by the change. Many
of them either rely on undef/unreachable too pervasively to be
restored to something well-defined (particularly while making sure
they test the same obscure assert from many years ago), or depend on a
particular CFG shape, which is disrupted by SimplifyCFG.

rdar://problem/16227836

llvm-svn: 209883
2014-05-30 10:09:59 +00:00
James Molloy
92a15078f1 Enable the Load/Store optimization pass for Thumb1 but make it return immediately for now.
Patch by Moritz Roth!

llvm-svn: 208991
2014-05-16 14:11:38 +00:00
Tim Northover
037f26f212 Atomics: promote ARM's IR-based atomics pass to CodeGen.
Still only 32-bit ARM using it at this stage, but the promotion allows
direct testing via opt and is a reasonably self-contained patch on the
way to switching ARM64.

At this point, other targets should be able to make use of it without
too much difficulty if they want. (See ARM64 commit coming soon for an
example).

llvm-svn: 206485
2014-04-17 18:22:47 +00:00
Tim Northover
c882eb0723 ARM: expand atomic ldrex/strex loops in IR
The previous situation where ATOMIC_LOAD_WHATEVER nodes were expanded
at MachineInstr emission time had grown to be extremely large and
involved, to account for the subtly different code needed for the
various flavours (8/16/32/64 bit, cmpxchg/add/minmax).

Moving this transformation into the IR clears up the code
substantially, and makes future optimisations much easier:

1. an atomicrmw followed by using the *new* value can be more
   efficient. As an IR pass, simple CSE could handle this
   efficiently.
2. Making use of cmpxchg success/failure orderings only has to be done
   in one (simpler) place.
3. The common "cmpxchg; did we store?" idiom can be exposed to
   optimisation.

I intend to gradually improve this situation within the ARM backend
and make sure there are no hidden issues before moving the code out
into CodeGen to be shared with (at least ARM64/AArch64, though I think
PPC & Mips could benefit too).

llvm-svn: 205525
2014-04-03 11:44:58 +00:00
Renato Golin
d93295ea56 Remove duplicated DMB instructions
ARM specific optimiztion, finding places in ARM machine code where 2 dmbs
follow one another, and eliminating one of them.

Patch by Reinoud Elhorst.

llvm-svn: 205409
2014-04-02 09:03:43 +00:00
Christian Pirker
dc9ff75554 ARM: rename ARMle/ARMbe with ARMLE/ARMBE, and Thumble/Thumbbe with ThumbLE/ThumbBE
llvm-svn: 205317
2014-04-01 15:19:30 +00:00
Saleem Abdulrasool
2070088bef ARM: fix typo
llvm-svn: 205233
2014-03-31 18:09:10 +00:00
Christian Pirker
2a11160956 Add ARM big endian Target (armeb, thumbeb)
Reviewed at http://llvm-reviews.chandlerc.com/D3095

llvm-svn: 205007
2014-03-28 14:35:30 +00:00
Craig Topper
a9253267a9 Prune includes in ARM target.
llvm-svn: 204548
2014-03-22 23:51:00 +00:00
Craig Topper
6bc27bf359 [C++11] Add 'override' keyword to virtual methods that override their base class.
llvm-svn: 203433
2014-03-10 02:09:33 +00:00
Tim Northover
f804c178a1 GlobalMerge: move "-global-merge" option to the pass itself.
It's rather odd to have the flag enabling and disabling this pass only affect a
single target.

llvm-svn: 201559
2014-02-18 11:17:29 +00:00
Rafael Espindola
58873566b3 Make the llvm mangler depend only on DataLayout.
Before this patch any program that wanted to know the final symbol name of a
GlobalValue had to link with Target.

This patch implements a compromise solution where the mangler uses DataLayout.
This way, any tool that already links with Target (llc, clang) gets the exact
behavior as before and new IR files can be mangled without linking with Target.

With this patch the mangler is constructed with just a DataLayout and DataLayout
is extended to include the information the Mangler needs.

llvm-svn: 198438
2014-01-03 19:21:54 +00:00
Rafael Espindola
ddb913cc8f Synchronize the NaCl DataLayout strings with the ones in clang.
Patch by Derek Schuff.

llvm-svn: 197640
2013-12-19 00:44:37 +00:00
Tim Northover
f1c31b95e0 ARM: update comment to match reality
llvm-svn: 197570
2013-12-18 14:18:36 +00:00
Tim Northover
44594ad7e2 ARM: set default float ABI based on triple.
Clang sets the float-abi target option manually, but no longer
annotates each function with its ABI. This can lead to confusing
mistmatch between "clang -emit-llvm | llc" and normal clang
invocations.

Besides which, gnueabihf actually *is* hard-float. Defaulting to soft
was just perverse.

llvm-svn: 197554
2013-12-18 09:27:33 +00:00
Rafael Espindola
8c08120dba On APCS, only try to align aggregates to 32 bits instead of 64.
This matches clang's behavior and since it is only a preference, it is not
an ABI issue.

llvm-svn: 197526
2013-12-17 21:36:54 +00:00
Rafael Espindola
9704fd03d1 Handle i64 first for clarity. No functionality change.
llvm-svn: 197524
2013-12-17 21:28:36 +00:00
Rafael Espindola
e89b41495a One last cleanup of LLVM's DataLayout strings.
Produce them in the same order on every target. The order is that of
getStringRepresentation: e|E-i*-f*-v*-a*-s*-n*-S*.

llvm-svn: 197411
2013-12-16 19:31:14 +00:00
Rafael Espindola
bccb9d45ad The preferred alignment defaults to the abi alignment. Omit if it is the same.
llvm-svn: 197400
2013-12-16 18:01:51 +00:00
Rafael Espindola
720ae4f885 Simplify the datalayout string of ARM and AArch64.
No functionality change.

Reviewed by Tim Northover.

llvm-svn: 197172
2013-12-12 17:43:37 +00:00
Rafael Espindola
1d224bd65f Add comments documenting the ARM datalayout string.
llvm-svn: 196850
2013-12-10 00:37:37 +00:00
Rafael Espindola
74d682b443 Simplify further.
Thanks to Jim Grosbach for noticing it.

llvm-svn: 196846
2013-12-10 00:15:35 +00:00
Rafael Espindola
964bf07fb8 Refactor the construction of the DataLayout string on ARM.
llvm-svn: 196843
2013-12-09 23:56:41 +00:00
Weiming Zhao
0da5cc0765 Enable generating legacy IT block for AArch32
By default, the behavior of IT block generation will be determinated
dynamically base on the arch (armv8 vs armv7). This patch adds backend
options: -arm-restrict-it and -arm-no-restrict-it.  The former one
restricts the generation of IT blocks (the same behavior as thumbv8) for
both arches. The later one allows the generation of legacy IT block (the
same behavior as ARMv7 Thumb2) for both arches.

Clang will support -mrestrict-it and -mno-restrict-it, which is
compatible with GCC.

llvm-svn: 194592
2013-11-13 18:29:49 +00:00
Joey Gouly
a5153cb025 [ARMv8] Prevent generation of deprecated IT blocks on ARMv8 in Thumb mode.
IT blocks can only be one instruction lonf, and can only contain a subset of
the 16 instructions.

Patch by Artyom Skrobov!

llvm-svn: 190309
2013-09-09 14:21:49 +00:00
Silviu Baranga
91ddaa1b48 Allow generation of vmla.f32 instructions when targeting Cortex-A15. The patch also adds the VFP4 feature to Cortex-A15 and fixes the DontUseFusedMAC predicate so that we can still generate vmla.f32 instructions on non-darwin targets with VFP4.
llvm-svn: 187349
2013-07-29 09:25:50 +00:00
Bill Wendling
7a639ea2a4 Access the TargetLoweringInfo from the TargetMachine object instead of caching it. The TLI may change between functions. No functionality change.
llvm-svn: 184352
2013-06-19 21:07:11 +00:00
Bill Wendling
afc1036f3e Access the TargetLoweringInfo from the TargetMachine object instead of caching it. The TLI may change between functions. No functionality change.
llvm-svn: 184349
2013-06-19 20:51:24 +00:00
Rafael Espindola
227144c23c Remove the MachineMove class.
It was just a less powerful and more confusing version of
MCCFIInstruction. A side effect is that, since MCCFIInstruction uses
dwarf register numbers, calls to getDwarfRegNum are pushed out, which
should allow further simplifications.

I left the MachineModuleInfo::addFrameMove interface unchanged since
this patch was already fairly big.

llvm-svn: 181680
2013-05-13 01:16:13 +00:00
Silviu Baranga
dc45336d09 Enabling the generation of dependency breakers for partial updates on Cortex-A15. Also fixing a small bug in getting the update clearence for VLD1LNd32.
llvm-svn: 178134
2013-03-27 12:38:44 +00:00
Renato Golin
b4dd6c5945 Avoid NEON SP-FP unless unsafe-math or Darwin
NEON is not IEEE 754 compliant, so we should avoid lowering single-precision
floating point operations with NEON unless unsafe-math is turned on. The
equivalent VFP instructions are IEEE 754 compliant, but in some cores they're
much slower, so some archs/OSs might still request it to be on by default,
such as Swift and Darwin.

llvm-svn: 177651
2013-03-21 18:47:47 +00:00
Silviu Baranga
82dd6ac3bc Adding an A15 specific optimization pass for interactions between S/D/Q registers. The pass handles all the required transformations pre-regalloc.
llvm-svn: 177169
2013-03-15 18:28:25 +00:00
Jim Grosbach
553eb75663 ARM: Fix a few copy-paste errors.
s/X86/ARM/

llvm-svn: 171789
2013-01-07 21:12:13 +00:00
Chandler Carruth
664e354de7 Switch TargetTransformInfo from an immutable analysis pass that requires
a TargetMachine to construct (and thus isn't always available), to an
analysis group that supports layered implementations much like
AliasAnalysis does. This is a pretty massive change, with a few parts
that I was unable to easily separate (sorry), so I'll walk through it.

The first step of this conversion was to make TargetTransformInfo an
analysis group, and to sink the nonce implementations in
ScalarTargetTransformInfo and VectorTargetTranformInfo into
a NoTargetTransformInfo pass. This allows other passes to add a hard
requirement on TTI, and assume they will always get at least on
implementation.

The TargetTransformInfo analysis group leverages the delegation chaining
trick that AliasAnalysis uses, where the base class for the analysis
group delegates to the previous analysis *pass*, allowing all but tho
NoFoo analysis passes to only implement the parts of the interfaces they
support. It also introduces a new trick where each pass in the group
retains a pointer to the top-most pass that has been initialized. This
allows passes to implement one API in terms of another API and benefit
when some other pass above them in the stack has more precise results
for the second API.

The second step of this conversion is to create a pass that implements
the TargetTransformInfo analysis using the target-independent
abstractions in the code generator. This replaces the
ScalarTargetTransformImpl and VectorTargetTransformImpl classes in
lib/Target with a single pass in lib/CodeGen called
BasicTargetTransformInfo. This class actually provides most of the TTI
functionality, basing it upon the TargetLowering abstraction and other
information in the target independent code generator.

The third step of the conversion adds support to all TargetMachines to
register custom analysis passes. This allows building those passes with
access to TargetLowering or other target-specific classes, and it also
allows each target to customize the set of analysis passes desired in
the pass manager. The baseline LLVMTargetMachine implements this
interface to add the BasicTTI pass to the pass manager, and all of the
tools that want to support target-aware TTI passes call this routine on
whatever target machine they end up with to add the appropriate passes.

The fourth step of the conversion created target-specific TTI analysis
passes for the X86 and ARM backends. These passes contain the custom
logic that was previously in their extensions of the
ScalarTargetTransformInfo and VectorTargetTransformInfo interfaces.
I separated them into their own file, as now all of the interface bits
are private and they just expose a function to create the pass itself.
Then I extended these target machines to set up a custom set of analysis
passes, first adding BasicTTI as a fallback, and then adding their
customized TTI implementations.

The fourth step required logic that was shared between the target
independent layer and the specific targets to move to a different
interface, as they no longer derive from each other. As a consequence,
a helper functions were added to TargetLowering representing the common
logic needed both in the target implementation and the codegen
implementation of the TTI pass. While technically this is the only
change that could have been committed separately, it would have been
a nightmare to extract.

The final step of the conversion was just to delete all the old
boilerplate. This got rid of the ScalarTargetTransformInfo and
VectorTargetTransformInfo classes, all of the support in all of the
targets for producing instances of them, and all of the support in the
tools for manually constructing a pass based around them.

Now that TTI is a relatively normal analysis group, two things become
straightforward. First, we can sink it into lib/Analysis which is a more
natural layer for it to live. Second, clients of this interface can
depend on it *always* being available which will simplify their code and
behavior. These (and other) simplifications will follow in subsequent
commits, this one is clearly big enough.

Finally, I'm very aware that much of the comments and documentation
needs to be updated. As soon as I had this working, and plausibly well
commented, I wanted to get it committed and in front of the build bots.
I'll be doing a few passes over documentation later if it sticks.

Commits to update DragonEgg and Clang will be made presently.

llvm-svn: 171681
2013-01-07 01:37:14 +00:00
Chandler Carruth
ed0881b2a6 Use the new script to sort the includes of every file under lib.
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.

Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]

llvm-svn: 169131
2012-12-03 16:50:05 +00:00
Rafael Espindola
d957cb2584 Remove TargetELFWriterInfo.
All the credit goes to Jan Voung for noticing it was dead!

llvm-svn: 166902
2012-10-28 21:34:43 +00:00
Nadav Rotem
2289f2c932 Implement a basic VectorTargetTransformInfo interface to be used by the loop and bb vectorizers for modeling the cost of instructions.
llvm-svn: 166593
2012-10-24 17:22:41 +00:00
Nadav Rotem
5dc203e8f4 Reapply the TargerTransformInfo changes, minus the changes to LSR and Lowerinvoke.
llvm-svn: 166248
2012-10-18 23:22:48 +00:00
Bob Wilson
d6d9ccca38 Temporarily revert the TargetTransform changes.
The TargetTransform changes are breaking LTO bootstraps of clang.  I am
working with Nadav to figure out the problem, but I am reverting it for now
to get our buildbots working.

This reverts svn commits: 165665 165669 165670 165786 165787 165997
and I have also reverted clang svn 165741

llvm-svn: 166168
2012-10-18 05:43:52 +00:00
Nadav Rotem
e10328737d Add a new interface to allow IR-level passes to access codegen-specific information.
llvm-svn: 165665
2012-10-10 22:04:55 +00:00
Micah Villmow
cdfe20b97f Move TargetData to DataLayout.
llvm-svn: 165402
2012-10-08 16:38:25 +00:00
Jush Lu
47172a064f [arm-fast-isel] Add support for ELF PIC.
This is a preliminary step towards ELF support; currently ARMFastISel hasn't
been used for ELF object files yet.

llvm-svn: 164759
2012-09-27 05:21:41 +00:00
Silviu Baranga
b47bb94f93 This patch introduces A15 as a target in LLVM.
llvm-svn: 163803
2012-09-13 15:05:10 +00:00
Bob Wilson
b9b693650a Consistently use AnalysisID types in TargetPassConfig.
This makes it possible to just use a zero value to represent "no pass", so
the phony NoPassID global variable is no longer needed.

llvm-svn: 159568
2012-07-02 19:48:37 +00:00
Bob Wilson
bbd38dd9c0 Add all codegen passes to the PassManager via TargetPassConfig.
This is a preliminary step toward having TargetPassConfig be able to
start and stop the compilation at specified passes for unit testing
and debugging.  No functionality change.

llvm-svn: 159567
2012-07-02 19:48:31 +00:00
Bill Wendling
b12f16e75f Change the PassManager from a reference to a pointer.
The TargetPassManager's default constructor wants to initialize the PassManager
to 'null'. But it's illegal to bind a null reference to a null l-value. Make the
ivar a pointer instead.
PR12468

llvm-svn: 155902
2012-05-01 08:27:43 +00:00
Jim Grosbach
0c509fa6bf Tidy up. 80 columns.
llvm-svn: 154226
2012-04-06 23:43:50 +00:00
Jakob Stoklund Olesen
cdee326ab6 Preserve implicit defs in ARMLoadStoreOptimizer.
When a number of sub-register VLRDS instructions are combined into a
VLDM, preserve any super-register implicit defs. This is required to
keep the register scavenger and machine code verifier happy.

Enable machine code verification after ARMLoadStoreOptimizer.
ARM/2012-01-26-CopyPropKills.ll was failing because of this.

llvm-svn: 153610
2012-03-28 22:50:56 +00:00
Andrew Trick
1fa5bcbe2a Codegen pass definition cleanup. No functionality.
Moving toward a uniform style of pass definition to allow easier target configuration.
Globally declare Pass ID.
Globally declare pass initializer.
Use INITIALIZE_PASS consistently.
Add a call to the initializer from CodeGen.cpp.
Remove redundant "createPass" functions and "getPassName" methods.

While cleaning up declarations, cleaned up comments (sorry for large diff).

llvm-svn: 150100
2012-02-08 21:23:13 +00:00
Andrew Trick
f8ea108c05 TargetPassConfig: confine the MC configuration to TargetMachine.
Passes prior to instructon selection are now split into separate configurable stages.
Header dependencies are simplified.
The bulk of this diff is simply removal of the silly DisableVerify flags.

Sorry for the target header churn. Attempting to stabilize them.

llvm-svn: 149754
2012-02-04 02:56:59 +00:00
Andrew Trick
ccb673659a Added TargetPassConfig. The first little step toward configuring codegen passes.
Allows command line overrides to be centralized in LLVMTargetMachine.cpp.
LLVMTargetMachine can intercept common passes and give precedence to command line overrides.
Allows adding "internal" target configuration options without touching TargetOptions.
Encapsulates the PassManager.
Provides a good point to initialize all CodeGen passes so that Pass ID's can be used in APIs.
Allows modifying the target configuration hooks without rebuilding the world.

llvm-svn: 149672
2012-02-03 05:12:41 +00:00
David Blaikie
a379b18173 Unweaken vtables as per http://llvm.org/docs/CodingStandards.html#ll_virtual_anch
llvm-svn: 146960
2011-12-20 02:50:00 +00:00
Evan Cheng
7fae11b231 - Add MachineInstrBundle.h and MachineInstrBundle.cpp. This includes a function
to finalize MI bundles (i.e. add BUNDLE instruction and computing register def
  and use lists of the BUNDLE instruction) and a pass to unpack bundles.
- Teach more of MachineBasic and MachineInstr methods to be bundle aware.
- Switch Thumb2 IT block to MI bundles and delete the hazard recognizer hack to
  prevent IT blocks from being broken apart.

llvm-svn: 146542
2011-12-14 02:11:42 +00:00
Nick Lewycky
50f02cb21b Move global variables in TargetMachine into new TargetOptions class. As an API
change, now you need a TargetOptions object to create a TargetMachine. Clang
patch to follow.

One small functionality change in PTX. PTX had commented out the machine
verifier parts in their copy of printAndVerify. That now calls the version in
LLVMTargetMachine. Users of PTX who need verification disabled should rely on
not passing the command-line flag to enable it.

llvm-svn: 145714
2011-12-02 22:16:29 +00:00
Evan Cheng
ecb2908bf9 Sink codegen optimization level into MCCodeGenInfo along side relocation model
and code model. This eliminates the need to pass OptLevel flag all over the
place and makes it possible for any codegen pass to use this information.

llvm-svn: 144788
2011-11-16 08:38:26 +00:00
Devang Patel
76c8563239 svn mv Target/ARM/ARMGlobalMerge.cpp Transforms/Scalar/GlobalMerge.cpp
There is no reason to have simple IR level pass in lib/Target.

llvm-svn: 142200
2011-10-17 17:17:43 +00:00
Lang Hames
de7ab801cc Add a natural stack alignment field to TargetData, and prevent InstCombine from
promoting allocas to preferred alignments that exceed the natural
alignment. This avoids some potentially expensive dynamic stack realignments.

The natural stack alignment is set in target data strings via the "S<size>"
option. Size is in bits and must be a multiple of 8. The natural stack alignment
defaults to "unspecified" (represented by a zero value), and the "unspecified"
value does not prevent any alignment promotions. Target maintainers that care
about avoiding promotions should explicitly add the "S<size>" option to their
target data strings.

llvm-svn: 141599
2011-10-10 23:42:08 +00:00
Jakob Stoklund Olesen
f7ad189033 Use ExecutionDepsFix instead of NEONMoveFix.
This enables NEON domain tracking across basic blocks, but should
otherwise do the same thing.

llvm-svn: 140772
2011-09-29 02:48:41 +00:00
Bill Wendling
a0d5f268a9 Move to ISelLowering.
llvm-svn: 140754
2011-09-29 01:13:55 +00:00
Bill Wendling
dfe5acd34e Ahem...actually *add* the ARMSjLjLowering pass to the pass manager.
llvm-svn: 140718
2011-09-28 20:29:28 +00:00
Bill Wendling
354ff9e348 This is the start of the new SjLj EH preparation pass, which will replace the
current IR-level pass.

The old SjLj EH pass has some problems, especially with the new EH model. Most
significantly, it violates some of the new restrictions the new model has. For
instance, the 'dispatch' table wants to jump to the landing pad, but we cannot
allow that because only an invoke's unwind edge can jump to a landing pad. This
requires us to mangle the code something awful. In addition, we need to keep the
now dead landingpad instructions around instead of CSE'ing them because the
DWARF emitter uses that information (they are dead because no control flow edge
will execute them - the control flow edge from an invoke's unwind is superceded
by the edge coming from the dispatch).

Basically, this pass belongs not at the IR level where SSA is king, but at the
code-gen level, where we have more flexibility.

llvm-svn: 140646
2011-09-27 22:14:12 +00:00
Evan Cheng
9dad430486 Hide -global-merge option.
llvm-svn: 138540
2011-08-25 01:22:49 +00:00
Evan Cheng
f066b2fe99 Add a command line option to disable global merge pass.
llvm-svn: 138536
2011-08-25 01:00:36 +00:00
Evan Cheng
3ca20e64ac Remove a out-of-place comment.
llvm-svn: 138534
2011-08-25 00:54:42 +00:00
Evan Cheng
2bb4035707 Move TargetRegistry and TargetSelect from Target to Support where they belong.
These are strictly utilities for registering targets and components.

llvm-svn: 138450
2011-08-24 18:08:43 +00:00
Evan Cheng
ad5f485957 Sink ARM mc routines into MCTargetDesc.
llvm-svn: 135825
2011-07-23 00:00:19 +00:00
Evan Cheng
efd9b4240f - Move CodeModel from a TargetMachine global option to MCCodeGenInfo.
- Introduce JITDefault code model. This tells targets to set different default
  code model for JIT. This eliminates the ugly hack in TargetMachine where
  code model is changed after construction.

llvm-svn: 135580
2011-07-20 07:51:56 +00:00
Evan Cheng
2129f59637 Introduce MCCodeGenInfo, which keeps information that can affect codegen
(including compilation, assembly). Move relocation model Reloc::Model from
TargetMachine to MCCodeGenInfo so it's accessible even without TargetMachine.

llvm-svn: 135468
2011-07-19 06:37:02 +00:00
Evan Cheng
1705ab00ab Rename createAsmInfo to createMCAsmInfo and move registration code to MCTargetDesc to prepare for next round of changes.
llvm-svn: 135219
2011-07-14 23:50:31 +00:00
Evan Cheng
4d1ca96bfc Eliminate asm parser's dependency on TargetMachine:
- Each target asm parser now creates its own MCSubtatgetInfo (if needed).
- Changed AssemblerPredicate to take subtarget features which tablegen uses
  to generate asm matcher subtarget feature queries. e.g.
  "ModeThumb,FeatureThumb2" is translated to
  "(Bits & ModeThumb) != 0 && (Bits & FeatureThumb2) != 0".

llvm-svn: 134678
2011-07-08 01:53:10 +00:00
Evan Cheng
2bd65363a8 Factor ARM triple parsing out of ARMSubtarget. Another step towards making ARM subtarget info available to MC.
llvm-svn: 134569
2011-07-07 00:08:19 +00:00
Evan Cheng
fe6e405e8c Fix the ridiculous SubtargetFeatures API where it implicitly expects CPU name to
be the first encoded as the first feature. It then uses the CPU name to look up
features / scheduling itineray even though clients know full well the CPU name
being used to query these properties.

The fix is to just have the clients explictly pass the CPU name!

llvm-svn: 134127
2011-06-30 01:53:36 +00:00
Evan Cheng
1b049f5851 Remove TargetOptions.h dependency from ARMSubtarget.
llvm-svn: 133738
2011-06-23 18:15:17 +00:00
Daniel Dunbar
2b9b0e3748 ADT/Triple: Move a variety of clients to using isOSDarwin() and isOSWindows()
predicates.

llvm-svn: 129816
2011-04-19 21:14:45 +00:00
Bob Wilson
0858c3aaed This patch combines several changes from Evan Cheng for rdar://8659675.
Making use of VFP / NEON floating point multiply-accumulate / subtraction is
difficult on current ARM implementations for a few reasons.
1. Even though a single vmla has latency that is one cycle shorter than a pair
   of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause
   additional pipeline stall. So it's frequently better to single codegen
   vmul + vadd.
2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to
   stall for 4 cycles. We need to schedule them apart.
3. A vmla followed vmla is a special case. Obvious issuing back to back RAW
   vmla + vmla is very bad. But this isn't ideal either:
     vmul
     vadd
     vmla
   Instead, we want to expand the second vmla:
     vmla
     vmul
     vadd
   Even with the 4 cycle vmul stall, the second sequence is still 2 cycles
   faster.

Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough
but it isn't the optimial solution. This patch attempts to make it possible to
use vmla / vmls in cases where it is profitable.

A. Add missing isel predicates which cause vmla to be codegen'ed.
B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to
   compute a fmul and a fmla.
C. Add additional isel checks for vmla, avoid cases where vmla is feeding into
   fp instructions (except for the #3 exceptional case).
D. Add ARM hazard recognizer to model the vmla / vmls hazards.
E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the
   vmla / vmls will trigger one of the special hazards.

Enable these fp vmlx codegen changes for Cortex-A9.

llvm-svn: 129775
2011-04-19 18:11:57 +00:00
Jim Grosbach
6ade7e0bac Tidy up.
llvm-svn: 129034
2011-04-06 22:35:47 +00:00
NAKAMURA Takumi
4c14a5cc2c Triple::MinGW64 is deprecated and removed. We can use Triple::MinGW32 generally.
No one uses *-mingw64. mingw-w64 is represented as {i686|x86_64}-w64-mingw32. In llvm side, i686 and x64 can be treated as similar way.

llvm-svn: 125747
2011-02-17 12:24:17 +00:00
Rafael Espindola
b3eca9bb71 Add support for the --noexecstack option.
llvm-svn: 124077
2011-01-23 17:55:27 +00:00
Anton Korobeynikov
2f93128109 Rename TargetFrameInfo into TargetFrameLowering. Also, put couple of FIXMEs and fixes here and there.
llvm-svn: 123170
2011-01-10 12:39:04 +00:00
Evan Cheng
62c7b5bf76 Making use of VFP / NEON floating point multiply-accumulate / subtraction is
difficult on current ARM implementations for a few reasons.
1. Even though a single vmla has latency that is one cycle shorter than a pair
   of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause
   additional pipeline stall. So it's frequently better to single codegen
   vmul + vadd.
2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to
   stall for 4 cycles. We need to schedule them apart.
3. A vmla followed vmla is a special case. Obvious issuing back to back RAW
   vmla + vmla is very bad. But this isn't ideal either:
     vmul
     vadd
     vmla
   Instead, we want to expand the second vmla:
     vmla
     vmul
     vadd
   Even with the 4 cycle vmul stall, the second sequence is still 2 cycles
   faster.

Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough
but it isn't the optimial solution. This patch attempts to make it possible to
use vmla / vmls in cases where it is profitable.

A. Add missing isel predicates which cause vmla to be codegen'ed.
B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to
   compute a fmul and a fmla.
C. Add additional isel checks for vmla, avoid cases where vmla is feeding into
   fp instructions (except for the #3 exceptional case).
D. Add ARM hazard recognizer to model the vmla / vmls hazards.
E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the
   vmla / vmls will trigger one of the special hazards.

Work in progress, only A+B are enabled.

llvm-svn: 120960
2010-12-05 22:04:16 +00:00
Chris Lattner
9fdd10dbce tidy up
llvm-svn: 119462
2010-11-17 05:41:32 +00:00
Anton Korobeynikov
f7183edb59 First step of huge frame-related refactoring: move emit{Prologue,Epilogue} out of TargetRegisterInfo to TargetFrameInfo, which is definitely much better suitable place
llvm-svn: 119097
2010-11-15 00:06:54 +00:00
Eric Christopher
7ae11c6962 Revert the accidental commit I made reverting the previous commit.
llvm-svn: 118835
2010-11-11 20:50:14 +00:00
Eric Christopher
b90f7004cf Revert this temporarily.
llvm-svn: 118827
2010-11-11 19:47:02 +00:00
Rafael Espindola
66e08d43d2 Jim Asked us to move DataLayout on ARM back to the most specialized classes. Do
so and also change X86 for consistency.

Investigating if this can be improved a bit.

llvm-svn: 115469
2010-10-03 18:59:45 +00:00
Jason W Kim
b32124545b I added a new file ARMAsmBackend which stubs out in similar ways to
the eqv X86 class.
For now, I split the ELFARMAsmBackend from the DarwinARMAsmBackend
(also mimicking X86)

Tested against -r115126

llvm-svn: 115129
2010-09-30 02:17:26 +00:00
Nick Lewycky
7d483d352b Resolve this GCC warning:
ARMTargetMachine.cpp:53: error: control reaches end of non-void function

llvm-svn: 114992
2010-09-28 21:40:26 +00:00
Rafael Espindola
69aa15155f Odd additional stub framework for the ARM MC ELF emission.
llc now recognizes the "intent" to support MC/obj emission for ARM, but
given that they are all stubs, it asserts on --filetype=obj --march=arm

Patch by Jason Kim.

llvm-svn: 114856
2010-09-27 18:31:37 +00:00
Bob Wilson
c597fd3b4a Convert some VTBL and VTBX instructions to use pseudo instructions prior to
register allocation.  Remove the NEONPreAllocPass, which is no longer needed.
Yeah!!

llvm-svn: 113818
2010-09-13 23:55:10 +00:00
Evan Cheng
5190f09291 Report error if codegen tries to instantiate a ARM target when the cpu does support it. e.g. cortex-m* processors.
llvm-svn: 110798
2010-08-11 07:17:46 +00:00
Evan Cheng
ce8fb68078 Change -prefer-32bit-thumb to attribute -mattr=+32bit instead to disable more 32-bit to 16-bit optimizations.
llvm-svn: 110584
2010-08-09 18:35:19 +00:00
Evan Cheng
7d8d9a5dd5 Add an option to disable 32 -> 16-bit Thumb2 size reduction pass for experimentation.
llvm-svn: 110579
2010-08-09 17:16:10 +00:00
Anton Korobeynikov
19edda0323 Hook in GlobalMerge pass
llvm-svn: 109359
2010-07-24 21:52:08 +00:00
Evan Cheng
c3525dc0fd Remove early IT block formation. It's not used.
llvm-svn: 107513
2010-07-02 21:07:09 +00:00
Bob Wilson
07aead2f8d Add missing ARM and Thumb data layout info for vector types.
Radar 8128745.

llvm-svn: 106820
2010-06-25 04:41:08 +00:00
Evan Cheng
c26e2f4b70 Oops. IT block formation pass needs to be run at any optimization level.
llvm-svn: 106775
2010-06-24 19:10:14 +00:00
Evan Cheng
119824ed4d Move ARM if-conversion before post-ra scheduling.
llvm-svn: 106355
2010-06-18 23:32:07 +00:00
Evan Cheng
2d51c7c592 Allow ARM if-converter to be run after post allocation scheduling.
- This fixed a number of bugs in if-converter, tail merging, and post-allocation
  scheduler. If-converter now runs branch folding / tail merging first to
  maximize if-conversion opportunities.
- Also changed the t2IT instruction slightly. It now defines the ITSTATE
  register which is read by instructions in the IT block.
- Added Thumb2 specific hazard recognizer to ensure the scheduler doesn't
  change the instruction ordering in the IT block (since IT mask has been
  finalized). It also ensures no other instructions can be scheduled between
  instructions in the IT block.

This is not yet enabled.

llvm-svn: 106344
2010-06-18 23:09:54 +00:00
Evan Cheng
f128bdcb55 Make post-ra scheduling, anti-dep breaking, and register scavenger (conservatively) aware of predicated instructions. This enables ARM to move if-conversion before post-ra scheduler.
llvm-svn: 106091
2010-06-16 07:35:02 +00:00
Evan Cheng
83c64ee8de Typo.
llvm-svn: 105677
2010-06-09 03:49:12 +00:00
Evan Cheng
47cd593023 Thumb2 IT blocks are fairly expensive. When there are multiple selects using
the same condition, it's important to make sure they are scheduled together
to avoid forming multiple IT blocks. I'm adding a pre-regalloc pass that forms
IT blocks early (by re-scheduling instructions and split basic blocks) to
attempt to fix this. This is not turned on by default since I am not sure this
is the right fix.

Another issue is llvm selects are modeled as two-address conditional moves.
This can be very bad when the copies before the conditional moves are not
coalesced away. Teach IT formation pass to move the copies above the IT block
(when legal) to avoid breaking the IT block.

llvm-svn: 105669
2010-06-09 01:46:50 +00:00
Dan Gohman
bb919dfb6b Implement a bunch more TargetSelectionDAGInfo infrastructure.
Move EmitTargetCodeForMemcpy, EmitTargetCodeForMemset, and
EmitTargetCodeForMemmove out of TargetLowering and into
SelectionDAGInfo to exercise this.

llvm-svn: 103481
2010-05-11 17:31:57 +00:00
Anton Korobeynikov
6e01726eae Remove late ARM codegen optimization pass committed by accident.
It is not ready for public yet.

llvm-svn: 100673
2010-04-07 18:23:27 +00:00
Anton Korobeynikov
4fb6a66c8f Move NEON-VFP domain fixer upper, so post-RA scheduler would benefit from it.
llvm-svn: 100668
2010-04-07 18:21:46 +00:00
Anton Korobeynikov
0453de0133 Some initial version of global merger
llvm-svn: 100641
2010-04-07 18:19:07 +00:00
Daniel Dunbar
fed917e078 TargetRegistry: Fix create{AsmInfo,MCDisassembler} to return non-const objects.
llvm-svn: 99097
2010-03-20 22:36:22 +00:00
Chris Lattner
c83cfb9dfa remove dead code.
llvm-svn: 95134
2010-02-02 21:38:59 +00:00
Chris Lattner
0cd6c2a047 eliminate all the dead addSimpleCodeEmitter implementations.
eliminate random "code emitter" stuff in Alpha, except for
the JIT path.  Next up, remove the template cruft.

llvm-svn: 95131
2010-02-02 21:31:47 +00:00
Jim Grosbach
04770f2aa1 For aligned load/store instructions, it's only required to know whether a
function can support dynamic stack realignment. That's a much easier question
to answer at instruction selection stage than whether the function actually
will have dynamic alignment prologue. This allows the removal of the
stack alignment heuristic pass, and improves code quality for cases where
the heuristic would result in dynamic alignment code being generated when
it was not strictly necessary.

llvm-svn: 93885
2010-01-19 18:31:11 +00:00
Jim Grosbach
2c3a6c6589 Factor the stack alignment calculations out into a target independent pass.
No functionality change.

llvm-svn: 90336
2009-12-02 19:30:24 +00:00
Jim Grosbach
01c1cae34d Detect need for autoalignment of the stack earlier to catch spills more
conservatively. eliminateFrameIndex() machinery adjust to handle addr mode
6 (vld1/vst1) used for spills. Fix tests to expect aligned Q-reg spilling

llvm-svn: 88874
2009-11-15 21:45:34 +00:00
Chris Lattner
8714348afd indicate what the native integer types for the target are.
Please verify.

llvm-svn: 86397
2009-11-07 19:07:32 +00:00
Evan Cheng
207b246650 - Add pseudo instructions tLDRpci_pic and t2LDRpci_pic which does a pc-relative
load of a GV from constantpool and then add pc. It allows the code sequence to
  be rematerializable so it would be hoisted by machine licm.
- Add a late pass to break these pseudo instructions into a number of real
  instructions. Also move the code in Thumb2 IT pass that breaks up t2MOVi32imm
  to this pass. This is done before post regalloc scheduling to allow the
  scheduler to proper schedule these instructions. It also allow them to be
  if-converted and shrunk by later passes.

llvm-svn: 86304
2009-11-06 23:52:48 +00:00
Daniel Dunbar
ad36e8aceb Pass StringRef by value.
llvm-svn: 86251
2009-11-06 10:58:06 +00:00
Anton Korobeynikov
76a4774a0d Move subtarget check upper for NEON reg-reg fixup pass.
llvm-svn: 85914
2009-11-03 18:46:11 +00:00
Anton Korobeynikov
d195f9e5c3 Turn neon reg-reg moves fixup code into separate pass. This should reduce the compile time.
llvm-svn: 85850
2009-11-03 01:04:26 +00:00
Bob Wilson
97b9312663 Revert r85346 change to control tail merging by CodeGenOpt::Level.
I'm going to redo this using the OptimizeForSize function attribute.

llvm-svn: 85426
2009-10-28 20:46:46 +00:00
Bob Wilson
9693f9d465 Record CodeGen optimization level in the BranchFolding pass so that we can
use it to control tail merging when there is a tradeoff between performance
and code size.  When there is only 1 instruction in the common tail, we have
been merging.  That can be good for code size but is a definite loss for
performance.  Now we will avoid tail merging in that case when the
optimization level is "Aggressive", i.e., "-O3".  Radar 7338114.

Since the IfConversion pass invokes BranchFolding, it too needs to know
the optimization level.  Note that I removed the RegisterPass instantiation
for IfConversion because it required a default constructor.  If someone
wants to keep that for some reason, we can add a default constructor with
a hard-wired optimization level.

llvm-svn: 85346
2009-10-27 23:49:38 +00:00
Bob Wilson
9d763cc3f8 Revert 84843. Evan, this was breaking some of the if-conversion tests.
llvm-svn: 84868
2009-10-22 16:52:21 +00:00
Evan Cheng
3615b9bef3 Move if-conversion before post-regalloc scheduling so the predicated instruction get scheduled properly.
llvm-svn: 84843
2009-10-22 06:48:32 +00:00
Evan Cheng
344fcd9d61 Trim include.
llvm-svn: 84831
2009-10-22 05:08:49 +00:00
Evan Cheng
2dcee28a61 Move load / store multiple before post-alloc scheduling.
llvm-svn: 83236
2009-10-02 04:57:15 +00:00
Evan Cheng
ce5a8ca3ef Add a option which would move ld/st multiple pass before post-alloc scheduling.
llvm-svn: 83145
2009-09-30 08:53:01 +00:00
Bob Wilson
2dd957fff6 Pass the optimization level when constructing the ARM instruction selector.
Otherwise, it is always set to "default", which prevents debug info from
even being generated during isel.  Radar 7250345.

llvm-svn: 82988
2009-09-28 14:30:20 +00:00
Evan Cheng
a6b9cab822 Enable pre-regalloc load / store multiple pass for Thumb2.
llvm-svn: 82893
2009-09-27 09:46:04 +00:00
Evan Cheng
6a3bdd872c Really remove this option.
llvm-svn: 82838
2009-09-26 02:49:49 +00:00
Evan Cheng
d0fe5abc23 Remove a couple of unused command line options.
llvm-svn: 82837
2009-09-26 02:45:45 +00:00
Jim Grosbach
04f9d2e264 trivial whitespace cleanup
llvm-svn: 81773
2009-09-14 17:27:35 +00:00
Chris Lattner
054574666a rename COFFMCAsmInfo -> MCAsmInfoCOFF, likewise for darwin.
llvm-svn: 79773
2009-08-22 21:03:30 +00:00
Chris Lattner
7b26fce23e Rename TargetAsmInfo (and its subclasses) to MCAsmInfo.
llvm-svn: 79763
2009-08-22 20:48:53 +00:00