Commit Graph

426 Commits

Author SHA1 Message Date
Eli Friedman
b1bbd5dca3 [ARM] Complete the Thumb1 shift+and->shift+shift transforms.
This saves materializing the immediate.  The additional forms are less
common (they don't usually show up for bitfield insert/extract), but
they're still relevant.

I had to add a new target hook to prevent DAGCombine from reversing the
transform. That isn't the only possible way to solve the conflict, but
it seems straightforward enough.

Differential Revision: https://reviews.llvm.org/D55630

llvm-svn: 349857
2018-12-20 23:39:54 +00:00
Tim Northover
5745b6ac3b ARM: use target-specific SUBS node when combining cmp with cmov.
This has two positive effects. First, using a custom node prevents
recombination leading to an infinite loop since the output DAG is notionally a
little more complex than the input one. Using a flag-setting instruction also
allows the subtraction to be folded with the related comparison more easily.

https://reviews.llvm.org/D53190

llvm-svn: 348122
2018-12-03 11:16:21 +00:00
Sjoerd Meijer
ecc7dcb879 [ARM] Don't expand sdiv when optimising for minsize
Don't expand SDIV with an immediate that is a power of 2 if we optimise for
minimum code size. For example:

sdiv %1, i32 4

gets expanded to a sequence of 3 instructions, but this is suboptimal for
minimum code size so instead we just generate a MOV and a SDIV if integer
division is supported.

Differential Revision: https://reviews.llvm.org/D54546

llvm-svn: 347965
2018-11-30 08:14:28 +00:00
Alex Bradbury
79518b02cd [AtomicExpandPass]: Add a hook for custom cmpxchg expansion in IR
This involves changing the shouldExpandAtomicCmpXchgInIR interface, but I have 
updated the in-tree backends using this hook (ARM, AArch64, Hexagon) so they 
will see no functional change. Previously this hook returned bool, but it now 
returns AtomicExpansionKind.

This hook allows targets to select how a given cmpxchg is to be expanded. 
D48131 uses this to expand part-word cmpxchg to a target-specific intrinsic.

See my associated RFC for more info on the motivation for this change 
<http://lists.llvm.org/pipermail/llvm-dev/2018-June/123993.html>.

Differential Revision: https://reviews.llvm.org/D48130

llvm-svn: 342550
2018-09-19 14:51:42 +00:00
Tim Northover
c15d47bb01 ARM: align loops to 4 bytes on Cortex-M3 and Cortex-M4.
The Technical Reference Manuals for these two CPUs state that branching
to an unaligned 32-bit instruction incurs an extra pipeline reload
penalty. That's bad.

This also enables the optimization at -Os since it costs on average one
byte per loop in return for 1 cycle per iteration, which is pretty good
going.

llvm-svn: 342127
2018-09-13 10:28:05 +00:00
Eli Friedman
d5d0a4d27f [ARM] Enable GEP offset splitting for 32-bit ARM.
It has essentially the same benefit it has on 64-bit ARM: it
substantially reduces the number of constants used by large GEP
operations. Seems to be generally helpful across a few different
codebases I've tried.

Differential Revision: https://reviews.llvm.org/D51462

llvm-svn: 341136
2018-08-30 22:18:27 +00:00
Eli Friedman
0d12e90bf5 [ARM] Make PerformSHLSimplify add nodes to the DAG worklist correctly.
Intentionally excluding nodes from the DAGCombine worklist is likely to
lead to weird optimizations and infinite loops, so it's generally a bad
idea.

To avoid the infinite loops, fix DAGCombine to use the
isDesirableToCommuteWithShift target hook before performing the
transforms in question, and implement the target hook in the ARM backend
disable the transforms in question.

Fixes https://bugs.llvm.org/show_bug.cgi?id=38530 . (I don't have a
reduced testcase for that bug. But we should have sufficient test
coverage for PerformSHLSimplify given that we're not playing weird
tricks with the worklist. I can try to bugpoint it if necessary,
though.)

Differential Revision: https://reviews.llvm.org/D50667

llvm-svn: 339734
2018-08-14 22:10:25 +00:00
Eli Friedman
e1687a89e8 [ARM] Adjust AND immediates to make them cheaper to select.
LLVM normally prefers to minimize the number of bits set in an AND
immediate, but that doesn't always match the available ARM instructions.
In Thumb1 mode, prefer uxtb or uxth where possible; otherwise, prefer
a two-instruction sequence movs+ands or movs+bics.

Some potential improvements outlined in
ARMTargetLowering::targetShrinkDemandedConstant, but seems to work
pretty well already.

The ARMISelDAGToDAG fix ensures we don't generate an invalid UBFX
instruction due to a larger-than-expected mask. (It's orthogonal, in
some sense, but as far as I can tell it's either impossible or nearly
impossible to reproduce the bug without this change.)

According to my testing, this seems to consistently improve codesize by
a small amount by forming bic more often for ISD::AND with an immediate.

Differential Revision: https://reviews.llvm.org/D50030

llvm-svn: 339472
2018-08-10 21:21:53 +00:00
Tim Northover
28e0a6f7dd ARM: don't try to over-align large vectors as arguments.
By default LLVM thinks very large vectors get aligned to their size when
passed across functions. Unfortunately no-one told the ARM backend so it
doesn't trigger stack realignment and so accesses can cause the usual
misalignment issues (e.g. a data abort).

This changes the ABI alignment to the stack alignment, which in practice
(and as a bonus) also coincides with the alignment "natural" vectors get.

llvm-svn: 331451
2018-05-03 12:54:25 +00:00
Adrian Prantl
5f8f34e459 Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

  for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done

Differential Revision: https://reviews.llvm.org/D46290

llvm-svn: 331272
2018-05-01 15:54:18 +00:00
Craig Topper
2fa1436206 [IR][CodeGen] Remove dependency on EVT from IR/Function.cpp. Move EVT to CodeGen layer.
Currently EVT is in the IR layer only because of Function.cpp needing a very small piece of the functionality of EVT::getEVTString(). The rest of EVT is used in codegen making CodeGen a better place for it.

The previous code converted a Type* to EVT and then called getEVTString. This was only expected to handle the primitive types from Type*. Since there only a few primitive types, we can just print them as strings directly.

Differential Revision: https://reviews.llvm.org/D45017

llvm-svn: 328806
2018-03-29 17:21:10 +00:00
David Blaikie
36a0f226b1 Fix layering by moving ValueTypes.h from CodeGen to IR
ValueTypes.h is implemented in IR already.

llvm-svn: 328397
2018-03-23 23:58:31 +00:00
David Blaikie
13e77db2df Fix layering of MachineValueType.h by moving it from CodeGen to Support
This is used by llvm tblgen as well as by LLVM Targets, so the only
common place is Support for now. (maybe we need another target for these
sorts of things - but for now I'm at least making them correct & we can
make them better if/when people have strong feelings)

llvm-svn: 328395
2018-03-23 23:58:25 +00:00
Christof Douma
4a025cc79d [ARM] Support float literals under XO
When targeting execute-only and fp-armv8, float constants in a compare
resulted in instruction selection failures. This is now fixed by using
vmov.f32 where possible, otherwise the floating point constant is
lowered into a integer constant that is moved into a floating point
register.

This patch also restores using fpcmp with immediate 0 under fp-armv8.

Change-Id: Ie87229706f4ed879a0c0cf66631b6047ed6c6443
llvm-svn: 328313
2018-03-23 13:02:03 +00:00
Sjoerd Meijer
89ea2648bb [ARM] Armv8.2-A FP16 code generation (part 3/3)
This adds most of the FP16 codegen support, but these areas need further work:

- FP16 literals and immediates are not properly supported yet (e.g. literal
  pool needs work),
- Instructions that are generated from intrinsics (e.g. vabs) haven't been
  added.

This will be addressed in follow-up patches.

Differential Revision: https://reviews.llvm.org/D42849

llvm-svn: 324321
2018-02-06 08:43:56 +00:00
Sjoerd Meijer
98d5359ea2 [ARM] Armv8.2-A FP16 code generation (part 2/3)
Half-precision arguments and return values are passed as if it were an int or
float for ARM. This results in truncates and bitcasts to/from i16 and f16
values, which are legalized very early to stack stores/loads. When FullFP16 is
enabled, we want to avoid codegen for these bitcasts as it is unnecessary and
inefficient.

Differential Revision: https://reviews.llvm.org/D42580

llvm-svn: 323861
2018-01-31 10:18:29 +00:00
Andre Vieira
5627c218e1 [ARM] Add codegen for SMMULR, SMMLAR and SMMLSR
This patch teaches the Arm back-end to generate the SMMULR, SMMLAR and SMMLSR
instructions from equivalent IR patterns.

Differential Revision: https://reviews.llvm.org/D41775

llvm-svn: 322361
2018-01-12 09:24:41 +00:00
Joel Galenson
6f4e827e4c [ARM] Optimize {s,u}{add,sub}.with.overflow.
The AArch64 backend contains code to optimize {s,u}{add,sub}.with.overflow during SelectionDAG.  This commit ports that code to the ARM backend.

Differential revision: https://reviews.llvm.org/D35635

llvm-svn: 321224
2017-12-20 22:25:39 +00:00
Florian Hahn
c3aa6d83fd [ARM] Lower unsigned saturation to USAT
Summary:
Implement lower of unsigned saturation on an interval [0, k] where k + 1 is a power of two using USAT instruction in a similar way to how [~k, k] is lowered using SSAT on ARM models that supports it.

Patch by Marten Svanfeldt

Reviewers: t.p.northover, pbarrio, eastig, SjoerdMeijer, javed.absar, fhahn

Reviewed By: fhahn

Subscribers: fhahn, aemerson, javed.absar, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D41348

llvm-svn: 321164
2017-12-20 11:13:57 +00:00
Matthias Braun
f1caa2833f MachineFunction: Return reference from getFunction(); NFC
The Function can never be nullptr so we can return a reference.

llvm-svn: 320884
2017-12-15 22:22:58 +00:00
Matt Arsenault
7d7adf4f2e TLI: Allow using PSV for intrinsic mem operands
llvm-svn: 320756
2017-12-14 22:34:10 +00:00
Roger Ferrer Ibanez
5ea0f2501f [ARM] Use ADDCARRY / SUBCARRY
This is a preparatory step for D34515.

This change:
 - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
 - lowering is done by first converting the boolean value into the carry flag
   using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value
   using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
   operations does the actual addition.
 - for subtraction, given that ISD::SUBCARRY second result is actually a
   borrow, we need to invert the value of the second operand and result before
   and after using ARMISD::SUBE. We need to invert the carry result of
   ARMISD::SUBE to preserve the semantics.
 - given that the generic combiner may lower ISD::ADDCARRY and
   ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering
   as well otherwise i64 operations now would require branches. This implies
   updating the corresponding test for unsigned.
 - add new combiner to remove the redundant conversions from/to carry flags
   to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C
 - fixes PR34045
 - fixes PR34564
 - fixes PR35103

Differential Revision: https://reviews.llvm.org/D35192

llvm-svn: 320355
2017-12-11 12:13:45 +00:00
Nirav Dave
7d8f3e0c93 [ARM][AArch64][DAG] Reenable post-legalize store merge
Reenable post-legalize stores with constant merging computation and
corresponding test case.

 * Properly truncate store merge constants
 * Disable merging of truncated stores floating points
 * Ensure merges of constant stores into a single vector are
   constructed from legal elements.

Reviewers: eastig, efriedma

Reviewed By: eastig

Subscribers: spatel, rengolin, aemerson, javed.absar, kristof.beyls, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D40701

llvm-svn: 319899
2017-12-06 15:30:13 +00:00
Nirav Dave
3e76e1e89e [DAG][ARM] Revert "Reenable post-legalize store merge"
due to failures in AArch and ARM code gen.

llvm-svn: 319587
2017-12-01 21:55:47 +00:00
Nirav Dave
eb2b24fded [ARM][DAG] Reenable post-legalize store merge
Summary: Reenable post-legalize stores with constant merging computation and cofrresponding test case.

Reviewers: eastig, efriedma

Subscribers: aemerson, javed.absar, kristof.beyls, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D40701

llvm-svn: 319547
2017-12-01 14:49:26 +00:00
Nirav Dave
bafaa53c4d [ARM][DAG] Revert Disable post-legalization store merge for ARM
Partially reverting enabling of post-legalization store merge
(r319036) for just ARM backend as it is causing incorrect code
in some Thumb2 cases.

llvm-svn: 319331
2017-11-29 18:06:13 +00:00
David Blaikie
b3bde2ea50 Fix a bunch more layering of CodeGen headers that are in Target
All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, not the
other way around).

llvm-svn: 318490
2017-11-17 01:07:10 +00:00
Roger Ferrer Ibanez
9dfbc10522 Revert r313618 "[ARM] Use ADDCARRY / SUBCARRY"
That change causes PR35103, so reverting until I figure it out.

llvm-svn: 317092
2017-11-01 14:06:57 +00:00
Eugene Zelenko
076468c0d0 [ARM] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 313823
2017-09-20 21:35:51 +00:00
Roger Ferrer Ibanez
8d0180c955 [ARM] Use ADDCARRY / SUBCARRY
This is a preparatory step for D34515.

This change:
 - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
 - lowering is done by first converting the boolean value into the carry flag
   using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value
   using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
   operations does the actual addition.
 - for subtraction, given that ISD::SUBCARRY second result is actually a
   borrow, we need to invert the value of the second operand and result before
   and after using ARMISD::SUBE. We need to invert the carry result of
   ARMISD::SUBE to preserve the semantics.
 - given that the generic combiner may lower ISD::ADDCARRY and
   ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering
   as well otherwise i64 operations now would require branches. This implies
   updating the corresponding test for unsigned.
 - add new combiner to remove the redundant conversions from/to carry flags
   to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C
 - fixes PR34045
 - fixes PR34564

Differential Revision: https://reviews.llvm.org/D35192

llvm-svn: 313618
2017-09-19 09:05:39 +00:00
Sam Parker
71efbe4c68 [ARM] Implement isTruncateFree
Implement the isTruncateFree hooks, lifted from AArch64, that are
used by TargetTransformInfo. This allows simplifycfg to reduce the
test case into a single basic block.

Differential Revision: https://reviews.llvm.org/D37516

llvm-svn: 313533
2017-09-18 14:28:51 +00:00
Hans Wennborg
8c1eb106bd Revert r313009 "[ARM] Use ADDCARRY / SUBCARRY"
This was causing PR34045 to fire again.

> This is a preparatory step for D34515 and also is being recommitted as its
> first version caused PR34045.
>
> This change:
>  - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
>  - lowering is done by first converting the boolean value into the carry flag
>    using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value
>    using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
>    operations does the actual addition.
>  - for subtraction, given that ISD::SUBCARRY second result is actually a
>    borrow, we need to invert the value of the second operand and result before
>    and after using ARMISD::SUBE. We need to invert the carry result of
>    ARMISD::SUBE to preserve the semantics.
>  - given that the generic combiner may lower ISD::ADDCARRY and
>    ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering
>    as well otherwise i64 operations now would require branches. This implies
>    updating the corresponding test for unsigned.
>  - add new combiner to remove the redundant conversions from/to carry flags
>    to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C
>  - fixes PR34045
>
> Differential Revision: https://reviews.llvm.org/D35192

Also revert follow-up r313010:

> [ARM] Fix typo when creating ISD::SUB nodes
>
> In D35192, I accidentally introduced a typo when creating ISD::SUB nodes,
> giving them two values instead of one.
>
> This fails when the merge_values combiner finds one of these nodes.
>
> This change fixes PR34564.
>
> Differential Revision: https://reviews.llvm.org/D37690

llvm-svn: 313044
2017-09-12 16:24:17 +00:00
Roger Ferrer Ibanez
4f92b4162f [ARM] Use ADDCARRY / SUBCARRY
This is a preparatory step for D34515 and also is being recommitted as its
first version caused PR34045.

This change:
 - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
 - lowering is done by first converting the boolean value into the carry flag
   using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value
   using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
   operations does the actual addition.
 - for subtraction, given that ISD::SUBCARRY second result is actually a
   borrow, we need to invert the value of the second operand and result before
   and after using ARMISD::SUBE. We need to invert the carry result of
   ARMISD::SUBE to preserve the semantics.
 - given that the generic combiner may lower ISD::ADDCARRY and
   ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering
   as well otherwise i64 operations now would require branches. This implies
   updating the corresponding test for unsigned.
 - add new combiner to remove the redundant conversions from/to carry flags
   to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C
 - fixes PR34045

Differential Revision: https://reviews.llvm.org/D35192

llvm-svn: 313009
2017-09-12 07:40:09 +00:00
Hans Wennborg
075e5a2e2b Revert r312898 "[ARM] Use ADDCARRY / SUBCARRY"
It caused PR34564.

> This is a preparatory step for D34515 and also is being recommitted as its
> first version caused PR34045.
>
> This change:
>  - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
>  - lowering is done by first converting the boolean value into the carry flag
>    using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value
>    using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
>    operations does the actual addition.
>  - for subtraction, given that ISD::SUBCARRY second result is actually a
>    borrow, we need to invert the value of the second operand and result before
>    and after using ARMISD::SUBE. We need to invert the carry result of
>    ARMISD::SUBE to preserve the semantics.
>  - given that the generic combiner may lower ISD::ADDCARRY and
>    ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering
>    as well otherwise i64 operations now would require branches. This implies
>    updating the corresponding test for unsigned.
>  - add new combiner to remove the redundant conversions from/to carry flags
>    to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C
>  - fixes PR34045
>
> Differential Revision: https://reviews.llvm.org/D35192

llvm-svn: 312980
2017-09-11 23:52:02 +00:00
Roger Ferrer Ibanez
12b20f2307 [ARM] Use ADDCARRY / SUBCARRY
This is a preparatory step for D34515 and also is being recommitted as its
first version caused PR34045.

This change:
 - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
 - lowering is done by first converting the boolean value into the carry flag
   using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value
   using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
   operations does the actual addition.
 - for subtraction, given that ISD::SUBCARRY second result is actually a
   borrow, we need to invert the value of the second operand and result before
   and after using ARMISD::SUBE. We need to invert the carry result of
   ARMISD::SUBE to preserve the semantics.
 - given that the generic combiner may lower ISD::ADDCARRY and
   ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering
   as well otherwise i64 operations now would require branches. This implies
   updating the corresponding test for unsigned.
 - add new combiner to remove the redundant conversions from/to carry flags
   to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C
 - fixes PR34045

Differential Revision: https://reviews.llvm.org/D35192

llvm-svn: 312898
2017-09-11 07:38:05 +00:00
Diana Picus
1d101d76c0 Move static helper into ARMTargetLowering. NFC
This exposes the isReadOnly(GlobalValue *) in the ARMTargetLowering so
we can make use of it in GlobalISel as well.

llvm-svn: 312320
2017-09-01 10:44:48 +00:00
Evgeny Astigeevich
540a39adf7 [ARM, Thumb1] Prevent ARMTargetLowering::isLegalAddressingMode from accepting illegal modes
ARMTargetLowering::isLegalAddressingMode can accept illegal addressing modes
for the Thumb1 target. This causes generation of redundant code and affects
performance.

This fixes PR34106: https://bugs.llvm.org/show_bug.cgi?id=34106

Differential Revision: https://reviews.llvm.org/D36467

llvm-svn: 311649
2017-08-24 10:00:25 +00:00
Craig Topper
2251ef95a3 [X86][ARM][TargetLowering] Add SrcVT to isExtractSubvectorCheap
Summary:
Without the SrcVT its hard to know what is really being asked for. For example if your target has 128, 256, and 512 bit vectors. Maybe extracting 128 from 256 is cheap, but maybe extracting 128 from 512 is not.

For x86 we do support extracting a quarter of a 512-bit register. But for i1 vectors we don't have isel patterns for extracting arbitrary pieces. So we need this to have a correct implementation of isExtractSubvectorCheap for mask vectors.

Reviewers: RKSimon, zvi, efriedma

Reviewed By: RKSimon

Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D36649

llvm-svn: 310793
2017-08-13 17:29:07 +00:00
Nico Weber
bfde70b097 Revert r309923, it caused PR34045.
llvm-svn: 309950
2017-08-03 15:41:26 +00:00
Roger Ferrer Ibanez
854980341b [ARM] Use ADDCARRY / SUBCARRY
This patch:

- makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
- lowering is done by first converting the boolean value into the carry flag
  using (_, C) <- (ARMISD::ADDC R, -1) and converted back to an integer value
  using (R, _) <- (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
  operations does the actual addition.
- for subtraction, given that ISD::SUBCARRY second result is actually a
  borrow, we need to invert the value of the second operand and result before
  and after using ARMISD::SUBE. We need to invert the carry result of
  ARMISD::SUBE to preserve the semantics.
- given that the generic combiner may lower ISD::ADDCARRY and
  ISD::SUBCARRY into ISD::UADDO and ISD::USUBO we need to update their lowering
  as well otherwise i64 operations now would require branches. This implies
  updating the corresponding test for unsigned.
- add new combiner to remove the redundant conversions from/to carry flags
  to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) -> C

Differential Revision: https://reviews.llvm.org/D35192

llvm-svn: 309923
2017-08-03 07:45:10 +00:00
Zvi Rackover
1b73682243 TargetLowering: Change isShuffleMaskLegal's mask argument type to ArrayRef<int>. NFCI.
Changing mask argument type from const SmallVectorImpl<int>& to
ArrayRef<int>.

This came up in D35700 where a mask is received as an ArrayRef<int> and
we want to pass it to TargetLowering::isShuffleMaskLegal().
Also saves a few lines of code.

llvm-svn: 309085
2017-07-26 08:06:58 +00:00
Jonas Paulsson
024e319489 [SystemZ, LoopStrengthReduce]
This patch makes LSR generate better code for SystemZ in the cases of memory
intrinsics, Load->Store pairs or comparison of immediate with memory.

In order to achieve this, the following common code changes were made:

 * New TTI hook: LSRWithInstrQueries(), which defaults to false. Controls if
 LSR should do instruction-based addressing evaluations by calling
 isLegalAddressingMode() with the Instruction pointers.
 * In LoopStrengthReduce: handle address operands of memset, memmove and memcpy
 as address uses, and call isFoldableMemAccessOffset() for any LSRUse::Address,
 not just loads or stores.

SystemZ changes:

 * isLSRCostLess() implemented with Insns first, and without ImmCost.
 * New function supportedAddressingMode() that is a helper for TTI methods
 looking at Instructions passed via pointers.

Review: Ulrich Weigand, Quentin Colombet
https://reviews.llvm.org/D35262
https://reviews.llvm.org/D35049

llvm-svn: 308729
2017-07-21 11:59:37 +00:00
Nirav Dave
4dcad5dc6b Add DAG argument to canMergeStoresTo NFC.
llvm-svn: 307583
2017-07-10 20:25:54 +00:00
Alexandros Lamprineas
2b2b420563 [ARM] Support constant pools in data when generating execute-only code.
Resubmission of r305387, which was reverted at r305390. The Address
Sanitizer caught a stack-use-after-scope of a Twine variable. This
is now fixed by passing the Twine directly as a function parameter.

The ARM backend asserts against constant pool lowering when it generates
execute-only code in order to prevent the generation of constant pools in
the text section. It appears that target independent optimizations might
generate DAG nodes that represent constant pools. By lowering such nodes
as global addresses we don't violate the semantics of execute-only code
and also it is guaranteed that execute-only behaves correct with the
position-independent addressing modes that support execute-only code.

Differential Revision: https://reviews.llvm.org/D33773

llvm-svn: 305776
2017-06-20 07:20:52 +00:00
Alexandros Lamprineas
1c15ee2631 Revert "[ARM] Support constant pools in data when generating execute-only code."
This reverts commit 3a204faa093c681a1e96c5e0622f50649b761ee0.

I've upset a buildbot which runs the address sanitizer:
ERROR: AddressSanitizer: stack-use-after-scope
lib/Target/ARM/ARMISelLowering.cpp:2690
That Twine variable is used illegally.

llvm-svn: 305390
2017-06-14 15:00:08 +00:00
Alexandros Lamprineas
c582d6e133 [ARM] Support constant pools in data when generating execute-only code.
The ARM backend asserts against constant pool lowering when it generates
execute-only code in order to prevent the generation of constant pools in
the text section. It appears that target independent optimizations might
generate DAG nodes that represent constant pools. By lowering such nodes
as global addresses we don't violate the semantics of execute-only code
and also it is guaranteed that execute-only behaves correct with the
position-independent addressing modes that support execute-only code.

Differential Revision: https://reviews.llvm.org/D33773

llvm-svn: 305387
2017-06-14 13:22:41 +00:00
Nirav Dave
6c910c0dd8 [DAG] Add AddressSpace parameter to canMergeStoresTo. NFC.
llvm-svn: 303673
2017-05-23 18:53:02 +00:00
Tim Shen
04de70d3a7 [Atomic] Remove IsStore/IsLoad in the interface, and pass the instruction instead. NFC.
Now both emitLeadingFence and emitTrailingFence take the instruction
itself, instead of taking IsLoad/IsStore pairs.
Instruction::mayReadFromMemory and Instrucion::mayWriteToMemory are used
for determining those two booleans.

The instruction argument is also useful for later D32763, in
emitTrailingFence. For emitLeadingFence, it seems to have cleaner
interface with the proposed change.

Differential Revision: https://reviews.llvm.org/D32762

llvm-svn: 302539
2017-05-09 15:27:17 +00:00
Matthias Braun
4682ac6c83 ARM: Compute MaxCallFrame size early
This exposes a method in MachineFrameInfo that calculates
MaxCallFrameSize and calls it after instruction selection in the ARM
target.

This avoids
ARMBaseRegisterInfo::canRealignStack()/ARMFrameLowering::hasReservedCallFrame()
giving different answers in early/late phases of codegen.

The testcase shows a particular nasty example result of that where we
would fail to properly align an alloca.

Differential Revision: https://reviews.llvm.org/D32622

llvm-svn: 302303
2017-05-05 22:04:05 +00:00
Sam Parker
df337704f0 [ARM] ACLE Chapter 9 intrinsics
Added the integer data processing intrinsics from ACLE v2.1 Chapter 9
but I have missed out the saturation_occurred intrinsics for now. For
the instructions that read and write the GE bits, a chain is included
and the only instruction that reads these flags (sel) is only
selectable via the implemented intrinsic.

Differential Revision: https://reviews.llvm.org/D32281

llvm-svn: 302126
2017-05-04 07:31:28 +00:00