Commit Graph

334 Commits

Author SHA1 Message Date
Teresa Johnson
716abbeb43 [HotColdSplit] Move splitting after instrumented PGO use
Summary:
Follow up to D57082 which moved splitting earlier in the pipeline, in
order to perform it before inlining. However, it was moved too early,
before the IR is annotated with instrumented PGO data. This caused the
splitting to incorrectly determine cold functions.

Move it to just after PGO annotation (still before inlining), in both
pass managers.

Reviewers: vsk, hiraditya, sebpop

Subscribers: mehdi_amini, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57805

llvm-svn: 353270
2019-02-06 04:29:39 +00:00
Vedant Kumar
ef1ebed1c6 [HotColdSplit] Move splitting earlier in the pipeline
Performing splitting early has several advantages:

  - Inhibiting inlining of cold code early improves code size. Compared
    to scheduling splitting at the end of the pipeline, this cuts code
    size growth in half within the iOS shared cache (0.69% to 0.34%).

  - Inhibiting inlining of cold code improves compile time. There's no
    need to inline split cold functions, or to inline as much *within*
    those split functions as they are marked `minsize`.

  - During LTO, extra work is only done in the pre-link step. Less code
    must be inlined during cross-module inlining.

An additional motivation here is that the most common cold regions
identified by the static/conservative splitting heuristic can (a) be
found before inlining and (b) do not grow after inlining. E.g.
__assert_fail, os_log_error.

The disadvantages are:

  - Some opportunities for splitting out cold code may be missed. This
    gap can potentially be narrowed by adding a worklist algorithm to the
    splitting pass.

  - Some opportunities to reduce code size may be lost (e.g. store
    sinking, when one side of the CFG diamond is split). This does not
    outweigh the code size benefits of splitting earlier.

On net, splitting early in the pipeline has substantial code size
benefits, and no major effects on memory locality or performance. We
measured memory locality using ktrace data, and consistently found that
10% fewer pages were needed to capture 95% of text page faults in key
iOS benchmarks. We measured performance on frequency-stabilized iOS
devices using LNT+externals.

This reverses course on the decision made to schedule splitting late in
r344869 (D53437).

Differential Revision: https://reviews.llvm.org/D57082

llvm-svn: 352080
2019-01-24 18:55:49 +00:00
Chandler Carruth
2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Wei Mi
3bcccdfe38 [SampleFDO] Skip profile reading when flattened profile used in ThinLTO postlink
If the sample profile has no inlining hierachy information included, we call
the sample profile is flattened. For flattened profile, in ThinLTO postlink
phase, SampleProfileLoader's hot function inlining and profile annotation will
do nothing, so it is better to save the effort to read in the profile and run
the sample profile loader pass. It is helpful for reducing compile time when
the flattened profile is huge.

Differential Revision: https://reviews.llvm.org/D54819

llvm-svn: 351476
2019-01-17 20:48:34 +00:00
Wei Mi
79c4408aa2 Fix a mistake in rL351392.
PGOInstrGen should be initialized to "" instead of false.

llvm-svn: 351397
2019-01-16 23:31:40 +00:00
Wei Mi
c876e3d42b [PGO] Make pgo related options in opt more consistent.
Currently we have pgo options defined in PassManagerBuilder.cpp only for
instrument pgo, but not for sample pgo. We also have pgo options defined
in NewPMDriver.cpp in opt only for new pass manager and for all kinds of
pgo. They have some inconsistency.

To make the options more consistent and make tests writing easier, the
patch let old pass manager to share the same pgo options with new pass
manager in opt, and removes the options in PassManagerBuilder.cpp.

Differential Revision: https://reviews.llvm.org/D56749

llvm-svn: 351392
2019-01-16 23:19:02 +00:00
Teresa Johnson
853b962416 [ThinLTO] Handle chains of aliases
At -O0, globalopt is not run during the compile step, and we can have a
chain of an alias having an immediate aliasee of another alias. The
summaries are constructed assuming aliases in a canonical form
(flattened chains), and as a result only the base object but no
intermediate aliases were preserved.

Fix by adding a pass that canonicalize aliases, which ensures each
alias is a direct alias of the base object.

Reviewers: pcc, davidxl

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits

Differential Revision: https://reviews.llvm.org/D54507

llvm-svn: 350423
2019-01-04 19:04:54 +00:00
Michael Kruse
d4eb13c880 [LoopVectorize] Rename pass options. NFC.
Rename:
NoUnrolling to InterleaveOnlyWhenForced
and
AlwaysVectorize to !VectorizeOnlyWhenForced

Contrary to what the name 'AlwaysVectorize' suggests, it does not
unconditionally vectorize all loops, but applies a cost model to
determine whether vectorization is profitable to all loops. Hence,
passing false will disable the cost model, except when a loop is marked
with llvm.loop.vectorize.enable. The 'OnlyWhenForced' suffix (suggested
by @hfinkel in D55716) better matches this behavior.

Similarly, 'NoUnrolling' disables the profitability cost model for
interleaving (a term to distinguish it from unrolling by the
LoopUnrollPass); rename it for consistency.

Differential Revision: https://reviews.llvm.org/D55785

llvm-svn: 349513
2018-12-18 17:46:09 +00:00
Michael Kruse
3284775b70 [LoopUnroll] Honor '#pragma unroll' even with -fno-unroll-loops.
When using clang with `-fno-unroll-loops` (implicitly added with `-O1`),
the LoopUnrollPass is not not added to the (legacy) pass pipeline. This
also means that it will not process any loop metadata such as
llvm.loop.unroll.enable (which is generated by #pragma unroll or
WarnMissedTransformationsPass emits a warning that a forced
transformation has not been applied (see
https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181210/610833.html).
Such explicit transformations should take precedence over disabling
heuristics.

This patch unconditionally adds LoopUnrollPass to the optimizing
pipeline (that is, it is still not added with `-O0`), but passes a flag
indicating whether automatic unrolling is dis-/enabled. This is the same
approach as LoopVectorize uses.

The new pass manager's pipeline builder has no option to disable
unrolling, hence the problem does not apply.

Differential Revision: https://reviews.llvm.org/D55716

llvm-svn: 349509
2018-12-18 17:16:05 +00:00
Michael Kruse
7244852557 [Unroll/UnrollAndJam/Vectorizer/Distribute] Add followup loop attributes.
When multiple loop transformation are defined in a loop's metadata, their order of execution is defined by the order of their respective passes in the pass pipeline. For instance, e.g.

    #pragma clang loop unroll_and_jam(enable)
    #pragma clang loop distribute(enable)

is the same as

    #pragma clang loop distribute(enable)
    #pragma clang loop unroll_and_jam(enable)

and will try to loop-distribute before Unroll-And-Jam because the LoopDistribute pass is scheduled after UnrollAndJam pass. UnrollAndJamPass only supports one inner loop, i.e. it will necessarily fail after loop distribution. It is not possible to specify another execution order. Also,t the order of passes in the pipeline is subject to change between versions of LLVM, optimization options and which pass manager is used.

This patch adds 'followup' attributes to various loop transformation passes. These attributes define which attributes the resulting loop of a transformation should have. For instance,

    !0 = !{!0, !1, !2}
    !1 = !{!"llvm.loop.unroll_and_jam.enable"}
    !2 = !{!"llvm.loop.unroll_and_jam.followup_inner", !3}
    !3 = !{!"llvm.loop.distribute.enable"}

defines a loop ID (!0) to be unrolled-and-jammed (!1) and then the attribute !3 to be added to the jammed inner loop, which contains the instruction to distribute the inner loop.

Currently, in both pass managers, pass execution is in a fixed order and UnrollAndJamPass will not execute again after LoopDistribute. We hope to fix this in the future by allowing pass managers to run passes until a fixpoint is reached, use Polly to perform these transformations, or add a loop transformation pass which takes the order issue into account.

For mandatory/forced transformations (e.g. by having been declared by #pragma omp simd), the user must be notified when a transformation could not be performed. It is not possible that the responsible pass emits such a warning because the transformation might be 'hidden' in a followup attribute when it is executed, or it is not present in the pipeline at all. For this reason, this patche introduces a WarnMissedTransformations pass, to warn about orphaned transformations.

Since this changes the user-visible diagnostic message when a transformation is applied, two test cases in the clang repository need to be updated.

To ensure that no other transformation is executed before the intended one, the attribute `llvm.loop.disable_nonforced` can be added which should disable transformation heuristics before the intended transformation is applied. E.g. it would be surprising if a loop is distributed before a #pragma unroll_and_jam is applied.

With more supported code transformations (loop fusion, interchange, stripmining, offloading, etc.), transformations can be used as building blocks for more complex transformations (e.g. stripmining+stripmining+interchange -> tiling).

Reviewed By: hfinkel, dmgreen

Differential Revision: https://reviews.llvm.org/D49281
Differential Revision: https://reviews.llvm.org/D55288

llvm-svn: 348944
2018-12-12 17:32:52 +00:00
Xin Tong
642c8d3575 [LTO] Load sample profile in LTO link step.
Summary:
Load sample profile in LTO link step.
ThinLTO calls populateModulePassManager to load the profile

Reviewers: tejohnson, davidxl, danielcdh

Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D54564

llvm-svn: 346971
2018-11-15 18:06:42 +00:00
Aditya Kumar
d9e2e383a9 Schedule Hot Cold Splitting pass after most optimization passes
Summary:
In the new+old pass manager, hot cold splitting was schedule too early.
Thanks to Vedant for pointing this out.

Reviewers: sebpop, vsk

Reviewed By: sebpop, vsk

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D53437

llvm-svn: 344869
2018-10-21 18:11:56 +00:00
Eric Christopher
dcf1d97c5c Temporarily revert "[GVNHoist] Re-enable GVNHoist by default"
This reverts commit r342387 as it's showing significant performance
regressions in a number of benchmarks. Followed up with the
committer and original thread with an example and will get performance
numbers before recommitting.

llvm-svn: 343522
2018-10-01 18:57:08 +00:00
Florian Hahn
8600fee52e Recommit r343308: [LoopInterchange] Turn into a loop pass.
llvm-svn: 343450
2018-10-01 09:59:48 +00:00
whitequark
29b2980159 Revert "[LLVM-C] Add bindings for addCoroutinePassesToExtensionPoints"
This reverts commit c4baf7c2f06ff5459c4f5998ce980346e72bff97.

Broke the bots, and should really be in Transforms/Coroutines
instead.

llvm-svn: 343337
2018-09-28 16:45:18 +00:00
whitequark
937afbc365 [LLVM-C] Add bindings for addCoroutinePassesToExtensionPoints
Summary: This patch adds bindings to C and Go for addCoroutinePassesToExtensionPoints, which is used to add coroutine passes to the correct locations in PassManagerBuilder.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: mehdi_amini, modocache, llvm-commits

Differential Revision: https://reviews.llvm.org/D51642

llvm-svn: 343336
2018-09-28 16:38:11 +00:00
Florian Hahn
8d72ecc36f Revert r343308: [LoopInterchange] Turn into a loop pass.
llvm-svn: 343310
2018-09-28 10:20:07 +00:00
Florian Hahn
0694c159f7 [LoopInterchange] Turn into a loop pass.
This patch turns LoopInterchange into a loop pass. It now only
considers top-level loops and tries to move the innermost loop to the
optimal position within the loop nest. By only looking at top-level
loops, we might miss a few opportunities the function pass would get
(e.g. if we have a loop nest of 3 loops, in the function pass
we might process loops at level 1 and 2 and move the inner most loop to
level 1, and then we process loops at levels 0, 1, 2 and interchange
again, because we now have a different inner loop). But I think it would
be better to handle such cases by picking the best inner loop from the
start and avoid re-visiting the same loops again.

The biggest advantage of it being a function pass is that it interacts
nicely with the other loop passes. Without this patch, there are some
performance regressions on AArch64 with loop interchanging enabled,
where no loops were interchanged, but we missed out on some other loop
optimizations.

It also removes the SimplifyCFG run. We are just changing branches, so
the CFG should not be more complicated, besides the additional 'unique'
preheaders this pass might create.


Reviewers: chandlerc, efriedma, mcrosier, javed.absar, xbolva00

Reviewed By: xbolva00

Differential Revision: https://reviews.llvm.org/D51702

llvm-svn: 343308
2018-09-28 09:45:50 +00:00
Alexandros Lamprineas
8a1c374b2e [GVNHoist] Re-enable GVNHoist by default
Rebase rL341954 since https://bugs.llvm.org/show_bug.cgi?id=38912
has been fixed by rL342055.

Precommit testing performed:
* Overnight runs of csmith comparing the output between programs
  compiled with gvn-hoist enabled/disabled.
* Bootstrap builds of clang with UbSan/ASan configurations.

llvm-svn: 342387
2018-09-17 12:24:55 +00:00
Alexandros Lamprineas
fe0512d575 Revert "[GVNHoist] Re-enable GVNHoist by default"
This reverts rL341954.

The builder `sanitizer-x86_64-linux-bootstrap-ubsan` has been
failing with timeouts at stage2 clang/ubsan:

[3065/3073] Linking CXX executable bin/lld
command timed out: 1200 seconds without output running python
../sanitizer_buildbot/sanitizers/buildbot_selector.py,
attempting to kill

llvm-svn: 342001
2018-09-11 22:10:57 +00:00
Alexandros Lamprineas
db18e972d7 [GVNHoist] Re-enable GVNHoist by default
Rebase rL340922 since https://bugs.llvm.org/show_bug.cgi?id=38807
has been fixed by rL341947.

llvm-svn: 341954
2018-09-11 15:55:45 +00:00
Aditya Kumar
801394a3d7 Hot cold splitting pass
Find cold blocks based on profile information (or optionally with static analysis).
Forward propagate profile information to all cold-blocks.
Outline a cold region.
Set calling conv and prof hint for the callsite of the outlined function.

Worked in collaboration with: Sebastian Pop <s.pop@samsung.com>
Differential Revision: https://reviews.llvm.org/D50658

llvm-svn: 341669
2018-09-07 15:03:49 +00:00
Hiroshi Yamauchi
9775a620b0 [PGO] Control Height Reduction
Summary:
Control height reduction merges conditional blocks of code and reduces the
number of conditional branches in the hot path based on profiles.

if (hot_cond1) { // Likely true.
  do_stg_hot1();
}
if (hot_cond2) { // Likely true.
  do_stg_hot2();
}

->

if (hot_cond1 && hot_cond2) { // Hot path.
  do_stg_hot1();
  do_stg_hot2();
} else { // Cold path.
  if (hot_cond1) {
    do_stg_hot1();
  }
  if (hot_cond2) {
    do_stg_hot2();
  }
}

This speeds up some internal benchmarks up to ~30%.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: xbolva00, dmgreen, mehdi_amini, llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D50591

llvm-svn: 341386
2018-09-04 17:19:13 +00:00
Alexandros Lamprineas
f6db5bcd38 Revert r340922 "[GVNHoist] Re-enable GVNHoist by default"
Another sanitizer buildbot failed this time at bootstrap when
compiling SemaTemplateInstantiate.cpp with this assertion:
`dominates(MD, U) && "Memory Def does not dominate it's uses"'.

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/15047

llvm-svn: 340925
2018-08-29 13:00:55 +00:00
Alexandros Lamprineas
c03b9b8854 [GVNHoist] Re-enable GVNHoist by default
Rebase rL338240 since the excessive memory usage observed when using
GVNHoist with UBSan has been fixed by rL340818.

Differential Revision: https://reviews.llvm.org/D49858

llvm-svn: 340922
2018-08-29 11:58:34 +00:00
Vlad Tsyrklevich
1c7160e85f Revert "[GVNHoist] Re-enable GVNHoist by default"
This reverts commit r338240 because it was causing OOMs on the UBSan
buildbot when building clang/lib/Sema/SemaChecking.cpp

llvm-svn: 338297
2018-07-30 20:07:33 +00:00
Alexandros Lamprineas
de3ca964c1 [GVNHoist] Re-enable GVNHoist by default
My initial motivation for this came from https://reviews.llvm.org/D48122,
where it was pointed out that my change didn't fit well in SimplifyCFG and
therefore using GVNHoist was a better way to go. GVNHoist has been disabled
for a while as there was a list of bugs related to it.

I have fixed the following bugs:

https://bugs.llvm.org/show_bug.cgi?id=37808 -> https://reviews.llvm.org/D48372 (rL337149)
https://bugs.llvm.org/show_bug.cgi?id=36787 -> https://reviews.llvm.org/D49555 (rL337674)
https://bugs.llvm.org/show_bug.cgi?id=37445 -> https://reviews.llvm.org/D49425 (rL337680)

The next two bugs no longer occur, and it's unclear which commit fixed them:

https://bugs.llvm.org/show_bug.cgi?id=36635
https://bugs.llvm.org/show_bug.cgi?id=37791

I investigated this one and proved to be unrelated to GVNHoist, but a genuine bug in NewGvn:

https://bugs.llvm.org/show_bug.cgi?id=37660

To convince myself GVNHoist is in a good state I made a successful bootstrap build of LLVM.
Merging this change now in order to make it to the LLVM 7.0.0 branch.

Differential Revision: https://reviews.llvm.org/D49858

llvm-svn: 338240
2018-07-30 10:50:18 +00:00
Teresa Johnson
e214fdeb69 [ThinLTO] Ensure the TargetLibraryInfo is constructed early enough
Summary:
Without this change, the WholeProgramDevirt pass, which requires the
TargetLibraryInfo, will construct one from the default triple.

Fixes PR38139.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D49278

llvm-svn: 337750
2018-07-23 21:58:19 +00:00
David Green
963401d2be [UnrollAndJam] New Unroll and Jam pass
This is a simple implementation of the unroll-and-jam classical loop
optimisation.

The basic idea is that we take an outer loop of the form:

  for i..
    ForeBlocks(i)
    for j..
      SubLoopBlocks(i, j)
    AftBlocks(i)

Instead of doing normal inner or outer unrolling, we unroll as follows:

  for i... i+=2
    ForeBlocks(i)
    ForeBlocks(i+1)
    for j..
      SubLoopBlocks(i, j)
      SubLoopBlocks(i+1, j)
    AftBlocks(i)
    AftBlocks(i+1)
  Remainder Loop

So we have unrolled the outer loop, then jammed the two inner loops into
one. This can lead to a simpler inner loop if memory accesses can be shared
between the now jammed loops.

To do this we have to prove that this is all safe, both for the memory
accesses (using dependence analysis) and that ForeBlocks(i+1) can move before
AftBlocks(i) and SubLoopBlocks(i, j).

Differential Revision: https://reviews.llvm.org/D41953

llvm-svn: 336062
2018-07-01 12:47:30 +00:00
Chandler Carruth
7c557f804d [instsimplify] Move the instsimplify pass to use more obvious file names
and diretory.

Also cleans up all the associated naming to be consistent and removes
the public access to the pass ID which was unused in LLVM.

Also runs clang-format over parts that changed, which generally cleans
up a bunch of formatting.

This is in preparation for doing some internal cleanups to the pass.

Differential Revision: https://reviews.llvm.org/D47352

llvm-svn: 336028
2018-06-29 23:36:03 +00:00
Tobias Edler von Koch
7609cb83e6 Re-land "[LTO] Enable module summary emission by default for regular LTO"
Since we are now producing a summary also for regular LTO builds, we
need to run the NameAnonGlobals pass in those cases as well (the
summary cannot handle anonymous globals).

See https://reviews.llvm.org/D34156 for details on the original change.

This reverts commit 6c9ee4a4a438a8059aacc809b2dd57128fccd6b3.

llvm-svn: 335385
2018-06-22 20:23:21 +00:00
Chandler Carruth
aa5f4d2e23 Revert r335306 (and r335314) - the Call Graph Profile pass.
This is the first pass in the main pipeline to use the legacy PM's
ability to run function analyses "on demand". Unfortunately, it turns
out there are bugs in that somewhat-hacky approach. At the very least,
it leaks memory and doesn't support -debug-pass=Structure. Unclear if
there are larger issues or not, but this should get the sanitizer bots
back to green by fixing the memory leaks.

llvm-svn: 335320
2018-06-22 05:33:57 +00:00
Michael J. Spencer
fc93dd8e18 [Instrumentation] Add Call Graph Profile pass
This patch adds support for generating a call graph profile from Branch Frequency Info.

The CGProfile module pass simply gets the block profile count for each BB and scans for call instructions. For each call instruction it adds an edge from the current function to the called function with the current BB block profile count as the weight.

After scanning all the functions, it generates an appending module flag containing the data. The format looks like:

!llvm.module.flags = !{!0}

!0 = !{i32 5, !"CG Profile", !1}
!1 = !{!2, !3, !4} ; List of edges
!2 = !{void ()* @a, void ()* @b, i64 32} ; Edge from a to b with a weight of 32
!3 = !{void (i1)* @freq, void ()* @a, i64 11}
!4 = !{void (i1)* @freq, void ()* @b, i64 20}

Differential Revision: https://reviews.llvm.org/D48105

llvm-svn: 335306
2018-06-21 23:31:10 +00:00
Chandler Carruth
71fd27043e [PM/LoopUnswitch] When using the new SimpleLoopUnswitch pass, schedule
loop-cleanup passes at the beginning of the loop pass pipeline, and
re-enqueue loops after even trivial unswitching.

This will allow us to much more consistently avoid simplifying code
while doing trivial unswitching. I've also added a test case that
specifically shows effective iteration using this technique.

I've unconditionally updated the new PM as that is always using the
SimpleLoopUnswitch pass, and I've made the pipeline changes for the old
PM conditional on using this new unswitch pass. I added a bunch of
comments to the loop pass pipeline in the old PM to make it more clear
what is going on when reviewing.

Hopefully this will unblock doing *partial* unswitching instead of just
full unswitching.

Differential Revision: https://reviews.llvm.org/D47408

llvm-svn: 333493
2018-05-30 02:46:45 +00:00
David Green
aee7ad0cde Revert 333358 as it's failing on some builders.
I'm guessing the tests reply on the ARM backend being built.

llvm-svn: 333359
2018-05-27 12:54:33 +00:00
David Green
3034281b43 [UnrollAndJam] Add a new Unroll and Jam pass
This is a simple implementation of the unroll-and-jam classical loop
optimisation.

The basic idea is that we take an outer loop of the form:

for i..
  ForeBlocks(i)
  for j..
    SubLoopBlocks(i, j)
  AftBlocks(i)

Instead of doing normal inner or outer unrolling, we unroll as follows:

for i... i+=2
  ForeBlocks(i)
  ForeBlocks(i+1)
  for j..
    SubLoopBlocks(i, j)
    SubLoopBlocks(i+1, j)
  AftBlocks(i)
  AftBlocks(i+1)
Remainder

So we have unrolled the outer loop, then jammed the two inner loops into
one. This can lead to a simpler inner loop if memory accesses can be shared
between the now-jammed loops.

To do this we have to prove that this is all safe, both for the memory
accesses (using dependence analysis) and that ForeBlocks(i+1) can move before
AftBlocks(i) and SubLoopBlocks(i, j).

Differential Revision: https://reviews.llvm.org/D41953

llvm-svn: 333358
2018-05-27 12:11:21 +00:00
Vlad Tsyrklevich
b768d235a9 Revert "Enable EliminateAvailableExternally pass for -O1"
This reverts commit r330961 because it breaks a handful of clang tests.

llvm-svn: 330964
2018-04-26 17:54:53 +00:00
Vlad Tsyrklevich
42c5a9c29a Enable EliminateAvailableExternally pass for -O1
Summary:
Follow-up to D43690, the EliminateAvailableExternally pass currently
runs under -O0 and -O2 and up. Under -O1 we would still want to drop
available_externally symbols to reduce space without inlining having
run.

Reviewers: tejohnson

Reviewed By: tejohnson

Subscribers: mehdi_amini, llvm-commits, kcc

Differential Revision: https://reviews.llvm.org/D46093

llvm-svn: 330961
2018-04-26 17:33:24 +00:00
David Blaikie
ba47dd16c5 Fix some layering in AggressiveInstCombine (avoiding inclusion of Scalar.h)
llvm-svn: 330726
2018-04-24 15:40:07 +00:00
David Blaikie
a27771b62f InstCombine: Fix layering by not including Scalar.h in InstCombine
(notionally Scalar.h is part of libLLVMScalarOpts, so it shouldn't be
included by InstCombine which doesn't/shouldn't need to depend on
ScalarOpts)

llvm-svn: 330669
2018-04-24 00:48:59 +00:00
David Blaikie
a373d18eb7 Transforms: Introduce Transforms/Utils.h rather than spreading the declarations amongst Scalar.h and IPO.h
Fixes layering - Transforms/Utils shouldn't depend on including a Scalar
or IPO header, because Scalar and IPO depend on Utils.

llvm-svn: 328717
2018-03-28 17:44:36 +00:00
Amjad Aboud
f1f57a3137 Another try to commit 323321 (aggressive instruction combine).
llvm-svn: 323416
2018-01-25 12:06:32 +00:00
Amjad Aboud
d53504e379 Reverted 323321.
llvm-svn: 323326
2018-01-24 14:48:49 +00:00
Amjad Aboud
e4453233d7 [InstCombine] Introducing Aggressive Instruction Combine pass (-aggressive-instcombine).
Combine expression patterns to form expressions with fewer, simple instructions.
This pass does not modify the CFG.

For example, this pass reduce width of expressions post-dominated by TruncInst
into smaller width when applicable.

It differs from instcombine pass in that it contains pattern optimization that
requires higher complexity than the O(1), thus, it should run fewer times than
instcombine pass.

Differential Revision: https://reviews.llvm.org/D38313

llvm-svn: 323321
2018-01-24 12:42:42 +00:00
Sanjay Patel
0ab0c1a201 [SimplifyCFG] don't sink common insts too soon (PR34603)
This should solve:
https://bugs.llvm.org/show_bug.cgi?id=34603
...by preventing SimplifyCFG from altering redundant instructions before early-cse has a chance to run.
It changes the default (canonical-forming) behavior of SimplifyCFG, so we're only doing the
sinking transform later in the optimization pipeline.

Differential Revision: https://reviews.llvm.org/D38566

llvm-svn: 320749
2017-12-14 22:05:20 +00:00
Michael Zolotukhin
6af4f232b5 Remove redundant includes from lib/Transforms.
llvm-svn: 320628
2017-12-13 21:31:01 +00:00
Hans Wennborg
e1ecd61b98 Rename CountingFunctionInserter and use for both mcount and cygprofile calls, before and after inlining
Clang implements the -finstrument-functions flag inherited from GCC, which
inserts calls to __cyg_profile_func_{enter,exit} on function entry and exit.

This is useful for getting a trace of how the functions in a program are
executed. Normally, the calls remain even if a function is inlined into another
function, but it is useful to be able to turn this off for users who are
interested in a lower-level trace, i.e. one that reflects what functions are
called post-inlining. (We use this to generate link order files for Chromium.)

LLVM already has a pass for inserting similar instrumentation calls to
mcount(), which it does after inlining. This patch renames and extends that
pass to handle calls both to mcount and the cygprofile functions, before and/or
after inlining as controlled by function attributes.

Differential Revision: https://reviews.llvm.org/D39287

llvm-svn: 318195
2017-11-14 21:09:45 +00:00
Teresa Johnson
07ec7d59c2 [ThinLTO] Ensure sanitizer passes are run
Summary:
In ThinLTO compilation, we exit populateModulePassManager early and
were not adding PM extension passes meant to run at the end of the
pipeline. This includes sanitizer passes. Add these passes before
the early exit.

A test will be added to projects/compiler-rt.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D39565

llvm-svn: 317714
2017-11-08 19:45:52 +00:00
Jun Bum Lim
0c99007db1 Recommit r317351 : Add CallSiteSplitting pass
This recommit r317351 after fixing a buildbot failure.

Original commit message:

    Summary:
    This change add a pass which tries to split a call-site to pass
    more constrained arguments if its argument is predicated in the control flow
    so that we can expose better context to the later passes (e.g, inliner, jump
    threading, or IPA-CP based function cloning, etc.).
    As of now we support two cases :

    1) If a call site is dominated by an OR condition and if any of its arguments
    are predicated on this OR condition, try to split the condition with more
    constrained arguments. For example, in the code below, we try to split the
    call site since we can predicate the argument (ptr) based on the OR condition.

    Split from :
          if (!ptr || c)
            callee(ptr);
    to :
          if (!ptr)
            callee(null ptr)  // set the known constant value
          else if (c)
            callee(nonnull ptr)  // set non-null attribute in the argument

    2) We can also split a call-site based on constant incoming values of a PHI
    For example,
    from :
          BB0:
           %c = icmp eq i32 %i1, %i2
           br i1 %c, label %BB2, label %BB1
          BB1:
           br label %BB2
          BB2:
           %p = phi i32 [ 0, %BB0 ], [ 1, %BB1 ]
           call void @bar(i32 %p)
    to
          BB0:
           %c = icmp eq i32 %i1, %i2
           br i1 %c, label %BB2-split0, label %BB1
          BB1:
           br label %BB2-split1
          BB2-split0:
           call void @bar(i32 0)
           br label %BB2
          BB2-split1:
           call void @bar(i32 1)
           br label %BB2
          BB2:
           %p = phi i32 [ 0, %BB2-split0 ], [ 1, %BB2-split1 ]

llvm-svn: 317362
2017-11-03 20:41:16 +00:00
Jun Bum Lim
0eb1c2d63a Revert "Add CallSiteSplitting pass"
Revert due to Buildbot failure.

This reverts commit r317351.

llvm-svn: 317353
2017-11-03 19:17:11 +00:00