For instructions in uniform set, they will not have vector versions so
add them to VecValuesToIgnore.
For induction vars, those only used in uniform instructions or consecutive
ptrs instructions have already been added to VecValuesToIgnore above. For
those induction vars which are only used in uniform instructions or
non-consecutive/non-gather scatter ptr instructions, the related phi and
update will also be added into VecValuesToIgnore set.
The change will make the vector RegUsages estimation less conservative.
Differential Revision: https://reviews.llvm.org/D20474
llvm-svn: 275912
This patch swaps A and B in the interleaved access analysis and clarifies
related comments. The algorithm is more intuitive if we let access A precede
access B in program order rather than the reverse. This change was requested in
the review of D19984.
llvm-svn: 275567
We now collect all accesses with a constant stride, not just the ones with a
stride greater than one. This change was requested in the review of D19984.
llvm-svn: 275473
This patch allows the formation of interleaved access groups in loops
containing predicated blocks. However, the predicated accesses are prevented
from forming groups.
Differential Revision: https://reviews.llvm.org/D19694
llvm-svn: 275471
This patch prevents increases in the number of instructions, pre-instcombine,
due to induction variable scalarization. An increase in instructions can lead
to an increase in the compile-time required to simplify the induction
variables. We now maintain a new map for scalarized induction variables to
prevent us from converting between the scalar and vector forms.
This patch should resolve compile-time regressions seen after r274627.
llvm-svn: 275419
The LCSSA pass itself will not generate several redundant PHI nodes in a single
exit block. However, such redundant PHI nodes don't violate LCSSA form, and may
be introduced by passes that preserve LCSSA, and/or preserved by the LCSSA pass
itself. So, assuming a single PHI node per exit block is not safe.
llvm-svn: 275217
We currently always vectorize induction variables. However, if an induction
variable is only used for counting loop iterations or computing addresses with
getelementptr instructions, we don't need to do this. Vectorizing these trivial
induction variables can create vector code that is difficult to simplify later
on. This is especially true when the unroll factor is greater than one, and we
create vector arithmetic when computing step vectors. With this patch, we check
if an induction variable is only used for counting iterations or computing
addresses, and if so, scalarize the arithmetic when computing step vectors
instead. This allows for greater simplification.
This patch addresses the suboptimal pointer arithmetic sequence seen in
PR27881.
Reference: https://llvm.org/bugs/show_bug.cgi?id=27881
Differential Revision: http://reviews.llvm.org/D21620
llvm-svn: 274627
This patch also removes the SCEV variants of getStepVector() since they have no
uses after the refactoring.
Differential Revision: http://reviews.llvm.org/D21903
llvm-svn: 274558
Except the seed uniform instructions (conditional branch and consecutive ptr
instructions), dependencies to be added into uniform set should only be used
by existing uniform instructions or intructions outside of current loop.
Differential Revision: http://reviews.llvm.org/D21755
llvm-svn: 274262
For the new hotness attribute, the API will take the pass rather than
the pass name so we can no longer play the trick of AlwaysPrint being a
special pass name. This adds a getter to help the transition.
There is also a corresponding clang patch.
llvm-svn: 274100
It did not handle correctly cases without GEP.
The following loop wasn't vectorized:
for (int i=0; i<len; i++)
*to++ = *from++;
I use getPtrStride() to find Stride for memory access and return 0 is the Stride is not 1 or -1.
Re-commit rL273257 - revision: http://reviews.llvm.org/D20789
llvm-svn: 273864
The interleaved access analysis currently assumes that the inserted run-time
pointer aliasing checks ensure the absence of dependences that would prevent
its instruction reordering. However, this is not the case.
Issues can arise from how code generation is performed for interleaved groups.
For a load group, all loads in the group are essentially moved to the location
of the first load in program order, and for a store group, all stores in the
group are moved to the location of the last store. For groups having members
involved in a dependence relation with any other instruction in the loop, this
reordering can violate the dependence.
This patch teaches the interleaved access analysis how to avoid breaking such
dependences, and should fix PR27626.
An assumption of the original analysis was that the accesses had been collected
in "program order". The analysis was then simplified by visiting the accesses
bottom-up. However, this ordering was never guaranteed for anything other than
single basic block loops. Thus, this patch also enforces the desired ordering.
Reference: https://llvm.org/bugs/show_bug.cgi?id=27626
Differential Revision: http://reviews.llvm.org/D19984
llvm-svn: 273687
It did not handle correctly cases without GEP.
The following loop wasn't vectorized:
for (int i=0; i<len; i++)
*to++ = *from++;
I use getPtrStride() to find Stride for memory access and return 0 is the Stride is not 1 or -1.
Differential revision: http://reviews.llvm.org/D20789
llvm-svn: 273257
This is a functional change for LLE and LDist. The other clients (LV,
LVerLICM) already had this explicitly enabled.
The temporary boolean parameter to LAA is removed that allowed turning
off speculation of symbolic strides. This makes LAA's caching interface
LAA::getInfo only take the loop as the parameter. This makes the
interface more friendly to the new Pass Manager.
The flag -enable-mem-access-versioning is moved from LV to a LAA which
now allows turning off speculation globally.
llvm-svn: 273064
This is still NFCI, so the list of clients that allow symbolic stride
speculation does not change (yes: LV and LoopVersioningLICM, no: LLE,
LDist). However since the symbolic strides are now managed by LAA
rather than passed by client a new bool parameter is used to enable
symbolic stride speculation.
The existing test Transforms/LoopVectorize/version-mem-access.ll checks
that stride speculation is performed for LV.
The previously added test Transforms/LoopLoadElim/symbolic-stride.ll
ensures that no speculation is performed for LLE.
The next patch will change the functionality and turn on symbolic stride
speculation in all of LAA's clients and remove the bool parameter.
llvm-svn: 272970
Turns out SymbolicStrides is actually used in canVectorizeWithIfConvert
before it gets set up in canVectorizeMemory.
This works fine as long as SymbolicStrides resides in LV since we just
have an empty map. Based on this the conclusion is made that there are
no symbolic strides which is conservatively correct.
However once SymbolicStrides becomes part of LAI, LAI is nullptr at this
point so we need to differentiate the uninitialized state by returning a
nullptr for SymbolicStrides.
llvm-svn: 272966
LoopVectorizationLegality holds a constant reference to LAI, so this
will have to be const as well.
Also added missed function comment.
llvm-svn: 272851
r272715 broke libcxx because it did not correctly handle cases where the
last iteration of one IV is the second-to-last iteration of another.
Original commit message:
Vectorizing loops with "escaping" IVs has been disabled since r190790, due to
PR17179. This re-enables it, with support for external use of both
"post-increment" (last iteration) and "pre-increment" (second-to-last iteration)
IVs.
llvm-svn: 272742
Vectorizing loops with "escaping" IVs has been disabled since r190790, due to
PR17179. This re-enables it, with support for external use of both
"post-increment" (last iteration) and "pre-increment" (second-to-last iteration)
IVs.
Differential Revision: http://reviews.llvm.org/D21048
llvm-svn: 272715
Previously, we materialized secondary vector IVs from the primary scalar IV,
by offseting the primary to match the correct start value, and then broadcasting
it - inside the loop body. Instead, we can use a real vector IV, like we do for
the primary.
This enables using vector IVs for secondary integer IVs whose type matches the
type of the primary.
Differential Revision: http://reviews.llvm.org/D20932
llvm-svn: 272283
Previously, whenever we needed a vector IV, we would create it on the fly,
by splatting the scalar IV and adding a step vector. Instead, we can create a
real vector IV. This tends to save a couple of instructions per iteration.
This only changes the behavior for the most basic case - integer primary
IVs with a constant step.
Differential Revision: http://reviews.llvm.org/D20315
llvm-svn: 271410
In truncateToMinimalBitwidths() we were RAUW'ing an instruction then erasing it. However, that intruction could be cached in the map we're iterating over. The first check is "I->use_empty()" which in most cases would return true, as the (deleted) object was RAUW'd first so would have zero use count. However in some cases the object could have been polluted or written over and this wouldn't be the case. Also it makes valgrind, asan and traditionalists who don't like their compiler to crash sad.
No testcase as there are no externally visible symptoms apart from a crash if the stars align.
Fixes PR26509.
llvm-svn: 269908
The selection of the vectorization factor currently doesn't consider
interleaved accesses. The vectorization factor is based on the maximum safe
dependence distance computed by LAA. However, for loops with interleaved
groups, we should instead base the vectorization factor on the maximum safe
dependence distance divided by the maximum interleave factor of all the
interleaved groups. Interleaved accesses not in a group will be scalarized.
Differential Revision: http://reviews.llvm.org/D20241
llvm-svn: 269659
LoopVectorBody was changed from a single pointer to a SmallVector when
store predication was introduced in r200270. Since r247139, store predication
no longer splits the vector loop body in-place, so we can go back to having
a single LoopVectorBody block.
This reverts the no-longer-needed changes from r200270.
llvm-svn: 269321
Allow vectorization when the step is a loop-invariant variable.
This is the loop example that is getting vectorized after the patch:
int int_inc;
int bar(int init, int *restrict A, int N) {
int x = init;
for (int i=0;i<N;i++){
A[i] = x;
x += int_inc;
}
return x;
}
"x" is an induction variable with *loop-invariant* step.
But it is not a primary induction. Primary induction variable with non-constant step is not handled yet.
Differential Revision: http://reviews.llvm.org/D19258
llvm-svn: 269023