| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before this change we created an additional reload in the copy of the incoming
block of a PHI node to reload the incoming value, even though the necessary
value has already been made available by the normally generated scalar loads.
In this change, we drop the code that generates this redundant reload and
instead just reuse the scalar value already available.
Besides making the generated code slightly cleaner, this change also makes sure
that scalar loads go through the normal logic, which means they can be remapped
(e.g. to array slots) and corresponding code is generated to load from the
remapped location. Without this change, the original scalar load at the
beginning of the non-affine region would have been remapped, but the redundant
scalar load would continue to load from the old PHI slot location.
It might be possible to further simplify the code in addOperandToPHI,
but this would not only mean to pull out getNewValue, but to also change the
insertion point update logic. As this did not work when trying it the first
time, this change is likely not trivial. To not introduce bugs last minute, we
postpone further simplications to a subsequent commit.
We also document the current behavior a little bit better.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D28892
llvm-svn: 292486
|
|
|
|
|
|
|
|
| |
Making certain values 'const' to just cast it away a little later mainly
obfuscates the code. Hence, we just drop the 'const' parts.
Suggested-by: Michael Kruse <llvm@meinersbur.de>
llvm-svn: 292480
|
|
|
|
| |
llvm-svn: 292471
|
|
|
|
| |
llvm-svn: 292123
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To benefit of the type safety guarantees of C++11 typed enums, which would have
caught the type mismatch fixed in r291960, we make MemoryKind a typed enum.
This change also allows us to drop the 'MK_' prefix and to instead use the more
descriptive full name of the enum as prefix. To reduce the amount of typing
needed, we use this opportunity to move MemoryKind from ScopArrayInfo to a
global scope, which means the ScopArrayInfo:: prefix is not needed. This move
also makes historically sense. In the beginning of Polly we had different
MemoryKind enums in both MemoryAccess and ScopArrayInfo, which were later
canonicalized to one. During this canonicalization we just choose the enum in
ScopArrayInfo, but did not consider to move this shared enum to global scope.
Reviewed-by: Michael Kruse <llvm@meinersbur.de>
Differential Revision: https://reviews.llvm.org/D28090
llvm-svn: 292030
|
|
|
|
|
|
|
| |
The helper function polly::canSynthesize() does not directly use the LoopInfo
analysis, hence remove it from its argument list.
llvm-svn: 288144
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes polly generate a CFG which is closer to what we want
in LLVM IR, with a loop preheader for the original loop. This is
just a cleanup, but it exposes some fragile assumptions.
I'm not completely happy with the changes related to expandCodeFor;
RTCBB->getTerminator() is basically a random insertion point which
happens to work due to the way we generate runtime checks. I'm not
sure what the right answer looks like, though.
Differential Revision: https://reviews.llvm.org/D26053
llvm-svn: 285864
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Iterating over SeenBlocks which is a SmallPtrSet results in non-determinism in codegen
Reviewers: jdoerfert, zinob, grosser
Tags: #polly
Differential Revision: https://reviews.llvm.org/D25778
llvm-svn: 284622
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Under some conditions MK_Value read accessed where converted to MK_ExitPHI read
accessed. This is unexpected because MK_ExitPHI read accesses are implicit after
the scop execution. This behaviour was introduced in r265261, which fixed a
failed assertion/crash in CodeGen.
Instead, we fix this failure in CodeGen itself. createExitPHINodeMerges(),
despite its name, also handles accesses of kind MK_Value, only to skip them
because they access values that are usually not PHI nodes in the SCoP region's
exit block. Except in the situation observed in r265261.
Do not convert value accessed to ExitPHI accesses and do not handle
value accesses like ExitPHI accessed in CodeGen anymore.
llvm-svn: 284023
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
generateScalarLoad() and generateScalarStore() are used for explicit (MK_Array)
memory accesses, therefore the method names were misleading. The names also
were similar to generateScalarLoads() and generateScalarStores() (plural forms)
which indeed handle scalar accesses. Presumbly, they were originally named to
contrast VectorBlockGenerator::generateLoad().
Rename the two methods to generateArrayLoad(),
respectively generateArrayStore().
llvm-svn: 282861
|
|
|
|
|
|
|
| |
The code generator always adds unconditional LoadInst and StoreInst, hence the
MemoryAccess must be defined over all statement instances.
llvm-svn: 282853
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the fourth patch to apply the BLIS matmul optimization pattern on matmul
kernels (http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf).
BLIS implements gemm as three nested loops around a macro-kernel, plus two
packing routines. The macro-kernel is implemented in terms of two additional
loops around a micro-kernel. The micro-kernel is a loop around a rank-1
(i.e., outer product) update. In this change we perform copying to created
arrays, which is the last step to implement the packing transformation.
Reviewed-by: Tobias Grosser <tobias@grosser.es>
Differential Revision: https://reviews.llvm.org/D23260
llvm-svn: 281441
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change the code around setNewAccessRelation to allow to use a an existing array
element for memory instead of an ad-hoc alloca. This facility will be used for
DeLICM/DeGVN to convert scalar dependencies into regular ones.
The changes necessary include:
- Make the code generator use the implicit locations instead of the alloca ones.
- A test case
- Make the JScop importer accept changes of scalar accesses for that test case.
- Adapt the MemoryAccess interface to the fact that the MemoryKind can change.
They are named (get|is)OriginalXXX() to get the status of the memory access
before any change by setNewAccessRelation() (some properties such as
getIncoming() do not change even if the kind is changed and are still
required). To get the modified properties, there is (get|is)LatestXXX(). The
old accessors without Original|Latest become synonyms of the
(get|is)OriginalXXX() to not make functional changes in unrelated code.
Differential Revision: https://reviews.llvm.org/D23962
llvm-svn: 280408
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We already invalidated a couple of critical values earlier on, but we now
invalidate all instructions contained in a scop after the scop has been code
generated. This is necessary as later scops may otherwise obtain SCEV
expressions that reference values in the earlier scop that before dominated
the later scop, but which had been moved into the conditional branch and
consequently do not dominate the later scop any more. If these very values are
then used during code generation of the later scop, we generate used that are
dominated by the values they use.
This fixes: http://llvm.org/PR28984
llvm-svn: 279047
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In case some code -- not guarded by control flow -- would be emitted directly in
the start block, it may happen that this code would use uninitalized scalar
values if the scalar initialization is only emitted at the end of the start
block. This is not a problem today in normal Polly, as all statements are
emitted in their own basic blocks, but Polly-ACC emits host-to-device copy
statements into the start block.
Additional Polly-ACC test coverage will be added in subsequent changes that
improve the handling of PHI nodes in Polly-ACC.
llvm-svn: 278124
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After having generated the code for a ScopStmt, we run a simple dead-code
elimination that drops all instructions that are known to be and remain unused.
Until this change, we only considered instructions for dead-code elimination, if
they have a corresponding instruction in the original BB that belongs to
ScopStmt. However, when generating code we do not only copy code from the BB
belonging to a ScopStmt, but also generate code for operands referenced from BB.
After this change, we now also considers code for dead code elimination, which
does not have a corresponding instruction in BB.
This fixes a bug in Polly-ACC where such dead-code referenced CPU code from
within a GPU kernel, which is possible as we do not guarantee that all variables
that are used in known-dead-code are moved to the GPU.
llvm-svn: 278103
|
|
|
|
|
|
|
|
|
|
|
| |
The map is iterated over when generating the values escaping the SCoP. The
indeterministic iteration order of DenseMap causes the output IR to change at
every compilation, adding noise to comparisons.
Replace DenseMap by a MapVector to ensure the same iteration order at every
compilation.
llvm-svn: 277832
|
|
|
|
| |
llvm-svn: 277698
|
|
|
|
| |
llvm-svn: 277569
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
reference these arrays from access expressions
Extend the jscop interface to allow the user to export arrays. It is required
that already existing arrays of the list of arrays correspond to arrays
of the SCoP. Each array that is appended to the list will be newly created.
Furthermore, we allow the user to modify access expressions to reference
any array in case it has the same element type.
Reviewed-by: Tobias Grosser <tobias@grosser.es>
Differential Revision: https://reviews.llvm.org/D22828
llvm-svn: 277263
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This ensures that no trivially dead code is generated. This is not only cleaner,
but also avoids troubles in case code is generated in a separate function and
some of this dead code contains references to values that are not available.
This issue may happen, in case the memory access functions have been updated
and old getelementptr instructions remain in the code. With normal Polly,
a test case is difficult to draft, but the upcoming GPU code generation can
possibly trigger such problems. We will later extend this dead-code elimination
to region and vector statements.
llvm-svn: 276263
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
API-wise `apply` is a somewhat unidiomatic one-off function, and
removing the only(?) use in polly will let me remove it from SCEV's
exposed interface.
Reviewers: jdoerfert, Meinersbur, grosser
Subscribers: grosser, mcrosier, pollydev
Differential Revision: http://reviews.llvm.org/D20779
llvm-svn: 271177
|
|
|
|
|
|
|
| |
Found by clang static analyzer (http://llvm.org/reports/scan-build/)
and Visual Studio.
llvm-svn: 270432
|
|
|
|
| |
llvm-svn: 270429
|
|
|
|
|
|
|
| |
This allows to use the SCoP directly for various queries,
thus to hide the underlying region more often.
llvm-svn: 270426
|
|
|
|
| |
llvm-svn: 270424
|
|
|
|
| |
llvm-svn: 270421
|
|
|
|
| |
llvm-svn: 270411
|
|
|
|
| |
llvm-svn: 267874
|
|
|
|
|
|
|
| |
This also removes a duplicated line of code in the region generator
that caused a SPEC benchmark to fail with the new SCoPs.
llvm-svn: 267404
|
|
|
|
|
|
|
|
|
|
| |
If a non-affine region PHI is generated we should not move the insert
point prior to the synthezised value in the same block as we might
split that block at the insert point later on. Only if the incoming
value should be placed in a different block we should change the
insertion point.
llvm-svn: 265132
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Value merging is only necessary for scalars when they are used outside
of the scop. While an array's base pointer can be used after the scop,
it gets an extra ScopArrayInfo of type MK_Value. We used to generate
phi's for both of them, where one was assuming the reault of the other
phi would be the original value, because it has already been replaced by
the previous phi. This resulted in IR that the current IR verifier
allows, but is probably illegal.
This reduces the number of LNT test-suite fails with
-polly-position=before-vectorizer -polly-process-unprofitable
from 16 to 10.
Also see llvm.org/PR26718.
llvm-svn: 262629
|
|
|
|
|
|
|
|
|
|
|
| |
Polly recognizes affine loops that ScalarEvolution does not, in
particular those with loop conditions that depend on hoisted invariant
loads. Check for SCEVAddRec dependencies on such loops and do not
consider their exit values as synthesizable because SCEVExpander would
generate them as expressions that depend on the original induction
variables. These are not available in generated code.
llvm-svn: 262404
|
|
|
|
| |
llvm-svn: 261876
|
|
|
|
|
|
|
|
|
| |
The generated dedicated subregion exit block was assumed to have the same
dominance relation as the original exit block. This is incorrect if the exit
block receives other edges than only from the subregion, which results in that
e.g. the subregion's entry block does not dominate the exit block.
llvm-svn: 261865
|
|
|
|
|
|
| |
This replaces an ungly inline ternary operator pattern.
llvm-svn: 261792
|
|
|
|
|
|
|
|
| |
This is also be caught by the function verifier, but disconnected from
the place that produced it. Catch it already at creation to be able to
reason more directly about the cause.
llvm-svn: 261790
|
|
|
|
| |
llvm-svn: 261473
|
|
|
|
|
|
| |
This patch was extracted from http://reviews.llvm.org/D13611.
llvm-svn: 260958
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows code such as:
void multiple_types(char *Short, char *Float, char *Double) {
for (long i = 0; i < 100; i++) {
Short[i] = *(short *)&Short[2 * i];
Float[i] = *(float *)&Float[4 * i];
Double[i] = *(double *)&Double[8 * i];
}
}
To model such code we use as canonical element type of the modeled array the
smallest element type of all original array accesses, if type allocation sizes
are multiples of each other. Otherwise, we use a newly created iN type, where N
is the gcd of the allocation size of the types used in the accesses to this
array. Accesses with types larger as the canonical element type are modeled as
multiple accesses with the smaller type.
For example the second load access is modeled as:
{ Stmt_bb2[i0] -> MemRef_Float[o0] : 4i0 <= o0 <= 3 + 4i0 }
To support code-generating these memory accesses, we introduce a new method
getAccessAddressFunction that assigns each statement instance a single memory
location, the address we load from/store to. Currently we obtain this address by
taking the lexmin of the access function. We may consider keeping track of the
memory location more explicitly in the future.
We currently do _not_ handle multi-dimensional arrays and also keep the
restriction of not supporting accesses where the offset expression is not a
multiple of the access element type size. This patch adds tests that ensure
we correctly invalidate a scop in case these accesses are found. Both types of
accesses can be handled using the very same model, but are left to be added in
the future.
We also move the initialization of the scop-context into the constructor to
ensure it is already available when invalidating the scop.
Finally, we add this as a new item to the 2.9 release notes
Reviewers: jdoerfert, Meinersbur
Differential Revision: http://reviews.llvm.org/D16878
llvm-svn: 259784
|
|
|
|
|
|
| |
This reverts commit (@259587). It needs some further discussions.
llvm-svn: 259629
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We support now code such as:
void multiple_types(char *Short, char *Float, char *Double) {
for (long i = 0; i < 100; i++) {
Short[i] = *(short *)&Short[2 * i];
Float[i] = *(float *)&Float[4 * i];
Double[i] = *(double *)&Double[8 * i];
}
}
To support such code we use as element type of the modeled array the smallest
element type of all original array accesses. Accesses with larger types are
modeled as multiple accesses with the smaller type.
For example the second load access is modeled as:
{ Stmt_bb2[i0] -> MemRef_Float[o0] : 4i0 <= o0 <= 3 + 4i0 }
To support jscop-rewritable memory accesses we need each statement instance to
only be assigned a single memory location, which will be the address at which
we load the value. Currently we obtain this address by taking the lexmin of
the access function. We may consider keeping track of the memory location more
explicitly in the future.
llvm-svn: 259587
|
|
|
|
| |
llvm-svn: 259504
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MemAccInst wraps the common members of LoadInst and StoreInst. Also use
of this class in:
- ScopInfo::buildMemoryAccess
- BlockGenerator::generateLocationAccessed
- ScopInfo::addArrayAccess
- Scop::buildAliasGroups
- Replace every use of polly::getPointerOperand
Reviewers: jdoerfert, grosser
Differential Revision: http://reviews.llvm.org/D16530
llvm-svn: 258947
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ensure that there is at most one phi write access per PHINode and
ScopStmt. In particular, this would be possible for non-affine
subregions with multiple exiting blocks. We replace multiple MAY_WRITE
accesses by one MUST_WRITE access. The written value is constructed
using a PHINode of all exiting blocks. The interpretation of the PHI
WRITE's "accessed value" changed from the incoming value to the PHI like
for PHI READs since there is no unique incoming value.
Because region simplification shuffles around PHI nodes -- particularly
with exit node PHIs -- the PHINodes at analysis time does not always
exist anymore in the code generation pass. We instead remember the
incoming block/value pair in the MemoryAccess.
Differential Revision: http://reviews.llvm.org/D15681
llvm-svn: 258809
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Both functions implement the same functionality, with the difference that
getNewScalarValue assumes that globals and out-of-scop scalars can be directly
reused without loading them from their corresponding stack slot. This is correct
for sequential code generation, but causes issues with outlining code e.g. for
OpenMP code generation. getNewValue handles such cases correctly.
Hence, we can replace getNewScalarValue with getNewValue. This is not only more
future proof, but also eliminates a bunch of code.
The only functionality that was available in getNewScalarValue that is lost
is the on-demand creation of scalar values. However, this is not necessary any
more as scalars are always loaded at the beginning of each basic block and will
consequently always be available when scalar stores are generated. As this was
not the case in older versions of Polly, it seems the on-demand loading is just
some older code that has not yet been removed.
Finally, generateScalarLoads also generated loads for values that are loop
invariant, available in GlobalMap and which are preferred over the ones loaded
in generateScalarLoads. Hence, we can just skip the code generation of such
scalar values, avoiding the generation of dead code.
Differential Revision: http://reviews.llvm.org/D16522
llvm-svn: 258799
|
|
|
|
| |
llvm-svn: 258660
|
|
|
|
| |
llvm-svn: 256269
|
|
|
|
| |
llvm-svn: 256267
|
|
|
|
| |
llvm-svn: 256266
|