| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ISL can conclude additional conditions on parameters from restrictions
on loop variables. Such conditions persist when leaving the loop and the
loop variable is projected out. This results in a narrower domain for
exiting the loop than entering it and is logically impossible for
non-infinite loops.
We fix this by not adding a lower bound i>=0 when constructing BB
domains, but defer it to when also the upper bound it computed, which
was done redundantly even before this patch.
This reduces the number of LNT fails with -polly-process-unprofitable
-polly-position=before-vectorizer from 8 to 6.
llvm-svn: 264118
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Value merging is only necessary for scalars when they are used outside
of the scop. While an array's base pointer can be used after the scop,
it gets an extra ScopArrayInfo of type MK_Value. We used to generate
phi's for both of them, where one was assuming the reault of the other
phi would be the original value, because it has already been replaced by
the previous phi. This resulted in IR that the current IR verifier
allows, but is probably illegal.
This reduces the number of LNT test-suite fails with
-polly-position=before-vectorizer -polly-process-unprofitable
from 16 to 10.
Also see llvm.org/PR26718.
llvm-svn: 262629
|
|
|
|
|
|
|
|
|
|
|
| |
Polly recognizes affine loops that ScalarEvolution does not, in
particular those with loop conditions that depend on hoisted invariant
loads. Check for SCEVAddRec dependencies on such loops and do not
consider their exit values as synthesizable because SCEVExpander would
generate them as expressions that depend on the original induction
variables. These are not available in generated code.
llvm-svn: 262404
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order to speed up compile time and to avoid random timeouts we now
separately track assumptions and restrictions. In this context
assumptions describe parameter valuations we need and restrictions
describe parameter valuations we do not allow. During AST generation
we create a runtime check for both, whereas the one for the
restrictions is negated before a conjunction is build.
Except the In-Bounds assumptions we currently only track restrictions.
Differential Revision: http://reviews.llvm.org/D17247
llvm-svn: 262328
|
|
|
|
|
|
| |
This cures the symptoms we see in h264 of SPEC2006 but not the cause.
llvm-svn: 262327
|
|
|
|
|
|
|
|
|
| |
The generated dedicated subregion exit block was assumed to have the same
dominance relation as the original exit block. This is incorrect if the exit
block receives other edges than only from the subregion, which results in that
e.g. the subregion's entry block does not dominate the exit block.
llvm-svn: 261865
|
|
|
|
|
|
|
|
|
|
| |
The test style guide defines that opt should get its input from stdin.
(instead by file argument to avoid that the file name appears in its
output)
CHECK-FORCED is not recognized by FileCheck; remove it.
llvm-svn: 261786
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use 'mark' nodes annotate a SIMD loop during ScheduleTransformation and skip
parallelism checks.
The buildbot shows the following compile/execution time changes:
Compile time:
Improvements Δ Previous Current σ
…/gesummv -6.06% 0.2640 0.2480 0.0055
…/gemver -4.46% 0.4480 0.4280 0.0044
…/covariance -4.31% 0.8360 0.8000 0.0065
…/adi -3.23% 0.9920 0.9600 0.0065
…/doitgen -2.53% 0.9480 0.9240 0.0090
…/3mm -2.33% 1.0320 1.0080 0.0087
Execution time:
Regressions Δ Previous Current σ
…/viterbi 1.70% 5.1840 5.2720 0.0074
…/smallpt 1.06% 12.4920 12.6240 0.0040
Reviewed-by: Tobias Grosser <tobias@grosser.es>
Differential Revision: http://reviews.llvm.org/D14491
llvm-svn: 261620
|
|
|
|
| |
llvm-svn: 261501
|
|
|
|
| |
llvm-svn: 261488
|
|
|
|
|
|
|
|
|
|
|
|
| |
To support non-aligned accesses we introduce a virtual element size
for arrays that divides each access function used for this array. The
adjustment of the access function based on the element size of the
array was therefore moved after this virtual element size was
determined, thus after all accesses have been created.
Differential Revision: http://reviews.llvm.org/D17246
llvm-svn: 261226
|
|
|
|
|
|
|
|
|
|
| |
A load can only be invariant if its base pointer is invariant too. To
this end, we check if the base pointer is defined inside the region or
outside. In the former case we recursively check if we can (and
therefore will) hoist the base pointer too. Only if that happends we
can hoist the load.
llvm-svn: 260886
|
|
|
|
|
|
|
|
|
| |
This reverts commit 98efa006c96ac981c00d2e386ec1102bce9f549a.
The fix was broken since we do not use AA in the ScopDetection anymore to
check for invariant accesses.
llvm-svn: 260884
|
|
|
|
|
|
|
|
|
|
| |
Before this patch it could happen that we did not hoist a load that
was a base pointer of another load even though AA already declared the
first one as invariant (during ScopDetection). If this case arises we
will now skipt the "can be overwriten" check because in this case the
over-approximating nature causes us to generate broken code.
llvm-svn: 260862
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
So far we separated constant factors from multiplications, however,
only when they are at the outermost level of a parameter SCEV. Now,
we also separate constant factors from the parameter SCEV if the
outermost expression is a SCEVAddRecExpr. With the changes to the
SCEVAffinator we can now improve the extractConstantFactor(...)
function at will without worrying about any other code part. Thus,
if needed we can implement a more comprehensive
extractConstantFactor(...) function that will traverse the SCEV
instead of looking only at the outermost level.
Four test cases were affected. One did not change much and the other
three were simplified.
llvm-svn: 260859
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We now distinguish invariant loads to the same memory location if they
have different types. This will cause us to pre-load an invariant
location once for each type that is used to access it. However, we can
thereby avoid invalid casting, especially if an array is accessed
though different typed/sized invariant loads.
This basically reverts the changes in r260023 but keeps the test
cases.
llvm-svn: 260045
|
|
|
|
|
|
|
|
| |
We also disable this feature by default, as there are still some issues in
combination with invariant load hoisting that slipped through my initial
testing.
llvm-svn: 260025
|
|
|
|
|
|
|
|
|
|
| |
Invariant load hoisting of memory accesses with non-canonical element
types lacks support for equivalence classes that contain elements of
different width/size. This support should be added, but to get our buildbots
back to green, we disable load hoisting for memory accesses with non-canonical
element size for now.
llvm-svn: 260023
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Always use access-instruction pointer type to load the invariant values.
Otherwise mismatches between ScopArrayInfo element type and memory access
element type will result in invalid casts. These type mismatches are after
r259784 a lot more common and also arise with types of different size, which
have not been handled before.
Interestingly, this change actually simplifies the code, as we now have only
one code path that is always taken, rather then a standard code path for the
common case and a "fixup" code path that replaces the standard code path in
case of mismatching types.
llvm-svn: 260009
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previously implemented approach is to follow value definitions and
create write accesses ("push defs") while searching for uses. This
requires the same relatively validity- and requirement conditions to be
replicated at multiple locations (PHI instructions, other instructions,
uses by PHIs).
We replace this by iterating over the uses in a SCoP ("pull in
requirements"), and add writes only when at least one read has been
added. It turns out to be simpler code because each use is only iterated
over once and writes are added for the first access that reads it. We
need another iteration to identify escaping values (uses not in the
SCoP), which also makes the difference between such accesses more
obvious. As a side-effect, the order of scalar MemoryAccess can change.
Differential Revision: http://reviews.llvm.org/D15706
llvm-svn: 259987
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows code such as:
void multiple_types(char *Short, char *Float, char *Double) {
for (long i = 0; i < 100; i++) {
Short[i] = *(short *)&Short[2 * i];
Float[i] = *(float *)&Float[4 * i];
Double[i] = *(double *)&Double[8 * i];
}
}
To model such code we use as canonical element type of the modeled array the
smallest element type of all original array accesses, if type allocation sizes
are multiples of each other. Otherwise, we use a newly created iN type, where N
is the gcd of the allocation size of the types used in the accesses to this
array. Accesses with types larger as the canonical element type are modeled as
multiple accesses with the smaller type.
For example the second load access is modeled as:
{ Stmt_bb2[i0] -> MemRef_Float[o0] : 4i0 <= o0 <= 3 + 4i0 }
To support code-generating these memory accesses, we introduce a new method
getAccessAddressFunction that assigns each statement instance a single memory
location, the address we load from/store to. Currently we obtain this address by
taking the lexmin of the access function. We may consider keeping track of the
memory location more explicitly in the future.
We currently do _not_ handle multi-dimensional arrays and also keep the
restriction of not supporting accesses where the offset expression is not a
multiple of the access element type size. This patch adds tests that ensure
we correctly invalidate a scop in case these accesses are found. Both types of
accesses can be handled using the very same model, but are left to be added in
the future.
We also move the initialization of the scop-context into the constructor to
ensure it is already available when invalidating the scop.
Finally, we add this as a new item to the 2.9 release notes
Reviewers: jdoerfert, Meinersbur
Differential Revision: http://reviews.llvm.org/D16878
llvm-svn: 259784
|
|
|
|
| |
llvm-svn: 259737
|
|
|
|
| |
llvm-svn: 259693
|
|
|
|
|
|
| |
This reverts commit (@259587). It needs some further discussions.
llvm-svn: 259629
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We support now code such as:
void multiple_types(char *Short, char *Float, char *Double) {
for (long i = 0; i < 100; i++) {
Short[i] = *(short *)&Short[2 * i];
Float[i] = *(float *)&Float[4 * i];
Double[i] = *(double *)&Double[8 * i];
}
}
To support such code we use as element type of the modeled array the smallest
element type of all original array accesses. Accesses with larger types are
modeled as multiple accesses with the smaller type.
For example the second load access is modeled as:
{ Stmt_bb2[i0] -> MemRef_Float[o0] : 4i0 <= o0 <= 3 + 4i0 }
To support jscop-rewritable memory accesses we need each statement instance to
only be assigned a single memory location, which will be the address at which
we load the value. Currently we obtain this address by taking the lexmin of
the access function. We may consider keeping track of the memory location more
explicitly in the future.
llvm-svn: 259587
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before adding a MK_Value READ MemoryAccess, check whether the read is
necessary or synthesizable. Synthesizable values are later generated by
the SCEVExpander and therefore do not need to be transferred
explicitly. This can happen because the check for synthesizability has
presumbly been forgotten in the case where a phi's incoming value has
been defined in a different statement.
Differential Revision: http://reviews.llvm.org/D15687
llvm-svn: 258998
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ensure that there is at most one phi write access per PHINode and
ScopStmt. In particular, this would be possible for non-affine
subregions with multiple exiting blocks. We replace multiple MAY_WRITE
accesses by one MUST_WRITE access. The written value is constructed
using a PHINode of all exiting blocks. The interpretation of the PHI
WRITE's "accessed value" changed from the incoming value to the PHI like
for PHI READs since there is no unique incoming value.
Because region simplification shuffles around PHI nodes -- particularly
with exit node PHIs -- the PHINodes at analysis time does not always
exist anymore in the code generation pass. We instead remember the
incoming block/value pair in the MemoryAccess.
Differential Revision: http://reviews.llvm.org/D15681
llvm-svn: 258809
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Both functions implement the same functionality, with the difference that
getNewScalarValue assumes that globals and out-of-scop scalars can be directly
reused without loading them from their corresponding stack slot. This is correct
for sequential code generation, but causes issues with outlining code e.g. for
OpenMP code generation. getNewValue handles such cases correctly.
Hence, we can replace getNewScalarValue with getNewValue. This is not only more
future proof, but also eliminates a bunch of code.
The only functionality that was available in getNewScalarValue that is lost
is the on-demand creation of scalar values. However, this is not necessary any
more as scalars are always loaded at the beginning of each basic block and will
consequently always be available when scalar stores are generated. As this was
not the case in older versions of Polly, it seems the on-demand loading is just
some older code that has not yet been removed.
Finally, generateScalarLoads also generated loads for values that are loop
invariant, available in GlobalMap and which are preferred over the ones loaded
in generateScalarLoads. Hence, we can just skip the code generation of such
scalar values, avoiding the generation of dead code.
Differential Revision: http://reviews.llvm.org/D16522
llvm-svn: 258799
|
|
|
|
| |
llvm-svn: 258662
|
|
|
|
|
|
|
|
|
|
|
| |
In Polly, after hoisting loop invariant loads outside loop, the alignment
information for hoisted loads are missing, this patch restore them.
Contributed-by: Lawrence Hu <lawrence@codeaurora.org>
Differential Revision: http://reviews.llvm.org/D16160
llvm-svn: 258105
|
|
|
|
| |
llvm-svn: 257898
|
|
|
|
|
|
|
|
| |
When generating scalar loads/stores separately the vector code has not been
updated. This commit adds code to generate scalar loads for vector code as well
as code to assert in case scalar stores are encountered within a vector loop.
llvm-svn: 255714
|
|
|
|
|
|
|
|
|
|
| |
When rewriting the access functions of load/store statements, we are only
interested in the actual array memory location. The current code just took
the very first memory access, which could be a scalar or an array access. As
a result, we failed to update access functions even though this was requested
via .jscop.
llvm-svn: 255713
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When introducing separate control flow for the original and optimized code we
introduce now a special 'ExitingBlock':
\ /
EnteringBB
|
SplitBlock---------\
_____|_____ |
/ EntryBB \ StartBlock
| (region) | |
\_ExitingBB_/ ExitingBlock
| |
MergeBlock---------/
|
ExitBB
/ \
This 'ExitingBlock' contains code such as the final_reloads for scalars, which
previously were just added to whichever statement/loop_exit/branch-merge block
had been generated last. Having an explicit basic block makes it easier to
find these constructs when looking at the CFG.
llvm-svn: 255107
|
|
|
|
| |
llvm-svn: 255106
|
|
|
|
|
|
|
| |
This is the default since a long time. Setting it again does not add value
in any of these test cases.
llvm-svn: 253800
|
|
|
|
| |
llvm-svn: 252942
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
IVs of loops for which the loop header is in the subregion, but not the entire
loop may be incremented outside of the subregion and can consequently not be
kept private to the subregion. Instead, they need to and are modeled as virtual
loops in the iteration domains. As this is the case, generating new subregion
induction variables for such loops is not needed and indeed wrong as they would
hide the virtual induction variables modeled in the scop.
This fixes a miscompile in MultiSource/Benchmarks/Ptrdist/bc and
MultiSource/Benchmarks/nbench/. Thanks Michael and Johannes for their
investiagations and helpful observations regarding this bug.
llvm-svn: 252860
|
|
|
|
|
|
|
| |
Check if a value that is referenced by a parameter is dead and do not
generate code for the parameter in such a case.
llvm-svn: 252813
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Especially for structs, the SAI object of a base pointer does not
describe all the types that the user might expect when he loads from
that base pointer. While we will still cast integers and pointers we
will now reload the value with the correct type if floating point and
non-floating point values are involved. However, there are now TODOs
where we use bitcasts instead of a proper conversion or reloading.
This fixes bug 25479.
llvm-svn: 252706
|
|
|
|
|
|
|
|
|
|
|
| |
We now create all invariant equivalence classes for required invariant loads
instead of creating them on-demand. This way we can check if a parameter
references an invariant load that is actually not executed and was therefor
not materialized. If that happens the parameter is not materialized either.
This fixes bug 25469.
llvm-svn: 252701
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Scalar reloads in the generated entering block were not recognized as
dominating the subregions locks when there were multiple entering
nodes. This resulted in values defined in there not being copied.
As a fix, we unconditionally add the BBMap of the generated entering
node to the generated entry. This fixes part of llvm.org/PR25439.
This reverts 252449 and reapplies r252445. Its test was failing
indeterministically due to r252375 which was reverted in r252522.
llvm-svn: 252540
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The dominance of the generated non-affine subregion block was based on
the scop's merge block, therefore resulted in an invalid DominanceTree.
It resulted in some values as assumed to be unusable in the actual
generated exit block.
We detect the case that the exit block has been moved and decide
dominance using the BB at the original exit. If we create another exit
node, that exit nodes is dominated by the one generated from where the
original exit resides. This fixes llvm.org/PR25438 and part of
llvm.org/PR25439.
llvm-svn: 252526
|
|
|
|
|
|
|
|
| |
It introduced indeterminism as it was iterating over an address-indexed
hashtable. The corresponding bug PR25438 will be fixed in a successive
commit.
llvm-svn: 252522
|
|
|
|
| |
llvm-svn: 252451
|
|
|
|
|
|
|
|
|
|
|
| |
dominating"
This reverts commit 9775824b265e574fc541e975d64d3e270243b59d due to a
failing unit test.
Please check and correct the unit test and commit again.
llvm-svn: 252449
|
|
|
|
|
|
|
|
|
|
|
| |
Scalar reloads in the generated entering block were not recognized as
dominating the subregions locks when there were multiple entering
nodes. This resulted in values defined in there not being copied.
As a fix, we unconditionally add the BBMap of the generated entering
node to the generated entry. This fixes part of llvm.org/PR25439.
llvm-svn: 252445
|
|
|
|
| |
llvm-svn: 252437
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we bail out early we make the partially build new code path
practically dead, though it was not unreachable. To remove dominance
problems we now make it not only dead but also prevent the control
flow to join with the original code path, thus allow to use original
values after the SCoP without any PHI nodes.
This fixes bug 25447.
llvm-svn: 252420
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While the program cannot cause a dependence cycle between invariant
loads, additional constraints (e.g., to ensure finite loops) can
introduce them. It is hard to detect them in the SCoP description,
thus we will only check for them at code generation time. If such a
recursion is detected we will bail out the code generation and place a
"false" runtime check to guarantee the original code is used.
This fixes bug 25443.
llvm-svn: 252412
|