| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
llvm-svn: 328005
|
|
|
|
|
|
| |
This reverts commit r326643. Fix didn't really fix anything.
llvm-svn: 326656
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During codegen, Polly attempts to clear all loops from ScalarEvolution
and LoopInfo, and it does so one block at a time. This causes undefined
behaviour, since this way a loop header might be removed from a loop
before the entire loop is erased, causing ScalarEvolution to run into an
error.
Instead, just delete the entire loop atomically. This fixes currently
failing testcases.
llvm-svn: 326643
|
|
|
|
|
|
|
|
|
|
|
| |
As part of this cleanup a couple of unnecessary isl::manage(obj.copy()) pattern
are eliminated as well.
We checked for all potential cleanups by scanning for:
"grep -R isl::manage\( lib/ | grep copy"
llvm-svn: 325558
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Memory transfer instructions take two pointers. It is not defined to
which of those a noalias annotation applies. To ensure correctness,
do not add noalias annotations to memcpy/memmove instructions anymore.
The caused a miscompile with test-suite's MultiSource/Applications/obsequi.
Since r321138, the MemCpyOpt pass would remove memcpy/memmove calls if
known to copy uninitialized memory. In that case, it was initialized
by another memcpy, but the annotation for the target pointer said
it would not alias. The annotation was actually meant for the source
pointer, which was was an alloca and could not alias with the target
pointer.
llvm-svn: 321371
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Isl does not allow generating isl_ast_expr from an isl_pw_aff that has an
empty domain (i.e. has no pieces). We already detected the case if the
isl_pw_aff comes with an empty domain.
isl_ast_build also considers the domain empty if it is disjoint with the
parameter context (e.g. parameters values that we exclude by runtime
versioning).
Intersect the access relation domain with the parameter context to
also detect such practically empty access domains. The effective
pointer used in the generated code is unimportand because it will never
be executed.
This fixes llvm.org/PR35362
llvm-svn: 318806
|
|
|
|
|
|
|
| |
polly-check-format has been failing since at least r318517,
due to more than one cause.
llvm-svn: 318795
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Most changes are mechanical, but in one place I changed the program semantics
by fixing a likely bug:
In `Scop::hasFeasibleRuntimeContext()`, I'm now explicitely handling the
error-case. Before, when the call to `addNonEmptyDomainConstraints()`
returned a null set, this (probably) accidentally worked because
isl_bool_error converts to true. I'm checking for nullptr now.
Reviewers: grosser, Meinersbur, bollu
Reviewed By: Meinersbur
Subscribers: nemanjai, kbarton, pollydev, llvm-commits
Differential Revision: https://reviews.llvm.org/D39971
llvm-svn: 318632
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: grosser, bollu
Reviewed By: grosser
Subscribers: nemanjai, kbarton, llvm-commits
Tags: #polly
Differential Revision: https://reviews.llvm.org/D39916
llvm-svn: 317922
|
|
|
|
|
|
|
|
| |
When collecting base pointers that need to be made available in parallel
subfunctions, use the base pointer associated with the latest
ScopArrayInfo, instead of the original one.
llvm-svn: 316983
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When GPUNodeBuilder creates loops inside the kernel, it dispatches to
IslNodeBuilder. This however is surprisingly dangerous, since it accesses the
AST Node's user through the wrong type. This patch fixes this problem by
overriding createFor correctly.
This fixes PR35010.
Reviewers: grosser, bollu, Meinersbur
Reviewed By: Meinersbur
Subscribers: Meinersbur, nemanjai, pollydev, llvm-commits, kbarton
Differential Revision: https://reviews.llvm.org/D39364
llvm-svn: 316872
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We make sure that the final reload of an invariant scalar memory access uses the
same stack slot into which the invariant memory access was stored originally.
Earlier, this was broken as we introduce a new stack slot aside of the preload
stack slot, which remained uninitialized and caused our escaping loads to
contain garbage. This happened due to us clearing the pre-populated values
in EscapeMap after kernel code generation. We address this issue by preserving
the original host values and restoring them after kernel code generation.
EscapeMap is not expected to be used during kernel code generation, hence we
clear it during kernel generation to make sure that any unintended uses are
noticed.
llvm-svn: 314894
|
|
|
|
|
|
|
|
|
|
|
| |
This matches the behavior we already have in lib/Codegen/CodeGeneration.cpp and
makes sure that we fall back to the original code. It seems when invariant load
hoisting was introduced to the GPGPU backend we missed to reset the RTC flag,
such that kernels where invariant load hoisting failed executed the 'optimized'
SCoP, which however is set to a simple 'unreachable'. Unsurprisingly, this
results in hard to debug issues that are a lot of fun to debug.
llvm-svn: 314624
|
|
|
|
|
|
| |
r314375 privatized Loop's constructor and replaced it with an Allocator.
llvm-svn: 314412
|
|
|
|
|
|
|
|
|
|
| |
Such RTCs may introduce integer wrapping intrinsics with more than 64 bit,
which are translated to library calls on AOSP that are not part of the
runtime and will consequently cause linker errors.
Thanks to Eli Friedman for reporting this issue and reducing the test case.
llvm-svn: 314065
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since -polly-codegen reports itself to preserve DependenceInfo and IslAstInfo,
we might get those analysis that were computed by a different ScopInfo for a
different Scop structure. This would be unfortunate because DependenceInfo and
IslAstInfo hold references to resources allocated by
ScopInfo/ScopBuilder/Scop (e.g. isl_id). If -polly-codegen and
DependenceInfo/IslAstInfo do not agree on which Scop to use, unpredictable
things can happen.
When the ScopInfo/Scop object is freed, there is a high probability that the
new ScopInfo/Scop object will be created at the same heap position with the
same address. Comparing whether the Scop or ScopInfo address is the expected
therefore is unreliable.
Instead, we compare the address of the isl_ctx object. Both, DependenceInfo
and IslAstInfo must hold a reference to the isl_ctx object to ensure it is
not freed before the destruction of those analyses which might happen after
the destruction of the Scop/ScopInfo they refer to. Hence, the isl_ctx
will not be freed and its address not reused as long there is a
DependenceInfo or IslAstInfo around.
This fixes llvm.org/PR34441
llvm-svn: 313842
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Update CodegenCleanup using the function-level passes added by
populatePassManager that run between EP_EarlyAsPossible and
EP_VectorizerStart in -O3.
The changes in particular are:
- Added pass create arguments, e.g. ExpensiveCombines for InstCombine.
- Remove reroll pass. The option -reroll-loops is disabled by default.
- Add passes run with UnitAtATime, which is the default.
- Add instances of LibCallsShrinkWrap, TailCallElimination, SCCP
(sparse conditional constant propagation), Float2Int
that did not run before.
- Add instances of GVN as in the default pipeline.
Notes:
- GVNHoist, GVNSink, NewGVN are still disabled in the -O3 pipeline.
- The optimization level and other optimization parameters are not
accessible outside of PassManagerBuilder, hence we cannot add passes
depending on these.
Differential Revision: https://reviews.llvm.org/D37571
llvm-svn: 312875
|
|
|
|
|
|
|
|
| |
The type of NewValue might change due to ScalarEvolution
looking though bitcasts. The synthesized NewValue therefore
becomes the type before the bitcast.
llvm-svn: 312718
|
|
|
|
|
|
|
| |
It's weird at first glance that we do this, so I wrote up some
documentation on why we need to perform this process.
llvm-svn: 312715
|
|
|
|
|
|
|
| |
It seems NewGVN still has some problems: llvm.org/PR34452, we will switch back
after they have been resolved.
llvm-svn: 312480
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In certain situations, the context in the isl_ast_build could result for the
min/max locations of our alias sets to become empty, which would cause an
internal error in isl, which is then unable to derive a value for these
expressions. Check these conditions before code generating expressions and
instead assume that alias check succeeded. This is valid, as the corresponding
memory accesses will not be executed under any valid context.
This fixed llvm.org/PR34432. Thanks to Qirun Zhang for reporting.
llvm-svn: 312455
|
|
|
|
| |
llvm-svn: 312452
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
accesses.
In Polly, we specifically add a paramter to represent the outermost dimension
size of fortran arrays. We do this because this information is statically
available from the fortran metadata generated by dragonegg.
However, we were only materializing these parameters (meaning, creating an
llvm::Value to back the isl_id) from *memory accesses*. This is wrong,
we should materialize parameters from *scop array info*.
It is wrong because if there is a case where we detect 2 fortran arrays,
but only one of them is accessed, we may not materialize the other array's
dimensions at all.
This is incorrect. We fix this by looping over all
`polly::ScopArrayInfo` in a scop, rather that just all `polly::MemoryAccess`.
Differential Revision: https://reviews.llvm.org/D37379
llvm-svn: 312350
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, GVN can be necessary to eliminate redundant instructions in case
of, for instance, GEMM and float type. This patch makes GVN be run during
the cleanup.
Reviewed-by: Tobias Grosser <tobias@grosser.es>,
Michael Kruse <llvm@meinersbur.de>
Differential Revision: https://reviews.llvm.org/D37340
llvm-svn: 312307
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is useful when we face certain intrinsics such as `llvm.exp.*`
which cannot be lowered by the NVPTX backend while other intrinsics can.
So, we would need to keep blacklists of intrinsics that cannot be
handled by the NVPTX backend. It is much simpler to try and promote
all intrinsics to libdevice versions.
This patch makes function/intrinsic very uniform, and will always try to use
a libdevice version if it exists.
Differential Revision: https://reviews.llvm.org/D37056
llvm-svn: 312239
|
|
|
|
|
|
|
|
|
| |
The adds code generation support for the previous commit.
This patch has been re-applied, after the memory issue in the previous patch
has been fixed.
llvm-svn: 312211
|
|
|
|
|
|
| |
This reverts commit r312129. It caused some memory issues.
llvm-svn: 312208
|
|
|
|
|
|
| |
The adds code generation support for the previous commit.
llvm-svn: 312129
|
|
|
|
|
|
| |
This possibly helps to avoid run-time check failures in the COSMO kernels.
llvm-svn: 311920
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Properly require and preserve the OptimizationRemarkEmitter for use in
ScopPass. Previously one had to get the ORE from ScopDetection because
CodeGeneration did not mark it as preserved. It would need to be
recomputed which results in the legacy PM to throw away all previous
SCoP analysis.
This also changes the implementation of ScopPass::getAnalysisUsage to
not unconditionally preserve all passes, but only those needed to be
preserved by any SCoP pass (at least when using the legacy PM). This
allows invalidating DependenceInfo (and IslAstInfo) in case the pass
would cause them to change (e.g. OpTree, DeLICM, MaximalArrayExpansion)
JSONImporter should also invalidate the DependenceInfo. In this patch
it marks DependenceInfo as preserved anyway because some regression
tests depend on it.
Differential Revision: https://reviews.llvm.org/D37010
llvm-svn: 311888
|
|
|
|
|
|
| |
other minor fixes (NFC).
llvm-svn: 311704
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Whether a partial write is tautological/unsatisfiable not only
depends on the access domain, but also on the domain covered
by its node in the AST.
In the example below, there are two instances of Stmt_cond_false. It may have a partial write access that is not executed in instance Stmt_cond_false(0).
for (int c0 = 0; c0 < tmp5; c0 += 1) {
Stmt_for_body344(c0);
if (tmp5 >= c0 + 2)
Stmt_cond_false(c0);
Stmt_cond_end(c0);
}
if (tmp5 <= 0) {
Stmt_for_body344(0);
Stmt_cond_false(0);
Stmt_cond_end(0);
}
Isl cannot derive a subscript for an array element that is never accessed.
This caused an error in that no subscript expression has been generated
in IslNodeBuilder::createNewAccesses, but BlockGenerator expected one
to exist because there is an execution of that write, just not in that
ast node.
Fixed by instead of determining whether the access domain is empty,
inspect whether isl generated a constant "false" ast expression in
the current ast node.
This should fix a compiler crash of the aosp buildbot.
llvm-svn: 311663
|
|
|
|
|
|
|
|
|
|
|
|
| |
functions in a kernel.
This is a stylistic change to make the function a little more readable.
Also add a debug print to show what instruction contains a use of a
function we don't understand in the kernel.
Differential Revision: https://reviews.llvm.org/D37058
llvm-svn: 311648
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add statistics about
- Which optimizations are applied
- Number of loops in Scops at various stages
- Number of scalar/singleton writes at various stages representative
for scalar false dependencies
- Number of parallel loops
These will be useful to find regressions due to moving Polly further
down of LLVM's pass pipeline.
Differential Revision: https://reviews.llvm.org/D37049
llvm-svn: 311553
|
|
|
|
|
|
|
|
|
|
|
| |
MSVC warns about comparison between a signed and unsigned integer.
The rules of C(++) define that an unsigned comparison has to be
carried-out in this case. This is unlikely to be intended.
Fix by assigning the loop's upper bound to a signed integer first.
This also avoids repeated evaluation of the invariant upper bound.
llvm-svn: 311548
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
There is no need to emit alias metadata for scalars, as basicaa will easily
distinguish them from arrays. This reduces the size of the metadata we generate.
This is especially useful after we moved to -polly-position=before-vectorizer,
where a lot more scalar dependences are introduced, which increased the size of
the alias analysis metadata and made us commonly reach the limits after which
we do not emit alias metadata that have been introduced to prevent quadratic
growth of this alias metadata.
This improves 2mm performance from 1.5 seconds to 0.17 seconds.
Reviewers: Meinersbur, bollu, singam-sanjay
Reviewed By: Meinersbur
Subscribers: pollydev, llvm-commits
Tags: #polly
Differential Revision: https://reviews.llvm.org/D37028
llvm-svn: 311498
|
|
|
|
| |
llvm-svn: 311477
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, in case of GEMM and the pattern matching based optimizations, we
use only the SLP Vectorizer out of two LLVM vectorizers. Since the Loop
Vectorizer can get in the way of optimal code generation, we disable the Loop
Vectorizer for the innermost loop using mark nodes and emitting the
corresponding metadata.
Reviewed-by: Tobias Grosser <tobias@grosser.es>
Differential Revision: https://reviews.llvm.org/D36928
llvm-svn: 311473
|
|
|
|
| |
llvm-svn: 311440
|
|
|
|
| |
llvm-svn: 311439
|
|
|
|
| |
llvm-svn: 311361
|
|
|
|
| |
llvm-svn: 311360
|
|
|
|
| |
llvm-svn: 311331
|
|
|
|
|
|
|
|
|
|
| |
This feature was not enabled for `PPCGCodeGeneration`. Now that this is
enabled, we can benchmark Scops that have been optimised with
`-polly-codegen-ppcg` with the `-polly-codegen-perf-monitoring` option.
Differential Revision: https://reviews.llvm.org/D36934
llvm-svn: 311328
|
|
|
|
|
|
| |
These intrinsics are used in COSMO.
llvm-svn: 311324
|
|
|
|
|
|
| |
These two functions are used in COSMO
llvm-svn: 311322
|
|
|
|
|
|
|
|
|
| |
We still see some issues with parameter space mismatches. Revert this to get
a clean baseline. We will recommit after these issues have been resolved.
This reverts commit 0e360a14194f722ded7aa2bc9d4be2ed2efeeb49.
llvm-svn: 311268
|
|
|
|
|
|
|
|
| |
Instead of using Twines and temporary expressions, we do string manipulation
through a std::string. This resolves a memory corruption issue, which likely
was caused by twines loosing their underlying string too soon.
llvm-svn: 311264
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
not the constantexpr itself.
- We should iterate over `I`, which is `Cur` expanded out to an
instruction, and not `Cur` itself.
- This is a bugfix.
Differential Revision: https://reviews.llvm.org/D36923
llvm-svn: 311261
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This information is necessary for PPCG to perform correct life range reordering.
With these changes applied we can live-range reorder some of the important
kernels in COSMO.
We also update and rename one test case, which previously could not be optimized
and now is optimized thanks to live-range reordering. To preserve test coverage
we add a new test case scalar-writes-in-scop-requires-abort.ll, which exercises
our automatic abort in case of scalar writes in the kernel.
Reviewers: Meinersbur, bollu, singam-sanjay
Subscribers: nemanjai, pollydev, llvm-commits, kbarton
Tags: #polly
Differential Revision: https://reviews.llvm.org/D36929
llvm-svn: 311259
|