| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
| |
This allows us to export the results from transformations such as DeLICM.
llvm-svn: 307641
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When providing the option "-polly-ast-print-accesses" Polly also prints the
memory accesses that are generated:
#pragma known-parallel
for (int c0 = 0; c0 <= 1023; c0 += 4)
#pragma simd
for (int c1 = c0; c1 <= c0 + 3; c1 += 1)
Stmt_for_body(
/* read */ &MemRef_B[0]
/* write */ MemRef_A[c1]
);
This makes writing and debugging memory layout transformations easier.
Based on a patch contributed by Thomas Lang (ETH Zurich)
llvm-svn: 307579
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Since r306667, propagateInvalidStmtDomains gets a reference to an
InvalidDomainMap. As part of the branch leading to return false, the respective
domain is freed. It is, however, not removed from the InvalidDomainMap, leaking
a pointer to a freed object which results in a use-after-free. Fix this be
removing the domain from the map before returning.
We tried to derive a test case that reliably failes, but did not succeed in
producing one. Hence, for now the failures in our LNT bots must be sufficient
to keep this issue tested.
Reviewers: grosser, Meinersbur, bollu
Subscribers: bollu, nandini12396, pollydev, llvm-commits
Differential Revision: https://reviews.llvm.org/D34971
llvm-svn: 307499
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
local to the scop.
- By definition, we can pass something as a `kill` to PPCG if we know
that no data can flow across a kill.
- This is useful for more complex examples where we have scalars that
are local to a scop.
- If the local is only used within a scop, we are free to kill it.
Differential Revision: https://reviews.llvm.org/D35045
llvm-svn: 307260
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Provide more context to the name of a GPU kernel by prefixing its name with the host function that calls it. E.g. The first kernel called by `gemm` would be `FUNC_gemm_KERNEL_0`.
Kernels currently follow the "kernel_#" (# = 0,1,2,3,...) nomenclature. This patch makes it easier to map host caller and device callee, especially when there are many kernels produced by Polly-ACC.
Reviewers: grosser, Meinersbur, bollu, philip.pfaffe, kbarton!
Reviewed By: grosser
Subscribers: nemanjai, pollydev
Tags: #polly
Differential Revision: https://reviews.llvm.org/D33985
llvm-svn: 307173
|
|
|
|
| |
llvm-svn: 307164
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Polly did not use PPCG's live range reordering feature. Teach
PPCGCodeGeneration to use this.
Documentation on this is sparse, so much of the code is conservative.
We currently kill all phi nodes in a Scop by appending them to the
must_kill map we pass to PPCG. I do not have a proof of correctness,
but it seems to be intuitively correct.
We also do not handle `array_order`, which, quoting PPCG, is:
PPCG/gpu.h: "Order dependences on non-scalars."
It seems to consist of RAW dependences between arrays. We need to
pass this information for more complex privatization cases.
Differential Revision: https://reviews.llvm.org/D34941
llvm-svn: 307163
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This is a general maintenance update
Reviewers: grosser
Subscribers: srhines, fedor.sergeev, pollydev, llvm-commits
Contributed-by: Maximilian Falkenstein <falkensm@student.ethz.ch>
Differential Revision: https://reviews.llvm.org/D34903
llvm-svn: 307090
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Introduce a "hybrid" `-polly-target` option to optimise code for either the GPU or CPU.
When this target is selected, PPCGCodeGeneration will attempt first to optimise a Scop. If the Scop isn't modified, it is then sent to the passes that form the CPU pipeline, i.e. IslScheduleOptimizerPass, IslAstInfoWrapperPass and CodeGeneration.
In case the Scop is modified, it is marked to be skipped by the subsequent CPU optimisation passes.
Reviewers: grosser, Meinersbur, bollu
Reviewed By: grosser
Subscribers: kbarton, nemanjai, pollydev
Tags: #polly
Differential Revision: https://reviews.llvm.org/D34054
llvm-svn: 306863
|
|
|
|
| |
llvm-svn: 306791
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ScopStmts were being used in the computation of the Domain of the SCoPs
in ScopInfo. Once statements are split, there will not be a 1-to-1
correspondence between Stmts and Basic blocks. Thus this patch avoids
the use of getStmtFor() by creating a map of BB to InvalidDomain and
using it to compute the domain of the statements.
Contributed-by: Nanidini Singhal <cs15mtech01004@iith.ac.in>
Differential Revision: https://reviews.llvm.org/D33942
llvm-svn: 306667
|
|
|
|
| |
llvm-svn: 306657
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The NVPTX backend is now initialised within Polly. A language front-end need not be modified to initialise the backend, just for Polly.
Reviewers: Meinersbur, grosser
Reviewed By: Meinersbur
Subscribers: vchuravy, mgorny
Tags: #polly
Differential Revision: https://reviews.llvm.org/D31859
llvm-svn: 306649
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch aims to implement the option of allocating new arrays created
by polly on heap instead of stack. To enable this option, a key named
'allocation' must be written in the imported json file with the value
'heap'.
We need such a feature because in a next iteration, we will implement a
mechanism of maximal static expansion which will need a way to allocate
arrays on heap. Indeed, the expansion is very costly in terms of memory
and doing the allocation on stack is not worth considering.
The malloc and the free are added respectively at polly.start and
polly.exiting such that there is no use-after-free (for instance in case
of Scop in a loop) and such that all memory cells allocated with a
malloc are free'd when we don't need them anymore.
We also add :
- In the class ScopArrayInfo, we add a boolean as member called IsOnHeap
which represents the fact that the array in allocated on heap or not.
- A new branch in the method allocateNewArrays in the ISLNodeBuilder for
the case of heap allocation. allocateNewArrays now takes a BBPair
containing polly.start and polly.exiting. allocateNewArrays takes this
two blocks and add the malloc and free calls respectively to
polly.start and polly.exiting.
- As IntPtrTy for the malloc call, we use the DataLayout one.
To do that, we have modified :
- createScopArrayInfo and getOrCreateScopArrayInfo such that it returns
a non-const SAI, in order to be able to call setIsOnHeap in the
JSONImporter.
- executeScopConditionnaly such that it return both start block and end
block of the scop, because we need this two blocs to be able to add
the malloc and the free calls at the right position.
Differential Revision: https://reviews.llvm.org/D33688
llvm-svn: 306540
|
|
|
|
| |
llvm-svn: 306479
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before we would 'guess' the correct location for the MergeBlock
that got introduced when executing a Scop conditionally. This
implicitly depends on the situation that at this point during
CodeGen there will be nothing between polly.start and polly.exiting.
With this commit we explicitly state that we want the block that
directly follows polly.exiting.
llvm-svn: 306398
|
|
|
|
|
|
| |
- This is useful for debugging GPU code.
llvm-svn: 306290
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- In D33414, if any function call was found within a kernel, we would bail out.
- This is an over-approximation. This patch changes this by allowing the
`llvm.sqrt.*` family of intrinsics.
- This introduces an additional step when creating a separate llvm::Module
for a kernel (GPUModule). We now copy function declarations from the
original module to new module.
- We also populate IslNodeBuilder::ValueMap so it replaces the function
references to the old module to the ones in the new module
(GPUModule).
Differential Revision: https://reviews.llvm.org/D34145
llvm-svn: 306284
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit returns both the start and the exit block that are created
by executeScopConditionally.
In a future commit we will make use of the exit block. Before we would
have to use the implicit property that there won't be any code generated
between polly.start and polly.exiting at the time of use to find the
correct block ('polly.exiting').
All usage location are semantically unchanged.
llvm-svn: 306283
|
|
|
|
|
|
|
|
|
|
| |
The condition that disallowed code generation in PPCGCodeGeneration with
invariant loads is not required. I haven't been able to construct a
counterexample where this generates invalid code.
Differential Revision: https://reviews.llvm.org/D34604
llvm-svn: 306245
|
|
|
|
|
|
|
|
|
|
|
| |
This reduces the compilation time of one reduced test case from Android from
16 seconds to 100 mseconds (we bail out), without negatively impacting any
other test case we currently have.
We still saw occasionally compilation timeouts on the AOSP buildbot. Hopefully,
those will go away with this change.
llvm-svn: 306235
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During the construction of MemoryAccesses in ScopBuilder, BasicBlocks
were used in function parameters, assuming that the ScopStmt an be
directly derived from it. This won't be true anymore once we split
BasicBlocks into multiple ScopStmt. As a preparation for such a change
in the future, we instead pass the ScopStmt and avoid the use of
getStmtFor().
There are two occasions where a kind of mapping from BasicBlock to
ScopStmt is still required.
1. Get the statement representing the incoming block of a `PHINode`
using `getLastStmtOf`.
2. One statement is required to write a scalar to be readable by those
which need it. This is most often the statement which contains its
definition, which we get using `getStmtFor(Instruction*)`.
Differential Revision: https://reviews.llvm.org/D34369
llvm-svn: 306132
|
|
|
|
| |
llvm-svn: 306088
|
|
|
|
|
|
|
|
|
| |
This allows us to bail out both in case the lexmin/max computation is too
expensive, but also in case the commulative cost across an alias group is
too expensive. This is an improvement of r303404, which did not seem to
be sufficient to keep the Android Buildbot quiet.
llvm-svn: 306087
|
|
|
|
| |
llvm-svn: 306086
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
r303971 added an assertion that SCEV addition involving an AddRec
and a SCEVUnknown must involve a dominance relation: either the
SCEVUnknown value dominates the AddRec's loop, or the AddRec's
loop header dominates the SCEVUnknown. This is generally fine
for most usage of SCEV because it isn't possible to write an
expression in IR which would violate it, but it's a bit inconvenient
here for polly.
To solve the issue, just avoid creating a SCEV expression which
triggers the asssertion.
I'm not really happy with this solution, but I don't have any better
ideas.
Fixes https://bugs.llvm.org/show_bug.cgi?id=33464.
Differential Revision: https://reviews.llvm.org/D34259
llvm-svn: 305864
|
|
|
|
|
|
| |
llvm::Loop::getNumBlocks returns an unsigned int, not a long.
llvm-svn: 305717
|
|
|
|
| |
llvm-svn: 305709
|
|
|
|
|
|
| |
isolateAndUnrollMatMulInnerLoops to C++
llvm-svn: 305676
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ensure that all array base pointers are assigned before generating
aliasing metadata by allocating new arrays beforehand.
Before this patch, getBasePtr() returned nullptr for new arrays because
the arrays were created at a later point. Nullptr did not match to any
array after the created array base pointers have been assigned and when
the loads/stores are generated.
llvm-svn: 305675
|
|
|
|
|
|
|
|
|
|
|
|
| |
In r304074 we introduce a patch to accept results from side effect free
functions into SCEV modeling. This causes rejection of cases where the
call is happening outside the SCoP. This patch checks if the call is
outside the Region and treats the results as a parameter (SCEVType::PARAM)
to the SCoP instead of returning SCEVType::INVALID.
Patch by Sameer Abu Asal.
llvm-svn: 305423
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In `PPCGCodeGeneration`, we try to take the references of every `Value`
that is used within a Scop to offload to the kernel. This occurs in
`GPUNodeBuilder::createLaunchParameters`.
This breaks if one of the values is a function pointer, since one of
these cases will trigger:
1. We try to to take the references of an intrinsic function, and this
breaks at `verifyModule`, since it is illegal to take the reference of
an intrinsic.
2. We manage to take the reference to a function, but this fails at
`verifyModule` since the function will not be present in the module that
is created in the kernel.
3. Even if `verifyModule` succeeds (which should not occur), we would
then try to call a *host function* from the *device*, which is
illegal runtime behaviour.
So, we disable this entire range of possibilities by simply not allowing
function references within a `Scop` which corresponds to a kernel.
However, note that this is too conservative. We *can* allow intrinsics
within kernels if the backend can lower the intrinsic correctly. For
example, an intrinsic like `llvm.powi.*` can actually be lowered by the `NVPTX`
backend.
We will now gradually whitelist intrinsics which are known to be safe.
Differential Revision: https://reviews.llvm.org/D33414
llvm-svn: 305185
|
|
|
|
|
|
|
|
| |
Contributed by: Singapuram Sanjay
Differential Revision: https://reviews.llvm.org/D34079
llvm-svn: 305183
|
|
|
|
|
|
|
|
|
|
| |
The isl/mat.h functionality was incomplete (we returned 'void *' instead of
'isl::mat') and is likely not needed.
*.insert_partial_schedule was until know not exported in the bindings, but will
be needed in the next step.
llvm-svn: 305161
|
|
|
|
|
|
|
|
| |
- This is useful to run optimisations on only certain functions.
Differential Revision: https://reviews.llvm.org/D33990
llvm-svn: 305060
|
|
|
|
| |
llvm-svn: 304974
|
|
|
|
| |
llvm-svn: 304841
|
|
|
|
|
|
| |
In importArrays instead of silently ignoring the file.
llvm-svn: 304817
|
|
|
|
|
|
|
|
|
|
|
|
| |
Iterate through memory accesses in execution order (first all implicit reads,
then explicit accesses, then implicit writes).
In the test case this caused an implicit load to be handled as if it was loaded
after the write. That is, the value being written before it is available.
This fixes llvm.org/PR33323
llvm-svn: 304810
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The RegionGenerator traditionally kept a BlockMap that mapped from original
basic blocks to newly generated basic blocks. With the introduction of partial
writes such a 1:1 mapping is not possible any more, as a single basic block
can be code generated into multiple basic blocks. Hence, depending on the use
case we need to either use the first basic block or the last basic block.
This is intended to address the last four cases of incorrect code generation
in our AOSP buildbot and hopefully should turn it green.
Reviewers: Meinersbur, bollu, gareevroman, efriedma, huihuiz, sebpop, simbuerg
Reviewed By: Meinersbur
Subscribers: pollydev, llvm-commits
Tags: #polly
Differential Revision: https://reviews.llvm.org/D33767
llvm-svn: 304808
|
|
|
|
|
|
|
|
|
|
| |
Fix compiler warning:
polly/lib/CodeGen/PerfMonitor.cpp:81:2: warning: extra ‘;’ [-Wpedantic]
};
^
llvm-svn: 304802
|
|
|
|
|
|
| |
This is a regular maintenance update
llvm-svn: 304686
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Add a counter that is incremented once on exit from a scop.
- Test cases got split into two: one to test the cycles, and another one
to test trip counts.
- Sample output:
```name=sample-output.txt
scop function, entry block name, exit block name, total time, trip count
warmup, %entry.split, %polly.merge_new_and_old, 5180, 1
f, %entry.split, %polly.merge_new_and_old, 409944, 500
g, %entry.split, %polly.merge_new_and_old, 1226, 1
```
Differential Revision: https://reviews.llvm.org/D33822
llvm-svn: 304543
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This ensures that tools can parse performance information which Polly
generates easily.
- Sample output:
```name=out.csv
scop function, entry block name, exit block name, total time
warmup, %entry.split, %polly.merge_new_and_old, 1960
f, %entry.split, %polly.merge_new_and_old, 1238
g, %entry.split, %polly.merge_new_and_old, 1218
```
- Example code to parse output:
```lang=python, name=example-parse.py
import asciitable
import sys
table = asciitable.read('out.csv', delimiter=',')
asciitable.write(table, sys.stdout, delimiter=',')
```
llvm-svn: 304533
|
|
|
|
|
|
|
|
|
|
| |
We should bail out if performance monitoring is not supported, since
we would have no information to print per-scop, and `FinalStartBB`,
`ReturnFromFinal` would be `nullptr`.
Assert that these are not `nullptr` if performance monitoring is supported.
llvm-svn: 304529
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we would generate one performance counter for all scops.
Now, we generate both the old information, as well as a per-scop
performance counter to generate finer grained information.
This patch needed a way to generate a unique name for a `Scop`.
The start region, end region, and function name combined provides a
unique `Scop` name. So, `Scop` has a new public API to provide its start
and end region names.
Differential Revision: https://reviews.llvm.org/D33723
llvm-svn: 304528
|
|
|
|
|
|
|
|
|
|
|
|
| |
For when statements do not contain all instructions of a BasicBlock
anymore, the block generator needs to go through the explicit list of
instructions it contains.
Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in>
Differential Revision: https://reviews.llvm.org/D33653
llvm-svn: 304502
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ignored intrinsics are ignored at code generation, therefore do not
need to be part of the instruction list.
Specifically, llvm.lifetime.* intrinisics are removed before code
generation, referencing them would cause a use-after-free error.
Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in>
Differential Revision: https://reviews.llvm.org/D33768
llvm-svn: 304483
|
|
|
|
|
|
|
|
|
| |
This is useful for debugging miscompiles and extracting testcases
for crashes. See http://llvm.org/docs/OptBisect.html .
Differential Revision: https://reviews.llvm.org/D33752
llvm-svn: 304480
|
|
|
|
|
| |
Suggested-by: Michael Kruse <llvm@meinersbur.de>
llvm-svn: 304410
|