| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
llvm-svn: 250776
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MachineBlockPlacement pass.
Currently, in MachineBlockPlacement pass the loop is rotated to let the best exit to be the last BB in the loop chain, to maximize the fall-through from the loop to outside. With profile data, we can determine the cost in terms of missed fall through opportunities when rotating a loop chain and select the best rotation. Basically, there are three kinds of cost to consider for each rotation:
1. The possibly missed fall through edge (if it exists) from BB out of the loop to the loop header.
2. The possibly missed fall through edges (if they exist) from the loop exits to BB out of the loop.
3. The missed fall through edge (if it exists) from the last BB to the first BB in the loop chain.
Therefore, the cost for a given rotation is the sum of costs listed above. We select the best rotation with the smallest cost. This is only for PGO mode when we have more precise edge frequencies.
Differential revision: http://reviews.llvm.org/D10717
llvm-svn: 250754
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was originally checked in at r250527, but reverted at r250570 because of PR25222.
There were at least 2 problems:
1. The cost check was checking for an instruction with an exact cost of TCC_Expensive;
that should have been >=.
2. The cause of the clang stage 1 failures was illegally sinking 'call' instructions;
we can't sink instructions that may have side effects / are not safe to execute speculatively.
Fixed those conditions in sinkSelectOperand() and added test cases.
Original commit message:
This is a follow-up to the discussion in D12882.
Ideally, we would like SimplifyCFG to be able to form select instructions even when the operands
are expensive (as defined by the TTI cost model) because that may expose further optimizations.
However, we would then like a later pass like CodeGenPrepare to undo that transformation if the
target would likely benefit from not speculatively executing an expensive op (this patch).
Once we have this safety mechanism in place, we can adjust SimplifyCFG to restore its
select-formation behavior that changed with r248439.
Differential Revision: http://reviews.llvm.org/D13297
llvm-svn: 250743
|
|
|
|
|
|
| |
It looks like an extra negation snuck in as apart of restoring it.
llvm-svn: 250726
|
|
|
|
|
|
|
| |
While technically this is untested dead code, it has out-of-tree users.
This reverts a part of r250434.
llvm-svn: 250717
|
|
|
|
|
|
|
|
|
| |
This reverts commit r250596.
Reverted for now as the commit triggers assert in the AMDGPU target
pending investigation.
llvm-svn: 250713
|
|
|
|
|
|
|
|
|
|
| |
Originally I planned to use the same interface for masked gather/scatter and set isConsecutive to "false" in this case.
Now I'm implementing masked gather/scatter and see that the interface is inconvenient. I want to add interfaces isLegalMaskedGather() / isLegalMaskedScatter() instead of using the "Consecutive" parameter in the existing interfaces.
Differential Revision: http://reviews.llvm.org/D13850
llvm-svn: 250686
|
|
|
|
| |
llvm-svn: 250653
|
|
|
|
| |
llvm-svn: 250651
|
|
|
|
|
|
| |
Minor fix to D13665 found during post-commit review.
llvm-svn: 250616
|
|
|
|
|
|
| |
Also do some cleanups comment improvements.
llvm-svn: 250598
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This property was already used in the code path when no liveness
intervals are present. Unfortunately the code path that uses liveness
intervals tried to query a cached live interval for an allocatable
physreg, those are usually not computed so a conservative default was
used.
This doesn't affect any of the lit testcases. This is a foreclosure to
upcoming changes which should be NFC but without this patch this tidbit
wouldn't be NFC.
llvm-svn: 250596
|
|
|
|
|
|
|
|
|
|
|
| |
This should not change behaviour because as far as I can see all code
reading the pressure changes has no effect if the PressureInc is 0.
Removing these entries however does avoid unnecessary computation, and
results in a more stable debug output. I want the stable debug output to
check that some upcoming changes are indeed NFC and identical even at
the debug output level.
llvm-svn: 250595
|
|
|
|
|
|
|
| |
It is too easy to accidentally violate the ordering requirements when
modifying the PressureDiff entries through iterators.
llvm-svn: 250590
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Some shared code for handling eh.exceptionpointer and eh.exceptioncode
needs to not share the part that truncates to 32 bits, which is intended
just for exception codes.
Reviewers: rnk
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13747
llvm-svn: 250588
|
|
|
|
|
|
|
|
|
|
|
|
| |
Our previous value of "16 + 8 + MaxCallFrameSize" for ParentFrameOffset
is incorrect when CSRs are involved. We were supposed to have a test
case to catch this, but it wasn't very rigorous.
The main effect here is that calling _CxxThrowException inside a
catchpad doesn't immediately crash on MOVAPS when you have an odd number
of CSRs.
llvm-svn: 250583
|
|
|
|
| |
llvm-svn: 250579
|
|
|
|
|
|
| |
Breaks clang selfhost, see PR25222. This reverts commits r250527 and r250528.
llvm-svn: 250570
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We now use the block for the catchpad itself, rather than its normal
successor, as the funclet entry.
Putting the normal successor in the map leads downstream funclet
membership computations to erroneous results.
Reviewers: majnemer, rnk
Subscribers: rnk, llvm-commits
Differential Revision: http://reviews.llvm.org/D13798
llvm-svn: 250552
|
|
|
|
|
|
| |
No functionality change is intended.
llvm-svn: 250545
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When a cleanup's cleanupendpad or cleanupret targets a catchendpad, stop
trying to propagate the cleanup's parent's color to the catchendpad, since
what's needed is the cleanup's grandparent's color and the catchendpad
will get that color from the catchpad linkage already. We already had
this exclusion for invokes, but were missing it for
cleanupendpad/cleanupret.
Also add a missing line that tags cleanupendpads' states in the
EHPadStateMap, without with lowering invokes that target cleanupendpads
which unwind to other handlers (and so don't have the -1 state) will fail.
This fixes the reduced IR repro in PR25163.
Reviewers: majnemer, andrew.w.kaylor, rnk
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13797
llvm-svn: 250534
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ideally, we would like SimplifyCFG to be able to form select instructions even when the operands
are expensive (as defined by the TTI cost model) because that may expose further optimizations.
However, we would then like a later pass like CodeGenPrepare to undo that transformation if the
target would likely benefit from not speculatively executing an expensive op (this patch).
Once we have this safety mechanism in place, we can adjust SimplifyCFG to restore its
select-formation behavior that changed with r248439.
Differential Revision: http://reviews.llvm.org/D13297
llvm-svn: 250527
|
|
|
|
|
|
| |
Breaks the hexagon buildbot.
llvm-svn: 250461
|
|
|
|
|
|
|
| |
When building with modules the forward-declared inner class
DebugLocStream::ListBuilder causes clang to fall over.
llvm-svn: 250459
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Android libc provides a fixed TLS slot for the unsafe stack pointer,
and this change implements direct access to that slot on AArch64 via
__builtin_thread_pointer() + offset.
This change also moves more code into TargetLowering and its
target-specific subclasses to get rid of target-specific codegen
in SafeStackPass.
This change does not touch the ARM backend because ARM lowers
builting_thread_pointer as aeabi_read_tp, which is not available
on Android.
llvm-svn: 250456
|
|
|
|
|
|
| |
Carefully selected parts without deleting graph stuff and dumping methods.
llvm-svn: 250434
|
|
|
|
|
|
| |
I left all (dead) print and dump methods in place.
llvm-svn: 250433
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Caching SDLoc(N), instead of recreating it in every single
function call, keeps the code denser, and allows to unwrap long lines.
Reviewers: sunfish, atrick, sdmitrouk
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13726
llvm-svn: 250305
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: The two implementations had more code in common than not.
Reviewers: sunfish, MatzeB, sdmitrouk
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13724
llvm-svn: 250302
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Emit the handler and clause locations immediately after the standard
xdata.
Clauses are emitted in the same order and format used to communiate them
to the CLR Execution Engine.
Add a lit test to verify correct table generation on a small but
interesting example function.
Reviewers: majnemer, andrew.w.kaylor, rnk
Subscribers: pgavlin, AndyAyers, llvm-commits
Differential Revision: http://reviews.llvm.org/D13451
llvm-svn: 250219
|
|
|
|
| |
llvm-svn: 250214
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add an iterator that can walk across blocks and which visits the state
transitions rather than state ranges, with explicit transitions to -1
indicating the presence of top-level calls that may throw and cause the
current function to unwind to caller. This will simplify code that needs
to identify nested try regions.
Refactor SEH and C++EH table generation to use the new
InvokeStateChangeIterator, and remove the InvokeLabelIterator they were
using.
Reviewers: majnemer, andrew.w.kaylor, rnk
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13623
llvm-svn: 250179
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The comment says this was stopped because it was unlikely to be
profitable. This is not true if you want to combine vector loads
with multiple components.
For a simple case that looks like
t0 = load t0 ...
t1 = load t0 ...
t2 = load t0 ...
t3 = load t0 ...
t4 = store t0:1, t0:1
t5 = store t4, t1:0
t6 = store t5, t2:0
t7 = store t6, t3:0
We want to get all of these stores onto a chain
that is a TokenFactor of these N loads. This mostly
solves the AMDGPU merge-stores.ll regressions
with -combiner-alias-analysis for merging vector
stores of vector loads.
llvm-svn: 250138
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This basic combine was surprisingly missing.
AMDGPU legalizes many operations in terms of 32-bit vector components,
so not doing this results in many extra copies and subregister extracts
that need to be cleaned up later.
InstCombine already does this for the hasOneUse case. The target hook
is to fix a handful of tests which break (e.g. ARM/vmov.ll) which turn
from a vector materialize repeated immediate instruction to a constant
vector load with more scalar copies from it.
llvm-svn: 250129
|
|
|
|
|
|
|
|
|
|
| |
statement.
When lowering invoke statement, all unwind destinations are directly added as successors of call site block, and the weight of those new edges are not assigned properly. Actually, default weight 16 are used for those edges. This patch calculates the proper edge weights for those edges when collecting all unwind destinations.
Differential revision: http://reviews.llvm.org/D13354
llvm-svn: 250119
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have a number of functions that implement constant folding of vectors (unary and binary ops) in near identical manners (and the differences don't appear to be critical).
This patch introduces a common implementation (SelectionDAG::FoldConstantVectorArithmetic) and calls this in both the unary and binary op cases.
After this initial patch I intend to begin enabling vector constant folding for a wider number of opcodes in SelectionDAG::getNode().
Differential Revision: http://reviews.llvm.org/D13665
llvm-svn: 250118
|
|
|
|
|
|
|
| |
No tests fail with this enabled so I assume it was an accident
that it isn't enabled now.
llvm-svn: 250070
|
|
|
|
|
|
|
|
| |
This was a minor bug in r249492. Calling PrepareEHLandingPad on a
non-landingpad was a no-op, but it attempted to get the generic pointer
register class, which apparently doesn't exist for some targets.
llvm-svn: 250068
|
|
|
|
|
|
|
| |
CatchObjRecoverIdx was used for the old scheme, it is no longer
relevant.
llvm-svn: 250065
|
|
|
|
|
|
|
|
|
|
| |
On targets where f32 is not legal, we have to look through a BITCAST SDNode to
find the register that an argument is stored in when emitting debug info, or we
will not be able to emit a DW_AT_location for it.
Differential Revision: http://reviews.llvm.org/D13005
llvm-svn: 250056
|
|
|
|
|
|
|
|
| |
Enabled constant canonicalization for all constants.
Improved combining of constant vectors.
llvm-svn: 249993
|
|
|
|
|
|
|
|
| |
Enable constant folding for vector splats as well as scalars.
Enable constant canonicalization for all scalar and vector constants.
llvm-svn: 249978
|
|
|
|
|
|
| |
wineh-parent is dead, so is ValueOrMBB.
llvm-svn: 249920
|
|
|
|
|
|
|
|
|
|
|
| |
The new implementation works at least as well as the old implementation
did.
Also delete the associated preparation tests. They don't exercise
interesting corner cases of the new implementation. All the codegen
tests of the EH tables have already been ported.
llvm-svn: 249918
|
|
|
|
|
|
|
|
|
| |
Also Fix a buglet where SEH tables had ranges that spanned funclets.
The remaining tests using the old landingpad IR are preparation tests,
and will be deleted along with the old preparation.
llvm-svn: 249917
|
|
|
|
|
|
|
| |
Finish removing implicit ilist iterator conversions from LLVMCodeGen.
I'm sure there are lots more of these in lib/CodeGen/*/.
llvm-svn: 249915
|
|
|
|
|
|
|
| |
We got them right for the old IR, but not with funclets. Port the old
test to the new IR and fix the code.
llvm-svn: 249906
|
|
|
|
| |
llvm-svn: 249903
|
|
|
|
| |
llvm-svn: 249901
|
|
|
|
|
|
|
|
| |
This wasn't very observable in execution tests, because usually there is
an invoke in the catchpad that unwinds the the catchendpad but never
actually throws.
llvm-svn: 249898
|