| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
Lowering funclets is a no-op, so we can just go ahead and lower the
deopt state.
llvm-svn: 264078
|
|
|
|
| |
llvm-svn: 264076
|
|
|
|
| |
llvm-svn: 264072
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Improve vector extension of vectors on hardware without dedicated VSEXT/VZEXT instructions.
We already convert these to SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG but can further improve this by using the legalizer instead of prematurely splitting into legal vectors in the combine as this only properly helps for lowering to VSEXT/VZEXT.
Removes a lot of unnecessary any_extend + mask pattern - (Fix for PR25718).
Reapplied with a fix for PR26953 (missing vector widening legalization).
Differential Revision: http://reviews.llvm.org/D17932
llvm-svn: 264062
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Reviewers: vkalintiris
Subscribers: dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D18147
llvm-svn: 264057
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Also renamed li_simm7 to li16_imm since it's not a simm7 and has an unusual
encoding (it's a uimm7 except that 0x7f represents -1).
Reviewers: vkalintiris
Subscribers: dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D18145
llvm-svn: 264056
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We can't check the error message for this one because there's another lw/sw
available that covers a larger range. We therefore check the transition
between the two sizes.
Reviewers: vkalintiris
Subscribers: llvm-commits, dsanders
Differential Revision: http://reviews.llvm.org/D18144
llvm-svn: 264054
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Reviewers: vkalintiris
Subscribers: dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D18143
llvm-svn: 264053
|
|
|
|
|
|
|
|
|
|
|
|
| |
immediates in MSA copy/insert.
Reviewers: vkalintiris
Subscribers: dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D18142
llvm-svn: 264052
|
|
|
|
|
|
|
|
|
|
|
| |
It's a bug fix.
For rerolled loops SE trip count remains unchanged. It leads to incorrect work of the next passes.
My patch just resets SE info for rerolled loop forcing SE to re-evaluate it next time it requested.
I also added a verifier call in the exisitng test to be sure no invalid SE data remain. Without my fix this test would fail with -verify-scev.
Differential Revision: http://reviews.llvm.org/D18316
llvm-svn: 264051
|
|
|
|
|
|
|
|
|
|
|
|
| |
(e.g. "/")
Adding support for section names with special characters in them (e.g. "/").
GCC successfully compiles such section names.
This also fixes PR24520.
Differential Revision: http://reviews.llvm.org/D15678
llvm-svn: 264038
|
|
|
|
|
|
|
| |
This is more coherent with usual containers.
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 264026
|
|
|
|
| |
llvm-svn: 264024
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
After this change, deopt operand bundles can be lowered directly by
SelectionDAG into STATEPOINT instructions (which are then lowered to a
call or sequence of nop, with an associated __llvm_stackmaps entry0.
This obviates the need to round-trip deoptimization state through
gc.statepoint via RewriteStatepointsForGC.
Reviewers: reames, atrick, majnemer, JosephTremoulet, pgavlin
Subscribers: sanjoy, mcrosier, majnemer, llvm-commits
Differential Revision: http://reviews.llvm.org/D18257
llvm-svn: 264015
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Without tree pruning clang has 2,667,552 points.
Wiht only dominators pruning: 1,515,586.
With both dominators & predominators pruning: 1,340,534.
Resubmit of r262103.
Differential Revision: http://reviews.llvm.org/D18341
llvm-svn: 264003
|
|
|
|
|
|
|
| |
Fixes Valgrind errors on the test cases that were reported as failing
by buildbots.
llvm-svn: 264000
|
|
|
|
|
|
|
|
|
|
| |
instruction constant folding
As noted in PR18355, this patch makes it clear that all cases with undef operands have been handled before further constant folding is attempted.
Differential Revision: http://reviews.llvm.org/D18305
llvm-svn: 263994
|
|
|
|
|
|
| |
Thanks to chapuni for catching this.
llvm-svn: 263993
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we have a BB with only MemoryDefs, live-in calculations will ignore
it. This means we get results like this:
define void @foo(i8* %p) {
; 1 = MemoryDef(liveOnEntry)
store i8 0, i8* %p
br i1 undef, label %if.then, label %if.end
if.then:
; 2 = MemoryDef(1)
store i8 1, i8* %p
br label %if.end
if.end:
; 3 = MemoryDef(1)
store i8 2, i8* %p
ret void
}
...When there should be a MemoryPhi in the `if.end` BB.
This patch fixes that behavior.
llvm-svn: 263991
|
|
|
|
|
|
|
| |
I meant to add these before committing r263982 as per the review,
but I forgot to squash.
llvm-svn: 263983
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Whole quad mode is already enabled for pixel shaders that compute
derivatives, but it must be suspended for instructions that cause a
shader to have side effects (i.e. stores and atomics).
This pass addresses the issue by storing the real (initial) live mask
in a register, masking EXEC before instructions that require exact
execution and (re-)enabling WQM where required.
This pass is run before register coalescing so that we can use
machine SSA for analysis.
The changes in this patch expose a problem with the second machine
scheduling pass: target independent instructions like COPY implicitly
use EXEC when they operate on VGPRs, but this fact is not encoded in
the MIR. This can lead to miscompilation because instructions are
moved past changes to EXEC.
This patch fixes the problem by adding use-implicit operands to
target independent instructions. Some general codegen passes are
relaxed to work with such implicit use operands.
Reviewers: arsenm, tstellarAMD, mareko
Subscribers: MatzeB, arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D18162
llvm-svn: 263982
|
|
|
|
| |
llvm-svn: 263981
|
|
|
|
| |
llvm-svn: 263980
|
|
|
|
|
|
|
| |
- R10 and R11 are not reserved registers.
- Check for reserved registers when finding unused caller-saved registers.
llvm-svn: 263977
|
|
|
|
| |
llvm-svn: 263976
|
|
|
|
|
|
|
| |
This changes the debug output, but still retains its usefulness.
Differential Revision: http://reviews.llvm.org/D18324
llvm-svn: 263975
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When control flow is implemented using the exec mask, the compiler will
insert branch instructions to skip over the masked section when exec is
zero if the section contains more than a certain number of instructions.
The previous code would only count instructions in successor blocks,
and this patch modifies the code to start counting instructions in all
blocks between the start and end of the branch.
Reviewers: nhaehnle, arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D18282
llvm-svn: 263969
|
|
|
|
| |
llvm-svn: 263965
|
|
|
|
|
|
|
| |
These instructions do not have the same negative base
address problem that DS instructions do on SI.
llvm-svn: 263964
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This introduces a custom lowering for ISD::SETCCE (introduced in r253572)
that allows us to emit a short code sequence for 64-bit compares.
Before:
push {r7, lr}
cmp r0, r2
mov.w r0, #0
mov.w r12, #0
it hs
movhs r0, #1
cmp r1, r3
it ge
movge.w r12, #1
it eq
moveq r12, r0
cmp.w r12, #0
bne .LBB1_2
@ BB#1: @ %bb1
bl f
pop {r7, pc}
.LBB1_2: @ %bb2
bl g
pop {r7, pc}
After:
push {r7, lr}
subs r0, r0, r2
sbcs.w r0, r1, r3
bge .LBB1_2
@ BB#1: @ %bb1
bl f
pop {r7, pc}
.LBB1_2: @ %bb2
bl g
pop {r7, pc}
Saves around 80KB in Chromium's libchrome.so.
Some notes on this patch:
- I don't much like the ARMISD::BRCOND and ARMISD::CMOV combines I
introduced (nothing else needs them). However, they are necessary in
order to avoid poor codegen, and they seem similar to existing combines
in other backends (e.g. X86 combines (brcond (cmp (setcc Compare))) to
(brcond Compare)).
- No support for Thumb-1. This is in principle possible, but we'd need
to implement ARMISD::SUBE for Thumb-1.
Differential Revision: http://reviews.llvm.org/D15256
llvm-svn: 263962
|
|
|
|
|
|
|
|
| |
Adding Cortex-A32 as an available target in the ARM backend.
Patch by Sam Parker.
llvm-svn: 263956
|
|
|
|
| |
llvm-svn: 263950
|
|
|
|
| |
llvm-svn: 263948
|
|
|
|
| |
llvm-svn: 263945
|
|
|
|
| |
llvm-svn: 263942
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
replaceCongruentIVs can break LCSSA when trying to replace IV increments
since it tries to replace all uses of a phi node with another phi node
while both of the phi nodes are not necessarily in the processed loop.
This will cause an assert in IndVars.
To fix this, we add a check to make sure that the replacement maintains
LCSSA.
Reviewers: sanjoy
Subscribers: mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D18266
llvm-svn: 263941
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
while processing AND SDNodes
Summary:
extract_vector_elt can cause an implicit any_ext if the types don't
match. When processing the following pattern:
(and (extract_vector_elt (load ([non_ext|any_ext|zero_ext] V))), c)
DAGCombine was ignoring the possible extend, and sometimes removing
the AND even though it was required to maintain some of the bits
in the result to 0, resulting in a miscompile.
This change fixes the issue by limiting the transformation only to
cases where the extract_vector_elt doesn't perform the implicit
extend.
Reviewers: t.p.northover, jmolloy
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D18247
llvm-svn: 263935
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The old address space inference pass (NVPTXFavorNonGenericAddrSpaces) is unable
to convert the address space of a pointer induction variable. This patch adds a
new pass called NVPTXInferAddressSpaces that overcomes that limitation using a
fixed-point data-flow analysis (see the file header comments for details).
The new pass is experimental and not enabled by default. Users can turn
it on by setting the -nvptx-use-infer-addrspace flag of llc.
Reviewers: jholewinski, tra, jlebar
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D17965
llvm-svn: 263916
|
|
|
|
|
|
|
|
| |
computeZeroableShuffleElements
Based on feedback for D14261
llvm-svn: 263911
|
|
|
|
|
|
|
|
| |
Improve computeZeroableShuffleElements to be able to peek through bitcasts to extract zero/undef values from BUILD_VECTOR nodes of different element sizes to the shuffle mask.
Differential Revision: http://reviews.llvm.org/D14261
llvm-svn: 263906
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D18211
llvm-svn: 263898
|
|
|
|
| |
llvm-svn: 263892
|
|
|
|
| |
llvm-svn: 263889
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Also expose getters and setters in the C API, so that the change can be tested.
Reviewers: nhaehnle, axw, joker.eph
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D18260
From: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
llvm-svn: 263886
|
|
|
|
|
|
| |
Convert a loop to use a range based style loop. NFC.
llvm-svn: 263884
|
|
|
|
|
|
|
|
|
|
| |
The sinpi/cospi can be replaced with sincospi to remove unnecessary
computations. However, we need to make sure that the calls are within
the same function!
This fixes PR26993.
llvm-svn: 263875
|
|
|
|
|
|
|
|
|
| |
CatchSwitches are not splittable, we cannot insert casts, etc. before
them.
This fixes PR26992.
llvm-svn: 263874
|
|
|
|
|
|
|
| |
Following r263866, on D. Blaikie suggestion.
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263869
|
|
|
|
|
|
|
|
|
|
|
| |
MDString are uniqued in the Context on creation, hashing the
pointer is less expensive than hashing the String itself.
Reviewers: dexonsmith
Differential Revision: http://reviews.llvm.org/D16560
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263867
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch changes the computation of the hash key for DISubprogram to
be computed on a small subset of the fields. The hash is computed a
lot faster, but there might be more collision in the table.
However by carefully selecting the fields, colisions should be rare.
Using `opt` to load the IR for FastISelEmitter.cpp.o, with this patch:
- DISubprogram::getImpl() goes from 28ms to 15ms.
- DICompositeType::getImpl() goes from 6ms to 2ms
- DIDerivedType::getImpl() goes from 18 to 12ms
Reviewers: dexonsmith
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16571
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263866
|