| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
turned into a branch
This is part of solving PR27344:
https://llvm.org/bugs/show_bug.cgi?id=27344
CGP should undo the SimplifyCFG transform for the same reason that earlier patches have used this
same mechanism: it's possible that passes between SimplifyCFG and CGP may be able to optimize the
IR further with a select in place.
For the TLI hook default, >99% taken or not taken is chosen as the default threshold for a highly
predictable branch. Even the most limited HW branch predictors will be correct on this branch almost
all the time, so even a massive mispredict penalty perf loss would be overcome by the win from all
the times the branch was predicted correctly.
As a follow-up, we could make the default target hook less conservative by using the SchedMachineModel's
MispredictPenalty. Or we could just let targets override the default by implementing the hook with that
and other target-specific options. Note that trying to statically determine mispredict rates for
close-to-balanced profile weight data is generally impossible if the HW is sufficiently advanced. Ie,
50/50 taken/not-taken might still be 100% predictable.
Finally, note that this patch as-is will not solve PR27344 because the current __builtin_unpredictable()
branch weight default values are 4 and 64. A proposal to change that is in D19435.
Differential Revision: http://reviews.llvm.org/D19488
llvm-svn: 267572
|
| |
|
|
| |
llvm-svn: 267571
|
| |
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D19445
Reviewed By: David Majnemer
llvm-svn: 267564
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D19235
llvm-svn: 267563
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In gcc, \ escapes every character in response files. It is true that this makes
it harder to mention Windows files in rsp files, but not doing this means clang
disagrees with gcc, and also disagrees with the shell (on non-Windows) which
rsp file quoting is supposed to match. clang isn't free to choose what to do
here.
In general, the idea for response files is to take bits of your command line
and write them to a file unchanged, and have things work the same way. Since
the command line would've been interpreted by the shell, things in the rsp file
need to be subject to the same shell quoting rules.
People who want to put Windows-style paths in their response files either need
to do any of:
* escape their backslashes
* or use clang-cl which uses cl.exe/cmd.exe quoting rules
* pass --rsp-quoting=windows to clang to tell it to use
cl.exe/cmd.exe quoting rules for response files.
Fixes PR27464.
http://reviews.llvm.org/D19417
llvm-svn: 267556
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Support for SDWA instructions for VOP1 and VOP2 encoding.
Not done yet:
- converters for support optional operands and modifiers
- VOPC
- sext() modifier
- intrinsics
- VOP2b (see vop_dpp.s)
- V_MAC_F32 (see vop_dpp.s)
Differential Revision: http://reviews.llvm.org/D19360
llvm-svn: 267553
|
| |
|
|
|
|
|
|
| |
Handle MachineBasicBlock as a memory displacement operand in the LEA optimization pass.
Differential Revision: http://reviews.llvm.org/D19409
llvm-svn: 267551
|
| |
|
|
| |
llvm-svn: 267549
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D19304
llvm-svn: 267546
|
| |
|
|
|
|
|
|
| |
This fixes PR22248 on sparc.
Differential Revision: http://reviews.llvm.org/D19386
llvm-svn: 267545
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D19387
llvm-svn: 267544
|
| |
|
|
|
|
|
|
|
| |
If the linker specifically requested for a linkonce to be preserved,
we need to make sure we won't drop it even if all the uses in the
current module disappear.
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267543
|
| |
|
|
|
|
| |
This reverts commit r267488, as it broke some ARM buildbots.
llvm-svn: 267541
|
| |
|
|
|
|
|
|
|
|
|
| |
tail-call issue
print-stack-trace.cc test failure of compiler-rt has been fixed by
r266869 (http://reviews.llvm.org/D19148), so reenable sibling call
optimization on ppc64
Reviewers: nemanjai kbarton
llvm-svn: 267527
|
| |
|
|
|
|
| |
The default is legal, which results in 'Cannot select' errors.
llvm-svn: 267522
|
| |
|
|
|
|
| |
The default is Legal, which results in 'Cannot select' errors.
llvm-svn: 267521
|
| |
|
|
|
|
| |
The default is legal, which results in 'Cannot select' errors.
llvm-svn: 267520
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Instead of using maximum IR weight as the basic block weight, this patch uses the voting algorithm to find the most likely weight for the basic block. This can effectively avoid the cases when some IRs are annotated incorrectly due to code motion of the profiled binary.
This patch also updates propagate.ll unittest to include discriminator in the input file so that it is testing something meaningful.
Reviewers: davidxl, dnovillo
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D19301
llvm-svn: 267519
|
| |
|
|
|
|
|
|
| |
When SimplifyCFG merges identical instructions from both sides of a diamond, it
can preserve !llvm.mem.parallel_loop_access (as it does with most of the other
metadata). There's no real data or control dependency change in this case.
llvm-svn: 267515
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
marked parallel loops
I really thought we were doing this already, but we were not. Given this input:
void Test(int *res, int *c, int *d, int *p) {
for (int i = 0; i < 16; i++)
res[i] = (p[i] == 0) ? res[i] : res[i] + d[i];
}
we did not vectorize the loop. Even with "assume_safety" the check that we
don't if-convert conditionally-executed loads (to protect against
data-dependent deferenceability) was not elided.
One subtlety: As implemented, it will still prefer to use a masked-load
instrinsic (given target support) over the speculated load. The choice here
seems architecture specific; the best option depends on how expensive the
masked load is compared to a regular load. Ideally, using the masked load still
reduces unnecessary memory traffic, and so should be preferred. If we'd rather
do it the other way, flipping the order of the checks is easy.
The LangRef is updated to make explicit that llvm.mem.parallel_loop_access also
implies that if conversion is okay.
Differential Revision: http://reviews.llvm.org/D19512
llvm-svn: 267514
|
| |
|
|
| |
llvm-svn: 267511
|
| |
|
|
|
|
|
|
|
| |
We would report that the function changed despite creating no new
allocas or performing any promotion.
This fixes PR27316.
llvm-svn: 267507
|
| |
|
|
| |
llvm-svn: 267506
|
| |
|
|
|
|
|
| |
Suggested in the review of D19488:
http://reviews.llvm.org/D19488
llvm-svn: 267504
|
| |
|
|
|
|
|
|
|
|
|
| |
Summary:
We don't use MinLatency any more since r184032.
Reviewers: atrick, hfinkel, mcrosier
Differential Revision: http://reviews.llvm.org/D19474
llvm-svn: 267502
|
| |
|
|
| |
llvm-svn: 267499
|
| |
|
|
|
|
|
|
|
|
| |
Pass all of the state we need around as arguments, so that these
functions are easier to reuse. There is one part of this that is
unusual: we pass around a functor to look up a DomTree for a function.
This will be a necessary abstraction when we try to use this code in
both the legacy and the new pass manager.
llvm-svn: 267498
|
| |
|
|
|
|
|
|
|
| |
Kill-flags, which computeRegisterLiveness uses, are not reliable.
LivePhysRegs is.
Differential Revision: http://reviews.llvm.org/D19472
llvm-svn: 267495
|
| |
|
|
|
|
|
| |
While we're here, fix the comment and variable names to make it
clear that these are raw weights, not percentages.
llvm-svn: 267491
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The SparcV8 fneg and fabs instructions interestingly come only in a
single-float variant. Since the sign bit is always the topmost bit no
matter what size float it is, you simply operate on the high
subregister, as if it were a single float.
However, the layout of double-floats in the float registers is reversed
on little-endian CPUs, so that the high bits are in the second
subregister, rather than the first.
Thus, this expansion must check the endianness to use the correct
subregister.
llvm-svn: 267489
|
| |
|
|
|
|
| |
Otherwise the linker has no idea what should be resolved.
llvm-svn: 267488
|
| |
|
|
| |
llvm-svn: 267487
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D19450
llvm-svn: 267485
|
| |
|
|
| |
llvm-svn: 267484
|
| |
|
|
|
|
|
|
|
|
|
| |
This patch is what was the "instcombine" portion of D14185, with an additional
test added (see julia_pseudovec in test/Transforms/InstCombine/insert-val-extract-elem.ll).
The patch causes instcombine to replace sequences of extractelement-insertvalue-store
that act essentially like a bitcast followed by a store.
Differential review: http://reviews.llvm.org/D14260
llvm-svn: 267482
|
| |
|
|
|
|
| |
happens [NFC]
llvm-svn: 267481
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D19449
llvm-svn: 267480
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D19394
llvm-svn: 267479
|
| |
|
|
| |
llvm-svn: 267476
|
| |
|
|
|
|
|
| |
Commit r267457 made a lot of type-substitutions threw off code formatting and
alignment. This patch should tidy those changes up.
llvm-svn: 267475
|
| |
|
|
|
|
|
| |
The linker needs to know that the symbols are thread-local to do its job
properly.
llvm-svn: 267473
|
| |
|
|
|
|
|
|
| |
Add a typedef for the std::map<GlobalValue::GUID, GlobalValueSummary *>
map that is passed around to identify summaries for values defined in a
particular module. This shortens up declarations in a variety of places.
llvm-svn: 267471
|
| |
|
|
| |
llvm-svn: 267469
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
log2(Mask) is smaller than 32, we must use the 32-bit variant because the 64-bit
variant cannot encode it. Therefore, set the subreg part accordingly.
[AArch64] Fix optimizeCondBranch logic.
The opcode for the optimized branch does not depend on the size
of the activate bits in the AND masks, but the AND opcode itself.
Indeed, we need to use a X or W variant based on the AND variant
not based on whether the mask fits into the related variant.
Otherwise, we may end up using the W variant of the optimized branch
for 64-bit register inputs!
This fixes the last make check verifier issues for AArch64: PR27479.
llvm-svn: 267465
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The expression is redundant on both side of operator |.
detected by : http://reviews.llvm.org/D19451
Reviewers: rnk, majnemer
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D19459
llvm-svn: 267458
|
| |
|
|
|
|
|
|
|
|
| |
This replaces use of std::error_code and ErrorOr in the ORC RPC support library
with Error and Expected. This required updating the OrcRemoteTarget API, Client,
and server code, as well as updating the Orc C API.
This patch also fixes several instances where Errors were dropped.
llvm-svn: 267457
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Use the operand for how long to wait. This is somewhat
distasteful, since it would be better to just emit s_nop
with the right argument in the first place. This would require
changing TII::insertNoop to emit N operands, which would be easy.
Slightly more problematic is the post-RA scheduler and hazard recognizer
represent nops as a single null node, and would require inventing
another way of representing N nops.
llvm-svn: 267456
|
| |
|
|
| |
llvm-svn: 267455
|
| |
|
|
| |
llvm-svn: 267452
|
| |
|
|
| |
llvm-svn: 267451
|