| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 261038
|
| |
|
|
| |
llvm-svn: 261037
|
| |
|
|
|
|
|
| |
move ctor
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 261036
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D16940
llvm-svn: 261033
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D17307
llvm-svn: 261032
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Loading IR with debug info improves MDString::get() from 19ms to 10ms.
Reviewers: dexonsmith
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16597
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 261030
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
On the contrary to Full LTO, ThinLTO can afford to shift compile time
from the frontend to the linker: both phases are parallel (even if
it is not totally "free": projects like clang are reusing product
from the "compile phase" for multiple link, think about
libLLVMSupport reused for opt, llc, etc.).
This pipeline is based on the proposal in D13443 for full LTO. We
didn't move forward on this proposal because the LTO link was far too
long after that. We believe that we can afford it with ThinLTO.
The ThinLTO pipeline integrates in the regular O2/O3 flow:
- The compile phase perform the inliner with a somehow lighter
function simplification. (TODO: tune the inliner thresholds here)
This is intendend to simplify the IR and get rid of obvious things
like linkonce_odr that will be inlined.
- The link phase will run the pipeline from the start, extended with
some specific passes that leverage the augmented knowledge we have
during LTO. Especially after the inliner is done, a sequence of
globalDCE/globalOpt is performed, followed by another run of the
"function simplification" passes. It is not clear if this part
of the pipeline will stay as is, as the split model of ThinLTO
does not allow the same benefit as FullLTO without added tricks.
The measurements on the public test suite as well as on our internal
suite show an overall net improvement. The binary size for the clang
executable is reduced by 5%. We're still tuning it with the bringup
of ThinLTO and it will evolve, but this should provide a good starting
point.
Reviewers: tejohnson
Differential Revision: http://reviews.llvm.org/D17115
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 261029
|
| |
|
|
|
|
|
|
|
|
| |
"addFunctionSimplificationPasses()" (NFC)
It is intended to contains the passes run over a function after the
inliner is done with a function and before it moves to its callers.
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 261028
|
| |
|
|
| |
llvm-svn: 261027
|
| |
|
|
| |
llvm-svn: 261026
|
| |
|
|
| |
llvm-svn: 261025
|
| |
|
|
|
|
| |
I suspect this is what let PR26110 lie dormant for so long.
llvm-svn: 261024
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, we sometimes miscompile this vector pattern:
(c ? -v : v)
We lower it to (because "c" is <4 x i1>, lowered as a vector mask):
(~c & v) | (c & -v)
When we have SSSE3, we incorrectly lower that to PSIGN, which does:
(c < 0 ? -v : c > 0 ? v : 0)
in other words, when c is either all-ones or all-zero:
(c ? -v : 0)
While this is an old bug, it rarely triggers because the PSIGN combine
is too sensitive to operand order. This will be improved separately.
Note that the PSIGN tests are also incorrect. Consider:
%b.lobit = ashr <4 x i32> %b, <i32 31, i32 31, i32 31, i32 31>
%sub = sub nsw <4 x i32> zeroinitializer, %a
%0 = xor <4 x i32> %b.lobit, <i32 -1, i32 -1, i32 -1, i32 -1>
%1 = and <4 x i32> %a, %0
%2 = and <4 x i32> %b.lobit, %sub
%cond = or <4 x i32> %1, %2
ret <4 x i32> %cond
if %b is zero:
%b.lobit = <4 x i32> zeroinitializer
%sub = sub nsw <4 x i32> zeroinitializer, %a
%0 = <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>
%1 = <4 x i32> %a
%2 = <4 x i32> zeroinitializer
%cond = or <4 x i32> %a, zeroinitializer
ret <4 x i32> %a
whereas we currently generate:
psignd %xmm1, %xmm0
retq
which returns 0, as %xmm1 is 0.
Instead, use a pure logic sequence, as described in:
https://graphics.stanford.edu/~seander/bithacks.html#ConditionalNegate
Fixes PR26110.
Differential Revision: http://reviews.llvm.org/D17181
llvm-svn: 261023
|
| |
|
|
|
|
|
|
|
|
| |
We're going to stop generating PSIGN, so calling a test "psign"
isn't ideal. Instead, call these tests what they really are:
variable blends using logic.
Also add a test to exhibit a case we're currently missing in
the PSIGN combine.
llvm-svn: 261022
|
| |
|
|
| |
llvm-svn: 261021
|
| |
|
|
|
|
| |
This makes it IMO more readable and reduces indentation.
llvm-svn: 261020
|
| |
|
|
|
|
|
|
| |
When emitting the source filename, the encoding of the string
was checked against the name instead of the filename.
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 261019
|
| |
|
|
|
|
| |
These were fixed with r260978
llvm-svn: 261017
|
| |
|
|
|
|
|
|
|
|
| |
This apparently comes up when the register allocator decides that a
variable will become undef along a certain path.
Also improve the error message we emit when we can't map from LLVM
register number to CV register number.
llvm-svn: 261016
|
| |
|
|
|
|
|
|
|
|
| |
The register stackifier currently checks for intervening stores (and
loads that may alias them) but doesn't account for the fact that the
instruction being moved may affect intervening loads.
Differential Revision: http://reviews.llvm.org/D17298
llvm-svn: 261014
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
I thought -Xlinker -mllvm -Xlinker -stats worked at some point but maybe
it never did.
For clang, I believe that stats are printed from cc1_main. This patch
also prints them for LTO, specifically right after codegen happens.
I only looked at the C API for LTO briefly to see if this is a good
place. Probably there are still cases where this wouldn't be printed
but it seems to be working for the common case. I also experimented
putting this in the LTOCodeGenerator destructor but that didn't trigger
for me because ld64 does not destroy the LTOCodeGenerator.
Reviewers: dexonsmith, joker.eph
Subscribers: rafael, joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D17302
llvm-svn: 261013
|
| |
|
|
|
|
|
| |
Eventually we should find a way to describe constant variables, but it
is not obvious how to do this at the moment.
llvm-svn: 261010
|
| |
|
|
|
|
|
|
|
|
|
|
| |
23-bit 4-byte aligned relocation to be a valid instruction encoding.
The usual way to get a 32-bit relocation is to use a constant extender which doubles the size of the instruction, 4 bytes to 8 bytes.
Another way is to put a .word32 and mix code and data within a function. The disadvantage is it's not a valid instruction encoding and jumping over it causes prefetch stalls inside the hardware.
This relocation packs a 23-bit value in to an "r0 = add(rX, #a)" instruction by overwriting the source register bits. Since r0 is the return value register, if this instruction is placed after a function call which return void, r0 will be filled with an undefined value, the prefetch won't be confused, and the callee can access the constant value by way of the link register.
llvm-svn: 261006
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This change will add a pass to remove unnecessary zero copies in target blocks
of cbz/cbnz instructions. E.g., the copy instruction in the code below can be
removed because the cbz jumps to BB1 when x0 is zero :
BB0:
cbz x0, .BB1
BB1:
mov x0, xzr
Jun
Reviewers: gberry, jmolloy, HaoLiu, MatzeB, mcrosier
Subscribers: mcrosier, mssimpso, haicheng, bmakam, llvm-commits, aemerson, rengolin
Differential Revision: http://reviews.llvm.org/D16203
llvm-svn: 261004
|
| |
|
|
|
|
|
|
|
| |
Original message:
Get rid of the ifdefs in TargetLowering.
Introduce a new API used only by GlobalISel: CallLowering.
This API will contain target hooks dedicated to call lowering.
llvm-svn: 260998
|
| |
|
|
|
|
|
| |
It was already the one "destroying" the source module, now the API
reflects that.
llvm-svn: 260989
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
CopyToReg nodes don't support FrameIndex operands. Other targets select
the FI to some LEA-like instruction, but since we don't have that, we
need to insert some kind of instruction that can take an FI operand and
produces a value usable by CopyToReg (i.e. in a vreg). So insert a dummy
copy_local between Op and its FI operand. This results in a redundant
copy which we should optimize away later (maybe in the post-FI-lowering
peephole pass).
Differential Revision: http://reviews.llvm.org/D17213
llvm-svn: 260987
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This change renames output operand for VOP instructions from dst to vdst. This is needed to enable decoding named operands for disassembler.
Reviewers: vpykhtin, tstellarAMD, arsenm
Subscribers: arsenm, llvm-commits, nhaustov
Projects: #llvm-amdgpu-spb
Differential Revision: http://reviews.llvm.org/D16920
llvm-svn: 260986
|
| |
|
|
|
|
| |
The root issue appears to be a confusion around what makeNoWrapRegion actually does. It seems likely we need two versions of this function with slightly different semantics.
llvm-svn: 260981
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D16877
llvm-svn: 260979
|
| |
|
|
|
|
|
|
|
|
| |
WebAssembly doesn't require full RPO; topological sorting is sufficient and
can preserve more of the MachineBlockPlacement ordering. Unfortunately, this
still depends a lot on heuristics, because while we use the
MachineBlockPlacement ordering as a guide, we can't use it in places where
it isn't topologically ordered. This area will require further attention.
llvm-svn: 260978
|
| |
|
|
|
|
| |
overflow. However, the other bitfield requires a signed value (it supports negative offsets), so it is slightly better to retain a signed 1-bit bitfield and use -1 instead of 1. Silences an MSVC warning.
llvm-svn: 260973
|
| |
|
|
|
|
|
| |
http://lab.llvm.org:8011/builders/lldb-x86-windows-msvc2015/builds/15436/steps/build/logs/stdio
http://bb.pgr.jp/builders/msbuild-llvmclang-x64-msc18-DA/builds/961/steps/build_llvm/logs/stdio
llvm-svn: 260972
|
| |
|
|
|
|
|
|
| |
This avoids some complications updating LiveIntervals to be aware of the new
register lifetimes, because we can just compute new intervals from scratch
rather than describe how the old ones have been changed.
llvm-svn: 260971
|
| |
|
|
|
|
|
|
|
|
| |
Original commit message:
[readobj] Dump DT_JMPREL relocations when outputting dynamic relocations.
The bits of r260488 it depends on have been committed.
llvm-svn: 260970
|
| |
|
|
| |
llvm-svn: 260968
|
| |
|
|
|
|
|
|
|
| |
This requires making an error message a bit more generic, but that seems
a reasonable tradeoff.
Extracted from r260488 but simplified a bit.
llvm-svn: 260967
|
| |
|
|
|
|
|
|
| |
This reduces indentation in preparation to adding a bit more code to it.
Extracted from r260488.
llvm-svn: 260963
|
| |
|
|
|
|
|
|
|
|
|
| |
Original messages:
Revert "[readobj] Handle ELF files with no section table or with no program headers."
Revert "[readobj] Dump DT_JMPREL relocations when outputting dynamic relocations."
r260489 depends on r260488 and among other issues r260488 deleted error
handling code.
llvm-svn: 260962
|
| |
|
|
|
|
|
|
|
|
| |
Add a missing check for a type of address displacement operand of the load/store instruction being a candidate for LEA substitution.
Ref: https://llvm.org/bugs/show_bug.cgi?id=26575
Differential Revision: http://reviews.llvm.org/D17261
llvm-svn: 260959
|
| |
|
|
|
|
|
|
| |
Once a pointer is turned into a reference it cannot be nullptr, clang
rightfully warns about this assert being a tautology. Put the assert
before the reference is created.
llvm-svn: 260949
|
| |
|
|
|
|
| |
C API test
llvm-svn: 260947
|
| |
|
|
| |
llvm-svn: 260943
|
| |
|
|
| |
llvm-svn: 260942
|
| |
|
|
| |
llvm-svn: 260941
|
| |
|
|
| |
llvm-svn: 260940
|
| |
|
|
| |
llvm-svn: 260939
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Extending findExistingExpansion can use existing value in ExprValueMap.
This patch gives 0.3~0.5% performance improvements on
benchmarks(test-suite, spec2000, spec2006, commercial benchmark)
Reviewers: mzolotukhin, sanjoy, zzheng
Differential Revision: http://reviews.llvm.org/D15559
llvm-svn: 260938
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This consist in variosu addition to the C API:
LLVMTargetDataRef LLVMGetModuleDataLayout(LLVMModuleRef M);
void LLVMSetModuleDataLayout(LLVMModuleRef M, LLVMTargetDataRef DL);
LLVMTargetDataRef LLVMCreateTargetMachineData(LLVMTargetMachineRef T);
Reviewers: joker.eph, Wallbraker, echristo
Subscribers: axw
Differential Revision: http://reviews.llvm.org/D17255
llvm-svn: 260936
|
| |
|
|
| |
llvm-svn: 260935
|