| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
| |
available. This gives the register allocator more registers to work with.
llvm-svn: 290049
|
| |
|
|
|
|
| |
encoded logic instructions to get more registers to use.
llvm-svn: 290048
|
| |
|
|
|
|
|
|
| |
It is still breaking Chrome. http://llvm.org/PR31361
This reverts commit r290026.
llvm-svn: 290047
|
| |
|
|
|
|
| |
See also test/CodeGen/MIR/README
llvm-svn: 290032
|
| |
|
|
|
|
|
|
| |
This reverts r289696, which caused TSan perf regression.
See PR31382.
llvm-svn: 290030
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Re-apply r288561: Liveness tracking should be correct now after r290014.
Previously this pass was using up to 5% compile time in some cases which
is a bit much for what it is doing. The pass featured a full blown
data-flow analysis which in the default configuration was restricted to a
single block.
This rewrites the pass under the assumption that we only ever work on a
single block. This is done in a single pass maintaining a state machine
per general purpose register to catch LOH patterns.
Differential Revision: https://reviews.llvm.org/D27329
llvm-svn: 290026
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D27863
llvm-svn: 290017
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D27559
llvm-svn: 290014
|
| |
|
|
|
|
| |
Feedback on r289468 from Adrian Prantl.
llvm-svn: 290012
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Mach-O command line flag like "-arch armv7m" does not match the
arch name part of its llvm Triple which is "thumbv7m-apple-darwin”.
I think the best way to fix this is to have
llvm::object::MachOObjectFile::getArchTriple() optionally return the
name of the Mach-O arch flag that would be used with -arch that
matches the CPUType and CPUSubType. Then change
llvm::object::MachOUniversalBinary::ObjectForArch::getArchTypeName()
to use that and change it to getArchFlagName() as the type name is
really part of the Triple and the -arch flag name is a Mach-O thing
for a specific Triple with a specific Mcpu value.
rdar://29663637
llvm-svn: 290001
|
| |
|
|
|
|
|
| |
The original patch was broken due to some undefined behavior
as well as warnings that were triggering -Werror.
llvm-svn: 290000
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When reading the metadata bitcode, create a type declaration when
possible for composite types when we are importing. Doing this in
the bitcode reader saves memory. Also it works naturally in the case
when the type ODR map contains a definition for the same composite type
because it was used in the importing module (buildODRType will
automatically use the existing definition and not create a type
declaration).
For Chromium built with -g2, this reduces the aggregate size of the
generated native object files by 66% (from 31G to 10G). It reduced
the time through the ThinLTO link and backend phases by about 20% on
my machine.
Reviewers: mehdi_amini, dblaikie, aprantl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D27775
llvm-svn: 289993
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D27830
llvm-svn: 289992
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is recommit of r287553 after fixing the invalid loop info after eliminating an empty block and unit test failures in AVR and WebAssembly :
Summary: Merging an empty case block into the header block of switch could cause ISel to add COPY instructions in the header of switch, instead of the case block, if the case block is used as an incoming block of a PHI. This could potentially increase dynamic instructions, especially when the switch is in a loop. I added a test case which was reduced from the benchmark I was targetting.
Reviewers: t.p.northover, mcrosier, manmanren, wmi, joerg, davidxl
Subscribers: joerg, qcolombet, danielcdh, hfinkel, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D22696
llvm-svn: 289988
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 289920 (again).
I forgot to implement a Bitcode upgrade for the case where a DIGlobalVariable
has not DIExpression. Unfortunately it is not possible to safely upgrade
these variables without adding a flag to the bitcode record indicating which
version they are.
My plan of record is to roll the planned follow-up patch that adds a
unit: field to DIGlobalVariable into this patch before recomitting.
This way we only need one Bitcode upgrade for both changes (with a
version flag in the bitcode record to safely distinguish the record
formats).
Sorry for the churn!
llvm-svn: 289982
|
| |
|
|
|
|
|
| |
This reverts commit r289978, which is failing due to some rebase/merge
issues.
llvm-svn: 289981
|
| |
|
|
|
|
|
|
|
| |
This is the 3rd of 3 patches to get reading and writing of
CodeView symbol and type records to use a single codepath.
Differential Revision: https://reviews.llvm.org/D26427
llvm-svn: 289978
|
| |
|
|
|
|
|
|
| |
This ensures backward compatibility on bitcode loading.
Differential Revision: https://reviews.llvm.org/D27839
llvm-svn: 289977
|
| |
|
|
|
|
|
|
| |
This patch reapplies r289863. The original patch was reverted because it
exposed a bug causing the loop vectorizer to crash in the Python runtime on
PPC. The underlying issue was fixed with r289958.
llvm-svn: 289975
|
| |
|
|
|
|
|
|
|
|
|
|
| |
`dropUnknownNonDebugMetadata` takes a list of "known" metadata IDs. The
only reason it worked at all is that `getMetadataID` returns something
unrelated -- it returns the subclass ID of the receiver (which is used
in `dyn_cast` etc.). That does not numerically match
`LLVMContext::MD_invariant_group` and ends up dropping `invariant_group`
along with every other metadata that does not numerically match
`LLVMContext::MD_invariant_group`.
llvm-svn: 289973
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, there are substantial problems forming vld1_dup even if the
VDUP survives legalization. The lack of an actual node
leads to terrible results: not only can we not form post-increment vld1_dup
instructions, but we form scalar pre-increment and post-increment
loads which force the loaded value into a GPR. This patch fixes that
by combining the vdup+load into an ARMISD node before DAGCombine
messes it up.
Also includes a crash fix for vld2_dup (see testcase @vld2dupi8_postinc_variable).
Recommiting with fix to avoid forming vld1dup if the type of the load
doesn't match the type of the vdup (see
https://llvm.org/bugs/show_bug.cgi?id=31404).
Differential Revision: https://reviews.llvm.org/D27694
llvm-svn: 289972
|
| |
|
|
| |
llvm-svn: 289967
|
| |
|
|
|
|
|
|
|
| |
Reverting because this breaks lld's gdb_index support - it's probably
double counting the abbrev relocation offset.
This reverts commit r289954.
llvm-svn: 289961
|
| |
|
|
|
|
| |
This reverts commit r289951.
llvm-svn: 289960
|
| |
|
|
| |
llvm-svn: 289959
|
| |
|
|
|
|
|
|
|
|
| |
After r288909, instructions feeding predicated instructions may be scalarized
if profitable. Since these instructions will remain scalar, we shouldn't
attempt to type-shrink them. We should only truncate vector types to their
minimal bit widths. This bug was exposed by enabling the vectorization of loops
containing conditional stores by default.
llvm-svn: 289958
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: ThinLTO needs to invoke SampleProfileLoader pass during link time in order to annotate profile correctly after module importing.
Reviewers: davidxl, mehdi_amini, tejohnson
Subscribers: pcc, davide, llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D27790
llvm-svn: 289957
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-C), COND) (PR31367)
atomic_load_add returns the value before addition, but sets EFLAGS based on the
result of the addition. That means it's setting the flags based on effectively
subtracting C from the value at x, which is also what the outer cmp does.
This targets a pattern that occurs frequently with reference counting pointers:
void decrement(long volatile *ptr) {
if (_InterlockedDecrement(ptr) == 0)
release();
}
Clang would previously compile it (for 32-bit at -Os) as:
00000000 <?decrement@@YAXPCJ@Z>:
0: 8b 44 24 04 mov 0x4(%esp),%eax
4: 31 c9 xor %ecx,%ecx
6: 49 dec %ecx
7: f0 0f c1 08 lock xadd %ecx,(%eax)
b: 83 f9 01 cmp $0x1,%ecx
e: 0f 84 00 00 00 00 je 14 <?decrement@@YAXPCJ@Z+0x14>
14: c3 ret
and with this patch it becomes:
00000000 <?decrement@@YAXPCJ@Z>:
0: 8b 44 24 04 mov 0x4(%esp),%eax
4: f0 ff 08 lock decl (%eax)
7: 0f 84 00 00 00 00 je d <?decrement@@YAXPCJ@Z+0xd>
d: c3 ret
(Equivalent variants with _InterlockedExchangeAdd, std::atomic<>'s fetch_add
or pre-decrement operator generate the same code.)
Differential Revision: https://reviews.llvm.org/D27781
llvm-svn: 289955
|
| |
|
|
|
|
|
|
|
|
| |
Input can be produced by ld -r, for example (a normal LLVM workflow
never hits this - LLVM only ever produces a single abbrev table in an
object (shared by multiple CUs), so the reloc's always 0, and when it's
linked together the relocation's resolved so it doesn't need to be
handled)
llvm-svn: 289954
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is recommit of r287553 after fixing the invalid loop info after eliminating an empty block:
Summary: Merging an empty case block into the header block of switch could cause ISel to add COPY instructions in the header of switch, instead of the case block, if the case block is used as an incoming block of a PHI. This could potentially increase dynamic instructions, especially when the switch is in a loop. I added a test case which was reduced from the benchmark I was targetting.
Reviewers: t.p.northover, mcrosier, manmanren, wmi, joerg, davidxl
Subscribers: joerg, qcolombet, danielcdh, hfinkel, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D22696
llvm-svn: 289951
|
| |
|
|
|
|
|
|
|
|
|
|
| |
instructions
This is the 512-bit counterpart to the 128-bit transform checked in here:
https://reviews.llvm.org/rL289837
This patch is based on the draft by @sroland (Roland Scheidegger) that is attached to PR27885:
https://llvm.org/bugs/show_bug.cgi?id=27885
llvm-svn: 289946
|
| |
|
|
|
|
| |
v16i32 to VSHUFPS (PR27885)
llvm-svn: 289945
|
| |
|
|
| |
llvm-svn: 289944
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Add the minimal support necessary to select a function that returns the sum of
two i32 values.
This includes some support for argument/return lowering of i32 values through
registers, as well as the handling of copy and add instructions throughout the
GlobalISel pipeline.
Differential Revision: https://reviews.llvm.org/D26677
llvm-svn: 289940
|
| |
|
|
|
|
| |
We already do the same thing in shuffle lowering; but don't do it if we have SSE41 (PBLEND) instead.
llvm-svn: 289937
|
| |
|
|
| |
llvm-svn: 289936
|
| |
|
|
|
|
|
|
|
|
| |
stores by default
This uncovers a crasher in the loop vectorizer on PPC when building the
Python runtime. I'll send the testcase to the review thread for the
original commit.
llvm-svn: 289934
|
| |
|
|
| |
llvm-svn: 289931
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
tail-call eligibility)
This patch appears to result in trampolines in vtables being miscompiled
when they in turn tail call a method.
I've posted some preliminary details about the failure on the thread for
this commit and talked to Hal. He was comfortable going ahead and
reverting until we sort out what is wrong.
llvm-svn: 289928
|
| |
|
|
|
|
|
|
|
|
|
|
| |
One more attempt to re-commit the patch r285355, which I had to revert in r285362, because some tests were failing (the reason is because the size of the line_table varied depending on the full file name).
In the past the compiler always emitted .debug_line version 2, though some opcodes from DWARF 3 (e.g. DW_LNS_set_prologue_end, DW_LNS_set_epilogue_begin or DW_LNS_set_isa) and from DWARF 4 could be emitted by the compiler.
This patch changes version information of .debug_line to exactly match the DWARF version. For .debug_line version 4, a new field maximum_operations_per_instruction is emitted.
Differential Revision: https://reviews.llvm.org/D16697
llvm-svn: 289925
|
| |
|
|
| |
llvm-svn: 289923
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements PR31013 by introducing a
DIGlobalVariableExpression that holds a pair of DIGlobalVariable and
DIExpression.
Currently, DIGlobalVariables holds a DIExpression. This is not the
best way to model this:
(1) The DIGlobalVariable should describe the source level variable,
not how to get to its location.
(2) It makes it unsafe/hard to update the expressions when we call
replaceExpression on the DIGLobalVariable.
(3) It makes it impossible to represent a global variable that is in
more than one location (e.g., a variable with multiple
DW_OP_LLVM_fragment-s). We also moved away from attaching the
DIExpression to DILocalVariable for the same reasons.
This reapplies r289902 with additional testcase upgrades.
<rdar://problem/29250149>
https://llvm.org/bugs/show_bug.cgi?id=31013
Differential Revision: https://reviews.llvm.org/D26769
llvm-svn: 289920
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
idiom.
r289538: Match load by bytes idiom and fold it into a single load
r289540: Fix a buildbot failure introduced by r289538
r289545: Use more detailed assertion messages in the code ...
r289646: Add a couple of assertions to the load combine code ...
This DAG combine has a bad crash in it that is quite hard to trigger
sadly -- it relies on sneaking code with UB through the SDAG build and
into this particular combine. I've responded to the original commit with
a test case that reproduces it.
However, the code also has other problems that will require substantial
changes to address and so I'm going ahead and reverting it for now. This
should unblock us and perhaps others that are hitting the crash in the
wild and will let a fresh patch with updated approach come in cleanly
afterward.
Sorry for any trouble or disruption!
llvm-svn: 289916
|
| |
|
|
|
|
| |
This reverts commit 289902 while investigating bot berakage.
llvm-svn: 289906
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements PR31013 by introducing a
DIGlobalVariableExpression that holds a pair of DIGlobalVariable and
DIExpression.
Currently, DIGlobalVariables holds a DIExpression. This is not the
best way to model this:
(1) The DIGlobalVariable should describe the source level variable,
not how to get to its location.
(2) It makes it unsafe/hard to update the expressions when we call
replaceExpression on the DIGLobalVariable.
(3) It makes it impossible to represent a global variable that is in
more than one location (e.g., a variable with multiple
DW_OP_LLVM_fragment-s). We also moved away from attaching the
DIExpression to DILocalVariable for the same reasons.
<rdar://problem/29250149>
https://llvm.org/bugs/show_bug.cgi?id=31013
Differential Revision: https://reviews.llvm.org/D26769
llvm-svn: 289902
|
| |
|
|
|
|
|
|
|
| |
Removing sensitivity to scheduling (by using CHECK-DAG instead of CHECK) and
some other minor corrections.
In preparation to commit Power9 processor model.
llvm-svn: 289900
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This pass prepares a module containing type metadata for ThinLTO by splitting
it into regular and thin LTO parts if possible, and writing both parts to
a multi-module bitcode file. Modules that do not contain type metadata are
written unmodified as a single module.
All globals with type metadata are added to the regular LTO module, and
the rest are added to the thin LTO module.
Differential Revision: https://reviews.llvm.org/D27324
llvm-svn: 289899
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We were reinvoking exportGlobalInModule numerous times redundantly.
No need to re-export globals referenced by a global that was already
imported from its module. This resulted in a large speedup in the thin
link for a big application, particularly when importing aggressiveness
was cranked up.
Reviewers: mehdi_amini
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D27687
llvm-svn: 289896
|
| |
|
|
| |
llvm-svn: 289895
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D14590
llvm-svn: 289894
|