| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Several fixes to the handling of bitcode files without function summary
sections so that they are skipped during ThinLTO processing in llvm-lto
and the gold plugin when appropriate instead of aborting.
1 Don't assert when trying to add a FunctionInfo that doesn't have
a summary attached.
2 Skip FunctionInfo structures that don't have attached function summary
sections when trying to create the combined function summary.
3 In both llvm-lto and gold-plugin, check whether a bitcode file has
a function summary section before trying to parse the index, and skip
the bitcode file if it does not.
4 Fix hasFunctionSummaryInMemBuffer in BitcodeReader, which had a bug
where we returned to early while looking for the summary section.
Also added llvm-lto and gold-plugin based tests for cases where we
don't have function summaries in the bitcode file. I verified that
either the first couple fixes described above are enough to avoid the
crashes, or fixes 1,3,4. But have combined them all here for added
robustness.
Reviewers: joker.eph
Subscribers: llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D14903
llvm-svn: 253796
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
additional offset.
MachineInstrBuilder::addDisp can already add an immediate or global address MO with an adjusted offset, this patch adds support for constant pool indices as well.
All remaining MO types still assert - there are a number of other types that could support adjusted offsets but I have no test cases at this time.
Required to fix a regression in D13988 found by Mikael Holmén during stress testing (test case attached).
Differential Revision: http://reviews.llvm.org/D14867
llvm-svn: 253795
|
| |
|
|
|
|
| |
cases exist
llvm-svn: 253784
|
| |
|
|
| |
llvm-svn: 253783
|
| |
|
|
|
|
| |
Tidied up triple and regenerate tests using update_llc_test_checks.py
llvm-svn: 253782
|
| |
|
|
|
|
| |
Tidied up triple and regenerate tests using update_llc_test_checks.py
llvm-svn: 253781
|
| |
|
|
|
|
| |
Tidied up triple and regenerate tests using update_llc_test_checks.py
llvm-svn: 253780
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
When MergeConsecutiveStores() combines two loads and two stores into
wider loads and stores, the chain users of both of the original loads
must be transfered to the new load, because it may be that a chain
user only depends on one of the loads.
New test case: test/CodeGen/SystemZ/dag-combine-01.ll
Reviewed by James Y Knight.
Bugzilla: https://llvm.org/bugs/show_bug.cgi?id=25310#c6
llvm-svn: 253779
|
| |
|
|
|
|
| |
Tidied up triple and regenerate tests using update_llc_test_checks.py
llvm-svn: 253778
|
| |
|
|
| |
llvm-svn: 253777
|
| |
|
|
|
|
|
|
|
|
|
|
| |
It turns out we have a number of places that just grab the first type attached to a register class for various reasons. This is fine unless for some reason that type isn't legal on the current target, such as for SSE1 which doesn't support v16i8/v8i16/v4i32/v2i64 - all of which were included before 4f32 in the class.
Given that this is such a rare situation I've just re-ordered the types and placed the float types first.
Fix for PR16133
Differential Revision: http://reviews.llvm.org/D14787
llvm-svn: 253773
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add a -preserve-modules option to llvm-link that simulates LTO
clients that don't destroy modules as they are linked. This enables
reproduction of a recent bug introduced by a metadata linking change
that was only caught when the modules weren't destroyed before
writing bitcode (LTO on Windows).
See http://llvm.org/viewvc/llvm-project?view=revision&revision=253170
for more details on the original bug and the fix.
Confirmed the new test added here reproduces the failure using the new
option when I suppress the fix.
Reviewers: pcc
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14818
llvm-svn: 253740
|
| |
|
|
| |
llvm-svn: 253730
|
| |
|
|
|
|
| |
Not all zero vectors are ConstantDataVector's.
llvm-svn: 253723
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add and instructions immediately after loads that only have their low
bits used, assuming that the (and (load x) c) will be matched as a
extload and the ands/truncs fed by the extload will be removed by isel.
Reviewers: mcrosier, qcolombet, ab
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14584
llvm-svn: 253722
|
| |
|
|
|
|
|
|
| |
The included test only checks for a compiler crash for now. Several people are
facing this issue, so we first resolve the crash, and will increase shrinkwrap's
coverage later in a follow-up patch.
llvm-svn: 253718
|
| |
|
|
| |
llvm-svn: 253717
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
coverage.
If a function was originally inlined but not actually hot at runtime,
its samples will not be counted inside the parent function. This throws
off the coverage calculation because it expects to find more used
records than it should.
Fixed by ignoring functions that will not be inlined into the parent.
Currently, this is inlined functions with 0 samples. In subsequent
patches, I'll change this to mean "cold" functions.
llvm-svn: 253716
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This change merges adjacent zero stores into a wider single store.
For example :
strh wzr, [x0]
strh wzr, [x0, #2]
becomes
str wzr, [x0]
This will fix PR25410.
llvm-svn: 253711
|
| |
|
|
|
|
|
|
|
|
| |
incorrect, as the chosen representative of the weak symbol may not live
with the code in question. Always indirect the access through the TOC
instead.
Patch by Kyle Butt!
llvm-svn: 253708
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Several (but not all) of the labels that are checked for in this test case
are checked as strings instead of labels. This can cause an apparent test
case failure if they are tested in an appropriately named directory.
For example, one of them that fails:
define zeroext i32 @test2(i32 %A.u, i32 %B.u) {
; A8: test2
; A8: uxtab r0, r0, r1
Output that causes it to fail:
. . .
.file "/home/seurer/llvm/llvm-test2/test/CodeGen/Thumb2/thumb2-uxt_rot.ll"
. . .
.globl test2
.align 1
.type test2,%function
.code 16 @ @test2
.thumb_func
test2:
.fnstart
The "A8: test2" matches on the directory name instead of the label.
llvm-svn: 253702
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This follows D14577 to treat ARMv6-J as an alias for ARMv6,
instead of an architecture in its own right.
The functional change is that the default CPU when targeting ARMv6-J
changes from arm1136j-s to arm1136jf-s, which is currently used as
the default CPU for ARMv6; both are, in fact, ARMv6-J CPUs.
The J-bit (Jazelle support) is irrelevant to LLVM, and it doesn't
affect code generation, attributes, optimizations, or anything else,
apart from selecting the default CPU.
Reviewers: rengolin, logan, compnerd
Subscribers: aemerson, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D14755
llvm-svn: 253675
|
| |
|
|
|
|
|
|
|
| |
While debugging some sampling coverage problems, I found this useful:
When applying samples from a profile, it helps to also know what line
offset and discriminator the sample belongs to. This makes it easy to
correlate against the input profile.
llvm-svn: 253670
|
| |
|
|
|
|
|
|
| |
Terrifyingly, one of them is a mishandling of floating point vectors
in Constant::isZero(). How exactly this issue survived this long
is beyond me.
llvm-svn: 253655
|
| |
|
|
|
|
|
|
| |
MULEU_S.PH.QBL, MULEU_S.PH.QBR, MULQ_RS.PH, MULQ_RS.W, MULQ_S.PH and MULQ_S.W instructions
Differential Revision: http://reviews.llvm.org/D14280
llvm-svn: 253651
|
| |
|
|
|
|
| |
the spec.
llvm-svn: 253642
|
| |
|
|
|
|
|
|
|
|
|
| |
The nuw constraint will not be satisfied unless <expr> == 0.
This bug has been around since r102234 (in 2010!), but was uncovered by
r251052, which introduced more aggressive optimization of nuw scev expressions.
Differential Revision: http://reviews.llvm.org/D14850
llvm-svn: 253627
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This introduces two new options:
- "llvm-lto -save-merged-module -o outfile" dumps the LTO Module to
outfile.merged.bc prior to CodeGen and after LTO optimizations have been run.
- "llvm-lto -filetype=asm -o outfile" makes llvm-lto emit assembly instead of
object code in outfile.
Both are intended for use in lit tests.
llvm-svn: 253624
|
| |
|
|
|
|
|
|
|
|
| |
Now that the register allocator knows about the barriers on funclet
entry and exit, testing has shown that this is unnecessary.
We still demote PHIs on unsplittable blocks due to the differences
between the IR CFG and the Machine CFG.
llvm-svn: 253619
|
| |
|
|
|
|
|
|
| |
start index.
Found during stress testing.
llvm-svn: 253611
|
| |
|
|
| |
llvm-svn: 253609
|
| |
|
|
| |
llvm-svn: 253602
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: The new algorithm is more efficient (O(n), n is number of basic blocks). And it is guaranteed to cover all cases of multiple BB mapped to same line.
Reviewers: dblaikie, davidxl, dnovillo
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14738
llvm-svn: 253594
|
| |
|
|
|
|
| |
We currently bail out of global localization if the global has non-instruction users. However, often these can be simple bitcasts or constant-GEPs, which we can easily turn into instructions before localizing. Be a bit more aggressive.
llvm-svn: 253584
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change extends r251438 to handle more narrow load promotions
including byte type, unscaled, and signed. For example, this change will
convert :
ldursh w1, [x0, #-2]
ldurh w2, [x0, #-4]
into
ldur w2, [x0, #-4]
asr w1, w2, #16
and w2, w2, #0xffff
llvm-svn: 253577
|
| |
|
|
| |
llvm-svn: 253574
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is another step towards allowing SimplifyCFG to speculate harder, but then have
CGP clean things up if the target doesn't like it.
Previous patches in this series:
http://reviews.llvm.org/D12882
http://reviews.llvm.org/D13297
D13297 should catch most expensive ops, but speculation of cttz/ctlz requires special
handling because of weirdness in the intrinsic definition for handling a zero input
(that definition can probably be blamed on x86).
For example, if we have the usual speculated-by-select expensive op pattern like this:
%tobool = icmp eq i64 %A, 0
%0 = tail call i64 @llvm.cttz.i64(i64 %A, i1 true) ; is_zero_undef == true
%cond = select i1 %tobool, i64 64, i64 %0
ret i64 %cond
There's an instcombine that will turn it into:
%0 = tail call i64 @llvm.cttz.i64(i64 %A, i1 false) ; is_zero_undef == false
This CGP patch is looking for that case and despeculating it back into:
entry:
%tobool = icmp eq i64 %A, 0
br i1 %tobool, label %cond.end, label %cond.true
cond.true:
%0 = tail call i64 @llvm.cttz.i64(i64 %A, i1 true) ; is_zero_undef == true
br label %cond.end
cond.end:
%cond = phi i64 [ %0, %cond.true ], [ 64, %entry ]
ret i64 %cond
This unfortunately may lead to poorer codegen (see the changes in the existing x86 test),
but if we increase speculation in SimplifyCFG (the next step in this patch series), then
we should avoid those kinds of cases in the first place.
The need for this patch was originally mentioned here:
http://reviews.llvm.org/D7506
with follow-up here:
http://reviews.llvm.org/D7554
Differential Revision: http://reviews.llvm.org/D14630
llvm-svn: 253573
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In particular, this makes the code for 64-bit compares on 32-bit targets
much more efficient.
Example:
define i32 @test_slt(i64 %a, i64 %b) {
entry:
%cmp = icmp slt i64 %a, %b
br i1 %cmp, label %bb1, label %bb2
bb1:
ret i32 1
bb2:
ret i32 2
}
Before this patch:
test_slt:
movl 4(%esp), %eax
movl 8(%esp), %ecx
cmpl 12(%esp), %eax
setae %al
cmpl 16(%esp), %ecx
setge %cl
je .LBB2_2
movb %cl, %al
.LBB2_2:
testb %al, %al
jne .LBB2_4
movl $1, %eax
retl
.LBB2_4:
movl $2, %eax
retl
After this patch:
test_slt:
movl 4(%esp), %eax
movl 8(%esp), %ecx
cmpl 12(%esp), %eax
sbbl 16(%esp), %ecx
jge .LBB1_2
movl $1, %eax
retl
.LBB1_2:
movl $2, %eax
retl
Differential Revision: http://reviews.llvm.org/D14496
llvm-svn: 253572
|
| |
|
|
|
|
|
|
| |
When dumping function samples or writing them out as text format, it
helps if the samples are emitted sorted by source location. The sorting
of the maps is a bit slow, so we only do it on demand.
llvm-svn: 253568
|
| |
|
|
|
| |
Author: obucina
llvm-svn: 253567
|
| |
|
|
|
|
|
|
|
| |
Copying one mask register to another under BW should be done with kmovq instruction, otherwise we can loose some bits.
Copying 8 bits under DQ may be done with kmovb.
Differential Revision: http://reviews.llvm.org/D14812
llvm-svn: 253563
|
| |
|
|
| |
llvm-svn: 253562
|
| |
|
|
|
|
|
|
|
|
| |
The lowering patterns for X86ISD::VZEXT_MOVL for 128-bit to 256-bit vectors were just copying the lower xmm instead of actually masking off the first scalar using a blend.
Fix for PR25320.
Differential Revision: http://reviews.llvm.org/D14151
llvm-svn: 253561
|
| |
|
|
|
|
|
| |
Make X86AsmBackend generate smarter nops instead of a bunch of 0x90 for code alignment for CPUs which don't support long nop instructions.
Differential Revision: http://reviews.llvm.org/D14178
llvm-svn: 253557
|
| |
|
|
|
|
|
|
|
|
|
|
| |
command line
This provides a way to force a function to have certain attributes from the command line. This can be useful when debugging or doing workload exploration, where manually editing IR is tedious or not possible (due to build systems etc).
The syntax is -force-attribute=function_name:attribute_name
All function attributes are parsed except alignstack as it requires an argument.
llvm-svn: 253550
|
| |
|
|
|
|
|
|
| |
instructions.
Differential Revision: http://reviews.llvm.org/D14702
llvm-svn: 253548
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D14771
llvm-svn: 253547
|
| |
|
|
|
|
|
|
| |
vmovapd.s, vmovaps.s, vmovdqa32.s, vmovdqa64.s, vmovdqu16.s, vmovdqu32.s, vmovdqu64.s, vmovdqu8.s, vmovupd.s, vmovups.s
Differential Revision: http://reviews.llvm.org/D14768
llvm-svn: 253546
|
| |
|
|
|
|
|
|
|
|
| |
The masked intrinsics support all integer and floating point data types. I added the pointer type to this list.
Added tests for CodeGen and for Loop Vectorizer.
Updated the Language Reference.
Differential Revision: http://reviews.llvm.org/D14150
llvm-svn: 253544
|
| |
|
|
|
|
|
|
|
|
| |
This reverts commit r253511.
This likely broke the bots in
http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202
http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787
llvm-svn: 253543
|