| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a follow-up from the previous discussion on the thread:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151019/307763.html
The LibLTO lto_get_error_message() API reads error messages from a std::string
sLastErrorString. Instead of passing this string around as an argument, this
patch creates a diagnostic handler and then sends this handler to the
constructor of LTOCodeGenerator.
Differential Revision: http://reviews.llvm.org/D14313
llvm-svn: 252791
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Don't fold
(zext (and (load x), cst)) -> (and (zextload x), (zext cst))
if
(and (load x) cst)
will match as a zextload already and has additional users.
For example, the following IR:
%load = load i32, i32* %ptr, align 8
%load16 = and i32 %load, 65535
%load64 = zext i32 %load16 to i64
store i32 %load16, i32* %dst1, align 4
store i64 %load64, i64* %dst2, align 8
used to produce the following aarch64 code:
ldr w8, [x0]
and w9, w8, #0xffff
and x8, x8, #0xffff
str w9, [x1]
str x8, [x2]
but with this change produces the following aarch64 code:
ldrh w8, [x0]
str w8, [x1]
str x8, [x2]
Reviewers: resistor, mcrosier
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14340
llvm-svn: 252789
|
|
|
|
| |
llvm-svn: 252786
|
|
|
|
| |
llvm-svn: 252783
|
|
|
|
| |
llvm-svn: 252782
|
|
|
|
|
|
| |
Just a tiny piece of index dumping - the header in this instance.
llvm-svn: 252781
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Other personalities don't use this special frame slot.
Reviewers: majnemer, andrew.w.kaylor, rnk
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14580
llvm-svn: 252778
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds DWARF values for the Delphi language and Borland C++
language extensions.
Reviewed by: dblaikie
Subscribers: llvm-commits, majnemer
Differential Revision: http://reviews.llvm.org/D14522
llvm-svn: 252776
|
|
|
|
| |
llvm-svn: 252775
|
|
|
|
| |
llvm-svn: 252770
|
|
|
|
| |
llvm-svn: 252769
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Inlined callsites need to be emitted in debug info so that sample profile can be annotated to the correct inlined instance.
Reviewers: dnovillo, dblaikie
Subscribers: dblaikie, llvm-commits
Differential Revision: http://reviews.llvm.org/D14511
llvm-svn: 252768
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The discriminators pass relied on the presence of llvm.dbg.cu to decide
whether to add discriminators, but this fails in the case where debug
info is only enabled partially when -fprofile-sample-use is active.
The reason llvm.dbg.cu is not present in these cases is to prevent
codegen from emitting debug info (as it is only used for the sample
profile pass).
This changes the discriminators pass to also emit discriminators even
when debug info is not being emitted.
llvm-svn: 252763
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D13671
llvm-svn: 252760
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MIPS32 has instructions for efficient count-leading/trailing-zeros, so this should be
considered a cheap operation (and therefore fair game for speculation) for any MIPS32
implementation.
The net result of allowing this speculation for the regression tests in this patch is
that we get this code:
ctlz:
jr $ra
clz $2, $4
cttz:
addiu $1, $4, -1
not $2, $4
and $1, $2, $1
clz $1, $1
addiu $2, $zero, 32
jr $ra
subu $2, $2, $1
Instead of:
ctlz:
beqz $4, $BB0_2
addiu $2, $zero, 32
clz $2, $4
$BB0_2:
jr $ra
nop
cttz:
beqz $4, $BB1_2
addiu $2, $zero, 32
addiu $1, $4, -1
not $2, $4
and $1, $2, $1
clz $1, $1
addiu $2, $zero, 32
subu $2, $2, $1
$BB1_2:
jr $ra
nop
See D14469 for the larger motivation.
Differential Revision: http://reviews.llvm.org/D14500
llvm-svn: 252755
|
|
|
|
|
|
|
|
| |
I missed the side-effects of ParseBFI in my previous attempt (r252748).
Thanks dblaikie for the suggestion of adding a void use of the unused
variable instead.
llvm-svn: 252751
|
|
|
|
| |
llvm-svn: 252748
|
|
|
|
|
|
|
|
|
| |
instruction.
Differential Revision: http://reviews.llvm.org/D13316
Fixes PR25003
llvm-svn: 252743
|
|
|
|
|
|
| |
If we have a chain of BFIs, we may be able to combine several together into one merged BFI. We can do this if the "from" bits from one BFI OR'd with the "from" bits from the other BFI form a contiguous range, and the same with the "to" bits.
llvm-svn: 252740
|
|
|
|
|
|
|
|
|
|
|
| |
Measurements primarily on AArch64 have shown this feature does not
significantly effect compile-time. The are no significant perf changes in LNT,
but for AArch64 at least, there are wins in third party benchmarks.
As discussed on llvm-dev, we're going to try turning this on by default and see
how other targets react to the change.
llvm-svn: 252733
|
|
|
|
| |
llvm-svn: 252732
|
|
|
|
|
|
| |
expression"; NFC.
llvm-svn: 252728
|
|
|
|
|
|
|
|
|
| |
If possible and profitable, replace lea %reg, 1(%reg) and lea %reg, -1(%reg) with inc %reg and dec %reg respectively.
Patch by: anton.nadolsky@intel.com
Differential Revision: http://reviews.llvm.org/D14059
llvm-svn: 252722
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D14242
llvm-svn: 252719
|
|
|
|
|
|
| |
introduced with SSE or SSE2.
llvm-svn: 252709
|
|
|
|
| |
llvm-svn: 252708
|
|
|
|
| |
llvm-svn: 252687
|
|
|
|
|
|
|
| |
Follow-up to r235963: this matches other assemblers and is less
unexpected (e.g. PR23227).
llvm-svn: 252681
|
|
|
|
|
|
| |
This way we can not only add but also remove read undef flags.
llvm-svn: 252678
|
|
|
|
| |
llvm-svn: 252677
|
|
|
|
|
|
|
|
| |
Right now isTruePredicate is only ever called with Pred == ICMP_SLE or
ICMP_ULE, and the ICMP_SLT and ICMP_ULT cases are dead. This change
removes the untested dead code so that the function is not misleading.
llvm-svn: 252676
|
|
|
|
| |
llvm-svn: 252675
|
|
|
|
| |
llvm-svn: 252674
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This change teaches isImpliedCondition to prove things like
(A | 15) < L ==> (A | 14) < L
if the low 4 bits of A are known to be zero.
Depends on D14391
Reviewers: majnemer, reames, hfinkel
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14392
llvm-svn: 252673
|
|
|
|
|
|
|
|
| |
This change would add functionality if isImpliedCondition worked on
vector types; but since it bail out on vector predicates this change is
an NFC.
llvm-svn: 252672
|
|
|
|
|
|
|
| |
This makes it slightly easier to handle classes with and without
subregister uniformly.
llvm-svn: 252671
|
|
|
|
|
|
|
| |
Inserting it before the target block could be bad, we might already have
a fallthrough edge to it.
llvm-svn: 252670
|
|
|
|
| |
llvm-svn: 252658
|
|
|
|
| |
llvm-svn: 252656
|
|
|
|
| |
llvm-svn: 252654
|
|
|
|
| |
llvm-svn: 252653
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds a pass for doing PowerPC peephole optimizations at the
MI level while the code is still in SSA form. This allows for easy
modifications to the instructions while depending on a subsequent pass
of DCE. Both passes are very fast due to the characteristics of SSA.
At this time, the only peepholes added are for cleaning up various
redundancies involving the XXPERMDI instruction. However, I would
expect this will be a useful place to add more peepholes for
inefficiencies generated during instruction selection. The pass is
placed after VSX swap optimization, as it is best to let that pass
remove unnecessary swaps before performing any remaining clean-ups.
The utility of these clean-ups are demonstrated by changes to four
existing test cases, all of which now have tighter expected code
generation. I've also added Eric Schweiz's bugpoint-reduced test from
PR25157, for which we now generate tight code. One other test started
failing for me, and I've fixed it
(test/Transforms/PlaceSafepoints/finite-loops.ll) as well; this is not
related to my changes, and I'm not sure why it works before and not
after. The problem is that the CHECK-NOT: of "statepoint" from test1
fails because of the "statepoint" in test2, and so forth. Adding a
CHECK-LABEL in between keeps the different occurrences of that string
properly scoped.
llvm-svn: 252651
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The module linker lazy links some "discardable if unused" global
values (e.g. linkonce), materializing and linking them only
if they are referenced in the module. If a comdat group contains a
linkonce member that is not referenced, however, it would not be
materialized and linked, leading to an incomplete comdat group.
If there are other object files not part of the same LTO link that also
define and use that comdat group, the linker may select the incomplete
group leading to link time unsats.
To solve this, whenever a global value body is linked, make sure we
materialize any other members of the same comdat group that are not yet
materialized. This ensures they are in the lazy link list and get linked
as well.
Added new test and adjusted old test to remove parts that didn't
make sense with fix.
Reviewers: rafael
Subscribers: dexonsmith, davidxl, llvm-commits
Differential Revision: http://reviews.llvm.org/D14516
llvm-svn: 252647
|
|
|
|
|
|
|
|
|
| |
This was an omission in the patch that landed initial support for
operand bundles. So far we haven't hit this, but we will once the
inliner is able to inline calls to functions that contain calls with
operand bundles.
llvm-svn: 252645
|
|
|
|
|
|
|
| |
No code uses this functionality yet. This change just exposes
information / structure that was already present.
llvm-svn: 252644
|
|
|
|
| |
llvm-svn: 252643
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ARM V6T2 has instructions for efficient count-leading/trailing-zeros, so this should be
considered a cheap operation (and therefore fair game for speculation) for any ARM V6T2
implementation.
The net result of allowing this speculation for the regression tests in this patch is
that we get this code:
ctlz:
clz r0, r0
bx lr
cttz:
rbit r0, r0
clz r0, r0
bx lr
Instead of:
ctlz:
cmp r0, #0
moveq r0, #32
clzne r0, r0
bx lr
cttz:
cmp r0, #0
moveq r0, #32
rbitne r0, r0
clzne r0, r0
bx lr
This will help solve a general speculation/despeculation problem noted in PR24818:
https://llvm.org/bugs/show_bug.cgi?id=24818
Differential Revision: http://reviews.llvm.org/D14469
llvm-svn: 252639
|
|
|
|
|
|
|
|
|
|
|
| |
This allows avoiding the default Expand behavior which
introduces stack usage. Bitcast the scalar and replace
the missing elements with undef.
This is covered by existing tests and used by a future
commit which makes 64-bit vectors legal types on AMDGPU.
llvm-svn: 252632
|
|
|
|
|
|
|
| |
This is covered by existing tests and used by a future
commit which makes 64-bit vectors legal types on AMDGPU.
llvm-svn: 252631
|
|
|
|
|
|
|
|
|
|
| |
This is for AMDGPU to implement v2i64 extract as extract of
half of a v4i32.
This is covered by existing tests and used by a future
commit which makes 64-bit vectors legal types on AMDGPU.
llvm-svn: 252630
|