| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
This fixes PR1769.
llvm-svn: 46408
|
| |
|
|
| |
llvm-svn: 46406
|
| |
|
|
|
|
|
| |
a "nop" instruction so that we don't have the function's label associated
with something that it's not supposed to be associated with.
llvm-svn: 46394
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
void bork() {
int *address = 0;
*address = 0;
}
It's compiled into LLVM code that looks like this:
define void @bork() noreturn nounwind {
entry:
unreachable
}
This is bad on some platforms (like PPC) because it will generate the label for
the function but no body. The label could end up being associated with some
non-code related stuff, like a section. This places a "trap" instruction if the
SimplifyCFG pass removed all code from the function leaving only one
"unreachable" instruction.
llvm-svn: 46387
|
| |
|
|
|
|
| |
for the purpose of removing end-of-function stores.
llvm-svn: 46351
|
| |
|
|
| |
llvm-svn: 46247
|
| |
|
|
|
|
| |
a smaller bitwidth.
llvm-svn: 46244
|
| |
|
|
|
|
| |
Fixes PR1935.
llvm-svn: 46203
|
| |
|
|
|
|
| |
to complain on x86-64 (gcc 4.1). Use ~0U instead.
llvm-svn: 46197
|
| |
|
|
|
|
|
|
|
| |
drop attributes on varargs call arguments. Also, it could generate
invalid IR if the transformed call already had the 'nest' attribute
somewhere (this can never happen for code coming from llvm-gcc,
but it's a theoretical possibility). Fix both problems.
llvm-svn: 45973
|
| |
|
|
|
|
|
|
|
|
| |
a load/store of i64. The later prevents promotion/scalarrepl of the
source and dest in many cases.
This fixes the 300% performance regression of the byval stuff on
stepanov_v1p2.
llvm-svn: 45945
|
| |
|
|
|
|
| |
method, no functionality change.
llvm-svn: 45944
|
| |
|
|
|
|
|
| |
greater than memcpy alignment, and if we lower to load/store, use the best
alignment info we have.
llvm-svn: 45943
|
| |
|
|
|
|
| |
make memmove->memcpy conversion a bit simpler.
llvm-svn: 45942
|
| |
|
|
|
|
|
| |
realize that ne & sgt was a signed comparison (it was only
looking at whether the left compare was signed).
llvm-svn: 45937
|
| |
|
|
|
|
|
| |
if this becomes a varargs call then deal correctly with any
parameter attributes on the newly vararg call arguments.
llvm-svn: 45931
|
| |
|
|
|
|
| |
arithmetic.
llvm-svn: 45745
|
| |
|
|
|
|
| |
incompatibility.
llvm-svn: 45704
|
| |
|
|
| |
llvm-svn: 45675
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ShadowStackCollector, which additionally has reduced overhead with
no sacrifice in portability.
Considering a function @fun with 8 loop-local roots,
ShadowStackCollector introduces the following overhead
(x86):
; shadowstack prologue
movl L_llvm_gc_root_chain$non_lazy_ptr, %eax
movl (%eax), %ecx
movl $___gc_fun, 20(%esp)
movl $0, 24(%esp)
movl $0, 28(%esp)
movl $0, 32(%esp)
movl $0, 36(%esp)
movl $0, 40(%esp)
movl $0, 44(%esp)
movl $0, 48(%esp)
movl $0, 52(%esp)
movl %ecx, 16(%esp)
leal 16(%esp), %ecx
movl %ecx, (%eax)
; shadowstack loop overhead
(none)
; shadowstack epilogue
movl 48(%esp), %edx
movl %edx, (%ecx)
; shadowstack metadata
.align 3
___gc_fun: # __gc_fun
.long 8
.space 4
In comparison to LowerGC:
; lowergc prologue
movl L_llvm_gc_root_chain$non_lazy_ptr, %eax
movl (%eax), %ecx
movl %ecx, 48(%esp)
movl $8, 52(%esp)
movl $0, 60(%esp)
movl $0, 56(%esp)
movl $0, 68(%esp)
movl $0, 64(%esp)
movl $0, 76(%esp)
movl $0, 72(%esp)
movl $0, 84(%esp)
movl $0, 80(%esp)
movl $0, 92(%esp)
movl $0, 88(%esp)
movl $0, 100(%esp)
movl $0, 96(%esp)
movl $0, 108(%esp)
movl $0, 104(%esp)
movl $0, 116(%esp)
movl $0, 112(%esp)
; lowergc loop overhead
leal 44(%esp), %eax
movl %eax, 56(%esp)
leal 40(%esp), %eax
movl %eax, 64(%esp)
leal 36(%esp), %eax
movl %eax, 72(%esp)
leal 32(%esp), %eax
movl %eax, 80(%esp)
leal 28(%esp), %eax
movl %eax, 88(%esp)
leal 24(%esp), %eax
movl %eax, 96(%esp)
leal 20(%esp), %eax
movl %eax, 104(%esp)
leal 16(%esp), %eax
movl %eax, 112(%esp)
; lowergc epilogue
movl 48(%esp), %edx
movl %edx, (%ecx)
; lowergc metadata
(none)
llvm-svn: 45670
|
| |
|
|
|
|
|
|
|
| |
direct calls bails out unless caller and callee have essentially
equivalent parameter attributes. This is illogical - the callee's
attributes should be of no relevance here. Rework the logic, which
incidentally fixes a crash when removed arguments have attributes.
llvm-svn: 45658
|
| |
|
|
|
|
|
|
|
|
|
| |
a direct call with cast parameters and cast return
value (if any), instcombine was prepared to cast any
non-void return value into any other, whether castable
or not. Add a new predicate for testing whether casting
is valid, and check it both for the return value and
(as a cleanup) for the parameters.
llvm-svn: 45657
|
| |
|
|
| |
llvm-svn: 45613
|
| |
|
|
|
|
|
| |
things that are not equality comparisons, for example:
(2147479553+4096)-2147479553 < 0 != (2147479553+4096) < 2147479553
llvm-svn: 45612
|
| |
|
|
| |
llvm-svn: 45594
|
| |
|
|
| |
llvm-svn: 45418
|
| |
|
|
| |
llvm-svn: 45415
|
| |
|
|
|
|
| |
should probably be a target-specific predicate based on address space. That way for targets where this isn't applicable the predicate can be optimized away.
llvm-svn: 45403
|
| |
|
|
|
|
| |
pointing out my stupid mistakes when writing this patch. :-)
llvm-svn: 45384
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
define i32 @main() {
entry:
%z = alloca i32 ; <i32*> [#uses=2]
store i32 0, i32* %z
%tmp = load i32* %z ; <i32> [#uses=1]
%sub = sub i32 %tmp, 1 ; <i32> [#uses=1]
%cmp = icmp ult i32 %sub, 0 ; <i1> [#uses=1]
%retval = select i1 %cmp, i32 1, i32 0 ; <i32> [#uses=1]
ret i32 %retval
}
into ret 1, instead of ret 0.
Christopher, please investigate.
llvm-svn: 45383
|
| |
|
|
|
|
|
|
|
|
| |
it is only a partial fix. This change is noise for most programs, but
speeds up Shootout-C++/matrix by 20%, Ptrdist/ks by 24%, smg2000 by 8%,
hexxagon by 9%, bzip2 by 9% (not sure I trust this), ackerman by 13%, etc.
OTOH, it slows down Shootout/fib2 by 40% (I'll update PR1877 with this info).
llvm-svn: 45354
|
| |
|
|
|
|
|
| |
When specified, don't split backedges of single-bb loops.
This helps address PR1877
llvm-svn: 45344
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
us to compile:
#include <math.h>
int t1(double d) { return signbit(d); }
into:
_t1:
movd %xmm0, %rax
shrq $63, %rax
ret
instead of:
_t1:
movd %xmm0, %rax
shrq $32, %rax
shrl $31, %eax
ret
on x86-64.
llvm-svn: 45311
|
| |
|
|
|
|
|
|
|
| |
(icmp slt (sub A B) 1) -> (icmp sle A B)
icmp sgt (sub A B) -1) -> (icmp sge A B)
and add testcase.
llvm-svn: 45256
|
| |
|
|
|
|
| |
uses are addresses. This trades a constant multiply for one fewer iv.
llvm-svn: 45251
|
| |
|
|
|
|
| |
has a single use, and generalize it to not require N to be a constant.
llvm-svn: 45250
|
| |
|
|
| |
llvm-svn: 45230
|
| |
|
|
|
|
|
|
|
| |
calls 'nounwind'. It is important for correct C++
exception handling that nounwind markings do not get
lost, so this transformation is actually needed for
correctness.
llvm-svn: 45218
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
for this case on X86
from
_foo:
movl $99, %ecx
movl 4(%esp), %eax
subl %eax, %ecx
xorl %edx, %edx
testl %ecx, %ecx
cmovs %edx, %eax
ret
to
_foo:
xorl %ecx, %ecx
movl 4(%esp), %eax
cmpl $99, %eax
cmovg %ecx, %eax
ret
llvm-svn: 45173
|
| |
|
|
| |
llvm-svn: 45170
|
| |
|
|
|
|
| |
recent submission.
llvm-svn: 45169
|
| |
|
|
|
|
| |
doesNotThrow.
llvm-svn: 45160
|
| |
|
|
| |
llvm-svn: 45159
|
| |
|
|
|
|
|
|
| |
eliminate subtractions. This code is often produced by the SMAX expansion in SCEV.
This implements test/Transforms/InstCombine/2007-12-18-AddSelCmpSub.ll
llvm-svn: 45158
|
| |
|
|
| |
llvm-svn: 45100
|
| |
|
|
|
|
| |
passed the erased element.
llvm-svn: 45099
|
| |
|
|
|
|
| |
of PointerType::get() has become PointerType::getUnqual(), which returns a pointer in the generic address space. The new prototype of PointerType::get() requires both a type and an address space.
llvm-svn: 45082
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
calls. Remove special casing of inline asm from the
inliner. There is a potential problem: the verifier
rejects invokes of inline asm (not sure why). If an
asm call is not marked "nounwind" in some .ll, and
instcombine is not run, but the inliner is run, then
an illegal module will be created. This is bad but
I'm not sure what the best approach is. I'm tempted
to remove the check in the verifier...
llvm-svn: 45073
|
| |
|
|
| |
llvm-svn: 44997
|
| |
|
|
| |
llvm-svn: 44981
|