| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
cleanups.
llvm-svn: 112197
|
|
|
|
|
|
| |
encodable as a 16-bit wide instruction.
llvm-svn: 112195
|
|
|
|
| |
llvm-svn: 112191
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fix: add a flag to MapValue and friends which indicates whether
any module-level mappings are being made. In the common case of
inlining, no module-level mappings are needed, so MapValue doesn't
need to examine non-function-local metadata, which can be very
expensive in the case of a large module with really deep metadata
(e.g. a large C++ program compiled with -g).
This flag is a little awkward; perhaps eventually it can be moved
into the ClonedCodeInfo class.
llvm-svn: 112190
|
|
|
|
|
|
| |
characters > 127.
llvm-svn: 112189
|
|
|
|
|
|
| |
consistent with compare in corner cases.
llvm-svn: 112185
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
comparison with 0. These two pieces of code should give identical results:
rsbs r1, r1, 0
cmp r0, r1
mov r0, #0
it ls
mov r0, #1
and:
cmn r0, r1
mov r0, #0
it ls
mov r0, #1
However, the CMN gives the *opposite* result when r1 is 0. This is because the
carry flag is set in the CMP case but not in the CMN case. In short, the CMP
instruction doesn't perform a truncate of the (logical) NOT of 0 plus the value
of r0 and the carry bit (because the "carry bit" parameter to AddWithCarry is
defined as 1 in this case, the carry flag will always be set when r0 >= 0). The
CMN instruction doesn't perform a NOT of 0 so there is never a "carry" when this
AddWithCarry is performed (because the "carry bit" parameter to AddWithCarry is
defined as 0).
The AddWithCarry in the CMP case seems to be relying upon the identity:
~x + 1 = -x
However when x is 0 and unsigned, this doesn't hold:
x = 0
~x = 0xFFFF FFFF
~x + 1 = 0x1 0000 0000
(-x = 0) != (0x1 0000 0000 = ~x + 1)
Therefore, we should disable *all* versions of CMN, especially when comparing
against zero, until we can limit when the CMN instruction is used (when we know
that the RHS is not 0) or when we have a hardware fix for this.
(See the ARM docs for the "AddWithCarry" pseudo-code.)
This is related to <rdar://problem/7569620>.
llvm-svn: 112176
|
|
|
|
|
|
| |
lack sse2.
llvm-svn: 112175
|
|
|
|
| |
llvm-svn: 112171
|
|
|
|
| |
llvm-svn: 112170
|
|
|
|
|
|
| |
apparently try to support.
llvm-svn: 112168
|
|
|
|
|
|
|
| |
except ...", it is causing *massive* performance regressions when building Clang
with itself (-O3 -g).
llvm-svn: 112158
|
|
|
|
|
|
| |
individual ...", which depends on r111922, which I am reverting.
llvm-svn: 112157
|
|
|
|
| |
llvm-svn: 112155
|
|
|
|
| |
llvm-svn: 112131
|
|
|
|
| |
llvm-svn: 112130
|
|
|
|
| |
llvm-svn: 112128
|
|
|
|
| |
llvm-svn: 112127
|
|
|
|
|
|
| |
and was over-complicated, and replacing it with a simple implementation.
llvm-svn: 112120
|
|
|
|
|
|
|
| |
a VLD result was not used (Radar 8355607). It should also fix pr7988, but
I haven't verified that yet.
llvm-svn: 112118
|
|
|
|
| |
llvm-svn: 112109
|
|
|
|
|
|
|
|
|
|
|
| |
with the VST4 instructions. Until after register allocation, we want to
represent sets of adjacent registers by a single super-register. These
VST4 pseudo instructions have a single QQ or QQQQ source register operand.
They get expanded to the real VST4 instructions with 4 separate D register
operands. Once this conversion is complete, we'll be able to remove the
NEONPreAllocPass and avoid some fragile and hacky code elsewhere.
llvm-svn: 112108
|
|
|
|
| |
llvm-svn: 112104
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
expanding: e.g. <2 x float> -> <4 x float> instead of -> 2 floats. This
affects two places in the code: handling cross block values and handling
function return and arguments. Since vectors are already widened by
legalizetypes, this gives us much better code and unblocks x86-64 abi
and SPU abi work.
For example, this (which is a silly example of a cross-block value):
define <4 x float> @test2(<4 x float> %A) nounwind {
%B = shufflevector <4 x float> %A, <4 x float> undef, <2 x i32> <i32 0, i32 1>
%C = fadd <2 x float> %B, %B
br label %BB
BB:
%D = fadd <2 x float> %C, %C
%E = shufflevector <2 x float> %D, <2 x float> undef, <4 x i32> <i32 0, i32 1, i32 undef, i32 undef>
ret <4 x float> %E
}
Now compiles into:
_test2: ## @test2
## BB#0:
addps %xmm0, %xmm0
addps %xmm0, %xmm0
ret
previously it compiled into:
_test2: ## @test2
## BB#0:
addps %xmm0, %xmm0
pshufd $1, %xmm0, %xmm1
## kill: XMM0<def> XMM0<kill> XMM0<def>
insertps $0, %xmm0, %xmm0
insertps $16, %xmm1, %xmm0
addps %xmm0, %xmm0
ret
This implements rdar://8230384
llvm-svn: 112101
|
|
|
|
|
|
| |
instructions, not when remapping modules.
llvm-svn: 112091
|
|
|
|
| |
llvm-svn: 112090
|
|
|
|
|
|
| |
clang -O3.
llvm-svn: 112089
|
|
|
|
| |
llvm-svn: 112086
|
|
|
|
| |
llvm-svn: 112085
|
|
|
|
| |
llvm-svn: 112084
|
|
|
|
| |
llvm-svn: 112083
|
|
|
|
|
|
|
| |
from MDValueList between each function, now that the bitcode
writer is reusing the index space for function-local metadata.
llvm-svn: 112082
|
|
|
|
| |
llvm-svn: 112081
|
|
|
|
| |
llvm-svn: 112080
|
|
|
|
| |
llvm-svn: 112079
|
|
|
|
| |
llvm-svn: 112076
|
|
|
|
|
|
|
| |
When doing copy/paste/modify, it's apparently rather important to remember
the 'modify' bit...
llvm-svn: 112075
|
|
|
|
| |
llvm-svn: 112073
|
|
|
|
|
|
| |
directly folded into a constant by FE.
llvm-svn: 112072
|
|
|
|
| |
llvm-svn: 112060
|
|
|
|
|
|
| |
function-specific state.
llvm-svn: 112058
|
|
|
|
| |
llvm-svn: 112056
|
|
|
|
| |
llvm-svn: 112055
|
|
|
|
|
|
|
|
|
| |
comparison that would overflow.
- The other under/overflow cases can't actually happen because the immediates
which would trigger them are legal (so we don't enter this code), but
adjusted the style to make it clear the transform is always valid.
llvm-svn: 112053
|
|
|
|
| |
llvm-svn: 112039
|
|
|
|
|
|
|
|
| |
Mark _alloca call as clobberring EFLAGS, otherwise some DCE might remove
other flags-clobberring stuff (e.g. cmp instructions) occuring after
_alloca call.
llvm-svn: 112034
|
|
|
|
|
|
| |
Fix some todos. No functional change.
llvm-svn: 112031
|
|
|
|
| |
llvm-svn: 112020
|
|
|
|
|
|
| |
isel behavior for now, so we can pass all vector shuffle tests
llvm-svn: 112017
|
|
|
|
|
|
|
|
|
| |
you try to load it. Thus,
any load in the default address space that completes implies that the base value that it GEP'd from
was not null.
llvm-svn: 112015
|