| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
| |
We always select the 8-bit size and let the assembler backend relax to the larger size.
llvm-svn: 225243
|
|
|
|
|
|
|
|
| |
long form.
The assembler backend will relax to the long form if necessary. This removes a swap from long form to short form in the MCInstLowering code. Selecting the long form used to be required by the old JIT.
llvm-svn: 225242
|
|
|
|
| |
llvm-svn: 225233
|
|
|
|
| |
llvm-svn: 225232
|
|
|
|
| |
llvm-svn: 225231
|
|
|
|
|
|
| |
I've filed http://llvm.org/PR22100 to track this issue.
llvm-svn: 225228
|
|
|
|
| |
llvm-svn: 225227
|
|
|
|
|
|
|
|
|
| |
This reverts commit r225213. It's failing on multiple buildbots [1][2].
[1]: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/22032
[2]: http://lab.llvm.org:8080/green/view/Clang/job/clang-stage1-cmake-RA-incremental_check/2357/
llvm-svn: 225222
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We no longer generate horrible code for the stated function:
void f(signed char *a, _Bool b, _Bool c) {
signed char t = 0;
if (b) t = *a;
if (c) *a = t;
}
for which we now generate:
.L.f:
andi. 5, 5, 1
cmpldi 1, 4, 0
li 5, 0
beq 1, .LBB0_2
lbz 5, 0(3)
.LBB0_2: # %if.end
bclr 4, 1, 0
stb 5, 0(3)
blr
so we don't need the README.txt entry.
llvm-svn: 225217
|
|
|
|
|
|
| |
Removed local isSequential predicate and use standard helper isSequentialOrUndefInRange instead.
llvm-svn: 225216
|
|
|
|
|
|
|
| |
We now produce the desired code as noted in the README.txt file (no spurious
or). Remove the README entry and improve the regression test.
llvm-svn: 225214
|
|
|
|
| |
llvm-svn: 225213
|
|
|
|
|
|
|
| |
This entry has been rendered irrelevant now that we have proper CR bit
tracking.
llvm-svn: 225211
|
|
|
|
|
|
| |
memop instructions. Removing old defs without bits and updating references.
llvm-svn: 225210
|
|
|
|
|
|
|
| |
We now produce the desired code as noted in the README.txt file. Remove the
README entry and add a regression test.
llvm-svn: 225209
|
|
|
|
|
|
|
| |
We now produce the desired code as noted in the README.txt file. Remove the
README entry and add a regression test.
llvm-svn: 225205
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Consider this function from our README.txt file:
int foo(int a, int b) { return (a < b) << 4; }
We now explicitly track CR bits by default, so the comment in the README.txt
about not really having a SETCC is no longer accurate, but we did generate this
somewhat silly code:
cmpw 0, 3, 4
li 3, 0
li 12, 1
isel 3, 12, 3, 0
sldi 3, 3, 4
blr
which generates the zext as a select between 0 and 1, and then shifts the
result by a constant amount. Here we preprocess the DAG in order to fold the
results of operations on an extension of an i1 value into the SELECT_I[48]
pseudo instruction when the resulting constant can be materialized using one
instruction (just like the 0 and 1). This was not implemented as a DAGCombine
because the resulting code would have been anti-canonical and depends on
replacing chained user nodes, which does not fit well into the lowering
paradigm. Now we generate:
cmpw 0, 3, 4
li 3, 0
li 12, 16
isel 3, 12, 3, 0
blr
which is less silly.
llvm-svn: 225203
|
|
|
|
| |
llvm-svn: 225202
|
|
|
|
|
|
| |
accumulating shifts.
llvm-svn: 225201
|
|
|
|
|
|
| |
encoding bits.
llvm-svn: 225199
|
|
|
|
| |
llvm-svn: 225198
|
|
|
|
| |
llvm-svn: 225197
|
|
|
|
|
|
|
|
|
| |
The 64-bit semantics of cntlzw are not special, the 32-bit population count is
stored as a 64-bit value in the range [0,32]. As a result, it is always zero
extended, and it can be added to the PPCISelDAGToDAG peephole optimization as a
frontier instruction for the removal of unnecessary zero extensions.
llvm-svn: 225192
|
|
|
|
|
|
|
|
|
| |
lhbrx and lwbrx not only load their data with byte swapping, but also clear the
upper 32 bits (at least). As a result, they can be added to the PPCISelDAGToDAG
peephole optimization as frontier instructions for the removal of unnecessary
zero extensions.
llvm-svn: 225189
|
|
|
|
| |
llvm-svn: 225188
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We used to generate code similar to:
umov.b w8, v0[2]
strb w8, [x0, x1]
because the STR*ro* patterns were preferred to ST1*.
Instead, we can avoid going through GPRs, and generate:
add x8, x0, x1
st1.b { v0 }[2], [x8]
This patch increases the ST1* AddedComplexity to achieve that.
rdar://16372710
Differential Revision: http://reviews.llvm.org/D6202
llvm-svn: 225183
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the subregister.
For 0-lane stores, we used to generate code similar to:
fmov w8, s0
str w8, [x0, x1, lsl #2]
instead of:
str s0, [x0, x1, lsl #2]
To correct that: for store lane 0 patterns, directly match to STR <subreg>0.
Byte-sized instructions don't have the special case for a 0 index,
because FPR8s are defined to have untyped content.
rdar://16372710
Differential Revision: http://reviews.llvm.org/D6772
llvm-svn: 225181
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch lowers patterns such as-
fsub v0.4s, v0.4s, v1.4s
fabs v0.4s, v0.4s
to
fabd v0.4s, v0.4s, v1.4s
on AArch64.
Review: http://reviews.llvm.org/D6791
llvm-svn: 225169
|
|
|
|
|
|
|
|
| |
Tag_compatibility takes two arguments, but before this patch it would
erroneously accept just one, it now produces an error in that case.
Change-Id: I530f918587620d0d5dfebf639944d6083871ef7d
llvm-svn: 225167
|
|
|
|
|
|
|
|
|
|
|
| |
Claim conformance to version 2.09 of the ARM ABI.
This build attribute must be emitted first amongst the build attributes when
written to an object file. This is to simplify conformance detection by
consumers.
Change-Id: If9eddcfc416bc9ad6e5cc8cdcb05d0031af7657e
llvm-svn: 225166
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch lowers patterns such as-
sub v0.4s, v0.4s, v1.4s
abs v0.4s, v0.4s
to
sabd v0.4s, v0.4s, v1.4s
on AArch64.
Review: http://reviews.llvm.org/D6781
llvm-svn: 225165
|
|
|
|
|
|
| |
into the assert.
llvm-svn: 225160
|
|
|
|
|
|
| |
dec instructions. Remove the 32-bit mode only versions that existed for the disassembler. Move the patterns out of the instructions so they can still be qualified with predicates.
llvm-svn: 225157
|
|
|
|
|
|
| |
incrementing. NFC
llvm-svn: 225156
|
|
|
|
|
|
| |
assignment as the beginning of the function. NFC.
llvm-svn: 225155
|
|
|
|
|
|
| |
since the code is in a comment. Can't figure out what the body of the 'if' was supposed to be anyway.
llvm-svn: 225154
|
|
|
|
|
|
| |
tablegen instruction info. No functional change intended.
llvm-svn: 225153
|
|
|
|
|
|
| |
for the move to/from absolute memory instructions.
llvm-svn: 225152
|
|
|
|
|
|
|
|
| |
PPC has an instruction for ctlz with defined zero behavior, and our lowering of
cttz (provided by DAGCombine) is also efficient and branchless, so speculating
these makes sense.
llvm-svn: 225150
|
|
|
|
|
|
|
|
|
| |
r225135 added the ability to materialize i64 constants using rotations in order
to reduce the instruction count. Sometimes we can use a rotation only with some
extra masking, so that we take advantage of the fact that generating a bunch of
extra higher-order 1 bits is easy using li/lis.
llvm-svn: 225147
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Materializing full 64-bit constants on PPC64 can be expensive, requiring up to
5 instructions depending on the locations of the non-zero bits. Sometimes
materializing a rotated constant, and then applying the inverse rotation, requires
fewer instructions than the direct method. If so, do that instead.
In r225132, I added support for forming constants using bit inversion. In
effect, this reverts that commit and replaces it with rotation support. The bit
inversion is useful for turning constants that are mostly ones into ones that
are mostly zeros (thus enabling a more-efficient shift-based materialization),
but the same effect can be obtained by using negative constants and a rotate,
and that is at least as efficient, if not more.
llvm-svn: 225135
|
|
|
|
|
|
|
|
|
| |
Materializing full 64-bit constants on PPC64 can be expensive, requiring up to
5 instructions depending on the locations of the non-zero bits. Sometimes
materializing the bit-reversed constant, and then flipping the bits, requires
fewer instructions than the direct method. If so, do that instead.
llvm-svn: 225132
|
|
|
|
|
|
|
|
|
|
|
| |
Weak externals are resolved statically, so we can actually generate the tail
call on PE/COFF targets without breaking the requirements. It is questionable
whether we want to propagate the current behaviour for MachO as the requirements
are part of the ARM ELF specifications, and it seems that prior to the SVN
r215890, we would have tail'ed the call. For now, be conservative and only
permit it on PE/COFF where the call will always be fully resolved.
llvm-svn: 225119
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The existing code provided for specifying a global loop alignment preference.
However, the preferred loop alignment might depend on the loop itself. For
recent POWER cores, loops between 5 and 8 instructions should have 32-byte
alignment (while the others are better with 16-byte alignment) so that the
entire loop will fit in one i-cache line.
To support this, getPrefLoopAlignment has been made virtual, and can be
provided with an optional MachineLoop* so the target can inspect the loop
before answering the query. The default behavior, as before, is to return the
value set with setPrefLoopAlignment. MachineBlockPlacement now queries the
target for each loop instead of only once per function. There should be no
functional change for other targets.
llvm-svn: 225117
|
|
|
|
|
|
|
|
|
|
|
|
| |
Most modern PowerPC cores prefer that functions and loops start on
16-byte-aligned boundaries (*), so instruct block placement, etc. to make this
happen. The branch selector has also been adjusted so account for the extra
nops that might now be inserted before loop headers.
(*) Some cores actually prefer other alignments for small loops, but that will
be addressed in a follow-up commit.
llvm-svn: 225115
|
|
|
|
|
|
|
|
| |
AsmParsers.
Make sure they all have llvm_unreachable on the default path out of the switch. Remove unnecessary "default: break". Remove a 'return' after unreachable. Fix some indentation.
llvm-svn: 225114
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Newer POWER cores, and the A2, support the cmpb instruction. This instruction
compares its operands, treating each of the 8 bytes in the GPRs separately,
returning a 'mask' result of 0 (for false) or -1 (for true) in each byte.
Code generation support is added, in the form of a PPCISelDAGToDAG
DAG-preprocessing routine, that recognizes patterns close to what the
instruction computes (either exactly, or related by a constant masking
operation), and generates the cmpb instruction (along with any necessary
constant masking operation). This can be expanded if use cases arise.
llvm-svn: 225106
|
|
|
|
|
|
| |
is REX.W and AdSize prefix are both present.
llvm-svn: 225099
|
|
|
|
|
|
| |
sign extended immediate.
llvm-svn: 225098
|
|
|
|
| |
llvm-svn: 225080
|