| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a constant is unamed_addr and is only used within one function, we can save
on the code size and runtime cost of an indirection by changing the global's storage
to inside the constant pool. For example, instead of:
ldr r0, .CPI0
bl printf
bx lr
.CPI0: &format_string
format_string: .asciz "hello, world!\n"
We can emit:
adr r0, .CPI0
bl printf
bx lr
.CPI0: .asciz "hello, world!\n"
This can cause significant code size savings when many small strings are used in one
function (4 bytes per string).
llvm-svn: 281484
|
| |
|
|
| |
llvm-svn: 281480
|
| |
|
|
|
|
|
|
|
|
|
| |
This patch corresponds to review:
https://reviews.llvm.org/D24021
In the initial implementation of this instruction, I forgot to account for
variable indices. This patch fixes PR30189 and should probably be merged into
3.9.1 (I'll open a bug according to the new instructions).
llvm-svn: 281479
|
| |
|
|
|
|
|
|
| |
generic shuffle
Shuffle lowering will correctly lower to MOVSS/MOVSD/PBLEND, improving commutation opportunities
llvm-svn: 281471
|
| |
|
|
|
|
| |
This reverts commit r281323. It caused chromium test failures and a selfhost failure.
llvm-svn: 281451
|
| |
|
|
| |
llvm-svn: 281448
|
| |
|
|
|
|
| |
to copy the new isAdd field in the tablegen data structure.
llvm-svn: 281447
|
| |
|
|
|
|
| |
Lowering was wrong - X86ISD::SETCC node should return i8 type.
llvm-svn: 281446
|
| |
|
|
|
|
|
|
| |
with AVX512VL.
Differential Revision: http://reviews.llvm.org/D24547
llvm-svn: 281445
|
| |
|
|
|
|
| |
and not needed since 32-bit integer to double is always exact.
llvm-svn: 281442
|
| |
|
|
| |
llvm-svn: 281407
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: When expanding mul in type legalization make sure the type for shift amount can actually fit the value. This fixes PR30354 https://llvm.org/bugs/show_bug.cgi?id=30354.
Reviewers: hfinkel, majnemer, RKSimon
Subscribers: RKSimon, llvm-commits
Differential Revision: https://reviews.llvm.org/D24478
llvm-svn: 281403
|
| |
|
|
|
|
|
|
|
|
|
| |
This allows us to, in some cases, create a vector_shuffle out of a build_vector, when
the inputs to the build are extract_elements from two different vectors, at least one
of which is wider than the output. (E.g. a <8 x i16> being constructed out of
elements from a <16 x i16> and a <8 x i16>).
Differential Revision: https://reviews.llvm.org/D24491
llvm-svn: 281402
|
| |
|
|
|
|
|
| |
- Expand SELECT_CC and BR_CC for vector types.
- Implement TLI::isShuffleMaskLegal.
llvm-svn: 281397
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Cleanup/change the code that checks for possible tailcall conventions to
look the same as the one in the X86 target. This makes the distinction
between calling conventions that can guarnatee tailcalls and the ones
that may tailcall more obvious.
- Add Swift to the mayTailCall list
- PreserveMost seemed to be incorrectly part of the guarnteed tail call
list, move it to the mayTailCall list.
llvm-svn: 281376
|
| |
|
|
| |
llvm-svn: 281369
|
| |
|
|
|
|
|
|
|
|
| |
test
To avoid assertion, we must ensure that the inner shift constant is within range before calling ConstantSDNode::getZExtValue(). We already know that the outer shift constant is in range.
Followup to D23007
llvm-svn: 281362
|
| |
|
|
| |
llvm-svn: 281359
|
| |
|
|
|
|
| |
trunc16i16_16i8 is currently commented out due to PR25684
llvm-svn: 281356
|
| |
|
|
|
|
|
|
| |
Fix failure to detect out of range shift constants leading to assert in ConstantSDNode::getZExtValue()
Followup to D23007
llvm-svn: 281354
|
| |
|
|
|
|
| |
Added x86_64 tests
llvm-svn: 281341
|
| |
|
|
| |
llvm-svn: 281339
|
| |
|
|
|
|
| |
This reverts commit r281314. Speculatively revert as it's possible this caused linker errors: http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/19656
llvm-svn: 281327
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before, only Thumb functions were marked as ".code 16". These
".code x" directives are effective until the next directive of its
kind is encountered. Therefore, in code with interleaved ARM and
Thumb functions, it was possible to declare a function as ARM and
end up with a Thumb function after assembly. A test has been added.
An existing test has also been fixed to take this change into
account.
Reviewers: aschwaighofer, t.p.northover, jmolloy, rengolin
Subscribers: aemerson, rengolin, llvm-commits
Differential Revision: https://reviews.llvm.org/D24337
llvm-svn: 281324
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more efficient instruction selection if the bitmask is one consecutive sequence of set bits (32 - clz(bm) - ctz(bm) == popcount(bm)).
1) If the bitmask touches the LSB, then we can remove all the upper bits and set the flags by doing one LSLS.
2) If the bitmask touches the MSB, then we can remove all the lower bits and set the flags with one LSRS.
3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit into the sign bit with one LSLS and change the condition query from NE/EQ to MI/PL (we could also implement this by shifting into the carry bit and branching on BCC/BCS).
4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower zero bits of the mask.
1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two 16-bit instructions but can elide the CMP and doesn't require materializing a complex immediate, so is also a win.
llvm-svn: 281323
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a constant is unamed_addr and is only used within one function, we can save
on the code size and runtime cost of an indirection by changing the global's storage
to inside the constant pool. For example, instead of:
ldr r0, .CPI0
bl printf
bx lr
.CPI0: &format_string
format_string: .asciz "hello, world!\n"
We can emit:
adr r0, .CPI0
bl printf
bx lr
.CPI0: .asciz "hello, world!\n"
This can cause significant code size savings when many small strings are used in one
function (4 bytes per string).
llvm-svn: 281314
|
| |
|
|
|
|
|
|
|
|
|
|
| |
r281285.
Reviewers: bkramer, ddcc, dschuff, sunfish
Subscribers: jfb, llvm-commits, dschuff
Differential Revision: https://reviews.llvm.org/D24497
llvm-svn: 281312
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D23764
llvm-svn: 281308
|
| |
|
|
|
|
|
| |
hwloop regression tests. These tests pass locally; will be investigating
where these differences come from.
llvm-svn: 281306
|
| |
|
|
|
|
|
|
|
|
|
| |
descriptions now tag add instructions, and the Hexagon backend is using this to
identify loop induction statements.
Patch by Sam Parker and Sjoerd Meijer.
Differential Revision: https://reviews.llvm.org/D23601
llvm-svn: 281304
|
| |
|
|
|
|
|
|
|
| |
Optimized (truncate (assertzext x) to i1) and anyext i1 to i8/16/32.
Optimization of this patterns is a one more step towards i1 optimization on AVX-512.
Differential Revision: https://reviews.llvm.org/D24456
llvm-svn: 281302
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We currently return 4 for stackmaps and patchpoints, which is very optimistic
and can in rare cases cause the branch relaxation pass to fail to relax certain
branches.
This patch causes getInstSizeInBytes to return a pessimistic estimate of the
size as the number of bytes requested in the stackmap/patchpoint. In the future,
we could provide a more accurate estimate by sharing some of the logic in
AArch64::LowerSTACKMAP/PATCHPOINT.
Fixes part of https://llvm.org/bugs/show_bug.cgi?id=28750
Differential Revision: https://reviews.llvm.org/D24073
llvm-svn: 281301
|
| |
|
|
|
|
| |
vector shuffles. They were removed from clang previously but accidentally left in the backend.
llvm-svn: 281300
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This patch reverses the edge from DIGlobalVariable to GlobalVariable.
This will allow us to more easily preserve debug info metadata when
manipulating global variables.
Fixes PR30362. A program for upgrading test cases is attached to that
bug.
Differential Revision: http://reviews.llvm.org/D20147
llvm-svn: 281284
|
| |
|
|
|
|
|
|
|
|
| |
That confuses e.g. machine basic block placement, which then doesn't
realize that control can fall through a block that ends with a conditional
tail call. Instead, isBranch=1 should be set.
Also, mark EFLAGS as used by these instructions.
llvm-svn: 281281
|
| |
|
|
| |
llvm-svn: 281263
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: If consecutive select instructions are lowered separately in CGP, it will introduce redundant condition check and branches that cannot be removed by later optimization phases. This patch lowers all consecutive select instructions at the same to to avoid inefficent code as demonstrated in https://llvm.org/bugs/show_bug.cgi?id=29095
Reviewers: davidxl
Subscribers: vsk, llvm-commits
Differential Revision: https://reviews.llvm.org/D24147
llvm-svn: 281252
|
| |
|
|
| |
llvm-svn: 281246
|
| |
|
|
| |
llvm-svn: 281244
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm, tstellarAMD, mareko
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: http://reviews.llvm.org/D22198
llvm-svn: 281230
|
| |
|
|
|
|
| |
This reverts commit r281213. It made a bot go bang: http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-full/builds/14625
llvm-svn: 281228
|
| |
|
|
|
|
| |
We're not supposed to have duplicate live-ins.
llvm-svn: 281224
|
| |
|
|
|
|
|
|
|
|
|
| |
r280832 added 32-bit support for emitting conditional tail-calls, but
dropped imp-used parameter registers. This went unnoticed until
r281113, which added 64-bit support, as this is only exposed with
parameter passing via registers.
Don't drop the imp-used parameters.
llvm-svn: 281223
|
| |
|
|
| |
llvm-svn: 281218
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more efficient instruction selection if the bitmask is one consecutive sequence of set bits (32 - clz(bm) - ctz(bm) == popcount(bm)).
1) If the bitmask touches the LSB, then we can remove all the upper bits and set the flags by doing one LSLS.
2) If the bitmask touches the MSB, then we can remove all the lower bits and set the flags with one LSRS.
3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit into the sign bit with one LSLS and change the condition query from NE/EQ to MI/PL (we could also implement this by shifting into the carry bit and branching on BCC/BCS).
4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower zero bits of the mask.
1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two 16-bit instructions but can elide the CMP and doesn't require materializing a complex immediate, so is also a win.
llvm-svn: 281215
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a constant is unamed_addr and is only used within one function, we can save
on the code size and runtime cost of an indirection by changing the global's storage
to inside the constant pool. For example, instead of:
ldr r0, .CPI0
bl printf
bx lr
.CPI0: &format_string
format_string: .asciz "hello, world!\n"
We can emit:
adr r0, .CPI0
bl printf
bx lr
.CPI0: .asciz "hello, world!\n"
This can cause significant code size savings when many small strings are used in one
function (4 bytes per string).
llvm-svn: 281213
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This test was not testing the intrinsics. A function like this:
define %v4f32 @test_v4f32.floor(%v4f32 %a){
...
%1 = call %v4f32 @llvm.floor.v4f32(%v4f32 %a)
...
}
is transformed into the following assembly:
_test_v4f32.floor: @ @test_v4f32.floor
...
bl _floorf
...
In each function tested, there are two CHECK: one that checked
for the label and another one for the intrinsic that should be used
inside the function (in our case, "floor"). However, although the
first CHECK was matching the label, the second was not matching the
intrinsic, but the second "floor" in the same line as the label.
This is fixed by making the first CHECK match the entire line.
Reviewers: jmolloy, rengolin
Subscribers: rengolin, llvm-commits
Differential Revision: https://reviews.llvm.org/D24398
llvm-svn: 281211
|
| |
|
|
| |
llvm-svn: 281207
|
| |
|
|
|
|
|
|
| |
Unlike SDag, we use a separate G_GEP instruction (much simplified, only taking
a single byte offset) to preserve the pointer type information through
selection.
llvm-svn: 281205
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some generic instructions have multiple types. While in theory these always be
discovered by inspecting the single definition of each generic vreg, in
practice those definitions won't always be local and traipsing through a big
function to find them will not be fun.
So this changes MIRPrinter to print out the type of uses as well as defs, if
they're known to be different or not known to be the same.
On the parsing side, we're a little more flexible: provided each register is
given a type in at least one place it's mentioned (and all types are
consistent) we accept the MIR. This doesn't introduce ambiguity but makes
writing tests manually a bit less painful.
llvm-svn: 281204
|