| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 216781
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
If a variadic function body contains a musttail call, then we copy all
of the remaining register parameters into virtual registers in the
function prologue. We track the virtual registers through the function
body, and add them as additional registers to pass to the call. Because
this is all done in virtual registers, the register allocator usually
gives us good code. If the function does a call, however, it will have
to spill and reload all argument registers (ew).
Forwarding regparms on x86_32 is not implemented because most compilers
don't support varargs in 32-bit with regparms.
Reviewers: majnemer
Subscribers: aemerson, llvm-commits
Differential Revision: http://reviews.llvm.org/D5060
llvm-svn: 216780
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We've rejected these kinds of functions since r28405 in 2006 because
it's impossible to lower the return of a callee cleanup varargs
function. However there are lots of legal ways to leave such a function
without returning, such as aborting. Today we can leave a function with
a musttail call to another function with the correct prototype, and
everything works out.
I'm removing the verifier check declaring that a normal return from such
a function is UB.
Reviewed By: nlewycky
Differential Revision: http://reviews.llvm.org/D5059
llvm-svn: 216779
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
values
This patch checks for DAG patterns that are an add or a sub followed by a
compare on 16 and 8 bit inputs. Since AArch64 does not support those types
natively they are legalized into 32 bit values, which means that mask operations
are inserted into the DAG to emulate overflow behaviour. In many cases those
masks do not change the result of the processing and just introduce a dependent
operation, often in the middle of a hot loop.
This patch detects the relevent DAG patterns and then tests to see if the
transforms are equivalent with and without the mask, removing the mask if
possible. The exact mechanism of this patch was discusses in
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-July/074444.html
There is a reasonably good chance there are missed oppurtunities due to similiar
(but not identical) DAG patterns that could be funneled into this test, adding
them should be simple if we see test cases.
Tests included.
rdar://13754426
llvm-svn: 216776
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The new solution is to not use this lowering if there are any dynamic
allocas in the current function. We know up front if there are dynamic
allocas, but we don't know if we'll need to create stack temporaries
with large alignment during lowering. Conservatively assume that we will
need such temporaries.
Reviewed By: hans
Differential Revision: http://reviews.llvm.org/D5128
llvm-svn: 216775
|
| |
|
|
|
|
|
|
| |
Even loads/stores that have a stronger ordering than monotonic can be safe.
The rule is no release-acquire pair on the path from the QueryInst, assuming that
the QueryInst is not atomic itself.
llvm-svn: 216771
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Mostly renaming the (not very explicit) variables Tmp0, .. Tmp4, and grouping
related statements together, along with a few lines of comments for the
surprising parts.
No functional change intended.
Test Plan: make check-all
Reviewers: jfb
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5088
llvm-svn: 216768
|
| |
|
|
|
|
| |
Followup to r215086.
llvm-svn: 216757
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we select a trunc instruction we don't emit any code if the type is already
i32 or smaller. This is because the instruction that uses the truncated value
will deal with it.
This behavior can incorrectly transfer a kill flag, which was meant for the
result of the truncate, onto the source register.
%2 = trunc i32 %1 to i16
... = ... %2 -> ... = ... vreg1 <kill>
... = ... %1 ... = ... vreg1
This commit fixes this by emitting a COPY instruction, so that the result and
source register are distinct virtual registers.
This fixes rdar://problem/18178188.
llvm-svn: 216750
|
| |
|
|
| |
llvm-svn: 216741
|
| |
|
|
| |
llvm-svn: 216736
|
| |
|
|
|
|
|
| |
We can use a negate source modifier to match
this for fsub.
llvm-svn: 216735
|
| |
|
|
|
|
|
| |
A bug in r216725 meant we tried to discover the type of a SETCC before
confirming the node actually was a SETCC.
llvm-svn: 216734
|
| |
|
|
| |
llvm-svn: 216732
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Instead of specifying the alignment as metadata which may be destroyed by
transformation passes, make the alignment the second argument to ldu/ldg
intrinsic calls.
Test Plan:
ldu-ldg.ll
ldu-i8.ll
ldu-reg-plus-offset.ll
Reviewers: eliben, meheff, jholewinski
Reviewed By: meheff, jholewinski
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D5093
llvm-svn: 216731
|
| |
|
|
|
|
|
|
|
|
|
| |
In an llvm-stress generated test, we were trying to create a v0iN type and
asserting when that failed. This case could probably be handled by the
function, but not without added complexity and the situation it arises in is
sufficiently odd that there's probably no benefit anyway.
Should fix PR20775.
llvm-svn: 216725
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
accumulator register
and forget about the previously used accumulator.
Coming up with a simple testcase is not easy, as this highly depends on
what the register allocator is doing: this issue showed up while working
with the PBQP allocator, which produced a different allocation scheme.
A testcase would need to come up with chain starting in D[0-7], then
moving to D[8-15], followed by a call to a function whose regmask
clobbers the starting accumulator in D[0-7], then another use of the chain.
Fixed some formatting, added some invariant checks while there.
llvm-svn: 216721
|
| |
|
|
| |
llvm-svn: 216719
|
| |
|
|
|
|
|
|
|
|
| |
Added new types to Legalizer.
Fixed getSetCCResultType function
Added lowering tests.
Reviewed by Elena Demikhovsky.
llvm-svn: 216717
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The code in SelectionDAG::getMemset for some reason assumes the value passed to
memset is an i32. This breaks the generated code for targets that only have
registers smaller than 32 bits because the value might get split into multiple
registers by the calling convention. See the test for the MSP430 target included
in the patch for an example.
This patch ensures that nothing is assumed about the type of the value. Instead,
the type is taken from the selected overload of the llvm.memset intrinsic.
llvm-svn: 216716
|
| |
|
|
|
|
|
| |
1) Add some missing bitcast patterns for v8f16.
2) Add type promotion for operand of ld/st operations.
llvm-svn: 216706
|
| |
|
|
|
|
| |
Reviewed by: Chandlerc
llvm-svn: 216704
|
| |
|
|
|
|
| |
Code reviewed by Chandlerc
llvm-svn: 216703
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
block.
This fix checks first if the instruction to be folded (e.g. sign-/zero-extend,
or shift) is in the same machine basic block as the instruction we are folding
into.
Not doing so can result in incorrect code, because the value might not be
live-out of the basic block, where the value is defined.
This fixes rdar://problem/18169495.
llvm-svn: 216700
|
| |
|
|
|
|
|
| |
This reverts commit r216523 and r216598; people have reported
regressions.
llvm-svn: 216698
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Don't promote byval pointer arguments when when their size in bits is
not equal to their alloc size in bits. This can happen for x86_fp80,
where the size in bits is 80 but the alloca size in bits in 128.
Promoting these types can break passing unions of x86_fp80s and other
types.
Patch by Thomas Jablin!
Reviewed By: rnk
Differential Revision: http://reviews.llvm.org/D5057
llvm-svn: 216693
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The AArch64 target lowering for [zs]ext of vectors is set up to handle
input simple types and expects the generic SDag path to do something reasonable
with anything that's not a simple type. The code, however, was only
checking that the result type was a simple type and assuming that
implied that the source type would also be a simple type. That's not a
valid assumption, as operations like "zext <1 x i1> %0 to <1 x i32>"
demonstrate. The fix is to simply explicitly validate the source type
as well as the result type.
PR20791
llvm-svn: 216689
|
| |
|
|
|
|
| |
can be refactored. NFC.
llvm-svn: 216688
|
| |
|
|
|
|
|
|
|
|
|
| |
On MachO, putting a symbol that doesn't start with a 'L' or 'l' in one of the
__TEXT,__literal* sections prevents the linker from merging the context of the
section.
Since private GVs are the ones the get mangled to start with 'L' or 'l', we now
only put those on the __TEXT,__literal* sections.
llvm-svn: 216682
|
| |
|
|
| |
llvm-svn: 216681
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove a block of code from LowerSIGN_EXTEND_INREG() that was added with:
http://llvm.org/viewvc/llvm-project?view=revision&revision=177421
And caused:
http://llvm.org/bugs/show_bug.cgi?id=20472 (more analysis here)
http://llvm.org/bugs/show_bug.cgi?id=18054
The testcases confirm that we (1) don't remove a zext op that is necessary and (2) generate
a pmovz instead of punpck if SSE4.1 is available. Although pmovz is 1 byte longer, it allows
folding of the load, and so saves 3 bytes overall.
Differential Revision: http://reviews.llvm.org/D4909
llvm-svn: 216679
|
| |
|
|
|
|
|
|
|
| |
SHUFFLE_VECTOR
was marked custom. The target independent DAG combine has no way to know if
the shuffles it is introducing are ones that the target could support or not.
llvm-svn: 216678
|
| |
|
|
|
|
| |
beginning of the comment."
llvm-svn: 216674
|
| |
|
|
|
|
| |
functional change.
llvm-svn: 216673
|
| |
|
|
|
|
|
| |
Completes what was started in r216611 and r216623.
Used const refs instead of pointers; not sure if one is preferable to the other.
llvm-svn: 216672
|
| |
|
|
|
|
|
|
| |
Reviewers: adasgupt, jverma, sidneym
Differential Revision: http://reviews.llvm.org/D5025
llvm-svn: 216667
|
| |
|
|
| |
llvm-svn: 216666
|
| |
|
|
| |
llvm-svn: 216660
|
| |
|
|
|
|
|
|
| |
InstSimplify already handles icmp (X+Y), X (and things like it)
appropriately. The first thing that InstCombine does is run
InstSimplify on the instruction.
llvm-svn: 216659
|
| |
|
|
|
|
|
|
|
|
| |
vector instruction.
For a detailed description of the problem see the comment in the test file.
The problematic moveBefore() calls are not required anymore because the new
scheduling algorithm ensures a correct ordering anyway.
llvm-svn: 216656
|
| |
|
|
| |
llvm-svn: 216651
|
| |
|
|
| |
llvm-svn: 216650
|
| |
|
|
|
|
| |
More work on http://llvm.org/PR20640
llvm-svn: 216648
|
| |
|
|
|
|
|
| |
I've decided not to commit a test, it takes 2.5 seconds to run on my an
incredibly strong machine.
llvm-svn: 216647
|
| |
|
|
|
|
| |
clang-format, no functionality changed.
llvm-svn: 216646
|
| |
|
|
|
|
|
|
|
|
|
| |
a single early exit.
And factor the subsequent cast<> from all but one block into a single
variable.
No functionality changed.
llvm-svn: 216645
|
| |
|
|
|
|
|
|
|
|
| |
functionality changed.
Separating this into two functions wasn't helping. There was a decent
amount of boilerplate duplicated, and some subsequent refactorings here
will pull even more common code out.
llvm-svn: 216644
|
| |
|
|
|
|
|
|
| |
Several combines involving icmp (shl C2, %X) C1 can be simplified
without introducing any new instructions. Move them to InstSimplify;
while we are at it, make them more powerful.
llvm-svn: 216642
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The included test case would fail, because the MI PHI node would have two
operands from the same predecessor.
This problem occurs when a switch instruction couldn't be selected. This happens
always, because there is no default switch support for FastISel to begin with.
The problem was that FastISel would first add the operand to the PHI nodes and
then fall-back to SelectionDAG, which would then in turn add the same operands
to the PHI nodes again.
This fix removes these duplicate PHI node operands by reseting the
PHINodesToUpdate to its original state before FastISel tried to select the
instruction.
This fixes <rdar://problem/18155224>.
llvm-svn: 216640
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently instructions are folded very aggressively for AArch64 into the memory
operation, which can lead to the use of killed operands:
%vreg1<def> = ADDXri %vreg0<kill>, 2
%vreg2<def> = LDRBBui %vreg0, 2
... = ... %vreg1 ...
This usually happens when the result is also used by another non-memory
instruction in the same basic block, or any instruction in another basic block.
This fix teaches hasTrivialKill to not only check the LLVM IR that the value has
a single use, but also to check if the register that represents that value has
already been used. This can happen when the instruction with the use was folded
into another instruction (in this particular case a load instruction).
This fixes rdar://problem/18142857.
llvm-svn: 216634
|