| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Subscribers: arsenm, wdng, nhaehnle, llvm-commits
Differential Revision: https://reviews.llvm.org/D24405
llvm-svn: 281080
|
| |
|
|
|
|
| |
assembly as this is not the desired behaviour for end-users. Small change to a unit test to implement this without requiring the inline assembly.
llvm-svn: 281047
|
| |
|
|
|
|
| |
So model that directly in TTI::getIntImmCost().
llvm-svn: 281044
|
| |
|
|
|
|
| |
Tell TargetTransformInfo about this so ConstantHoisting is informed.
llvm-svn: 281043
|
| |
|
|
|
|
| |
Fixes issue with rL280927 identified by Mikael Holmén
llvm-svn: 281042
|
| |
|
|
|
|
| |
The CMPZ #0 disappears during peepholing, leaving just a tADDi3, tADDi8 or t2ADDri. This avoids having to materialize the expensive negative constant in Thumb-1, and allows a shrinking from a 32-bit CMN to a 16-bit ADDS in Thumb-2.
llvm-svn: 281040
|
| |
|
|
|
|
|
|
| |
These instructions were only necessary when type information was stored in the
MachineInstr (because only generic MachineInstrs possessed a type). Now that
it's in MachineRegisterInfo, COPY and PHI work fine.
llvm-svn: 281037
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We want each register to have a canonical type, which means the best place to
store this is in MachineRegisterInfo rather than on every MachineInstr that
happens to use or define that register.
Most changes following from this are pretty simple (you need an MRI anyway if
you're going to be doing any transformations, so just check the type there).
But legalization doesn't really want to check redundant operands (when, for
example, a G_ADD only ever has one type) so I've made use of MCInstrDesc's
operand type field to encode these constraints and limit legalization's work.
As an added bonus, more validation is possible, both in MachineVerifier and
MachineIRBuilder (coming soon).
llvm-svn: 281035
|
| |
|
|
|
|
|
| |
This reverts commit r281022. Mips buildbot broke, due to unhandled register
class FCC.
llvm-svn: 281033
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Also removed duplicate code from AMDGPUTargetAsmStreamer.
This change only change how amd_kernel_code_t is parsed and printed. No variable names are changed.
Reviewers: vpykhtin, tstellarAMD
Subscribers: arsenm, wdng, nhaehnle
Differential Revision: https://reviews.llvm.org/D24296
llvm-svn: 281028
|
| |
|
|
|
|
|
|
| |
This avoids us doing a completely unneeded "cmp r0, #0" after a flag-setting instruction if we only care about the Z or C flags.
Add LSL/LSR to the whitelist while we're here and add testing. This code could really do with a spring clean.
llvm-svn: 281027
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As part of this effort, remove MipsFCmp nodes and use tablegen
patterns rather than custom lowering through C++.
Unexpectedly, this improves codesize for microMIPS as previous floating
point setcc expansions would materialize 0 and 1 into GPRs before using
the relevant mov[tf].[sd] instruction. Now $zero is used directly.
Reviewers: dsanders, vkalintiris, zoran.jovanovic
Differential Review: https://reviews.llvm.org/D23118
llvm-svn: 281022
|
| |
|
|
|
|
| |
processors added.
llvm-svn: 281021
|
| |
|
|
|
|
| |
commutable.
llvm-svn: 281013
|
| |
|
|
|
|
|
|
| |
show opportunities where we can commute to fold loads.
Commutes will be added in a followup commit.
llvm-svn: 281012
|
| |
|
|
|
|
|
| |
This adds more tests for shuffles where the output width does not match
the input width and/or the output is generated from more than two inputs.
llvm-svn: 281005
|
| |
|
|
|
|
|
|
|
|
| |
The REX prefix should be used on indirect jmps, but not direct ones.
For direct jumps, the unwinder looks at the offset to determine if
it's inside the current function.
Differential Revision: https://reviews.llvm.org/D24359
llvm-svn: 281003
|
| |
|
|
| |
llvm-svn: 280987
|
| |
|
|
|
|
| |
Recent change exposed this issue, breaking the Hexagon buildbots.
llvm-svn: 280973
|
| |
|
|
| |
llvm-svn: 280972
|
| |
|
|
| |
llvm-svn: 280970
|
| |
|
|
|
|
|
|
|
|
| |
And associated commits, as they broke the Thumb bots.
This reverts commit r280935.
This reverts commit r280891.
This reverts commit r280888.
llvm-svn: 280967
|
| |
|
|
|
|
| |
This bloats codesize - all of the non-leaf nodes are extra code.
llvm-svn: 280932
|
| |
|
|
|
|
|
| |
So model the cost of materializing the constant operand C as the minimum of
C and ~C.
llvm-svn: 280929
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Materializing something like "-3" can be done as 2 instructions:
MOV r0, #3
MVN r0, r0
This has a cost of 2, not 3. It looks like we were already trying to detect this pattern in TII::getIntImmCost(), but were taking the complement of the zero-extended value instead of the sign-extended value which is unlikely to ever produce a number < 256.
There were no tests failing after changing this... :/
llvm-svn: 280928
|
| |
|
|
|
|
|
|
|
|
|
|
| |
SimplifyDemandedBits
Add the ability to computeKnownBits and SimplifyDemandedBits to extract the known zero/one bits from BUILD_VECTOR, returning the known bits that are shared by every vector element.
This is an initial step towards determining the sign bits of a vector (PR29079).
Differential Revision: https://reviews.llvm.org/D24253
llvm-svn: 280927
|
| |
|
|
|
|
|
|
| |
Allow AND combines to use a vector splatted constant as well as a constant scalar.
Preliminary part of D24253.
llvm-svn: 280926
|
| |
|
|
|
|
|
|
|
|
| |
This reverts commit r280808.
It is possible that this change results in an infinite loop. This
is causing timeouts in some tests on ARM, and a Chromebook bot is
failing.
llvm-svn: 280918
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CGP tail-duplicates rets into blocks that end with a call that feed the ret.
This puts the call in tail position, potentially allowing the DAG builder to
lower it as a tail call. To avoid tail duplication in cases where we won't
form the tail call, CGP tried to predict whether this is going to be possible,
and avoids doing it when lowering as a tail call will definitely fail.
However, it was being too conservative by always throwing away calls to
functions with a signext/zeroext attribute on the return type.
Instead, we can use the same logic the builder uses to determine whether the
attributes work out.
Differential Revision: https://reviews.llvm.org/D24315
llvm-svn: 280894
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This is a port of XRay to ARM 32-bit, without Thumb support yet. The XRay instrumentation support is moving up to AsmPrinter.
This is one of 3 commits to different repositories of XRay ARM port. The other 2 are:
1. https://reviews.llvm.org/D23932 (Clang test)
2. https://reviews.llvm.org/D23933 (compiler-rt)
Differential Revision: https://reviews.llvm.org/D23931
llvm-svn: 280888
|
| |
|
|
|
|
|
|
|
|
|
| |
https://llvm.org/bugs/show_bug.cgi?id=29058.
While node legalization we tried to legalize its operands.
If an operand node is replaced during legalization the user node may be destroyed.
Differential Revision: https://reviews.llvm.org/D24244
llvm-svn: 280862
|
| |
|
|
|
|
|
|
| |
Shadow uses need to be analyzed together, since each individual shadow
will only have a partial reaching def. All shadows together may cover
a given register ref, while each individual shadow may not.
llvm-svn: 280855
|
| |
|
|
|
|
|
|
| |
meaningful, NFC.
Add PR number and comment in pr30298.ll to explain what is testing.
llvm-svn: 280843
|
| |
|
|
|
|
|
|
|
| |
The patch is to fix PR30298, which is caused by rL272694. The solution is to
bail out if the target has no SSE2.
Differential Revision: https://reviews.llvm.org/D24288
llvm-svn: 280837
|
| |
|
|
| |
llvm-svn: 280835
|
| |
|
|
|
|
|
| |
The original commit was too aggressive about marking LibCalls as AAPCS. The
libcalls contain libc/libm/libunwind calls which are not AAPCS, but C.
llvm-svn: 280833
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When branching to a block that immediately tail calls, it is possible to fold
the call directly into the branch if the call is direct and there is no stack
adjustment, saving one byte.
Example:
define void @f(i32 %x, i32 %y) {
entry:
%p = icmp eq i32 %x, %y
br i1 %p, label %bb1, label %bb2
bb1:
tail call void @foo()
ret void
bb2:
tail call void @bar()
ret void
}
before:
f:
movl 4(%esp), %eax
cmpl 8(%esp), %eax
jne .LBB0_2
jmp foo
.LBB0_2:
jmp bar
after:
f:
movl 4(%esp), %eax
cmpl 8(%esp), %eax
jne bar
.LBB0_1:
jmp foo
I don't expect any significant size savings from this (on a Clang bootstrap I
saw 288 bytes), but it does make the code a little tighter.
This patch only does 32-bit, but 64-bit would work similarly.
Differential Revision: https://reviews.llvm.org/D24108
llvm-svn: 280832
|
| |
|
|
|
|
|
|
| |
OpenCL kernels have hidden kernel arguments for global offset and printf buffer. For consistency, these hidden argument should be included in the runtime metadata. Also updated kernel argument kind metadata.
Differential Revision: https://reviews.llvm.org/D23424
llvm-svn: 280829
|
| |
|
|
| |
llvm-svn: 280816
|
| |
|
|
|
|
| |
Part of the yak shaving for D24253
llvm-svn: 280813
|
| |
|
|
|
|
| |
Part of the yak shaving for D24253
llvm-svn: 280810
|
| |
|
|
|
|
|
|
| |
(and (or x, C), D) -> D if (C & D) == D
Part of the yak shaving for D24253
llvm-svn: 280809
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This saves a library call to __aeabi_uidivmod. However, the
processor must feature hardware division in order to benefit from
the transformation.
Reviewers: scott-0, jmolloy, compnerd, rengolin
Subscribers: t.p.northover, compnerd, aemerson, rengolin, samparker, llvm-commits
Differential Revision: https://reviews.llvm.org/D24133
llvm-svn: 280808
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The o32 ABI doesn't not support the TImode helpers. For the time being,
disable just the shift libcalls as they break recursive builds on MIPS.
Reviewers: sdardis
Subscribers: llvm-commits, sdardis
Differential Revision: https://reviews.llvm.org/D24259
llvm-svn: 280798
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When folding an addi into a memory access that can take an immediate offset, we
were implicitly assuming that the existing offset was zero. This was incorrect.
If we're dealing with an addi with a plain constant, we can add it to the
existing offset (assuming that doesn't overflow the immediate, etc.), but if we
have anything else (i.e. something that will become a relocation expression),
we'll go back to requiring the existing immediate offset to be zero (because we
don't know what the requirements on that relocation expression might be - e.g.
maybe it is paired with some addis in some relevant way).
On the other hand, when dealing with a plain addi with a regular constant
immediate, the alignment restrictions (from the TOC base pointer, etc.) are
irrelevant.
I've added the test case from PR30280, which demonstrated the bug, but also
demonstrates a missed optimization opportunity (i.e. we don't need the memory
accesses at all).
Fixes PR30280.
llvm-svn: 280789
|
| |
|
|
|
|
|
|
|
|
|
| |
The previous commit (r280368 - https://reviews.llvm.org/D23313) does not cover AVX-512F, KNL set.
FNEG(x) operation is lowered to (bitcast (vpxor (bitcast x), (bitcast constfp(0x80000000))).
It happens because FP XOR is not supported for 512-bit data types on KNL and we use integer XOR instead.
I added pattern match for integer XOR.
Differential Revision: https://reviews.llvm.org/D24221
llvm-svn: 280785
|
| |
|
|
|
|
| |
findCommutedOpIndices. The default implementation doesn't skip the mask input or the preserved input.
llvm-svn: 280781
|
| |
|
|
|
|
|
| |
This reverts SVN r280683. Revert until I figure out why this is breaking lli
tests.
llvm-svn: 280778
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I might have called this "r246507, the sequel". It fixes the same issue, as the
issue has cropped up in a few more places. The underlying problem is that
isSetCCEquivalent can pick up select_cc nodes with a result type that is not
legal for a setcc node to have, and if we use that type to create new setcc
nodes, nothing fixes that (and so we've violated the contract that the
infrastructure has with the backend regarding setcc node types).
Fixes PR30276.
For convenience, here's the commit message from r246507, which explains the
problem is greater detail:
[DAGCombine] Fixup SETCC legality checking
SETCC is one of those special node types for which operation actions (legality,
etc.) is keyed off of an operand type, not the node's value type. This makes
sense because the value type of a legal SETCC node is determined by its
operands' value type (via the TLI function getSetCCResultType). When the
SDAGBuilder creates SETCC nodes, it either creates them with an MVT::i1 value
type, or directly with the value type provided by TLI.getSetCCResultType.
The first problem being fixed here is that DAGCombine had several places
querying TLI.isOperationLegal on SETCC, but providing the return of
getSetCCResultType, instead of the operand type directly. This does not mean
what the author thought, and "luckily", most in-tree targets have SETCC with
Custom lowering, instead of marking them Legal, so these checks return false
anyway.
The second problem being fixed here is that two of the DAGCombines could create
SETCC nodes with arbitrary (integer) value types; specifically, those that
would simplify:
(setcc a, b, op1) and|or (setcc a, b, op2) -> setcc a, b, op3
(which is possible for some combinations of (op1, op2))
If the operands of the and|or node are actual setcc nodes, then this is not an
issue (because the and|or must share the same type), but, the relevant code in
DAGCombiner::visitANDLike and DAGCombiner::visitORLike actually calls
DAGCombiner::isSetCCEquivalent on each operand, and that function will
recognise setcc-like select_cc nodes with other return types. And, thus, when
creating new SETCC nodes, we need to be careful to respect the value-type
constraint. This is even true before type legalization, because it is quite
possible for the SELECT_CC node to have a legal type that does not happen to
match the corresponding TLI.getSetCCResultType type.
To be explicit, there is nothing that later fixes the value types of SETCC
nodes (if the type is legal, but does not happen to match
TLI.getSetCCResultType). Creating SETCCs with an MVT::i1 value type seems to
work only because, either MVT::i1 is not legal, or it is what
TLI.getSetCCResultType returns if it is legal. Fixing that is a larger change,
however. For the time being, restrict the relevant transformations to produce
only SETCC nodes with a value type matching TLI.getSetCCResultType (or MVT::i1
prior to type legalization).
Fixes PR24636.
llvm-svn: 280767
|
| |
|
|
|
|
| |
- Add missing test
llvm-svn: 280749
|