| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D22049
llvm-svn: 274852
|
| |
|
|
|
|
|
|
|
|
| |
Support for the macro fusion of simple ALU ops with branches for the Vulcan sub-target.
Patch by Meador Inge <meadori@gmail.com>
Differential Revision: http://reviews.llvm.org/D22042
llvm-svn: 274837
|
| |
|
|
|
|
| |
Until we have a better way to extract constants through bitcasted build vectors (and how to handle undefs of partial lanes etc.) at least accept build vectors that are all zeroes.
llvm-svn: 274833
|
| |
|
|
|
|
|
|
| |
intrinsics.
I'm not sure if clang ever used these builtin names or not.
llvm-svn: 274827
|
| |
|
|
| |
llvm-svn: 274818
|
| |
|
|
|
|
|
| |
Also this will be more precise since it will check
exec_lo/exec_hi writes.
llvm-svn: 274817
|
| |
|
|
|
|
|
|
|
|
| |
Windows on ARM uses a pure thumb-2 environment. This means that it can select a
high register when doing a __builtin_longjmp. We would use a tLDRi which would
truncate the register to a low register. Use a t2LDRi12 to get the full
register file access. Tweak the code to just load into PC, as that is an
interworking branch on all supported cores anyways.
llvm-svn: 274815
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
* Similiar to the ARM backend yse the peephole optimizer to generate more conditional ALU operations;
* Add predicated type with default always true to RR instructions in LanaiInstrInfo.td;
* Move LanaiSetflagAluCombiner into optimizeCompare;
* The ASM parser can currently only handle explicitly specified CC, so specify ".t" (true) where needed in the ASM test;
* Remove unused MachineOperand flags;
Reviewers: eliben
Subscribers: aemerson
Differential Revision: http://reviews.llvm.org/D22072
llvm-svn: 274807
|
| |
|
|
|
|
|
|
|
|
|
| |
xorl + setcc is generally the preferred sequence due to the partial register
stall setcc + movzbl suffers from. As a bonus, it also encodes one byte smaller.
This fixes PR28146.
The original commit tried inserting an 8bit-subreg into a GR32 (not GR32_ABCD)
which was not appreciated by fast regalloc on 32-bit.
llvm-svn: 274802
|
| |
|
|
|
|
|
|
|
|
| |
The commit reinstates r273279, which was informally approved.
Original Review: http://reviews.llvm.org/D21414
This reverts commit ca632c91aaa7cafc50942f890c49f727a046ace1.
llvm-svn: 274790
|
| |
|
|
| |
llvm-svn: 274771
|
| |
|
|
|
|
|
|
|
|
|
|
| |
- Rename the ptx.read.* intrinsics to nvvm.read.ptx.sreg.* - some but
not all of these registers were already accessible via the nvvm
name.
- Rename ptx.bar.sync nvvm.bar.sync, to match nvvm.bar0.
There's a fair amount of code motion here, but it's all very
mechanical.
llvm-svn: 274769
|
| |
|
|
|
|
|
|
| |
alignment"
This reverts commit r273279 as the change was not properly approved.
llvm-svn: 274768
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
A regression showed up in node.js when handling conditional calls.
Fix the regression by recognizing external symbols as a possible
operand type in CallJG.
Reviewers: koriakin
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D22054
llvm-svn: 274761
|
| |
|
|
|
|
| |
Differential revision: http://reviews.llvm.org/D22041
llvm-svn: 274756
|
| |
|
|
|
|
| |
Fixes pr28452.
llvm-svn: 274754
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a follow-up for r273544.
The end goal is to get rid of the isSwift / isCortexXY / isWhatever methods.
This commit also removes a command line flag that isn't used in any of the tests:
check-vmlx-hazards. It can be replaced easily with the mattr mechanism, since
this is now a subtarget feature.
There is still some work left regarding FeatureExpandMLx. In the past MLx
expansion was enabled for subtargets with hasVFP2(), until r129775 [1] switched
from that to isCortexA9, without too much justification.
In spite of that, the code performing MLx expansion still contains calls to
isSwift/isLikeA9, although the results of those are pretty clear given that
we're only enabling it for the A9.
We should try to enable it for all targets that have FeatureHasVMLxHazards, as
it seems to be closely related to that behaviour, and if that is possible try to
clean up the MLx expansion pass from all calls to isWhatever. This will require
some performance testing, so it will be done in another patch.
[1] http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20110418/119725.html
Differential Revision: http://reviews.llvm.org/D21798
llvm-svn: 274742
|
| |
|
|
|
|
| |
ourselves via a call through the DAG.
llvm-svn: 274721
|
| |
|
|
| |
llvm-svn: 274720
|
| |
|
|
| |
llvm-svn: 274717
|
| |
|
|
| |
llvm-svn: 274716
|
| |
|
|
| |
llvm-svn: 274715
|
| |
|
|
| |
llvm-svn: 274714
|
| |
|
|
|
|
| |
called by it.
llvm-svn: 274711
|
| |
|
|
|
|
| |
and remove the argument.
llvm-svn: 274710
|
| |
|
|
| |
llvm-svn: 274709
|
| |
|
|
| |
llvm-svn: 274704
|
| |
|
|
| |
llvm-svn: 274702
|
| |
|
|
|
|
|
|
|
|
|
| |
xorl + setcc is generally the preferred sequence due to the partial register
stall setcc + movzbl suffers from. As a bonus, it also encodes one byte smaller.
This fixes PR28146.
Differential Revision: http://reviews.llvm.org/D21774
llvm-svn: 274692
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On CPUs with the zero cycle zeroing feature enabled "movi v.2d" should
be used to zero a vector register. This was previously done at
instruction selection time, however the register coalescer sometimes
widened multiple vregs to the Q width because of that leading to extra
spills. This patch leaves the decision on how to zero a register to the
AsmPrinter phase where it doesn't affect register allocation anymore.
This patch also sets isAsCheapAsAMove=1 on FMOVS0, FMOVD0.
This fixes http://llvm.org/PR27454, rdar://25866262
Differential Revision: http://reviews.llvm.org/D21826
llvm-svn: 274686
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
findScratchNonCalleeSaveRegister() just needs a simple liveness
analysis, use LivePhysRegs for that as it is simpler and does not depend
on the kill flags.
This commit adds a convenience function available() to LivePhysRegs:
This function returns true if the given register is not reserved and
neither the register nor any of its aliases are alive.
Differential Revision: http://reviews.llvm.org/D21865
llvm-svn: 274685
|
| |
|
|
|
|
| |
This adds it only for movl mov@GOT(%reg), %reg.
llvm-svn: 274678
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: tra
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D22068
llvm-svn: 274674
|
| |
|
|
|
|
|
| |
Everywhere where cuda.syncthreads or __syncthreads is used, use the
properly namespaced nvvm.barrier0 instead.
llvm-svn: 274664
|
| |
|
|
|
|
|
| |
The intrinsics here use nvvm, but the builtins and tablegen variable
names were using ptx. Stick to the modern names here.
llvm-svn: 274662
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This is "cvtdq2ps" which does not appear to be particularly slow on any CPU
according to Agner's tables. Choosing "5" as a cost here as suggested in:
https://llvm.org/bugs/show_bug.cgi?id=21356
...but it seems very conservative given that the instruction is fully pipelined,
and I think these costs are supposed to model throughput.
Note that related costs are also most likely too high, but this fixes PR21356
and partly fixes PR28434.
llvm-svn: 274658
|
| |
|
|
|
|
|
| |
Cast cost tables are now sorted, for each cast type, lexicographically on
[source base type, source vector width, dest base type, base vector width].
llvm-svn: 274653
|
| |
|
|
|
|
|
|
|
|
| |
On SystemZ, shift and rotate instructions only use the bottom 6 bits of the shift/rotate amount.
Therefore, if the amount is ANDed with an immediate mask that has all of the bottom 6 bits set, we
can remove the AND operation entirely.
Differential Revision: http://reviews.llvm.org/D21854
llvm-svn: 274650
|
| |
|
|
|
|
|
|
| |
We were checking for 2 insertions (which is caught earlier in the pattern matching loop) instead of the case where we have no insertions.
Turns out this code never fires as we always try to lower to insertps after trying to lower to blendps, which would catch these cases - I'm about to make some changes to support combining to insertps which could cause this to fire so I don't want to remove it.
llvm-svn: 274648
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is a problem in VSXSwapRemoval where it is incorrectly removing permute instructions.
In this case, the permute is feeding both a vector store and also a non-store instruction. In this case, the permute cannot be removed.
The fix is to simply look at all the uses of the vector register defined by the permute and ensure that all the uses are vector store instructions.
This problem was reported in PR 27735 (https://llvm.org/bugs/show_bug.cgi?id=27735).
Test case based on the original problem reported.
Phabricator Review: http://reviews.llvm.org/D21802
llvm-svn: 274645
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The cost model should not assume vector casts get completely scalarized, since
on targets that have vector support, the common case is a partial split up to
the legal vector size. So, when a vector cast gets split, the resulting casts
end up legal and cheap.
Instead of pessimistically assuming scalarization, base TTI can use the costs
the concrete TTI provides for the split vector, plus a fudge factor to account
for the cost of the split itself. This fudge factor is currently 1 by default,
except on AMDGPU where inserts and extracts are considered free.
Differential Revision: http://reviews.llvm.org/D21251
llvm-svn: 274642
|
| |
|
|
| |
llvm-svn: 274636
|
| |
|
|
|
|
|
| |
The prev commit failed on compilation.
A minor change in one pattern in lib/Target/X86/X86InstrAVX512.td fixes the failure.
llvm-svn: 274626
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a follow-up for r273544.
The end goal is to get rid of the isSwift / isCortexXY / isWhatever methods.
This commit also removes two command-line flags that weren't used in any of the
tests: widen-vmovs and swift-partial-update-clearance. The former may be easily
replaced with the mattr mechanism, but the latter may not (as it is a subtarget
property, and not a proper feature).
Differential Revision: http://reviews.llvm.org/D21797
llvm-svn: 274620
|
| |
|
|
|
|
|
|
|
|
|
| |
This is a follow-up for r273544 and r273853.
The end goal is to get rid of the isSwift / isCortexXY / isWhatever methods.
This commit also marks them as obsolete.
Differential Revision: http://reviews.llvm.org/D21796
llvm-svn: 274616
|
| |
|
|
| |
llvm-svn: 274615
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The patch removes redundant kmov instructions (not all, we still have a lot of work here) and redundant "and" instructions after "setcc".
I use "AssertZero" marker between X86ISD::SETCC node and "truncate" to eliminate extra "and $1" instruction.
I also changed zext, aext and trunc patterns in the .td file. It allows to remove extra "kmov" instruictions.
This patch fixes https://llvm.org/bugs/show_bug.cgi?id=28173.
Fast ISEL mode is not supported correctly for AVX-512. ICMP/FCMP scalar instruction should return result in k-reg. It will be fixed in one of the next patches. I redirected handling of "cmp" to the DAG builder mode. (The code looks worse in one specific test case, but without this fix the new patch fails).
Differential revision: http://reviews.llvm.org/D21956
llvm-svn: 274613
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Since "AMDGPU: Fix verifier errors in SILowerControlFlow", the logic that
ensures that a non-void-returning shader falls off the end of the last
basic block was effectively disabled, since SI_RETURN is now used.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96731
Reviewers: arsenm, tstellarAMD
Subscribers: arsenm, kzhuravl, llvm-commits
Differential Revision: http://reviews.llvm.org/D21975
llvm-svn: 274612
|
| |
|
|
|
|
|
|
|
| |
I think the Ops filled out by Regex::match contain pointers into the temporary
std::string returned by StringRef::upper. Its lifetime is extended by the call
to match, but only until the end of that call (not to the uses of Ops later
on).
llvm-svn: 274586
|
| |
|
|
| |
llvm-svn: 274583
|