| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
| |
Reviewers:
arsenm
Differential Revision:
http://reviews.llvm.org/D22025
llvm-svn: 297243
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
object that knows how to generate it.
Summary:
This will allow future patches to inspect the details of the LLT. The implementation is now split between
the Support and CodeGen libraries to allow TableGen to use this class without introducing layering concerns.
Thanks to Ahmed Bougacha for finding a reasonable way to avoid the layering issue and providing the version of this patch without that problem.
The problem with the previous commit appears to have been that TableGen was including CodeGen/LowLevelType.h instead of Support/LowLevelTypeImpl.h.
Reviewers: t.p.northover, qcolombet, rovka, aditya_nandakumar, ab, javed.absar
Subscribers: arsenm, nhaehnle, mgorny, dberris, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D30046
llvm-svn: 297241
|
| |
|
|
| |
llvm-svn: 297240
|
| |
|
|
| |
llvm-svn: 297239
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D30672
llvm-svn: 297198
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Broadcom Vulcan is now Cavium ThunderX2T99.
LLVM Bugzilla: http://bugs.llvm.org/show_bug.cgi?id=32113
Minor fixes for the alignments of loops and functions for
ThunderX T81/T83/T88 (better performance).
Patch was tested with SpecCPU2006.
Patch by Stefan Teleman
Differential Revision: https://reviews.llvm.org/D30510
llvm-svn: 297190
|
| |
|
|
|
|
|
|
|
|
| |
More module problems. This time it only showed up in the stage 2 compile of
clang-x86_64-linux-selfhost-modules-2 but not the stage 1 compile.
Somehow, this change causes the build to need Attributes.gen before it's been
generated.
llvm-svn: 297188
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Loop alignment can cause a significant change of
the perfromance for short loops.
To be able to evaluate the impact of loop alignment this change
introduces the new option x86-experimental-pref-loop-alignment.
The alignment will be 2^Value bytes, the default value is 4.
Patch by Serguei Katkov!
Reviewers: craig.topper
Reviewed By: craig.topper
Subscribers: sanjoy, llvm-commits
Differential Revision: https://reviews.llvm.org/D30391
llvm-svn: 297178
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
knows how to generate it.
Summary:
This will allow future patches to inspect the details of the LLT. The implementation is now split between
the Support and CodeGen libraries to allow TableGen to use this class without introducing layering concerns.
Thanks to Ahmed Bougacha for finding a reasonable way to avoid the layering issue and providing the version of this patch without that problem.
Reviewers: t.p.northover, qcolombet, rovka, aditya_nandakumar, ab, javed.absar
Subscribers: arsenm, nhaehnle, mgorny, dberris, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D30046
llvm-svn: 297177
|
| |
|
|
|
|
|
|
|
|
|
| |
The check for LSL #0 in an IT block was checking if operand 4 was zero, but
operand 4 is the condition code operand so it was actually checking for LSLEQ.
Fix this by checking operand 3, which really is the immediate operand, and add
some tests.
Differential Revision: https://reviews.llvm.org/D30692
llvm-svn: 297142
|
| |
|
|
| |
llvm-svn: 297141
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
other""
The original patch r296865 was reverted as it broke the chromium builds for
Android https://bugs.llvm.org/show_bug.cgi?id=32134, this patch reapplies
r296865 with a fix to make sure it doesn't cause the build regression.
The problem was that intrinsic selection on int_arm_get_fpscr was failing in
ISel this was because the code to manually select this intrinsic still thought
it was the version with no side-effects (INTRINSIC_WO_CHAIN) which is wrong as
it doesn't semantically match the definition in the tablegen code which says it
does have side-effects, I've fixed this by updating the intrinsic type to
INTRINSIC_W_CHAIN (has side-effects). I've also added a test for this based on
Hans original reproducer.
Differential Revision: https://reviews.llvm.org/D30645
llvm-svn: 297137
|
| |
|
|
|
|
|
|
| |
Since BB-vectorizer can produce vectors of for example 3 elements,
this check is needed.
Review: Ulrich Weigand
llvm-svn: 297136
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
isn't live.
Summary: Previously, it had always been materialized as a push/pop sequence.
Reviewers: labrinea, jroelofs
Reviewed By: jroelofs
Subscribers: llvm-commits, rengolin
Differential Revision: https://reviews.llvm.org/D30648
llvm-svn: 297134
|
| |
|
|
|
|
|
|
|
|
|
|
| |
compressing tables.
X86EvexToVex machine instruction pass compresses EVEX encoded instructions by replacing them with their identical VEX encoded instructions when possible.
It uses manually supported 2 large tables that map the EVEX instructions to their VEX ideticals.
This TableGen backend replaces the tables by automatically generating them.
Differential Revision: https://reviews.llvm.org/D30451
llvm-svn: 297127
|
| |
|
|
|
|
|
|
| |
evex2vex pass defines 2 tables which maps EVEX instructions to their VEX identical when possible. Adding all missing entries.
Differential Revision: https://reviews.llvm.org/D30501
llvm-svn: 297126
|
| |
|
|
|
|
|
|
|
|
|
| |
stack frame size"
This reverts commit r296771.
We found some wide spread test failures internally. I'm working on a
testcase. Politely revert the patch in the mean time. :)
llvm-svn: 297124
|
| |
|
|
|
|
|
|
| |
It breaks line tables because the patch is not complete, working on a complete one at the moment
This reverts commit r294031.
llvm-svn: 297118
|
| |
|
|
|
|
|
| |
A bit more painful than G_INSERT because it was more widely used, but this
should simplify the handling of extract operations in most locations.
llvm-svn: 297100
|
| |
|
|
|
|
|
|
| |
Fixed the asan bot failure which led to the last commit of the outliner being reverted.
The change is in lib/CodeGen/MachineOutliner.cpp in the SuffixTree's constructor. LeafVector
is no longer initialized using reserve but just a standard constructor.
llvm-svn: 297081
|
| |
|
|
|
|
|
|
|
|
| |
This patch extends the current functionality of the AArch64 redundant copy
elimination pass to handle CMN instructions as well as a shifted
immediates.
Differential Revision: https://reviews.llvm.org/D30576.
llvm-svn: 297078
|
| |
|
|
|
|
|
|
| |
also exit early on kill instead of redefinition.
Differential Revision: https://reviews.llvm.org/D30230
llvm-svn: 297060
|
| |
|
|
|
|
|
|
|
|
|
| |
Use the store size of the argument type, which will be a byte-sized
quantity, rather than dividing the size in bits by 8.
Fixes PR32136 and re-enables copy elision from i64 arguments.
Reverts the workaround in from r296950.
llvm-svn: 297045
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Merge the tail block into the loop in cases where the main loop body
exits early, subject to profitability constraints. This will coalesce
the loop body into fewer blocks.
For example:
loop: loop:
// loop body // loop body
if (...) jump exit --> // more body
more: if (...) jump exit
// more body jump loop
jump loop
llvm-svn: 297033
|
| |
|
|
|
|
|
|
|
| |
The code in updateDeadFlags removed unnecessary <dead> flags, but there
can be cases where such a flag is not set, and yet a register has become
dead. For example, if a mux with identical inputs is replaced with a COPY,
the predicate register may no longer be used after that.
llvm-svn: 297032
|
| |
|
|
| |
llvm-svn: 297031
|
| |
|
|
|
|
|
|
|
|
|
| |
Fixes a crash caused by r296811 by truncating the input of the STBRX node
when the bswap is wider than i32.
Fixes https://bugs.llvm.org/show_bug.cgi?id=32140
Differential Revision: https://reviews.llvm.org/D30615
llvm-svn: 297001
|
| |
|
|
|
|
|
|
| |
X86ISelLowering.cpp:26506:36: error: enumeral mismatch in conditional
expression: 'llvm::X86ISD::NodeType' vs 'llvm::ISD::NodeType'
[-Werror=enum-compare]
llvm-svn: 296986
|
| |
|
|
|
|
|
|
|
|
|
|
| |
As described on PR31712, we miss a variety of legalization combines because we lower these to X86ISD::VSEXT/VZEXT despite them having the same functionality. This patch makes 128-bit (SSE41) SIGN/ZERO_EXTEND_VECTOR_IN_REG ops legal, adds the necessary tablegen plumbing and uses a helper 'getExtendInVec' to decide when to use SIGN/ZERO_EXTEND_VECTOR_IN_REG or VSEXT/VZEXT.
We're missing a couple of shuffle combines that will be added in a future patch for review.
Later patches can then support the AVX2 cases as a mixture of SIGN/ZERO_EXTEND and SIGN/ZERO_EXTEND_VECTOR_IN_REG, and then finally deal with the AVX512 cases.
Differential Revision: https://reviews.llvm.org/D30549
llvm-svn: 296985
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The larger goal is to move the ADC/SBB transforms currently in
combineX86SetCC() to combineAddOrSubToADCOrSBB() because we're
creating ADC/SBB in lots of places where we shouldn't.
This was intended to be an NFC change, but avx-512 has something
strange going on. It doesn't seem like any of the affected tests
should really be using SET+TEST or ADC; a simple ADD could replace
several instructions. But that's another bug...
llvm-svn: 296978
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
select Cond, C +/- 1, C --> add(ext Cond), C -- with a target hook.
This is part of the ongoing process to obsolete D24480. The motivation is to
canonicalize to select IR in InstCombine whenever possible, so we need to have a way to
undo that easily in codegen.
PowerPC is an obvious winner for this kind of transform because it has fast and complete
bit-twiddling abilities but generally lousy conditional execution perf (although this might
have changed in recent implementations).
x86 also sees some wins, but the effect is limited because these transforms already mostly
exist in its target-specific combineSelectOfTwoConstants(). The fact that we see any x86
changes just shows that that code is a mess of special-case holes. We may be able to remove
some of that logic now.
My guess is that other targets will want to enable this hook for most cases. The likely
follow-ups would be to add value type and/or the constants themselves as parameters for the
hook. As the tests in select_const.ll show, we can transform any select-of-constants to
math/logic, but the general transform for any 2 constants needs one more instruction
(multiply or 'and').
ARM is one target that I think may not want this for most cases. I see infinite loops there
because it wants to use selects to enable conditionally executed instructions.
Differential Revision: https://reviews.llvm.org/D30537
llvm-svn: 296977
|
| |
|
|
|
|
|
|
| |
Long ago (2010 according to svn blame), combineShuffle probably needed to prevent the accidental creation of illegal i64 types but there doesn't appear to be any combines that can cause this any more as they all have their own legality checks.
Differential Revision: https://reviews.llvm.org/D30213
llvm-svn: 296966
|
| |
|
|
|
|
|
|
|
| |
This fixes cases where i1 types were not properly legalized yet and lead
to the creating of 0-sized stack slots.
This fixes http://llvm.org/PR32136
llvm-svn: 296950
|
| |
|
|
| |
llvm-svn: 296933
|
| |
|
|
|
|
|
| |
It's much easier to reason about single-value inserts and no-one was actually
using the variadic variants before.
llvm-svn: 296923
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The comments were wrong, and this is not an obvious transform.
This hopefully makes it clearer that we're missing the commuted
patterns for adds. It's less clear that this is actually a good
transform for all micro-arch.
This is prep work for trying to clean up the current adc/sbb
codegen because it's definitely not happening optimally.
llvm-svn: 296918
|
| |
|
|
| |
llvm-svn: 296901
|
| |
|
|
|
|
| |
This is producing SBB where it is obviously not necessary, so it needs to be limited.
llvm-svn: 296894
|
| |
|
|
| |
llvm-svn: 296875
|
| |
|
|
|
|
| |
creation
llvm-svn: 296874
|
| |
|
|
|
|
|
|
| |
Added code to check constant bus restrictions for VOP formats (only one SGPR value or literal-constant may be used by the instruction).
Note that the same checks are performed by SIInstrInfo::verifyInstruction (used by lowering code).
Added LIT tests.
llvm-svn: 296873
|
| |
|
|
|
|
|
|
|
|
| |
This patch causes compile times for some patterns to explode. I have
a (large, unreduced) test case that slows down by more than 20x and
several test cases slow down by 2x. I'm sending some of the test cases
directly to Nirav and following up with more details in the review log,
but this should unblock anyone else hitting this.
llvm-svn: 296862
|
| |
|
|
|
|
|
|
| |
VZEROUPPER should not be issued on Knights Landing (KNL), but on Skylake-avx512 it should be.
Differential Revision: https://reviews.llvm.org/D29874
llvm-svn: 296859
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This is a cleanup/rewrite of the parseSysAlias function. It was not using the
tablegen instruction descriptions, but was “manually” matching the mnemonics
and recreating the operands whereas all this information is already in
tablegen; all this code has been replaced with calls to lookupXYZByName
tablegen calls.
Differential Revision: https://reviews.llvm.org/D30491
llvm-svn: 296857
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: [GlobalISel][X86] Add support for f32/f64 and vector types in RegisterBank and InstructionSelector.
Reviewers: delena, zvi
Reviewed By: zvi
Subscribers: dberris, rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D30533
llvm-svn: 296856
|
| |
|
|
| |
llvm-svn: 296842
|
| |
|
|
|
|
|
| |
Specifically, pick the opcode with the correct branch prediction, i.e.
jump:t or jump:nt.
llvm-svn: 296821
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In ARMPreAllocLoadStoreOpt::RescheduleOps, LastOp should be the last
operation which we want to merge. If we break out of the loop because
an operation has the wrong offset, we shouldn't use that operation
as LastOp.
This patch fixes some cases where we would move stores to the wrong
insert point.
Re-commit with a fix to increment NumMove in the right place.
Differential Revision: https://reviews.llvm.org/D30124
llvm-svn: 296815
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes pr32063.
Current code in PPCTargetLowering::PerformDAGCombine can transform
bswap
store
into a single PPCISD::STBRX instruction. but it doesn't consider the case that the operand size of bswap may be larger than store size. When it occurs, we need 2 modifications,
1 For the last operand of PPCISD::STBRX, we should not use DAG.getValueType(N->getOperand(1).getValueType()), instead we should use cast<StoreSDNode>(N)->getMemoryVT().
2 Before PPCISD::STBRX, we need to shift the original operand of bswap to the right side.
Differential Revision: https://reviews.llvm.org/D30362
llvm-svn: 296811
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch extends the current functionality of the AArch64 redundant copy
elimination pass to handle non-zero cases such as:
BB#0:
cmp x0, #1
b.eq .LBB0_1
.LBB0_1:
orr x0, xzr, #0x1 ; <-- redundant copy; x0 known to hold #1.
Differential Revision: https://reviews.llvm.org/D29344
llvm-svn: 296809
|