| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of always using addu to adjust the stack pointer when the
size out is of the range of an addiu instruction, use subu so that
a smaller constant can be generated.
This can give savings of ~3 instructions whenever a function has a
a stack frame whose size is out of range of an addiu instruction.
This change may break some naive stack unwinders.
Partially resolves PR/26291.
Thanks to David Chisnall for reporting the issue.
Reviewers: dsanders, vkalintiris
Differential Review: http://reviews.llvm.org/D21321
llvm-svn: 272666
|
| |
|
|
|
|
|
|
| |
We can only generate immediates up to #510 with a MOV+ADD, not #511, because there's no such instruction as add #256.
Found by Oliver Stannard and csmith!
llvm-svn: 272665
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
PR27458 highlights that the MIPS backend does not have well formed
MIR for atomic operations (among other errors).
This patch adds expands and corrects the LL/SC descriptions and uses
for MIPS(64).
Reviewers: dsanders, vkalintiris
Differential Review: http://reviews.llvm.org/D19719
llvm-svn: 272655
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements the N32 case where -mno-shared is in effect. The case
where -mshared is in effect will be added later since doing that now requires
additional changes to how we handle %hi(%neg(%gp_rel(foo))) expressions to
emit the three relocations as three relocations (currently only one of the
three would be emitted) which then requires further changes to our MCFixup
handling.
While we could fix both cases together, fixing the -mno-shared case allows us
to fix the ELFCLASS bug (where N32 incorrectly uses ELFCLASS64 instead of
ELFCLASS32) in a way that allows cpsetup.s to check for a correct output instead
of another incorrect output.
Reviewers: sdardis
Subscribers: dsanders, llvm-commits, sdardis
Differential Revision: http://reviews.llvm.org/D21131
llvm-svn: 272652
|
| |
|
|
|
|
| |
using MOVNTSD/MOVNTSS
llvm-svn: 272651
|
| |
|
|
|
|
|
|
|
|
|
| |
Itineraries for some pre MIPSR6 and EVA instructions. Some pseudo expanded
instructions are marked as having no scheduling info.
Reviewers: dsanders, vkalintiris
Differential Review: http://reviews.llvm.org/D20418
llvm-svn: 272648
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: sdardis
Subscribers: dsanders, sdardis, llvm-commits
Differential Revision: http://reviews.llvm.org/D21063
llvm-svn: 272647
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The machine verifier reports 'Explicit operand marked as def' when it is
manually specified even though it agrees with the operand info.
Reviewers: sdardis
Subscribers: dsanders, sdardis, llvm-commits
Differential Revision: http://reviews.llvm.org/D21065
llvm-svn: 272646
|
| |
|
|
|
|
| |
bit masks. This results in a smaller encoding.
llvm-svn: 272627
|
| |
|
|
|
|
| |
when KMOVB is not available. This has better behavior with respect to partial register stalls since it won't need to preserve the upper 16-bits of the GPR.
llvm-svn: 272626
|
| |
|
|
|
|
| |
KMOVW/KMOVB without going through an EXTRACT_SUBREG and a MOVZX.
llvm-svn: 272625
|
| |
|
|
|
|
|
|
| |
so it is the same as the MCExternalSymbolizer.
rdar://17349181
llvm-svn: 272588
|
| |
|
|
|
|
|
| |
The need for these intrinsics has been obviated by r272564 which
reimplements their functionality using generic IR.
llvm-svn: 272566
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Mesa and other users must set this to enable coalescing:
- STRIDE = 0
- SWIZZLE_ENABLE = 1
This makes one particular compute shader 8x faster.
Reviewers: tstellarAMD, arsenm
Subscribers: arsenm, kzhuravl
Differential Revision: http://reviews.llvm.org/D21136
llvm-svn: 272556
|
| |
|
|
|
|
|
| |
The condition reg of the cndmask_b64 expansion can't be killed by
the first one, and the implicit super register implicit def is needed.
llvm-svn: 272554
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This enables use of the 'R' and 'T' memory constraints for inline ASM
operands on SystemZ, which allow an index register as well as an
immediate displacement. This patch includes corresponding documentation
and test case updates.
As with the last patch of this kind, I moved the 'm' constraint to the
most general case, which is now 'T' (base + 20-bit signed displacement +
index register).
Author: colpell
Differential Revision: http://reviews.llvm.org/D21239
llvm-svn: 272547
|
| |
|
|
|
|
|
| |
to go in as soon as llvm patch has gone in because
tests will start breaking in Clang.
llvm-svn: 272546
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
MRRC/MRRC2 instruction writes to two registers. The
intrinsic definition returns a single uint64_t to
represent the write, this is a compact way of
representing a write to two 32 bit registers,
the alternative might have been two return a
struct of 2 uint32_t's but this isn't as nice.
Differential Revision:
llvm-svn: 272544
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The "-Werror=enum-compare" shows that the statement is using two different enums:
enumeral mismatch in conditional expression: 'llvm::X86ISD::NodeType' vs 'llvm::ISD::NodeType'
A follow-up fix on D21235.
Reviewers: klimek
Subscribers: spatel, cfe-commits
Differential Revision: http://reviews.llvm.org/D21278
llvm-svn: 272539
|
| |
|
|
|
|
| |
autoupgrade them to selects and shufflevector.
llvm-svn: 272527
|
| |
|
|
|
|
| |
No functionality change intended.
llvm-svn: 272516
|
| |
|
|
|
|
| |
Or replace with llvm::function_ref if it's never stored. NFC intended.
llvm-svn: 272513
|
| |
|
|
|
|
|
|
| |
This used to be free, copying and moving DebugLocs became expensive
after the metadata rewrite. Passing by reference eliminates a ton of
track/untrack operations. No functionality change intended.
llvm-svn: 272512
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SSE1 (PR28044)
This patch is intended to solve:
https://llvm.org/bugs/show_bug.cgi?id=28044
By changing the definition of X86ISD::CMPP to use float types, we allow it to be created
and pass legalization for an SSE1-only target where v4i32 is not legal.
The motivational trail for this change includes:
https://llvm.org/bugs/show_bug.cgi?id=28001
and eventually makes this trigger:
http://reviews.llvm.org/D21190
Ie, after this step, we should be free to have Clang generate FP compare IR instead of x86
intrinsics for SSE C packed compare intrinsics. (We can auto-upgrade and remove the LLVM
sse.cmp intrinsics as a follow-up step.) Once we're generating vector IR instead of x86
intrinsics, a big pile of generic optimizations can trigger.
Differential Revision: http://reviews.llvm.org/D21235
llvm-svn: 272511
|
| |
|
|
|
|
| |
shufflevector.
llvm-svn: 272510
|
| |
|
|
|
|
| |
added auto-upgrade code to turn them into shufflevectors and selects.
llvm-svn: 272497
|
| |
|
|
|
|
| |
To account for the fast PSHUFB implementation now available
llvm-svn: 272484
|
| |
|
|
|
|
| |
PSHUFB can speed up BITREVERSE of byte vectors by performing LUT on the low/high nibbles separately and ORing the results. Wider integer vector types are already BSWAP'd beforehand so also make use of this approach.
llvm-svn: 272477
|
| |
|
|
| |
llvm-svn: 272476
|
| |
|
|
| |
llvm-svn: 272473
|
| |
|
|
|
|
| |
Ensure that PALIGNR/PSLLDQ/PSRLDQ are byte vectors so that they can be correctly decoded for target shuffle combining
llvm-svn: 272471
|
| |
|
|
|
|
| |
These are byte shift instructions and it will make shuffle combining a lot more straightforward if we can assume a vXi8 vector of bytes so decoded shuffle masks match the return type's number of elements
llvm-svn: 272468
|
| |
|
|
|
|
|
|
| |
generation
Now matches other shuffles
llvm-svn: 272464
|
| |
|
|
|
|
| |
Hopefully this time it actually works and stays away.
llvm-svn: 272463
|
| |
|
|
| |
llvm-svn: 272459
|
| |
|
|
|
|
| |
allows us to create 512-bit PSHUFLW/PSHUFHW.
llvm-svn: 272450
|
| |
|
|
|
|
| |
v64i8 shuffles. If we get this far the types are already legal which means BWI must be enabled.
llvm-svn: 272449
|
| |
|
|
|
|
| |
(This is split out from was D21115)
llvm-svn: 272435
|
| |
|
|
| |
llvm-svn: 272430
|
| |
|
|
| |
llvm-svn: 272429
|
| |
|
|
|
|
|
| |
The vector cases don't change because we already have folds in X86ISelLowering
to look through and remove bitcasts.
llvm-svn: 272427
|
| |
|
|
| |
llvm-svn: 272422
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Support and generate Compare and Traps like CRT, CIT, etc.
Support Trap as legal DAG opcodes and generate "j .+2" for them by default.
Add support for Conditional Traps and use the If Converter to convert them into
the corresponding compare and trap opcodes.
Differential Revision: http://reviews.llvm.org/D21155
llvm-svn: 272419
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We need to set the fixup type to FK_Data_4 for the
SCRATCH_RSRC_DWORD[01] symbols, since these require absolute
relocations, and fixup_si_rodata is for relative relocations.
Reviewers: arsenm, kzhuravl
Subscribers: arsenm, kzhuravl, llvm-commits
Differential Revision: http://reviews.llvm.org/D21153
llvm-svn: 272417
|
| |
|
|
|
|
|
|
|
| |
The costs are somewhat hand-wavy, but should be much closer to the truth
than what we get from BasicTTI.
Differential Revision: http://reviews.llvm.org/D21156
llvm-svn: 272406
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D21203
llvm-svn: 272400
|
| |
|
|
| |
llvm-svn: 272399
|
| |
|
|
| |
llvm-svn: 272393
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in AMDGPUOperand.
Summary:
sext() modifier is supported in SDWA instructions only for integer operands. Spec is unclear should integer operands support abs and neg modifiers with sext - for now they are not supported.
Renamed InputModsWithNoDefault to FloatInputMods. Added SextInputMods for operands that support sext() modifier.
Added AMDGPUOperand::Modifier struct to handle register and immediate modifiers.
Code cleaning in AMDGPUOperand class: organize method in groups (render-, predicate-methods...).
Reviewers: vpykhtin, artem.tamazov, tstellarAMD
Subscribers: arsenm, kzhuravl
Differential Revision: http://reviews.llvm.org/D20968
llvm-svn: 272384
|
| |
|
|
| |
llvm-svn: 272380
|