| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The added testcase, which triggered this, was derived from a shader-db case
via bugpoint. A separate question is why scalar branching wasn't used.
Reviewers: arsenm, tstellarAMD
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D19208
llvm-svn: 266825
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
A shader stored the live mask (initial exec mask) in an SGPR which was then
spilled during register allocation. The allocator quite reasonably
optimized turned the spill into
v_writelane_b32 %vgpr, exec_lo, N
v_writelane_b32 %vgpr, exec_hi, N+1
at the beginning of the shader, confusing the SGPR accounting.
No test case, because si-sgpr-spill.ll together with an upcoming patch for
WQM handling exhibits the problem.
Reviewers: arsenm, tstellarAMD
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D19199
llvm-svn: 266824
|
|
|
|
|
|
| |
Also, disable zero- and size-extend optimizations for now.
llvm-svn: 266821
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Both AArch64 and ARM support llvm.<arch>.thread.pointer intrinsics that
just return the thread pointer. I have a pending patch that does the same
for SystemZ (D19054), and there are many more targets that could benefit
from one.
This patch merges the ARM and AArch64 intrinsics into a single target
independent one that will also be used by subsequent targets.
Differential Revision: http://reviews.llvm.org/D19098
llvm-svn: 266818
|
|
|
|
| |
llvm-svn: 266811
|
|
|
|
| |
llvm-svn: 266809
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With this change, ideally IR pass can always generate llvm.stackguard
call to get the stack guard; but for now there are still IR form stack
guard customizations around (see getIRStackGuard()). Future SSP
customization should go through LOAD_STACK_GUARD.
There is a behavior change: stack guard values are not CSEed anymore,
since we should never reuse the value in case that it has been spilled (and
corrupted). See ssp-guard-spill.ll. This also cause the change of stack
size and codegen in X86 and AArch64 test cases.
Ideally we'd like to know if the guard created in llvm.stackprotector() gets
spilled or not. If the value is spilled, discard the value and reload
stack guard; otherwise reuse the value. This can be done by teaching
register allocator to know how to rematerialize LOAD_STACK_GUARD and
force a rematerialization (which seems hard), or check for spilling in
expandPostRAPseudo. It only makes sense when the stack guard is a global
variable, which requires more instructions to load. Anyway, this seems to go out
of the scope of the current patch.
llvm-svn: 266806
|
|
|
|
|
|
|
|
| |
* Add lowering for SETCCE i32.
* Add test to check lowering of i64 compares uses SETCCE expansion (outside of EQ and NE).
* Fix select.ll test and immediate form selection for RI operations.
llvm-svn: 266802
|
|
|
|
|
|
|
|
| |
- Elide trivial contructor and desctructor
- Move implementation out of an unnecessary explicit llvm namespace
scope
llvm-svn: 266794
|
|
|
|
|
|
|
| |
Instead of having a conditional assert inside EmitNops, refactor so that
the caller can have the assert instead.
llvm-svn: 266793
|
|
|
|
|
|
| |
Patch by Sirish Pande.
llvm-svn: 266792
|
|
|
|
|
|
|
|
|
| |
in preparation for enabling the outgoing parameter store-to-push optimization
for 64-bit targets.
Differential Revision: http://reviews.llvm.org/D19222
llvm-svn: 266774
|
|
|
|
|
|
|
|
|
|
| |
shuffles
Using VPERMQ/VPERMPD allows memory folding of the (repeated) input where VINSERTI128/VINSERTF128 can not.
Differential Revision: http://reviews.llvm.org/D19228
llvm-svn: 266728
|
|
|
|
|
|
|
| |
PatchableFunction requires AllVRegsAllocated that these targets don't
provide.
llvm-svn: 266720
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The `"patchable-function"` attribute can be used by an LLVM client to
influence LLVM's code generation in ways that makes the generated code
easily patchable at runtime (for instance, to redirect control).
Right now only one patchability scheme is supported,
`"prologue-short-redirect"`, but this can be expanded in the future.
Reviewers: joker.eph, rnk, echristo, dberris
Subscribers: joker.eph, echristo, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D19046
llvm-svn: 266715
|
|
|
|
| |
llvm-svn: 266701
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The fast register-allocator cannot cope with inter-block dependencies without
spilling. This is fine for ldrex/strex loops coming from atomicrmw instructions
where any value produced within a block is dead by the end, but not for
cmpxchg. So we lower a cmpxchg at -O0 via a pseudo-inst that gets expanded
after regalloc.
Fortunately this is at -O0 so we don't have to care about performance. This
simplifies the various axes of expansion considerably: we assume a strong
seq_cst operation and ensure ordering via the always-present DMB instructions
rather than v8 acquire/release instructions.
Should fix the 32-bit part of PR25526.
llvm-svn: 266679
|
|
|
|
|
|
| |
There's currently no raw_ostream &operator<<(SimpleValueType); provided by LLVM. It could be added by refactoring utils/TableGen/CodeGenTarget.cpp:getEnumName, but that's much more work than fixing the build.
llvm-svn: 266627
|
|
|
|
|
|
|
|
|
|
|
| |
Also,
- Skip pass if machine module does not have debug info
- Minor comment changes
- Added test
Differential Revision: http://reviews.llvm.org/D19079
llvm-svn: 266626
|
|
|
|
|
|
|
|
| |
Order should match the sp3 syntax, where destination (simm16 denoting the hwreg) is coming first.
Differential Revision: http://reviews.llvm.org/D19161
llvm-svn: 266617
|
|
|
|
|
|
| |
being called is a static function, so there's no need for an instance variable. NFC.
llvm-svn: 266616
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
noreorder' regions.
Summary:
When clang is given -save-temps or -via-file-asm, any inline assembly in
the source is parsed twice. Once by the compiler, and again by the
assembler. We must take care to ensure that this doesn't lead to
double-filling delay slots.
Reviewers: sdardis, vkalintiris
Subscribers: dsanders, sdardis, llvm-commits
Differential Revision: http://reviews.llvm.org/D19166
llvm-svn: 266608
|
|
|
|
|
|
| |
lib/Target/WebAssembly/InstPrinter/WebAssemblyInstPrinter.h
llvm-svn: 266606
|
|
|
|
| |
llvm-svn: 266603
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This will allows us to eliminate some magic numbers from the offset operand of
branch instructions in favour of symbols and makes it possible to avoid
double-filling delay slots when clang is given -save-temps.
parseDirectiveCpRestore() is calling isIntegratedAssemblerRequired() for the
moment since correctly pushing the generation of these instructions into the
ELF target streamer is tricky enough to warrant a separate patch.
Reviewers: sdardis, vkalintiris
Subscribers: dsanders, llvm-commits, sdardis
Differential Revision: http://reviews.llvm.org/D19164
llvm-svn: 266602
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Removed some unused headers, replaced some headers with forward class declarations.
Found using simple scripts like this one:
clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' | xargs grep -L 'IndexedMap[<]' | xargs grep -n --color=auto 'IndexedMap'
Patch by Eugene Kosov <claprix@yandex.ru>
Differential Revision: http://reviews.llvm.org/D19219
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266595
|
|
|
|
|
|
| |
rather than just looping over all vector types and conditinally matching them. NFC
llvm-svn: 266577
|
|
|
|
|
|
|
|
|
|
| |
from TargetLoweringBase and probably other places.
This required changing several places to print VT enums as strings instead of raw ints since the proper method to use to print became ambiguous. This is probably an improvement anyway.
This also appears to save ~8K from an x86 self host build of llc.
llvm-svn: 266562
|
|
|
|
| |
llvm-svn: 266559
|
|
|
|
|
|
|
|
|
| |
no functional change.
ExtraLoad and WrapperKind are been used only if (OpFlags == X86II::MO_GOTPCREL).
Differential Revision: http://reviews.llvm.org/D18942
llvm-svn: 266557
|
|
|
|
|
|
| |
are enabled.
llvm-svn: 266554
|
|
|
|
| |
llvm-svn: 266534
|
|
|
|
|
|
|
|
| |
support
Added additional test that peeks through bitcast to v16i8 mask
llvm-svn: 266533
|
|
|
|
|
|
|
|
|
|
|
| |
This resolves more frame indexes early and folds
the immediate offsets into the scratch mubuf instructions.
This cleans up a lot of the mess that's currently emitted,
such as emitting add 0s and repeatedly initializing the same
register to 0 when spilling.
llvm-svn: 266508
|
|
|
|
| |
llvm-svn: 266506
|
|
|
|
|
|
|
| |
There are still a couple more inside the MIPS target. I opted for a single
commit in order to avoid spamming the list.
llvm-svn: 266472
|
|
|
|
| |
llvm-svn: 266471
|
|
|
|
|
|
|
|
|
|
| |
Use the tryAddingSymbolicOperand callback to attempt to present immediate
values in symbolic form when disassembling. This is currently only used
for PC-relative immediates (which are most likely to be symbolic in the
SystemZ ISA). Add new DecodeMethod types to allow distinguishing between
branch and non-branch instructions.
llvm-svn: 266469
|
|
|
|
|
|
|
|
|
|
|
|
| |
Divisions by a constant can be converted into multiplies which are usually
cheaper, but this isn't possible if the constant gets separated (particularly
in loops). Fix this by telling ConstantHoisting that the immediate in a DIV is
cheap.
I considered making the check generic, but neither AArch64 (strangely) nor x86
showed any benefit on the tests I had.
llvm-svn: 266464
|
|
|
|
|
|
|
|
|
| |
This improves AA in the MI schduler when reason about paired instructions.
Phabricator Revision: http://reviews.llvm.org/D17098
PR26358
llvm-svn: 266462
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Without MMOs, the callee-save load/store instructions were treated as
volatile by the MI post-RA scheduler and AArch64LoadStoreOptimizer.
Reviewers: t.p.northover, mcrosier
Subscribers: aemerson, rengolin, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D17661
llvm-svn: 266439
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[PPC] Previously when casting generic loads to LXV2DX/ST instructions we
would leave the original load return type in place allowing for an
assertion failure when we merge two equivalent LXV2DX nodes with
different types.
This fixes PR27350.
Reviewers: nemanjai
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D19133
llvm-svn: 266438
|
|
|
|
|
|
|
|
|
|
|
|
| |
Perform store clustering just like load clustering. This change add
StoreClusterMutation in machine-scheduler. To control StoreClusterMutation,
added enableClusterStores() in TargetInstrInfo.h. This is enabled only on
AArch64 for now.
This change also add support for unscaled stores which were not handled in
getMemOpBaseRegImmOfs().
llvm-svn: 266437
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In the added test-case, the atomic instruction feeds into a non-machine
CopyToReg node which hasn't been selected yet, so guard against
non-machine opcodes here.
Reviewers: arsenm, tstellarAMD
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D19043
llvm-svn: 266433
|
|
|
|
| |
llvm-svn: 266414
|
|
|
|
|
|
| |
to promoted and set the type in one call. Use it so save code in X86.
llvm-svn: 266413
|
|
|
|
|
|
| |
explicitly.
llvm-svn: 266412
|
|
|
|
|
|
| |
setOperationAction that only varied in Legal/Custom. Use the ternary operator on that argument instead. NFC
llvm-svn: 266410
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Calls on NVPTX are unusually expensive (for one thing, lots of state
needs to be saved to memory, which is slow), so make the inlininer much
more aggressive.
Reviewers: chandlerc
Subscribers: jholewinski, llvm-commits, tra
Differential Revision: http://reviews.llvm.org/D18561
llvm-svn: 266406
|
|
|
|
| |
llvm-svn: 266385
|