summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU: Guard VOPC instructions against incorrect commuteNicolai Haehnle2016-04-191-3/+3
| | | | | | | | | | | | | | Summary: The added testcase, which triggered this, was derived from a shader-db case via bugpoint. A separate question is why scalar branching wasn't used. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19208 llvm-svn: 266825
* AMDGPU/SI: SGPR accounting in getSIProgramInfo must ignore exec_lo/hiNicolai Haehnle2016-04-191-0/+2
| | | | | | | | | | | | | | | | | | | | | | | Summary: A shader stored the live mask (initial exec mask) in an SGPR which was then spilled during register allocation. The allocator quite reasonably optimized turned the spill into v_writelane_b32 %vgpr, exec_lo, N v_writelane_b32 %vgpr, exec_hi, N+1 at the beginning of the shader, confusing the SGPR accounting. No test case, because si-sgpr-spill.ll together with an upcoming patch for WQM handling exhibits the problem. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19199 llvm-svn: 266824
* [Hexagon] Fix operand swapping in HexagonPeepholeKrzysztof Parzyszek2016-04-191-2/+4
| | | | | | Also, disable zero- and size-extend optimizations for now. llvm-svn: 266821
* [AArch64] [ARM] Make a target-independent llvm.thread.pointer intrinsic.Marcin Koscielnicki2016-04-192-4/+4
| | | | | | | | | | | | | | Both AArch64 and ARM support llvm.<arch>.thread.pointer intrinsics that just return the thread pointer. I have a pending patch that does the same for SystemZ (D19054), and there are many more targets that could benefit from one. This patch merges the ARM and AArch64 intrinsics into a single target independent one that will also be used by subsequent targets. Differential Revision: http://reviews.llvm.org/D19098 llvm-svn: 266818
* [Hexagon] Fix printing the address operand of S2_storerinewabsKrzysztof Parzyszek2016-04-191-9/+8
| | | | llvm-svn: 266811
* [PPC, SSP] Support PowerPC Linux stack protection.Tim Shen2016-04-197-8/+37
| | | | llvm-svn: 266809
* [SSP, 2/2] Create llvm.stackguard() intrinsic and lower it to LOAD_STACK_GUARDTim Shen2016-04-192-4/+4
| | | | | | | | | | | | | | | | | | | | | | | With this change, ideally IR pass can always generate llvm.stackguard call to get the stack guard; but for now there are still IR form stack guard customizations around (see getIRStackGuard()). Future SSP customization should go through LOAD_STACK_GUARD. There is a behavior change: stack guard values are not CSEed anymore, since we should never reuse the value in case that it has been spilled (and corrupted). See ssp-guard-spill.ll. This also cause the change of stack size and codegen in X86 and AArch64 test cases. Ideally we'd like to know if the guard created in llvm.stackprotector() gets spilled or not. If the value is spilled, discard the value and reload stack guard; otherwise reuse the value. This can be done by teaching register allocator to know how to rematerialize LOAD_STACK_GUARD and force a rematerialization (which seems hard), or check for spilling in expandPostRAPseudo. It only makes sense when the stack guard is a global variable, which requires more instructions to load. Anyway, this seems to go out of the scope of the current patch. llvm-svn: 266806
* [lanai] Add lowering for SETCCE i32.Jacques Pienaar2016-04-193-34/+68
| | | | | | | | * Add lowering for SETCCE i32. * Add test to check lowering of i64 compares uses SETCCE expansion (outside of EQ and NE). * Fix select.ll test and immediate form selection for RI operations. llvm-svn: 266802
* [X86] Simplify StackMapShadowTracker; NFCSanjoy Das2016-04-192-38/+27
| | | | | | | | - Elide trivial contructor and desctructor - Move implementation out of an unnecessary explicit llvm namespace scope llvm-svn: 266794
* [X86MCInstLower] Clean up EmitNops; NFCSanjoy Das2016-04-191-51/+68
| | | | | | | Instead of having a conditional assert inside EmitNops, refactor so that the caller can have the assert instead. llvm-svn: 266793
* [Hexagon] Implement branch relaxationKrzysztof Parzyszek2016-04-193-0/+215
| | | | | | Patch by Sirish Pande. llvm-svn: 266792
* Preliminary changes for fixing PR27241. Generalized/restructured some thingsDavid L Kreitzer2016-04-191-19/+37
| | | | | | | | | in preparation for enabling the outgoing parameter store-to-push optimization for 64-bit targets. Differential Revision: http://reviews.llvm.org/D19222 llvm-svn: 266774
* [X86][AVX2] Prefer VPERMQ/VPERMPD over VINSERTI128/VINSERTF128 for unary ↵Simon Pilgrim2016-04-191-4/+10
| | | | | | | | | | shuffles Using VPERMQ/VPERMPD allows memory folding of the (repeated) input where VINSERTI128/VINSERTF128 can not. Differential Revision: http://reviews.llvm.org/D19228 llvm-svn: 266728
* Disable the PatchableFunction pass for NVPTX & WasmSanjoy Das2016-04-192-0/+2
| | | | | | | PatchableFunction requires AllVRegsAllocated that these targets don't provide. llvm-svn: 266720
* Introduce a "patchable-function" function attributeSanjoy Das2016-04-193-18/+63
| | | | | | | | | | | | | | | | | Summary: The `"patchable-function"` attribute can be used by an LLVM client to influence LLVM's code generation in ways that makes the generated code easily patchable at runtime (for instance, to redirect control). Right now only one patchability scheme is supported, `"prologue-short-redirect"`, but this can be expanded in the future. Reviewers: joker.eph, rnk, echristo, dberris Subscribers: joker.eph, echristo, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19046 llvm-svn: 266715
* [lanai] Set boolean contentss to ZeroOrOneBooleanContent.Jacques Pienaar2016-04-191-0/+3
| | | | llvm-svn: 266701
* ARM: use a pseudo-instruction for cmpxchg at -O0.Tim Northover2016-04-184-5/+385
| | | | | | | | | | | | | | | | | The fast register-allocator cannot cope with inter-block dependencies without spilling. This is fine for ldrex/strex loops coming from atomicrmw instructions where any value produced within a block is dead by the end, but not for cmpxchg. So we lower a cmpxchg at -O0 via a pseudo-inst that gets expanded after regalloc. Fortunately this is at -O0 so we don't have to care about performance. This simplifies the various axes of expansion considerably: we assume a strong seq_cst operation and ensure ordering via the always-present DMB instructions rather than v8 acquire/release instructions. Should fix the 32-bit part of PR25526. llvm-svn: 266679
* Lanai: fix debug buildJF Bastien2016-04-181-1/+1
| | | | | | There's currently no raw_ostream &operator<<(SimpleValueType); provided by LLVM. It could be added by refactoring utils/TableGen/CodeGenTarget.cpp:getEnumName, but that's much more work than fixing the build. llvm-svn: 266627
* [AMDGPU] Add insert nops pass based on subtarget features instead of cl::optKonstantin Zhuravlyov2016-04-185-14/+43
| | | | | | | | | | | Also, - Skip pass if machine module does not have debug info - Minor comment changes - Added test Differential Revision: http://reviews.llvm.org/D19079 llvm-svn: 266626
* [AMDGPU][llvm-mc] s_setreg* - Fix order of operandsArtem Tamazov2016-04-181-2/+2
| | | | | | | | Order should match the sp3 syntax, where destination (simm16 denoting the hwreg) is coming first. Differential Revision: http://reviews.llvm.org/D19161 llvm-svn: 266617
* Silence some "initialized but unused" warnings from MSVC -- the function ↵Aaron Ballman2016-04-181-13/+2
| | | | | | being called is a static function, so there's no need for an instance variable. NFC. llvm-svn: 266616
* [mips][ias] Prevent double-filling of delay slots by generating '.set ↵Daniel Sanders2016-04-181-1/+8
| | | | | | | | | | | | | | | | | | noreorder' regions. Summary: When clang is given -save-temps or -via-file-asm, any inline assembly in the source is parsed twice. Once by the compiler, and again by the assembler. We must take care to ensure that this doesn't lead to double-filling delay slots. Reviewers: sdardis, vkalintiris Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D19166 llvm-svn: 266608
* Include SmallVector.h header in ↵Eric Liu2016-04-181-0/+1
| | | | | | lib/Target/WebAssembly/InstPrinter/WebAssemblyInstPrinter.h llvm-svn: 266606
* [ARM] AArch32 v8 NEON is still not IEEE-754 compliantRenato Golin2016-04-181-1/+4
| | | | llvm-svn: 266603
* [mips][ias] Stream macro expansions to output instead of buffering them. NFC.Daniel Sanders2016-04-183-353/+336
| | | | | | | | | | | | | | | | | | | Summary: This will allows us to eliminate some magic numbers from the offset operand of branch instructions in favour of symbols and makes it possible to avoid double-filling delay slots when clang is given -save-temps. parseDirectiveCpRestore() is calling isIntegratedAssemblerRequired() for the moment since correctly pushing the generation of these instructions into the ELF target streamer is tricky enough to warrant a separate patch. Reviewers: sdardis, vkalintiris Subscribers: dsanders, llvm-commits, sdardis Differential Revision: http://reviews.llvm.org/D19164 llvm-svn: 266602
* [NFC] Header cleanupMehdi Amini2016-04-18102-170/+44
| | | | | | | | | | | | | | Removed some unused headers, replaced some headers with forward class declarations. Found using simple scripts like this one: clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' | xargs grep -L 'IndexedMap[<]' | xargs grep -n --color=auto 'IndexedMap' Patch by Eugene Kosov <claprix@yandex.ru> Differential Revision: http://reviews.llvm.org/D19219 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266595
* [X86] Be explicit about calls to setOperationAction for AVX2 and AVX512 ↵Craig Topper2016-04-171-45/+42
| | | | | | rather than just looping over all vector types and conditinally matching them. NFC llvm-svn: 266577
* Declare MVT::SimpleValueType as an int8_t sized enum. This removes 400 bytes ↵Craig Topper2016-04-173-3/+3
| | | | | | | | | | from TargetLoweringBase and probably other places. This required changing several places to print VT enums as strings instead of raw ints since the proper method to use to print became ambiguous. This is probably an improvement anyway. This also appears to save ~8K from an x86 self host build of llc. llvm-svn: 266562
* [X86] Added TODO comment for target shuffle mask decoding of bitcasted masksSimon Pilgrim2016-04-171-0/+1
| | | | llvm-svn: 266559
* [X86] Remove unneeded variablesAsaf Badouh2016-04-171-12/+8
| | | | | | | | | no functional change. ExtraLoad and WrapperKind are been used only if (OpFlags == X86II::MO_GOTPCREL). Differential Revision: http://reviews.llvm.org/D18942 llvm-svn: 266557
* [AVX512] ISD::MUL v2i64/v4i64 should only be legal if DQI and VLX features ↵Craig Topper2016-04-171-2/+4
| | | | | | are enabled. llvm-svn: 266554
* [X86] Use ternary operator to reduce code slightly. NFCCraig Topper2016-04-161-8/+3
| | | | llvm-svn: 266534
* [X86][XOP] Added VPPERM constant mask decoding and target shuffle combining ↵Simon Pilgrim2016-04-163-2/+64
| | | | | | | | support Added additional test that peeks through bitcast to v16i8 mask llvm-svn: 266533
* AMDGPU: Enable LocalStackSlotAllocation passMatt Arsenault2016-04-162-0/+159
| | | | | | | | | | | This resolves more frame indexes early and folds the immediate offsets into the scratch mubuf instructions. This cleans up a lot of the mess that's currently emitted, such as emitting add 0s and repeatedly initializing the same register to 0 when spilling. llvm-svn: 266508
* AMDGPU: Use s_addk_i32 / s_mulk_i32Matt Arsenault2016-04-161-12/+45
| | | | llvm-svn: 266506
* [mips] More range-based for loops. NFC.Vasileios Kalintiris2016-04-153-10/+9
| | | | | | | There are still a couple more inside the MIPS target. I opted for a single commit in order to avoid spamming the list. llvm-svn: 266472
* [mips] Use range-based for loops and simplify slightly the code. NFC.Vasileios Kalintiris2016-04-151-9/+13
| | | | llvm-svn: 266471
* [SystemZ] Call tryAddingSymbolicOperand in the disassemblerUlrich Weigand2016-04-152-11/+52
| | | | | | | | | | Use the tryAddingSymbolicOperand callback to attempt to present immediate values in symbolic form when disassembling. This is currently only used for PC-relative immediates (which are most likely to be symbolic in the SystemZ ISA). Add new DecodeMethod types to allow distinguishing between branch and non-branch instructions. llvm-svn: 266469
* ARM: don't try to hoist constant RHS out of a division.Tim Northover2016-04-152-3/+16
| | | | | | | | | | | | Divisions by a constant can be converted into multiplies which are usually cheaper, but this isn't possible if the constant gets separated (particularly in loops). Fix this by telling ConstantHoisting that the immediate in a DIV is cheap. I considered making the check generic, but neither AArch64 (strangely) nor x86 showed any benefit on the tests I had. llvm-svn: 266464
* [AArch64] Add load/store pair instructions to getMemOpBaseRegImmOfsWidth().Chad Rosier2016-04-151-5/+46
| | | | | | | | | This improves AA in the MI schduler when reason about paired instructions. Phabricator Revision: http://reviews.llvm.org/D17098 PR26358 llvm-svn: 266462
* [AArch64] Add MMOs to callee-save load/store instructions.Geoff Berry2016-04-151-2/+15
| | | | | | | | | | | | | | Summary: Without MMOs, the callee-save load/store instructions were treated as volatile by the MI post-RA scheduler and AArch64LoadStoreOptimizer. Reviewers: t.p.northover, mcrosier Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D17661 llvm-svn: 266439
* Fix typing on generated LXV2DX/STXV2DX instructionsNirav Dave2016-04-151-5/+23
| | | | | | | | | | | | | | | | | [PPC] Previously when casting generic loads to LXV2DX/ST instructions we would leave the original load return type in place allowing for an assertion failure when we merge two equivalent LXV2DX nodes with different types. This fixes PR27350. Reviewers: nemanjai Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19133 llvm-svn: 266438
* [MachineScheduler]Add support for store clusteringJun Bum Lim2016-04-154-10/+22
| | | | | | | | | | | | Perform store clustering just like load clustering. This change add StoreClusterMutation in machine-scheduler. To control StoreClusterMutation, added enableClusterStores() in TargetInstrInfo.h. This is enabled only on AArch64 for now. This change also add support for unscaled stores which were not handled in getMemOpBaseRegImmOfs(). llvm-svn: 266437
* AMDGPU/SI: Fix regression with no-return atomicsNicolai Haehnle2016-04-151-0/+1
| | | | | | | | | | | | | | | Summary: In the added test-case, the atomic instruction feeds into a non-machine CopyToReg node which hasn't been selected yet, so guard against non-machine opcodes here. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19043 llvm-svn: 266433
* Use MVT instead of EVT to remove a bunch of unnecessary calls to getSimpleVT.Craig Topper2016-04-154-61/+58
| | | | llvm-svn: 266414
* Add a setOperationPromotedToType convenience method that sets an operation ↵Craig Topper2016-04-151-36/+18
| | | | | | to promoted and set the type in one call. Use it so save code in X86. llvm-svn: 266413
* [X86] AND, OR, and XOR of vectors are always legal no need to set them legal ↵Craig Topper2016-04-151-5/+0
| | | | | | explicitly. llvm-svn: 266412
* [X86] Combine an if and else block that had the same set of calls to ↵Craig Topper2016-04-151-47/+24
| | | | | | setOperationAction that only varied in Legal/Custom. Use the ternary operator on that argument instead. NFC llvm-svn: 266410
* [NVPTX] Set NVPTXTTI::getInliningThresholdMultiplier to 5.Justin Lebar2016-04-151-0/+4
| | | | | | | | | | | | | | | Summary: Calls on NVPTX are unusually expensive (for one thing, lots of state needs to be saved to memory, which is slow), so make the inlininer much more aggressive. Reviewers: chandlerc Subscribers: jholewinski, llvm-commits, tra Differential Revision: http://reviews.llvm.org/D18561 llvm-svn: 266406
* AMDGPU: Remove custom load/store scalarizationMatt Arsenault2016-04-144-87/+7
| | | | llvm-svn: 266385
OpenPOWER on IntegriCloud