summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [X86]: Improve Liveness checking for X86FixupBWInsts.cppKevin B. Smith2016-06-151-39/+97
| | | | | | Differential Revision: http://reviews.llvm.org/D21085 llvm-svn: 272797
* [mips] Eliminate unused code for addrRegReg complex pattern. NFC.Vasileios Kalintiris2016-06-155-29/+0
| | | | | | | | | | Reviewers: dsanders, sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D21381 llvm-svn: 272794
* Preserve DebugInfo when replacing values in DAGCombinerNirav Dave2016-06-152-4/+2
| | | | | | | | | | | | | | | | | | | [DAG] Previously debug values would transfer debuginfo for the selected start node for a replacement which allows for debug to be dropped. Push debug value transfer to occur with node/value replacement in SelectionDAG, remove now extraneous transfers of debug values. This refixes PR9817 which was being incompletely checked in the testsuite. Reviewers: jyknight Subscribers: dblaikie, llvm-commits Differential Revision: http://reviews.llvm.org/D21037 llvm-svn: 272792
* Reverting r272778 because there's an assertionRanjeet Singh2016-06-153-64/+13
| | | | | | failure when running the test CodeGen/ARM/intrinsics-coprocessor.ll llvm-svn: 272791
* [AMDGPU] Fix few coding style issues. NFC.Valery Pykhtin2016-06-152-23/+23
| | | | llvm-svn: 272785
* [ARM] Add support for mrrc/mrrc2 intrinsics.Ranjeet Singh2016-06-153-13/+64
| | | | | | Differential Revision: http://reviews.llvm.org/D21178 llvm-svn: 272778
* [mips] Replace AdditionalRequires<[IsGP64bit]> with GPR_64. NFC.Daniel Sanders2016-06-151-8/+4
| | | | | | | | | | | | Summary: Also fixed one case where HasMips64 was being used instead of IsGP64bit. Reviewers: sdardis Subscribers: dsanders, llvm-commits, sdardis Differential Revision: http://reviews.llvm.org/D21028 llvm-svn: 272771
* [mips] clang-format Mips16ISelDAGToDAG.{cpp,h}Daniel Sanders2016-06-152-52/+52
| | | | llvm-svn: 272768
* [mips][msa] Fix register/register-class mismatches in emitINSERT_DF_VIDX().Daniel Sanders2016-06-151-3/+6
| | | | | | | | | | Reviewers: sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D21068 llvm-svn: 272765
* [mips][microMIPS] Add CodeGen support for AND*, OR16, OR*, XOR*, NOT16 and ↵Zlatko Buljan2016-06-154-132/+197
| | | | | | | | NOR instructions Differential Revision: http://reviews.llvm.org/D16719 llvm-svn: 272764
* [AVX512] Fix BLENDM lowering patterns. Operands should be swapped to match ↵Igor Breger2016-06-151-6/+9
| | | | | | | | | | SELECT behavior. Use BLENDM instead of masked move instruction. Differential Revision: http://reviews.llvm.org/D21001 llvm-svn: 272763
* Push a dependent computation into the assert that uses it; NFCSanjoy Das2016-06-151-5/+3
| | | | | | | | | ... instead of explicitly conditioning on NDEBUG. Also use an easier to read conditional expression. (Addresses post-commit review from David Blaikie.) llvm-svn: 272762
* AMDGPU: Fix MUBUF offset bugs affecting llvm.amdgcn.buffer.* intrinsicsNicolai Haehnle2016-06-151-13/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This fixes two related bugs. First, the generic optimization passes unfortunately generate negative constant offsets but the hardware treats SOffset as an unsigned value. Second, there is a hardware bug on SI and CI, where address clamping in MUBUF instructions does not work correctly when SOffset is larger than the buffer size. This patch works around this bug by never using SOffset. An alternative workaround would be to do the clamping manually when SOffset is too large, but generating the required code sequence during instruction selection would be rather involved, and in any case the resulting code would probably be worse. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96360 Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21326 llvm-svn: 272761
* Fix unused variable warning; NFCSanjoy Das2016-06-151-0/+2
| | | | | | | TailCallReturnAddrDelta is used only in an assert, so put it under defined(NDEBUG). llvm-svn: 272760
* Don't force SP-relative addressing for statepointsSanjoy Das2016-06-152-51/+57
| | | | | | | | | | | | | | | | | | | Summary: ... when the offset is not statically known. Prioritize addresses relative to the stack pointer in the stackmap, but fallback gracefully to other modes of addressing if the offset to the stack pointer is not a known constant. Patch by Oscar Blumberg! Reviewers: sanjoy Subscribers: llvm-commits, majnemer, rnk, sanjoy, thanm Differential Revision: http://reviews.llvm.org/D21259 llvm-svn: 272756
* AMDGPU/SI: Correctly encode constant expressionsTom Stellard2016-06-151-9/+23
| | | | | | | | | | | | | | | Summary: We we have an MCConstantExpr, we can encode it directly into the instruction instead of emitting fixups. Reviewers: artem.tamazov, vpykhtin, SamWot, nhaustov, arsenm Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21236 Change-Id: I88b3edf288d48e65c5d705fc4850d281f8e36948 llvm-svn: 272750
* AMDGPU/AsmParser: Add support for parsing symbol operandsTom Stellard2016-06-151-2/+46
| | | | | | | | | | | | | | Summary: We can now reference symbols directly in operands, like this: s_mov_b32 s0, global Reviewers: artem.tamazov, vpykhtin, SamWot, nhaustov Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21038 llvm-svn: 272748
* Remove the ScalarReplAggregates passDavid Majnemer2016-06-152-2/+2
| | | | | | | | | | Nearly all the changes to this pass have been done while maintaining and updating other parts of LLVM. LLVM has had another pass, SROA, which has superseded ScalarReplAggregates for quite some time. Differential Revision: http://reviews.llvm.org/D21316 llvm-svn: 272737
* AMDGPU: Run pointer optimization passesMatt Arsenault2016-06-151-7/+46
| | | | llvm-svn: 272736
* IR: Introduce local_unnamed_addr attribute.Peter Collingbourne2016-06-142-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a local_unnamed_addr attribute is attached to a global, the address is known to be insignificant within the module. It is distinct from the existing unnamed_addr attribute in that it only describes a local property of the module rather than a global property of the symbol. This attribute is intended to be used by the code generator and LTO to allow the linker to decide whether the global needs to be in the symbol table. It is possible to exclude a global from the symbol table if three things are true: - This attribute is present on every instance of the global (which means that the normal rule that the global must have a unique address can be broken without being observable by the program by performing comparisons against the global's address) - The global has linkonce_odr linkage (which means that each linkage unit must have its own copy of the global if it requires one, and the copy in each linkage unit must be the same) - It is a constant or a function (which means that the program cannot observe that the unique-address rule has been broken by writing to the global) Although this attribute could in principle be computed from the module contents, LTO clients (i.e. linkers) will normally need to be able to compute this property as part of symbol resolution, and it would be inefficient to materialize every module just to compute it. See: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160509/356401.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160516/356738.html for earlier discussion. Part of the fix for PR27553. Differential Revision: http://reviews.llvm.org/D20348 llvm-svn: 272709
* AMDGPU/SI: Refactor fixup handling for constant addrspace variablesTom Stellard2016-06-1413-47/+76
| | | | | | | | | | | | | | | | | | | | | | Summary: We now use a standard fixup type applying the pc-relative address of constant address space variables, and we have the GlobalAddress lowering code add the required 4 byte offset to the global address rather than doing it as part of the fixup. This refactoring will make it easier to use the same code for global address space variables and also simplifies the code. Re-commit this after fixing a bug where we were trying to use a reference to a Triple object that had already been destroyed. Reviewers: arsenm, kzhuravl Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D21154 llvm-svn: 272705
* [X86] Reduce the width of multiplification when its operands are extended ↵Wei Mi2016-06-141-3/+208
| | | | | | | | | | | | | | | | from i8 or i16 For <N x i32> type mul, pmuludq will be used for targets without SSE41, which often introduces many extra pack and unpack instructions in vectorized loop body because pmuludq generates <N/2 x i64> type value. However when the operands of <N x i32> mul are extended from smaller size values like i8 and i16, the type of mul may be shrunk to use pmullw + pmulhw/pmulhuw instead of pmuludq, which generates better code. For targets with SSE41, pmulld is supported so no shrinking is needed. Differential Revision: http://reviews.llvm.org/D20931 llvm-svn: 272694
* Revert "AMDGPU/SI: Refactor fixup handling for constant addrspace variables"Tom Stellard2016-06-1413-73/+50
| | | | | | This reverts commit r272675. llvm-svn: 272677
* AMDGPU/SI: Refactor fixup handling for constant addrspace variablesTom Stellard2016-06-1413-50/+73
| | | | | | | | | | | | | | | | | | | Summary: We now use a standard fixup type applying the pc-relative address of constant address space variables, and we have the GlobalAddress lowering code add the required 4 byte offset to the global address rather than doing it as part of the fixup. This refactoring will make it easier to use the same code for global address space variables and also simplifies the code. Reviewers: arsenm, kzhuravl Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D21154 llvm-svn: 272675
* [AMDGPU][llvm-mc] Predefined symbols to access -mcpu from the assembly ↵Artem Tamazov2016-06-141-2/+15
| | | | | | | | | | | | source (.option.machine_version...) The feature allows for conditional assembly etc. TODO: make those symbols read-only. Test added. Differential Revision: http://reviews.llvm.org/D21238 llvm-svn: 272673
* [mips] Optimize stack pointer adjustments.Simon Dardis2016-06-143-4/+17
| | | | | | | | | | | | | | | | | | | | | Instead of always using addu to adjust the stack pointer when the size out is of the range of an addiu instruction, use subu so that a smaller constant can be generated. This can give savings of ~3 instructions whenever a function has a a stack frame whose size is out of range of an addiu instruction. This change may break some naive stack unwinders. Partially resolves PR/26291. Thanks to David Chisnall for reporting the issue. Reviewers: dsanders, vkalintiris Differential Review: http://reviews.llvm.org/D21321 llvm-svn: 272666
* [Thumb] Fix off-by-one error in r272007James Molloy2016-06-142-6/+6
| | | | | | | | We can only generate immediates up to #510 with a MOV+ADD, not #511, because there's no such instruction as add #256. Found by Oliver Stannard and csmith! llvm-svn: 272665
* [mips][atomics] Fix atomic instruction descriptions and uses.Simon Dardis2016-06-148-22/+107
| | | | | | | | | | | | | | PR27458 highlights that the MIPS backend does not have well formed MIR for atomic operations (among other errors). This patch adds expands and corrects the LL/SC descriptions and uses for MIPS(64). Reviewers: dsanders, vkalintiris Differential Review: http://reviews.llvm.org/D19719 llvm-svn: 272655
* [mips][ias] Implement one N32 case (of two) for .cpsetup.Daniel Sanders2016-06-141-28/+27
| | | | | | | | | | | | | | | | | | | | | | This patch implements the N32 case where -mno-shared is in effect. The case where -mshared is in effect will be added later since doing that now requires additional changes to how we handle %hi(%neg(%gp_rel(foo))) expressions to emit the three relocations as three relocations (currently only one of the three would be emitted) which then requires further changes to our MCFixup handling. While we could fix both cases together, fixing the -mno-shared case allows us to fix the ELFCLASS bug (where N32 incorrectly uses ELFCLASS64 instead of ELFCLASS32) in a way that allows cpsetup.s to check for a correct output instead of another incorrect output. Reviewers: sdardis Subscribers: dsanders, llvm-commits, sdardis Differential Revision: http://reviews.llvm.org/D21131 llvm-svn: 272652
* [X86][SSE4A] Added patterns for nontemporal stores of scalar float/doubles ↵Simon Pilgrim2016-06-141-2/+12
| | | | | | using MOVNTSD/MOVNTSS llvm-svn: 272651
* [mips] MIPS32/64 itinerariesSimon Dardis2016-06-147-125/+300
| | | | | | | | | | | Itineraries for some pre MIPSR6 and EVA instructions. Some pseudo expanded instructions are marked as having no scheduling info. Reviewers: dsanders, vkalintiris Differential Review: http://reviews.llvm.org/D20418 llvm-svn: 272648
* [mips][dsp] Fix use without def on DSPCtrl registers read by rddsp intrinsic.Daniel Sanders2016-06-141-1/+2
| | | | | | | | | | Reviewers: sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D21063 llvm-svn: 272647
* [mips][msa] copyPhysReg() should not set RegState::Define on result of CTCMSA.Daniel Sanders2016-06-141-2/+5
| | | | | | | | | | | | | | Summary: The machine verifier reports 'Explicit operand marked as def' when it is manually specified even though it agrees with the operand info. Reviewers: sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D21065 llvm-svn: 272646
* [AVX512] Use AND32ri8 instead of AND32ri when anding with 1 to create single ↵Craig Topper2016-06-141-9/+9
| | | | | | bit masks. This results in a smaller encoding. llvm-svn: 272627
* [AVX512] Use MOVZX32 instead of MOVZ16 for loading single v8/v4/v2/v1 masks ↵Craig Topper2016-06-141-4/+4
| | | | | | when KMOVB is not available. This has better behavior with respect to partial register stalls since it won't need to preserve the upper 16-bits of the GPR. llvm-svn: 272626
* [AVX512] Add patterns for zero-extending a mask that use the def of ↵Craig Topper2016-06-141-0/+6
| | | | | | KMOVW/KMOVB without going through an EXTRACT_SUBREG and a MOVZX. llvm-svn: 272625
* Update the AArch64ExternalSymbolizer to print literal strings as escaped stringsKevin Enderby2016-06-131-3/+5
| | | | | | | | so it is the same as the MCExternalSymbolizer. rdar://17349181 llvm-svn: 272588
* [X86] Remove llvm.x86.bit.scan.{forward,reverse}.32David Majnemer2016-06-131-2/+0
| | | | | | | The need for these intrinsics has been obviated by r272564 which reimplements their functionality using generic IR. llvm-svn: 272566
* AMDGPU/SI: Set INDEX_STRIDE for scratch coalescingMarek Olsak2016-06-132-3/+6
| | | | | | | | | | | | | | | | | Summary: Mesa and other users must set this to enable coalescing: - STRIDE = 0 - SWIZZLE_ENABLE = 1 This makes one particular compute shader 8x faster. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D21136 llvm-svn: 272556
* AMDGPU: Fix post-RA verifier errors with trackLivenessAfterRegAllocMatt Arsenault2016-06-131-14/+16
| | | | | | | The condition reg of the cndmask_b64 expansion can't be killed by the first one, and the implicit super register implicit def is needed. llvm-svn: 272554
* [SystemZ] Enable index register memory constraints for inline ASMUlrich Weigand2016-06-132-27/+25
| | | | | | | | | | | | | | | | This enables use of the 'R' and 'T' memory constraints for inline ASM operands on SystemZ, which allow an index register as well as an immediate displacement. This patch includes corresponding documentation and test case updates. As with the last patch of this kind, I moved the 'm' constraint to the most general case, which is now 'T' (base + 20-bit signed displacement + index register). Author: colpell Differential Revision: http://reviews.llvm.org/D21239 llvm-svn: 272547
* [ARM] Reverting r272544 because clang patch needsRanjeet Singh2016-06-133-64/+13
| | | | | | | to go in as soon as llvm patch has gone in because tests will start breaking in Clang. llvm-svn: 272546
* [ARM] Add mrrc/mrrc2 co-processor intrinsicsRanjeet Singh2016-06-133-13/+64
| | | | | | | | | | | | | MRRC/MRRC2 instruction writes to two registers. The intrinsic definition returns a single uint64_t to represent the write, this is a compact way of representing a write to two 32 bit registers, the alternative might have been two return a struct of 2 uint32_t's but this isn't as nice. Differential Revision: llvm-svn: 272544
* Fix an enumeral mismatch warning.Haojian Wu2016-06-131-2/+4
| | | | | | | | | | | | | | | | | Summary: The "-Werror=enum-compare" shows that the statement is using two different enums: enumeral mismatch in conditional expression: 'llvm::X86ISD::NodeType' vs 'llvm::ISD::NodeType' A follow-up fix on D21235. Reviewers: klimek Subscribers: spatel, cfe-commits Differential Revision: http://reviews.llvm.org/D21278 llvm-svn: 272539
* [AVX512] Remove maksed pshufd, pshuflw, and phufhw intrinsics and ↵Craig Topper2016-06-131-18/+0
| | | | | | autoupgrade them to selects and shufflevector. llvm-svn: 272527
* Run clang-tidy's performance-unnecessary-copy-initialization over LLVM.Benjamin Kramer2016-06-1210-11/+11
| | | | | | No functionality change intended. llvm-svn: 272516
* Move instances of std::function.Benjamin Kramer2016-06-125-15/+15
| | | | | | Or replace with llvm::function_ref if it's never stored. NFC intended. llvm-svn: 272513
* Pass DebugLoc and SDLoc by const ref.Benjamin Kramer2016-06-12123-1702/+1485
| | | | | | | | This used to be free, copying and moving DebugLocs became expensive after the metadata rewrite. Passing by reference eliminates a ton of track/untrack operations. No functionality change intended. llvm-svn: 272512
* [x86, SSE] change patterns for CMPP to float types to allow matching with ↵Sanjay Patel2016-06-123-23/+53
| | | | | | | | | | | | | | | | | | | | | | | | | SSE1 (PR28044) This patch is intended to solve: https://llvm.org/bugs/show_bug.cgi?id=28044 By changing the definition of X86ISD::CMPP to use float types, we allow it to be created and pass legalization for an SSE1-only target where v4i32 is not legal. The motivational trail for this change includes: https://llvm.org/bugs/show_bug.cgi?id=28001 and eventually makes this trigger: http://reviews.llvm.org/D21190 Ie, after this step, we should be free to have Clang generate FP compare IR instead of x86 intrinsics for SSE C packed compare intrinsics. (We can auto-upgrade and remove the LLVM sse.cmp intrinsics as a follow-up step.) Once we're generating vector IR instead of x86 intrinsics, a big pile of generic optimizations can trigger. Differential Revision: http://reviews.llvm.org/D21235 llvm-svn: 272511
* [X86] Remove sse2 pshufd/pshuflw/pshufhw intrinsics and upgrade them to ↵Craig Topper2016-06-121-3/+0
| | | | | | shufflevector. llvm-svn: 272510
OpenPOWER on IntegriCloud