summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* Use SDValue helper instead of explicitly going via SDValue::getNode(). NFCISimon Pilgrim2016-11-251-5/+5
| | | | llvm-svn: 287940
* [AVX-512] Add support for changing VSHUFF64x2 to VSHUFF32x4 when its feeding ↵Craig Topper2016-11-251-9/+25
| | | | | | | | | | | | | | | | | | | a vselect with 32-bit element size. Summary: Shuffle lowering may have widened the element size of a i32 shuffle to i64 before selecting X86ISD::SHUF128. If this shuffle was used by a vselect this can prevent us from selecting masked operations. This patch detects this and changes the element size to match the vselect. I don't handle changing integer to floating point or vice versa as its not clear if its better to push such a bitcast to the inputs of the shuffle or to the user of the vselect. So I'm ignoring that case for now. Reviewers: delena, zvi, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27087 llvm-svn: 287939
* [AVX-512] Add VPERMT2* and VPERMI2* instructions to load folding tables.Craig Topper2016-11-251-0/+32
| | | | llvm-svn: 287937
* Revert "AMDGPU: Implement SGPR spilling with scalar stores"Marek Olsak2016-11-253-153/+10
| | | | | | This reverts commit 4404d0d6e354e80dd7f8f0a0e12d8ad809cf007e. llvm-svn: 287936
* Revert "AMDGPU: Fix MMO when splitting spill"Marek Olsak2016-11-252-79/+47
| | | | | | This reverts commit 79d4f8b8b1ce430c3d5dac4fc72a9eebaed24fe1. llvm-svn: 287935
* Revert "AMDGPU: Fix adding extra implicit def of register"Marek Olsak2016-11-251-25/+14
| | | | | | This reverts commit e834ce5976567575621901fb967b8018b9916d71. llvm-svn: 287934
* Revert "AMDGPU: Fix not setting kill flag on temp reg when spilling"Marek Olsak2016-11-251-1/+1
| | | | | | This reverts commit 057bbbe4ae170247ba37f08f2e70ef185267d1bb. llvm-svn: 287933
* Revert "AMDGPU: Make m0 unallocatable"Marek Olsak2016-11-256-23/+16
| | | | | | This reverts commit 124ad83dae04514f943902446520c859adee0e96. llvm-svn: 287932
* Revert "AMDGPU: Remove m0 spilling code"Marek Olsak2016-11-251-3/+37
| | | | | | This reverts commit f18de36554eb22416f8ba58e094e0272523a4301. llvm-svn: 287931
* Revert "AMDGPU: Preserve m0 value when spilling"Marek Olsak2016-11-251-34/+5
| | | | | | This reverts commit a5a179ffd94fd4136df461ec76fb30f04afa87ce. llvm-svn: 287930
* [Loop Unswitch] Patch to selective unswitch only the reachable branch ↵Abhilash Bhandari2016-11-251-1/+36
| | | | | | | | | | | | | | | | instructions. Summary: The iterative algorithm for Loop Unswitching may render some of the branches unreachable in the unswitched loops. Given the exponential nature of the algorithm, this is quite an overhead. This patch fixes this problem by selectively unswitching only those branches within a loop that are reachable from the loop header. Reviewers: Michael Zolothukin, Anna Thomas, Weiming Zhao. Subscribers: llvm-commits. Differential Revision: http://reviews.llvm.org/D26299 llvm-svn: 287925
* [mips] Correct jal expansion for local symbols in .local directives.Simon Dardis2016-11-251-1/+2
| | | | | | | | | | | | | | | | | This patch corrects the behaviour of code such as: .local foo jal foo foo: to use the correct jal expansion when writing ELF files. Patch by: Daniel Sanders Reviewers: zoran.jovanovic, seanbruno, vkalintiris Differential Revision: https://reviews.llvm.org/D24722 llvm-svn: 287918
* [X86] Invert an 'if' and early out to fix a weird indentation. NFCICraig Topper2016-11-251-1/+2
| | | | llvm-svn: 287909
* [X86] Size a SmallVector to the worst case mask size for a 512-bit shuffle. NFCICraig Topper2016-11-251-1/+1
| | | | llvm-svn: 287908
* [DAGCombine] Teach DAG combine that if both inputs of a vselect are the ↵Craig Topper2016-11-241-0/+4
| | | | | | | | same, then the condition doesn't matter and the vselect can be removed. Selects with scalar condition already handle this correctly. llvm-svn: 287904
* Test commit access.Serge Rogatch2016-11-241-0/+1
| | | | llvm-svn: 287898
* Fix unused variable warningSimon Pilgrim2016-11-241-1/+0
| | | | llvm-svn: 287889
* [X86] Don't round trip a unique_ptr through a raw pointer for assignment.Benjamin Kramer2016-11-241-1/+1
| | | | | | No functional change. llvm-svn: 287888
* [X86][SSE] Improve UINT_TO_FP v2i32 -> v2f64Simon Pilgrim2016-11-241-8/+38
| | | | | | | | | | Vectorize UINT_TO_FP v2i32 -> v2f64 instead of scalarization (albeit still on the SIMD unit). The codegen matches that generated by legalization (and is in fact used by AVX for UINT_TO_FP v4i32 -> v4f64), but has to be done in the x86 backend to account for legalization via 4i32. Differential Revision: https://reviews.llvm.org/D26938 llvm-svn: 287886
* [X86][AVX512] Add support for v2i64 fptosi/fptoui/sitofp/uitofp on ↵Simon Pilgrim2016-11-243-5/+29
| | | | | | | | AVX512DQ-only targets Use 512-bit instructions with subvector insertion/extraction like we do in a number of similar circumstances llvm-svn: 287882
* [X86][AVX512DQVL] Add awareness of vcvtqq2ps and vcvtuqq2ps implicit zeroing ↵Simon Pilgrim2016-11-241-0/+11
| | | | | | of upper 64-bits of xmm result llvm-svn: 287878
* [X86][AVX512DQVL] Add support for v2i64 -> v2f32 SINT_TO_FP/UINT_TO_FP loweringSimon Pilgrim2016-11-241-4/+22
| | | | llvm-svn: 287877
* [x86] Fixing PR28755 by precomputing the address used in CMPXCHG8BNikolai Bozhenov2016-11-243-1/+63
| | | | | | | | | | | | | | | | | | The bug arises during register allocation on i686 for CMPXCHG8B instruction when base pointer is needed. CMPXCHG8B needs 4 implicit registers (EAX, EBX, ECX, EDX) and a memory address, plus ESI is reserved as the base pointer. With such constraints the only way register allocator would do its job successfully is when the addressing mode of the instruction requires only one register. If that is not the case - we are emitting additional LEA instruction to compute the address. It fixes PR28755. Patch by Alexander Ivchenko <alexander.ivchenko@intel.com> Differential Revision: https://reviews.llvm.org/D25088 llvm-svn: 287875
* [x86] Minor refactoring of X86TargetLowering::EmitInstrWithCustomInserterNikolai Bozhenov2016-11-241-10/+6
| | | | | | | | | | Move the definitions of three variables out of the switch. Patch by Alexander Ivchenko <alexander.ivchenko@intel.com> Differential Revision: https://reviews.llvm.org/D25192 llvm-svn: 287874
* [x86] Rewrite getAddressFromInstr helper functionNikolai Bozhenov2016-11-241-17/+18
| | | | | | | | | | | | | - It does not modify the input instruction - Second operand of any address is always an Index Register, make sure we actually check for that, instead of a check for an immediate value Patch by Alexander Ivchenko <alexander.ivchenko@intel.com> Differential Revision: https://reviews.llvm.org/D24938 llvm-svn: 287873
* [X86] Generalize CVTTPD2DQ/CVTTPD2UDQ and CVTDQ2PD/CVTUDQ2PD opcodes. NFCI Simon Pilgrim2016-11-246-58/+54
| | | | | | | | | | Replace the CVTTPD2DQ/CVTTPD2UDQ and CVTDQ2PD/CVTUDQ2PD opcodes with general versions. This is an initial step towards similar FP_TO_SINT/FP_TO_UINT and SINT_TO_FP/UINT_TO_FP lowering to AVX512 CVTTPS2QQ/CVTTPS2UQQ and CVTQQ2PS/CVTUQQ2PS with illegal types. Differential Revision: https://reviews.llvm.org/D27072 llvm-svn: 287870
* Object: Add IRObjectFile::getTargetTriple().Peter Collingbourne2016-11-241-0/+2
| | | | | | | | This lets us remove a use of IRObjectFile::getModule() in llvm-nm. Differential Revision: https://reviews.llvm.org/D27074 llvm-svn: 287846
* Object: Simplify the IRObjectFile symbol iterator implementation.Peter Collingbourne2016-11-241-89/+25
| | | | | | | | | | | | | | Change the IRObjectFile symbol iterator to be a pointer into a vector of PointerUnions representing either IR symbols or asm symbols. This change is in preparation for a future change for supporting multiple modules in an IRObjectFile. Although it causes an increase in memory consumption, we can deal with that issue separately by introducing a bitcode symbol table. Differential Revision: https://reviews.llvm.org/D26928 llvm-svn: 287845
* AMDGPU: Preserve m0 value when spillingMatt Arsenault2016-11-241-5/+34
| | | | llvm-svn: 287844
* TRI: Add hook to pass scavenger during frame eliminationMatt Arsenault2016-11-243-5/+23
| | | | | | | | | | | | The scavenger was not passed if requiresFrameIndexScavenging was enabled. I need to be able to test for the availability of an unallocatable register here, so I can't create a virtual register for it. It might be better to just always use the scavenger and stop creating virtual registers. llvm-svn: 287843
* AMDGPU: Remove m0 spilling codeMatt Arsenault2016-11-241-37/+3
| | | | | | Since m0 isn't allocatable it should never be spilled anymore. llvm-svn: 287842
* AMDGPU: Make m0 unallocatableMatt Arsenault2016-11-246-16/+23
| | | | | | | | | | | m0 may need to be written for spill code, so we don't want general code uses relying on the value stored in it. This introduces a few code quality regressions where copies from m0 are not coalesced into copies of a copy of m0. llvm-svn: 287841
* [lib/LTO] Rename few instances of Lto to LTO.Davide Italiano2016-11-241-6/+6
| | | | llvm-svn: 287840
* Rely on a single DWARF version instead of having two copiesGreg Clayton2016-11-236-16/+21
| | | | | | | | This patch makes AsmPrinter less reliant on DwarfDebug by relying on the DWARF version in the AsmPrinter's MCStreamer's MCContext. This allows us to remove the redundant DWARF version from DwarfDebug. It also lets us change code that used to access the AsmPrinter's DwarfDebug just to get to the DWARF version by changing the DWARF version accessor on AsmPrinter so that it grabs the version from its MCStreamer's MCContext. Differential Revision: https://reviews.llvm.org/D27032 llvm-svn: 287839
* [DebugInfo] Fix some Clang-tidy modernize-use-default and Include What You ↵Eugene Zelenko2016-11-2318-125/+166
| | | | | | | | Use warnings; other minor fixes (NFC). Per Zachary Turner and Mehdi Amini suggestion to make only post-commit reviews. llvm-svn: 287838
* [X86][SSE] Add awareness of (v)cvtpd2dq and vcvtpd2udq implicit zeroing of ↵Simon Pilgrim2016-11-232-15/+30
| | | | | | | | upper 64-bits of xmm result We've already added the equivalent for (v)cvttpd2dq (rL284459) and vcvttpd2udq llvm-svn: 287835
* [SelectionDAG] Early-out in TargetLowering::expandMUL (NFC)Nicolai Haehnle2016-11-231-77/+80
| | | | | | | | | | | | Summary: Reduce indentation level; preparation for D24956. Reviewers: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27063 llvm-svn: 287831
* AMDGPU: Cleanup immediate folding codeMatt Arsenault2016-11-231-64/+62
| | | | | | | Move code down to use, reorder to avoid hard to follow immediate folding logic. llvm-svn: 287818
* AMDGPU: Fix debug printingMatt Arsenault2016-11-231-1/+1
| | | | | | The uint8_t was printed as a char which didn't really work. llvm-svn: 287817
* AMDGPU: Fix not setting kill flag on temp reg when spillingMatt Arsenault2016-11-231-1/+1
| | | | llvm-svn: 287808
* AMDGPU: Fix adding extra implicit def of registerMatt Arsenault2016-11-231-14/+25
| | | | | | | In the scalar case, there's no reason to add an additional def of the same register. llvm-svn: 287807
* AMDGPU: Fix MMO when splitting spillMatt Arsenault2016-11-232-47/+79
| | | | | | | | | | The size and offset were wrong. The size of the object was being used for the size of the access, when here it is really being split into 4-byte accesses. The underlying object size is set in the MachinePointerInfo, which also didn't have the offset set. llvm-svn: 287806
* [LoopUnroll] Move code to exit early. NFC.Haicheng Wu2016-11-231-10/+8
| | | | | | | | Just to save some compilation time. Differential Revision: https://reviews.llvm.org/D26784 llvm-svn: 287800
* Revert "[Triple] Add Facebook vendor"Daniel Berlin2016-11-231-2/+0
| | | | | | | | | | This reverts commit r287684 Objections on the review thread had not been addressed to prior to commit. I asked the committer to revert, but i expect they are gone for the US holiday or something. llvm-svn: 287798
* [X86] Allow folding of stack reloads when loading a subreg of the spilled regMichael Kuperstein2016-11-234-7/+53
| | | | | | | | | | | | | We did not support subregs in InlineSpiller:foldMemoryOperand() because targets may not deal with them correctly. This adds a target hook to let the spiller know that a target can handle subregs, and actually enables it for x86 for the case of stack slot reloads. This fixes PR30832. Differential Revision: https://reviews.llvm.org/D26521 llvm-svn: 287792
* [PM] Change the static object whose address is used to uniquely identifyChandler Carruth2016-11-2334-40/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | analyses to have a common type which is enforced rather than using a char object and a `void *` type when used as an identifier. This has a number of advantages. First, it at least helps some of the confusion raised in Justin Lebar's code review of why `void *` was being used everywhere by having a stronger type that connects to documentation about this. However, perhaps more importantly, it addresses a serious issue where the alignment of these pointer-like identifiers was unknown. This made it hard to use them in pointer-like data structures. We were already dodging this in dangerous ways to create the "all analyses" entry. In a subsequent patch I attempted to use these with TinyPtrVector and things fell apart in a very bad way. And it isn't just a compile time or type system issue. Worse than that, the actual alignment of these pointer-like opaque identifiers wasn't guaranteed to be a useful alignment as they were just characters. This change introduces a type to use as the "key" object whose address forms the opaque identifier. This both forces the objects to have proper alignment, and provides type checking that we get it right everywhere. It also makes the types somewhat less mysterious than `void *`. We could go one step further and introduce a truly opaque pointer-like type to return from the `ID()` static function rather than returning `AnalysisKey *`, but that didn't seem to be a clear win so this is just the initial change to get to a reliably typed and aligned object serving is a key for all the analyses. Thanks to Richard Smith and Justin Lebar for helping pick plausible names and avoid making this refactoring many times. =] And thanks to Sean for the super fast review! While here, I've tried to move away from the "PassID" nomenclature entirely as it wasn't really helping and is overloaded with old pass manager constructs. Now we have IDs for analyses, and key objects whose address can be used as IDs. Where possible and clear I've shortened this to just "ID". In a few places I kept "AnalysisID" to make it clear what was being identified. Differential Revision: https://reviews.llvm.org/D27031 llvm-svn: 287783
* [LoadStoreVectorizer] Enable vectorization of stores in the presence of an ↵Alina Sbirlea2016-11-231-3/+25
| | | | | | | | | | | | | | | | | | aliasing load Summary: The "getVectorizablePrefix" method would give up if it found an aliasing load for a store chain. In practice, the aliasing load can be treated as a memory barrier and all stores that precede it are a valid vectorizable prefix. Issue found by volkan in D26962. Testcase is a pruned version of the one in the original patch. Reviewers: jlebar, arsenm, tstellarAMD Subscribers: mzolotukhin, wdng, nhaehnle, anna, volkan, llvm-commits Differential Revision: https://reviews.llvm.org/D27008 llvm-svn: 287781
* [DAG] Improve loads-from-store forwarding to handle TokenFactorNirav Dave2016-11-231-2/+13
| | | | | | | | | | | | | Forward store values to matching loads down through token factors. Factored from D14834. Reviewers: jyknight, hfinkel Subscribers: hfinkel, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D26080 llvm-svn: 287773
* [DAGCombiner] Fix infinite loop in vector mul/shl combiningJohn Brawn2016-11-232-18/+15
| | | | | | | | | | | | | | | | | | | | | | We have the following DAGCombiner transformations: (mul (shl X, c1), c2) -> (mul X, c2 << c1) (mul (shl X, C), Y) -> (shl (mul X, Y), C) (shl (mul x, c1), c2) -> (mul x, c1 << c2) Usually the constant shift is optimised by SelectionDAG::getNode when it is constructed, by SelectionDAG::FoldConstantArithmetic, but when we're dealing with vectors and one of those vector constants contains an undef element FoldConstantArithmetic does not fold and we enter an infinite loop. Fix this by making FoldConstantArithmetic use getNode to decide how to fold each vector element, the same as FoldConstantVectorArithmetic does, and rather than adding the constant shift to the work list instead only apply the transformation if it's already been folded into a constant, as if it's not we're going to loop endlessly. Additionally add missing NoOpaques to one of those transformations, which I noticed when writing the tests for this. Differential Revision: https://reviews.llvm.org/D26605 llvm-svn: 287766
* [PowerPC] Remove InstAlias definitions that cause incorrect assemblyNemanja Ivanovic2016-11-231-11/+15
| | | | | | | | | | | | | In rL283190, I added some InstAlias definitions to generate extended mnemonics for some uses of the XXPERMDI instruction. However, when the assembler matches these extended mnemonics, it matches the new instruction in situations where it should match the old one. This patch removes these definitions and accomplishes that by defining these mnemonics with additional instructions that are isCodeGenOnly. Fixes PR31127. llvm-svn: 287765
OpenPOWER on IntegriCloud