summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* parseArch() supports more variations of arch names for PowerPC buildsKelvin Li2016-01-191-4/+4
| | | | llvm-svn: 258103
* Add a change accidentally left out from r258100Tobias Edler von Koch2016-01-181-0/+0
| | | | | | Also remove an executable bit introduced by r258083. llvm-svn: 258101
* [LTO] Restore original linkage of externals prior to splittingTobias Edler von Koch2016-01-181-1/+41
| | | | | | | | | | | | | | | | | | | | | | | Summary: This is a companion patch for http://reviews.llvm.org/D16124. Internalized symbols increase the size of strongly-connected components in SCC-based module splitting and thus reduce the amount of parallelism. This patch records the original linkage of non-local symbols prior to internalization and then restores it just before splitting/CodeGen. This is also useful for cases where the linker requires symbols to remain external, for instance, so they can be placed according to linker script rules. It's currently under its own flag (-restore-globals) but should eventually share a common flag with D16124. Reviewers: joker.eph, pcc Subscribers: slarin, llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D16229 llvm-svn: 258100
* Fixed MSVC warning that not all control paths return a value.Simon Pilgrim2016-01-181-0/+1
| | | | llvm-svn: 258099
* AMDGPU: Reduce 64-bit SRAsMatt Arsenault2016-01-182-0/+62
| | | | llvm-svn: 258096
* AMDGPU: Split 64-bit and of constant upMatt Arsenault2016-01-183-2/+70
| | | | | | | | | | This breaks the tests that were meant for testing 64-bit inline immediates, so move those to shl where they won't be broken up. This should be repeated for the other related bit ops. llvm-svn: 258095
* [AArch64] Remove unused arguments. NFC.Chad Rosier2016-01-181-7/+7
| | | | | | AFAICT, these have been unused since the initial backend import. llvm-svn: 258093
* AMDGPU: Generalize shl combineMatt Arsenault2016-01-181-8/+14
| | | | | | | Reduce 64-bit shl with constant > 32. We already special cased this for the == 32 case, but this also works for any >= 32 constant. llvm-svn: 258092
* AMDGPU: Reduce 64-bit lshr by constant to 32-bitMatt Arsenault2016-01-182-0/+45
| | | | | | 64-bit shifts are very slow on some subtargets. llvm-svn: 258090
* [LAA] Include function name in debug outputAdam Nemet2016-01-181-3/+4
| | | | llvm-svn: 258088
* AMDGPU: Add subtarget feature for instruction ratesMatt Arsenault2016-01-184-9/+23
| | | | llvm-svn: 258085
* Fixed MSVC Win64 warning of implicit conversion of 32-bit shift to 64-bits.Simon Pilgrim2016-01-181-1/+1
| | | | llvm-svn: 258084
* Add to the split module utility an SCC based method which allows not to ↵Sergei Larin2016-01-182-21/+191
| | | | | | | | | | | | | | | | | | globalize any local variables. Summary: Currently llvm::SplitModule as the first step globalizes all local objects, which might not be desirable in some scenarios. This change adds a new flag to llvm::SplitModule that uses SCC approach to search for a balanced partition without the need to externalize symbols. Such partition might not be possible or fully balanced for a given number of partitions, and is a function of the module properties (global/local dependencies within the module). Joint development Tobias Edler von Koch (tobias@codeaurora.org) and Sergei Larin (slarin@codeaurora.org) Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D16124 llvm-svn: 258083
* [X86][AVX2] Broadcast subvectorsSimon Pilgrim2016-01-181-3/+21
| | | | | | | | AVX2 can only broadcast from the zero'th element of a vector, but if the broadcastable element is the zero'th element of a 128-bit subvector its advantageous to extract the subvector, broadcast from that and avoid the loading of shuffle mask data that would be needed for VPERMPS/VPERMD. The only exception being when the source type is 4f64 or 4i64 which can directly use the immediate shuffle VPERMPD/VPERMQ directly. Differential Revision: http://reviews.llvm.org/D16050 llvm-svn: 258081
* [Hexagon] Recognize more copy-equivalents in RDF optimizationsKrzysztof Parzyszek2016-01-181-14/+59
| | | | llvm-svn: 258076
* [RDF] Improvements to copy propagationKrzysztof Parzyszek2016-01-182-72/+145
| | | | | | | - Allow any instruction to define equality between registers. - Keep the DFG updated. llvm-svn: 258075
* [RDF] Improve compile-time performance of dead code eliminationKrzysztof Parzyszek2016-01-182-12/+42
| | | | llvm-svn: 258074
* [RDF] Allow unlinking ref nodes from data-flow chains onlyKrzysztof Parzyszek2016-01-183-14/+23
| | | | llvm-svn: 258073
* [TableGen] Use FoldingSets instead of DenseMaps to unique UnOpInit, ↵Craig Topper2016-01-181-39/+79
| | | | | | BinOpInit and TernOpInit. This remove the memory needed to store the key for the DenseMap. NFC llvm-svn: 258071
* [TableGen] Fix an assert I missed in r258063.Craig Topper2016-01-181-1/+1
| | | | llvm-svn: 258068
* TargetLowering: Improve handling of (setcc ([sz]ext x) 0, cc) in SimplifySetCCTom Stellard2016-01-181-0/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When SimplifySetCC sees a setcc node that compares the result of a value extension operation with a constant, it tries to simplify the setcc node by eliminating the extension and shrinking the constant. If shrinking the inputs to setcc is deemed not desirable by the target (e.g. the target does not want a setcc comparing i1 values), then it is still possible to optimize this sequence in some cases. This patch adds the following combines to SimplifySetCC when shrinking setcc inputs is not desirable: (setcc ([sz]ext (setcc x, y, cc)), 0, setne) -> (setcc (x, y, cc)) (setcc ([sz]ext (setcc x, y, cc)), 0, seteq) -> (setcc (x, Y, !cc)) There are no tests for this yet, but once AMDGPU correctly implements TargetLowering::isTypeDesirableForOp(), this new combine will be exercised by the existing CodeGen/AMDGPU/setcc-opt.ll test. Reviewers: resistor, arsenm Subscribers: jroelofs, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15034 llvm-svn: 258067
* [TableGen] Merge the SuperClass Record and SMRange vector into a single ↵Craig Topper2016-01-183-19/+18
| | | | | | vector. This removes the state needed to manage the extra vector thus reducing the size of the Record class. NFC llvm-svn: 258065
* [TableGen] Allocate the Init pointer array for BitsInit/ListInit after the ↵Craig Topper2016-01-181-8/+14
| | | | | | BitsInit/ListInit object itself. Saves a bit of memory. NFC llvm-svn: 258063
* combine clauses with same output ; NFCISanjay Patel2016-01-181-8/+3
| | | | llvm-svn: 258062
* use m_OneUse ; NFCISanjay Patel2016-01-181-4/+2
| | | | llvm-svn: 258059
* fix variable names, typos ; NFCSanjay Patel2016-01-181-36/+36
| | | | llvm-svn: 258058
* fix typo; NFCSanjay Patel2016-01-181-1/+1
| | | | llvm-svn: 258057
* AVX512: Masked store intrinsic implementation.Igor Breger2016-01-183-29/+61
| | | | | | | | Implemented intrinsic for the follow instructions (store) : VMOVDQU8/16/32/64, VMOVDQA32/64, VMOVAPS/PD, VMOVUPS/PD. Differential Revision: http://reviews.llvm.org/D16271 llvm-svn: 258047
* Added Cannonlake processor to X86 TargetElena Demikhovsky2016-01-181-1/+37
| | | | | | Differential Revision: http://reviews.llvm.org/D16289 llvm-svn: 258046
* AVX512 : Change v8i1 bitconvert GR8 pattern, remove unnecessary movzbl ↵Igor Breger2016-01-181-1/+1
| | | | | | | | | | | | | | instruction. code example , previous implementation. movzbl %dil, %eax kmovw %eax, %k0 new code kmovw %edi, %k0 Differential Revision: http://reviews.llvm.org/D16287 llvm-svn: 258045
* [ARM] Operands for PKHTB alias should be swappedOliver Stannard2016-01-182-6/+6
| | | | | | | | | When the shift immediate is zero, PKHTB is an alias for PKHBT, but the order of the input operands needs to be swapped. Differential Revision: http://reviews.llvm.org/D16288 llvm-svn: 258044
* [AVX512] adding AVXVBMI feature flagMichael Zuckerman2016-01-181-1/+1
| | | | | | | | | Fixing wrong typo (avx515) → (avx512) Review over the shoulder by asaf . Differential Revision: http://reviews.llvm.org/D16190 llvm-svn: 258041
* [Coverage] move a local var to be BinaryCoverageReader's memberXinliang David Li2016-01-181-8/+11
| | | | | | | | The symtab is logically referenced beyond the call to the create method. This changes makes sure its lifetime matches that of the reader. llvm-svn: 258036
* Remove extra whitespace. NFC.Junmo Park2016-01-181-2/+2
| | | | llvm-svn: 258035
* Revert assert added in rL258028 as the alloca and OtherPtr types may differ ↵Eduard Burtescu2016-01-181-1/+0
| | | | | | in address space. llvm-svn: 258029
* [opaque pointer types] Alloca: use getAllocatedType() instead of ↵Eduard Burtescu2016-01-185-17/+13
| | | | | | | | | | | | getType()->getPointerElementType(). Reviewers: mjacob Subscribers: llvm-commits, dblaikie Differential Revision: http://reviews.llvm.org/D16272 llvm-svn: 258028
* fix variable names; NFCSanjay Patel2016-01-171-16/+16
| | | | llvm-svn: 258027
* fix typos; NFCSanjay Patel2016-01-171-17/+16
| | | | llvm-svn: 258026
* [opaque pointer types] [breaking-change] [NFC] SimplifyGEPInst: take the ↵Manuel Jacob2016-01-173-7/+9
| | | | | | | | | | | | | | source element type of the GEP as an argument. Patch by Eduard Burtescu. Reviewers: dblaikie, mjacob Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16281 llvm-svn: 258024
* [opaque pointer types] [NFC] CallSite: use getFunctionType() instead of ↵Manuel Jacob2016-01-174-19/+9
| | | | | | | | | | | | | | going through PointerType::getElementType. Patch by Eduard Burtescu. Reviewers: dblaikie, mjacob Subscribers: dsanders, llvm-commits, dblaikie Differential Revision: http://reviews.llvm.org/D16273 llvm-svn: 258023
* [NFC] Remove one dead PointerType::getElementType() call.Manuel Jacob2016-01-171-2/+0
| | | | | | | | | | | | Reviewers: dblaikie, mjacob Subscribers: llvm-commits, dblaikie Patch by Eduard Burtescu. Differential Revision: http://reviews.llvm.org/D16274 llvm-svn: 258022
* [IndVars] Fix PR25576Sanjoy Das2016-01-171-23/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | `LCSSASafePhiForRAUW` as computed was incorrect -- in cases like these (this exact example does not actually trigger the bug): define i32 @f(i32 %n, i1* %c) { entry: br label %outer.loop outer.loop: br label %inner.loop inner.loop: %iv = phi i32 [ 0, %outer.loop ], [ %iv.inc, %inner.loop ] %iv.inc = add nuw nsw i32 %iv, 1 %tc = udiv i32 %n, 13 %be.cond = icmp ult i32 %iv, %tc br i1 %be.cond, label %inner.loop, label %inner.exit inner.exit: %iv.lcssa = phi i32 [ %iv, %inner.loop ] %outer.be.cond = load volatile i1, i1* %c br i1 %outer.be.cond, label %outer.loop, label %leave leave: %iv.lcssa.lcssa = phi i32 [ %iv.lcssa, %inner.exit ] ret i32 %iv.lcssa.lcssa } `LCSSASafePhiForRAUW` is true for `%iv.lcssa` when re-rewriting the exit value of `%iv` for `%inner.loop` to `%tc` (this can happen due to `SCEVExpander::findExistingExpansion`), but the RAUW breaks LCSSA. To fix this, instead of computing `SafePhi` with special logic, decide the safety of RAUW directly via `replacementPreservesLCSSAForm`. llvm-svn: 258016
* [IndVars] Use emplace_back; NFCSanjoy Das2016-01-171-4/+3
| | | | llvm-svn: 258015
* [AVX512] adding AVXVBMI feature flagMichael Zuckerman2016-01-175-1/+12
| | | | | | | | | | The feature flag is for VPERMB,VPERMI2B,VPERMT2B and VPMULTISHIFTQB instructions. More about the instruction can be found in: hattps://software.intel.com/sites/default/files/managed/07/b7/319433-023.pdf Differential Revision: http://reviews.llvm.org/D16190 llvm-svn: 258012
* Fix buildbot failure introduced by 258010. Remove local variables became unused.Artur Pilipenko2016-01-172-7/+0
| | | | llvm-svn: 258011
* Push isDereferenceableAndAlignedPointer down into isSafeToLoadUnconditionallyArtur Pilipenko2016-01-173-16/+9
| | | | | | | | Reviewed By: reames Differential Revision: http://reviews.llvm.org/D16226 llvm-svn: 258010
* AVX512: Use MemIntrinsicSDNode to implement load/store intrinsic.Igor Breger2016-01-171-60/+76
| | | | | | Differential Revision: http://reviews.llvm.org/D16184 llvm-svn: 258009
* [AVX512] Adding VPERMW/D/Q VPERMPS/D Intrinsics Michael Zuckerman2016-01-171-0/+11
| | | | | | Differential Revision: http://reviews.llvm.org/D16189 llvm-svn: 258008
* [AVX512] Adding VPERMQ VPERMPD Intrinsics Michael Zuckerman2016-01-171-0/+4
| | | | | | Differential Revision: http://reviews.llvm.org/D16194 llvm-svn: 258006
* [X86][AVX] Enable extraction of upper 128-bit subvectors for 'half undef' ↵Simon Pilgrim2016-01-161-11/+28
| | | | | | | | | | shuffle lowering Added support for the extraction of the upper 128-bit subvectors for lower/upper half undef shuffles if it would reduce the number of extractions/insertions or avoid loads of AVX2 permps/permd shuffle masks. Minor follow up to D15477. llvm-svn: 258000
OpenPOWER on IntegriCloud