summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
...
* Provide a test case for rl259798Nemanja Ivanovic2016-02-041-0/+10
| | | | llvm-svn: 259835
* [AArch64] Bound the number of instructions we scan when searching for updates.Chad Rosier2016-02-041-14/+26
| | | | | | | This only impacts the creation of pre-/post-index instructions. The bound was set high enough such that it did not change code generation for SPEC200X. llvm-svn: 259828
* [docs] Fix typo in YamlIO.rstVedant Kumar2016-02-041-2/+3
| | | | | | Patch by Mario Lang! llvm-svn: 259825
* Install cmake files to lib/cmake/llvmNiels Ole Salscheider2016-02-042-5/+5
| | | | | | | | | | | This is the right location for platform-specific files. On some distributions (e. g. Exherbo), a package can be installed for several architectures in parallel, but the architecture-independent files are shared. Therefore, we must not install architecture-dependent files (like the CMake config and export files) to share/. llvm-svn: 259821
* [X86][SSE] Select domain for 32/64-bit partial loads for ↵Simon Pilgrim2016-02-049-103/+118
| | | | | | | | | | EltsFromConsecutiveLoads Choose between MOVD/MOVSS and MOVQ/MOVSD depending on the target vector type. This has a lot fewer test changes than trying to add this to X86InstrInfo::setExecutionDomain..... llvm-svn: 259816
* Fix a regression for r259736.Wei Mi2016-02-041-3/+10
| | | | | | | | | When SCEV expansion tries to reuse an existing value, it is needed to ensure that using the Value at the InsertPt will not break LCSSA. The fix adds a check that InsertPt is either inside the candidate Value's parent loop, or the candidate Value's parent loop is nullptr. llvm-svn: 259815
* Fix format in commentXinliang David Li2016-02-041-6/+4
| | | | llvm-svn: 259814
* [PGO] Add interfaces to annotate instr with VP dataXinliang David Li2016-02-043-7/+167
| | | | | | | Add interfaces to do value profile data IR annnotation and read. Needed by both FE and IR based PGO. llvm-svn: 259813
* [AArch64] Improve load/store optimizer to handle LDUR + LDR (take 3).Chad Rosier2016-02-042-21/+181
| | | | | | | | | | | | | | | This patch allows the mixing of scaled and unscaled load/stores to form load/store pairs. PR24465 http://reviews.llvm.org/D12116 Many thanks to Ahmed and Michael for fixes and code review. This is a reapplication of r246769 and r259790. The tramp3d failure was caused by an incorrect refactoring in the patch. Specifically, we weren't always properly clearing the SExtIdx flag. llvm-svn: 259812
* [SCEV] Add boolean accessors for NSW, NUW and NW; NFCSanjoy Das2016-02-042-14/+26
| | | | llvm-svn: 259809
* Correctly handle {Always,Never}StepIntoLineDavid Majnemer2016-02-042-8/+10
| | | | llvm-svn: 259806
* Add support for S_DEFRANGE and S_DEFRANGE_SUBFIELDDavid Majnemer2016-02-042-2/+42
| | | | llvm-svn: 259805
* Make the dumper's output for variable ranges easier to readDavid Majnemer2016-02-041-24/+14
| | | | llvm-svn: 259804
* use 'auto' for iterators; NFCISanjay Patel2016-02-041-9/+3
| | | | llvm-svn: 259802
* [AArch64] Multiply extended 32-bit ints with `[U|S]MADDL'Silviu Baranga2016-02-042-0/+92
| | | | | | | | | | | | | | | | | | | | | | | During instruction selection, the AArch64 backend can recognise the following pattern and generate an [U|S]MADDL instruction, i.e. a multiply of two 32-bit operands with a 64-bit result: (mul (sext i32), (sext i32)) However, when one of the operands is constant, the sign extension gets folded into the constant in SelectionDAG::getNode(). This means that the instruction selection sees this: (mul (sext i32), i64) ...which doesn't match the pattern. Sign-extension and 64-bit multiply instructions are generated, which are slower than one 32-bit multiply. Add a pattern to match this and generate the correct instruction, for both signed and unsigned multiplies. Patch by Chris Diamand! llvm-svn: 259800
* The canonical way to XFAIL a test for all targets is XFAIL: *, not XFAIL:Benjamin Kramer2016-02-043-3/+3
| | | | | | | | Fix the lit bug that enabled this "feature" (empty triple is substring of all possible target triples) and change the two outliers to use the documented * syntax. llvm-svn: 259799
* Enable the %s modifier in inline asm template stringNemanja Ivanovic2016-02-041-0/+5
| | | | | | | | | | | This patch corresponds to review: http://reviews.llvm.org/D16847 There are some files in glibc that use the output operand modifier even though it was deprecated in GCC. This patch just adds support for it to prevent issues with such files. llvm-svn: 259798
* [PPC] Move PPC test to a PPC-specific dirRenato Golin2016-02-041-0/+0
| | | | llvm-svn: 259797
* [X86][SSE] Add general 32-bit LOAD + VZEXT_MOVL support to ↵Simon Pilgrim2016-02-045-150/+75
| | | | | | | | | | EltsFromConsecutiveLoads This patch adds support for consecutive (load/undef elements) 32-bit loads, followed by trailing undef/zero elements to be combined to a single MOVD load. Differential Revision: http://reviews.llvm.org/D16729 llvm-svn: 259796
* Revert "[AArch64] Improve load/store optimizer to handle LDUR + LDR."Chad Rosier2016-02-042-182/+22
| | | | | | This reverts commit r259790. tramp3d-v4 is still having problems. llvm-svn: 259795
* [X86][SSE] Added i686 target tests to make sure we are correctly loading ↵Simon Pilgrim2016-02-043-0/+504
| | | | | | consecutive entries as 64-bit integers llvm-svn: 259794
* AVX-512: Fixed a bug in FMA instruction selection on KNLElena Demikhovsky2016-02-044-15/+16
| | | | | | | | The FMA instruction was selected from AVX2 set instead of AVX-512 Differential Revision: http://reviews.llvm.org/D16884 llvm-svn: 259792
* [Power PC] softening long double typePetar Jovanovic2016-02-044-26/+260
| | | | | | | | | | | This patch implements softening of long double type (ppcf128) on ppc32 architecture and enables operations for this type for soft float. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D15811 llvm-svn: 259791
* [AArch64] Improve load/store optimizer to handle LDUR + LDR.Chad Rosier2016-02-042-22/+182
| | | | | | | | | | | | | | This patch allows the mixing of scaled and unscaled load/stores to form load/store pairs. PR24465 http://reviews.llvm.org/D12116 Many thanks to Ahmed and Michael for fixes and code review. This is a reapplication of r246769, which was reverted in r246782 due to a test-suite failure. I'm unable to reproduce the issue at this time. llvm-svn: 259790
* [AVX512] add vfmadd132ss and vfmadd132sd IntrinsicMichael Zuckerman2016-02-045-11/+265
| | | | | | Differential Revision: http://reviews.llvm.org/D16589 llvm-svn: 259789
* [X86] Add AVX512 vector zext testsSimon Pilgrim2016-02-041-0/+122
| | | | llvm-svn: 259786
* [ScheduleDagInstrs] Improved commentsJonas Paulsson2016-02-041-9/+9
| | | | llvm-svn: 259783
* [X86] Moved SEXT -> SIGN_EXTEND_VECTOR_INREG combine into helper. NFC.Simon Pilgrim2016-02-041-60/+84
| | | | llvm-svn: 259771
* [X86] Use hash table in LEA optimization pass.Andrey Turetskiy2016-02-041-150/+247
| | | | | | | | Use hash table (key is a memory operand) to store found LEA instructions to reduce compile time. Differential Revision: http://reviews.llvm.org/D16404 llvm-svn: 259770
* cmake: Add a flag to enable LTOJustin Bogner2016-02-042-0/+11
| | | | | | | This adds -DLLVM_ENABLE_LTO, rather than forcing people to manually add -flto to the various _FLAGS variables. llvm-svn: 259766
* [Support] Use range-based for loop. NFCCraig Topper2016-02-041-3/+1
| | | | llvm-svn: 259763
* [Support] Use hexdigit instead of manually coding the same thing. NFCCraig Topper2016-02-041-2/+2
| | | | llvm-svn: 259762
* [PGO] Profile interface cleanupXinliang David Li2016-02-044-25/+38
| | | | | | | - Remove unused valuemapper parameter - add totalcount optional parameter llvm-svn: 259756
* [NVPTX] Disable performance optimizations when OptLevel==NoneJingyue Wu2016-02-042-21/+48
| | | | | | | | | | Reviewers: jholewinski, tra, eliben Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D16874 llvm-svn: 259749
* Test case for PR 26381Nemanja Ivanovic2016-02-041-0/+8
| | | | llvm-svn: 259740
* [SCEV] Try to reuse existing value during SCEV expansionWei Mi2016-02-0412-30/+207
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current SCEV expansion will expand SCEV as a sequence of operations and doesn't utilize the value already existed. This will introduce redundent computation which may not be cleaned up throughly by following optimizations. This patch introduces an ExprValueMap which is a map from SCEV to the set of equal values with the same SCEV. When a SCEV is expanded, the set of values is checked and reused whenever possible before generating a sequence of operations. The original commit triggered regressions in Polly tests. The regressions exposed two problems which have been fixed in current version. 1. Polly will generate a new function based on the old one. To generate an instruction for the new function, it builds SCEV for the old instruction, applies some tranformation on the SCEV generated, then expands the transformed SCEV and insert the expanded value into new function. Because SCEV expansion may reuse value cached in ExprValueMap, the value in old function may be inserted into new function, which is wrong. In SCEVExpander::expand, there is a logic to check the cached value to be used should dominate the insertion point. However, for the above case, the check always passes. That is because the insertion point is in a new function, which is unreachable from the old function. However for unreachable node, DominatorTreeBase::dominates thinks it will be dominated by any other node. The fix is to simply add a check that the cached value to be used in expansion should be in the same function as the insertion point instruction. 2. When the SCEV is of scConstant type, expanding it directly is cheaper than reusing a normal value cached. Although in the cached value set in ExprValueMap, there is a Constant type value, but it is not easy to find it out -- the cached Value set is not sorted according to the potential cost. Existing reuse logic in SCEVExpander::expand simply chooses the first legal element from the cached value set. The fix is that when the SCEV is of scConstant type, don't try the reuse logic. simply expand it. Differential Revision: http://reviews.llvm.org/D12090 llvm-svn: 259736
* Fix undefined behavior when compiling in C++14 mode (with sized deletionRichard Smith2016-02-041-0/+8
| | | | | | | enabled): ensure that we do not invoke the sized deallocator for MemoryBuffer subclasses that have tail-allocated data. llvm-svn: 259735
* [codeview] Don't attempt a cross-section label diffReid Kleckner2016-02-042-5/+82
| | | | | | | | This only comes up when we're trying to find the next .cv_loc label. Fixes PR26467 llvm-svn: 259733
* [libFuzzer] hot fix a testKostya Serebryany2016-02-041-1/+1
| | | | llvm-svn: 259732
* [libFuzzer] don't write the test unit when a leak is detected (since we ↵Kostya Serebryany2016-02-044-0/+16
| | | | | | don't know which unit causes the leak) llvm-svn: 259731
* [SimplifyCFG] Fix for "endless" loop after dead code removal (Alternative toGerolf Hoflehner2016-02-032-2/+107
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | D16251) Summary: This is a simpler fix to the problem than the dominator approach in http://reviews.llvm.org/D16251. It adds only values into the gather() while loop that have been seen before. The actual endless loop is in the constant compare gather() routine in Utils/SimplifyCFG.cpp. The same value ret.0.off0.i is pushed back into the queue: %.ret.0.off0.i = or i1 %.ret.0.off0.i, %cmp10.i Here is what happens at the IR level: for.cond.i: ; preds = %if.end6.i, %if.end.i54 %ix.0.i = phi i32 [ 0, %if.end.i54 ], [ %inc.i55, %if.end6.i ] %ret.0.off0.i = phi i1 [false, %if.end.i54], [%.ret.0.off0.i, %if.end6.i] <<< %cmp2.i = icmp ult i32 %ix.0.i, %11 br i1 %cmp2.i, label %for.body.i, label %LBJ_TmpSimpleNeedExt.exit if.end6.i: ; preds = %for.body.i %cmp10.i = icmp ugt i32 %conv.i, %add9.i %.ret.0.off0.i = or i1 %ret.0.off0.i, %cmp10.i <<< When if.end.i54 gets eliminated which removes the definition of ret.0.off0.i. The result is the expression %.ret.0.off0.i = or i1 %.ret.0.off0.i, %cmp10.i (Note the first ‘or’ operand is now %.ret.0.off0.i, and *NOT* %ret.0.off0.i). And now there is use of .ret.0.off0.i before a definition which triggers the “endless” loop in gather(): while(!DFT.empty()) { V = DFT.pop_back_val(); // V is .ret.0.off0.i if (Instruction *I = dyn_cast<Instruction>(V)) { // If it is a || (or && depending on isEQ), process the operands. if (I->getOpcode() == (isEQ ? Instruction::Or : Instruction::And)) { DFT.push_back(I->getOperand(1)); // This is now .ret.0.off0.i also DFT.push_back(I->getOperand(0)); continue; // “endless loop” for .ret.0.off0.i } Reviewers: reames, ahatanak Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16839 llvm-svn: 259730
* [InstrProfiling] Fix a comment (NFC)Vedant Kumar2016-02-031-1/+1
| | | | llvm-svn: 259727
* Unify the target opcode enum in TargetOpcodes.h and the FixedInstrs array inDavid L Kreitzer2016-02-033-142/+158
| | | | | | | | CodeGenTarget.cpp to avoid the ordering dependence. NFCI. Differential Revision: http://reviews.llvm.org/D16826 llvm-svn: 259726
* Minor code cleanups. NFC.Junmo Park2016-02-031-18/+18
| | | | llvm-svn: 259725
* Print the OffsetStart field's relocationDavid Majnemer2016-02-031-8/+15
| | | | llvm-svn: 259723
* rangify; NFCISanjay Patel2016-02-031-159/+129
| | | | llvm-svn: 259722
* clean up; NFCSanjay Patel2016-02-031-15/+13
| | | | llvm-svn: 259720
* [llvm-readobj] Add support for dumping S_DEFRANGE symbolsDavid Majnemer2016-02-032-2/+138
| | | | llvm-svn: 259719
* Replace static const int with enum to fix obnoxious linker errors about a ↵Reid Kleckner2016-02-031-1/+1
| | | | | | missing definition llvm-svn: 259712
* [unittests] Move TargetRegistry test from Support to MCReid Kleckner2016-02-033-3/+6
| | | | | | | This removes the dependency from SupportTests to all of the LLVM backends, and makes it link faster. llvm-svn: 259705
OpenPOWER on IntegriCloud