summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [InstCombine] reduce even more unsigned saturated add with 'not' opSanjay Patel2019-02-181-17/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We want to use the sum in the icmp to allow matching with m_UAddWithOverflow and eliminate the 'not'. This is discussed in D51929 and is another step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 Name: uaddsat, -1 fval %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ugt i32 %notx, %y %r = select i1 %c, i32 %a, i32 -1 => %a = add i32 %x, %y %c2 = icmp ugt i32 %y, %a %r = select i1 %c2, i32 -1, i32 %a Name: uaddsat, -1 fval + ult %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ult i32 %y, %notx %r = select i1 %c, i32 %a, i32 -1 => %a = add i32 %x, %y %c2 = icmp ugt i32 %y, %a %r = select i1 %c2, i32 -1, i32 %a https://rise4fun.com/Alive/nTp llvm-svn: 354276
* [MCA] Correctly update register definitions in the PRF after move elimination.Andrea Di Biagio2019-02-181-14/+9
| | | | | | | | | | This patch fixes a bug where register writes performed by optimizable register moves were sometimes wrongly treated like partial register updates. Before this patch, llvm-mca wrongly predicted a 1.50 IPC for test reg-move-elimination-6.s (added by this patch). With this patch, llvm-mca correctly updates the register defintions in the PRF, and the IPC for that test is now correctly reported as 2. llvm-svn: 354271
* [MCA] Slightly refactor method writeStartEvent in WriteState and ReadState. NFCIAndrea Di Biagio2019-02-183-13/+13
| | | | | | | This is another change in preparation for PR37494. No functional change intended. llvm-svn: 354261
* Revert r354244 "[DAGCombiner] Eliminate dead stores to stack."Clement Courbet2019-02-185-209/+45
| | | | | | Breaks some bots. llvm-svn: 354245
* [DAGCombiner] Eliminate dead stores to stack.Clement Courbet2019-02-185-45/+209
| | | | | | | | | | | | | | | Summary: A store to an object whose lifetime is about to end can be removed. See PR40550 for motivation. Reviewers: niravd Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D57541 llvm-svn: 354244
* [MC] Make SubtargetFeatureKV only store one FeatureBitset and use an ↵Craig Topper2019-02-181-11/+11
| | | | | | | | | | | | 'unsigned' to hold the value. This class is used for two difference tablegen generated tables. For one of the tables the Value FeatureBitset only has one bit set. For the other usage the Implies field was unused. This patch changes the Value field to just be an unsigned. For the usage that put a real vector in bitset, we now use the previously unused Implies field and leave the Value field unused instead. This is good for a 16K reduction in the size of llc on my local build with all targets enabled. llvm-svn: 354243
* [LLVM-C] Add bindings to create enumeratorsRobert Widmann2019-02-171-0/+8
| | | | | | | | | | | | | | | | Summary: The C API don't have the bindings to create enumerators, needed to create an enumeration. Reviewers: whitequark, CodaFi, harlanhaskins, deadalnix Reviewed By: whitequark, CodaFi, harlanhaskins Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58323 llvm-svn: 354237
* [X86] In FP_TO_INTHelper, when moving data from SSE register to X87 register ↵Craig Topper2019-02-171-13/+9
| | | | | | | | | | file via the stack, use the same stack slot we use for the integer conversion. No need for a separate stack slot. The lifetimes don't overlap. Also fix the MachinePointerInfo for the final load after the integer conversion to indicate it came from the stack slot. llvm-svn: 354234
* [NFC] Teach getInnermostLoopFor walk up the loop treesMax Kazantsev2019-02-171-6/+10
| | | | | | | This should be NFC in current use case of this method, but it will help to use it for solving more compex tasks in follow-up patches. llvm-svn: 354227
* [SelectionDAG] Extract [US]MULO expansion into TL method; NFCNikita Popov2019-02-172-148/+124
| | | | | | | | | | | | In preparation for supporting vector expansion. Add an isPostTypeLegalization flag to makeLibCall(), because this expansion relies on the legalized form using MERGE_VALUES. Drop the corresponding variant of ExpandLibCall, which is no longer used. Differential Revision: https://reviews.llvm.org/D58006 llvm-svn: 354226
* [InstCombine] reduce more unsigned saturated add with 'not' opSanjay Patel2019-02-171-3/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We want to use the sum in the icmp to allow matching with m_UAddWithOverflow and eliminate the 'not'. This is discussed in D51929 and is another step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 Name: not op %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ult i32 %notx, %y %r = select i1 %c, i32 -1, i32 %a => %a = add i32 %x, %y %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a Name: not op ugt %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ugt i32 %y, %notx %r = select i1 %c, i32 -1, i32 %a => %a = add i32 %x, %y %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a https://rise4fun.com/Alive/niom (The matching here is still incomplete.) llvm-svn: 354224
* [InstCombine] reduce unsigned saturated add with 'not' opSanjay Patel2019-02-171-11/+28
| | | | | | | | | | | | | We want to use the sum in the icmp to allow matching with m_UAddWithOverflow and eliminate the 'not'. This is discussed in D51929 and is another step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 (The matching here is incomplete. Trying to take minimal steps to make sure we don't induce infinite looping from existing canonicalizations of the 'select'.) llvm-svn: 354221
* [NFC] Fix name and clarifying comment for factored-out functionMax Kazantsev2019-02-171-4/+5
| | | | llvm-svn: 354220
* [NFC] Factor out a function for future reuseMax Kazantsev2019-02-171-8/+15
| | | | llvm-svn: 354218
* [LLVMSupport]: Remove a severely outdated README.Kristina Brooks2019-02-171-43/+0
| | | | | | | | | | | | | | | | | | | The LLVM Support library implementation has resided in //llvm/lib/Support for a significant amount of time now, with documentation having been updated with all references to the "System library" being replaced with "Support library". Since this file mirrors already existing documentation available for Support library, includes dead links to documentation and still refers to it as "System library", having it there is confusing and updating it has very little point as it duplicates information in documentation, except documentation is a lot more up to date while this file has not been maintained. Up to date documentation concerning this can be found here: http://llvm.org/docs/SupportLibrary.html llvm-svn: 354209
* [X86] When type legalizing the result of a i64 fp_to_uint on 32-bit targets. ↵Craig Topper2019-02-161-27/+10
| | | | | | | | | | Generate all of the ops as i64 and let them be legalized. No need to manually split everything. We can let the type legalizer work for us. The test change seems to be caused by some DAG ordering issue that was previously circumventing a one use check in LowerSELECT where FP selects are turned into blends if the setcc has one use. But it was running after an integer select and the same setcc had been legalized to cmov and X86SISD::CMP. This dropped the use count of the setcc, but wasn't what was intended. llvm-svn: 354197
* [X86] Don't prevent load folding for cvtsi2ss/cvtsi2sd based on ↵Craig Topper2019-02-161-39/+48
| | | | | | | | | | hasPartialRegUpdate. Preventing the load fold won't fix the partial register update since the input we can fold is a GPR. So it will do nothing to prevent a false dependency on an XMM register. llvm-svn: 354193
* [EarlyCSE & MSSA] Cap the clobbering calls in EarlyCSE.Alina Sbirlea2019-02-151-2/+13
| | | | | | | | | | | | | | | | | | | Summary: Unlimitted number of calls to getClobberingAccess can lead to high compile times in pathological cases. Limitting getClobberingAccess to a fairly high number. Can be adjusted based on users/need. Note: this is the only user of MemorySSA currently enabled by default. The same handling exists in LICM (disabled atm). As MemorySSA gains more users, this logic of capping will need to move inside MemorySSA. Reviewers: george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D58248 llvm-svn: 354182
* [X86] Don't set exception mask bits when modifying FPCW to change rounding ↵Craig Topper2019-02-151-14/+24
| | | | | | | | | | | | mode for fp->int conversion When we need to do an fp->int conversion using x87 instructions, we need to temporarily change the rounding mode to 0b11 and perform a store. To do this we save the old value of the fpcw to the stack, then set the fpcw to 0xc7f, do the store, then restore fpcw. But the 0xc7f value forces the exception mask bits 1. While this is what they would be in the default FP environment, as we move to support changing the FP environments, we shouldn't make this assumption. This patch changes the code to explicitly OR 0xc00 with the old value so that only the rounding mode is changed. Unfortunately, this requires two stack temporaries instead of one. One to hold the old value and one to hold the new value. Without two stack temporaries we would need an additional GPR. We already need one to do the OR operation in. This is similar to what gcc and icc do for this operation. Though they are both better at reusing the stack temporaries when there are multiple truncates in a function(or at least in a basic block) Differential Revision: https://reviews.llvm.org/D57788 llvm-svn: 354178
* [InstCombine] Address a couple stylistic issues pointed out by reviewer [NFC]Philip Reames2019-02-151-6/+6
| | | | | | Better addressing comments from https://reviews.llvm.org/D58290. llvm-svn: 354171
* [InstCombine] Convert atomicrmws to xchg or store where legalPhilip Reames2019-02-151-15/+59
| | | | | | | | | | Implement two more transforms of atomicrmw: 1) We can convert an atomicrmw which produces a known value in memory into an xchg instead. 2) We can convert an atomicrmw xchg w/o users into a store for some orderings. Differential Revision: https://reviews.llvm.org/D58290 llvm-svn: 354170
* [X86] Fix LowerAsmOutputForConstraint.Nirav Dave2019-02-154-11/+9
| | | | | | | | | | | | | | | | | Summary: Update Flag when generating cc output. Fixes PR40737. Reviewers: rnk, nickdesaulniers, craig.topper, spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58283 llvm-svn: 354163
* [X86] Move all the SSE legality checks out of FP_TO_INTHelper and up to ↵Craig Topper2019-02-151-22/+14
| | | | | | | | | | LowerFP_TO_INT. NFCI These checks aren't needed on the call to FP_TO_INTHelper from the type legalizer for splitting i64. We always want to use X87 FIST/FISTT to memory there. Moving up the SSE checks will allow this routine to focus on what it cares about and makes its return semantics cleaner. llvm-svn: 354161
* Recommit "[SystemZ] Do not emit VEXTEND or VROUND nodes without vector support."Jonas Paulsson2019-02-151-0/+8
| | | | | | | | | | | It seems there were some problem with using a .mir test. For some reason doing '-stop-before=codegenprepare' and then '-start-before=codegenprepare' on the output .mir file results in the NoVRegs Property after instruction selection. Recommitting the same test as an .ll file instead. llvm-svn: 354160
* [CodeExtractor] Do not lift lifetime.end markers for region inputsVedant Kumar2019-02-151-13/+7
| | | | | | | | | | | | | | | | | | | | | | | If a lifetime.end marker occurs along one path through the extraction region, but not another, then it's still incorrect to lift the marker, because there is some path through the extracted function which would ordinarily not reach the marker. If the call to the extracted function is in a loop, unrolling can cause inputs to the function to become optimized out as undef after the first iteration. To prevent incorrect stack slot merging in the calling function, it should be sufficient to lift lifetime.start markers for region inputs. I've tested this theory out by doing a stage2 check-all with randomized splitting enabled. This is a follow-up to r353973, and there's additional context for this change in https://reviews.llvm.org/D57834. rdar://47896986 Differential Revision: https://reviews.llvm.org/D58253 llvm-svn: 354159
* [HotColdSplit] Schedule splitting late to fix perf regressionVedant Kumar2019-02-152-19/+27
| | | | | | | | | | | | | | | With or without PGO data applied, splitting early in the pipeline (either before the inliner or shortly after it) regresses performance across SPEC variants. The cause appears to be that splitting hides context for subsequent optimizations. Schedule splitting late again, in effect reversing r352080, which scheduled the splitting pass early for code size benefits (documented in https://reviews.llvm.org/D57082). Differential Revision: https://reviews.llvm.org/D58258 llvm-svn: 354158
* [MCA] Improved code comment. NFCAndrea Di Biagio2019-02-151-1/+2
| | | | llvm-svn: 354154
* Fix 80-column limit in SimplifyDemandedBits/SimplifyDemandedVectorElts. NFCI.Simon Pilgrim2019-02-151-70/+78
| | | | llvm-svn: 354152
* [MCA][LSUnit] Return the ID of the dependent memory operation from methodAndrea Di Biagio2019-02-152-12/+16
| | | | | | | | isReady(). NFCI This is yet another change in preparation for a fix for PR37494. llvm-svn: 354150
* [InstCombine] fix crash while trying to narrow a binop of shuffles (PR40734)Sanjay Patel2019-02-151-1/+2
| | | | | | https://bugs.llvm.org/show_bug.cgi?id=40734 llvm-svn: 354144
* GlobalISel: Fix inadequate verification of g_build_vectorMatt Arsenault2019-02-151-6/+11
| | | | | | | | | | | | | Testing based on the total size of the elements failed to catch a few invalid scenarios, so explicitly check the number of elements/operands and types. This failed to catch situations like <4 x s16> = G_BUILD_VECTOR s32, s32 since the total size added up. This also would fail to catch an implicit conversion between pointers and scalars. llvm-svn: 354139
* [MergeICmps] Make base ordering really deterministic.Clement Courbet2019-02-151-87/+107
| | | | | | | | | | | | | | | | | | Summary: The idea is that we now manipulate bases through a `unsigned BaseID` based on order of appearance in the comparison chain rather than through the `Value*`. Fixes 40714. Reviewers: gchatelet Subscribers: mgrang, jfb, jdoerfert, llvm-commits, hans Tags: #llvm Differential Revision: https://reviews.llvm.org/D58274 llvm-svn: 354131
* [MergeICmps][NFC] Improve doc.Clement Courbet2019-02-151-5/+28
| | | | llvm-svn: 354128
* [NFCI] Factor out block removal from stack of nested loopsMax Kazantsev2019-02-151-6/+14
| | | | llvm-svn: 354124
* Fix "field 'DFS' will be initialized after field 'DTU'" warning. NFCI.Simon Pilgrim2019-02-151-1/+1
| | | | llvm-svn: 354123
* [BPI] Look through bitcasts in calcZeroHeuristicSam Parker2019-02-151-1/+7
| | | | | | | | | | | | Constant hoisting may have hidden a constant behind a bitcast so that it isn't folded into its users. However, this prevents BPI from calculating some of its heuristics that are based upon constant values. So, I've added a simple helper function to look through these casts. Differential Revision: https://reviews.llvm.org/D58166 llvm-svn: 354119
* [NFC] Promote DFS to field for further useMax Kazantsev2019-02-151-2/+2
| | | | llvm-svn: 354118
* [X86][AVX] lowerShuffleAsLanePermuteAndPermute - fully populate the lane ↵Simon Pilgrim2019-02-151-2/+11
| | | | | | | | | | | | | | shuffle mask (PR40730) As detailed on PR40730, we are not correctly filling in the lane shuffle mask (D53148/rL344446) - we fill in for the correct src lane but don't add it to the correct mask element, so any reference to the correct element is likely to see an UNDEF mask index. This allows constant folding to propagate UNDEFs prior to the lane mask being (correctly) lowered to vperm2f128. This patch fixes the issue by fully populating the lane shuffle mask - this is more than is necessary (if we only filled in the required mask elements we might be able to match other shuffle instructions - broadcasts etc.), but its the most cautious approach as this needs to be cherrypicked into the 8.0.0 release branch. Differential Revision: https://reviews.llvm.org/D58237 llvm-svn: 354117
* [ARM GlobalISel] Style fix. NFCIDiana Picus2019-02-151-1/+5
| | | | | | | | Add the opcode for ADDrr / t2ADDrr to the Opcode cache, as we did for all other opcodes where the handling is otherwise the same between arm mode and thumb2. llvm-svn: 354115
* [ARM GlobalISel] Support branches for Thumb2Diana Picus2019-02-152-9/+17
| | | | | | Just like arm mode, but with different opcodes. llvm-svn: 354113
* [RISCV] Add assembler support for LA pseudo-instructionAlex Bradbury2019-02-152-18/+76
| | | | | | | | | | This patch also introduces the emitAuipcInstPair helper, which is then used for both emitLoadAddress and emitLoadLocalAddress. Differential Revision: https://reviews.llvm.org/D55325 Patch by James Clarke. llvm-svn: 354111
* [RISCV] Support assembling %got_pcrel_hi operatorAlex Bradbury2019-02-158-8/+32
| | | | | | | Differential Revision: https://reviews.llvm.org/D55279 Patch by James Clarke. llvm-svn: 354110
* [ARM CGP] Fix ConvertTruncsSam Parker2019-02-151-8/+17
| | | | | | | | | | | | | ConvertTruncs is used to replace a trunc for an AND mask, however this function wasn't working as expected. By performing the change later, we can create a wide type integer mask instead of a narrow -1 value, which could then be simply removed (incorrectly). Because we now perform this action later, it's necessary to cache the trunc type before we perform the promotion. Differential Revision: https://reviews.llvm.org/D57686 llvm-svn: 354108
* [NFC] Tweak SplitBlockAndInsertIfThen to use existing ThenBlockMax Kazantsev2019-02-151-8/+16
| | | | llvm-svn: 354107
* X86: Replace isSafeToClobberEFLAGS implementationMatt Arsenault2019-02-152-86/+5
| | | | | | Also use modifiesRegister instead of looping over operands. llvm-svn: 354098
* Revert "[SystemZ] Do not emit VEXTEND or VROUND nodes without vector support."Francis Visoiu Mistrih2019-02-151-8/+0
| | | | | | | | | This reverts commit aa0b77d3395dc6ab91647138139c1a15a3aa088d. This fails to pass the machine verifier: http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-expensive/13579/ llvm-svn: 354096
* [GISel][NFC]: Add methods to speed up insertion into GISelWorklistAditya Nandakumar2019-02-152-3/+6
| | | | | | | | | | | | | https://reviews.llvm.org/D58073 Speed up insertion during the initial populating phase into the GISelWorkList by deferring repeatedly resizing the DenseMap. This results in ~10% improvement in the combiner passes, and ~3% speedup in the Legalizer. reviewed by: aemerson. llvm-svn: 354093
* AMDGPU: Set ABI version to 1 for code object v3Konstantin Zhuravlyov2019-02-143-10/+20
| | | | | | Differential Revision: https://reviews.llvm.org/D57811 llvm-svn: 354085
* [symbolizer] Avoid collecting symbols belonging to invalid sections.Matt Davis2019-02-141-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: llvm-symbolizer would originally report symbols that belonged to an invalid object file section. Specifically the case where: `*Symbol.getSection() == ObjFile.section_end()` This patch prevents the Symbolizer from collecting symbols that belong to invalid sections. The test (from PR40591) introduces a case where two symbols have address 0, one symbol is defined, 'foo', and the other is not defined, 'bar'. This patch will cause the Symbolizer to keep 'foo' and ignore 'bar'. As a side note, the logic for adding symbols to the Symbolizer's store (`SymbolizableObjectFile::addSymbol`) replaces symbols with the same <address, size> pair. At some point that logic should be revisited as in the aforementioned case, 'bar' was overwriting 'foo' in the Symbolizer's store, and 'foo' was forgotten. This fixes PR40591 Reviewers: jhenderson, rupprecht Reviewed By: rupprecht Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58146 llvm-svn: 354083
* Revert "[INLINER] allow inlining of address taken blocks"Nick Desaulniers2019-02-142-10/+2
| | | | | | This reverts commit 19e95fe61182945b7b68ad15348f144fb996633f. llvm-svn: 354082
OpenPOWER on IntegriCloud