summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* IPRA: Allow target to enable IPRA by defaultMatt Arsenault2017-08-141-0/+10
| | | | llvm-svn: 310876
* IPRA: Run RegUsageInfoPropagate much laterMatt Arsenault2017-08-141-3/+3
| | | | | | | | | | | | | | This was running immediately after isel, before isel pseudos were even expanded which is really unreasonable. Move this to before pre-reglloc passes in case some other pre-regalloc pass wants to use the updated regmask info. Fixes one of the reasons IPRA doesn't do anything on AMDGPU currently. Tests will be included with future patch after a few more are fixed. llvm-svn: 310875
* [DAGCombine] Do not try to deduplicate commutative operations if both ↵Amaury Sechet2017-08-141-3/+3
| | | | | | | | | | | | | | operand are the same. Summary: It is creating useless work as the commuted nodes is the same as the node we are working on in that case. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33840 llvm-svn: 310832
* [SelectionDAG] combine vextract (v1iX extract_subvector(vNiX, Idx))Elad Cohen2017-08-141-0/+9
| | | | | | | | | into vextract(vNiX,Idx) when creating vextract with getNode(). This case appeared in AVX512 after fixing pr33349 in r310552. Differential revision: https://reviews.llvm.org/D36571 llvm-svn: 310828
* MachineInstr: Reason locally about some memory objects before going to AA.Balaram Makam2017-08-141-17/+42
| | | | | | This addresses a FIXME in MachineInstr::mayAlias. llvm-svn: 310825
* Revert "[DAGCombiner] Extending pattern detection for vector shuffle ↵Elad Cohen2017-08-141-47/+2
| | | | | | | | (REAPPLIED)" This reverts commit r310782. llvm-svn: 310822
* [X86][ARM][TargetLowering] Add SrcVT to isExtractSubvectorCheapCraig Topper2017-08-131-1/+1
| | | | | | | | | | | | | | | | | Summary: Without the SrcVT its hard to know what is really being asked for. For example if your target has 128, 256, and 512 bit vectors. Maybe extracting 128 from 256 is cheap, but maybe extracting 128 from 512 is not. For x86 we do support extracting a quarter of a 512-bit register. But for i1 vectors we don't have isel patterns for extracting arbitrary pieces. So we need this to have a correct implementation of isExtractSubvectorCheap for mask vectors. Reviewers: RKSimon, zvi, efriedma Reviewed By: RKSimon Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D36649 llvm-svn: 310793
* [DAGCombiner] Extending pattern detection for vector shuffle (REAPPLIED)Simon Pilgrim2017-08-121-2/+47
| | | | | | | | | | | | If all the operands of a BUILD_VECTOR extract elements from same vector then split the vector efficiently based on the maximum vector access index. Reapplied with fix to only work with simple value types. Committed on behalf of @jbhateja (Jatin Bhateja) Differential Revision: https://reviews.llvm.org/D35788 llvm-svn: 310782
* [x86] use more shift or LEA for select-of-constants (2nd try)Sanjay Patel2017-08-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous rev (r310208) failed to account for overflow when subtracting the constants to see if they're suitable for shift/lea. This version add a check for that and more test were added in r310490. We can convert any select-of-constants to math ops: http://rise4fun.com/Alive/d7d For this patch, I'm enhancing an existing x86 transform that uses fake multiplies (they always become shl/lea) to avoid cmov or branching. The current code misses cases where we have a negative constant and a positive constant, so this is just trying to plug that hole. The DAGCombiner diff prevents us from hitting a terrible inefficiency: we can start with a select in IR, create a select DAG node, convert it into a sext, convert it back into a select, and then lower it to sext machine code. Some notes about the test diffs: 1. 2010-08-04-MaskedSignedCompare.ll - We were creating control flow that didn't exist in the IR. 2. memcmp.ll - Choose -1 or 1 is the case that got me looking at this again. We could avoid the push/pop in some cases if we used 'movzbl %al' instead of an xor on a different reg? That's a post-DAG problem though. 3. mul-constant-result.ll - The trade-off between sbb+not vs. setne+neg could be addressed if that's a regression, but those would always be nearly equivalent. 4. pr22338.ll and sext-i1.ll - These tests have undef operands, so we don't actually care about these diffs. 5. sbb.ll - This shows a win for what is likely a common case: choose -1 or 0. 6. select.ll - There's another borderline case here: cmp+sbb+or vs. test+set+lea? Also, sbb+not vs. setae+neg shows up again. 7. select_const.ll - These are motivating cases for the enhancement; replace cmov with cheaper ops. Assembly differences between movzbl and xor to avoid a partial reg stall are caused later by the X86 Fixup SetCC pass. Differential Revision: https://reviews.llvm.org/D35340 llvm-svn: 310717
* Improve handling of insert_subvector of bitcast valuesNirav Dave2017-08-111-0/+35
| | | | | | | | | | | | Fix insert_subvector / extract_subvector merges of bitcast values. Reviewers: efriedma, craig.topper, RKSimon Subscribers: RKSimon, llvm-commits Differential Revision: https://reviews.llvm.org/D34571 llvm-svn: 310711
* [X86][DAG] Switch X86 Target to post-legalized store mergeNirav Dave2017-08-111-1/+2
| | | | | | | | | | | | | | | | Move store merge to happen after intrinsic lowering to allow lowered stores to be merged. Some regressions due in MergeConsecutiveStores to missing insert_subvector that are addressed in follow up patch. Reviewers: craig.topper, efriedma, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34559 llvm-svn: 310710
* [DAGCombiner] Remove shuffle support from simplifyShuffleMaskSimon Pilgrim2017-08-111-2/+0
| | | | | | | | rL310372 enabled simplifyShuffleMask to support undef shuffle mask inputs, but its causing hangs. Removing support until I can triage the problem llvm-svn: 310699
* [IfConversion] Maintain the CFG when predicating/merging blocks in IfConvert*Mikael Holmen2017-08-111-38/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This fixes PR32721 in IfConvertTriangle and possible similar problems in IfConvertSimple, IfConvertDiamond and IfConvertForkedDiamond. In PR32721 we had a triangle EBB | \ | | | TBB | / FBB where FBB didn't have any successors at all since it ended with an unconditional return. Then TBB and FBB were be merged into EBB, but EBB would still keep its successors, and the use of analyzeBranch and CorrectExtraCFGEdges wouldn't help to remove them since the return instruction is not analyzable (at least not on ARM). The edge updating code and branch probability updating code is now pushed into MergeBlocks() which allows us to share the same update logic between more callsites. This lets us remove several dependencies on analyzeBranch and completely eliminate RemoveExtraEdges. One thing that showed up with this patch was that IfConversion sometimes left a successor with 0% probability even if there was no branch or fallthrough to the successor. One such example from the test case ifcvt_bad_zero_prob_succ.mir. The indirect branch tBRIND can only jump to bb.1, but without the patch we got: bb.0: successors: %bb.1(0x80000000) bb.1: successors: %bb.1(0x80000000), %bb.2(0x00000000) tBRIND %r1, 1, %cpsr B %bb.1 bb.2: There is no way to jump from bb.1 to bb2, but still there is a 0% edge from bb.1 to bb.2. With the patch applied we instead get the expected: bb.0: successors: %bb.1(0x80000000) bb.1: successors: %bb.1(0x80000000) tBRIND %r1, 1, %cpsr B %bb.1 Since bb.2 had no predecessor at all, it was removed. Several testcases had to be updated due to this since the removed successor made the "Branch Probability Basic Block Placement" pass sometimes place blocks in a different order. Finally added a couple of new test cases: * PR32721_ifcvt_triangle_unanalyzable.mir: Regression test for the original problem dexcribed in PR 32721. * ifcvt_triangleWoCvtToNextEdge.mir: Regression test for problem that caused a revert of my first attempt to solve PR 32721. * ifcvt_simple_bad_zero_prob_succ.mir: Test case showing the problem where a wrong successor with 0% probability was previously left. * ifcvt_[diamond|forked_diamond|simple]_unanalyzable.mir Very simple test cases for the simple and (forked) diamond cases involving unanalyzable branches that can be nice to have as a base if wanting to write more complicated tests. Reviewers: iteratee, MatzeB, grosser, kparzysz Reviewed By: kparzysz Subscribers: kbarton, davide, aemerson, nemanjai, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34099 llvm-svn: 310697
* Revert "[DAG] Cleanup unused nodes after store merge. NFCI."Nirav Dave2017-08-101-11/+1
| | | | | | This reverts commit r310648 which causes an unexpected assertion failure llvm-svn: 310659
* [DAG] Relax type restriction for store mergeNirav Dave2017-08-101-24/+64
| | | | | | | | | | | | | | Summary: Allow stores of bitcastable types to be merged by peeking through BITCAST nodes and recasting stored values constant and vector extract nodes as necessary. Reviewers: jyknight, hfinkel, efriedma, RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34569 llvm-svn: 310655
* [DAG] Cleanup unused nodes after store merge. NFCI.Nirav Dave2017-08-101-1/+11
| | | | llvm-svn: 310648
* Make .file directive to have basename onlyTaewook Oh2017-08-101-1/+3
| | | | | | | | | | | | | | | Summary: Currently LLVM puts directory along with the filename in .file directive, but this behavior doesn't match gcc. There's a no clear description about which one is right (https://sourceware.org/binutils/docs/as/File.html#File), but one document (https://sourceware.org/gdb/current/onlinedocs/stabs/ELF-Linker-Relocation.html) suggests that STT_FILE symbol in elf file is expected to have basename only, which should have a same sting file .file directive according to (https://docs.oracle.com/cd/E26502_01/html/E28388/eoiyg.html). This also affects badly on the build system that uses hashing, as the directory info could be differnt from developer to developer even when they're working on same file. Reviewers: pcc, mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36018 llvm-svn: 310642
* Add "Restored" flag to CalleeSavedInfoKrzysztof Parzyszek2017-08-102-3/+4
| | | | | | | | | | | The liveness-tracking code assumes that the registers that were saved in the function's prolog are live outside of the function. Specifically, that registers that were saved are also live-on-exit from the function. This isn't always the case as illustrated by the LR register on ARM. Differential Revision: https://reviews.llvm.org/D36160 llvm-svn: 310619
* [DAG] Rewrite expression. NFC.Nirav Dave2017-08-101-2/+2
| | | | llvm-svn: 310608
* [X86] Keep dependencies when constructing loads in combineStoreNirav Dave2017-08-101-5/+6
| | | | | | | | | | | | | | | | Summary: Preserve chain dependecies between old and new loads constructed to prevent loads from reordering below later stores. Fixes PR34088. Reviewers: craig.topper, spatel, RKSimon, efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36528 llvm-svn: 310604
* [SelectionDAG] Allow constant folding for implicitly truncating BUILD_VECTOR ↵Guy Blank2017-08-101-2/+16
| | | | | | | | | | | | | nodes. In FoldConstantArithmetic, handle BUILD_VECTOR nodes that do implicit truncation on the elements. This is similar to what is done in FoldConstantVectorArithmetic. Differential Revision: https://reviews.llvm.org/D36506 llvm-svn: 310593
* [SelectionDAG] When scalarizing vselect, don't assert onElad Cohen2017-08-101-1/+15
| | | | | | | | | | | | | | | | | | | | | | | a legal cond operand. When scalarizing the result of a vselect, the legalizer currently expects to already have scalarized the operands. While this is true for the true/false operands (which have the same type as the result), it is not case for the condition operand. On X86 AVX512, v1i1 is legal - this leads to operations such as '< N x type> vselect < N x i1> < N x type> < N x type>' where < N x type > is illegal to hit an assertion during the scalarization. The handling is similar to r205625. This also exposes the fact that (v1i1 extract_subvector) should be legal and selectable on AVX512 - We do this by custom lowering to vector_extract_elt. This still leaves us in some cases with redundant dag nodes which will be combined in a separate soon to come patch. This fixes pr33349. Differential revision: https://reviews.llvm.org/D36511 llvm-svn: 310552
* Reduce variable scope by moving declaration into if clauseDavid Blaikie2017-08-091-8/+8
| | | | llvm-svn: 310506
* [DAG] Explicitly cleanup merged load values during store merge. NFCI.Nirav Dave2017-08-091-2/+8
| | | | llvm-svn: 310474
* [ImplicitNullCheck] Fix the bug when dependent instruction accesses memorySerguei Katkov2017-08-091-1/+3
| | | | | | | | | | | | | | It is possible that dependent instruction may access memory. In this case we must reject optimization because the memory change will be visible in null handler basic block. So we will execute an instruction which we must not execute if check fails. Reviewers: sanjoy, reames Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36392 llvm-svn: 310443
* [codeview] Emit nested enums and typedefs from classesReid Kleckner2017-08-081-4/+6
| | | | | | | | | | Previously we limited ourselves to only emitting nested classes, but we need other kinds of types as well. This fixes the Visual Studio STL visualizers, so that users can visualize std::string and other objects. llvm-svn: 310410
* [DAG] Introduce peekThroughBitcast function. NFCI.Nirav Dave2017-08-081-23/+14
| | | | llvm-svn: 310405
* [DAG] Update comments. NFC.Nirav Dave2017-08-081-8/+9
| | | | llvm-svn: 310404
* [DAGCombiner] simplifyShuffleMask - handle UNDEF inputs from shuffles as ↵Simon Pilgrim2017-08-081-11/+10
| | | | | | | | well as BUILD_VECTOR Minor extension to D36393 llvm-svn: 310372
* [DAGCombiner] Simplify shuffle mask index if the referenced input element is ↵Simon Pilgrim2017-08-081-0/+36
| | | | | | | | | | UNDEF Fixes one of the cases in PR34041. Differential Revision: https://reviews.llvm.org/D36393 llvm-svn: 310344
* [x86] revert r310208 to investigate test-suite failures (PR34105 / PR34097) Sanjay Patel2017-08-071-1/+1
| | | | llvm-svn: 310264
* [DAG] Extend visitSCALAR_TO_VECTOR optimization to truncated vector.Nirav Dave2017-08-071-12/+35
| | | | | | | | | | | | | | | | Relanding after case to insert explicit truncation as necessary. Allow SCALAR_TO_VECTOR of EXTRACT_VECTOR_ELT to reduce to EXTRACT_SUBVECTOR of vector shuffle when output is smaller. Marginally improves vector shuffle computations. Reviewers: efriedma, RKSimon, spatel Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D35566 llvm-svn: 310256
* [SelectionDAG] reset NewNodesMustHaveLegalTypes flag between basic blocksGuy Blank2017-08-071-0/+3
| | | | | | | | | | | | | The NewNodesMustHaveLegalTypes flag is set to false at the beginning of CodeGenAndEmitDAG, and set to true after legalizing types. But before calling CodeGenAndEmitDAG we build the DAG for the basic block. So for the first basic block NewNodesMustHaveLegalTypes would be 'false' during the SDAG building, and for all other basic blocks it would be 'true'. This patch sets the flag to false before SDAG building each basic block. Differential Revision: https://reviews.llvm.org/D33435 llvm-svn: 310239
* [x86] use more shift or LEA for select-of-constantsSanjay Patel2017-08-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can convert any select-of-constants to math ops: http://rise4fun.com/Alive/d7d For this patch, I'm enhancing an existing x86 transform that uses fake multiplies (they always become shl/lea) to avoid cmov or branching. The current code misses cases where we have a negative constant and a positive constant, so this is just trying to plug that hole. The DAGCombiner diff prevents us from hitting a terrible inefficiency: we can start with a select in IR, create a select DAG node, convert it into a sext, convert it back into a select, and then lower it to sext machine code. Some notes about the test diffs: 1. 2010-08-04-MaskedSignedCompare.ll - We were creating control flow that didn't exist in the IR. 2. memcmp.ll - Choose -1 or 1 is the case that got me looking at this again. I think we could avoid the push/pop in some cases if we used 'movzbl %al' instead of an xor on a different reg? That's a post-DAG problem though. 3. mul-constant-result.ll - The trade-off between sbb+not vs. setne+neg could be addressed if that's a regression, but I think those would always be nearly equivalent. 4. pr22338.ll and sext-i1.ll - These tests have undef operands, so I don't think we actually care about these diffs. 5. sbb.ll - This shows a win for what I think is a common case: choose -1 or 0. 6. select.ll - There's another borderline case here: cmp+sbb+or vs. test+set+lea? Also, sbb+not vs. setae+neg shows up again. 7. select_const.ll - These are motivating cases for the enhancement; replace cmov with cheaper ops. Assembly differences between movzbl and xor to avoid a partial reg stall are caused later by the X86 Fixup SetCC pass. Differential Revision: https://reviews.llvm.org/D35340 llvm-svn: 310208
* IPRA: Don't crash on null getCallPreservedMaskMatt Arsenault2017-08-051-3/+5
| | | | | | Kernels aren't callable, so they don't have a call preserved mask. llvm-svn: 310172
* BlockPlacement: add a flag to force cold block outlining w/o a profile.Kyle Butt2017-08-041-1/+6
| | | | | | NFC. llvm-svn: 310129
* Revert r310058, it caused PR34073.Nico Weber2017-08-041-47/+2
| | | | llvm-svn: 310118
* [GlobalISel] Remove a stall comment in CMake.Quentin Colombet2017-08-041-3/+0
| | | | | | | | Thanks to Diana Picus <diana.picus@linaro.org> for noticing. NFC llvm-svn: 310114
* [MachineOperand] Add ChangeToTargetIndex method. NFCMarcello Maggioni2017-08-041-0/+13
| | | | | | Differential Revision: https://reviews.llvm.org/D36301 llvm-svn: 310083
* [DAGCombiner] Extending pattern detection for vector shuffle.Simon Pilgrim2017-08-041-2/+47
| | | | | | | | | | If all the operands of a BUILD_VECTOR extract elements from same vector then split the vector efficiently based on the maximum vector access index. Committed on behalf of @jbhateja (Jatin Bhateja) Differential Revision: https://reviews.llvm.org/D35788 llvm-svn: 310058
* Fix typo.Eric Christopher2017-08-031-1/+1
| | | | llvm-svn: 309997
* DAG: Provide access to Pass instance from SelectionDAGMatt Arsenault2017-08-032-2/+4
| | | | | | This allows accessing an analysis pass during lowering. llvm-svn: 309991
* [GlobalISel] Make GlobalISel a non-optional library.Quentin Colombet2017-08-032-34/+13
| | | | | | | | With this change, the GlobalISel library gets always built. In particular, this is not possible to opt GlobalISel out of the build using the LLVM_BUILD_GLOBAL_ISEL variable any more. llvm-svn: 309990
* [DAG] Allow merging of stores of vector loadsNirav Dave2017-08-031-6/+0
| | | | | | | | | | | | | Remove restriction disallowing merging of stores vector loads into larger store of larger vector load. Reviewers: RKSimon, efriedma, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36158 llvm-svn: 309951
* [LiveDebugVariables] Use lexical scope to trim debug value live intervalsRobert Lougher2017-08-031-7/+90
| | | | | | | | | | | | | | | | The debug value live intervals computed by Live Debug Variables may extend beyond the range of the debug location's lexical scope. In this case, splitting of an interval can result in an interval outside of the scope being created, causing extra unnecessary DBG_VALUEs to be emitted. To prevent this, trim the intervals to the lexical scope. This resolves PR33730. Reviewers: aprantl Differential Revision: https://reviews.llvm.org/D35953 llvm-svn: 309933
* [SelectionDAG] Resolve PR33978.Simon Dardis2017-08-031-4/+2
| | | | | | | | | | | | | | | | | | | | | rL306209 taught SelectionDAG how to add the dereferenceable flag when expanding memcpy and memmove. The fix however contained a nit where the offset + size was constructed as an APInt of PointerSize rather than PointerSizeInBits. This lead to isDereferenceableAndAlignedPointer() get truncated values or values which would be sign extended within that function leading to incorrect results. Thanks to Alex Crichton for reporting the issue! This resolves PR33978. Reviewers: inouehrs Differential Revision: https://reviews.llvm.org/D36236 llvm-svn: 309930
* [RegisterCoalescer] Add wrapper for Erasing InstructionsSameer AbuAsal2017-08-031-14/+16
| | | | | | | | | | | | | | | | | | | Summary: To delete an instruction the coalescer needs to call eraseFromParent() on the MachineInstr, insert it in the ErasedInstrs list and update the Live Ranges structure. This patch re-factors the code to do all that in one function. This will also fix cases where previous code wasn't inserting deleted instructions in the ErasedList. Reviewers: qcolombet, kparzysz Reviewed By: qcolombet Subscribers: MatzeB, llvm-commits, qcolombet Differential Revision: https://reviews.llvm.org/D36204 llvm-svn: 309915
* Delete Default and JITDefault code modelsRafael Espindola2017-08-031-1/+0
| | | | | | | | | | | | | | | IMHO it is an antipattern to have a enum value that is Default. At any given piece of code it is not clear if we have to handle Default or if has already been mapped to a concrete value. In this case in particular, only the target can do the mapping and it is nice to make sure it is always done. This deletes the two default enum values of CodeModel and uses an explicit Optional<CodeModel> when it is possible that it is unspecified. llvm-svn: 309911
* [StackColoring] Update AliasAnalysis information in stack coloring pass (part 2)Hiroshi Inoue2017-08-021-4/+4
| | | | | | | | | | | | | | | | | | | | | This patch is update after the first patch (https://reviews.llvm.org/rL309651) based on the post-commit comments. Stack coloring pass need to maintain AliasAnalysis information when merging stack slots of different types. Actually, there is a FIXME comment in StackColoring.cpp // FIXME: In order to enable the use of TBAA when using AA in CodeGen, // we'll also need to update the TBAA nodes in MMOs with values // derived from the merged allocas. But, TBAA has been already enabled in CodeGen without fixing this pass. The incorrect TBAA metadata results in recent failures in bootstrap test on ppc64le (PR33928) by allowing unsafe instruction scheduling. Although we observed the problem on ppc64le, this is a platform neutral issue. This patch makes the stack coloring pass maintains AliasAnalysis information when merging multiple stack slots. This patch fixes PR33928. llvm-svn: 309849
* Assert that the offset of a DBG_VALUE is always 0. (NFC)Adrian Prantl2017-08-022-4/+8
| | | | llvm-svn: 309834
OpenPOWER on IntegriCloud