summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG
Commit message (Collapse)AuthorAgeFilesLines
...
* [DAGCombiner] Match non-uniform constant vectors using predicates.Simon Pilgrim2017-07-201-28/+81
| | | | | | | | | | | | Most combines currently recognise scalar and splat-vector constants, but not non-uniform vector constants. This patch introduces a matching mechanism that uses predicates to check against BUILD_VECTOR of ConstantSDNode, as well as scalar ConstantSDNode cases. I've changed a couple of predicates to demonstrate - the combine-shl changes add currently unsupported cases, while the MatchRotate replaces an existing mechanism. Differential Revision: https://reviews.llvm.org/D35492 llvm-svn: 308598
* {DAGCombine] Convert (Val & Mask) == Mask to Mask.isSubsetof(Val). NFCI.Simon Pilgrim2017-07-191-1/+1
| | | | llvm-svn: 308460
* [DAG] Improve Aliasing of operations to static allocaNirav Dave2017-07-181-6/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Re-recommiting after landing DAG extension-crash fix. Recommiting after adding check to avoid miscomputing alias information on addresses of the same base but different subindices. Memory accesses offset from frame indices may alias, e.g., we may merge write from function arguments passed on the stack when they are contiguous. As a result, when checking aliasing, we consider the underlying frame index's offset from the stack pointer. Static allocs are realized as stack objects in SelectionDAG, but its offset is not set until post-DAG causing DAGCombiner's alias check to consider access to static allocas to frequently alias. Modify isAlias to consider access between static allocas and access from other frame objects to be considered aliasing. Many test changes are included here. Most are fixes for tests which indirectly relied on our aliasing ability and needed to be modified to preserve their original intent. The remaining tests have minor improvements due to relaxed ordering. The exception is CodeGen/X86/2011-10-19-widen_vselect.ll which has a minor degradation dispite though the pre-legalized DAG is improved. Reviewers: rnk, mkuper, jonpa, hfinkel, uweigand Reviewed By: rnk Subscribers: sdardis, nemanjai, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33345 llvm-svn: 308350
* [DAG] Reverse node replacement in extension operation. NFCI.Nirav Dave2017-07-181-12/+20
| | | | | | | | Reorder replacements to be user first in preparation for multi-level folding to premptively avoid inadvertantly deleting later nodes from sharing found from replacement. llvm-svn: 308348
* [DAG] Avoid deleting nodes before combining them.Nirav Dave2017-07-181-7/+26
| | | | | | | | | | | | | | | | | | When replacing a node and it's operand, replacing the operand node may cause the deletion of the original node leading to an assertion failure. Case around these replacements to avoid this without relying on inspecting the DELETED_NODE opcode in various extend dagcombiner cases. Fixes PR32515. Reviewers: dbabokin, RKSimon, davide, chandlerc Subscribers: chandlerc, llvm-commits Differential Revision: https://reviews.llvm.org/D34095 llvm-svn: 308330
* [DAG] Allow base element type of store merge type to also be a vector.Nirav Dave2017-07-181-1/+6
| | | | | | Correctly calculate merged vector size if MemVT is already a vector. llvm-svn: 308312
* [DAGCombine] Fix issue with out of bound constant rotation (PR33828)Simon Pilgrim2017-07-181-1/+10
| | | | | | Take the modulo of rotations by a constant greater than or equal to the bit-width llvm-svn: 308302
* Revert r308025 due to uncovering a crash in SelectionDAG. This is filedChandler Carruth2017-07-181-16/+6
| | | | | | | | | with a minimal test case in http://llvm.org/PR33833. Original commit message: Improve Aliasing of operations to static alloca llvm-svn: 308271
* [DAGCombiner] Recognise vector rotations with non-splat constantsAndrew Zhogin2017-07-161-13/+21
| | | | | | | | Fixes PR33691. Differential revision: https://reviews.llvm.org/D35381 llvm-svn: 308150
* Strip trailing whitespace. NFCISimon Pilgrim2017-07-151-1/+1
| | | | llvm-svn: 308108
* Improve Aliasing of operations to static allocaNirav Dave2017-07-141-6/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recommiting after adding check to avoid miscomputing alias information on addresses of the same base but different subindices. Memory accesses offset from frame indices may alias, e.g., we may merge write from function arguments passed on the stack when they are contiguous. As a result, when checking aliasing, we consider the underlying frame index's offset from the stack pointer. Static allocs are realized as stack objects in SelectionDAG, but its offset is not set until post-DAG causing DAGCombiner's alias check to consider access to static allocas to frequently alias. Modify isAlias to consider access between static allocas and access from other frame objects to be considered aliasing. Many test changes are included here. Most are fixes for tests which indirectly relied on our aliasing ability and needed to be modified to preserve their original intent. The remaining tests have minor improvements due to relaxed ordering. The exception is CodeGen/X86/2011-10-19-widen_vselect.ll which has a minor degradation dispite though the pre-legalized DAG is improved. Reviewers: rnk, mkuper, jonpa, hfinkel, uweigand Reviewed By: rnk Subscribers: sdardis, nemanjai, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33345 llvm-svn: 308025
* Reland "[mips] Fix multiprecision arithmetic."Simon Dardis2017-07-131-4/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | For multiprecision arithmetic on MIPS, rather than using ISD::ADDE / ISD::ADDC, get SelectionDAG to break down the operation into ISD::ADDs and ISD::SETCCs. For MIPS, only the DSP ASE has a carry flag, so in the general case it is not useful to directly support ISD::{ADDE, ADDC, SUBE, SUBC} nodes. Also improve the generation code in such cases for targets with TargetLoweringBase::ZeroOrOneBooleanContent by directly using the result of the comparison node rather than using it in selects. Similarly for ISD::SUBE / ISD::SUBC. Address optimization breakage by moving the generation of MIPS specific integer multiply-accumulate nodes to before legalization. This revolves PR32713 and PR33424. Thanks to Simonas Kazlauskas and Pirama Arumuga Nainar for reporting the issue! Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D33494 The previous version of this patch was too aggressive in producing fused integer multiple-addition instructions. llvm-svn: 307906
* [DAGCombiner] Fix issue with rotate combines asserting if the constant value ↵Simon Pilgrim2017-07-131-15/+18
| | | | | | types differ from the result type. llvm-svn: 307900
* Use isNullConstantOrNullSplatConstant helper. NFCI.Simon Pilgrim2017-07-131-3/+2
| | | | llvm-svn: 307895
* [TargetLowering] Add hook for adding target MMO flags when doing ISel.Geoff Berry2017-07-131-0/+2
| | | | | | | | | | | | | Summary: Add TargetLowering hook getMMOFlags() to add target specific MMO flags to load/store instructions created by ISel. Reviewers: bogner, hfinkel, qcolombet, MatzeB Subscribers: mcrosier, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D34962 llvm-svn: 307879
* Add element atomic memset intrinsicDaniel Neilson2017-07-121-0/+39
| | | | | | | | | | | | | | Summary: Continuing the work from https://reviews.llvm.org/D33240, this change introduces an element unordered-atomic memset intrinsic. This intrinsic is essentially memset with the implementation requirement that all stores used for the assignment are done with unordered-atomic stores of a given element size. Reviewers: eli.friedman, reames, mkazantsev, skatkov Reviewed By: reames Subscribers: jfb, dschuff, sbc100, jgravelle-google, aheejin, efriedma, llvm-commits Differential Revision: https://reviews.llvm.org/D34885 llvm-svn: 307854
* Add element atomic memmove intrinsicDaniel Neilson2017-07-121-0/+38
| | | | | | | | | | | | | | Summary: Continuing the work from https://reviews.llvm.org/D33240, this change introduces an element unordered-atomic memmove intrinsic. This intrinsic is essentially memmove with the implementation requirement that all loads/stores used for the copy are done with unordered-atomic loads/stores of a given element size. Reviewers: eli.friedman, reames, mkazantsev, skatkov Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34884 llvm-svn: 307796
* Enhance synchscope representationKonstantin Zhuravlyov2017-07-112-13/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | OpenCL 2.0 introduces the notion of memory scopes in atomic operations to global and local memory. These scopes restrict how synchronization is achieved, which can result in improved performance. This change extends existing notion of synchronization scopes in LLVM to support arbitrary scopes expressed as target-specific strings, in addition to the already defined scopes (single thread, system). The LLVM IR and MIR syntax for expressing synchronization scopes has changed to use *syncscope("<scope>")*, where <scope> can be "singlethread" (this replaces *singlethread* keyword), or a target-specific name. As before, if the scope is not specified, it defaults to CrossThread/System scope. Implementation details: - Mapping from synchronization scope name/string to synchronization scope id is stored in LLVM context; - CrossThread/System and SingleThread scopes are pre-defined to efficiently check for known scopes without comparing strings; - Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in the bitcode. Differential Revision: https://reviews.llvm.org/D21723 llvm-svn: 307722
* Revert "[DAG] Improve Aliasing of operations to static alloca"Matthias Braun2017-07-101-14/+6
| | | | | | | | | Reverting as it breaks tramp3d-v4 in the llvm test-suite. I added some comments to https://reviews.llvm.org/D33345 about it. This reverts commit r307546. llvm-svn: 307589
* Add DAG argument to canMergeStoresTo NFC.Nirav Dave2017-07-101-7/+9
| | | | llvm-svn: 307583
* [DAG] Improve Aliasing of operations to static allocaNirav Dave2017-07-101-6/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Memory accesses offset from frame indices may alias, e.g., we may merge write from function arguments passed on the stack when they are contiguous. As a result, when checking aliasing, we consider the underlying frame index's offset from the stack pointer. Static allocs are realized as stack objects in SelectionDAG, but its offset is not set until post-DAG causing DAGCombiner's alias check to consider access to static allocas to frequently alias. Modify isAlias to consider access between static allocas and access from other frame objects to be considered aliasing. Many test changes are included here. Most are fixes for tests which indirectly relied on our aliasing ability and needed to be modified to preserve their original intent. The remaining tests have minor improvements due to relaxed ordering. The exception is CodeGen/X86/2011-10-19-widen_vselect.ll which has a minor degradation dispite though the pre-legalized DAG is improved. Reviewers: rnk, mkuper, jonpa, hfinkel, uweigand Reviewed By: rnk Subscribers: sdardis, nemanjai, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33345 llvm-svn: 307546
* fix typos in comments and error messages; NFCHiroshi Inoue2017-07-102-2/+2
| | | | llvm-svn: 307533
* [X86] Relax an assertion when legalizing vector types.Davide Italiano2017-07-091-0/+4
| | | | | | | | | | | | | | WidenVSELECTAndMask can fold (and it folds in this case) so we get a BUILD_VECTOR of constants as mask. convertMask() seems to work fine when the input is a vector of constants, and we still need to call it to extend/add elements at the end. but the current code just asserts on anything but a SETCC or AND/OR/XOR of 2xSETCC. This change was discussed briefly with Simon Pilgrim, who also suggests we might consider dropping this assertion in the future. Fixes PR33715. llvm-svn: 307508
* Handle ConstantExpr correctly in SelectionDAGBuilderSimon Pilgrim2017-07-092-8/+18
| | | | | | | | | | | | This change fixes a bug in SelectionDAGBuilder::visitInsertValue and SelectionDAGBuilder::visitExtractValue where constant expressions (InsertValueConstantExpr and ExtractValueConstantExpr) would be treated as non-constant instructions (InsertValueInst and ExtractValueInst). This bug resulted in an incorrect memory access, which manifested as an assertion failure in SDValue::SDValue. Fixes PR#33094. Submitted on behalf of @Praetonus (Benoit Vey) Differential Revision: https://reviews.llvm.org/D34538 llvm-svn: 307502
* [FastISel] fix a fallback diagnostic.Igor Breger2017-07-091-1/+2
| | | | | | | | | | | | | | Summary: FastISel was marked as failed in case instruction selection succeeded. Reviewers: qcolombet, zvi, rovka, ab Reviewed By: zvi Subscribers: javed.absar, ab, qcolombet, bogner, llvm-commits Differential Revision: https://reviews.llvm.org/D34438 llvm-svn: 307489
* [DAGCombiner] use local variable to shorten code; NFCISanjay Patel2017-07-071-36/+31
| | | | llvm-svn: 307429
* Fix libcall expansion creating DAG nodes with invalid type post type ↵Vadim Chugunov2017-07-052-12/+25
| | | | | | | | | | | | legalization. If we are lowering a libcall after legalization, we'll split the return type into a pair of legal values. Patch by Jatin Bhateja and Eli Friedman. Differential Revision: https://reviews.llvm.org/D34240 llvm-svn: 307207
* {DAGCombiner] Fold (rot x, 0) -> xSimon Pilgrim2017-07-051-0/+4
| | | | llvm-svn: 307184
* [DAGCombiner] visitRotate patch to optimize pair of ROTR/ROTL instructions ↵Andrew Zhogin2017-07-051-0/+19
| | | | | | | | | | into one with combined shift operand. For two ROTR operations with shifts C1, C2; combined shift operand will be (C1 + C2) % bitsize. Differential revision: https://reviews.llvm.org/D12833 llvm-svn: 307179
* Rewrite areNonVolatileConsecutiveLoads to use BaseIndexOffsetNirav Dave2017-07-052-45/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | Relanding after rewriting undef.ll test to avoid host-dependant endianness. As discussed in D34087, rewrite areNonVolatileConsecutiveLoads using generic checks. Also, propagate missing local handling from there to BaseIndexOffset checks. Tests of note: * test/CodeGen/X86/build-vector* - Improved. * test/CodeGen/BPF/undef.ll - Improved store alignment allows an additional store merge * test/CodeGen/X86/clear_upper_vector_element_bits.ll - This is a case we already do not handle well. Here, the DAG is improved, but scheduling causes a code size degradation. Reviewers: RKSimon, craig.topper, spatel, andreadb, filcab Subscribers: nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D34472 llvm-svn: 307114
* fix trivial typos in comments; NFCHiroshi Inoue2017-07-041-1/+1
| | | | llvm-svn: 307094
* [DAGCombiner] Intermediate variables in visitRotate promoted to the ↵Andrew Zhogin2017-07-041-6/+9
| | | | | | function's begin. NFC precommit for D12833. llvm-svn: 307091
* [FastISel][SelectionDAG]Teach fastISel about GC intrinsicsAnna Thomas2017-07-041-1/+5
| | | | | | | | | | | | | | | | | | | | | | | Summary: We are crashing in LLC at O0 when gc intrinsics are present in the block. The reason being FastISel performs basic block ISel by modifying GC.relocates to be the first instruction in the block. This can cause us to visit the GC relocate before it's corresponding GC.statepoint is visited, which is incorrect. When we lower the statepoint, we record the base and derived pointers, along with the gc.relocates. After this we can visit the gc.relocate. This patch avoids fastISel from incorrectly creating the block with gc.relocate as the first instruction. Reviewers: qcolombet, skatkov, qikon, reames Reviewed by: skatkov Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34421 llvm-svn: 307084
* [DAG] Fixed predicate for determining when two frame indicesNirav Dave2017-07-041-5/+5
| | | | | | addresses are comparable. NFCI. llvm-svn: 307055
* [legalize-types] Clean up softening machinery.Anton Yartsev2017-07-044-42/+89
| | | | | | | | The patch makes SoftenFloatResult/Operand logic just the same as all other legalization routines have: SoftenFloatResult() now fills the SoftenFloats map and SoftenFloatOperand() perform all needed replacements. This prevents softening mashinery from leaving stale entries in SoftenFloats map (that resulted in errors during the legalize type checking) and clarifies softening. The patch replaces https://reviews.llvm.org/D29265. Differential Revision: https://reviews.llvm.org/D31946 llvm-svn: 307053
* DAGCombine: Combine BUILD_VECTOR to TRUNCATEZvi Rackover2017-07-031-0/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Add a combine for creating a truncate to replace a build_vector composed of extracts with indices that form a stride-2^N series. Example: v8i32 V = ... v4i32 build_vector((extract_elt V, 0), (extract_elt V, 2), (extract_elt V, 4), (extract_elt V, 6)) --> v4i32 truncate (bitcast V to v4i64) Related discussion in llvm-dev about canonicalizing shuffles to truncates in LLVM IR: http://lists.llvm.org/pipermail/llvm-dev/2017-January/108936.html. Reviewers: spatel, RKSimon, efriedma, igorb, craig.topper, wolfgangp, delena Reviewed By: delena Subscribers: guyblank, delena, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D34077 llvm-svn: 307036
* [SelectionDAGBuilder] Use EVT::getVectorVT instead of MVT::getVectorVT to ↵Craig Topper2017-07-011-1/+1
| | | | | | prevent a crash if the type isn't a simple VT. llvm-svn: 306950
* Revert "[DAG] Rewrite areNonVolatileConsecutiveLoads to use BaseIndexOffset"Nirav Dave2017-06-302-20/+45
| | | | | | | This reverts commit r306819 which appears be exposing underlying issues in a stage1 ppc64be build llvm-svn: 306820
* [DAG] Rewrite areNonVolatileConsecutiveLoads to use BaseIndexOffsetNirav Dave2017-06-302-45/+20
| | | | | | | | | | | | | | | | | | | | | | | | As discussed in D34087, rewrite areNonVolatileConsecutiveLoads using generic checks. Also, propagate missing local handling from there to BaseIndexOffset checks. Tests of note: * test/CodeGen/X86/build-vector* - Improved. * test/CodeGen/BPF/undef.ll - Improved store alignment allows an additional store merge * test/CodeGen/X86/clear_upper_vector_element_bits.ll - This is a case we already do not handle well. Here, the DAG is improved, but scheduling causes a code size degradation. Reviewers: RKSimon, craig.topper, spatel, andreadb, filcab Subscribers: nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D34472 llvm-svn: 306819
* Revert "[mips] Fix multiprecision arithmetic."Simon Dardis2017-06-291-17/+4
| | | | | | | This reverts commit r305389. This broke chromium builds, so reverting while I investigate further. llvm-svn: 306741
* [DAG] Fold FrameIndex offset into BaseIndexOffset analysis. NFCI.Nirav Dave2017-06-292-21/+35
| | | | | | | | | | | Relanding after restricting equalBaseIndex to not erroneuosly consider a FrameIndices stemming from alloca from being comparable as its offset is set post-selectionDAG. Pull FrameIndex comparision reasoning from DAGCombiner::isAlias to general BaseIndexOffset. llvm-svn: 306688
* Fold fneg and fabs like multiplicationsStanislav Mekhanoshin2017-06-281-0/+46
| | | | | | | | | | | Given no NaNs and no signed zeroes it folds: (fmul X, (select (fcmp X > 0.0), -1.0, 1.0)) -> (fneg (fabs X)) (fmul X, (select (fcmp X > 0.0), 1.0, -1.0)) -> (fabs X) Differential Revision: https://reviews.llvm.org/D34579 llvm-svn: 306592
* Revert "[DAG] Fold FrameIndex offset into BaseIndexOffset analysis. NFCI."Nirav Dave2017-06-282-24/+27
| | | | | | This reverts commit r306498 which appears to cause a compilrt-rt test failures llvm-svn: 306501
* Allow to truncate left shift with non-constant shift amountStanislav Mekhanoshin2017-06-281-10/+12
| | | | | | | | | | | That is pretty common for clang to produce code like (shl %x, (and %amt, 31)). In this situation we can still perform trunc (shl) into shl (trunc) conversion given the known value range of shift amount. Differential Revision: https://reviews.llvm.org/D34723 llvm-svn: 306499
* [DAG] Fold FrameIndex offset into BaseIndexOffset analysis. NFCI.Nirav Dave2017-06-282-27/+24
| | | | | | | Pull FrameIndex comparision reasoning from DAGCombiner::isAlias to general BaseIndexOffset. llvm-svn: 306498
* [SelectionDAG] set dereferenceable flag in MergeConsecutiveStores to fix ↵Hiroshi Inoue2017-06-271-2/+12
| | | | | | | | | | | | assetion failure When SelectionDAG merges consecutive stores and loads in MergeConsecutiveStores, it does not set dereferenceable flag for a created load instruction. This results in an assertion failure if SelectionDAG commonizes this load instruction with other load instructions, as well as it may miss optimization opportunities. This patch sat dereferenceable flag for the newly created load instruction if all the load instructions to be merged are dereferenceable. Differential Revision: https://reviews.llvm.org/D34679 llvm-svn: 306404
* DAGCombine: Make sure we only eliminate trunc/extend when the scales of ↵Wolfgang Pieb2017-06-261-5/+9
| | | | | | | | | | | | truncation and extension match. This fixes PR33368. Reviewer: rksimon Differential Revision: https://reviews.llvm.org/D34069 llvm-svn: 306345
* AVX-512: Fixed a crash during legalization of <3 x i8> typeElena Demikhovsky2017-06-251-2/+1
| | | | | | | | | The compiler fails with assertion during legalization of SETCC for <3 x i8> operands. The result is extended to <4 x i8> and then truncated <4 x i1>. It does not happen on AVX2, because the final result of SETCC is <4 x i32>. Differential Revision: https://reviews.llvm.org/D34503 llvm-svn: 306242
* [SelectionDAG] set dereferenceable flag when expanding memcpy/memmoveHiroshi Inoue2017-06-241-8/+25
| | | | | | | | | | When SelectionDAG expands memcpy (or memmove) call into a sequence of load and store instructions, it disregards dereferenceable flag even the source pointer is known to be dereferenceable. This results in an assertion failure if SelectionDAG commonizes a load instruction generated for memcpy with another load instruction for the source pointer. This patch makes SelectionDAG to set the dereferenceable flag for the load instructions properly to avoid the assertion failure. Differential Revision: https://reviews.llvm.org/D34467 llvm-svn: 306209
* [DAG] Add Target Store Merge pass ordering functionNirav Dave2017-06-221-1/+2
| | | | | | | Allow targets to specify if they should merge stores before or after legalization. llvm-svn: 306006
OpenPOWER on IntegriCloud