summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* [SDAG] Fix Stale SDNode usage in visitANDNirav Dave2017-03-281-0/+31
| | | | | | | | | | | | | | | Reorder CombineTo Calls to prevent potential use of deleted node. Fixes PR32372. Reviewers: jnspaulsson, RKSimon, uweigand, jonpa Reviewed By: jonpa Subscribers: jonpa, llvm-commits Differential Revision: https://reviews.llvm.org/D31346 llvm-svn: 298920
* [x86] add AVX2 run to show 256-bit opportunity; NFCSanjay Patel2017-03-281-17/+17
| | | | llvm-svn: 298918
* [GlobalISel][X86] support G_FRAME_INDEX instruction selection.Igor Breger2017-03-283-49/+99
| | | | | | | | | | | | | | | | Summary: G_LOAD/G_STORE, add alternative RegisterBank mapping. For G_LOAD, Fast and Greedy mode choose the same RegisterBank mapping (GprRegBank ) for the G_GLOAD + G_FADD , can't get rid of cross register bank copy GprRegBank->VecRegBank. Reviewers: zvi, rovka, qcolombet, ab Reviewed By: zvi Subscribers: llvm-commits, dberris, kristof.beyls, eladcohen, guyblank Differential Revision: https://reviews.llvm.org/D30979 llvm-svn: 298907
* [ARM] Mark falky test unsupported until we find the causeRenato Golin2017-03-271-1/+1
| | | | llvm-svn: 298887
* Improve machine schedulers for in-order processorsJaved Absar2017-03-271-0/+86
| | | | | | | | | | | This patch enables schedulers to specify instructions that cannot be issued with any other instructions. It also fixes BeginGroup/EndGroup. Reviewed by: Andrew Trick Differential Revision: https://reviews.llvm.org/D30744 llvm-svn: 298885
* [GlobalISel][AArch64] Fold FI into LDR/STR ui addressing mode.Ahmed Bougacha2017-03-272-0/+66
| | | | | | | | A majority of loads and stores at O0 access an alloca. It's trivial to fold the G_FRAME_INDEX into the instruction; do it. llvm-svn: 298864
* [GlobalISel][AArch64] Fold G_GEP into LDR/STR ui addressing mode.Ahmed Bougacha2017-03-272-0/+468
| | | | | | | | | | We're not to the point of supporting the load/store patterns yet (because they extensively use PatFrags). But in the meantime, we can implement some of the simplest addressing modes. llvm-svn: 298863
* [GlobalISel][AArch64] Select store of zero to WZR/XZR.Ahmed Bougacha2017-03-271-0/+56
| | | | | | These occur very frequently, and are quite trivial to catch. llvm-svn: 298862
* [GlobalISel][AArch64] Select CBZ.Ahmed Bougacha2017-03-271-0/+108
| | | | | | | | | | CBZ/CBNZ represent a substantial portion of all conditional branches. Look through G_ICMP to select them. We can't use tablegen yet because the existing patterns match an AArch64ISD node. llvm-svn: 298856
* [GlobalISel][AArch64] Use proper constant types in test. NFC.Ahmed Bougacha2017-03-271-2/+2
| | | | llvm-svn: 298854
* [AArch64] Mark mrs of TPIDR_EL0 (thread pointer) as not having side effects.Chad Rosier2017-03-271-0/+60
| | | | | | | | | Among other things, this allows Machine LICM to hoist a costly 'mrs' instruction from within a loop. Differential Revision: http://reviews.llvm.org/D31151 llvm-svn: 298851
* [X86][AVX2] bugzilla bug 21281 Performance regression in vector interleave ↵Gadi Haber2017-03-272-56/+34
| | | | | | | | | | | | | | | | | | | | | in AVX2 This is a patch for an on-going bugzilla bug 21281 on the generated X86 code for a matrix transpose8x8 subroutine which requires vector interleaving. The generated code in AVX2 is currently non-optimal and requires 60 instructions as opposed to only 40 instructions generated for AVX1. The patch includes a fix for the AVX2 case where vector unpack instructions use less operations than the vector blend operations available in AVX2. In this case using vector unpack instructions is more efficient. Reviewers: zvi delena igorb craig.topper guyblank eladcohen m_zuckerman aymanmus RKSimon llvm-svn: 298840
* [X86][SSE] Add computeKnownBitsForTargetNode support for (V)PSLL/(V)PSRL ↵Simon Pilgrim2017-03-261-1/+0
| | | | | | instructions llvm-svn: 298806
* [X86][AVX512F] Fix reg class for VMOVSSZrr/VMOVSSZrrk and VMOVSDZrr/VMOVSDZrrkSimon Pilgrim2017-03-261-6/+6
| | | | | | | | | | Fixed -verify-machineinstrs errors in fast-isel-select-sse.ll (one of many in PR27481) The VMOVSSZrr/VMOVSSZrrk and VMOVSDZrr/VMOVSDZrrk instructions were assuming both source registers were V128X when the second is actually supposed to be FR32X/FR64X Differential Revision: https://reviews.llvm.org/D31200 llvm-svn: 298805
* Regenerate testSimon Pilgrim2017-03-261-1/+1
| | | | llvm-svn: 298803
* Regenerate testSimon Pilgrim2017-03-261-7/+7
| | | | | | The CHECK-DAG aren't necessary and get in the way of automated checks llvm-svn: 298802
* Regenerate tests to remove duplicated checksSimon Pilgrim2017-03-261-241/+118
| | | | llvm-svn: 298801
* [GlobalISel][X86] support G_FRAME_INDEX instruction selection.Igor Breger2017-03-262-0/+66
| | | | | | | | | | | | | | | Summary: Support G_FRAME_INDEX instruction selection. Reviewers: zvi, rovka, ab, qcolombet Reviewed By: ab Subscribers: llvm-commits, dberris, kristof.beyls, eladcohen, guyblank Differential Revision: https://reviews.llvm.org/D30980 llvm-svn: 298800
* [X86][SSE] Combine (VSRLI (VSRAI X, Y), (NumSignBits-1)) -> (VSRLI X, ↵Simon Pilgrim2017-03-251-1/+0
| | | | | | | | | | (NumSignBits-1)) Part 3 of 3. Differential Revision: https://reviews.llvm.org/D31347 llvm-svn: 298782
* [X86][SSE] Added ComputeNumSignBitsForTargetNode support for (V)PSRAISimon Pilgrim2017-03-251-2/+2
| | | | | | | | Part 2 of 3. Differential Revision: https://reviews.llvm.org/D31347 llvm-svn: 298780
* [x86] use PMOVMSK to replace memcmp libcalls for 16-byte equalitySanjay Patel2017-03-251-11/+9
| | | | | | | | | This is the payoff for D31156 - if a target has efficient comparison instructions for vector-sized equality, we can replace memcmp calls with inline code that is both smaller and faster. Differential Revision: https://reviews.llvm.org/D31290 llvm-svn: 298775
* [X86][SSE] Add extra computeNumSignBits test case for D31311.Simon Pilgrim2017-03-251-0/+47
| | | | llvm-svn: 298774
* [AMDGPU] Switch data layout by triple environment amdgizYaxun Liu2017-03-252-0/+22
| | | | | | | | | | | | Switch data layout by target triple environment amdgiz and amdgizcl indicating using of an address space mapping in which generic address space is 0. amdgiz is for non-OpenCL environment where generic address space is 0. amdgizcl is for OpenCL environment where generic address space is 0. Differential Revision: https://reviews.llvm.org/D31211 llvm-svn: 298758
* [ARM] Fix mixup between Lo and Hi in SMLALBB formation.Eli Friedman2017-03-251-84/+84
| | | | llvm-svn: 298752
* [x86] add 32-bit RUN for better memcmp coverage; NFCSanjay Patel2017-03-241-102/+244
| | | | llvm-svn: 298744
* AMDGPU: Fix annotating loops with nested loop conditionsMatt Arsenault2017-03-241-0/+269
| | | | | | | | If the branch condition for a loop was a phi which itself was fed from a phi from a loop, it isn't safe to try to delete the phi until after the loop is handled. llvm-svn: 298737
* AMDGPU: Implement f16 froundMatt Arsenault2017-03-242-21/+74
| | | | llvm-svn: 298730
* AMDGPU: Unify divergent function exits.Matt Arsenault2017-03-246-44/+952
| | | | | | | | | | StructurizeCFG can't handle cases with multiple returns creating regions with multiple exits. Create a copy of UnifyFunctionExitNodes that only unifies exit nodes that skips exit nodes with uniform branch sources. llvm-svn: 298729
* [AMDGPU] Fold V_CNDMASK with identical source operandsStanislav Mekhanoshin2017-03-241-0/+34
| | | | | | | | Such instructions sometimes appear after lowering and folding. Differential Revision: https://reviews.llvm.org/D31318 llvm-svn: 298723
* [AMDGPU] Rename Kind to ValueKind in metadata to be consistentKonstantin Zhuravlyov2017-03-241-169/+169
| | | | llvm-svn: 298722
* [AMDGPU] Add AMDGPUAliasAnalysis to opt pipelineStanislav Mekhanoshin2017-03-241-0/+8
| | | | | | | | Previously it was added only to the BE. Differential Revision: https://reviews.llvm.org/D31323 llvm-svn: 298721
* [X86][SSE] Add ashr + mask test cases.Simon Pilgrim2017-03-241-0/+26
| | | | | | Test cases showing cases where we're missing an opportunity to lshr a value with an extended sign to avoid loading a mask llvm-svn: 298716
* Fix trellis layout to avoid mis-identify triangle.Dehao Chen2017-03-231-0/+48
| | | | | | | | | | | | | | | | | | | | | Summary: For the following CFG: A->B B->C A->C If there is another edge B->D, then ABC should not be considered as triangle. Reviewers: davidxl, iteratee Reviewed By: iteratee Subscribers: nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D31310 llvm-svn: 298661
* [Hexagon] Avoid infinite loops in HexagonLoopIdiomRecognitionKrzysztof Parzyszek2017-03-231-0/+83
| | | | | | | | | - Avoid explosive growth of the simplification queue by not queuing expressions that are alredy in it. - Add an iteration counter and abort after a sufficiently large number of iterations (assuming that it's a symptom of an infinite loop). llvm-svn: 298655
* [X86] Fix Stale SDNode use in X86ISelDAGtoDAGNirav Dave2017-03-231-0/+126
| | | | | | | | | | | | Summary: Fixes pr32329. Reviewers: spatel, craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31286 llvm-svn: 298633
* [ARM] Fix computeKnownBits for ARMISD::CMOVPirama Arumuga Nainar2017-03-231-0/+19
| | | | | | | | | | | | | | | | | | | Summary: The true and false operands for the CMOV are operands 0 and 1. ARMISelLowering.cpp::computeKnownBits was looking at operands 1 and 2 instead. This can cause CMOV instructions to be incorrectly folded into BFI if value set by the CMOV is another CMOV, whose known bits are computed incorrectly. This patch fixes the issue and adds a test case. Reviewers: kristof.beyls, jmolloy Subscribers: llvm-commits, aemerson, srhines, rengolin Differential Revision: https://reviews.llvm.org/D31265 llvm-svn: 298624
* [X86][SSE] Extract elements from narrower shuffle masks.Simon Pilgrim2017-03-231-5/+2
| | | | | | Add support for widening narrow shuffle masks so we can directly extract from the relevant input vector of the shuffle. llvm-svn: 298616
* [PPC] Add generated tests for all atomic operationsTim Shen2017-03-231-0/+9546
| | | | | | | | | | | | Summary: Add tests for all atomic operations for powerpc64le, so that all changes can be easily examined. Reviewers: kbarton, hfinkel, echristo Subscribers: mehdi_amini, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D31285 llvm-svn: 298614
* [x86] add memcmp tests, remove runSanjay Patel2017-03-231-142/+140
| | | | | | | | | Add tests for vector lengths that could be handled without a libcall. There's an explicit test for 'nobuiltin', so there's not much value in a separate run that checks that same behavior over and over again. llvm-svn: 298611
* [GlobalISel][X86] Support G_STORE/G_LOAD operationIgor Breger2017-03-235-0/+1102
| | | | | | | | | | | | | | | | | | Summary: 1. Support pointer type as function argumnet and return value 2. G_STORE/G_LOAD - set legal action for i8/i16/i32/i64/f32/f64/vec128 3. RegisterBank - support typeless operations like G_STORE/G_LOAD, for scalar use GPR bank. 4. Support instruction selection for G_LOAD/G_STORE Reviewers: zvi, rovka, ab, qcolombet Reviewed By: rovka Subscribers: llvm-commits, dberris, kristof.beyls, eladcohen, guyblank Differential Revision: https://reviews.llvm.org/D30973 llvm-svn: 298609
* [SDAG] Fix zeroExtend assertion errorNirav Dave2017-03-231-0/+32
| | | | | | | | | | | | | | | | | Move CombineTo preventing deleted node from being returned in visitZERO_EXTEND. Fixes PR32284. Reviewers: RKSimon, bogner Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31254 llvm-svn: 298604
* [X86][SSE] Add computeNumSignBits test for sitofp of (extended) i64 ↵Simon Pilgrim2017-03-231-0/+28
| | | | | | extracted element llvm-svn: 298592
* [X86][TD][vpmovm2 ] New TD pattern for the vpmovm2 instructionMichael Zuckerman2017-03-234-54/+1720
| | | | | | | | | | | Up until now, vpmovm2 instruction described its destination operand size by the source operand size. This patch adds new pattern for the vpmovm2 instruction. The node describes new expansion of the destination (from {128|256} to 512). Differential Revision: https://reviews.llvm.org/D30654 llvm-svn: 298586
* Reapply r298417 "[ARM] Recommit the glueless lowering of addc/adde in Thumb1"Artyom Skrobov2017-03-221-8/+132
| | | | | | | | The UB in t2_so_imm_neg conversion has been addressed under D31242 / r298512 This reverts commit r298482. llvm-svn: 298562
* [AMDGPU] Do not emit isa info as code object metadataKonstantin Zhuravlyov2017-03-224-60/+3
| | | | | | | | - It was decided to expose this information through other means (rocr) Differential Revision: https://reviews.llvm.org/D30970 llvm-svn: 298560
* [AMDGPU] Emit kernel debug properties as code object metadataKonstantin Zhuravlyov2017-03-221-0/+67
| | | | | | Differential Revision: https://reviews.llvm.org/D30969 llvm-svn: 298558
* [AMDGPU] Emit kernel code properties as code object metadataKonstantin Zhuravlyov2017-03-222-3/+35
| | | | | | | | - These are not required for low level runtime Differential Revision: https://reviews.llvm.org/D29949 llvm-svn: 298556
* [x86] improve tests, add tests, auto-generate checks; NFCSanjay Patel2017-03-221-107/+168
| | | | llvm-svn: 298553
* [AMDGPU] Restructure code object metadata creationKonstantin Zhuravlyov2017-03-228-426/+1346
| | | | | | | | | | | | | | | | | - Rename runtime metadata -> code object metadata - Make metadata not flow - Switch enums to use ScalarEnumerationTraits - Cleanup and move AMDGPUCodeObjectMetadata.h to AMDGPU/MCTargetDesc - Introduce in-memory representation for attributes - Code object metadata streamer - Create metadata for isa and printf during EmitStartOfAsmFile - Create metadata for kernel during EmitFunctionBodyStart - Finalize and emit metadata to .note during EmitEndOfAsmFile - Other minor improvements/bug fixes Differential Revision: https://reviews.llvm.org/D29948 llvm-svn: 298552
* [ARM] t2_so_imm_neg had a subtle bug in the conversion, and could trigger UB ↵Artyom Skrobov2017-03-221-0/+9
| | | | | | | | | | | | | | by negating (int)-2147483648. By pure luck, none of the pre-existing tests triggered this; so I'm adding one. Summary: Thanks to Vitaly Buka for helping catch this. Reviewers: rengolin, jmolloy, efriedma, vitalybuka Subscribers: llvm-commits, aemerson Differential Revision: https://reviews.llvm.org/D31242 llvm-svn: 298512
OpenPOWER on IntegriCloud