summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* [GlobalISel][AArch64] Select CBZ.Ahmed Bougacha2017-03-272-3/+53
| | | | | | | | | | CBZ/CBNZ represent a substantial portion of all conditional branches. Look through G_ICMP to select them. We can't use tablegen yet because the existing patterns match an AArch64ISD node. llvm-svn: 298856
* [AMDGPU][MC] Fix for Bug 28207 + LIT testsDmitry Preobrazhensky2017-03-275-17/+95
| | | | | | | | | | Enabled clamp and omod for v_cvt_* opcodes which have src0 of an integer type Reviewers: vpykhtin, arsenm Differential Revision: https://reviews.llvm.org/D31327 llvm-svn: 298852
* [AArch64] Mark mrs of TPIDR_EL0 (thread pointer) as not having side effects.Chad Rosier2017-03-272-2/+12
| | | | | | | | | Among other things, this allows Machine LICM to hoist a costly 'mrs' instruction from within a loop. Differential Revision: http://reviews.llvm.org/D31151 llvm-svn: 298851
* [AMDGPU] Get address space mapping by target triple environmentYaxun Liu2017-03-2739-290/+446
| | | | | | | | | | | | | | | | | | As we introduced target triple environment amdgiz and amdgizcl, the address space values are no longer enums. We have to decide the value by target triple. The basic idea is to use struct AMDGPUAS to represent address space values. For address space values which are not depend on target triple, use static const members, so that they don't occupy extra memory space and is equivalent to a compile time constant. Since the struct is lightweight and cheap, it can be created on the fly at the point of usage. Or it can be added as member to a pass and created at the beginning of the run* function. Differential Revision: https://reviews.llvm.org/D31284 llvm-svn: 298846
* [X86][AVX2] bugzilla bug 21281 Performance regression in vector interleave ↵Gadi Haber2017-03-271-0/+33
| | | | | | | | | | | | | | | | | | | | | in AVX2 This is a patch for an on-going bugzilla bug 21281 on the generated X86 code for a matrix transpose8x8 subroutine which requires vector interleaving. The generated code in AVX2 is currently non-optimal and requires 60 instructions as opposed to only 40 instructions generated for AVX1. The patch includes a fix for the AVX2 case where vector unpack instructions use less operations than the vector blend operations available in AVX2. In this case using vector unpack instructions is more efficient. Reviewers: zvi delena igorb craig.topper guyblank eladcohen m_zuckerman aymanmus RKSimon llvm-svn: 298840
* [Target] Remove some code probably copy/pasted from another backend.Davide Italiano2017-03-261-4/+0
| | | | llvm-svn: 298825
* [MachineScheduler] Reference the correct header.Davide Italiano2017-03-261-1/+1
| | | | llvm-svn: 298823
* [X86][SSE] Add computeKnownBitsForTargetNode support for (V)PSLL/(V)PSRL ↵Simon Pilgrim2017-03-261-1/+26
| | | | | | instructions llvm-svn: 298806
* [X86][AVX512F] Fix reg class for VMOVSSZrr/VMOVSSZrrk and VMOVSDZrr/VMOVSDZrrkSimon Pilgrim2017-03-261-11/+10
| | | | | | | | | | Fixed -verify-machineinstrs errors in fast-isel-select-sse.ll (one of many in PR27481) The VMOVSSZrr/VMOVSSZrrk and VMOVSDZrr/VMOVSDZrrk instructions were assuming both source registers were V128X when the second is actually supposed to be FR32X/FR64X Differential Revision: https://reviews.llvm.org/D31200 llvm-svn: 298805
* [GlobalISel][X86] support G_FRAME_INDEX instruction selection.Igor Breger2017-03-265-5/+44
| | | | | | | | | | | | | | | Summary: Support G_FRAME_INDEX instruction selection. Reviewers: zvi, rovka, ab, qcolombet Reviewed By: ab Subscribers: llvm-commits, dberris, kristof.beyls, eladcohen, guyblank Differential Revision: https://reviews.llvm.org/D30980 llvm-svn: 298800
* [X86] Pull out repeated ScalarValueSizeInBits code. NFCI.Simon Pilgrim2017-03-251-6/+4
| | | | llvm-svn: 298783
* [X86][SSE] Combine (VSRLI (VSRAI X, Y), (NumSignBits-1)) -> (VSRLI X, ↵Simon Pilgrim2017-03-251-1/+9
| | | | | | | | | | (NumSignBits-1)) Part 3 of 3. Differential Revision: https://reviews.llvm.org/D31347 llvm-svn: 298782
* [X86][SSE] Added ComputeNumSignBitsForTargetNode support for (V)PSRAISimon Pilgrim2017-03-251-0/+9
| | | | | | | | Part 2 of 3. Differential Revision: https://reviews.llvm.org/D31347 llvm-svn: 298780
* [X86][SSE] Generalised CMP+AND1 combine to ZERO/ALLBITS+MASKSimon Pilgrim2017-03-251-26/+22
| | | | | | | | | | | | Patch to generalize combinePCMPAnd1 (for handling SETCC + ZEXT cases) to work for any input that has zero/all bits set masked with an 'all low bits' mask. Replaced the implicit assumption of shift availability with a call to SupportedVectorShiftWithImm. Part 1 of 3. Differential Revision: https://reviews.llvm.org/D31347 llvm-svn: 298779
* [x86] use PMOVMSK to replace memcmp libcalls for 16-byte equalitySanjay Patel2017-03-252-0/+19
| | | | | | | | | This is the payoff for D31156 - if a target has efficient comparison instructions for vector-sized equality, we can replace memcmp calls with inline code that is both smaller and faster. Differential Revision: https://reviews.llvm.org/D31290 llvm-svn: 298775
* [AArch64] Refine Falkor Machine Model - Part1Balaram Makam2017-03-253-88/+422
| | | | llvm-svn: 298768
* [AMDGPU] Switch data layout by triple environment amdgizYaxun Liu2017-03-251-1/+6
| | | | | | | | | | | | Switch data layout by target triple environment amdgiz and amdgizcl indicating using of an address space mapping in which generic address space is 0. amdgiz is for non-OpenCL environment where generic address space is 0. amdgizcl is for OpenCL environment where generic address space is 0. Differential Revision: https://reviews.llvm.org/D31211 llvm-svn: 298758
* [ARM] Fix mixup between Lo and Hi in SMLALBB formation.Eli Friedman2017-03-251-4/+4
| | | | llvm-svn: 298752
* [Outliner] Revert r298734.Jessica Paquette2017-03-241-1/+1
| | | | | | | When I tested r298734, I thought that red zones were enabled by default like in X86. Since red zones are behind a flag on AArch64 the testing wasn't true. llvm-svn: 298747
* AMDGPU: Fix annotating loops with nested loop conditionsMatt Arsenault2017-03-241-9/+21
| | | | | | | | If the branch condition for a loop was a phi which itself was fed from a phi from a loop, it isn't safe to try to delete the phi until after the loop is handled. llvm-svn: 298737
* [Outliner] Remove no red zone requirment for AArch64Jessica Paquette2017-03-241-1/+1
| | | | | | | | | | | AArch64 doesn't require -mno-red-zone; stack fixups are sufficient here. This was unnecessarily copied over from the X86 target. (You can now outline with red zones! Yay!) Removing the requirement passes all Single/MultiSource tests. llvm-svn: 298734
* AMDGPU: Implement f16 froundMatt Arsenault2017-03-243-14/+20
| | | | llvm-svn: 298730
* AMDGPU: Unify divergent function exits.Matt Arsenault2017-03-247-15/+254
| | | | | | | | | | StructurizeCFG can't handle cases with multiple returns creating regions with multiple exits. Create a copy of UnifyFunctionExitNodes that only unifies exit nodes that skips exit nodes with uniform branch sources. llvm-svn: 298729
* TTI: Split IsSimple in MemIntrinsicInfoMatt Arsenault2017-03-241-4/+0
| | | | | | All this did before was assert in EarlyCSE. llvm-svn: 298724
* [AMDGPU] Fold V_CNDMASK with identical source operandsStanislav Mekhanoshin2017-03-241-0/+29
| | | | | | | | Such instructions sometimes appear after lowering and folding. Differential Revision: https://reviews.llvm.org/D31318 llvm-svn: 298723
* [AMDGPU] Rename Kind to ValueKind in metadata to be consistentKonstantin Zhuravlyov2017-03-242-2/+2
| | | | llvm-svn: 298722
* [AMDGPU] Add AMDGPUAliasAnalysis to opt pipelineStanislav Mekhanoshin2017-03-241-1/+24
| | | | | | | | Previously it was added only to the BE. Differential Revision: https://reviews.llvm.org/D31323 llvm-svn: 298721
* [AMDGPU] Don't enforce constexpr, there are still old standard libraries ↵Benjamin Kramer2017-03-241-4/+4
| | | | | | around that don't have a constexpr std::pair. llvm-svn: 298719
* [AMDGPU] Remove double map lookups in SI schedulerValery Pykhtin2017-03-241-25/+8
| | | | | | | | Patch by Axel Davy (axel.davy@normalesup.org) Differential revision: https://reviews.llvm.org/D30382 llvm-svn: 298718
* [AMDGPU] Fix SGPR usage count in SI schedulerValery Pykhtin2017-03-241-2/+2
| | | | | | | | Patch by Axel Davy (axel.davy@normalesup.org) Differential revision: https://reviews.llvm.org/D30149 llvm-svn: 298710
* [AMDGPU] Add a new line after a debug messageValery Pykhtin2017-03-241-0/+1
| | | | | | | | Patch by Axel Davy (axel.davy@normalesup.org) Differential revision: https://reviews.llvm.org/D30146 llvm-svn: 298708
* [X86][SSE] Generalised lowerTruncate by PACKSS to work with any 'zero/all ↵Simon Pilgrim2017-03-241-17/+19
| | | | | | | | | | bits' result, not just comparisons. Added vector compare opcodes to X86TargetLowering::ComputeNumSignBitsForTargetNode Covered by existing tests added for D22814. llvm-svn: 298704
* Don't build up std::vectors with constant sizes when an array suffices.Benjamin Kramer2017-03-241-2/+6
| | | | | | NFC. llvm-svn: 298701
* [AVR] Fix build after r298178Meador Inge2017-03-241-9/+9
| | | | | | | | r298178 capitalized the fields in `ArgListEntry`. All the official targets were updated accordingly, but as an experimental target AVR was missed. llvm-svn: 298677
* [Hexagon] Avoid infinite loops in HexagonLoopIdiomRecognitionKrzysztof Parzyszek2017-03-231-13/+29
| | | | | | | | | - Avoid explosive growth of the simplification queue by not queuing expressions that are alredy in it. - Add an iteration counter and abort after a sufficiently large number of iterations (assuming that it's a symptom of an infinite loop). llvm-svn: 298655
* Kill some trailing whitespace to make some new changes a bit easier.Eric Christopher2017-03-231-12/+12
| | | | llvm-svn: 298637
* [X86] Fix Stale SDNode use in X86ISelDAGtoDAGNirav Dave2017-03-231-2/+2
| | | | | | | | | | | | Summary: Fixes pr32329. Reviewers: spatel, craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31286 llvm-svn: 298633
* Remove the subtarget argument from LowerFP_TO_INT since there's oneEric Christopher2017-03-232-9/+4
| | | | | | stored on X86TargetLowering. llvm-svn: 298628
* Remove unused X86Subtarget argument from getOnesVector.Eric Christopher2017-03-231-7/+6
| | | | llvm-svn: 298627
* [ARM] Fix computeKnownBits for ARMISD::CMOVPirama Arumuga Nainar2017-03-231-2/+2
| | | | | | | | | | | | | | | | | | | Summary: The true and false operands for the CMOV are operands 0 and 1. ARMISelLowering.cpp::computeKnownBits was looking at operands 1 and 2 instead. This can cause CMOV instructions to be incorrectly folded into BFI if value set by the CMOV is another CMOV, whose known bits are computed incorrectly. This patch fixes the issue and adds a test case. Reviewers: kristof.beyls, jmolloy Subscribers: llvm-commits, aemerson, srhines, rengolin Differential Revision: https://reviews.llvm.org/D31265 llvm-svn: 298624
* [X86][SSE] Extract elements from narrower shuffle masks.Simon Pilgrim2017-03-231-14/+21
| | | | | | Add support for widening narrow shuffle masks so we can directly extract from the relevant input vector of the shuffle. llvm-svn: 298616
* [GlobalISel][X86] Support G_STORE/G_LOAD operationIgor Breger2017-03-237-74/+254
| | | | | | | | | | | | | | | | | | Summary: 1. Support pointer type as function argumnet and return value 2. G_STORE/G_LOAD - set legal action for i8/i16/i32/i64/f32/f64/vec128 3. RegisterBank - support typeless operations like G_STORE/G_LOAD, for scalar use GPR bank. 4. Support instruction selection for G_LOAD/G_STORE Reviewers: zvi, rovka, ab, qcolombet Reviewed By: rovka Subscribers: llvm-commits, dberris, kristof.beyls, eladcohen, guyblank Differential Revision: https://reviews.llvm.org/D30973 llvm-svn: 298609
* X86FixupBWInsts: Minor cleanup. NFCZvi Rackover2017-03-231-35/+14
| | | | | | | | | | | | | | Summary: Cleanup some remnants of code from when the X86FixupBWInsts pass did both forward liveness analysis and backward liveness analysis. Reviewers: MatzeB, myatsina, DavidKreitzer Reviewed By: MatzeB Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31264 llvm-svn: 298599
* [Mips] Emit the correct DINS variantStrahinja Petrovic2017-03-231-8/+11
| | | | | | | | This patch fixes emitting of correct variant of DINS instruction. Differential Revision: https://reviews.llvm.org/D30988 llvm-svn: 298596
* [X86][SSE] Tidyup canWidenShuffleElements. NFCI.Simon Pilgrim2017-03-231-12/+13
| | | | | | Pull out mask elements at the start, allowing us to make the widening pattern matching more readable. llvm-svn: 298594
* [Mips] Fix for decoding DINS instruction - disassemblerStrahinja Petrovic2017-03-231-1/+8
| | | | | | | | | This patch fixes decoding of size and position for DINSM and DINSU instructions. Differential Revision: https://reviews.llvm.org/D31072 llvm-svn: 298593
* [GlobalISel][X86] clang-format. NFCIgor Breger2017-03-235-21/+18
| | | | llvm-svn: 298590
* [X86][TD][vpmovm2 ] New TD pattern for the vpmovm2 instructionMichael Zuckerman2017-03-232-22/+24
| | | | | | | | | | | Up until now, vpmovm2 instruction described its destination operand size by the source operand size. This patch adds new pattern for the vpmovm2 instruction. The node describes new expansion of the destination (from {128|256} to 512). Differential Revision: https://reviews.llvm.org/D30654 llvm-svn: 298586
* [ARM] Reduce code duplication by factoring out in a lambda. NFCI.Davide Italiano2017-03-231-16/+11
| | | | llvm-svn: 298572
* [AArch64] Drive-by cleanup, make this code shorter. NFCI.Davide Italiano2017-03-221-3/+1
| | | | llvm-svn: 298563
OpenPOWER on IntegriCloud