summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [X86][SSE] Generalise target shuffle combine of shuffles using variable masksSimon Pilgrim2016-07-111-13/+21
| | | | | | At present the only shuffle with a variable mask we recognise is PSHUFB, which influences if its worth the cost of mask creation/loading of a combined target shuffle with a variable mask. This change sets up the infrastructure to support other shuffles in the future but has no effect yet. llvm-svn: 275059
* [AMDGPU][llvm-mc] Quickfix for r272748 to enable labels in branch instructions.Artem Tamazov2016-07-111-0/+2
| | | | | | | | | | Fixes issue mentioned at: https://github.com/RadeonOpenCompute/LLVM-AMDGPU-Assembler-Extra/issues/13. Lit tests added. Differential Revision: http://reviews.llvm.org/D22133 llvm-svn: 275054
* [mips][microMIPS] Implement LDC1, SDC1, LDC2, SDC2, LWC1, SWC1, LWC2 and ↵Zlatko Buljan2016-07-1115-58/+324
| | | | | | | | SWC2 instructions and add CodeGen support Differential Revision: http://reviews.llvm.org/D18824 llvm-svn: 275050
* AVX-512: DAG lowering for scalar MIN/MAX commutable opsElena Demikhovsky2016-07-111-3/+36
| | | | | | | | DAG lowering was missing for the scalar FMINC, FMAXC nodes. The nodes are generated only in the "unsafe-fp-math" mode. Added tests. llvm-svn: 275048
* [AVX512] Add support for 512-bit ANDN now that all ones build vectors ↵Craig Topper2016-07-111-1/+2
| | | | | | survive long enough to allow the matching. llvm-svn: 275046
* [AVX512] Use vpternlog with an immediate of 0xff to create 512-bit all one ↵Craig Topper2016-07-113-5/+19
| | | | | | vectors. llvm-svn: 275045
* [X86] Add the AVX512 SET0 pseudos to foldMemoryOperandImpl since they are ↵Craig Topper2016-07-112-3/+14
| | | | | | | | marked for CanFoldAsLoad. I don't really know how to test this. llvm-svn: 275044
* [X86][SSE] Relax type assertions for matchVectorShuffleAsInsertPSSimon Pilgrim2016-07-101-2/+4
| | | | | | Calls to matchVectorShuffleAsInsertPS only need to ensure the inputs are 128-bit vectors. Only lowerVectorShuffleAsInsertPS needs to ensure that they are v4f32. llvm-svn: 275028
* AMDGPU/R600: Add implicitarg.ptr intrinsicJan Vesely2016-07-103-6/+12
| | | | | | Differential Revision: http://reviews.llvm.org/D21622 llvm-svn: 275024
* [X86][SSE] Add support for target shuffle combining to PSHUFLW/PSHUFHWSimon Pilgrim2016-07-101-3/+48
| | | | llvm-svn: 275022
* [SystemZ] Utilize Test Data Class instructions.Marcin Koscielnicki2016-07-106-4/+432
| | | | | | | | | | | This adds a new SystemZ-specific intrinsic, llvm.s390.tdc.f(32|64|128), which maps straight to the test data class instructions. A new IR pass is added to recognize instructions that can be converted to TDC and perform the necessary replacements. Differential Revision: http://reviews.llvm.org/D21949 llvm-svn: 275016
* Give helper classes/functions internal linkage. NFC.Benjamin Kramer2016-07-101-9/+14
| | | | llvm-svn: 275014
* [AVX512] Add support for lowering to 512-bit SHUFPS.Craig Topper2016-07-101-4/+8
| | | | llvm-svn: 275011
* [X86][SSE] Add support for target shuffle combining to INSERTPSSimon Pilgrim2016-07-091-10/+48
| | | | llvm-svn: 274990
* [X86][SSE] Use scaleShuffleMask helper. NFCI.Simon Pilgrim2016-07-091-11/+2
| | | | llvm-svn: 274988
* [lanai] Treat .t as optional in assembly parser for RR operands and add ↵Jacques Pienaar2016-07-092-9/+38
| | | | | | predicate operand to ShiftRR llvm-svn: 274980
* AMDGPU: Move R600 only pieces into R600 classesMatt Arsenault2016-07-0911-109/+81
| | | | llvm-svn: 274979
* Revert "AMDGPU: Remove unused control flow intrinsic"Matt Arsenault2016-07-097-0/+32
| | | | llvm-svn: 274978
* AMDGPU: Prune AMDGPUAsmParser in libdeps.NAKAMURA Takumi2016-07-091-1/+1
| | | | llvm-svn: 274970
* AMDGPU: Fix fdiv lowering when f32 denormals supportedMatt Arsenault2016-07-091-5/+3
| | | | | | Also fix test not actually using function labels. llvm-svn: 274969
* [X86] Remove and autoupgrade 512-bit non-temporal store intrinsics.Craig Topper2016-07-092-25/+1
| | | | llvm-svn: 274966
* AMDGPU: Improve offset folding for register indexingMatt Arsenault2016-07-095-45/+102
| | | | llvm-svn: 274954
* AMDGPU: Simplify isSchedulingBoundaryMatt Arsenault2016-07-091-5/+4
| | | | llvm-svn: 274953
* Lanai: Avoid implicit iterator conversions, NFCDuncan P. N. Exon Smith2016-07-081-6/+6
| | | | | | | Avoid implicit conversions from MachineInstrBundleIterator to MachineInstr* in the Lanai backend. llvm-svn: 274942
* AMDGPU: Remove unused control flow intrinsicMatt Arsenault2016-07-087-32/+0
| | | | llvm-svn: 274939
* MSP430: Avoid implicit iterator conversions, NFCDuncan P. N. Exon Smith2016-07-084-32/+31
| | | | | | | | Avoid implicit conversions from MachineInstrBundleIIterator to MachineInstr* in the MSP430 backend by preferring MachineInstr& over MachineInstr* when a pointer isn't nullable. llvm-svn: 274933
* NVPTX: Avoid implicit iterator conversions, NFCDuncan P. N. Exon Smith2016-07-083-22/+21
| | | | | | | | | | | | | | | | | Avoid implicit conversions from MachineInstrBundleIterator to MachineInstr* in the NVPTX backend, mainly by preferring MachineInstr& over MachineInstr* when a pointer isn't nullable and using range-based for loops. There was one piece of questionable code in NVPTXInstrInfo::AnalyzeBranch, where a condition checked a pointer converted from an iterator for nullptr. Since this case is impossible (moreover, the code above guarantees that the iterator is valid), I removed the check when I changed the pointer to a reference. Despite that case, there should be no functionality change here. llvm-svn: 274931
* AArch64: Avoid implicit iterator conversions, NFCDuncan P. N. Exon Smith2016-07-088-222/+217
| | | | | | | | Avoid implicit conversions from MachineInstrBundleInstr to MachineInstr* in the AArch64 backend, mainly by preferring MachineInstr& over MachineInstr* when a pointer isn't nullable. llvm-svn: 274924
* ARM: Remove implicit iterator conversions, NFCDuncan P. N. Exon Smith2016-07-088-135/+124
| | | | | | | | | | | Remove remaining implicit conversions from MachineInstrBundleIterator to MachineInstr* from the ARM backend. In most cases, I made them less attractive by preferring MachineInstr& or using a ranged-based for loop. Once all the backends are fixed I'll make the operator explicit so that this doesn't bitrot back. llvm-svn: 274920
* Sparc: Avoid implicit iterator conversions, NFCDuncan P. N. Exon Smith2016-07-081-3/+3
| | | | | | | Remove the only implicit conversions from MachineInstrBundleIterator to MachineInstr* in the Sparc backend. llvm-svn: 274913
* WebAssembly: Avoid implicit iterator conversions, NFCDuncan P. N. Exon Smith2016-07-083-33/+30
| | | | | | | | Avoid implicit conversions from MachineInstrBundleIterator to MachineInstr* in the WebAssembly backend by preferring MachineInstr& over MachineInstr*. llvm-svn: 274912
* [X86][AVX2] Add support for target shuffle combining to VPERMPD/VPERMQSimon Pilgrim2016-07-081-2/+19
| | | | llvm-svn: 274908
* AMDGPU: Remove implicit iterator conversions, NFCDuncan P. N. Exon Smith2016-07-089-172/+176
| | | | | | | | | | | Remove remaining implicit conversions from MachineInstrBundleIterator to MachineInstr* from the AMDGPU backend. In most cases, I made them less attractive by preferring MachineInstr& or using a ranged-based for loop. Once all the backends are fixed I'll make the operator explicit so that this doesn't bitrot back. llvm-svn: 274906
* AMDGPU: Make infinite loop clear, NFCDuncan P. N. Exon Smith2016-07-081-1/+2
| | | | | | | | | | | | | | | Change a while loop that was checking for nullptr on an iterator-to-pointer conversion to an infinite for loop. Now it's clear that the condition doesn't terminate. The only change in behaviour is if an invalid iterator (holding nullptr) was passed into AMDGPUCFGStructurizer::reversePredicateSetter. There are only two callers, and they both dereference the iterator before sending it in, so rather than adding an early return to avoid the loop I've just asserted (using a static_cast, to avoid an implicit conversion to pointer). llvm-svn: 274902
* Target: Avoid getFirstTerminator() => pointer, NFCDuncan P. N. Exon Smith2016-07-083-10/+10
| | | | | | | | | | | | | | Stop using an implicit conversion from the return of MachineBasicBlock::getFirstTerminator to MachineInstr*. In two cases, directly dereference to a MachineInstr& since later code assumes it's valid. In a third case, change to an iterator since later code checks against MachineBasicBlock::end. Although the fix for the third case avoids undefined behaviour, I expect this doesn't cause a functionality change in practice (since the basic block already has a terminator). llvm-svn: 274898
* AMDGPU: Minor adjustment to r274817Matt Arsenault2016-07-081-1/+1
| | | | | | | | | | The commit message is inaccurate, modifiesRegister will check for partial defs of exec. We currently don't ever emit partial defs of exec, so it doesn't really matter. llvm-svn: 274886
* [SystemZ] Add support for the .word directive.Zhan Jun Liau2016-07-081-0/+3
| | | | | | | | | | | | | Summary: Branch off the work to add support for the .word directive, using addAliasForDirective. Reviewers: koriakin Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22142 llvm-svn: 274878
* [SystemZ] Add support for missing instructionsZhan Jun Liau2016-07-082-4/+93
| | | | | | | | | | | | | | | | | Summary: Add support to allow clang integrated assembler to recognize some missing instructions, for openssl. Instructions are: LM, LMH, LMY, STM, STMH, STMY, ICM, ICMH, ICMY, SLA, SLAK, TML, TMH, EX, EXRL. Reviewers: uweigand Subscribers: koriakin, llvm-commits Differential Revision: http://reviews.llvm.org/D22050 llvm-svn: 274869
* [Sparc] Leon errata fix passes.Chris Dewhurst2016-07-088-173/+869
| | | | | | | | | | | | Errata fixes for various errata in different versions of the Leon variants of the Sparc 32 bit processor. The nature of the errata are listed in the comments preceding the errata fix passes. Relevant unit tests are implemented for each of these. Note: Running clang-format has changed a few other lines too, unrelated to the implemented errata fixes. These have been left in as this keeps the code formatting consistent. Differential Revision: http://reviews.llvm.org/D21960 llvm-svn: 274856
* [AMDGPU] fix ds_swizzle_b32 opcode for VI (bz 28371)Valery Pykhtin2016-07-082-1/+21
| | | | | | Differential Revision: http://reviews.llvm.org/D22049 llvm-svn: 274852
* [AArch64] Macro fusion of simple ALU ops with branches for Broadcom's VulcanPankaj Gode2016-07-081-0/+1
| | | | | | | | | | Support for the macro fusion of simple ALU ops with branches for the Vulcan sub-target. Patch by Meador Inge <meadori@gmail.com> Differential Revision: http://reviews.llvm.org/D22042 llvm-svn: 274837
* [X86][SSE] Accept any shuffle mask that is all zeroesSimon Pilgrim2016-07-081-0/+7
| | | | | | Until we have a better way to extract constants through bitcasted build vectors (and how to handle undefs of partial lanes etc.) at least accept build vectors that are all zeroes. llvm-svn: 274833
* [AVX512] Remove and autoupgrade a duplicate set of 512-bit masked shift ↵Craig Topper2016-07-082-14/+1
| | | | | | | | intrinsics. I'm not sure if clang ever used these builtin names or not. llvm-svn: 274827
* AMDGPU: Move si_mask_branch register operand to be a useMatt Arsenault2016-07-083-6/+8
| | | | llvm-svn: 274818
* AMDGPU: Cleanup. Use definesRegister instead of manual loopMatt Arsenault2016-07-081-6/+2
| | | | | | | Also this will be more precise since it will check exec_lo/exec_hi writes. llvm-svn: 274817
* ARM: support high registers in __builtin_longjmp on WoASaleem Abdulrasool2016-07-082-4/+34
| | | | | | | | | | Windows on ARM uses a pure thumb-2 environment. This means that it can select a high register when doing a __builtin_longjmp. We would use a tLDRi which would truncate the register to a low register. Use a t2LDRi12 to get the full register file access. Tweak the code to just load into PC, as that is an interworking branch on all supported cores anyways. llvm-svn: 274815
* [lanai] Use peephole optimizer to generate more conditional ALU operations.Jacques Pienaar2016-07-0715-364/+707
| | | | | | | | | | | | | | | | | Summary: * Similiar to the ARM backend yse the peephole optimizer to generate more conditional ALU operations; * Add predicated type with default always true to RR instructions in LanaiInstrInfo.td; * Move LanaiSetflagAluCombiner into optimizeCompare; * The ASM parser can currently only handle explicitly specified CC, so specify ".t" (true) where needed in the ASM test; * Remove unused MachineOperand flags; Reviewers: eliben Subscribers: aemerson Differential Revision: http://reviews.llvm.org/D22072 llvm-svn: 274807
* Recommit r274692 - [X86] Transform setcc + movzbl into xorl + setccMichael Kuperstein2016-07-074-1/+192
| | | | | | | | | | | xorl + setcc is generally the preferred sequence due to the partial register stall setcc + movzbl suffers from. As a bonus, it also encodes one byte smaller. This fixes PR28146. The original commit tried inserting an 8bit-subreg into a GR32 (not GR32_ABCD) which was not appreciated by fast regalloc on 32-bit. llvm-svn: 274802
* [AArch64] Change the preferred alignment for char and short to word alignment.Chad Rosier2016-07-071-2/+2
| | | | | | | | | | The commit reinstates r273279, which was informally approved. Original Review: http://reviews.llvm.org/D21414 This reverts commit ca632c91aaa7cafc50942f890c49f727a046ace1. llvm-svn: 274790
* Revert r274692 to check whether this is what breaks windows selfhost.Michael Kuperstein2016-07-074-189/+1
| | | | llvm-svn: 274771
OpenPOWER on IntegriCloud