summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* Removing several -Wunused-but-set-variable warnings; NFC intended.Aaron Ballman2015-07-131-26/+0
| | | | llvm-svn: 242028
* AVX-512: Added all AVX-512 forms of Vector Convert for Float/Double/Int/Long ↵Elena Demikhovsky2015-07-134-152/+436
| | | | | | | | | | | | types. In this patch I have only encoding. Intrinsics and DAG lowering will be in the next patch. I temporary removed the old intrinsics test (just to split this patch). Half types are not covered here. Differential Revision: http://reviews.llvm.org/D11134 llvm-svn: 242023
* [ARM] Add support for nest attribute using r12Renato Golin2015-07-121-0/+3
| | | | | | | | | | | | | | | | Register r12 ('ip') is used by GCC for this purpose and hence is used here. As discussed on the GCC mailing list, the register choice is an ABI issue and so choosing the same register as GCC means __builtin_call_with_static_chain is compatible. A similar patch has just gone in the AArch64 backend, so this is just the ARM counterpart, following the same discussion. Patch by Stephen Cross. llvm-svn: 241996
* [X86][SSE] (V)PMINSB is commutable.Simon Pilgrim2015-07-121-3/+0
| | | | | | (V)PMINSB is no different to the other (V)PMIN/(V)PMAX B/D/W instructions - it is fully commutable. llvm-svn: 241994
* Trim trailing whitespaces. NFC.Simon Pilgrim2015-07-121-3/+3
| | | | llvm-svn: 241990
* [X86][SSE] Vectorized v4i32 non-uniform shifts.Simon Pilgrim2015-07-122-23/+70
| | | | | | | | | | While the v4i32 shl operation is already vectorized using a cvttps2dq/pmulld pattern, the lshr/ashr opeations are still scalarized. This patch adds vectorization support for non-uniform v4i32 shift operations - it splats constant shift amounts to allow them to use the immediate sse shift instructions, or extracts/zero-extends non-constant shift amounts. The individual results are then blended together. Differential Revision: http://reviews.llvm.org/D11063 llvm-svn: 241989
* [PowerPC] Make use of the TargetRecip systemHal Finkel2015-07-123-15/+47
| | | | | | | | | | r238842 added the TargetRecip system for controlling use of reciprocal estimates for sqrt and division using a set of parameters that can be set by the frontend. Clang now supports a sophisticated -mrecip option, and this will allow that option to effectively control the relevant code-generation functionality of the PPC backend. llvm-svn: 241985
* [PowerPC] Support the nest parameter attributeHal Finkel2015-07-123-16/+51
| | | | | | | | | | | | | This adds support for the 'nest' attribute, which allows the static chain register to be set for functions calls under non-Darwin PPC/PPC64 targets. r11 is the chain register (which the PPC64 ELF ABI calls the "environment pointer"). For indirect calls under PPC64 ELFv1, this would normally be loaded from the function descriptor, but providing an explicit 'nest' parameter will override that process and use the value provided. This allows __builtin_call_with_static_chain to work as expected on PowerPC. llvm-svn: 241984
* MC: Only allow changing feature bits in MCSubtargetInfoDuncan P. N. Exon Smith2015-07-101-1/+1
| | | | | | | | | | | | | | | | | Disallow all mutation of `MCSubtargetInfo` expect the feature bits. Besides deleting the assignment operators -- which were dead "code" -- this restricts `InitMCProcessorInfo()` to subclass initialization sequences, and exposes a new more limited function called `setDefaultFeatures()` for use by the ARMAsmParser `.cpu` directive. There's a small functional change here: ARMAsmParser used to adjust `MCSubtargetInfo::CPUSchedModel` as a side effect of calling `InitMCProcessorInfo()`, but I've removed that suspicious behaviour. Since the AsmParser shouldn't be doing any scheduling, there shouldn't be any observable change... llvm-svn: 241961
* AMDGPU: Fix chains for memory ops dependent on argument loadsMatt Arsenault2015-07-101-4/+19
| | | | | | | | | | | | | | | Most loads and stores are derived from pointers derived from a kernel argument load inserted during argument lowering. This was just using the EntryToken chain for the argument loads, and any users of these loads were also on the EntryToken chain. Return the chain of the lowered argument load so that dependent loads end up on the correct chain. No test since I'm not aware of any case where this actually broke. llvm-svn: 241960
* MC: Remove MCSubtargetInfo() default constructorDuncan P. N. Exon Smith2015-07-1014-41/+21
| | | | | | | | | | | | | | | | | | | | | Force all creators of `MCSubtargetInfo` to immediately initialize it, merging the default constructor and the initializer into an initializing constructor. Besides cleaning up the code a little, this makes it clear that the initializer is never called again later. Out-of-tree backends need a trivial change: instead of calling: auto *X = new MCSubtargetInfo(); InitXYZMCSubtargetInfo(X, ...); return X; they should call: return createXYZMCSubtargetInfoImpl(...); There's no real functionality change here. llvm-svn: 241957
* MC: Remove MCSubtargetInfo::InitCPUSched()Duncan P. N. Exon Smith2015-07-102-5/+1
| | | | | | | | | | | | Remove all calls to `MCSubtargetInfo::InitCPUSched()` and merge its body into the only relevant caller, `MCSubtargetInfo::InitMCProcessorInfo()`. We were only calling the former after explicitly calling the latter with the same CPU; it's confusing to have both methods exposed. Besides a minor (surely unmeasurable) speedup in ARM and X86 from avoiding running the logic twice, no functionality change. llvm-svn: 241956
* AMDGPU: Use requested chain when lowering argumentsMatt Arsenault2015-07-101-1/+1
| | | | | | | No test since I'm not aware of any case where this will end up being a different chain. llvm-svn: 241954
* ARM: Use SpecificBumpPtrAllocator to fix leak introduced in r241920Matthias Braun2015-07-101-3/+3
| | | | llvm-svn: 241951
* Fix AArch64 prologue for empty frame with dynamic allocas.Evgeniy Stepanov2015-07-101-8/+4
| | | | | | | | Fixes PR23804: assertion failure in emitPrologue in the case of a function with an empty frame and a dynamic alloca that needs stack realignment. This is a typical case for AddressSanitizer. llvm-svn: 241943
* [TTI] BasicTTIImpl assumes no vector registersJingyue Wu2015-07-102-8/+0
| | | | | | | | | | | | | | | | | | | | | | | Summary: Following the discussion on r241884, it's more reasonable to assume that a target has no vector registers by default instead of letting every such target overrides getNumberOfRegisters. Therefore, this patch modifies BasicTTIImpl::getNumberOfRegisters to return 0 when Vector is true, and partially reverts r241884 which modifies NVPTXTTIImpl::getNumberOfRegisters. It also fixes a performance bug in LoopVectorizer. Even if a target has no vector registers, vectorization may still help ILP. So, we need both checks to be false before disabling loop vectorization all together. Reviewers: hfinkel Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11108 llvm-svn: 241942
* ARMLoadStoreOpt: Merge subs/adds into LDRD/STRD; Factor out common codeMatthias Braun2015-07-101-166/+186
| | | | | | | | | | | | This commit factors out common code from MergeBaseUpdateLoadStore() and MergeBaseUpdateLSMultiple() and introduces a new function MergeBaseUpdateLSDouble() which merges adds/subs preceding/following a strd/ldrd instruction into an strd/ldrd instruction with writeback where possible. Differential Revision: http://reviews.llvm.org/D10676 llvm-svn: 241928
* ARMLoadStoreOptimizer: Create LDRD/STRD on thumb2Matthias Braun2015-07-101-29/+96
| | | | | | Differential Revision: http://reviews.llvm.org/D10623 llvm-svn: 241926
* WebAssembly: basic instructions todo, and basic register info.JF Bastien2015-07-1016-19/+331
| | | | | | | | | | | | | | Summary: This code is based on AArch64 for modern backend good practice, and NVPTX for virtual ISA concerns. Reviewers: sunfish Subscribers: aemerson, llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D11070 llvm-svn: 241923
* Target RegisterInfo: devirtualize TargetFrameLoweringJF Bastien2015-07-108-61/+50
| | | | | | | | | | | | | Summary: The target frame lowering's concrete type is always known in RegisterInfo, yet it's only sometimes devirtualized through a static_cast. This change adds an auto-generated static function <Target>GenRegisterInfo::getFrameLowering(const MachineFunction &MF) which does this devirtualization, and uses this function in all targets which can. This change was suggested by sunfish in D11070 for WebAssembly, I figure that I may as well improve the other targets while I'm here. Subscribers: sunfish, ted, llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D11093 llvm-svn: 241921
* ARMLoadStoreOptimizer: Rewrite LDM/STM matching logic.Matthias Braun2015-07-101-551/+481
| | | | | | | | | | | | | | | | | | | | | This improves the logic in several ways and is a preparation for followup patches: - First perform an analysis and create a list of merge candidates, then transform. This simplifies the code in that you have don't have to care to much anymore that you may be holding iterators to MachineInstrs that get removed. - Analyze/Transform basic blocks in reverse order. This allows to use LivePhysRegs to find free registers instead of the RegisterScavenger. The RegisterScavenger will become less precise in the future as it relies on the deprecated kill-flags. - Return the newly created node in MergeOps so there's no need to look around in the schedule to find it. - Rename some MBBI iterators to InsertBefore to make their role clear. - General code cleanup. Differential Revision: http://reviews.llvm.org/D10140 llvm-svn: 241920
* Actually support volatile memcpys in NVPTX loweringEli Bendersky2015-07-101-8/+10
| | | | | | Differential Revision: http://reviews.llvm.org/D11091 llvm-svn: 241914
* NFC. Added a blank line for consistency.Nemanja Ivanovic2015-07-101-0/+1
| | | | llvm-svn: 241913
* Add missing builtins to the PPC back end for ABI compliance (vol. 3)Nemanja Ivanovic2015-07-101-0/+2
| | | | | | | | | This patch corresponds to review: http://reviews.llvm.org/D10973 Back end portion of the third round of additions to altivec.h. llvm-svn: 241900
* [NVPTX] declare no vector registersJingyue Wu2015-07-102-0/+8
| | | | | | | | | | | | | | | | | Summary: Without this patch, LoopVectorizer in certain cases (see loop-vectorize.ll) produces code with complex control flow which hurts later optimizations. Since NVPTX doesn't have vector registers in LLVM's sense (NVPTXTTI::getRegisterBitWidth(true) == 32), we for now declare no vector registers to effectively disable loop vectorization. Reviewers: jholewinski Subscribers: jingyue, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11089 llvm-svn: 241884
* [WinEH] Make sure LSDA tables are 4 byte alignedReid Kleckner2015-07-101-2/+4
| | | | | | | | | | Apparently this is important, otherwise _except_handler3 assumes that the registration node is corrupted and ignores it. Also fix a bug in WinEHPrepare where we would insert code after a terminator instruction. llvm-svn: 241877
* Replace index-loops by range-based loopsEli Bendersky2015-07-091-6/+3
| | | | | | NFC llvm-svn: 241875
* [x86] enable machine combiner reassociations for scalar double-precision ↵Sanjay Patel2015-07-091-1/+3
| | | | | | multiplies llvm-svn: 241873
* [x86] enable machine combiner reassociations for scalar double-precision addsSanjay Patel2015-07-091-0/+2
| | | | llvm-svn: 241871
* [WinEH] Give up on using CSRs across 32-bit invokes for nowReid Kleckner2015-07-091-2/+17
| | | | | | | | | | | The runtime does not restore CSRs when transferring control back to the function handling the exception. According to the experts on IRC, LLVM's register allocator has no way to model register clobbers that only happen on one edge of the CFG. For now, don't worry about trying to use the meager three CSRs available on 32-bit X86 and just say that such invokes preserve nothing. llvm-svn: 241865
* AMDGPU: Add helper function for implicit parameter offsets.Tom Stellard2015-07-094-4/+28
| | | | | | Patch by: Zoltan Gilian llvm-svn: 241861
* Unbreak WebAssembly buildJF Bastien2015-07-094-26/+6
| | | | | | | | | | | | Summary: D11021 and D11045 didn't update the WebAssembly target's code. It's still experimental so all tests passed. Reviewers: sunfish, joker.eph, echristo Subscribers: llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D11084 llvm-svn: 241859
* AMDGPU/R600: Return correct chain when lowering loadsMatt Arsenault2015-07-091-8/+2
| | | | | | The other LowerLOAD should be returning the correct chain. llvm-svn: 241839
* Allow {e,r}bp as the target of {read,write}_register.Pat Gavlin2015-07-0910-15/+42
| | | | | | | | | | This patch allows the read_register and write_register intrinsics to read/write the RBP/EBP registers on X86 iff the targeted register is the frame pointer for the containing function. Differential Revision: http://reviews.llvm.org/D10977 llvm-svn: 241827
* AMDGPU/SI: The SIShrinkInstructions pass should only fold immediates with ↵Tom Stellard2015-07-091-1/+1
| | | | | | | | | one use This is convered by existing testcases and will be exposed by a future commit. llvm-svn: 241817
* AMDGPU/SI: Fix crash on physical registers in SIInstrInfo::isOperandLegal()Tom Stellard2015-07-091-1/+4
| | | | | | | No test case for this. I ran into it while working on some improvements to SIShrinkInstructions.cpp. llvm-svn: 241816
* [Hexagon] Add missing preamble to a source fileKrzysztof Parzyszek2015-07-091-0/+9
| | | | llvm-svn: 241813
* Re-instate the EVT parameter to getScalarShiftAmountTy() for OOT userMehdi Amini2015-07-0914-15/+17
| | | | | | | A documentation for this function would be nice by the way. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241807
* Reapply fixed r241790: Fix shift legalization and lowering for big constants.Pawel Bylica2015-07-091-1/+1
| | | | | | | | | | | | Summary: If shift amount is a constant value > 64 bit it is handled incorrectly during type legalization and X86 lowering. This patch the type of shift amount argument in function DAGTypeLegalizer::ExpandShiftByConstant from unsigned to APInt. Reviewers: nadav, majnemer, sanjoy, RKSimon Subscribers: RKSimon, llvm-commits Differential Revision: http://reviews.llvm.org/D10767 llvm-svn: 241806
* [Hexagon] Add support for atomic RMW operationsKrzysztof Parzyszek2015-07-093-1/+59
| | | | llvm-svn: 241804
* [AArch64] Select SBFIZ or UBFIZ instead of left + right shiftsArnaud A. de Grandmaison2015-07-091-20/+16
| | | | | | And rename LSB to Immr / MSB to Imms to match the ARM ARM terminology. llvm-svn: 241803
* [ARM] Thumb1 3 to 2 operand convertion for commutative operationsScott Douglass2015-07-091-3/+21
| | | | | | Differential Revision: http://reviews.llvm.org/D11057 llvm-svn: 241802
* [ARM] Don't be overzealous converting Thumb1 3 to 2 operandsScott Douglass2015-07-091-0/+5
| | | | | | Differential Revision: http://reviews.llvm.org/D11056 llvm-svn: 241801
* [ARM] Add Thumb2 ADD with PC narrowing from 3 operand to 2Scott Douglass2015-07-091-2/+12
| | | | | | Differential Revision: http://reviews.llvm.org/D11055 llvm-svn: 241800
* [ARM] Refactor converting Thumb1 from 3 to 2 operand (nfc)Scott Douglass2015-07-091-42/+45
| | | | | | | | Also adds some test cases. Differential Revision: http://reviews.llvm.org/D11054 llvm-svn: 241799
* Add support for nest attribute to AArch64 backendRenato Golin2015-07-091-0/+5
| | | | | | | | | | | | | | | The nest attribute is currently supported on the x86 (32-bit) and x86-64 backends, but not on ARM (32-bit) or AArch64. This patch adds support for nest to the AArch64 backend. Register x18 is used by GCC for this purpose and hence is used here. As discussed on the GCC mailing list the register choice is an ABI issue and so choosing the same register as GCC means __builtin_call_with_static_chain is compatible. Patch by Stephen Cross. llvm-svn: 241794
* Revert r241790: Fix shift legalization and lowering for big constants.Pawel Bylica2015-07-091-1/+1
| | | | llvm-svn: 241792
* Fix shift legalization and lowering for big constants.Pawel Bylica2015-07-091-1/+1
| | | | | | | | | | | | Summary: If shift amount is a constant value > 64 bit it is handled incorrectly during type legalization and X86 lowering. This patch the type of shift amount argument in function DAGTypeLegalizer::ExpandShiftByConstant from unsigned to APInt. Reviewers: nadav, majnemer, sanjoy, RKSimon Subscribers: RKSimon, llvm-commits Differential Revision: http://reviews.llvm.org/D10767 llvm-svn: 241790
* Remove getDataLayout() from TargetSelectionDAGInfo (had no users)Mehdi Amini2015-07-0940-294/+27
| | | | | | | | | | | | | | | | | | Summary: Remove empty subclass in the process. This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren, ted Differential Revision: http://reviews.llvm.org/D11045 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241780
* Remove getDataLayout() from TargetLoweringMehdi Amini2015-07-0914-111/+119
| | | | | | | | | | | | | | | | Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: yaron.keren, rafael, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11042 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241779
OpenPOWER on IntegriCloud