summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU: Delete dead codeMatt Arsenault2016-07-254-41/+0
| | | | llvm-svn: 276675
* MC] Provide an MCTargetOptions to implementors of MCAsmBackendCtorTy, NFCJoel Jones2016-07-2520-44/+93
| | | | | | | | | | | | | | | Some targets, notably AArch64 for ILP32, have different relocation encodings based upon the ABI. This is an enabling change, so a future patch can use the ABIName from MCTargetOptions to chose which relocations to use. Tested using check-llvm. The corresponding change to clang is in: http://reviews.llvm.org/D16538 Patch by: Joel Jones Differential Revision: https://reviews.llvm.org/D16213 llvm-svn: 276654
* AVX-512: Fixed [US]INT_TO_FP selection for i1 vectors.Elena Demikhovsky2016-07-251-0/+21
| | | | | | | | It failed with assertion before this patch. Differential Revision: https://reviews.llvm.org/D22735 llvm-svn: 276648
* [Hexagon] Add target feature to generate long callsKrzysztof Parzyszek2016-07-256-29/+73
| | | | llvm-svn: 276638
* [ARM] Improve longMAC codegen testSam Parker2016-07-252-1/+6
| | | | | | | | Added thumb targets and dataflow checks to the longMAC test. Differential Revision: https://reviews.llvm.org/D22684 llvm-svn: 276629
* [mips] Optimize materialization of i64 constantsSimon Dardis2016-07-255-13/+49
| | | | | | | | | | | | | | | | | Avoid MipsAnalyzeImmediate usage if the constant fits in an 32-bit integer. This allows us to generate the same instructions for the materialization of the same constants regardless the width of their type. Patch by: Vasileios Kalintiris Contributions by: Simon Dardis Reviewers: Daniel Sanders Differential Review: https://reviews.llvm.org/D21689 llvm-svn: 276628
* [ARM] Small refactor of Thumb2 SMLA instsSam Parker2016-07-251-12/+8
| | | | | | | | | Follow up to r276624. Changes bits 22-20 to be parameters to instruction class. Differential Revision: https://reviews.llvm.org/D22562 llvm-svn: 276626
* [ARM] Enable ISel of SMMLS for ARM and Thumb2Sam Parker2016-07-252-5/+34
| | | | | | | | Use ISelDAGToDAG to recognise the SMMLS instruction pattern. Differential Revision: https://reviews.llvm.org/D22562 llvm-svn: 276624
* [AVX512] Add load folding support for the unmasked forms of the FMA ↵Craig Topper2016-07-251-0/+144
| | | | | | instructions. llvm-svn: 276615
* [AVX512] Add some additional patterns so that we can fold broadcast loads in ↵Craig Topper2016-07-251-45/+76
| | | | | | the first argument of an FMADD/FMSUB/FNMADD/FNMSUB/FMADDSUB/FMSUBADD node. Also add patterns to support all combinations of the broadcast input and the preserved input for masked versions. llvm-svn: 276614
* [AVX512] Cleanup FMA operand order in patterns to match the VEX versions and ↵Craig Topper2016-07-251-19/+19
| | | | | | to really be 213, 231, and 132. llvm-svn: 276613
* [X86] Add 'FeatureSlowSHLD' to cpu 'bdver4'Simon Pilgrim2016-07-241-0/+1
| | | | | | As with all AMD CPUs, excavator has poor SHLD/SHRD performance. Also added bdver3 to the test as it was missing. llvm-svn: 276569
* [X86] Make the FMA3 instruction names consistent between VEX and EVEX ↵Craig Topper2016-07-244-407/+383
| | | | | | | | encoded versions. This places the 132/213/231 form number in front of the SS/SD/PS/PD. Move the Y for 256-bit versions to be after the PS/PD. Change the AVX512 scalar forms to include a Z in the their name. This new format should be consistent with the general naming of instructions. llvm-svn: 276559
* [X86] Replace CodeGenOnly VPSRAVW/D/Q_Int instructions with patterns since ↵Craig Topper2016-07-244-8/+87
| | | | | | the operand types exactly match the normal VPSRAVW/D/Q instructions. llvm-svn: 276555
* [X86] Fix typo in comment.Craig Topper2016-07-231-1/+1
| | | | llvm-svn: 276528
* Fix a GCC error due to this member name also being a type name. ThisChandler Carruth2016-07-231-3/+3
| | | | | | | should fix the build with GCC 4.9 at least. Not sure if this is the right name or fix, but I've followed up on the original commit. llvm-svn: 276522
* [AVX512] Implement commuting support for EVEX encoded FMA3 instructions.Craig Topper2016-07-231-218/+186
| | | | llvm-svn: 276521
* [X86] Make one of the FMA3 commuting methods static. Remove a call to isFMA3 ↵Craig Topper2016-07-232-212/+212
| | | | | | just to get the IsIntrisic flag, instead get it during the first call and pass it along. NFC llvm-svn: 276520
* [X86] Fix switch statement indentation per coding standards.Craig Topper2016-07-231-136/+136
| | | | llvm-svn: 276519
* AMDGPU: Delete dead codeMatt Arsenault2016-07-232-97/+0
| | | | | | This has been dead since r269479 llvm-svn: 276518
* Revert "[AMDGPU] Emit read-only data to .rodata for hsa"Tom Stellard2016-07-221-2/+1
| | | | | | | | | | | | This reverts commit r276298. Data stored in .rodata can have a negative offset from .text, but we don't support negative values in relocations yet. This caused a regression in one of the amp conformance tests: 5_Data_Cont/5_2_a_v/5_2_3_m/Assignment/Test.02.01 llvm-svn: 276498
* Fix include case. NFC.George Burgess IV2016-07-221-1/+1
| | | | llvm-svn: 276465
* GlobalISel: implement legalization pass, with just one transformation.Tim Northover2016-07-227-0/+84
| | | | | | | | | This adds the actual MachineLegalizeHelper to do the work and a trivial pass wrapper that legalizes all instructions in a MachineFunction. Currently the only transformation supported is splitting up a vector G_ADD into one acting on smaller vectors. llvm-svn: 276461
* [Hexagon] Make HexagonCodeGen depend on ScalarKrzysztof Parzyszek2016-07-221-12/+13
| | | | | | Hexagon backend uses LoopDataPrefetch pass that is defined in Scalar. llvm-svn: 276441
* AMDGPU: Fix groupstaticsize for large LDSMatt Arsenault2016-07-221-3/+3
| | | | | | | | | The size can exceed s_movk_i32's limit, and we don't want to use it this early since it inhibits optimizations. This should probably be merged to the release branch. llvm-svn: 276438
* AMDGPU: Add HSA dispatch id intrinsicMatt Arsenault2016-07-225-8/+31
| | | | llvm-svn: 276437
* AMDGPU: Delete more dead codeMatt Arsenault2016-07-2210-182/+15
| | | | | | | Remove dead code from r600 intrinsic removal. Remove unset members, rename StackSize to be less ambiguous. llvm-svn: 276436
* AMDGPU: Fix i1 fp_to_intMatt Arsenault2016-07-224-7/+34
| | | | | | | R600's i1 fp_to_uint selected but was incorrect according to what instcombine constant folds to. llvm-svn: 276435
* AMDGPU: Don't reinvent transferSuccessorsAndUpdatePHIsMatt Arsenault2016-07-221-26/+2
| | | | llvm-svn: 276434
* [RDF] Make the graph construction/use less expensiveKrzysztof Parzyszek2016-07-222-7/+23
| | | | | | | | | - FuncNode::findBlock traverses the function every time. Avoid using it, and keep a cache of block addresses in DataFlowGraph instead. - The operator[] in the map of definition stacks was very slow. Replace the map with unordered_map. llvm-svn: 276429
* [Hexagon] Use loop data prefetch on HexagonKrzysztof Parzyszek2016-07-225-0/+29
| | | | llvm-svn: 276422
* [X86][AVX] Added support for lowering to VBROADCASTF128/VBROADCASTI128 ↵Simon Pilgrim2016-07-223-5/+58
| | | | | | | | | | | | | | | | (reapplied) As reported on PR26235, we don't currently make use of the VBROADCASTF128/VBROADCASTI128 instructions (or the AVX512 equivalents) to load+splat a 128-bit vector to both lanes of a 256-bit vector. This patch enables lowering from subvector insertion/concatenation patterns and auto-upgrades the llvm.x86.avx.vbroadcastf128.pd.256 / llvm.x86.avx.vbroadcastf128.ps.256 intrinsics to match. We could possibly investigate using VBROADCASTF128/VBROADCASTI128 to load repeated constants as well (similar to how we already do for scalar broadcasts). Reapplied with fix for PR28657 - removed intrinsic definitions (clang companion patch to be be submitted shortly). Differential Revision: https://reviews.llvm.org/D22460 llvm-svn: 276416
* Revert "[X86][AVX] Added support for lowering to VBROADCASTF128/VBROADCASTI128"Benjamin Kramer2016-07-223-58/+5
| | | | | | | | It caused PR28657. This reverts commit r276281. llvm-svn: 276405
* This refactoring of ARM machine block size computation creates two utilitySjoerd Meijer2016-07-225-123/+193
| | | | | | | | | functions so that the size computation is available not only in ConstantIslands but in other passes as well. Differential Revision: https://reviews.llvm.org/D22640 llvm-svn: 276399
* [mips][microMIPS] Implement SLT, SLTI, SLTIU, SLTU microMIPS32r6 instructionsHrvoje Varga2016-07-226-68/+102
| | | | | | Differential Revision: https://reviews.llvm.org/D19906 llvm-svn: 276397
* [AVX512] Add ExeDomain to vector extend and truncate instructions.Craig Topper2016-07-221-2/+5
| | | | llvm-svn: 276394
* [AVX512] Add initial support for the Execution Domain fixing pass to change ↵Craig Topper2016-07-222-2/+57
| | | | | | some EVEX instructions. llvm-svn: 276393
* [AVX512] Fix the ExeDomain for some packed fp instructions.Craig Topper2016-07-221-5/+19
| | | | llvm-svn: 276392
* [AVX512] Add load folding for some AVX512VL logic and arithmetic instructions.Craig Topper2016-07-221-0/+36
| | | | llvm-svn: 276391
* [AVX512] Update X86InstrInfo::foldMemoryOperandCustom to handle the EVEX ↵Craig Topper2016-07-221-4/+8
| | | | | | encoded instructions too. llvm-svn: 276390
* [AArch64] Cleanup sign extend in genAlternativeCodeSequenceDavid Majnemer2016-07-211-3/+3
| | | | | | | | Use the machinery in MathExtras instead of rolling it by hand. This fixes PR28624. llvm-svn: 276366
* [Sparc]: Fix bug in LowerSTORE due to r275592Douglas Katzman2016-07-211-1/+1
| | | | llvm-svn: 276362
* [X86] Do not use AND8ri8 in AVX512 patternMichael Kuperstein2016-07-211-1/+1
| | | | | | | This variant is (as documented in the TD) for disassembler use only, and should not be used in patterns - it is longer, and is broken on 64-bit. llvm-svn: 276347
* [AArch64][Inline-Asm] Return the 32-bit floating point register classAkira Hatanaka2016-07-211-1/+1
| | | | | | | | | | | | | | | when constraint "w" is used on a 32-bit operand. This enables compiling the following code, which used to error out in the backend: void foo1(int a) { asm volatile ("sqxtn h0, %s0\n" : : "w"(a):); } Fixes PR28633. llvm-svn: 276344
* [AMDGPU] Emit read-only data to .rodata for hsaKonstantin Zhuravlyov2016-07-211-1/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D22538 llvm-svn: 276298
* AMDGPU/SI: Add support for R_AMDGPU_ABS32Konstantin Zhuravlyov2016-07-211-0/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D21646 llvm-svn: 276294
* [AArch64] Load/store opt: Don't count transient instructions towards search ↵Geoff Berry2016-07-211-15/+14
| | | | | | | | | | | | | | | | | | limits. Summary: This change also changes findMatchingInsn and findMatchingUpdateInsnForward to take DBG_VALUE opcodes into account when tracking register defs and uses, which could potentially inhibit these optimizations in the presence of debug information. Reviewers: mcrosier Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D22582 llvm-svn: 276293
* [X86][SSE] Allow folding of store/zext with PEXTRW of 0'th elementSimon Pilgrim2016-07-211-6/+15
| | | | | | | | | | | | | | Under normal circumstances we prefer the higher performance MOVD to extract the 0'th element of a v8i16 vector instead of PEXTRW. But as detailed on PR27265, this prevents the SSE41 implementation of PEXTRW from folding the store of the 0'th element. Additionally it prevents us from making use of the fact that the (SSE2) reg-reg version of PEXTRW implicitly zero-extends the i16 element to the i32/i64 destination register. This patch only preferentially lowers to MOVD if we will not be zero-extending the extracted i16, nor prevent a store from being folded (on SSSE41). Fix for PR27265. Differential Revision: https://reviews.llvm.org/D22509 llvm-svn: 276289
* [X86][SSE] Pull out duplicate EXTRW lowering code. NFCI.Simon Pilgrim2016-07-211-26/+16
| | | | | | As requested on D22509, I've pulled out the v8i16 extraction lowering as the SSE41 and pre-SSE41 implementations are effectively the same. llvm-svn: 276285
* [X86][AVX] Added support for lowering to VBROADCASTF128/VBROADCASTI128Simon Pilgrim2016-07-213-5/+58
| | | | | | | | | | | | As reported on PR26235, we don't currently make use of the VBROADCASTF128/VBROADCASTI128 instructions (or the AVX512 equivalents) to load+splat a 128-bit vector to both lanes of a 256-bit vector. This patch enables lowering from subvector insertion/concatenation patterns and auto-upgrades the llvm.x86.avx.vbroadcastf128.pd.256 / llvm.x86.avx.vbroadcastf128.ps.256 intrinsics to match. We could possibly investigate using VBROADCASTF128/VBROADCASTI128 to load repeated constants as well (similar to how we already do for scalar broadcasts). Differential Revision: https://reviews.llvm.org/D22460 llvm-svn: 276281
OpenPOWER on IntegriCloud