summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* Revert r223709, "[PowerPC]Activate FeatureVSX for the Power target", to ↵NAKAMURA Takumi2014-12-091-3/+5
| | | | | | | | | | | | | | | | | | | unbreak bots. CodeGen/PowerPC/vsx-p8.ll was failing. '+power8-vector' is not a recognized feature for this target (ignoring feature) llvm/test/CodeGen/PowerPC/vsx-p8.ll:33:14: error: expected string not found in input ; CHECK-REG: lxvw4x 34, 0, 3 ^ <stdin>:50:2: note: scanning from here .align 3 ^ <stdin>:61:2: note: possible intended match here lvx 3, 0, 3 ^ llvm-svn: 223729
* R600/SI: Set MayStore = 0 on MUBUF loadsTom Stellard2014-12-091-1/+1
| | | | llvm-svn: 223722
* R600/SI: Move setting of the lds bit to the base MUBUF classTom Stellard2014-12-091-6/+9
| | | | llvm-svn: 223721
* [Hexagon] Removing old def versions and replacing usages with versions that ↵Colin LeMahieu2014-12-085-182/+40
| | | | | | have encodings. llvm-svn: 223720
* [Hexagon] Adding any8, all8, and/or/xor/andn/orn/not predicate register ↵Colin LeMahieu2014-12-081-0/+84
| | | | | | forms, mask, and vitpack instructions and patterns. llvm-svn: 223710
* [PowerPC]Activate FeatureVSX for the Power targetBill Seurer2014-12-081-5/+3
| | | | | | | | This change activates FeatureVSX for Power 7 and Power 8 in PPC.td. http://reviews.llvm.org/D6570 llvm-svn: 223709
* [PowerPC] Don't use a non-allocatable register to implement the 'cc' aliasHal Finkel2014-12-082-9/+6
| | | | | | | | | | | | | | | | GCC accepts 'cc' as an alias for 'cr0', and we need to do the same when processing inline asm constraints. This had previously been implemented using a non-allocatable register, named 'cc', that was listed as an alias of 'cr0', but the infrastructure does not seem to support this properly (neither the register allocator nor the scheduler properly accounts for the alias). Instead, we can just process this as a naming alias inside of the inline asm constraint-processing code, so we'll do that instead. There are two regression tests, one where the post-RA scheduler did the wrong thing with the non-allocatable alias, and one where the register allocator did the wrong thing. Fixes PR21742. llvm-svn: 223708
* [Hexagon] Adding xtype doubleword add, sub, and, or, xor and patterns.Colin LeMahieu2014-12-081-46/+50
| | | | llvm-svn: 223702
* [Hexagon] Adding xtype doubleword comparisons. Removing unused multiclass.Colin LeMahieu2014-12-082-32/+50
| | | | llvm-svn: 223701
* [Hexagon] Adding xtype parity, min, minu, max, maxu instructions.Colin LeMahieu2014-12-084-0/+110
| | | | llvm-svn: 223693
* [Hexagon] Adding xtype halfword add/sub ll/hl/lh/hh/sat/<<16 instructions.Colin LeMahieu2014-12-081-2/+102
| | | | llvm-svn: 223692
* R600/SI: Move continue after checking s_mov_b32.Matt Arsenault2014-12-081-3/+3
| | | | | | There's nothing else to bother trying to shrink these. llvm-svn: 223686
* [Hexagon] Adding add/sub with saturation. Removing unused def. Cleaning up ↵Colin LeMahieu2014-12-082-16/+22
| | | | | | shift patterns. llvm-svn: 223680
* [CompactUnwind] Fix register encoding logicBruno Cardoso Lopes2014-12-081-1/+1
| | | | | | | | | | | | Fix a compact unwind encoding logic bug which would try to encode more callee saved registers than it should, leading to early bail out in the encoding logic and abusive use of DWARF frame mode unnecessarily. Also remove no-compact-unwind.ll which was testing the wrong thing based on this bug and move it to valid 'compact unwind' tests. Added other few more tests too. llvm-svn: 223676
* AArch64: treat HFAs containing "half" types as blocks too.Tim Northover2014-12-081-0/+5
| | | | llvm-svn: 223669
* [X86] Improved tablegen patters for matching TZCNT/LZCNT.Andrea Di Biagio2014-12-081-24/+29
| | | | | | | | | | | Teach ISel how to match a TZCNT/LZCNT from a conditional move if the condition code is X86_COND_NE. Existing tablegen patterns only allowed to match TZCNT/LZCNT from a X86cond with condition code equal to X86_COND_E. To avoid introducing extra rules, I added an 'ImmLeaf' definition that checks if the condition code is COND_E or COND_NE. llvm-svn: 223668
* [Hexagon] Adding combine reg, reg with predicated forms.Colin LeMahieu2014-12-081-0/+7
| | | | llvm-svn: 223667
* [Hexagon] Adding packhl instruction.Colin LeMahieu2014-12-081-0/+6
| | | | llvm-svn: 223664
* [mips] Add Mips-specific CCIf's for accessing the MipsCCState. NFC.Daniel Sanders2014-12-081-13/+28
| | | | | | | | | | | | Reviewers: vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6213 llvm-svn: 223662
* [X86] Improved lowering of packed v8i16 vector shifts by non-constant count.Andrea Di Biagio2014-12-081-10/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this patch, the backend sub-optimally expanded the non-constant shift count of a v8i16 shift into a sequence of two 'movd' plus 'movzwl'. With this patch the backend checks if the target features sse4.1. If so, then it lets the shuffle legalizer deal with the expansion of the shift amount. Example: ;; define <8 x i16> @test(<8 x i16> %A, <8 x i16> %B) { %shamt = shufflevector <8 x i16> %B, <8 x i16> undef, <8 x i32> zeroinitializer %shl = shl <8 x i16> %A, %shamt ret <8 x i16> %shl } ;; Before (with -mattr=+avx): vmovd %xmm1, %eax movzwl %ax, %eax vmovd %eax, %xmm1 vpsllw %xmm1, %xmm0, %xmm0 retq Now: vpxor %xmm2, %xmm2, %xmm2 vpblendw $1, %xmm1, %xmm2, %xmm1 vpsllw %xmm1, %xmm0, %xmm0 retq llvm-svn: 223660
* X86 intrinsics moved form X86ISelLowering.cpp to X86IntrinsicsInfo.hElena Demikhovsky2014-12-082-133/+48
| | | | | | | | X86ISelLowering.cpp has a long switch for intrinsics. I moved a part of this long switch to the new intrinsics table in X86IntrinsicsInfo.h. No functional changes, just code and compile time optimization. llvm-svn: 223641
* R600/SI: Disable VMEM and SMEM clauses by breaking them with S_NOPMarek Olsak2014-12-071-8/+46
| | | | | | This is only a workaround. llvm-svn: 223615
* R600/SI: Set 20-bit immediate byte offset for SMRD on VIMarek Olsak2014-12-076-20/+85
| | | | llvm-svn: 223614
* R600/SI: Update instruction conversions for VIMarek Olsak2014-12-073-1/+48
| | | | | | | | | There are 3 changes: - Convert 32-bit S_LSHL/LSHR/ASHR to their V_*REV variants for VI - Lower RSQ_CLAMP for VI - Don't generate MIN/MAX_LEGACY on VI llvm-svn: 223604
* R600/SI: Add VI instructionsMarek Olsak2014-12-0712-651/+1439
| | | | llvm-svn: 223603
* R600/SI: Add SCC Defs/Uses to SOP1 and SOP2 opcodesMarek Olsak2014-12-071-28/+49
| | | | llvm-svn: 223602
* Make the DenseMap bucket type configurable and use a smaller bucket for ↵Benjamin Kramer2014-12-061-1/+1
| | | | | | | | | | | | | | DenseSet. DenseSet used to be implemented as DenseMap<Key, char>, which usually doubled the memory footprint of the map. Now we use a compressed set so the second element uses no memory at all. This required some surgery on DenseMap as all accesses to the bucket now have to go through methods; this should have no impact on the behavior of DenseMap though. The new default bucket type for DenseMap is a slightly extended std::pair as we expose it through DenseMap's iterator and don't want to break any existing users. llvm-svn: 223588
* R600/SI: Restore PrivateGlobalPrefix to the default ELF value of ".L"Tom Stellard2014-12-061-1/+0
| | | | | | This was changed in r223323. llvm-svn: 223579
* [X86] Refactor PMOV[SZ]Xrm to add missing AVX2 patterns.Ahmed Bougacha2014-12-062-531/+226
| | | | | | | | Most patterns will go away once the extload legalization changes land. Differential Revision: http://reviews.llvm.org/D6125 llvm-svn: 223567
* AArch64: use explicit MVT::i64 when creating EXTRACT_SUBVECTOR nodes.Tim Northover2014-12-061-10/+12
| | | | | | | | | All our patterns use MVT::i64, but the ISelLowering nodes were inconsistent in their choice. No functional change. llvm-svn: 223551
* [X86] Cleanup FCOPYSIGN lowering. NFC intended.Ahmed Bougacha2014-12-051-29/+15
| | | | llvm-svn: 223542
* [Hexagon] Relocating logical instructions and templates later in the td file.Colin LeMahieu2014-12-051-116/+115
| | | | llvm-svn: 223523
* [Hexagon] Adding sub/and/or reg, imm formsColin LeMahieu2014-12-051-29/+56
| | | | llvm-svn: 223522
* Optimize merging of scalar loads for 32-byte vectors [X86, AVX]Sanjay Patel2014-12-051-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the poor codegen seen in PR21710 ( http://llvm.org/bugs/show_bug.cgi?id=21710 ). Before we crack 32-byte build vectors into smaller chunks (and then subsequently glue them back together), we should look for the easy case where we can just load all elements in a single op. An example of the codegen change is: From: vmovss 16(%rdi), %xmm1 vmovups (%rdi), %xmm0 vinsertps $16, 20(%rdi), %xmm1, %xmm1 vinsertps $32, 24(%rdi), %xmm1, %xmm1 vinsertps $48, 28(%rdi), %xmm1, %xmm1 vinsertf128 $1, %xmm1, %ymm0, %ymm0 retq To: vmovups (%rdi), %ymm0 retq Differential Revision: http://reviews.llvm.org/D6536 llvm-svn: 223518
* [Hexagon] Updating mux_ir/ri/ii/rr with encoding bitsColin LeMahieu2014-12-054-46/+78
| | | | llvm-svn: 223515
* Use 32-bit ebp for NaCl64 in a limited case: llvm.frameaddress.Jan Wen Voung2014-12-054-4/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Follow up to [x32] "Use ebp/esp as frame and stack pointer": http://reviews.llvm.org/D4617 In that earlier patch, NaCl64 was made to always use rbp. That's needed for most cases because rbp should hold a full 64-bit address within the NaCl sandbox so that load/stores off of rbp don't require sandbox adjustment (zeroing the top 32-bits, then filling those by adding r15). However, llvm.frameaddress returns a pointer and pointers are 32-bit for NaCl64. In this case, use ebp instead, which will make the register copy type check. A similar mechanism may be needed for llvm.eh.return, but is not added in this change. Test Plan: test/CodeGen/X86/frameaddr.ll Reviewers: dschuff, nadav Subscribers: jfb, llvm-commits Differential Revision: http://reviews.llvm.org/D6514 llvm-svn: 223510
* [PowerPC]Add VSX loads/stores to fastisel for PPC targetBill Seurer2014-12-051-4/+36
| | | | | | | | | | This patch adds VSX floating point loads and stores to fastisel. Along with the change to tablegen (D6220), VSX instructions are now fully supported in fastisel. http://reviews.llvm.org/D6274 llvm-svn: 223507
* [Hexagon] Adding tfrih/l instructions.Colin LeMahieu2014-12-051-0/+22
| | | | llvm-svn: 223506
* [X86] Improved lowering of packed vector shifts to vpsllq/vpsrlq.Andrea Di Biagio2014-12-051-10/+17
| | | | | | | | | | | | | | SSE2/AVX non-constant packed shift instructions only use the lower 64-bit of the shift count. This patch teaches function 'getTargetVShiftNode' how to deal with shifts where the shift count node is of type MVT::i64. Before this patch, function 'getTargetVShiftNode' only knew how to deal with shift count nodes of type MVT::i32. This forced the backend to wrongly truncate the shift count to MVT::i32, and then zero-extend it back to MVT::i64. llvm-svn: 223505
* [Hexagon] Adding add reg, imm form with encoding bits and test.Colin LeMahieu2014-12-051-42/+80
| | | | llvm-svn: 223504
* [Hexagon] Adding DoubleRegs decoder. Moving C2_mux and A2_nop. Adding ↵Colin LeMahieu2014-12-053-10/+60
| | | | | | combine imm-imm form. llvm-svn: 223494
* [Hexagon] [NFC] Rearranging patterns and mux instruction.Colin LeMahieu2014-12-051-38/+38
| | | | llvm-svn: 223488
* [Hexagon] [NFC] Rearranging def order.Colin LeMahieu2014-12-051-28/+27
| | | | llvm-svn: 223487
* [Hexagon] Adding combine reg-reg forms.Colin LeMahieu2014-12-051-1/+14
| | | | llvm-svn: 223485
* [Hexagon] Marking several instructions as isCodeGenOnly=0 and adding direct ↵Colin LeMahieu2014-12-051-2/+3
| | | | | | disassembly tests for many instructions. llvm-svn: 223482
* [X86] Avoid introducing extra shuffles when lowering packed vector shifts.Andrea Di Biagio2014-12-051-15/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When lowering a vector shift node, the backend checks if the shift count is a shuffle with a splat mask. If so, then it introduces an extra dag node to extract the splat value from the shuffle. The splat value is then used to generate a shift count of a target specific shift. However, if we know that the shift count is a splat shuffle, we can use the splat index 'I' to extract the I-th element from the first shuffle operand. The advantage is that the splat shuffle may become dead since we no longer use it. Example: ;; define <4 x i32> @example(<4 x i32> %a, <4 x i32> %b) { %c = shufflevector <4 x i32> %b, <4 x i32> undef, <4 x i32> zeroinitializer %shl = shl <4 x i32> %a, %c ret <4 x i32> %shl } ;; Before this patch, llc generated the following code (-mattr=+avx): vpshufd $0, %xmm1, %xmm1 # xmm1 = xmm1[0,0,0,0] vpxor %xmm2, %xmm2 vpblendw $3, %xmm1, %xmm2, %xmm1 # xmm1 = xmm1[0,1],xmm2[2,3,4,5,6,7] vpslld %xmm1, %xmm0, %xmm0 retq With this patch, the redundant splat operation is removed from the code. vpxor %xmm2, %xmm2 vpblendw $3, %xmm1, %xmm2, %xmm1 # xmm1 = xmm1[0,1],xmm2[2,3,4,5,6,7] vpslld %xmm1, %xmm0, %xmm0 retq llvm-svn: 223461
* Add missing FP build attribute tests.Charlie Turner2014-12-051-0/+2
| | | | | | | | | | | | | | | | | | | | | The test file test/CodeGen/ARM/build-attributes.ll was missing several floating-point build attribute tests. The intention of this commit is that for each CPU / architecture currently tested, there are now tests that make sure the following attributes are sufficiently checked, * Tag_ABI_FP_rounding * Tag_ABI_FP_denormal * Tag_ABI_FP_exceptions * Tag_ABI_FP_user_exceptions * Tag_ABI_FP_number_model Also in this commit, the -unsafe-fp-math flag has been augmented with the full suite of flags Clang sends to LLVM when you pass -ffast-math to Clang. That is, `-unsafe-fp-math' has been changed to `-enable-unsafe-fp-math -disable-fp-elim -enable-no-infs-fp-math -enable-no-nans-fp-math -fp-contract=fast' Change-Id: I35d766076bcbbf09021021c0a534bf8bf9a32dfc llvm-svn: 223454
* Rename the x86 isTargetMacho to isTargetMachO for uniformity.Eric Christopher2014-12-054-8/+8
| | | | llvm-svn: 223421
* Both of these subtargets have functions that check whether orEric Christopher2014-12-052-3/+2
| | | | | | not the target is mach-o. Use them. llvm-svn: 223420
* [X86] Delete dead code in fcopysign lowering. NFC.Ahmed Bougacha2014-12-041-11/+0
| | | | | | | | | r32900 introduced custom lowering for fcopysign, with two checks to change the magnitude value's type if it's larger/smaller than the sign value's type. r32932 replaced that code for the smaller case. r43205 did the same for the larger case, but left the old code, now dead. llvm-svn: 223415
OpenPOWER on IntegriCloud