summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [PowerPC] Fix invalid lxvdsx optimization (PR25157)Bill Schmidt2015-10-141-0/+2
| | | | | | | | | | | | PR25157 identifies a bug where a load plus a vector shuffle is incorrectly converted into an LXVDSX instruction. That optimization is only valid if the load is of a doubleword, and in the noted case, it was not. This corrects that problem. Joint patch with Eric Schweitz, who provided the bugpoint-reduced test case. llvm-svn: 250324
* [x86][FastISel] Teach how to select nontemporal stores.Andrea Di Biagio2015-10-141-16/+34
| | | | | | | | | | | | | | | | | | | | | | This patch teaches x86 fast-isel how to select nontemporal stores. On x86, we can use MOVNTI for nontemporal stores of doublewords/quadwords. Instructions (V)MOVNTPS/PD/DQ can be used for SSE2/AVX aligned nontemporal vector stores. Before this patch, fast-isel always selected 'movd/movq' instead of 'movnti' for doubleword/quadword nontemporal stores. In the case of nontemporal stores of aligned vectors, fast-isel always selected movaps/movapd/movdqa instead of movntps/movntpd/movntdq. With this patch, if we use SSE2/AVX intrinsics for nontemporal stores we now always get the expected (V)MOVNT instructions. The lack of fast-isel support for nontemporal stores was spotted when analyzing the -O0 codegen for nontemporal stores. Differential Revision: http://reviews.llvm.org/D13698 llvm-svn: 250285
* [X86] Add XSAVE feature flags to their various processors.Craig Topper2015-10-141-3/+23
| | | | llvm-svn: 250268
* [WebAssembly] Remove a TODO comment which is no longer needed. NFC.Dan Gohman2015-10-131-7/+0
| | | | llvm-svn: 250233
* AMDGPU: Remove implicit ilist iterator conversions, NFCDuncan P. N. Exon Smith2015-10-139-18/+17
| | | | | | | | | | | | | | | | | | One of the changes in lib/Target/AMDGPU/AMDGPUMCInstLower.cpp was a new one. Previously, bundle iterators and single-instruction iterators could be compared to each other (comparing on underlying pointers). I changed a comparison from using `MBB->end()` to using `MBB->instr_end()`, since both end iterators should point at the some place anyway. I don't think the implicit conversion between the two iterator types is a good idea since it's fairly easy to accidentally compare to the wrong thing (they aren't always end iterators). Otherwise I would have just added the conversion. Even with that, no there should be functionality change here. llvm-svn: 250218
* AArch64: Remove implicit ilist iterator conversions, NFCDuncan P. N. Exon Smith2015-10-136-15/+12
| | | | llvm-svn: 250216
* [AArch64] Check the size of the vector before accessing its elements.Akira Hatanaka2015-10-131-1/+1
| | | | | | | | This fixes an assert in AArch64AsmParser::MatchAndEmitInstruction. rdar://problem/23081753 llvm-svn: 250207
* function names should start with a lower case letter; NFCSanjay Patel2015-10-134-116/+116
| | | | llvm-svn: 250174
* don't repeat function/class/variable names in comments; NFCSanjay Patel2015-10-131-60/+46
| | | | llvm-svn: 250162
* Test commitChristof Douma2015-10-131-1/+0
| | | | llvm-svn: 250154
* Fix line-ending issue. NFC.Michael Kuperstein2015-10-131-2/+2
| | | | llvm-svn: 250151
* [X86] Mark the AAD and AAM aliases as not valid in 64-bit mode.Craig Topper2015-10-131-2/+2
| | | | llvm-svn: 250148
* [X86] Change all the i8imm operands in XOP instructions to u8imm so the ↵Craig Topper2015-10-131-10/+10
| | | | | | parser will check the size. llvm-svn: 250147
* x86: preserve flags when folding atomic operationsJF Bastien2015-10-131-6/+8
| | | | | | | | | | | | | | | | | | | | | Summary: D4796 taught LLVM to fold some atomic integer operations into a single instruction. The pattern was unaware that the instructions clobbered flags. This patch adds the missing EFLAGS definition. Floating point operations don't set flags, the subsequent fadd optimization is therefore correct. The same applies for surrounding load/store optimizations. Reviewers: rsmith, rtrieu Subscribers: llvm-commits, reames, morisset Differential Revision: http://reviews.llvm.org/D13680 llvm-svn: 250135
* AMDGPU: Refactor isVGPRToSGPRCopyMatt Arsenault2015-10-131-19/+48
| | | | | | | It should now correctly handle physical registers and make it easier to identify the other direction. llvm-svn: 250132
* DAGCombiner: Combine extract_vector_elt from build_vectorMatt Arsenault2015-10-122-0/+13
| | | | | | | | | | | | | | This basic combine was surprisingly missing. AMDGPU legalizes many operations in terms of 32-bit vector components, so not doing this results in many extra copies and subregister extracts that need to be cleaned up later. InstCombine already does this for the hasOneUse case. The target hook is to fix a handful of tests which break (e.g. ARM/vmov.ll) which turn from a vector materialize repeated immediate instruction to a constant vector load with more scalar copies from it. llvm-svn: 250129
* Make Win64 localescape offsets FP relative instead of SP relativeReid Kleckner2015-10-121-8/+2
| | | | | | | | | We made them SP relative back in March (r233137) because that's the value the runtime passes to EH functions. With the new cleanuppad IR, funclets adjust their frame argument from SP to FP, so our offsets should now be FP-relative. llvm-svn: 250088
* [x86] Fix wrong lowering of vsetcc nodes (PR25080).Andrea Di Biagio2015-10-121-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Function LowerVSETCC (in X86ISelLowering.cpp) worked under the wrong assumption that for non-AVX512 targets, the source type and destination type of a type-legalized setcc node were always the same type. This assumption was unfortunately incorrect; the type legalizer is not always able to promote the return type of a setcc to the same type as the first operand of a setcc. In the case of a vsetcc node, the legalizer firstly checks if the first input operand has a legal type. If so, then it promotes the return type of the vsetcc to that same type. Otherwise, the return type is promoted to the 'next legal type', which, for vectors of MVT::i1 is always a 128-bit integer vector type. Example (-mattr=+avx): %0 = trunc <8 x i32> %a to <8 x i23> %1 = icmp eq <8 x i23> %0, zeroinitializer The initial selection dag for the code above is: v8i1 = setcc t5, t7, seteq:ch t5: v8i23 = truncate t2 t2: v8i32,ch = CopyFromReg t0, Register:v8i32 %vreg1 t7: v8i32 = build_vector of all zeroes. The type legalizer would firstly check if 't5' has a legal type. If so, then it would reuse that same type to promote the return type of the setcc node. Unfortunately 't5' is of illegal type v8i23, and therefore it cannot be used to promote the return type of the setcc node. Consequently, the setcc return type is promoted to v8i16. Later on, 't5' is promoted to v8i32 thus leading to the following dag node: v8i16 = setcc t32, t25, seteq:ch where t32 and t25 are now values of type v8i32. Before this patch, function LowerVSETCC would have wrongly expanded the setcc to a single X86ISD::PCMPEQ. Surprisingly, ISel was still able to match an instruction. In our case, ISel would have matched a VPCMPEQWrr: t37: v8i16 = X86ISD::VPCMPEQWrr t36, t25 However, t36 and t25 are both VR256, while the result type is instead of class VR128. This inconsistency ended up causing the insertion of COPY instructions like this: %vreg7<def> = COPY %vreg3; VR128:%vreg7 VR256:%vreg3 Which is an invalid full copy (not a sub register copy). Eventually, the backend would have hit an UNREACHABLE "Cannot emit physreg copy instruction" in the attempt to expand the malformed pseudo COPY instructions. This patch fixes the problem adding the missing logic in LowerVSETCC to handle the corner case of a setcc with 128-bit return type and 256-bit operand type. This problem was originally reported by Dimitry as PR25080. It has been latent for a very long time. I have added the minimal reproducible from that bugzilla as test setcc-lowering.ll. Differential Revision: http://reviews.llvm.org/D13660 llvm-svn: 250085
* combine predicates; NFCISanjay Patel2015-10-121-4/+1
| | | | llvm-svn: 250075
* AMDGPU: Register some more passes so -print-before worksMatt Arsenault2015-10-121-0/+2
| | | | llvm-svn: 250071
* fix typos; NFCSanjay Patel2015-10-121-3/+2
| | | | llvm-svn: 250059
* [mips][micromips] Initial support for micrmomips DSP instructions and ↵Zoran Jovanovic2015-10-1210-6/+85
| | | | | | | | addu.qb implementation Differential Revision: http://reviews.llvm.org/D12798 llvm-svn: 250058
* [mips][FastISel] Clang-format switch statement. NFC.Vasileios Kalintiris2015-10-121-10/+10
| | | | llvm-svn: 250053
* fix capitalization; NFCSanjay Patel2015-10-121-2/+2
| | | | llvm-svn: 250049
* [mips][ias] Implement macro expansion when bcc has an immediate where a ↵Daniel Sanders2015-10-122-2/+126
| | | | | | | | | | | | | | register belongs. Summary: Fixes PR24915. Reviewers: vkalintiris Subscribers: emaste, seanbruno, llvm-commits Differential Revision: http://reviews.llvm.org/D13533 llvm-svn: 250042
* [mips] Clean up most macro expansions to use the emit*() functions.Daniel Sanders2015-10-121-287/+163
| | | | | | | | | | Reviewers: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13591 llvm-svn: 250040
* [mips] Handle undef when extracting subregs from FP64 registers.Daniel Sanders2015-10-121-4/+12
| | | | | | | | | | | | | | | Summary: This removes unnecessary instructions when extracting from an undefined register and also fixes a crash for O32 when passing undef to a double argument in held in integer registers. Reviewers: vkalintiris Subscribers: llvm-commits, zoran.jovanovic, petarj Differential Revision: http://reviews.llvm.org/D13467 llvm-svn: 250039
* [ARM] Mark Swift MISched model as incompleteJames Molloy2015-10-121-0/+1
| | | | | | | | | | | The Swift Machine Scheduler Model is incomplete. There are instructions missing which can trigger the "incomplete machine model" abort. This was observed when a downstream SchedMachineModel was added to the ARM target. Patch by Christof Douma! llvm-svn: 250033
* [X86] Add XSAVE intrinsic familyAmjad Aboud2015-10-125-23/+79
| | | | | | | | | | | | Add intrinsics for the XSAVE instructions (XSAVE/XSAVE64/XRSTOR/XRSTOR64) XSAVEOPT instructions (XSAVEOPT/XSAVEOPT64) XSAVEC instructions (XSAVEC/XSAVEC64) XSAVES instructions (XSAVES/XSAVES64/XRSTORS/XRSTORS64) Differential Revision: http://reviews.llvm.org/D13012 llvm-svn: 250029
* [x86] PR24562: fix incorrect folding of PSHUFB nodes with a mask where all ↵Andrea Di Biagio2015-10-121-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | indices have the most significant bit set. This patch fixes a problem in function 'combineX86ShuffleChain' that causes a chain of shuffles to be wrongly folded away when the combined shuffle mask has only one element. We may end up with a combined shuffle mask of one element as a result of multiple calls to function 'canWidenShuffleElements()'. Function canWidenShuffleElements attempts to simplify a shuffle mask by widening the size of the elements being shuffled. For every pair of shuffle indices, function canWidenShuffleElements checks if indices refer to adjacent elements. If all pairs refer to "adjacent" elements then the shuffle mask is safely widened. As a consequence of widening, we end up with a new shuffle mask which is half the size of the original shuffle mask. The byte shuffle (pshufb) from test pr24562.ll has a mask of all SM_SentinelZero indices. Function canWidenShuffleElements would combine each pair of SM_SentinelZero indices into a single SM_SentinelZero index. So, in a logarithmic number of steps (4 in this case), the pshufb mask is simplified to a mask with only one index which is equal to SM_SentinelZero. Before this patch, function combineX86ShuffleChain wrongly assumed that a mask of size one is always equivalent to an identity mask. So, the entire shuffle chain was just folded away as the combined shuffle mask was treated as a no-op mask. With this patch we know check if the only element of a combined shuffle mask is SM_SentinelZero. In case, we propagate a zero vector. Differential Revision: http://reviews.llvm.org/D13364 llvm-svn: 250027
* Test commitZlatko Buljan2015-10-121-1/+0
| | | | llvm-svn: 250026
* [X86] Use u8imm for the immediate type for all shift and rotate ↵Craig Topper2015-10-121-70/+70
| | | | | | instructions. This way the assembler will perform range checking. Believe this matches gas behavior. llvm-svn: 250016
* [X86] Add support to assembler and MCInst lowering to use the other vmovq ↵Craig Topper2015-10-122-24/+28
| | | | | | %xmmX, %xmmX encoding if it would be a shorter VEX encoding. llvm-svn: 250014
* [X86] Cleanup formatting a bit. NFCCraig Topper2015-10-121-14/+14
| | | | llvm-svn: 250013
* [X86] Change the immediate for IN/OUT instructions to u8imm so the assembly ↵Craig Topper2015-10-122-12/+12
| | | | | | parser will check the size. llvm-svn: 250012
* [X86] Add some instruction aliases to get the assembly parser table to favor ↵Craig Topper2015-10-122-63/+31
| | | | | | | | arithmetic instructions with 8-bit immediates over the forms that implicitly use the ax/eax/rax. This allows us to remove the explicit code for working around the existing priority llvm-svn: 250011
* [X86] Fix CMP and TEST with al/ax/eax/rax to not mark EFLAGS as a use or ↵Craig Topper2015-10-111-27/+34
| | | | | | al/ax/eax/rax as a def. Probably doesn't have a functional affect since these aren't used in isel. llvm-svn: 249994
* [X86] Remove special validation for INT immediate operand from AsmParser. ↵Craig Topper2015-10-112-24/+1
| | | | | | | | Instead mark its operand type as u8imm which will cause it to fail to match. This is more consistent with other instruction behavior. This also fixes a bug where negative immediates below -128 were not being reported as errors. llvm-svn: 249989
* [X86] Simplify immediate range checking code.Craig Topper2015-10-112-18/+13
| | | | llvm-svn: 249979
* [X86][XOP] Added support for the lowering of 128-bit vector integer ↵Simon Pilgrim2015-10-115-12/+62
| | | | | | | | comparisons to XOP PCOM/PCOMU instructions. The XOP vector integer comparisons can deal with all signed/unsigned comparison cases directly and can be easily commuted as well (D7646). llvm-svn: 249976
* Change isUIntN/isIntN calls with constant N to use the template version. NFCCraig Topper2015-10-102-13/+13
| | | | llvm-svn: 249952
* [SystemZ] Fixes in the backend I/R.Jonas Paulsson2015-10-102-3/+5
| | | | | | | | | | | | | | | expandPostRAPseudo(): STX -> 2 * STD: The first STD should not have the kill flag set for the address. SystemZElimCompare: BRC -> BRCT conversion: Don't forget to remove the CC<use,kill> operand. Needed to make SystemZ/asm-17.ll pass with -verify-machineinstrs, which now runs with this flag. Reviewed by Ulrich Weigand. llvm-svn: 249945
* Use range-based for loops. NFC.Craig Topper2015-10-101-17/+15
| | | | llvm-svn: 249941
* Use emplace_back instead of a constructor call and push_back. NFCCraig Topper2015-10-101-22/+18
| | | | llvm-svn: 249940
* [WinEH] Remove more dead codeDavid Majnemer2015-10-102-21/+8
| | | | | | wineh-parent is dead, so is ValueOrMBB. llvm-svn: 249920
* [WinEH] Delete the old landingpad implementation of Windows EHReid Kleckner2015-10-091-89/+16
| | | | | | | | | | | The new implementation works at least as well as the old implementation did. Also delete the associated preparation tests. They don't exercise interesting corner cases of the new implementation. All the codegen tests of the EH tables have already been ported. llvm-svn: 249918
* [WinEH] Insert the catchpad return before CSR restorationDavid Majnemer2015-10-091-18/+21
| | | | | | | | x64 catchpads use rax to inform the unwinder where control should go next. However, we must initialize rax before the epilogue sequence so as to not perturb the unwinder. llvm-svn: 249910
* Fix assert when emitting llvm.pow.f86.James Y Knight2015-10-091-5/+4
| | | | | | | | | | | | | This occurred due to introducing the invalid i64 type after type legalization had already finished, in an attempt to workaround bitcast f64 -> v2i32 not doing constant folding. The *right* thing is to actually fix bitcast, but that has other complications. So, for now, just get rid of the broken workaround, and check in a test-case showing that it doesn't crash, with TODOs for emitting proper code. llvm-svn: 249908
* Fix assert in X86 backend.James Y Knight2015-10-091-8/+8
| | | | | | | | | | | | | | When running combine on an extract_vector_elt, it wants to look through a bitcast to check if the argument to the bitcast was itself an extract_vector_elt with particular operands. However, it called getOperand() on the argument to the bitcast *before* checking that the opcode was EXTRACT_VECTOR_ELT, assert-failing if there were zero operands for the actual opcode. Fix, and add trivial test. llvm-svn: 249891
* [WebAssembly] Rename floating-point operators to match their spec names.Dan Gohman2015-10-091-6/+6
| | | | llvm-svn: 249859
OpenPOWER on IntegriCloud