summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* WebAssembly: update syntaxJF Bastien2015-10-161-31/+59
| | | | | | | | | | | | | | | | | | | | | | | | Summary: Follow the same syntax as for the spec repo. Both have evolved slightly independently and need to converge again. This, along with wasmate changes, allows me to do the following: echo "int add(int a, int b) { return a + b; }" > add.c ./out/bin/clang -O2 -S --target=wasm32-unknown-unknown add.c -o add.wack ./experimental/prototype-wasmate/wasmate.py add.wack > add.wast ./sexpr-wasm-prototype/out/sexpr-wasm add.wast -o add.wasm ./sexpr-wasm-prototype/third_party/v8-native-prototype/v8/v8/out/Release/d8 -e "print(WASM.instantiateModule(readbuffer('add.wasm'), {print:print}).add(42, 1337));" As you'd expect, the d8 shell prints out the right value. Reviewers: sunfish Subscribers: jfb, llvm-commits, dschuff Differential Revision: http://reviews.llvm.org/D13712 llvm-svn: 250480
* Revert "[safestack] Fast access to the unsafe stack pointer on AArch64/Android."Evgeniy Stepanov2015-10-154-31/+9
| | | | | | Breaks the hexagon buildbot. llvm-svn: 250461
* [safestack] Fast access to the unsafe stack pointer on AArch64/Android.Evgeniy Stepanov2015-10-154-9/+31
| | | | | | | | | | | | | | | | Android libc provides a fixed TLS slot for the unsafe stack pointer, and this change implements direct access to that slot on AArch64 via __builtin_thread_pointer() + offset. This change also moves more code into TargetLowering and its target-specific subclasses to get rid of target-specific codegen in SafeStackPass. This change does not touch the ARM backend because ARM lowers builting_thread_pointer as aeabi_read_tp, which is not available on Android. llvm-svn: 250456
* x86: preserve flags when folding atomic operationsJF Bastien2015-10-151-15/+20
| | | | | | | | | | D4796 taught LLVM to fold some atomic integer operations into a single instruction. The pattern was unaware that the instructions clobbered flags. I fixed some of this issue in D13680 but had missed INC/DEC. This patch adds the missing EFLAGS definition. llvm-svn: 250438
* x86 FP atomic codegen: don't drop globals, stackJF Bastien2015-10-151-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: x86 codegen is clever about generating good code for relaxed floating-point operations, but it was being silly when globals and immediates were involved, forgetting where the global was and loading/storing from/to the wrong place. The same applied to hard-coded address immediates. Don't let it forget about the displacement. This fixes https://llvm.org/bugs/show_bug.cgi?id=25171 A very similar bug when doing floating-points atomics to the stack is also fixed by this patch. This fixes https://llvm.org/bugs/show_bug.cgi?id=25144 Reviewers: pete Subscribers: llvm-commits, majnemer, rsmith Differential Revision: http://reviews.llvm.org/D13749 llvm-svn: 250429
* [mips][ias] Implement ulh macro.Daniel Sanders2015-10-152-7/+13
| | | | | | | | | | | | | | Summary: This macro is needed to prevent test/CodeGen/Mips/2008-08-01-AsmInline.ll from failing after the integrated assembler is enabled by default. Reviewers: vkalintiris Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D13654 llvm-svn: 250414
* [NVPTX] Remove dead code.Benjamin Kramer2015-10-159-222/+0
| | | | | | I left helpers that look useful for debugging alone. NFC. llvm-svn: 250410
* [mips][mips16] MIPS16 is not a CPU/Architecture but is an ASE.Daniel Sanders2015-10-152-2/+0
| | | | | | | | | | | | | | | | Summary: The -mcpu=mips16 option caused the Integrated Assembler to crash because it couldn't figure out the architecture revision number to write to the .MIPS.abiflags section. This CPU definition has been removed because, like microMIPS, MIPS16 is an ASE to a base architecture. Reviewers: vkalintiris Subscribers: rkotler, llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D13656 llvm-svn: 250407
* [X86] Rip out orphaned method declarations and other dead code. NFC.Benjamin Kramer2015-10-157-53/+0
| | | | llvm-svn: 250406
* AVX512: Implemented DAG lowering for shuff62x2/shufi62x2 instructions ( ↵Igor Breger2015-10-154-1/+133
| | | | | | | | shuffle packed values at 128-bit granularity ) Differential Revision: http://reviews.llvm.org/D13648 llvm-svn: 250400
* AVX512: Implemented encoding and intrinsics for vpternlogd/q.Igor Breger2015-10-155-4/+98
| | | | | | Differential Revision: http://reviews.llvm.org/D13768 llvm-svn: 250396
* AVX-512: Fixed a bug in shuffle lowering 32-bit modeElena Demikhovsky2015-10-151-6/+35
| | | | | | | | | AVX-512 bit shuffle fails on 32 bit since we create a vector of 64-bit constants. I split 8x64-bit const vector to 16x32 on 32-bit mode. Differential Revision: http://reviews.llvm.org/D13644 llvm-svn: 250390
* Don't pretend AMDGPU backend knows how to custom-lower UDIVREM for vector ↵Artyom Skrobov2015-10-151-1/+1
| | | | | | | | | | | | types; it can't Reviewers: arsenm, jvesely, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13734 llvm-svn: 250384
* [mips][microMIPS] Implement DPA.W.PH, DPAQ_S.W.PH, DPAQ_SA.L.W, ↵Zlatko Buljan2015-10-154-12/+51
| | | | | | | | DPAQX_S.W.PH, DPAQX_SA.W.PH, DPAU.H.QBL, DPAU.H.QBR and DPAX.W.PH instructions Differential Revision: http://reviews.llvm.org/D13376 llvm-svn: 250382
* [mips][microMIPS] Implement BREAK16, LI16, MOVE16, SDBBP16, SUBU16 and XOR16 ↵Hrvoje Varga2015-10-153-13/+68
| | | | | | | | instructions Differential Revision: http://reviews.llvm.org/D11292#inline-103143 llvm-svn: 250381
* [mips][microMIPS] Implement LLE and SCE instructionsHrvoje Varga2015-10-153-0/+37
| | | | | | Differential Revision: http://reviews.llvm.org/D11630 llvm-svn: 250379
* [mips][microMIPS] Implement LWLE, LWRE, SWLE and SWRE instructionsHrvoje Varga2015-10-152-0/+26
| | | | | | Differential Revision: http://reviews.llvm.org/D11631 llvm-svn: 250377
* Test commit.Hrvoje Varga2015-10-151-1/+0
| | | | llvm-svn: 250367
* Add XSAVE/XSAVEOPT to KNL processor.Craig Topper2015-10-151-0/+2
| | | | llvm-svn: 250362
* [ARM] Make sure we do not dereference the end iterator when accessing debugQuentin Colombet2015-10-151-2/+2
| | | | | | | | | | information. Although the problem was always here, it would only be exposed when shrink-wrapping is enable. rdar://problem/23110493 llvm-svn: 250352
* [PowerPC] Fix invalid lxvdsx optimization (PR25157)Bill Schmidt2015-10-141-0/+2
| | | | | | | | | | | | PR25157 identifies a bug where a load plus a vector shuffle is incorrectly converted into an LXVDSX instruction. That optimization is only valid if the load is of a doubleword, and in the noted case, it was not. This corrects that problem. Joint patch with Eric Schweitz, who provided the bugpoint-reduced test case. llvm-svn: 250324
* [x86][FastISel] Teach how to select nontemporal stores.Andrea Di Biagio2015-10-141-16/+34
| | | | | | | | | | | | | | | | | | | | | | This patch teaches x86 fast-isel how to select nontemporal stores. On x86, we can use MOVNTI for nontemporal stores of doublewords/quadwords. Instructions (V)MOVNTPS/PD/DQ can be used for SSE2/AVX aligned nontemporal vector stores. Before this patch, fast-isel always selected 'movd/movq' instead of 'movnti' for doubleword/quadword nontemporal stores. In the case of nontemporal stores of aligned vectors, fast-isel always selected movaps/movapd/movdqa instead of movntps/movntpd/movntdq. With this patch, if we use SSE2/AVX intrinsics for nontemporal stores we now always get the expected (V)MOVNT instructions. The lack of fast-isel support for nontemporal stores was spotted when analyzing the -O0 codegen for nontemporal stores. Differential Revision: http://reviews.llvm.org/D13698 llvm-svn: 250285
* [X86] Add XSAVE feature flags to their various processors.Craig Topper2015-10-141-3/+23
| | | | llvm-svn: 250268
* [WebAssembly] Remove a TODO comment which is no longer needed. NFC.Dan Gohman2015-10-131-7/+0
| | | | llvm-svn: 250233
* AMDGPU: Remove implicit ilist iterator conversions, NFCDuncan P. N. Exon Smith2015-10-139-18/+17
| | | | | | | | | | | | | | | | | | One of the changes in lib/Target/AMDGPU/AMDGPUMCInstLower.cpp was a new one. Previously, bundle iterators and single-instruction iterators could be compared to each other (comparing on underlying pointers). I changed a comparison from using `MBB->end()` to using `MBB->instr_end()`, since both end iterators should point at the some place anyway. I don't think the implicit conversion between the two iterator types is a good idea since it's fairly easy to accidentally compare to the wrong thing (they aren't always end iterators). Otherwise I would have just added the conversion. Even with that, no there should be functionality change here. llvm-svn: 250218
* AArch64: Remove implicit ilist iterator conversions, NFCDuncan P. N. Exon Smith2015-10-136-15/+12
| | | | llvm-svn: 250216
* [AArch64] Check the size of the vector before accessing its elements.Akira Hatanaka2015-10-131-1/+1
| | | | | | | | This fixes an assert in AArch64AsmParser::MatchAndEmitInstruction. rdar://problem/23081753 llvm-svn: 250207
* function names should start with a lower case letter; NFCSanjay Patel2015-10-134-116/+116
| | | | llvm-svn: 250174
* don't repeat function/class/variable names in comments; NFCSanjay Patel2015-10-131-60/+46
| | | | llvm-svn: 250162
* Test commitChristof Douma2015-10-131-1/+0
| | | | llvm-svn: 250154
* Fix line-ending issue. NFC.Michael Kuperstein2015-10-131-2/+2
| | | | llvm-svn: 250151
* [X86] Mark the AAD and AAM aliases as not valid in 64-bit mode.Craig Topper2015-10-131-2/+2
| | | | llvm-svn: 250148
* [X86] Change all the i8imm operands in XOP instructions to u8imm so the ↵Craig Topper2015-10-131-10/+10
| | | | | | parser will check the size. llvm-svn: 250147
* x86: preserve flags when folding atomic operationsJF Bastien2015-10-131-6/+8
| | | | | | | | | | | | | | | | | | | | | Summary: D4796 taught LLVM to fold some atomic integer operations into a single instruction. The pattern was unaware that the instructions clobbered flags. This patch adds the missing EFLAGS definition. Floating point operations don't set flags, the subsequent fadd optimization is therefore correct. The same applies for surrounding load/store optimizations. Reviewers: rsmith, rtrieu Subscribers: llvm-commits, reames, morisset Differential Revision: http://reviews.llvm.org/D13680 llvm-svn: 250135
* AMDGPU: Refactor isVGPRToSGPRCopyMatt Arsenault2015-10-131-19/+48
| | | | | | | It should now correctly handle physical registers and make it easier to identify the other direction. llvm-svn: 250132
* DAGCombiner: Combine extract_vector_elt from build_vectorMatt Arsenault2015-10-122-0/+13
| | | | | | | | | | | | | | This basic combine was surprisingly missing. AMDGPU legalizes many operations in terms of 32-bit vector components, so not doing this results in many extra copies and subregister extracts that need to be cleaned up later. InstCombine already does this for the hasOneUse case. The target hook is to fix a handful of tests which break (e.g. ARM/vmov.ll) which turn from a vector materialize repeated immediate instruction to a constant vector load with more scalar copies from it. llvm-svn: 250129
* Make Win64 localescape offsets FP relative instead of SP relativeReid Kleckner2015-10-121-8/+2
| | | | | | | | | We made them SP relative back in March (r233137) because that's the value the runtime passes to EH functions. With the new cleanuppad IR, funclets adjust their frame argument from SP to FP, so our offsets should now be FP-relative. llvm-svn: 250088
* [x86] Fix wrong lowering of vsetcc nodes (PR25080).Andrea Di Biagio2015-10-121-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Function LowerVSETCC (in X86ISelLowering.cpp) worked under the wrong assumption that for non-AVX512 targets, the source type and destination type of a type-legalized setcc node were always the same type. This assumption was unfortunately incorrect; the type legalizer is not always able to promote the return type of a setcc to the same type as the first operand of a setcc. In the case of a vsetcc node, the legalizer firstly checks if the first input operand has a legal type. If so, then it promotes the return type of the vsetcc to that same type. Otherwise, the return type is promoted to the 'next legal type', which, for vectors of MVT::i1 is always a 128-bit integer vector type. Example (-mattr=+avx): %0 = trunc <8 x i32> %a to <8 x i23> %1 = icmp eq <8 x i23> %0, zeroinitializer The initial selection dag for the code above is: v8i1 = setcc t5, t7, seteq:ch t5: v8i23 = truncate t2 t2: v8i32,ch = CopyFromReg t0, Register:v8i32 %vreg1 t7: v8i32 = build_vector of all zeroes. The type legalizer would firstly check if 't5' has a legal type. If so, then it would reuse that same type to promote the return type of the setcc node. Unfortunately 't5' is of illegal type v8i23, and therefore it cannot be used to promote the return type of the setcc node. Consequently, the setcc return type is promoted to v8i16. Later on, 't5' is promoted to v8i32 thus leading to the following dag node: v8i16 = setcc t32, t25, seteq:ch where t32 and t25 are now values of type v8i32. Before this patch, function LowerVSETCC would have wrongly expanded the setcc to a single X86ISD::PCMPEQ. Surprisingly, ISel was still able to match an instruction. In our case, ISel would have matched a VPCMPEQWrr: t37: v8i16 = X86ISD::VPCMPEQWrr t36, t25 However, t36 and t25 are both VR256, while the result type is instead of class VR128. This inconsistency ended up causing the insertion of COPY instructions like this: %vreg7<def> = COPY %vreg3; VR128:%vreg7 VR256:%vreg3 Which is an invalid full copy (not a sub register copy). Eventually, the backend would have hit an UNREACHABLE "Cannot emit physreg copy instruction" in the attempt to expand the malformed pseudo COPY instructions. This patch fixes the problem adding the missing logic in LowerVSETCC to handle the corner case of a setcc with 128-bit return type and 256-bit operand type. This problem was originally reported by Dimitry as PR25080. It has been latent for a very long time. I have added the minimal reproducible from that bugzilla as test setcc-lowering.ll. Differential Revision: http://reviews.llvm.org/D13660 llvm-svn: 250085
* combine predicates; NFCISanjay Patel2015-10-121-4/+1
| | | | llvm-svn: 250075
* AMDGPU: Register some more passes so -print-before worksMatt Arsenault2015-10-121-0/+2
| | | | llvm-svn: 250071
* fix typos; NFCSanjay Patel2015-10-121-3/+2
| | | | llvm-svn: 250059
* [mips][micromips] Initial support for micrmomips DSP instructions and ↵Zoran Jovanovic2015-10-1210-6/+85
| | | | | | | | addu.qb implementation Differential Revision: http://reviews.llvm.org/D12798 llvm-svn: 250058
* [mips][FastISel] Clang-format switch statement. NFC.Vasileios Kalintiris2015-10-121-10/+10
| | | | llvm-svn: 250053
* fix capitalization; NFCSanjay Patel2015-10-121-2/+2
| | | | llvm-svn: 250049
* [mips][ias] Implement macro expansion when bcc has an immediate where a ↵Daniel Sanders2015-10-122-2/+126
| | | | | | | | | | | | | | register belongs. Summary: Fixes PR24915. Reviewers: vkalintiris Subscribers: emaste, seanbruno, llvm-commits Differential Revision: http://reviews.llvm.org/D13533 llvm-svn: 250042
* [mips] Clean up most macro expansions to use the emit*() functions.Daniel Sanders2015-10-121-287/+163
| | | | | | | | | | Reviewers: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13591 llvm-svn: 250040
* [mips] Handle undef when extracting subregs from FP64 registers.Daniel Sanders2015-10-121-4/+12
| | | | | | | | | | | | | | | Summary: This removes unnecessary instructions when extracting from an undefined register and also fixes a crash for O32 when passing undef to a double argument in held in integer registers. Reviewers: vkalintiris Subscribers: llvm-commits, zoran.jovanovic, petarj Differential Revision: http://reviews.llvm.org/D13467 llvm-svn: 250039
* [ARM] Mark Swift MISched model as incompleteJames Molloy2015-10-121-0/+1
| | | | | | | | | | | The Swift Machine Scheduler Model is incomplete. There are instructions missing which can trigger the "incomplete machine model" abort. This was observed when a downstream SchedMachineModel was added to the ARM target. Patch by Christof Douma! llvm-svn: 250033
* [X86] Add XSAVE intrinsic familyAmjad Aboud2015-10-125-23/+79
| | | | | | | | | | | | Add intrinsics for the XSAVE instructions (XSAVE/XSAVE64/XRSTOR/XRSTOR64) XSAVEOPT instructions (XSAVEOPT/XSAVEOPT64) XSAVEC instructions (XSAVEC/XSAVEC64) XSAVES instructions (XSAVES/XSAVES64/XRSTORS/XRSTORS64) Differential Revision: http://reviews.llvm.org/D13012 llvm-svn: 250029
* [x86] PR24562: fix incorrect folding of PSHUFB nodes with a mask where all ↵Andrea Di Biagio2015-10-121-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | indices have the most significant bit set. This patch fixes a problem in function 'combineX86ShuffleChain' that causes a chain of shuffles to be wrongly folded away when the combined shuffle mask has only one element. We may end up with a combined shuffle mask of one element as a result of multiple calls to function 'canWidenShuffleElements()'. Function canWidenShuffleElements attempts to simplify a shuffle mask by widening the size of the elements being shuffled. For every pair of shuffle indices, function canWidenShuffleElements checks if indices refer to adjacent elements. If all pairs refer to "adjacent" elements then the shuffle mask is safely widened. As a consequence of widening, we end up with a new shuffle mask which is half the size of the original shuffle mask. The byte shuffle (pshufb) from test pr24562.ll has a mask of all SM_SentinelZero indices. Function canWidenShuffleElements would combine each pair of SM_SentinelZero indices into a single SM_SentinelZero index. So, in a logarithmic number of steps (4 in this case), the pshufb mask is simplified to a mask with only one index which is equal to SM_SentinelZero. Before this patch, function combineX86ShuffleChain wrongly assumed that a mask of size one is always equivalent to an identity mask. So, the entire shuffle chain was just folded away as the combined shuffle mask was treated as a no-op mask. With this patch we know check if the only element of a combined shuffle mask is SM_SentinelZero. In case, we propagate a zero vector. Differential Revision: http://reviews.llvm.org/D13364 llvm-svn: 250027
OpenPOWER on IntegriCloud