summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [MIPS GlobalISel] remove superfluous #includes (NFC)Petar Jovanovic2018-04-125-14/+1
| | | | | | | Remove superfluous #includes. Minor code style change in MipsCallLowering::lowerFormalArguments(). llvm-svn: 329926
* [AArch64] Move AFI->setRedZone(false) to top of emitPrologueJessica Paquette2018-04-121-1/+5
| | | | | | | | | | | | | AFI->setRedZone(false) was put in the wrong place before, and so it only fired on functions that didn't have stack frames. This moves that to the top of emitPrologue to make sure that every function without a redzone has it set correctly. This also adds a function representing one of the early exit cases (GHC calling convention) to the MachineOutliner noredzone test to ensure that we can outline from functions like these, where we never use a redzone. llvm-svn: 329922
* [mips] Correct the predicates of the load/store (double)word for coprocessor 3.Simon Dardis2018-04-121-4/+6
| | | | llvm-svn: 329913
* [X86] Remove AES/CLMUL/CRC32/LDDQU/MOVNT/POPCNT/SHA schedule itineraries ↵Simon Pilgrim2018-04-124-100/+70
| | | | | | (PR37093) llvm-svn: 329912
* [AArch64][AsmParser] Unify 'addVectorListOperands' functions.Sander de Smalen2018-04-122-38/+34
| | | | | | | | | | | | | | | | | | | | Summary: Merged 'addVectorList64Operands' and 'addVectorList128Operands' into a generic 'addVectorListOperands', which can be easily extended to work for SVE vectors. This is patch [4/6] in a series to add assembler/disassembler support for SVE's contiguous ST1 (scalar+imm) instructions. Reviewers: fhahn, rengolin, javed.absar, huntergr, SjoerdMeijer, t.p.northover, echristo, evandro Reviewed By: rengolin Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D45430 llvm-svn: 329909
* [X86] Remove remaining system/special schedule itineraries (PR37093)Simon Pilgrim2018-04-125-540/+381
| | | | llvm-svn: 329906
* [mips] Correct the predicates for special nops, tlb ctrl instrs, software ↵Simon Dardis2018-04-122-25/+34
| | | | | | | | | | breakpoint and prefx. Reviewers: atanasyan, abeserminji Differential Revision: https://reviews.llvm.org/D44436 llvm-svn: 329905
* [X86] Remove system/control schedule itineraries (PR37093)Simon Pilgrim2018-04-128-527/+356
| | | | llvm-svn: 329903
* [AArch64][AsmParser] Make parse function for VectorLists generic to other ↵Sander de Smalen2018-04-122-74/+100
| | | | | | | | | | | | | | | | | | | | | | | | vector types. Summary: Added 'RegisterKind' to the VectorListOp structure, so that this operand type can be reused for SVE vector lists in a later patch. It also refactors the 'tryParseVectorList' function so it can be used directly in the ParserMethod of an operand. The parsing can now parse multiple kinds of vectors and recover if there is no match. This is patch [3/6] in a series to add assembler/disassembler support for SVE's contiguous ST1 (scalar+imm) instructions. Reviewers: fhahn, rengolin, javed.absar, huntergr, SjoerdMeijer, t.p.northover, echristo, evandro Reviewed By: rengolin Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D45429 llvm-svn: 329900
* [RISCV] Change function alignment to 4 bytes, and 2 bytes for RVCShiva Chen2018-04-121-2/+3
| | | | | | | | | | | | | | | | | | Summary: According RISC-V ELF psABI specification, base RV32 and RV64 ISAs only allow 32-bit instruction alignment, but instruction allow to be aligned to 16-bit boundaries for C-extension. So we just align to 4 bytes and 2 bytes for C-extension is enough. Reviewers: asb, apazos Differential Revision: https://reviews.llvm.org/D45560 Patch by Kito Cheng. llvm-svn: 329899
* [X86] Remove CMOV/SETCC schedule itineraries (PR37093)Simon Pilgrim2018-04-122-26/+13
| | | | llvm-svn: 329898
* [X86] Remove MMX/3DNow schedule itineraries (PR37093)Simon Pilgrim2018-04-124-387/+226
| | | | llvm-svn: 329896
* [X86] Remove X87 schedule itineraries (PR37093)Simon Pilgrim2018-04-123-172/+101
| | | | | | First of a number of commits to remove x86 schedule itineraries entirely - approved off-line with @craig.topper llvm-svn: 329893
* [SystemZ] Use ResourceCycles=30 for FPd unit (NFC).Jonas Paulsson2018-04-122-22/+4
| | | | | | | This is better than listing FPd 30 times :-) Review: Ulrich Weigand llvm-svn: 329887
* [SystemZ] Remove FullInstRWOverlapCheck from SchedMachineModels.Jonas Paulsson2018-04-124-18/+10
| | | | | | | | This is NFC, even though it caught just a few cases of overlapping regular expressions. Review: Ulrich Weigand llvm-svn: 329886
* [HexagonMachineScheduler] Remove local (copied) getWeakLeft().Jonas Paulsson2018-04-121-4/+0
| | | | | | | Since the common code getWeakLeft() is now available, there should not be a local copy of this function in target. llvm-svn: 329885
* [MachineScheduler] NFC refactoringJonas Paulsson2018-04-121-26/+32
| | | | | | | | | | | | | | | | This patch makes tryCandidate() virtual and some utility functions like tryLess(), tryGreater(), ... externally available (used to be static). This makes it possible for a target to derive a new MachineSchedStrategy from GenericScheduler and reuse most parts. It was necessary to wrap functions with the same names in AMDGPU/SIMachineScheduler in a local namespace. Review: Andy Trick, Florian Hahn https://reviews.llvm.org/D43329 llvm-svn: 329884
* [RISCV] Codegen support for RV32D floating point comparison operationsAlex Bradbury2018-04-123-11/+37
| | | | | | | | Also add double-prevoius-failure.ll which captures a test case that at one point triggered a compiler crash, while developing calling convention support for f64 on RV32D with soft-float ABI. llvm-svn: 329877
* [RISCV] Codegen support for RV32D floating point conversion operationsAlex Bradbury2018-04-122-0/+15
| | | | | | | This also includes support and a test for truncating stores, which are now possible thanks to the fpround pattern. llvm-svn: 329876
* [RISCV] Add codegen support for RV32D floating point arithmetic operationsAlex Bradbury2018-04-122-1/+33
| | | | llvm-svn: 329874
* [RISCV] Codegen support for RV32D floating point load/store, fadd.d, calling ↵Alex Bradbury2018-04-126-22/+327
| | | | | | | | | | | | | | | | conv fadd.d is required in order to force floating point registers to be used in test code, as parameters are passed in integer registers in the soft float ABI. Much of this patch is concerned with support for passing f64 on RV32D with a soft-float ABI. Similar to Mips, introduce pseudoinstructions to build an f64 out of a pair of i32 and to split an f64 to a pair of i32. BUILD_PAIR and EXTRACT_ELEMENT can't be used, as a BITCAST to i64 would be necessary, but i64 is not a legal type. llvm-svn: 329871
* Test commit accessYan Luo2018-04-121-1/+1
| | | | llvm-svn: 329870
* [X86] Remove unused itinerary argument from FMA3/FMA4/XOP instructions. NFCI.Simon Pilgrim2018-04-111-20/+20
| | | | llvm-svn: 329862
* [LLVM-C] Add LLVMGetHostCPU{Name,Features}.whitequark2018-04-111-0/+16
| | | | | | | | | | | | | | Without these functions it's hard to create a TargetMachine for Orc JIT that creates efficient native code. It's not sufficient to just expose LLVMGetHostCPUName(), because for some CPUs there's fewer features actually available than the CPU name indicates (e.g. AVX might be missing on some CPUs identified as Skylake). Differential Revision: https://reviews.llvm.org/D44861 llvm-svn: 329856
* [PowerPC] Fix condition for 64-bit rotate when replacing r+r instr with r+iNemanja Ivanovic2018-04-111-1/+2
| | | | | | | | | | This patch fixes https://bugs.llvm.org/show_bug.cgi?id=37039 The condition only covers one of the two 64-bit rotate instructions. This just adds the second (RLDICLo). Patch by Josh Stone. llvm-svn: 329852
* bpf: signal error instead of silent drop for certain invalid asm insnYonghong Song2018-04-111-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, an invalid asm insn, either in an asm file or in an inline asm format, might be silently dropped. This patch fixed two places where this may happen by signaling the error so user knows what goes wrong. The following is an example to demonstrate error messages: -bash-4.2$ cat t.c int test(void *ctx) { #if defined(NO_ERROR) asm volatile("r0 = *(u16 *)skb[%0]" : : "i"(2)); #elif defined(ERROR_1) asm volatile("r20 = *(u16 *)skb[%0]" : : "i"(2)); #elif defined(ERROR_2) asm volatile("r0 = *(u16 *)(r1 + ?)" : :); #endif return 0; } -bash-4.2$ cat run.sh for macro in NO_ERROR ERROR_1 ERROR_2; do echo "===== compile for macro" $macro clang -D${macro} -O2 -target bpf -emit-llvm -S t.c echo "==llc==" llc -march=bpf -filetype=obj t.ll done -bash-4.2$ ./run.sh ===== compile for macro NO_ERROR ==llc== ===== compile for macro ERROR_1 ==llc== <inline asm>:1:2: error: invalid register/token name r20 = *(u16 *)skb[2] ^ note: !srcloc = 135 ===== compile for macro ERROR_2 ==llc== <inline asm>:1:21: error: unexpected token r0 = *(u16 *)(r1 + ?) ^ note: !srcloc = 210 -bash-4.2$ Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 329849
* [X86] Describe wbnoinvd instructionGabor Buella2018-04-115-0/+15
| | | | | | | | | | | | | | | Similar to the wbinvd instruction, except this one does not invalidate caches. Ring 0 only. The encoding matches a wbinvd instruction with an F3 prefix. Reviewers: craig.topper, zvi, ashlykov Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D43816 llvm-svn: 329847
* [X86][Atom] Convert Atom scheduler model to SchedRW (PR32431)Simon Pilgrim2018-04-111-533/+785
| | | | | | | | | | | | | | | | Atom is the only x86 target that still uses schedule itineraries, if we can remove this then we can begin the work on removing x86 itineraries. I've also found that it will help with PR36550. I've focussed on matching the existing model as closely as possible (relying on the schedule tests), PR36895 indicated a lot of these were incorrect but we can just as easily fix these after this patch as before. Hopefully we can get llvm-exegesis to help here, There are a few instructions that rely on itinerary scheduling (mainly push/pop/return) of multiple resource stages, but I don't think any of these are show stoppers. There are also a few codegen changes that seem related to the post-ra scheduler acting a little differently, I haven't tracked these down but they don't seem critical. NOTE: I don't have access to any Atom hardware, so this hasn't been tested in the wild. Differential Revision: https://reviews.llvm.org/D45486 llvm-svn: 329837
* [X86] Generalize X86PadShortFunction to work with TargetSchedModelSimon Pilgrim2018-04-111-14/+10
| | | | | | | | | | Pre-commit for D45486, don't rely on itinerary scheduler model to determine latencies for padding, use the generic TargetSchedModel::computeInstrLatency call. Also, replace hard coded (atom specific) 2*uop creation per padding cycle with a version based on the scheduler model's issue width. Differential Revision: https://reviews.llvm.org/D45486 llvm-svn: 329834
* [NVPTX] Removed 'satom' feature which is no longer used.Artem Belevich2018-04-112-11/+4
| | | | | | Differential Revision: https://reviews.llvm.org/D45061 llvm-svn: 329830
* [NVPTX, CUDA] Improved feature constraints on NVPTX target builtins.Artem Belevich2018-04-111-1/+1
| | | | | | | | | | When NVPTX TARGET_BUILTIN specifies sm_XX or ptxYY as required feature, consider those features available if we're compiling for GPU >= sm_XX or have enabled PTX version >= ptxYY. Differential Revision: https://reviews.llvm.org/D45061 llvm-svn: 329829
* [AMDGPU] Ensure there are enough registers for wave dispatchTim Renouf2018-04-111-0/+13
| | | | | | | | | | | | | | | | | Summary: This fixes the number of SGPRs and VGPRs in the *_RSRC1 register to allow for registers set up in wave dispatch, even if those registers are not used in the shader. Re-landed after noticing that the buildbot failure from 329808 seemed to be unrelated. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45503 Change-Id: I6575f0e0d2a528d1319d0b289f0ebe4510fa5771 llvm-svn: 329826
* [MIPS GlobalISel] Select add i32, i32Petar Jovanovic2018-04-1111-13/+406
| | | | | | | | | | | | | Add the minimal support necessary to lower a function that returns the sum of two i32 values. Support argument/return lowering of i32 values through registers only. Add tablegen for regbankselect and instructionselect. Patch by Petar Avramovic. Differential Revision: https://reviews.llvm.org/D44304 llvm-svn: 329819
* [AMDGPU] Fix lowering enqueue_kernelYaxun Liu2018-04-111-20/+30
| | | | | | | | | | | | | | | | | | Two issues were fixed: runtime has difficulty to allocate memory for an external symbol of a kernel and set the address of the external symbol, therefore make the runtime handle of an enqueued kernel an ordinary global variable. Runtime only needs to store the address of the loaded kernel to the handle and has verified that this approach works. handle the situation where __enqueue_kernel* gets inlined therefore the enqueued kernel may be used through a constant expr instead of an instruction. Differential Revision: https://reviews.llvm.org/D45187 llvm-svn: 329815
* Revert "[AMDGPU] Ensure there are enough registers for wave dispatch"Tim Renouf2018-04-111-13/+0
| | | | | | | | | This reverts 329808. That change caused a report of a failure in test/CodeGen/MIR/AMDGPU/mir-canon-multi.mir that I didn't see. I suspect it is an expensive-check-only error. Change-Id: I8133f26f15e7d5ec2b09c687c12cd70e918461b0 llvm-svn: 329811
* [AArch64][AsmParser] Split index parsing from vector list.Sander de Smalen2018-04-111-27/+23
| | | | | | | | | | | | | | | | | | | | Summary: Place parsing of a vector index into a separate function to reduce duplication, since the code is duplicated in both the parsing of a Neon vector register operand and a Neon vector list. This is patch [2/6] in a series to add assembler/disassembler support for SVE's contiguous ST1 (scalar+imm) instructions. Reviewers: fhahn, rengolin, javed.absar, huntergr, SjoerdMeijer, t.p.northover, echristo, evandro Reviewed By: rengolin Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D45428 llvm-svn: 329809
* [AMDGPU] Ensure there are enough registers for wave dispatchTim Renouf2018-04-111-0/+13
| | | | | | | | | | | | | | Summary: This fixes the number of SGPRs and VGPRs in the *_RSRC1 register to allow for registers set up in wave dispatch, even if those registers are not used in the shader. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45503 Change-Id: I6575f0e0d2a528d1319d0b289f0ebe4510fa5771 llvm-svn: 329808
* [X86] Add variable shuffle schedule classesSimon Pilgrim2018-04-1113-119/+55
| | | | | | | | | | | | | | Split variable index shuffles from immediate index shuffles WriteFVarShuffle - variable 'in-lane' shuffles (VPERMILPS/VPERMIL2PS etc.) WriteVarShuffle - variable 'in-lane' shuffles (PSHUFB/VPPERM etc.) WriteFVarShuffle256 - variable 'cross-lane' shuffles (VPERMPS etc.) WriteVarShuffle256 - variable 'cross-lane' shuffles (VPERMD etc.) Differential Revision: https://reviews.llvm.org/D45404 llvm-svn: 329806
* [AMDGPU][MC][GFX9] Added v_screen_partition_4se_b32Dmitry Preobrazhensky2018-04-111-4/+30
| | | | | | | | | See bug 36845: https://bugs.llvm.org/show_bug.cgi?id=36845 Differential Revision: https://reviews.llvm.org/D45443 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 329801
* [AArch64] Fix regression after r329691Francis Visoiu Mistrih2018-04-111-1/+1
| | | | | | | | | | | | In r329691, we would choose FP even if the offset wouldn't fit, just because the offset is smaller than the one from BP. This made many accesses through FP need to scavenge a register, which resulted in slower and bigger code for no good reason. This patch now always picks the offset that fits first, even if FP is preferred. llvm-svn: 329797
* [ARM] FP16 VSEL codegenSjoerd Meijer2018-04-111-4/+10
| | | | | | | | | | | | | This is a follow up of rL327695 to instruction select more variants of VSELGT and VSELGE, for which it is necessary to custom lower SELECT. More work is required in this area, which will be addressed soon: - more variants need to be regression tested, but this depends on the next point. - first LowerConstantFP need to be adjusted for fp16 values. Differential Revision: https://reviews.llvm.org/D45205 llvm-svn: 329788
* [AArch64][AsmParser] Unify code for parsing Neon/SVE vectors.Sander de Smalen2018-04-112-147/+161
| | | | | | | | | | | | | | | | | | | | | | Summary: Merged 'tryMatchVectorRegister' (specific to Neon) and 'tryParseSVERegister' into a single 'tryParseVectorRegister' function, and created a generic 'parseVectorKind()' function that returns the #Elements and ElementWidth of a vector suffix. This reduces the duplication of this functionality between two the vector implementations. This is patch [1/6] in a series to add assembler/disassembler support for SVE's contiguous ST1 (scalar+imm) instructions. Reviewers: fhahn, rengolin, javed.absar, huntergr, SjoerdMeijer, t.p.northover, echristo, evandro Reviewed By: fhahn Subscribers: tschuett, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D45427 llvm-svn: 329782
* [X86] Remove 128/256-bit masked pmaddubsw and pmaddwd intrinsics. Replace ↵Craig Topper2018-04-111-12/+4
| | | | | | | | 512-bit masked intrinsic with unmasked intrinsic and a select. The 128/256-bit versions were no longer used by clang. It uses the legacy SSE/AVX2 version and a select. The 512-bit was changed to the same for consistency. llvm-svn: 329774
* [X86] In X86FlagsCopyLowering, when rewriting a memory setcc we need to emit ↵Craig Topper2018-04-111-3/+22
| | | | | | | | | | an explicit MOV8mr instruction. Previously the code only knew how to handle setcc to a register. This should fix a crash in the chromium build. llvm-svn: 329771
* GOTPCREL references must always use RIP.Sriraman Tallam2018-04-102-3/+9
| | | | | | | | With -fno-plt, global value references can use GOTPCREL and RIP must be used. Differential Revision: https://reviews.llvm.org/D45460 llvm-svn: 329765
* AMDGPU: enable 128-bit for local addr space under an optionMarek Olsak2018-04-105-12/+17
| | | | | | | | | | | | | | | | | | | Author: Samuel Pitoiset ds_read_b128 and ds_write_b128 have been recently enabled under the amdgpu-ds128 option because the performance benefit is unclear. Though, using 128-bit loads/stores for the local address space appears to introduce regressions in tessellation shaders. Not sure what is broken, but as ds_read_b128/ds_write_b128 are not enabled by default, just introduce a global option and enable 128-bit only if requested (until it's fixed/used correctly). v2: - fix regressions in merge-stores.ll and multiple_tails.ll Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464 llvm-svn: 329764
* [AArch64][Falkor] Fix bug in Falkor HWPF collision avoidance pass.Geoff Berry2018-04-101-0/+18
| | | | | | | | | | | | | | | | Summary: When inserting MOVs to avoid Falkor HWPF collisions, the non-base register operand of load instructions (e.g. a register offset) was not being considered live, so it could potentially have been used as a scratch register, clobbering the actual offset value. Reviewers: mcrosier Subscribers: rengolin, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D45502 llvm-svn: 329761
* Recommit r329716 "Add missing nullptr check before getSection() to ↵Jessica Paquette2018-04-101-0/+9
| | | | | | | | | | | | | AArch64MachObjectWriter::recordRelocation" This commit fixes the bot failures that were coming up before with r329716. The fix was to move the check for "isInSection()" inside of the if condition and emit the error there instead of waiting to get past the unreachable statement. This should work in debug and release builds now. llvm-svn: 329746
* [AArch64] Fix isel failure when BUILD_PAIR nodes are left over.Amara Emerson2018-04-101-0/+2
| | | | | | rdar://39175175 llvm-svn: 329743
* [X86] Split up -march=icelake to -client & -serverGabor Buella2018-04-102-6/+16
| | | | | | | | | | Reviewers: craig.topper, zvi, echristo Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D45055 llvm-svn: 329742
OpenPOWER on IntegriCloud