summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [SystemZ] Add decimal floating-point instructionsUlrich Weigand2017-05-309-1/+666
| | | | | | | | | This adds assembler / disassembler support for the decimal floating-point instructions. Since LLVM does not yet have support for decimal float types, these cannot be used for codegen at this point. llvm-svn: 304203
* [SystemZ] Add hexadecimal floating-point instructionsUlrich Weigand2017-05-307-18/+604
| | | | | | | | This adds assembler / disassembler support for the hexadecimal floating-point instructions. Since the Linux ABI does not use any hex float data types, these are not useful for codegen. llvm-svn: 304202
* [mips] Expansion of LI.S and LI.DZoran Jovanovic2017-05-303-39/+398
| | | | | | | | | Author: smaksimovic Reviewers: dsanders sdardis Introduces LI.S and LI.D pseudo instructions with floating point operands. Differential Revision: https://reviews.llvm.org/D14390 llvm-svn: 304198
* Fix PR33031: correct the estimate of maximum offset for instructions ↵Kristof Beyls2017-05-301-5/+30
| | | | | | spilling/filling the stack. llvm-svn: 304196
* [SystemZ] Improve buildVector() in SystemZISelLowering.cpp.Jonas Paulsson2017-05-291-19/+41
| | | | | | | | Use VLREP when inserting one or more loads into a vector. This is more efficient than to first load and then use a VLVGP. Review: Ulrich Weigand llvm-svn: 304152
* [Nios2] Target registrationNikolai Bozhenov2017-05-2917-0/+571
| | | | | | | | | | | | | Reviewers: craig.topper, hfinkel, joerg, lattner, zvi Reviewed By: craig.topper Subscribers: oren_ben_simhon, igorb, belickim, tvvikram, mgorny, llvm-commits, pavel.v.chupin, DavidKreitzer Differential Revision: https://reviews.llvm.org/D32669 Patch by AndreiGrischenko <andrei.l.grischenko@intel.com> llvm-svn: 304144
* [ARM] GlobalISel: Extract helper. NFCI.Diana Picus2017-05-291-29/+34
| | | | | | | | Create a helper to deal with the common code for merging incoming values together after they've been split during call lowering. There's likely more stuff that can be commoned up here, but we'll leave that for later. llvm-svn: 304143
* [ARM] GlobalISel: Support array returnsDiana Picus2017-05-291-3/+25
| | | | | | | These are a bit rare in practice, but they don't require anything special compared to array parameters, so support them as well. llvm-svn: 304137
* [PPC] Fix assertion failure during binary encoding with -mcpu=pwr9Hiroshi Inoue2017-05-292-4/+18
| | | | | | | | | | | | | Summary clang -c -mcpu=pwr9 test/CodeGen/PowerPC/build-vector-tests.ll causes an assertion failure during the binary encoding. The failure occurs when a D-form load instruction takes two register operands instead of a register + an immediate. This patch fixes the problem and also adds an assertion to catch this failure earlier before the binary encoding (i.e. during lit test). The fix is from Nemanja Ivanovic @nemanjai. Differential Revision: https://reviews.llvm.org/D33482 llvm-svn: 304133
* [ARM] GlobalISel: Support array parameters/argumentsDiana Picus2017-05-292-14/+71
| | | | | | | | | Clang coerces structs into arrays, so it's a good idea to support them. Most of the support boils down to getting the splitToValueTypes helper to actually split types. We then use G_INSERT/G_EXTRACT to deal with the parts. llvm-svn: 304132
* Resubmit "[X86] Adding new LLVM TableGen backend that generates the X86 ↵Zachary Turner2017-05-292-3398/+3
| | | | | | | | | | | | | backend memory folding tables." This was reverted due to buildbot breakages and I was not familiar with this code to investigate it. But while trying to get a useful backtrace for the author, it turns out the fix was very obvious. Resubmitting this patch as is, and will submit the fix in a followup so that the fix is not hidden in the larger CL. llvm-svn: 304122
* Revert "[X86] Adding new LLVM TableGen backend that generates the X86 ↵Zachary Turner2017-05-292-3/+3398
| | | | | | | | | | | | | backend memory folding tables." This reverts commit 28cb1003507f287726f43c771024a1dc102c45fe as well as all subsequent followups. llvm-tblgen currently segfaults with this change, and it seems it has been broken on the bots all day with no fixes in preparation. See, for example: http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/ llvm-svn: 304121
* [AVR] Remove SREG from CPI's Uses; authored by Florian ZeitzDylan McKay2017-05-291-1/+0
| | | | | | | | | | | | | | Summary: CPI does not read the status register, but only writes it. Reviewers: dylanmckay Reviewed By: dylanmckay Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33223 llvm-svn: 304116
* [AArch64][Falkor] Combine sched details files into one. NFC.Geoff Berry2017-05-282-514/+503
| | | | llvm-svn: 304109
* [AArch64][Falkor] Fix some sched details.Geoff Berry2017-05-284-294/+461
| | | | | | | | | | | | | | | | | | | | - Remove all uses of base sched model entries and set them all to Unsupported so all the opcodes are described in AArch64SchedFalkorDetails.td. - Remove entries for unsupported half-float opcodes. - Remove entries for unsupported LSE extension opcodes. - Add entry for MOVbaseTLS (and set Sched in base td file entry to WriteSys) and a few other pseudo ops. - Fix a few FP load/store with reg offset entries to use the LSLfast predicates. - Add Q size BIF/BIT/BSL entries. - Fix swapped Q/D sized CLS/CLZ/CNT/RBIT entires. - Fix pre/post increment address register latency (this operand is always dest 0). - Fix swapped FCVTHD/FCVTHS/FCVTDH/FCVTDS entries. - Fix XYZ resource over usage on LD[1-4] opcodes. llvm-svn: 304108
* [X86] Adding new LLVM TableGen backend that generates the X86 backend memory ↵Ayman Musa2017-05-282-3398/+3
| | | | | | | | | | | folding tables. X86 backend holds huge tables in order to map between the register and memory forms of each instruction. This TableGen Backend automatically generated all these tables with the appropriate flags for each entry. Differential Revision: https://reviews.llvm.org/D32684 llvm-svn: 304088
* [X86] Adding FoldGenRegForm helper field (for memory folding tables tableGen ↵Ayman Musa2017-05-288-89/+175
| | | | | | | | | | | | | | | | | | | | | | backend) to X86Inst class and set its value for the relevant instructions. Some register-register instructions can be encoded in 2 different ways, this happens when 2 register operands can be folded (separately). For example if we look at the MOV8rr and MOV8rr_REV, both instructions perform exactly the same operation, but are encoded differently. Here is the relevant information about these instructions from Intel's 64-ia-32-architectures-software-developer-manual: Opcode Instruction Op/En 64-Bit Mode Compat/Leg Mode Description 8A /r MOV r8,r/m8 RM Valid Valid Move r/m8 to r8. 88 /r MOV r/m8,r8 MR Valid Valid Move r8 to r/m8. Here we can see that in order to enable the folding of the output and input registers, we had to define 2 "encodings", and as a result we got 2 move 8-bit register-register instructions. In the X86 backend, we define both of these instructions, usually one has a regular name (MOV8rr) while the other has "_REV" suffix (MOV8rr_REV), must be marked with isCodeGenOnly flag and is not emitted from CodeGen. Automatically generating the memory folding tables relies on matching encodings of instructions, but in these cases where we want to map both memory forms of the mov 8-bit (MOV8rm & MOV8mr) to MOV8rr (not to MOV8rr_REV) we have to somehow point from the MOV8rr_REV to the "regular" appropriate instruction which in this case is MOV8rr. This field enable this "pointing" mechanism - which is used in the TableGen backend for generating memory folding tables. Differential Revision: https://reviews.llvm.org/D32683 llvm-svn: 304087
* AArch64/PEI: Do not add reserved regs to liveinsMatthias Braun2017-05-271-2/+5
| | | | | | | We do not track liveness for reserved registers. It is unnecessary to add them to block livein lists. llvm-svn: 304059
* [AArch64][GlobalISel] Add the Localizer pass for the O0 pipelineQuentin Colombet2017-05-271-1/+9
| | | | | | | This should fix most of the issue we have right now with constants being spilled all over the place. llvm-svn: 304052
* AArch64: Fix cmpxchg O0 expansionMatthias Braun2017-05-261-58/+61
| | | | | | | | | | | | | | - Rewrite livein calculation to use the computeLiveIns() helper function. This is slightly less efficient but easier to reason about and doesn't unnecessarily add pristine and reserved registers[1] - Zero the status register at the beginning of the loop to make sure it has a defined value. - Remove kill flags of values that need to stay alive throughout the loop. [1] An upcoming commit of mine will tighten the MachineVerifier to catch these. llvm-svn: 304048
* [bpf] disallow global_addr+off foldingAlexei Starovoitov2017-05-262-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | Wrong assembly code is generated for a simple program with clang. If clang only produces IR and llc is used for IR lowering and optimization, correct assembly code is generated. The main reason is that clang feeds default Reloc::Static to llvm and llc feeds no RelocMode to llvm, where for llc case, BPF backend picks up Reloc::PIC_ mode. This leads different IR lowering behavior and clang permits global_addr+off folding while llc doesn't. This patch introduces isOffsetFoldingLegal function into BPF backend and the function always return false. This will make clang and llc behave the same for the lowering. Bug https://bugs.llvm.org//show_bug.cgi?id=33183 has more detailed explanation. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 304043
* [Mips] Placate GCC's -Wmisleading-indentation. NFCI.Davide Italiano2017-05-261-17/+17
| | | | llvm-svn: 304041
* LivePhysRegs: Rework constructor + documentation; NFCMatthias Braun2017-05-2610-22/+22
| | | | | | | - Take reference instead of pointer to a TRI that cannot be nullptr. - Improve documentation comments. llvm-svn: 304038
* [Hexagon] Cleanup of unused function isCalleeSaveReg (NFC)Sumanth Gundapaneni2017-05-262-6/+0
| | | | llvm-svn: 304034
* Resubmit r303859 with test fixed.Konstantin Zhuravlyov2017-05-261-1/+3
| | | | | | | | | | [AMDGPU] add intrinsic for s_getpc Summary: The s_getpc instruction is exposed as intrinsic llvm.amdgcn.s.getpc. Patch by Tim Corringham llvm-svn: 304031
* Make helper functions static. NFC.Benjamin Kramer2017-05-263-3/+7
| | | | llvm-svn: 304029
* [AMDGPU][MC][GFX9] Corrected encoding of flat_scratch* for SDWA opcodesDmitry Preobrazhensky2017-05-261-1/+1
| | | | | | | | | | See bug 33171: https://bugs.llvm.org/show_bug.cgi?id=33171 Reviewers: Sam Kolton Differential Revision: https://reviews.llvm.org/D33553 llvm-svn: 304015
* AMDGPU/GlobalISel: Mark 32-bit float constants as legalTom Stellard2017-05-262-0/+6
| | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D33212 llvm-svn: 304003
* [AMDGPU] SDWA: add disassembler support for GFX9Sam Kolton2017-05-265-31/+113
| | | | | | | | | | | | Summary: Added decoder methods and tests Reviewers: vpykhtin, artem.tamazov, dp Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D33545 llvm-svn: 303999
* [ARM] Fix lowering of misaligned memcpy/memsetJohn Brawn2017-05-261-6/+0
| | | | | | | | | | | | | | | | | Currently getOptimalMemOpType returns i32 for large enough sizes without checking for alignment, leading to poor code generation when misaligned accesses aren't permitted as we generate a word store then later split it up into byte stores. This means we inadvertantly go over the MaxStoresPerMemcpy limit and for memset we splat the memset value into a word then immediately split it up again. Fix this by leaving it up to FindOptimalMemOpLowering to figure out which type to use, but also fix a bug there where it wasn't correctly checking if misaligned memory accesses are allowed. Differential Revision: https://reviews.llvm.org/D33442 llvm-svn: 303990
* The fix for PR22004: X86AsmParser.cpp asserts: OperandStack.size() > 1 && ↵Andrew V. Tischenko2017-05-261-2/+3
| | | | | | "Too few operands." llvm-svn: 303985
* Fix signedness of constant. NFC.Nirav Dave2017-05-261-5/+5
| | | | llvm-svn: 303980
* [PPC] Add text for assert.Tim Shen2017-05-251-1/+1
| | | | llvm-svn: 303940
* [PPC] Fix atomics lowering in DAG lowering.Tim Shen2017-05-251-1/+3
| | | | | | | | | | | I forgot to forward the chain, causing some missing instruction dependencies. The test crashes the compiler without this patch. Inspired by the test case, D33519 also tries to remove the extra sync. Differential Revision: https://reviews.llvm.org/D33573 llvm-svn: 303931
* PPC: Correct Size for GETtlsADDRKyle Butt2017-05-251-1/+3
| | | | | | | | | PPC::GETtlsADDR is lowered to a branch and a nop, by the assembly printer. Its size was incorrectly marked as 4, correct it to 8. The incorrect size can cause incorrect branch relaxation in PPCBranchSelector under the right conditions. llvm-svn: 303904
* Revert r303859, CodeGen/AMDGPU/llvm.amdgcn.s.getpc.ll fails on bots.Nico Weber2017-05-251-3/+1
| | | | llvm-svn: 303902
* [AArch64]: add 'a' inline asm operand modifier.Manoj Gupta2017-05-251-1/+4
| | | | | | | | | | | | | | | | Summary: This is used in the Linux kernel, and effectively just means "print an address". This brings back r193593. Reviewed by: Renato Golin Reviewers: t.p.northover, rengolin, richard.barton.arm, kristof.beyls Subscribers: aemerson, javed.absar, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D33558 llvm-svn: 303901
* [AMDGPU] add intrinsic for s_getpcTim Corringham2017-05-251-1/+3
| | | | | | | | | | | | | | Summary: The s_getpc instruction is exposed as intrinsic llvm.amdgcn.s.getpc. Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D32862 llvm-svn: 303859
* [X86] Adding vpopcntd and vpopcntq instructionsOren Ben Simhon2017-05-257-0/+59
| | | | | | | | | AVX512_VPOPCNTDQ is a new feature set that was published by Intel. The patch represents the LLVM side of the addition of two new intrinsic based instructions (vpopcntd and vpopcntq). Differential Revision: https://reviews.llvm.org/D33169 llvm-svn: 303858
* [PowerPC] Fix a performance bug for PPC::XXSLDWI.Tony Jiang2017-05-244-4/+96
| | | | | | | | There are some VectorShuffle Nodes in SDAG which can be selected to XXSLDWI instruction, this patch recognizes them and does the selection to improve the PPC performance. llvm-svn: 303822
* [AArch64] Prevent nested ADDs from address calc in splitStoreSplat. NFCNirav Dave2017-05-241-2/+12
| | | | | | In preparation for late-stage store merging. llvm-svn: 303800
* P9: D-form vector load/store. Differential Revision: ↵Zaara Syeda2017-05-241-18/+34
| | | | | | https://reviews.llvm.org/D33248 llvm-svn: 303780
* Revert r291254: [AArch64] Reduce vector insert/extract cost for FalkorMatthew Simpson2017-05-241-1/+0
| | | | | | | The default vector insert/extract cost is more profitable on Falkor than the reduced cost. llvm-svn: 303771
* [AMDGPU] Prevent too large store merges in AMDGPU Subtargets. NFCI.Nirav Dave2017-05-244-0/+24
| | | | | | | | | Various address spaces on the SI and R600 subtargets have stricter limits on memory access size that other address spaces. Use canMergeStoresTo predicate to prevent the DAGCombiner from creating these stores as they will be split up during legalization. llvm-svn: 303767
* [MSP430] Fix PR33050: Don't use ADD16ri to lower FrameIndex.Vadzim Dambrouski2017-05-243-3/+8
| | | | | | | | | Use ADDframe pseudo instruction instead. This will fix machine verifier error, and will help to fix PR32146. Differential Revision: https://reviews.llvm.org/D33452 llvm-svn: 303758
* Revert "AMDGPU: Fold CI-specific complex SMRD patterns into existing complex ↵Marek Olsak2017-05-244-18/+51
| | | | | | | | | | | patterns" This reverts commit e065977c4b5f68ab845400b256f6a3822b1325fa. It doesn't work. S_LOAD_DWORD_IMM_ci and friends aren't selected by any of the patterns, so it was putting 32-bit literals into the 8-bit field. llvm-svn: 303754
* [Hexagon] Fix comment in HexagonPacketizer::runOnMachineFunctionKrzysztof Parzyszek2017-05-241-2/+2
| | | | | | | | Patch by Wei-Ren Chen. Differential Revision: https://reviews.llvm.org/D33439 llvm-svn: 303745
* [LoopVectorizer] Let target prefer scalar addressing computations.Jonas Paulsson2017-05-241-0/+1
| | | | | | | | | | | | | | | | | | | | | | The loop vectorizer usually vectorizes any instruction it can and then extracts the elements for a scalarized use. On SystemZ, all elements containing addresses must be extracted into address registers (GRs). Since this extraction is not free, it is better to have the address in a suitable register to begin with. By forcing address arithmetic instructions and loads of addresses to be scalar after vectorization, two benefits result: * No need to extract the register * LSR optimizations trigger (LSR isn't handling vector addresses currently) Benchmarking show improvements on SystemZ with this new behaviour. Any other target could try this by returning false in the new hook prefersVectorizedAddressing(). Review: Renato Golin, Elena Demikhovsky, Ulrich Weigand https://reviews.llvm.org/D32422 llvm-svn: 303744
* [SystemZ] Fix register modelling in expandLoadStackGuard()Jonas Paulsson2017-05-241-16/+14
| | | | | | | | EXPENSIVE_CHECKS found this bug (https://bugs.llvm.org/show_bug.cgi?id=33047), which this patch fixes. The EAR instruction defines a GR32, not a GR64. Review: Ulrich Weigand llvm-svn: 303743
* Strip trailing whitespace. NFCI.Simon Pilgrim2017-05-241-2/+2
| | | | llvm-svn: 303736
OpenPOWER on IntegriCloud