summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* [AArch64][GlobalISel] Add isel support for global values in the large code ↵Amara Emerson2018-01-181-0/+34
| | | | | | | | | | model. Fixes PR35958. Differential Revision: https://reviews.llvm.org/D42175 llvm-svn: 322878
* [RISCV] Fixed setting predicates for compressed instructions.Ana Pazos2018-01-181-36/+38
| | | | | | | | | | | | | | | | | | | | | | Summary: Fixed setting predicates for compressed instructions. Some instructions were being generated with C extension enabled only, without proper checks for the other required extensions like F, D and 32 and 64-bit target checks. Affected instructions: C_FLD, C_FLW, C_LD, C_FSD, C_FSW, C_SD, C_JAL, C_ADDIW, C_SUBW, C_ADDW, C_FLDSP, C_FLWSP, C_LDSP, C_FSDSP, C_FSWSP, C_SDSP Reviewers: asb, shiva0217 Reviewed By: asb Subscribers: rbar, johnrusso, simoncook, jordy.potman.lists, sabuasal, niosHD, llvm-commits Differential Revision: https://reviews.llvm.org/D42132 llvm-svn: 322876
* [RISCV] Codegen support for the standard RV32M instruction set extensionAlex Bradbury2018-01-182-8/+25
| | | | llvm-svn: 322843
* [RISCV] Implement frame pointer eliminationAlex Bradbury2018-01-182-23/+24
| | | | llvm-svn: 322839
* [X86] Use vmovdqu64/vmovdqa64 for unmasked integer vector stores for ↵Craig Topper2018-01-182-27/+33
| | | | | | | | consistency with loads. Previously we used 64 for vXi64 stores and 32 for everything else. This change uses 64 for everything just like do for loads. llvm-svn: 322820
* [X86] Remove isel patterns for using unmasked vmovdqa32/vmovdqu32 for ↵Craig Topper2018-01-181-9/+10
| | | | | | | | integer vector loads. These patterns were just looking for a vXi64 bitcasted to vXi32, but there is no advantage to using vmovdqa32 over vmovdqa64. llvm-svn: 322819
* [WebAssembly] Remove duplicated RTLIB namesDerek Schuff2018-01-181-976/+394
| | | | | | | | | | | | | | Remove the tight coupling between llvm/CodeGenRuntimeLibcalls.def and the table of supported singatures for wasm. This will allow adding new libcalls without changing wasm's signature table. Also, some cleanup: Use ManagedStatics instead of const tables to avoid memory/binary bloat. Use a StringMap instead of a linear search for name lookup. Differential Revision: https://reviews.llvm.org/D35592 llvm-svn: 322802
* [CodeGen] Hoist common AsmPrinter code out of X86, ARM, and AArch64Reid Kleckner2018-01-173-67/+0
| | | | | | | | | | | Every known PE COFF target emits /EXPORT: linker flags into a .drective section. The AsmPrinter should handle this. While we're at it, use global_values() and emit each export flag with its own .ascii directive. This should make the .s file output more readable. llvm-svn: 322788
* Add a TargetOption to enable/disable GlobalISelVolkan Keles2018-01-171-6/+4
| | | | | | | | | | | | | | | | | | | | | Summary: This patch adds a new target option in order to control GlobalISel. This will allow the users to enable/disable GlobalISel prior to the backend by calling `TargetMachine::setGlobalISel(bool Enable)`. No test case as there is already a test to check GlobalISel command line options. See: CodeGen/AArch64/GlobalISel/gisel-commandline-option.ll. Reviewers: qcolombet, aemerson, ab, dsanders Reviewed By: qcolombet Subscribers: rovka, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D42137 llvm-svn: 322773
* Add support for emitting libcalls for x86_fp80 -> fp128 and vice-versaBenjamin Kramer2018-01-171-0/+4
| | | | | | compiler_rt doesn't provide them (yet), but libgcc does. PR34076. llvm-svn: 322772
* Revert [PowerPC] This reverts commit rL322721Zaara Syeda2018-01-177-113/+6
| | | | | | Failing build bots. Revert the commit now. llvm-svn: 322748
* [GISel] Make constrainSelectedInstRegOperands() available to the legalizer. NFCAditya Nandakumar2018-01-172-4/+5
| | | | | | https://reviews.llvm.org/D42149 llvm-svn: 322743
* Use a got to access a hidden weak undefined on MachO.Rafael Espindola2018-01-171-3/+1
| | | | | | | | | | | | | | | | | | | Trying to link __attribute__((weak, visibility("hidden"))) extern int foo; int *main(void) { return &foo; } on OS X fails with ld: 32-bit RIP relative reference out of range (-4294971318 max is +/-2GB): from _main (0x100000FAB) to _foo@0x00001000 (0x00000000) in '_main' from test.o for architecture x86_64 The problem being that 0 cannot be computed as a fixed difference from %rip. Exactly the same issue exists on ELF and we can use the same solution. llvm-svn: 322739
* [ARM] Optimize {s,u}mul.with.overflow.Joel Galenson2018-01-171-5/+32
| | | | | | | | This extends my previous patches to also optimize overflow-checked multiplies during SelectionDAG. Differential revision: https://reviews.llvm.org/D40922 llvm-svn: 322738
* [ARM] Optimize {s,u}{add,sub}.with.overflow.Joel Galenson2018-01-172-23/+76
| | | | | | | | The ARM backend contains code that tries to optimize compares by replacing them with an existing instruction that sets the flags the same way. This allows it to replace a "cmp" with a "adds", generalizing the code that replaces "cmp" with "sub". It also heuristically disables sinking of instructions that could potentially be used to replace compares (currently only if they're next to each other). Differential revision: https://reviews.llvm.org/D38378 llvm-svn: 322737
* [X86][BTVER2] Reduce instregex usage (PR35955)Simon Pilgrim2018-01-171-25/+29
| | | | | | Most are just replaced with instrs lists, but a few regexps have been further generalized to match more instructions with a single pattern. llvm-svn: 322734
* [X86] Teach LowerBUILD_VECTOR to recognize pair-wise splats of 32-bit ↵Craig Topper2018-01-171-0/+26
| | | | | | | | | | | | | | elements and use a 64-bit broadcast If we are splatting pairs of 32-bit elements, we can use a 64-bit broadcast to get the job done. We could probably could probably do this with other sizes too, for example four 16-bit elements. Or we could broadcast pairs of 16-bit elements using a 32-bit element broadcast. But I've left that as a future improvement. I've also restricted this to AVX2 only because we can only broadcast loads under AVX. Differential Revision: https://reviews.llvm.org/D42086 llvm-svn: 322730
* [X86] When legalizing (v64i1 select i8, v64i1, v64i1) make sure not to ↵Craig Topper2018-01-171-0/+12
| | | | | | | | | | | | introduce bitcasts to i64 in 32-bit mode We legalize selects of masks with scalar conditions using a bitcast to an integer type. But if we are in 32-bit mode we can't convert v64i1 to i64. So instead split the v64i1 to v32i1 and concat it back together. Each half will then be legalized by bitcasting to i32 which is fine. The test case is a little indirect. If we have the v64i1 select in IR it will get legalized by legalize vector ops which has a run of type legalization after it. That type legalization run is able to fix this i64 bitcast. So in order to avoid that we need a build_vector of a splat which legalize vector ops will ignore. Legalize DAG will then turn that into a select via LowerBUILD_VECTORvXi1. And the select will get legalized. In this case there is no type legalizer run to cleanup the bitcast. This fixes pr35972. llvm-svn: 322724
* [PowerPC] Add handling for ColdCC calling convention and a pass to markZaara Syeda2018-01-177-6/+113
| | | | | | | | | | | | | | | | candidates with coldcc attribute. This patch adds support for the coldcc calling convention for Power. This changes the set of non-volatile registers. It includes a pass to stress test the implementation by marking all static directly called functions with the coldcc attribute through the option -enable-coldcc-stress-test. It also includes an option, -ppc-enable-coldcc, to add the coldcc attribute to functions which are cold at all call sites based on BlockFrequencyInfo when the containing function does not call any non cold functions. Differential Revision: https://reviews.llvm.org/D38413 llvm-svn: 322721
* [ARC] Add missing condition codes.Tatyana Krasnukha2018-01-173-0/+10
| | | | | | | | | | | | | | Summary: Added VS and VC, required for disassembling. Reviewers: petecoup Reviewed By: petecoup Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42172 llvm-svn: 322718
* [SystemZ] Handle BRCTH branches correctly in SystemZLongBranch.cpp.Jonas Paulsson2018-01-171-1/+1
| | | | | | | | BRCTH is capable of a long branch which needs to be recognized during branch relaxation. This is done by checking for ExtraRelaxSize == 0. Review: Ulrich Weigand llvm-svn: 322688
* AMDGPU: Error in SIAnnotateControlFlow instead of assertMatt Arsenault2018-01-171-1/+5
| | | | | | | | This assert typically happens if an unstructured CFG is passed to the pass. This can happen if the pass is run independently without the structurizer. llvm-svn: 322685
* [ARM GlobalISel] Rename local variable. NFCDiana Picus2018-01-171-2/+2
| | | | llvm-svn: 322667
* [AArch64] Fix incorrect LD1 of 16-bit FP vectors in big endianPablo Barrio2018-01-171-18/+8
| | | | | | | | | | | | | | | | | | | | | Summary: Loading a vector of 4 half-precision FP sometimes results in an LD1 of 2 single-precision FP + a reversal. This results in an incorrect byte swap due to the conversion from little endian to big endian. In order to generate the correct byte swap, it is easier to generate the correct LD1 of 4 half-precision FP, thus avoiding the subsequent reversal. Reviewers: craig.topper, jmolloy, olista01 Reviewed By: olista01 Subscribers: efriedma, samparker, SjoerdMeijer, rogfer01, aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D41863 llvm-svn: 322663
* [RISCV] Allow RISCVAsmBackend::writeNopData to generate c.nop when supportedAlex Bradbury2018-01-171-8/+18
| | | | | | | | | | When the compressed instruction set is enabled, the 16-bit c.nop can be generated if necessary. Differential Revision: https://reviews.llvm.org/D41221 Patch by Shiva Chen. llvm-svn: 322658
* [ARM GlobalISel] Map G_FPEXT and G_FPTRUNC to FPRDiana Picus2018-01-171-0/+18
| | | | llvm-svn: 322657
* [AMDGPU] add LDS f32 intrinsicsDaniil Fukalov2018-01-177-10/+78
| | | | | | | | | | | | added llvm.amdgcn.atomic.{add|min|max}.f32 intrinsics to allow generate ds_{add|min|max}[_rtn]_f32 instructions needed for OpenCL float atomics in LDS Reviewed by: arsenm Differential Revision: https://reviews.llvm.org/D37985 llvm-svn: 322656
* [AMDGPU][MC][GFX9] Enable inline constants for SDWA operandsDmitry Preobrazhensky2018-01-175-65/+124
| | | | | | | | | See bug 35771: https://bugs.llvm.org/show_bug.cgi?id=35771 Differential Revision: https://reviews.llvm.org/D42058 Reviewers: vpykhtin, artem.tamazov, arsenm llvm-svn: 322655
* [ARM GlobalISel] Legalize G_FPEXT and G_FPTRUNCDiana Picus2018-01-171-0/+12
| | | | | | | | | | | Mark G_FPEXT and G_FPTRUNC as legal or libcall, depending on hardware support, but only for conversions between float and double. Also add the necessary boilerplate so that the LegalizerHelper can introduce the required libcalls. This also works only for float and double, but isn't too difficult to extend when the need arises. llvm-svn: 322651
* [X86] Don't mutate shuffle arguments after early-out for AVX512Benjamin Kramer2018-01-171-18/+22
| | | | | | | | | | The match* functions have the annoying behavior of modifying its inputs. Save and restore the inputs, just in case the early out for AVX512 is hit. This is still not great and its only a matter of time this kind of bug happens again, but I couldn't come up with a better pattern without rewriting significant chunks of this code. Fixes PR35977. llvm-svn: 322644
* [X86] Constify DebugLoc parameters. No functionality change.Benjamin Kramer2018-01-171-13/+10
| | | | llvm-svn: 322643
* Allow usage of X86-prefixes as separate instrs.Andrew V. Tischenko2018-01-171-0/+7
| | | | | | Differential Revision: https://reviews.llvm.org/D42102 llvm-svn: 322623
* [X86] In LowerBUILD_VECTOR, rename ExtVT to EltVT so it makes sense.Craig Topper2018-01-171-7/+7
| | | | llvm-svn: 322616
* [X86] Remove duplicate lines from scheduler models. NFCCraig Topper2018-01-174-8/+0
| | | | llvm-svn: 322615
* [X86][BTVER2] Fix scheduling of VCMPSD/VCMPSS instructionsSimon Pilgrim2018-01-161-2/+2
| | | | | | For some reason they don't have a trailing i like the packed equivalents. llvm-svn: 322600
* [X86][BTVER2] Use instrs instead of instregex for low match counts (PR35955)Simon Pilgrim2018-01-161-30/+22
| | | | llvm-svn: 322598
* [X86][BTVER2] Use instrs instead of instregex for single use matches (PR35955)Simon Pilgrim2018-01-161-36/+33
| | | | llvm-svn: 322597
* [PPC] Add a new register XER aliased to CARRYGuozhi Wei2018-01-161-2/+6
| | | | | | | | | | When "xer" is specified as clobbered register in inline assembler, clang can accept it, but llvm simply ignore it when lowered to machine instructions. It may cause problems later in scheduler. This patch adds a new register XER aliased to CARRY, and adds it to register class CARRYRC. Now PPCTargetLowering::getRegForInlineAsmConstraint can return correct register number for inline asm constraint "{xer}", and scheduler behave correctly. Differential Revision: https://reviews.llvm.org/D41967 llvm-svn: 322591
* [GlobalISel][TableGen] Add support for SDNodeXFormVolkan Keles2018-01-162-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch adds CustomRenderer which renders the matched operands to the specified instruction. Targets can enable the matching of SDNodeXForm by adding a definition that inherits from GICustomOperandRenderer and GISDNodeXFormEquiv as follows. def gi_imm8 : GICustomOperandRenderer<"renderImm8”>, GISDNodeXFormEquiv<imm8_xform>; Custom renderer functions should be of the form: void render(MachineInstrBuilder &MIB, const MachineInstr &I); Reviewers: dsanders, ab, rovka Reviewed By: dsanders Subscribers: kristof.beyls, javed.absar, llvm-commits, mgrang, qcolombet Differential Revision: https://reviews.llvm.org/D42012 llvm-svn: 322582
* [X86][MMX] Accept UNDEF upper bits for MOVD GR32->MMXSimon Pilgrim2018-01-161-3/+4
| | | | llvm-svn: 322574
* [X86][MMX] Improve MMX constant generationSimon Pilgrim2018-01-161-3/+12
| | | | | | Extend the MMX zero code to take any constant with zero'd upper 32-bits llvm-svn: 322553
* [BPF] Mark pseudo insn patterns as isCodeGenOnlyYonghong Song2018-01-161-2/+2
| | | | | | | | | | | | | | | | | | | | These pseudos are not supposed to be visible to user. This patch reduced the auto-generated instruction matcher. For example, the following words are removed from keyword list of LLVM BPF assembler. - MCK__35_, // '#' - MCK__COLON_, // ':' - MCK__63_, // '?' - MCK_ADJCALLSTACKDOWN, // 'ADJCALLSTACKDOWN' - MCK_ADJCALLSTACKUP, // 'ADJCALLSTACKUP' - MCK_PSEUDO, // 'PSEUDO' - MCK_Select, // 'Select' Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> llvm-svn: 322535
* [BPF] Teach DAG2DAG AND elimination about load intrinsicsYonghong Song2018-01-161-7/+31
| | | | | | | | | | | | | | | | | | As commented on the existing code: // The Reg operand should be a virtual register, which is defined // outside the current basic block. DAG combiner has done a pretty // good job in removing truncating inside a single basic block. However, when the Reg operand comes from bpf_load_[byte | half | word] intrinsics, the generic optimizer doesn't understand their results are zero extended, so these single basic block elimination opportunities were missed. Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> llvm-svn: 322534
* [X86] Revisit the fix I made years ago to make 'xchgl %eax, %eax' not encode ↵Craig Topper2018-01-162-16/+8
| | | | | | | | | | using the 0x90 encoding in 64-bit mode. Prior to this we had a separate instruction and register class that excluded eax to prevent matching the instruction that would encode with 0x90. This patch changes this to just use an InstAlias to force xchgl %eax, %eax to use XCHG32rr instruction in 64-bit mode. This gets rid of the separate instruction and register class. llvm-svn: 322532
* [X86] Make 'xchgq %rax, %rax' an alias for the 0x90 nop encoding to match gas.Craig Topper2018-01-161-0/+4
| | | | | | Previously we encoded it as 0x48 0x90. llvm-svn: 322531
* Avoid Wparentheses warning.Simon Pilgrim2018-01-151-2/+2
| | | | llvm-svn: 322526
* [X86][MMX] Add support for MMX zero vector creationSimon Pilgrim2018-01-153-1/+26
| | | | | | | | | | As mentioned on PR35869, (and came up recently on D41517) we don't create a MMX zero register via the PXOR but instead perform a spill to stack from a XMM zero register. This patch adds support for direct MMX zero vector creation and should make it easier to add better constant vector creation in the future as well. Differential Revision: https://reviews.llvm.org/D41908 llvm-svn: 322525
* [X86][SSE] Add custom execution domain fixing for ↵Simon Pilgrim2018-01-152-3/+192
| | | | | | | | | | BLENDPD/BLENDPS/PBLENDD/PBLENDW (PR34873) Add support for custom execution domain fixing and implement support for BLENDPD/BLENDPS/PBLENDD/PBLENDW. Differential Revision: https://reviews.llvm.org/D42042 llvm-svn: 322524
* [X86] Use MVT::getVectorVT instead of EVT::getVectorVT when splitting ↵Craig Topper2018-01-151-2/+2
| | | | | | | | 256/512 bit build_vectors. NFC We must be creating a legal type here which means it can be an MVT. llvm-svn: 322512
* [X86] Generalize some code in LowerBUILD_VECTOR. NFCCraig Topper2018-01-151-4/+10
| | | | llvm-svn: 322511
OpenPOWER on IntegriCloud