summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* [PATCH] [AArch64] Add new target feature to fuse conditional selectEvandro Menezes2018-02-233-22/+73
| | | | | | | | | This feature enables the fusion of the comparison and the conditional select instructions together. Differential revision: https://reviews.llvm.org/D42392 llvm-svn: 325939
* Fix compiler warning introduced in r325931. NFC.Geoff Berry2018-02-231-3/+2
| | | | llvm-svn: 325938
* [X86] Custom split v32i16/v64i8 bitcasts when AVX512F is available, but BWI ↵Craig Topper2018-02-231-1/+20
| | | | | | | | | | is not. The test changes you can see are related to the changes in ReplaceNodeResults. Though shuffle-vs-trunc-512.ll does have a test that exercises the code in LowerBITCAST. Looks like the test output didn't change because DAG combining is able to clean up the resulting type legalization. Adding the custom hook just makes type legalization work less hard. Differential Revision: https://reviews.llvm.org/D43447 llvm-svn: 325933
* [MachineOperand][Target] MachineOperand::isRenamable semantics changesGeoff Berry2018-02-2319-27/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Add a target option AllowRegisterRenaming that is used to opt in to post-register-allocation renaming of registers. This is set to 0 by default, which causes the hasExtraSrcRegAllocReq/hasExtraDstRegAllocReq fields of all opcodes to be set to 1, causing MachineOperand::isRenamable to always return false. Set the AllowRegisterRenaming flag to 1 for all in-tree targets that have lit tests that were effected by enabling COPY forwarding in MachineCopyPropagation (AArch64, AMDGPU, ARM, Hexagon, Mips, PowerPC, RISCV, Sparc, SystemZ and X86). Add some more comments describing the semantics of the MachineOperand::isRenamable function and how it is set and maintained. Change isRenamable to check the operand's opcode hasExtraSrcRegAllocReq/hasExtraDstRegAllocReq bit directly instead of relying on it being consistently reflected in the IsRenamable bit setting. Clear the IsRenamable bit when changing an operand's register value. Remove target code that was clearing the IsRenamable bit when changing registers/opcodes now that this is done conservatively by default. Change setting of hasExtraSrcRegAllocReq in AMDGPU target to be done in one place covering all opcodes that have constant pipe read limit restrictions. Reviewers: qcolombet, MatzeB Subscribers: aemerson, arsenm, jyknight, mcrosier, sdardis, nhaehnle, javed.absar, tpr, arichardson, kristof.beyls, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, niosHD, escha, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D43042 llvm-svn: 325931
* [PowerPC] Code cleanup. Remove instructions that were withdrawn from Power 9.Stefan Pintilie2018-02-231-21/+0
| | | | | | | | | | | | | | The following set of instructions was originally planned to be added for Power 9 and so code was added to support them. However, a decision was made later on to withdraw support for these instructions in the hardware. xscmpnedp xvcmpnesp xvcmpnedp This patch removes support for the instructions that were not added. Differential Revision: https://reviews.llvm.org/D43641 llvm-svn: 325918
* [mips] finish removal of unused fields in MipsInstructionSelectorPetar Jovanovic2018-02-231-2/+2
| | | | | | r325916 missed to remove calls in constructor. llvm-svn: 325917
* [mips] remove unused fields in MipsInstructionSelectorPetar Jovanovic2018-02-231-3/+0
| | | | | | Unused fields cause buildbreak if -Werror,-Wunused-private-field is passed. llvm-svn: 325916
* Support for the mno-stack-arg-probe flagHans Wennborg2018-02-234-10/+25
| | | | | | | | | | | | Adds support for this flag. There is also another piece for clang (separate review). More info: https://bugs.llvm.org/show_bug.cgi?id=36221 By Ruslan Nikolaev! Differential Revision: https://reviews.llvm.org/D43107 llvm-svn: 325900
* [DAGCOmbine] Ensure that (brcond (setcc ...)) is handled in a canonical manner.Amaury Sechet2018-02-231-0/+2
| | | | | | | | | | | | | | | Summary: There are transformation that change setcc into other constructs, and transform that try to reconstruct a setcc from the brcond condition. Depending on what order these transform are done, the end result differs. Most of the time, it is preferable to get a setcc as a brcond argument (and this is why brcond try to recreate the setcc in the first place) so we ensure this is done every time by also doing it at the setcc level when the only user is a brcond. Reviewers: spatel, hfinkel, niravd, craig.topper Subscribers: nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D41235 llvm-svn: 325892
* [MIPS GlobalISel] Adding GlobalISelPetar Jovanovic2018-02-2313-0/+355
| | | | | | | | | | | Add GlobalISel infrastructure up to the point where we can select a ret void. Patch by Petar Avramovic. Differential Revision: https://reviews.llvm.org/D43583 llvm-svn: 325888
* AMDGPU: Track physreg uses in SILoadStoreOptimizerNicolai Haehnle2018-02-231-32/+32
| | | | | | | | | | | | | | | | Summary: This handles def-after-use of physregs, and allows us to merge loads and stores even across some physreg defs (typically M0 defs). Change-Id: I076484b2bda27c2cf46013c845a0380c5b89b67b Reviewers: arsenm, mareko, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D42647 llvm-svn: 325882
* [Mips] Return true in enableMultipleCopyHints().Jonas Paulsson2018-02-231-0/+2
| | | | | | | | | | Enable multiple COPY hints to eliminate more COPYs during register allocation. Note that this is something all targets should do, see https://reviews.llvm.org/D38128. Review: Simon Dardis llvm-svn: 325870
* [WebAssembly] Add first claass symbol table to wasm objectsSam Clegg2018-02-232-6/+13
| | | | | | | | | | | | | | | | | | | | This is combination of two patches by Nicholas Wilson: 1. https://reviews.llvm.org/D41954 2. https://reviews.llvm.org/D42495 Along with a few local modifications: - One change I made was to add the UNDEFINED bit to the binary format to avoid the extra byte used when writing data symbols. Although this bit is redundant for other symbols types (i.e. undefined can be implied if a function or global is a wasm import) - I prefer to be explicit and consistent and not have derived flags. - Some field renaming. - Some reverting of unrelated minor changes. - No test output differences. Differential Revision: https://reviews.llvm.org/D43147 llvm-svn: 325860
* Revert r325128 ("[X86] Reduce Store Forward Block issues in HW").Richard Smith2018-02-234-585/+0
| | | | | | This is causing miscompiles in some situations. See the llvm-commits thread for the commit for details. llvm-svn: 325852
* [X86] Turn setne X, signedmax into setgt signedmax, X in LowerVSETCC to ↵Craig Topper2018-02-231-3/+7
| | | | | | | | | | avoid an invert We won't be able to fold the constant pool load, but its still better than materialing ones and xoring for the invert if we used PCMPEQ. This will fix another regression from D42948. llvm-svn: 325845
* [AArch64] Refactor macro fusion (NFC)Evandro Menezes2018-02-231-151/+202
| | | | | | | | | Move checks for each fusion case into separate functions for better legibility and maintainability. Differential revision: https://reviews.llvm.org/D43649 llvm-svn: 325844
* Fix grammar. NFC.Rafael Espindola2018-02-221-1/+1
| | | | | | Thank to Eric Christopher for noticing. llvm-svn: 325842
* [X86] Turn setne X, signedmin into setgt X, signedmin in LowerVSETCC to ↵Craig Topper2018-02-221-0/+9
| | | | | | | | | | avoid an invert This will fix one of the regressions from D42948. Differential Revision: https://reviews.llvm.org/D43531 llvm-svn: 325840
* Fix the build of the wasm backend.Benjamin Kramer2018-02-221-2/+2
| | | | | | | toString conflicts with llvm::toString here. Yay for overly generic function names. llvm-svn: 325833
* [X86] Make the subus special case in LowerVSETCC self containedCraig Topper2018-02-221-39/+49
| | | | | | | | | | Previously this code overrode the flags and opcode used by the later code in LowerVSETCC. This makes the code difficult to read and follow. This patch moves all the SUBUS code into its own function and makes it responsible for creating its own SDNodes on success. Differential Revision: https://reviews.llvm.org/D43530 llvm-svn: 325827
* AMDGPU: Stop using .NAME in .td filesNicolai Haehnle2018-02-221-6/+6
| | | | | | | | | | | | | | | | | | | Summary: .NAME is a bit of an odd duck, in that we should really treat it like a template argument, but we currently don't, and so when and where NAME is initialized and how is pretty inconsistent. Best to just avoid using it as a field of already instantiated records, and use cast to string instead. Change-Id: I5a0c202401cede3d5c3827ab9c7858ea48b29108 Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D43551 llvm-svn: 325794
* [RISCV] Implement c.lui immediate operand constraintShiva Chen2018-02-223-10/+39
| | | | | | | | | | | | Implement c.lui immediate constraint to [1, 31] and [0xfffe0, 0xfffff]. The RISC-V ISA describes the constraint as [1, 63], with that value being loaded in to bits 17-12 of the destination register and sign extended from bit 17. Therefore, this 6-bit immediate can represent values in the ranges [1, 31] and [0xfffe0, 0xfffff]. Differential Revision: https://reviews.llvm.org/D42834 llvm-svn: 325792
* [mips] Generate memory dependencies for byVal argumentsStefan Maksimovic2018-02-221-1/+6
| | | | | | | | | | | | | | | | | | | | | There were no memory dependencies made between stores generated when lowering formal arguments and loads generated when call lowering byVal arguments which made the Post-RA scheduler place a load before a matching store. Make the fixed object stored to mutable so that the load instructions can have their memory dependencies added Set the frame object as isAliased which clears the underlying objects vector in ScheduleDAGInstrs::buildSchedGraph(). This results in addition of all stores as dependenies for loads. This problem appeared when passing a byVal parameter coupled with a fastcc function call. Differential Revision: https://reviews.llvm.org/D37515 llvm-svn: 325782
* [RISCV][NFC] Make logic in RISCVMCCodeEmitter::getImmOpValue more defensiveAlex Bradbury2018-02-221-5/+13
| | | | | | | | | | | As pointed out by @sabuasal in a comment on D23568, the logic in RISCVMCCodeEmitter::getImmOpValue could be more defensive. Although with the current instruction definitions it is always the case that `VK_RISCV_LO` is always used with either an I- or S-format instruction, this may not always be the case in the future. Add a check to ensure we will get an assertion in debug builds if that changes. llvm-svn: 325775
* Recommit: [ARM] f16 constant pool fixSjoerd Meijer2018-02-221-4/+2
| | | | | | | This recommits r325754; the modified and failing test case actually didn't need any modifications. llvm-svn: 325765
* [ARM] Fix issue with large xor constants.David Green2018-02-221-5/+2
| | | | | | | | | | Fixup to rL325573 for large xor constants. Thanks to Eli Friedman for the catch. Differential revision: https://reviews.llvm.org/D43549 llvm-svn: 325761
* Revert r325754 and r325755 (f16 literal pool) because buildbots were unhappy.Sjoerd Meijer2018-02-221-2/+4
| | | | llvm-svn: 325756
* [ARM] f16 constant pool fixSjoerd Meijer2018-02-221-4/+2
| | | | | | | | | | | This is a follow up of r325012, that allowed half types in constant pools. Proper alignment was enforced when a big basic block was split up, but not when a CPE was placed before/after a block; the successor block had the wrong alignment. Differential Revision: https://reviews.llvm.org/D43580 llvm-svn: 325754
* [NFC] fix trivial typos in commentsHiroshi Inoue2018-02-224-4/+4
| | | | | | "a a" -> "a" llvm-svn: 325752
* [PowerPC] Do not produce invalid CTR loop with an FRemNemanja Ivanovic2018-02-221-1/+4
| | | | | | | | | | | An FRem instruction inside a loop should prevent the loop from being converted into a CTR loop since this is not an operation that is legal on any PPC subtarget. This will always be a call to a library function which means the loop will be invalid if this instruction is in the body. Fixes PR36292. llvm-svn: 325739
* [X86][MMX] Generlize MMX_MOVD64rr combines to accept v4i16/v8i8 build ↵Simon Pilgrim2018-02-211-7/+17
| | | | | | | | vectors as well as v2i32 Also handle both cases where the lower 32-bits of the MMX is undef or zero extended. llvm-svn: 325736
* bpf: disable DwarfUsesRelocationsAcrossSectionsYonghong Song2018-02-211-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The pahole does not work with BPF backend properly: -bash-4.2$ cat test.c struct test_t { int a; int b; }; int test(struct test_t *s) { return s->a; } -bash-4.2$ clang -g -O2 -target bpf -c test.c -bash-4.2$ pahole test.o struct clang version 7.0.0 (trunk 325446) (llvm/trunk 325464) { clang version 7.0.0 (trunk 325446) (llvm/trunk 325464) clang version 7.0.0 (trunk 325446) (llvm/trunk 325464); /* 0 4 */ clang version 7.0.0 (trunk 325446) (llvm/trunk 325464) clang version 7.0.0 (trunk 325446) (llvm/trunk 325464); /* 4 4 */ /* size: 8, cachelines: 1, members: 2 */ /* last cacheline: 8 bytes */ }; -bash-4.2$ The reason is that BPF backend is not yet implemented in elfutils backend https://github.com/threatstack/elfutils/tree/master/backends and pahole depends on elfutils for dwarf parsing and resolving relocation. More specifically, the unsupported relocation in .debug_info for type/member name against symbol table caused the incorrect result above. The following is the raw .rel.debug_info for the above example, Hex dump of section '.rel.debug_info': 0x00000000 06000000 00000000 0a000000 0b000000 ................ 0x00000010 0c000000 00000000 0a000000 01000000 ................ 0x00000020 12000000 00000000 0a000000 02000000 ................ 0x00000030 16000000 00000000 0a000000 0e000000 ................ 0x00000040 1a000000 00000000 0a000000 03000000 ................ ----------------- -------- -------- reloc location type symtab index Hex dump of section '.debug_info': 0x00000000 7b000000 04000000 00000801 00000000 {............... 0x00000010 0c000000 00000000 00000000 00000000 ................ 0x00000020 00000000 00001000 00000200 00000000 ................ Based on "type", the proper value will be extracted from symbol table and filled in .debug_info so later on .debug_info can be properly resolved against debug strings. There are two ways to fix this problem. One is to fix elfutils by adding BPF support which is desirable. This could take a long time and won't work with already deployed pahole. For a short term workaround, we can disable dwarf cross-section relation which specifically avoids debug_info and symbol table cross relocation. This should help any dwarf-related tool which has not implement BPF specific relocations yet. Now .rel.debug_info does not have any relocation for symbol table and .debug_info itself contains necessary relocation information by itself. Hex dump of section '.debug_info': 0x00000000 7b000000 04000000 00000801 00000000 {............... 0x00000010 0c003700 00000000 00003e00 00000000 ..7.......>..... 0x00000020 00000000 00001000 00000200 00000000 ................ location 0xc has 0, 0x12 has 0x37, 0x1a has 0x3e in place which will be used in relocation resolution. Here, the values of 0, 0x37 and 0x3e are offset in .debug_str section. Please note the difference between two above .debug_info dumps. With the fix, pahole works properly with BPF backend: -bash-4.2$ clang -O2 -g -target bpf -c test.c -bash-4.2$ pahole test.o struct test_t { int a; /* 0 4 */ int b; /* 4 4 */ /* size: 8, cachelines: 1, members: 2 */ /* last cacheline: 8 bytes */ }; Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 325735
* [Hexagon] Add TargetRegisterInfo::getPointerRegClass() overrideTobias Edler von Koch2018-02-212-0/+9
| | | | llvm-svn: 325731
* [Hexagon] Return true in enableMultipleCopyHints().Jonas Paulsson2018-02-211-0/+2
| | | | | | | | | | Enable multiple COPY hints to eliminate more COPYs during register allocation. Note that this is something all targets should do, see https://reviews.llvm.org/D38128. Review: Krzysztof Parzyszek llvm-svn: 325697
* [X86] LowerBITCAST - pull out repeated calls to getOperand(0). NFCI.Simon Pilgrim2018-02-211-8/+7
| | | | llvm-svn: 325695
* [Sparc] Include __tls_get_addr in symbol table for TLS calls to itJonas Devlieghere2018-02-211-2/+14
| | | | | | | | | | | | | | | | | | Global Dynamic and Local Dynamic call relocations only implicitly reference __tls_get_addr; there is no connection in the ELF file between the relocations and the symbol other than the specification for the relocations' semantics. However, it still needs to be in the symbol table despite the lack of explicit references to the symbol table entry, since it needs to be bound at link time for these relocations, otherwise any objects will fail to link. For details, see https://sourceware.org/bugzilla/show_bug.cgi?id=22832. Path by: James Clarke (jrtc27) Differential revision: https://reviews.llvm.org/D43271 llvm-svn: 325688
* AMDGPU: Do not combine loads/store across physreg defsNicolai Haehnle2018-02-211-1/+19
| | | | | | | | | | | | | | | | | | | Summary: Since this pass operates on machine SSA form, this should only really affect M0 in practice. Fixes various piglit variable-indexing/vs-varying-array-mat4-index-* Change-Id: Ib2a1dc3a8d7b08225a8da49a86f533faa0986aa8 Fixes: r317751 ("AMDGPU: Merge S_BUFFER_LOAD_DWORD_IMM into x2, x4") Reviewers: arsenm, mareko, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D40343 llvm-svn: 325677
* [AMDGPU][MC] Added lds support for MUBUF instructionsDmitry Preobrazhensky2018-02-214-54/+168
| | | | | | | | | See bug 28234: https://bugs.llvm.org/show_bug.cgi?id=28234 Differential Revision: https://reviews.llvm.org/D43472 Reviewers: vpykhtin, artem.tamazov, arsenm llvm-svn: 325676
* [X86] Disable CLWB for Cannon LakeCraig Topper2018-02-211-1/+7
| | | | | | | | | | | | | Cannon Lake does not support CLWB, therefore it does not include all features listed under SKX anymore. Instead, enumerate all SKX features with the exception of CLWB. Patch by Gabor Buella Differential Revision: https://reviews.llvm.org/D43380 llvm-svn: 325654
* [mips] Spectre variant two mitigation for MIPSR2Simon Dardis2018-02-2114-38/+206
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch provides mitigation for CVE-2017-5715, Spectre variant two, which affects the P5600 and P6600. It implements the LLVM part of -mindirect-jump=hazard. It is _not_ enabled by default for the P5600. The migitation strategy suggested by MIPS for these processors is to use hazard barrier instructions. 'jalr.hb' and 'jr.hb' are hazard barrier variants of the 'jalr' and 'jr' instructions respectively. These instructions impede the execution of instruction stream until architecturally defined hazards (changes to the instruction stream, privileged registers which may affect execution) are cleared. These instructions in MIPS' designs are not speculated past. These instructions are used with the attribute +use-indirect-jump-hazard when branching indirectly and for indirect function calls. These instructions are defined by the MIPS32R2 ISA, so this mitigation method is not compatible with processors which implement an earlier revision of the MIPS ISA. Performance benchmarking of this option with -fpic and lld using -z hazardplt shows a difference of overall 10%~ time increase for the LLVM testsuite. Certain benchmarks such as methcall show a substantially larger increase in time due to their nature. Reviewers: atanasyan, zoran.jovanovic Differential Revision: https://reviews.llvm.org/D43486 llvm-svn: 325653
* Revert "[AMDGPU] Increased vector length for global/constant loads."Konstantin Zhuravlyov2018-02-202-34/+2
| | | | | | | | | | https://reviews.llvm.org/rL325518 It breaks following OpenCL conformance tests: - Basic - parameter_types - Basic - vload_private llvm-svn: 325643
* [AArch64] Refactor instructions using SIMD immediatesEvandro Menezes2018-02-201-368/+281
| | | | | | | | | | | Get rid of icky goto loops and make the code easier to maintain. Otherwise, NFC. Restore r324903 and fix PR36369. Differentail revision: https://reviews.llvm.org/D43364 llvm-svn: 325621
* [ARM] Lower BR_CC for f16Sjoerd Meijer2018-02-201-2/+1
| | | | | | | | This case wasn't handled yet. Differential Revision: https://reviews.llvm.org/D43508 llvm-svn: 325616
* [Hexagon] Handle *Low8 register classes in early if-conversionKrzysztof Parzyszek2018-02-201-0/+2
| | | | llvm-svn: 325606
* [X86] Correct SHRUNKBLEND creation to work correctly when there are multiple ↵Craig Topper2018-02-201-31/+23
| | | | | | | | | | | | | | uses of the condition. SimplifyDemandedBits forces the demanded mask to all 1s if the node has multiple uses, unless the AssumeSingleUse flag is set. So previously we were only really likely to simplify something if the condition had a single use. And on the off chance we did simplify with multiple uses the demanded mask being used was all ones so there was no reason to create a shrunkblend. This patch now checks that the condition is only used by selects first, and then sets the AssumeSingleUse flag for the simplifcation. Then we convert the selects to shrunkblend, and finally replace condition. Differential Revision: https://reviews.llvm.org/D43446 llvm-svn: 325604
* [X86] Promote 16-bit cmovs to 32-bitsCraig Topper2018-02-201-3/+54
| | | | | | | | | | This allows us to avoid an opsize prefix. And forcing some move immediates to i32 avoids a length changing prefix on those instructions. This mostly replaces the existing combine we had for zext/sext+cmov of constants. I left in a case for sign extending a 32 bit cmov of constants to 64 bits. Differential Revision: https://reviews.llvm.org/D43327 llvm-svn: 325601
* [mips] Correct the definition of cvt.d.wSimon Dardis2018-02-201-3/+2
| | | | | | | | An upcoming patch D41434, changes the ordering of the matcher table for assembly. This patch corrects the definition of the normal MIPS cvt.d.w not to be available in microMIPS. llvm-svn: 325589
* [PowerPC] Reduce stack frame for fastcc functions by only allocating ↵Lei Huang2018-02-201-2/+11
| | | | | | | | | | | | parameter save area when needed Current implementation always allocates the parameter save area conservatively for fastcc functions. There is no reason to allocate the parameter save area if all the parameters can be passed via registers. Differential Revision: https://reviews.llvm.org/D42602 llvm-svn: 325581
* [Hexagon] Fix alignment calculation of stack objects in Hexagon bit trackerKrzysztof Parzyszek2018-02-203-6/+6
| | | | llvm-svn: 325580
* [ARM] Mark -1 as cheap in xor's for thumb1David Green2018-02-201-0/+7
| | | | | | | | | | We can always convert xor %a, -1 into MVN, even in thumb 1 where the -1 would not otherwise be considered a cheap constant. This prevents the -1's from being pulled out into constants and potentially hoisted. Differential Revision: https://reviews.llvm.org/D43451 llvm-svn: 325573
OpenPOWER on IntegriCloud