summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* [AArch64][Falkor] Refine load/store increment latencies.Geoff Berry2017-06-191-164/+242
| | | | | | Also fix LDXP & LDAXP write latency to avoid similar assert as PR33491 and PR33512. llvm-svn: 305750
* AMDGPU: Cleanup CreateLiveInRegisterMatt Arsenault2017-06-195-34/+45
| | | | llvm-svn: 305748
* Revert r305382, it caused PR33513.Nico Weber2017-06-191-6/+6
| | | | llvm-svn: 305735
* Revert r304824 "Fix PR23384 (part 3 of 3)"Hans Wennborg2017-06-192-13/+0
| | | | | | | | | | | | | | | | | This seems to be interacting badly with ASan somehow, causing false reports of heap-buffer overflows: PR33514. > Summary: > The patch makes instruction count the highest priority for > LSR solution for X86 (previously registers had highest priority). > > Reviewers: qcolombet > > Differential Revision: http://reviews.llvm.org/D30562 > > From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 305720
* [AArch64] Fix order of checks in shouldScheduleAdjacent.Florian Hahn2017-06-191-2/+2
| | | | | | | We need to check the opcode of FirstMI before accessing the operands. This caused a buildbot failure during bootstrapping on AArch64. llvm-svn: 305694
* AMDGPU/GlobalISel: Mark G_BITCAST s32 <--> <2 x s16> legalTom Stellard2017-06-191-0/+7
| | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D34129 llvm-svn: 305692
* [GlobalISel][X86] Fold FI/G_GEP into LDR/STR instruction addressing mode.Igor Breger2017-06-191-4/+42
| | | | | | | | | | | | | | Summary: Implement some of the simplest addressing modes.It should help to test ABI. Reviewers: zvi, guyblank Reviewed By: guyblank Subscribers: rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D33888 llvm-svn: 305691
* Recommit rL305677: [CodeGen] Add generic MacroFusion passFlorian Hahn2017-06-194-253/+53
| | | | | | | | | | | | | Use llvm::make_unique to avoid ambiguity with MSVC. This patch adds a generic MacroFusion pass, that is used on X86 and AArch64, which both define target-specific shouldScheduleAdjacent functions. This generic pass should make it easier for other targets to implement macro fusion and I intend to add macro fusion for ARM shortly. Differential Revision: https://reviews.llvm.org/D34144 llvm-svn: 305690
* [ARM] GlobalISel: Support G_ICMP for s8 and s16Diana Picus2017-06-191-0/+2
| | | | | | Widen to s32 (like all other binary ops). llvm-svn: 305683
* Revert r305677 [CodeGen] Add generic MacroFusion pass.Florian Hahn2017-06-194-53/+253
| | | | | | This causes Windows buildbot failures do an ambiguous call. llvm-svn: 305681
* [CodeGen] Add generic MacroFusion pass.Florian Hahn2017-06-194-253/+53
| | | | | | | | | | | | | | | | | | Summary: This patch adds a generic MacroFusion pass, that is used on X86 and AArch64, which both define target-specific shouldScheduleAdjacent functions. This generic pass should make it easier for other targets to implement macro fusion and I intend to add macro fusion for ARM shortly. Reviewers: craig.topper, evandro, t.p.northover, atrick, MatzeB Reviewed By: MatzeB Subscribers: atrick, aemerson, mgorny, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34144 llvm-svn: 305677
* [ARM] GlobalISel: Support G_ICMP for i32 and pointersDiana Picus2017-06-193-0/+119
| | | | | | | | | | | | | | Add support throughout the pipeline: - mark as legal for s32 and pointers - map to GPRs - lower to a sequence of instructions, which moves 0 or 1 into the result register based on the flags set by a CMPrr We have copied from FastISel a helper function which maps CmpInst predicates into ARMCC codes. Ideally, we should be able to move it somewhere that both FastISel and GlobalISel can use. llvm-svn: 305672
* Rework logic and comment out the default relocation models for PPC.Eric Christopher2017-06-171-10/+13
| | | | llvm-svn: 305630
* Turn a large if block into a smaller early return for clarity.Eric Christopher2017-06-171-11/+10
| | | | llvm-svn: 305629
* Remove the old and unused PPC32 and PPC64TargetMachine classes.Eric Christopher2017-06-172-47/+4
| | | | llvm-svn: 305628
* Remove unused forward declaration.Eric Christopher2017-06-171-1/+0
| | | | llvm-svn: 305627
* Tidy up some calls to getRegister for readability.Eric Christopher2017-06-171-5/+6
| | | | llvm-svn: 305626
* [PPC] Remove isBarrier from CFENCE8's definition.Tim Shen2017-06-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is my misunderstanding on isBarrier. It's not for memory barriers, but for other control flow purposes. lwsync doesn't have it either. This fixes a simple crash with -verify-machineinstrs like below: define void @Foo() { entry: %tmp = load atomic i64, i64* undef acquire, align 8 unreachable } I deliberately don't want to check in the test, since there is little chance to regress on such a mistake. Such a test adds noise to the code base. I plan to check in first, since it fixes a crash, and the fix is obvious. Reviewers: kbarton, echristo Subscribers: sanjoy, nemanjai, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D34314 llvm-svn: 305624
* [WebAssembly] Use __stack_pointer global when writing wasm binarySam Clegg2017-06-167-65/+28
| | | | | | | | | | | | | | | | | | This ensures that symbolic relocations are generated for stack pointer manipulations. These relocations are of type R_WEBASSEMBLY_GLOBAL_INDEX_LEB. This change also adds support for reading relocations of this type in WasmObjectFile.cpp. Since its a globally imported symbol this does mean that the get_global/set_global instruction won't be valid until the objects are linked that global used in no longer an imported global. Differential Revision: https://reviews.llvm.org/D34172 llvm-svn: 305616
* bpf: fix a strict-aliasing issueYonghong Song2017-06-161-11/+19
| | | | | | | | | | | | | | | | | | | | | | Davide Italiano reported the following issue if llvm is compiled with gcc -Wstrict-aliasing -Werror: ..... lib/Target/BPF/CMakeFiles/LLVMBPFCodeGen.dir/BPFISelDAGToDAG.cpp.o ../lib/Target/BPF/BPFISelDAGToDAG.cpp: In member function ‘virtual void {anonymous}::BPFDAGToDAGISel::PreprocessISelDAG()’: ../lib/Target/BPF/BPFISelDAGToDAG.cpp:264:26: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] val = *(uint16_t *)new_val; ..... The error is caused by my previous commit (revision 305560). This patch fixed the issue by introducing an union to avoid type casting. Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 305608
* bpf: avoid load from read-only sectionsYonghong Song2017-06-161-7/+233
| | | | | | | | | | | | | | | | | | | | | | | | | | | If users tried to have a structure decl/init code like below struct test_t t = { .memeber1 = 45 }; It is very likely that compiler will generate a readonly section to hold up the init values for variable t. Later load of t members, e.g., t.member1 will result in a read from readonly section. BPF program cannot handle relocation. This will force users to write: struct test_t t = {}; t.member1 = 45; This is just inconvenient and unintuitive. This patch addresses this issue by implementing BPF PreprocessISelDAG. For any load from a global constant structure or an global array of constant struct, it attempts to translate it into a constant directly. The traversal of the constant struct and other constant data structures are similar to where the assembler emits read-only sections. Four different unit test cases are also added to cover different scenarios. Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 305560
* bpf: set missing types in insn tablegen fileYonghong Song2017-06-161-7/+7
| | | | | | | | | | | | o This is discovered during my study of 32-bit subregister support. o This is no impact on current functionality since we only support 64-bit registers. o Searching the web, looks like the issue has been discovered before, so fix it now. Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 305559
* Revert "[mips][microMIPS] Extending size reduction pass with ADDIUSP and ↵Simon Dardis2017-06-161-97/+12
| | | | | | | | | ADDIUR1SP" This reverts commit r305455. This commit was reported as breaking one of the sanitizer buildbots. Reverting until lab.llvm.org comes back online. llvm-svn: 305557
* [Hexagon] Don't kill live registers when creating mux out of tfrKrzysztof Parzyszek2017-06-161-1/+31
| | | | | | | | | The second part of r305300: when placing the mux at the later location, make sure that it won't use any register that was killed between the two original instructions. Remove any such kills and transfer them to the mux. llvm-svn: 305553
* [AMDGPU] Testing commit access only, no real changeAlfred Huang2017-06-151-1/+1
| | | | llvm-svn: 305523
* DivergencyAnalysis patch for reviewAlexander Timofeev2017-06-153-1/+15
| | | | llvm-svn: 305494
* [MachineLICM] Hoist TOC-based address instructionsLei Huang2017-06-152-0/+15
| | | | | | | | | | | | | | | | | | Add condition for MachineLICM to safely hoist instructions that utilize non constant registers that are reserved. On PPC, global variable access is done through the table of contents (TOC) which is always in register X2. The ABI reserves this register in any functions that have calls or access global variables. A call through a function pointer involves saving, changing and restoring this register around the call and thus MachineLICM does not consider it to be invariant. We can however guarantee the register is preserved across the call and thus is invariant. Differential Revision: https://reviews.llvm.org/D33562 llvm-svn: 305490
* [PowerPC] fix potential verification errors on CFENCE8Hiroshi Inoue2017-06-151-1/+1
| | | | | | | | This patch fixes a potential verification error (64-bit register operands for cmpw) with -verify-machineinstrs. Differential Revision: https://reviews.llvm.org/D34208 llvm-svn: 305479
* [mips] Fix documentation of member variable. NFCI.Simon Dardis2017-06-151-1/+1
| | | | llvm-svn: 305478
* Remove trailing whitespace. NFCI.Simon Pilgrim2017-06-151-1/+1
| | | | llvm-svn: 305476
* [X86][AVX2] Fix issue in lowerV8I16GeneralSingleInputVectorShuffle that was ↵Simon Pilgrim2017-06-151-3/+4
| | | | | | | | | | assuming v8i16 vectors We can use this with v16i16/v32i16 as well. Found during fuzz testing. llvm-svn: 305472
* [AArch64] Add indexed check to splitStores. NFC.Nirav Dave2017-06-151-1/+1
| | | | | | | Add explicit check for unhandled cases in preparation for delaying splitStores to post-legalization. llvm-svn: 305471
* Revert r305465: [X86][AVX512] Improve lowering of AVX512 compare intrinsics ↵Simon Pilgrim2017-06-152-715/+25
| | | | | | | | (remove redundant shift left+right instructions). This is causing windows buildbot failures llvm-svn: 305470
* [X86][AVX512] Improve lowering of AVX512 compare intrinsics (remove ↵Ayman Musa2017-06-152-25/+715
| | | | | | | | | | | | | | | redundant shift left+right instructions). AVX512 compare instructions return v*i1 types. In cases where the number of elements in the returned value are less than 8, clang adds zeroes to get a mask of v8i1 type. Later on it's replaced with CONCAT_VECTORS, which then is lowered to many DAG nodes including insert/extract element and shift right/left nodes. The fact that AVX512 compare instructions put the result in a k register and zeroes all its upper bits allows us to remove the extra nodes simply by copying the result to the required register class. When lowering, identify these cases and transform them into an INSERT_SUBVECTOR node (marked legal), then catch this pattern in instructions selection phase and transform it into one avx512 cmp instruction. Differential Revision: https://reviews.llvm.org/D33188 llvm-svn: 305465
* [ARM] GlobalISel: Add support for i32 moduloDiana Picus2017-06-151-0/+45
| | | | | | | | | | | | | | | | | | Add support for modulo for targets that have hardware division and for those that don't. When hardware division is not available, we have to choose the correct libcall to use. This is generally straightforward, except for AEABI. The AEABI variant is trickier than the other libcalls because it returns { quotient, remainder }, instead of just one value like the other libcalls that we've seen so far. Therefore, we need to use custom lowering for it. However, we don't want to have too much special code, so we refactor the target-independent code in the legalizer by adding a helper for replacing an instruction with a libcall. This helper is used by the legalizer itself when dealing with simple calls, and also by the custom ARM legalization for the more complicated AEABI divmod calls. llvm-svn: 305459
* [ARM] GlobalISel: Lower only homogeneous struct argsDiana Picus2017-06-151-31/+24
| | | | | | | | | | | | | Lowering mixed struct args, params and returns used G_INSERT, which is a bit more convoluted to support through the entire pipeline. Since they don't occur that often in practice, it's probably wiser to leave them out until later. Meanwhile, we can lower homogeneous structs using G_MERGE_VALUES, which has good support in the legalizer. These occur e.g. as the return of __aeabi_idivmod, so it's nice to be able to support them. llvm-svn: 305458
* [AArch64] Enable FeatureFuseAES for the generic processor model.Florian Hahn2017-06-151-0/+1
| | | | | | | | | | | | | | | | | | | | | | | Summary: Scheduling AESE/AESMC and AESD/AESIMC instruction pairs back-to-back gives a double digit speedup on benchmarks using those instructions on Cortex-A processors. In GCC, this optimization is part of the generic processor model as well. This change should not have a major performance impact on processors that do not optimize AES instruction pairs, although I only had access to Cortex-A processors for benchmarking. Reviewers: rengolin, kristof.beyls, javed.absar, evandro, silviu.baranga, MatzeB, mcrosier, joelkevinjones, joel_k_jones, bmakam, t.p.northover Reviewed By: evandro Subscribers: sbaranga, aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D33836 llvm-svn: 305457
* [mips][microMIPS] Extending size reduction pass with ADDIUSP and ADDIUR1SPZoran Jovanovic2017-06-151-12/+97
| | | | | | | | | | | | Author: milena.vujosevic.janicic Reviewers: sdardis The patch extends size reduction pass for MicroMIPS. The following instructions are examined and transformed, if possible: ADDIU instruction is transformed into 16-bit instruction ADDIUSP ADDIU instruction is transformed into 16-bit instruction ADDIUR1SP Differential Revision: https://reviews.llvm.org/D33887 llvm-svn: 305455
* [x86] avoid unnecessary shuffle mask math in combineX86ShufflesRecursively()Sanjay Patel2017-06-141-6/+7
| | | | | | | | | | | | This is a follow-up to https://reviews.llvm.org/D34174 / https://reviews.llvm.org/rL305398. We mentioned replacing the multiplies with shifts, but the real win seems to be in bypassing the extra ops in the common case when the RootRatio and OpRatio are one. This gives us another 1-2% overall win for the test in PR32037: https://bugs.llvm.org/show_bug.cgi?id=32037 llvm-svn: 305414
* Test commit - NFC.Lei Huang2017-06-141-1/+1
| | | | | | Modified a comment to confirm commit access functionality. llvm-svn: 305402
* [x86] replace div/rem with shift/mask for better shuffle combining perfSanjay Patel2017-06-141-13/+31
| | | | | | | | | | | | We know that shuffle masks are power-of-2 sizes, but there's no way (?) for LLVM to know that, so hack combineX86ShufflesRecursively() to be much faster by replacing div/rem with shift/mask. This makes the motivating compile-time bug in PR32037 ( https://bugs.llvm.org/show_bug.cgi?id=32037 ) about 9% faster overall. Differential Revision: https://reviews.llvm.org/D34174 llvm-svn: 305398
* Revert "[ARM] Support constant pools in data when generating execute-only code."Alexandros Lamprineas2017-06-143-43/+15
| | | | | | | | | | | This reverts commit 3a204faa093c681a1e96c5e0622f50649b761ee0. I've upset a buildbot which runs the address sanitizer: ERROR: AddressSanitizer: stack-use-after-scope lib/Target/ARM/ARMISelLowering.cpp:2690 That Twine variable is used illegally. llvm-svn: 305390
* [mips] Fix multiprecision arithmetic.Simon Dardis2017-06-144-226/+187
| | | | | | | | | | | | | | | | | | | | | | | | | | For multiprecision arithmetic on MIPS, rather than using ISD::ADDE / ISD::ADDC, get SelectionDAG to break down the operation into ISD::ADDs and ISD::SETCCs. For MIPS, only the DSP ASE has a carry flag, so in the general case it is not useful to directly support ISD::{ADDE, ADDC, SUBE, SUBC} nodes. Also improve the generation code in such cases for targets with TargetLoweringBase::ZeroOrOneBooleanContent by directly using the result of the comparison node rather than using it in selects. Similarly for ISD::SUBE / ISD::SUBC. Address optimization breakage by moving the generation of MIPS specific integer multiply-accumulate nodes to before legalization. This revolves PR32713 and PR33424. Thanks to Simonas Kazlauskas and Pirama Arumuga Nainar for reporting the issue! Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D33494 llvm-svn: 305389
* [ARM] Support constant pools in data when generating execute-only code.Alexandros Lamprineas2017-06-143-15/+43
| | | | | | | | | | | | | | The ARM backend asserts against constant pool lowering when it generates execute-only code in order to prevent the generation of constant pools in the text section. It appears that target independent optimizations might generate DAG nodes that represent constant pools. By lowering such nodes as global addresses we don't violate the semantics of execute-only code and also it is guaranteed that execute-only behaves correct with the position-independent addressing modes that support execute-only code. Differential Revision: https://reviews.llvm.org/D33773 llvm-svn: 305387
* [mips] Fix machine verifier errors in the long branch passSimon Dardis2017-06-141-6/+6
| | | | | | | | | | | | | | | This patch fixes two systemic machine verifier errors in the long branch pass. The first is the incorrect basic block successors and the second was the incorrect construction of several jump instructions. This partially resolves PR27458 and the associated PR32146. Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D33378 llvm-svn: 305382
* Revert r304907 as it is causing some failures that I cannot reproduce.Nemanja Ivanovic2017-06-141-26/+0
| | | | | | Reverting this until a test case can be provided to aid the investigation. llvm-svn: 305372
* [AMDGPU] Remove now dead defaultOffsetS13(). NFCI.Davide Italiano2017-06-131-5/+0
| | | | | | Fixes the GCC7 build with -Werror. llvm-svn: 305329
* [WebAssembly] Cleanup WebAssemblyWasmObjectWriterSam Clegg2017-06-132-16/+13
| | | | | | Differential Revision: https://reviews.llvm.org/D34131 llvm-svn: 305316
* [AArch64][Falkor] Fix sched details for FDIV, FSQRT, SDIV, UDIVGeoff Berry2017-06-131-11/+50
| | | | llvm-svn: 305310
* Test commit - NFC.Kit Barton2017-06-131-1/+1
| | | | | | Modified a comment to confirm commit access functionality. llvm-svn: 305309
OpenPOWER on IntegriCloud