summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* [x86] simplify code in combineExtractSubvector; NFCSanjay Patel2019-02-221-19/+21
| | | | | | | Only the 1st fold is attempted pre-legalization, but it requires legal (simple) types too, so we don't need an EVT in any of the code. llvm-svn: 354674
* [mips][micromips] fix filling delay slots for PseudoIndirectBranch_MMPetar Jovanovic2019-02-221-0/+1
| | | | | | | | | | | | | | | Filling a delay slot in 32bit jump instructions with a 16bit instruction can cause issues. According to the documentation such an operation is unpredictable. This patch adds opcode Mips::PseudoIndirectBranch_MM alongside Mips::PseudoIndirectBranch and other instructions that are expanded to jr instruction and do not allow a 16bit instruction in their delay slots. Patch by Mirko Brkusanin. Differential Revision: https://reviews.llvm.org/D58507 llvm-svn: 354672
* [ARM] Add some missing thumb1 opcodes to enable peephole optimisation of CMPsDavid Green2019-02-221-11/+54
| | | | | | | | | | | | This adds a number of missing Thumb1 opcodes so that the peephole optimiser can remove redundant CMP instructions. Reapplying this after the first attempt broke non-thumb1 code as the t2ADDri instruction can be used with frame indices. In thumb1 we use tADDframe. Differential Revision: https://reviews.llvm.org/D57833 llvm-svn: 354667
* [ARM GlobalISel] Support floating point for Thumb2Diana Picus2019-02-221-29/+29
| | | | | | | | | | | | | | | This is exactly the same as arm mode, so for the instruction selector tests we just extract them to a new file and run with the same checks for both arm and thumb mode. For the legalizer we need to update the tests for soft float a bit, but only because BL and tBL are slightly different. We could be pedantic and check that we get a well-formed BL for arm mode and a tBL for thumb, but for the purposes of the legalizer test it's sufficient to just skip over the predicate operands in the checks. Also note that we have the pedantic checks in the divmod test, so we're covered. llvm-svn: 354665
* [WebAssembly] Remove getBottom function from CFGStackify (NFC)Heejin Ahn2019-02-221-27/+4
| | | | | | | | | | | | | | | | Summary: This removes `getBottom` function and the bookeeping map of <begin marker instruction, bottom BB>. Reviewers: dschuff Subscribers: sunfish, sbc100, jgravelle-google, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58319 llvm-svn: 354657
* [X86] Add a DAG combine for (aext_vector_inreg (aext_vector_inreg X)) -> ↵Craig Topper2019-02-221-0/+21
| | | | | | | | (aext_vector_inreg X) to fix a regression from my previous commit. Type legalization is causing two nodes to be created here, but we can use a single node to extend from v8i16 to v2i64. llvm-svn: 354648
* [X86] Fix some copy/paste mistakes that caused a VR128 to be used as the ↵Craig Topper2019-02-221-4/+4
| | | | | | | | | | address of a load in an isel pattern This was introduced in r354511. Fixes PR40811. llvm-svn: 354640
* AMDGPU: Remove debugger related subtarget featuresMatt Arsenault2019-02-2117-334/+13
| | | | | | As far as I know these aren't needed anymore. llvm-svn: 354634
* [X86] Remove hasSideEffects=1 from the X87 pseudos with folded load.Craig Topper2019-02-211-2/+4
| | | | | | This was done in r321424 to prevent scheduling from reordering things. But now that we model FPCW as a dependency, I don't think the same scheduling we were trying to prevent can occur. llvm-svn: 354628
* AMDGPU/NFC: Cleanup subtarget predicatesKonstantin Zhuravlyov2019-02-2114-138/+137
| | | | | | Differential Revision: https://reviews.llvm.org/D58522 llvm-svn: 354620
* [x86] vectorize more cast ops in lowering to avoid register file transfersSanjay Patel2019-02-211-8/+21
| | | | | | | | | | | | | | | | This is a follow-up to D56864. If we're extracting from a non-zero index before casting to FP, then shuffle the vector and optionally narrow the vector before doing the cast: cast (extelt V, C) --> extelt (cast (extract_subv (shuffle V, [C...]))), 0 This might be enough to close PR39974: https://bugs.llvm.org/show_bug.cgi?id=39974 Differential Revision: https://reviews.llvm.org/D58197 llvm-svn: 354619
* Re-land "[AArch64][GlobalISel] Implement partial support for G_SHUFFLE_VECTOR""Amara Emerson2019-02-212-0/+152
| | | | | | | Thanks to Richard Trieu for pointing out that the failures were due to a use-after-free of an ArrayRef. llvm-svn: 354616
* [Hexagon] Use misaligned load instead of trap0(#0) for __builtin_trapKrzysztof Parzyszek2019-02-213-1/+38
| | | | | | | | | The trap instruction is intercepted by various runtime environments, and instead of a crash it creates confusion. This reapplies r354606 with a fix. llvm-svn: 354611
* Revert r354606, it breaks asan testsKrzysztof Parzyszek2019-02-213-38/+1
| | | | llvm-svn: 354609
* [Hexagon] Use misaligned load instead of trap0(#0) for __builtin_trapKrzysztof Parzyszek2019-02-213-1/+38
| | | | | | | The trap instruction is intercepted by various runtime environments, and instead of a crash it creates confusion. llvm-svn: 354606
* [AMDGPU] remove unused AssemblerPredicatesMark Searles2019-02-211-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | An internal build is hitting asserts complaining about too many subtarget features: llvm/utils/TableGen/Types.cpp:42: const char* llvm::getMinimalTypeForEnumBitfield(uint64_t): Assertion `MaxIndex <= 64 && "Too many bits"' failed. llvm/utils/TableGen/AsmMatcherEmitter.cpp:1476: void {anonymous}::AsmMatcherInfo::buildInfo(): Assertion `SubtargetFeatures.size() <= 64 && "Too many subtarget features!"' failed. The short-term solution is to remove a few unused AssemblerPredicates to get under the limit. The long-term solution seems to be to revisit these asserts. E.g., rather than hardcoded '64', use the standard sized std::bitset like the other places that track subtarget features. Differential Revision: https://reviews.llvm.org/D58516 llvm-svn: 354604
* AMDGPU/GlobalISel: Make phis legalMatt Arsenault2019-02-211-0/+13
| | | | llvm-svn: 354592
* [X86] Fix copy-paste error in @ccz flag.Nirav Dave2019-02-211-1/+1
| | | | | | @ccz operand should be equivalent to @cce. llvm-svn: 354588
* AMDGPU/GlobalISel: Fix bit count ops for non-power-of-2 typesMatt Arsenault2019-02-211-1/+3
| | | | llvm-svn: 354587
* [RISCV][NFC] IsEligibleForTailCallOptimization -> ↵Alex Bradbury2019-02-212-9/+8
| | | | | | | | isEligibleForTailCallOptimization Also clang-format the modified hunks. llvm-svn: 354584
* [RISCV] Add implied zero offset load/store alias patternsAlex Bradbury2019-02-214-0/+81
| | | | | | | | | | Allow load/store instructions with implied zero offset for compatibility with GNU assembler. Differential Revision: https://reviews.llvm.org/D57141 Patch by James Clarke. llvm-svn: 354581
* [ARM GlobalISel] Support G_FRAME_INDEX for Thumb2Diana Picus2019-02-212-2/+5
| | | | | | Same as arm mode. llvm-svn: 354579
* [X86][SSE] combineX86ShufflesRecursively - moved to generic op input index ↵Simon Pilgrim2019-02-211-7/+3
| | | | | | | | lookup. NFCI. We currently bail if the target shuffle decodes to more than 2 input vectors, this change alters the input index to work for any number of inputs for when we drop that requirement. llvm-svn: 354575
* Revert 354564: [ARM] Add some missing thumb1 opcodes to enable peephole ↵David Green2019-02-211-54/+12
| | | | | | | | | optimisation of CMPs I believe it's causing bootstrap failures for A32 code. I'll take a look at what's wrong. llvm-svn: 354569
* [AArch64] Print instruction before atomic semantic annotationsDavid Spickett2019-02-211-5/+6
| | | | | | | | | | | | | | | | | Commit r353303 added annotations when acquire semantics were dropped from an instruction. printAnnotation was called before printInstruction. So if you didn't set a separate comment output stream you got <comment><instr> instead of <instr><comment> as expected. To fix this move the new printAnnotation to after the instruction is printed. Differential Revision: https://reviews.llvm.org/D58059 llvm-svn: 354565
* [ARM] Add some missing thumb1 opcodes to enable peephole optimisation of CMPsDavid Green2019-02-211-12/+54
| | | | | | | | | This adds a number of missing Thumb1 opcodes so that the peephole optimiser can remove redundant CMP instructions. Differential Revision: https://reviews.llvm.org/D57833 llvm-svn: 354564
* [ARM] Negative constants mishandled in ARM CGPSam Parker2019-02-211-5/+5
| | | | | | | | | | | | | During type promotion, sometimes we convert negative an add with a negative constant into a sub with a positive constant. The loop that performs this transformation has two issues: - it iterates over a set, causing non-determinism. - it breaks, instead of continuing, when it finds the first non-negative operand. Differential Revision: https://reviews.llvm.org/D58452 llvm-svn: 354557
* [WebAssembly] Default to something reasonable in WebAssemblyAddMissingPrototypesSam Clegg2019-02-211-7/+13
| | | | | | | | | | | | | | | | | | Previously if we couldn't derive a prototype for a "no-prototype" function from C we would leave it as is: void foo(...) With this change we instead give is an empty signature and remove the "no-prototype" attribute. This fixes the current wasm waterfall test failure. Tags: #llvm Differential Revision: https://reviews.llvm.org/D58488 llvm-svn: 354544
* [AMDGPU] fix commuted case of sub combineStanislav Mekhanoshin2019-02-211-5/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D58481 llvm-svn: 354543
* Revert "[AArch64][GlobalISel] Implement partial support for G_SHUFFLE_VECTOR"Amara Emerson2019-02-212-153/+0
| | | | | | This reverts r354521 because it broke the bots, but passes on Darwin somehow. llvm-svn: 354532
* [WebAssembly] Don't error on conflicting uses of prototype-less functionsSam Clegg2019-02-201-6/+8
| | | | | | | | | | | | | | | | When we can't determine with certainty the signature of a function import we pick the fist signature we find rather than error'ing out. The resulting program might not do what is expected since we might pick the wrong signature. However since undefined behavior in C to use the same function with different signatures this seems better than refusing to compile such programs. Fixes PR40472 Differential Revision: https://reviews.llvm.org/D58304 llvm-svn: 354523
* [AArch64][GlobalISel] Implement partial support for G_SHUFFLE_VECTORAmara Emerson2019-02-202-0/+153
| | | | | | | | | | | | | | This change makes some basic type combinations for G_SHUFFLE_VECTOR legal, and implements them with a very pessimistic TBL2 instruction in the selector. For TBL2, support is also needed to generate constant pool entries and load from them in order to materialize the mask register. Currently supports <2 x s64> and <4 x s32> result types. Differential Revision: https://reviews.llvm.org/D58466 llvm-svn: 354521
* AMDGPU/GlobalISel: Move SMRD selection logic to TableGenTom Stellard2019-02-204-128/+136
| | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: volkan, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D52922 llvm-svn: 354516
* [SDAG] Support vector UMULO/SMULONikita Popov2019-02-201-0/+2
| | | | | | | | | | | | | | | Second part of https://bugs.llvm.org/show_bug.cgi?id=40442. This adds an extra UnrollVectorOverflowOp() method to SDAG, because the general UnrollOverflowOp() method can't deal with multiple results. Additionally we need to expand UMULO/SMULO during vector op legalization, as it may result in unrolling, which may need additional type legalization. Differential Revision: https://reviews.llvm.org/D57997 llvm-svn: 354513
* [X86] Add more load folding patterns for blend instructions as a follow up ↵Craig Topper2019-02-201-6/+66
| | | | | | | | | | | | | | to r354363. This avoids depending on the peephole pass to do load folding. Also adds some load folding for some insert_subvector patterns that use blend. All of this was found by temporarily adding TB_NO_FORWARD to the blend immediate entries in the load folding tables. I've added -disable-peephole to some of the affected tests from that experiment to ensure we're testing isel patterns. llvm-svn: 354511
* [X86][SSE] combineX86ShufflesRecursively - begin generalizing the number of ↵Simon Pilgrim2019-02-201-11/+7
| | | | | | | | shuffle inputs. NFCI. We currently bail if the target shuffle decodes to more than 2 input vectors, this is some initial cleanup that still has the limit but generalizes the opindices to an array that will be necessary when we drop the limit. llvm-svn: 354489
* GlobalISel: Fix fewerElementsVector for ctlz with different result typeMatt Arsenault2019-02-201-2/+2
| | | | | | Also complete the set of related operations. llvm-svn: 354480
* GlobalISel: Implement moreElementsVector for g_insert resultsMatt Arsenault2019-02-201-14/+24
| | | | llvm-svn: 354477
* [Hexagon] Split vector pairs for ISD::SIGN_EXTEND and ISD::ZERO_EXTENDKrzysztof Parzyszek2019-02-202-0/+7
| | | | llvm-svn: 354473
* [MIPS MSA] Avoid some DAG combines for vector shiftsPetar Avramovic2019-02-202-0/+9
| | | | | | | | | | DAG combiner combines two shifts into shift + and with bitmask. Avoid such combines for vectors since leaving two vector shifts as they are produces better end results. Differential Revision: https://reviews.llvm.org/D58225 llvm-svn: 354461
* [X86] Remove FeatureSlowIncDec from Sandy Bridge and later Intel Core CPUsCraig Topper2019-02-201-1/+0
| | | | | | | | | | | | | | | | | | | Summary: Inc and Dec were at one point slow on Intel CPUs due to their tendency to cause partial flag stalls on P6 derived CPU cores. This is because these instructions are defined to preserve the carry flag. This partial flag stall issue persisted until Sandy Bridge when flag merging was changed to be handled as a data dependency instead of as a stall until retirement. Sandy Bridge and later CPUs rename the C flag separately from OSPAZ so there is no flag merge needed on INC/DEC to preserve the C flag. Given these improvements I don't know why INC/DEC was ever considered slow on Sandy Bridge. If anything they should have been disabled on the earlier CPUs instead. Note after this patch, INC/DEC are still considered slow on Silvermont, Goldmont, Knights Landing and our generic "x86-64" CPU. Reviewers: spatel, RKSimon, chandlerc Reviewed By: chandlerc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D58412 llvm-svn: 354436
* Temporarily Revert "[X86][SLP] Enable SLP vectorization for 128-bit ↵Eric Christopher2019-02-202-8/+0
| | | | | | | | | | | | | | | | horizontal X86 instructions (add, sub)" As this has broken the lto bootstrap build for 3 days and is showing a significant regression on the Dither_benchmark results (from the LLVM benchmark suite) -- specifically, on the BENCHMARK_FLOYD_DITHER_128, BENCHMARK_FLOYD_DITHER_256, and BENCHMARK_FLOYD_DITHER_512; the others are unchanged. These have regressed by about 28% on Skylake, 34% on Haswell, and over 40% on Sandybridge. This reverts commit r353923. llvm-svn: 354434
* [RISCV] Implement pseudo instructions for load/store from a symbol address.Kito Cheng2019-02-205-14/+137
| | | | | | | | | | | | | Summary: Those pseudo-instructions are making load/store instructions able to load/store from/to a symbol, and its always using PC-relative addressing to generating a symbol address. Reviewers: asb, apazos, rogfer01, jrtc27 Differential Revision: https://reviews.llvm.org/D50496 llvm-svn: 354430
* [PowerPC] exploit P9 instruction maddld.Chen Zheng2019-02-202-4/+12
| | | | | | Differential Revision: https://reviews.llvm.org/D58364 llvm-svn: 354427
* [WebAssembly] Refactor atomic operation definitions (NFC)Heejin Ahn2019-02-201-205/+226
| | | | | | | | | | | | | | | | | | | Summary: - Make `ATOMIC_I`, `ATOMIC_NRI`, `AtomicLoad`, `AtomicStore` classes and make other operations inherit from them - Factor the common opcode prefix '0xfe' out from the opcodes into the common class - Reorder instructions in the order of increasing opcodes Reviewers: tlively Subscribers: dschuff, sbc100, jgravelle-google, sunfish, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58338 llvm-svn: 354421
* [WebAssembly] Fix load/store name detection for atomic instructionsHeejin Ahn2019-02-202-7/+6
| | | | | | | | | | | | | | | | | | | | Summary: Fixed a bug in the routine in AsmParser that determines whether the current instruction is a load or a store. Atomic instructions' prefixes are not `atomic_` but `atomic.`, and all atomic instructions are also memory instructions. Also fixed the printing format of atomic instructions to match other memory instructions and added encoding tests for atomic instructions. Reviewers: aardappel, tlively Subscribers: dschuff, sbc100, jgravelle-google, sunfish, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58337 llvm-svn: 354419
* [WebAssembly] Fixed disassembler not knowing about OPERAND_EVENTWouter van Oortmerssen2019-02-201-0/+1
| | | | | | | | | | | | Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58414 llvm-svn: 354416
* [WebAssembly] Update MC for bulk memoryThomas Lively2019-02-191-0/+1
| | | | | | | | | | | | | | | | | | Summary: Rename MemoryIndex to InitFlags and implement logic for determining data segment layout in ObjectYAML and MC. Also adds a "passive" flag for the .section assembler directive although this cannot be assembled yet because the assembler does not support data sections. Reviewers: sbc100, aardappel, aheejin, dschuff Subscribers: jgravelle-google, hiraditya, sunfish, rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57938 llvm-svn: 354397
* [X86] Mark FP32_TO_INT16_IN_MEM/FP32_TO_INT32_IN_MEM/FP32_TO_INT64_IN_MEM as ↵Craig Topper2019-02-191-1/+3
| | | | | | | | | | | | | | clobbering EFLAGS to prevent mis-scheduling during conversion from SelectionDAG to MIR. After r354178, these instruction expand to a sequence that uses an OR instruction. That OR clobbers EFLAGS so we need to state that to avoid accidentally using the clobbered flags. Our tests show the bug, but I didn't notice because the SETcc instructions didn't move after r354178 since it used to be safe to do the fp->int conversion first. We should probably convert this whole sequence to SelectionDAG instead of a custom inserter to avoid mistakes like this. Fixes PR40779 llvm-svn: 354395
* PowerPC: Fix typos in commentsJinsong Ji2019-02-191-2/+2
| | | | llvm-svn: 354382
OpenPOWER on IntegriCloud