summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [X86] Optimize v2i32/v2f32 scatters.Craig Topper2018-01-112-30/+58
| | | | | | | | If the index is v2i64 we can use the scatter instruction that has v4i32/v4f32 data register, v2i64 index, and v2i1 mask. Similar was already done for gather. Implement custom widening for v2i32 data to remove the code that reverses type legalization during lowering. llvm-svn: 322254
* Revert "AArch64: Fix emergency spillslot being out of reach for large ↵Matthias Braun2018-01-106-52/+11
| | | | | | | | | | | | callframes" Revert for now as the testcase is hitting a pre-existing verifier error that manifest as a failure when expensive checks are enabled (or -verify-machineinstrs) is used. This reverts commit r322200. llvm-svn: 322231
* [X86] Move HasNOPL to a subtarget feature bit. Plumb MCSubtargetInfo through ↵Craig Topper2018-01-104-57/+79
| | | | | | | | | | the MCAsmBackend constructor After D41349, we can no get a MCSubtargetInfo into the MCAsmBackend constructor. This allows us to get NOPL from a subtarget feature rather than a CPU name blacklist. Differential Revision: https://reviews.llvm.org/D41721 llvm-svn: 322227
* [RISCV] Implement support for the BranchRelaxation passAlex Bradbury2018-01-105-9/+133
| | | | | | | | | Branch relaxation is needed to support branch displacements that overflow the instruction's immediate field. Differential Revision: https://reviews.llvm.org/D40830 llvm-svn: 322224
* [RISCV] Implement branch analysisAlex Bradbury2018-01-102-0/+182
| | | | | | | | | This is a prerequisite for the branch relaxation pass, and allows a number of optimisation passes (e.g. BranchFolding and MachineBlockPlacement) to work. Differential Revision: https://reviews.llvm.org/D40808 llvm-svn: 322222
* [RISCV] Add support for llvm.{frameaddress,returnaddress} intrinsicsAlex Bradbury2018-01-102-0/+59
| | | | llvm-svn: 322218
* [RISCV] Add basic support for inline asm constraintsAlex Bradbury2018-01-104-0/+96
| | | | llvm-svn: 322217
* [RISCV] Support stack frames and offsets up to 32-bitsAlex Bradbury2018-01-105-11/+79
| | | | | | Differential Revision: https://reviews.llvm.org/D40807 llvm-svn: 322216
* [RISCV] Support for varargsAlex Bradbury2018-01-104-24/+183
| | | | | | | | | | | | Includes support for expanding va_copy. Also adds support for using 'aligned' registers when necessary for vararg calls, and ensure the frame pointer always points to the bottom of the vararg spill region. This is necessary to ensure that the saved return address and stack pointer are always available at fixed known offsets of the frame pointer. Differential Revision: https://reviews.llvm.org/D40805 llvm-svn: 322215
* [SelectionDAG][X86] Explicitly store the scale in the gather/scatter ISD nodesCraig Topper2018-01-103-16/+20
| | | | | | | | | | Currently we infer the scale at isel time by analyzing whether the base is a constant 0 or not. If it is we assume scale is 1, else we take it from the element size of the pass thru or stored value. This seems a little weird and I think it makes more sense to make it explicit in the DAG rather than doing tricky things in the backend. Most of this patch is just making sure we copy the scale around everywhere. Differential Revision: https://reviews.llvm.org/D40055 llvm-svn: 322210
* [MachineOutliner] Outline ADRPsJessica Paquette2018-01-101-0/+6
| | | | | | | | | ADRP instructions weren't being outlined because they're PC-relative and thus fail the LR checks. This patch adds a special case for ADRPs to getOutliningType to make sure that ADRPs can be outlined and updates the MIR test. llvm-svn: 322207
* AArch64: Fix emergency spillslot being out of reach for large callframesMatthias Braun2018-01-106-11/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | Large callframes (calls with several hundreds or thousands or parameters) could lead to situations in which the emergency spillslot is out of range to be addressed relative to the stack pointer. This commit forces the use of a frame pointer in the presence of large callframes. This commit does several things: - Compute max callframe size at the end of instruction selection. - Add mirFileLoaded target callback. Use it to compute the max callframe size after loading a .mir file when the size wasn't specified in the file. - Let TargetFrameLowering::hasFP() return true if there exists a callframe > 255 bytes. - Always place the emergency spillslot close to FP if we have a frame pointer. - Note that `useFPForScavengingIndex()` would previously return false when a base pointer was available leading to the emergency spillslot getting allocated late (that's the whole effect of this callback). Which made no sense to me so I took this case out: Even though the emergency spillslot is technically not referenced by FP in this case we still want it allocated early. Differential Revision: https://reviews.llvm.org/D40876 llvm-svn: 322200
* [X86][MMX] Pull out common MMX VT test. NFCI.Simon Pilgrim2018-01-101-28/+27
| | | | llvm-svn: 322195
* [AMDGPU][MC][GFX8][GFX9] Added XNACK_MASK supportDmitry Preobrazhensky2018-01-108-5/+54
| | | | | | | | | See bug 35764: https://bugs.llvm.org/show_bug.cgi?id=35764 Differential Revision: https://reviews.llvm.org/D41614 Reviewers: vpykhtin, artem.tamazov, arsenm llvm-svn: 322189
* [AArch64][SVE] Asm: Add support for (mov|dup) of scalarSander de Smalen2018-01-102-0/+37
| | | | | | | | | | | | | | Summary: This patch adds support for 'dup' (Scalar -> SVE) and its corresponding 'mov' alias. Reviewers: fhahn, rengolin, evandro, echristo Reviewed By: fhahn Subscribers: aemerson, javed.absar, tschuett, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D41822 llvm-svn: 322172
* [ARM GlobalISel] Map G_FNEG to the FPR bankDiana Picus2018-01-101-1/+2
| | | | llvm-svn: 322169
* [ARM GlobalISel] Legalize G_FNEG for s32 and s64Diana Picus2018-01-101-1/+2
| | | | | | | | | | | | | For hard float, it is legal. For soft float, we need to lower to 0 - x first, and then we can use the libcall for G_FSUB. This is undoing some of the canonicalization performed by the IRTranslator (which introduces G_FNEG when it sees a 0 - x). Ideally, that canonicalization would be performed by a pre-legalizer pass that would allow targets to opt out of this behaviour rather than dance around it in the legalizer. llvm-svn: 322168
* [TableGen][AsmMatcherEmitter] Generate assembler checks for tied operandsSander de Smalen2018-01-101-0/+3
| | | | | | | | | | | | | | | | | | | | Summary: This extends TableGen's AsmMatcherEmitter with code that generates a table with tied-operand constraints. The constraints are checked when parsing the instruction. If an operand is not equal to its tied operand, the assembler will give an error. Patch [2/3] in a series to add operand constraint checks for SVE's predicated ADD/SUB. Reviewers: olista01, rengolin, mcrosier, fhahn, craig.topper, evandro, echristo Reviewed By: fhahn Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D41446 llvm-svn: 322166
* Temporarily revertJonas Paulsson2018-01-101-25/+15
| | | | | | | | "[SystemZ] Check for legality before doing LOAD AND TEST transformations." , due to test failures. llvm-svn: 322165
* [ARM GlobalISel] Legalize s32/s64 G_FCONSTANTDiana Picus2018-01-101-3/+14
| | | | | | | | Legal for hard float. Change to G_CONSTANT for soft float (but preserve the binary representation). llvm-svn: 322164
* [ARM GlobalISel] Legalize G_CONSTANT for scalars > 32 bitsDiana Picus2018-01-101-3/+4
| | | | | | Make G_CONSTANT narrow for any scalars larger than 32 bits. llvm-svn: 322162
* [SystemZ] Check for legality before doing LOAD AND TEST transformations.Jonas Paulsson2018-01-101-15/+25
| | | | | | | | | | Since a load and test instruction treat its operands as signed, it can only replace a logical compare for EQ/NE uses. Review: Ulrich Weigand https://bugs.llvm.org/show_bug.cgi?id=35662 llvm-svn: 322161
* [PowerPC] Manually schedule the prologue and epilogueStefan Pintilie2018-01-091-6/+62
| | | | | | | | | | | | | | | | | | | | | This patch makes the following changes to the schedule of instructions in the prologue and epilogue. The stack pointer update is moved down in the prologue so that the callee saves do not have to wait for the update to happen. Saving the lr is moved down in the prologue to hide the latency of the mflr. The stack pointer is moved up in the epilogue so that restoring of the lr can happen sooner. The mtlr is moved up in the epilogue so that it is away form the blr at the end of the epilogue. The latency of the mtlr can now be hidden by the loads of the callee saved registers. This commit is almost identical to this one: r322036 except that two warnings that broke build bots have been fixed. The revision number is D41737 as before. llvm-svn: 322124
* [AMDGPU] Fixed incorrect uniform branch conditionTim Renouf2018-01-091-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: I had a case where multiple nested uniform ifs resulted in code that did v_cmp comparisons, combining the results with s_and_b64, s_or_b64 and s_xor_b64 and using the resulting mask in s_cbranch_vccnz, without first ensuring that bits for inactive lanes were clear. There was already code for inserting an "s_and_b64 vcc, exec, vcc" to clear bits for inactive lanes in the case that the branch is instruction selected as s_cbranch_scc1 and is then changed to s_cbranch_vccnz in SIFixSGPRCopies. I have added the same code into SILowerControlFlow for the case that the branch is instruction selected as s_cbranch_vccnz. This de-optimizes the code in some cases where the s_and is not needed, because vcc is the result of a v_cmp, or multiple v_cmp instructions combined by s_and/s_or. We should add a pass to re-optimize those cases. Reviewers: arsenm, kzhuravl Subscribers: wdng, yaxunl, t-tye, llvm-commits, dstuttard, timcorringham, nhaehnle Differential Revision: https://reviews.llvm.org/D41292 llvm-svn: 322119
* [COST]Fix PR35865: Fix cost model evaluation for shuffle on X86.Alexey Bataev2018-01-091-1/+2
| | | | | | | | | | | | | | Summary: If the vector type is transformed to non-vector single type, the compile may crash trying to get vector information about non-vector type. Reviewers: RKSimon, spatel, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41862 llvm-svn: 322106
* [WebAssembly] Update libcall signature listsDerek Schuff2018-01-091-0/+60
| | | | | | New signatures added in r322087. A fix for this tight coupling is forthcoming. llvm-svn: 322105
* [X86] Add a DAG combine to combine (sext (setcc)) with VLXCraig Topper2018-01-091-0/+42
| | | | | | | | | | | | Normally target independent DAG combine would do this combine based on getSetCCResultType, but with VLX getSetCCResultType returns a vXi1 type preventing the DAG combining from kicking in. But doing this combine can allow us to remove the explicit sign extend that would otherwise be emitted. This patch adds a target specific DAG combine to combine the sext+setcc when the result type is the same size as the input to the setcc. I've restricted this to FP compares and things that can be represented with PCMPEQ and PCMPGT since we don't have full integer compare support on the older ISAs. Differential Revision: https://reviews.llvm.org/D41850 llvm-svn: 322101
* [CodeGen] Don't print "pred:" and "opt:" in -debug outputFrancis Visoiu Mistrih2018-01-091-3/+3
| | | | | | | | | | In -debug output we print "pred:" whenever a MachineOperand is a predicate operand in the instruction descriptor, and "opt:" whenever a MachineOperand is an optional def in the instruction descriptor. Differential Revision: https://reviews.llvm.org/D41870 llvm-svn: 322096
* Recommit r322073: [AArch64][SVE] Asm: Add predicated ADD/SUB instructionsSander de Smalen2018-01-094-4/+38
| | | | | | | | | Fixed issue that was found on sanitizer-x86_64-linux-fast. I changed the result type of 'Parser.getTok().getString().lower()' in AArch64AsmParser::tryParseSVEPredicateVector() from 'StringRef' to 'auto', since StringRef::lower() returns a std::string. llvm-svn: 322092
* Reverted r322073 because of AddressSanitizer failure onSander de Smalen2018-01-093-37/+3
| | | | | | sanitizer-x86_64-linux-fast builder. llvm-svn: 322077
* [AArch64][SVE] Asm: Add predicated ADD/SUB instructionsSander de Smalen2018-01-093-3/+37
| | | | | | | | | | | | | | | | | Summary: Add the predicated ADD/SUB instructions and corresponding tests. Patch [3/3] in a series to add predicated ADD/SUB instructions for SVE. Reviewers: rengolin, mcrosier, evandro, fhahn, echristo Reviewed By: fhahn Subscribers: aemerson, javed.absar, tschuett, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D41443 llvm-svn: 322073
* [AArch64][SVE] Asm: Add parsing of merging/zeroing suffix for SVE predicate ↵Sander de Smalen2018-01-092-0/+32
| | | | | | | | | | | | | | | | | | | vector operands Summary: Parsing of the '/m' (merging) or '/z' (zeroing) suffix of a predicate operand. Patch [2/3] in a series to add predicated ADD/SUB instructions for SVE. Reviewers: rengolin, mcrosier, evandro, fhahn, echristo, MatzeB, t.p.northover Reviewed By: fhahn Subscribers: t.p.northover, MatzeB, aemerson, javed.absar, tschuett, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D41442 llvm-svn: 322070
* [Nios2] Arithmetic instructions for R1 and R2 ISA.Nikolai Bozhenov2018-01-095-11/+138
| | | | | | | | | | | | | | Summary: This commit enables some of the arithmetic instructions for Nios2 ISA (for both R1 and R2 revisions), implements facilities required to emit those instructions and provides LIT tests for added instructions. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D41236 Author: belickim <mateusz.belicki@intel.com> llvm-svn: 322069
* Instrument Control Flow For Indirect Branch TrackingOren Ben Simhon2018-01-095-1/+176
| | | | | | | | | | | | | CET (Control-Flow Enforcement Technology) introduces a new mechanism called IBT (Indirect Branch Tracking). According to IBT, each Indirect branch should land on dedicated ENDBR instruction (End Branch). The new pass adds ENDBR instructions for every indirect jmp/call (including jumps using jump tables / switches). For more information, please see the following: https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-enforcement-technology-preview.pdf Differential Revision: https://reviews.llvm.org/D40482 Change-Id: Icb754489faf483a95248f96982a4e8b1009eb709 llvm-svn: 322062
* [X86] Allow more cmpps/pd immediate encodings to be commuted during isel.Craig Topper2018-01-091-2/+2
| | | | | | The code that checks the immediate wasn't masking to the lower 3-bits like the code in X86InstrInfo.cpp that's used by the peephole pass does. llvm-svn: 322060
* [PowerPC] Can not assume an intrinsic argument is a simple type.Sean Fertile2018-01-091-6/+7
| | | | | | | | | | | | | The CTRLoop pass performs checks on the argument of certain libcalls/intrinsics, and assumes the arguments must be of a simple type. This isn't always the case though. For example if we unroll and vectorize a loop we may end up with vectors larger then the largest legal type, along with intrinsics that operate on those wider types. This happened in the ffmpeg build, where we unrolled a loop and ended up with a sqrt intrinsic that operated on V16f64, triggering an assertion. Differential Revision: https://reviews.llvm.org/D41758 llvm-svn: 322055
* Remove unused function HvxSelector::zerous.Eric Christopher2018-01-091-20/+0
| | | | llvm-svn: 322053
* Revert "[PowerPC] Manually schedule the prologue and epilogue"Stefan Pintilie2018-01-091-65/+6
| | | | | | | | [PowerPC] This reverts commit r322036. Failing build bots. Revert the commit now. llvm-svn: 322051
* [X86] Remove llvm.x86.avx512.cvt*2mask.* intrinsics and autoupgrade to (icmp ↵Craig Topper2018-01-092-26/+1
| | | | | | | | slt X, 0) I had to drop fast-isel-abort from a test because we can't fast isel some of the mask stuff. When we used intrinsics we implicitly fell back to SelectionDAG for the intrinsic call without triggering the abort error. But with native IR that doesn't happen the same way. llvm-svn: 322050
* [X86] Remove unnecessary isel pattern that is a combination of two other ↵Craig Topper2018-01-091-2/+0
| | | | | | | | | | | | | | | patterns. The pattern was this def : Pat<(i32 (zext (i8 (bitconvert (v8i1 VK8:$src))))), (MOVZX32rr8 (EXTRACT_SUBREG (i32 (COPY_TO_REGCLASS VK8:$src, GR32)), sub_8bit))>, Requires<[NoDQI]>; but if you just let (i32 (zext X)) match byte itself you'll get MOVZX32rr8. And if you let (i8 (bitconvert (v8i1 VK8:$src))) match by itself you'll get (EXTRACT_SUBREG (i32 (COPY_TO_REGCLASS VK8:$src, GR32)), sub_8bit). So we can just let isel do the two patterns naturally. llvm-svn: 322049
* [MachineOutliner] AArch64: Handle instrs that use SP and will never need fixupsJessica Paquette2018-01-094-15/+93
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit does two things. Firstly, it adds a collection of flags which can be passed along to the target to encode information about the MBB that an instruction lives in to the outliner. Second, it adds some of those flags to the AArch64 outliner in order to add more stack instructions to the list of legal instructions that are handled by the outliner. The two flags added check if - There are calls in the MachineBasicBlock containing the instruction - The link register is available in the entire block If the link register is available and there are no calls, then a stack instruction can always be outlined without fixups, regardless of what it is, since in this case, the outliner will never modify the stack to create a call or outlined frame. The motivation for doing this was checking which instructions are most often missed by the outliner. Instructions like, say %sp<def> = ADDXri %sp, 32, 0; flags: FrameDestroy are very common, but cannot be outlined in the case that the outliner might modify the stack. This commit allows us to outline instructions like this. llvm-svn: 322048
* [PowerPC] Manually schedule the prologue and epilogueStefan Pintilie2018-01-081-6/+65
| | | | | | | | | | | | | | | | | | This patch makes the following changes to the schedule of instructions in the prologue and epilogue. The stack pointer update is moved down in the prologue so that the callee saves do not have to wait for the update to happen. Saving the lr is moved down in the prologue to hide the latency of the mflr. The stack pointer is moved up in the epilogue so that restoring of the lr can happen sooner. The mtlr is moved up in the epilogue so that it is away form the blr at the end of the epilogue. The latency of the mtlr can now be hidden by the loads of the callee saved registers. Differential Revision: https://reviews.llvm.org/D41737 llvm-svn: 322036
* [mips] Remove duplicated R6 EVA instructionsAleksandar Beserminji2018-01-084-208/+0
| | | | | | | | This patch removes duplicated EVA instructions in R6. Differential Revision: https://reviews.llvm.org/D41769 llvm-svn: 322007
* [ARM] Fix PR35379 - incorrect unwind information when compiling with -OzMomchil Velikov2018-01-082-4/+20
| | | | | | | | | | The patch makes the unwind information not mention registers, which were pushed solely for the purpose of saving stack adjustment instructions. Differential revision: https://reviews.llvm.org/D41300 Fixes https://bugs.llvm.org/show_bug.cgi?id=35379 llvm-svn: 321996
* [SystemZ] Comment fix in SystemZElimCompare.cppJonas Paulsson2018-01-081-5/+2
| | | | | | | NFC Review: Ulrich Weigand llvm-svn: 321990
* [ARM] Fix PR35481Momchil Velikov2018-01-081-5/+14
| | | | | | | | | | | | This patch allows `r7` to be used, regardless of its use as a frame pointer, as a temporary register when popping `lr`, and also falls back to using a high temporary register if, for some reason, we weren't able to find a suitable low one. Differential revision: https://reviews.llvm.org/D40961 Fixes https://bugs.llvm.org/show_bug.cgi?id=35481 llvm-svn: 321989
* [X86] Remove side-effects from determineCalleeSavesFrancis Visoiu Mistrih2018-01-081-28/+27
| | | | | | | | | | | | | | (Target)FrameLowering::determineCalleeSaves can be called multiple times. I don't think it should have side-effects as creating stack objects and setting global MachineFunctionInfo state as it is doing today (in other back-ends as well). This moves the creation of stack objects from determineCalleeSaves to assignCalleeSavedSpillSlots. Differential Revision: https://reviews.llvm.org/D41703 llvm-svn: 321987
* [X86] Replace CVT2MASK ISD opcode with PCMPGTM compared to zero.Craig Topper2018-01-085-24/+22
| | | | | | CVT2MASK is just checking the sign bit which can be represented with a comparison with zero. llvm-svn: 321985
* [X86] Add patterns to allow 512-bit BWI compare instructions to be used for ↵Craig Topper2018-01-082-6/+27
| | | | | | 128/256-bit compares when VLX is not available. llvm-svn: 321984
* [X86] Simplify some code in lower1BitVectorShuffle by relying on getNode's ↵Craig Topper2018-01-071-15/+2
| | | | | | ability to constant fold vector SIGN_EXTEND. llvm-svn: 321979
OpenPOWER on IntegriCloud