summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* Lowering x86 adds/addus/subs/subus intrinsics (llvm part)Alexander Ivchenko2018-04-192-40/+89
| | | | | | | | | | | | | This is the patch that lowers x86 intrinsics to native IR in order to enable optimizations. The patch also includes folding of previously missing saturation patterns so that IR emits the same machine instructions as the intrinsics. Patch by tkrupa Differential Revision: https://reviews.llvm.org/D44785 llvm-svn: 330322
* [X86][FMA] Remove FMA reg-reg InstRW scheduler overrides.Simon Pilgrim2018-04-194-25/+1
| | | | | | These are all already handled identically by WriteFMA. llvm-svn: 330319
* [X86][BtVer2] Remove 128-bit F16C InstRW overrides.Simon Pilgrim2018-04-191-10/+0
| | | | | | These are already handled identically by WriteCvtF2F. llvm-svn: 330318
* [mips] Guard some macro expansions properlySimon Dardis2018-04-193-20/+24
| | | | | | | | Reviewers: atanasyan, abeserminji Differential Revision: https://reviews.llvm.org/D45565 llvm-svn: 330315
* [AArch64][AsmParser] NFC: Cleanup parsing of scalar registers.Sander de Smalen2018-04-191-77/+69
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: - Renamed tryParseRegister to tryParseScalarRegister, which now returns an OperandMatchResultTy. - Moved matching of certain aliases into matchRegisterNameAlias. - Changed type of most 'Reg' variables to 'unsigned'. This is patch [1/4] in a series to add assembler/disassembler support for SVE's contiguous LD1 (scalar+scalar) instructions: - Patch [1/4]: https://reviews.llvm.org/D45687 - Patch [2/4]: https://reviews.llvm.org/D45688 - Patch [3/4]: https://reviews.llvm.org/D45689 - Patch [4/4]: https://reviews.llvm.org/D45690 Reviewers: fhahn, rengolin, javed.absar, huntergr, SjoerdMeijer, t.p.northover, echristo, evandro, samparker Reviewed By: samparker Subscribers: samparker, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D45687 llvm-svn: 330311
* [X86] Scrub scheduling information for MUL/IMUL on Intel CPUs.Craig Topper2018-04-195-35/+81
| | | | | | This removes a bunch of unnecessary InstRW overrides. It also cleans up the missing information from the Sandy Bridge model. Other fixes to other models. llvm-svn: 330308
* Fix data race in X86FloatingPoint.cpp ASSERT_SORTEDBob Haarman2018-04-181-7/+8
| | | | | | | | | | | | | | | | | | | | | Summary: ASSERT_SORTED checks if a table is sorted, and uses a boolean to prevent the check from being run again if it was earlier determined that the table is in fact sorted. Unsynchronized reads and writes of that boolean triggered ThreadSanitizer's data race detection. This change rewrites the code to use std::atomic<bool> instead. Fixes PR36922. Reviewers: rnk Reviewed By: rnk Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D45742 llvm-svn: 330301
* [X86] Correct the Defs, Uses, hasSideEffects, mayLoad, mayStore for XCHG and ↵Craig Topper2018-04-183-35/+52
| | | | | | | | XADD instructions. I don't think we emit any of these from codegen except for using XCHG16ar as 2 byte NOP. llvm-svn: 330298
* [NVPTX, CUDA] Added support for m8n32k16 and m32n8k16 variants of wmma ↵Artem Belevich2018-04-184-17/+85
| | | | | | | | | | instructions. The new instructions were added added for sm_70+ GPUs in CUDA-9.1. Differential Revision: https://reviews.llvm.org/D45068 llvm-svn: 330296
* [RISCV] Introduce pattern for materialising immediates with 0 for lower 12 bitsAlex Bradbury2018-04-181-2/+3
| | | | | | | These immediates can be materialised with just an lui, rather than an lui+addi pair. llvm-svn: 330293
* [X86] Fix the Uses/Defs,mayLoad,mayStore,hasSideEffects flags for the ↵Craig Topper2018-04-181-6/+13
| | | | | | | | CMPXCHG instructions. The compiler only emits the locked version of these which use different instruction definitions. The versions fixed here are only used by the assembler/disassembler. llvm-svn: 330287
* Revert "[RISCV] implement li pseudo instruction"Alex Bradbury2018-04-189-266/+49
| | | | | | | | | Reverts rL330224, while issues with the C extension and missed common subexpression elimination opportunities are addressed. Neither of these issues are visible in current RISC-V backend unit tests, which clearly need expanding. llvm-svn: 330281
* [Power9]Legalize and emit code for converting Unsigned HWord/Char to ↵Lei Huang2018-04-181-0/+8
| | | | | | | | | | | | | | | | Quad-Precision Legalize and emit code for converting unsigned HWord/Char to QP: xscvsdqp xscvudqp Only covering patterns for unsigned forms cause we don't have part-word sign-extending integer loads into VSX registers. Differential Revision: https://reviews.llvm.org/D45494 llvm-svn: 330278
* [AArch64] Add isel pattern for v8i8->v2f32 NVCASTs.Amara Emerson2018-04-181-0/+1
| | | | | | rdar://39454635 llvm-svn: 330276
* [Power9]Legalize and emit code for converting (Un)Signed Word to Quad-PrecisionLei Huang2018-04-181-1/+10
| | | | | | | | | | | Legalize and emit code for converting (Un)Signed Word to quad-precision via: xscvsdqp xscvudqp Differential Revision: https://reviews.llvm.org/D45389 llvm-svn: 330273
* [DEBUG] Initial adaptation of NVPTX target for debug info emission.Alexey Bataev2018-04-1811-273/+207
| | | | | | | | | | | | | | | Summary: Patch adds initial emission of the debug info for NVPTX target. Currently, only .file and .loc directives are emitted, everything else is commented out to not break the compilation of Cuda. Reviewers: echristo, jlebar, tra, jholewinski Subscribers: mgorny, aprantl, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D41827 llvm-svn: 330271
* [x86] Switch EFLAGS copy lowering to use reg-reg form of testing forChandler Carruth2018-04-181-1/+1
| | | | | | | | | | | | | | | | a zero register. Previously I tried this and saw LLVM unable to transform this to fold with memory operands such as spill slot rematerialization. However, it clearly works as shown in this patch. We turn these into `cmpb $0, <mem>` when useful for folding a memory operand without issue. This form has no disadvantage compared to `testb $-1, <mem>`. So overall, this is likely no worse and may be slightly smaller in some cases due to the `testb %reg, %reg` form. Differential Revision: https://reviews.llvm.org/D45475 llvm-svn: 330269
* [x86] Fix PR37100 by teaching the EFLAGS copy lowering to rewrite usesChandler Carruth2018-04-181-82/+125
| | | | | | | | | | | | | | | | | | | | | | across basic blocks in the limited cases where it is very straight forward to do so. This will also be useful for other places where we do some limited EFLAGS propagation across CFG edges and need to handle copy rewrites afterward. I think this is rapidly approaching the maximum we can and should be doing here. Everything else begins to require either heroic analysis to prove how to do PHI insertion manually, or somehow managing arbitrary PHI-ing of EFLAGS with general PHI insertion. Neither of these seem at all promising so if those cases come up, we'll almost certainly need to rewrite the parts of LLVM that produce those patterns. We do now require dominator trees in order to reliably diagnose patterns that would require PHI nodes. This is a bit unfortunate but it seems better than the completely mysterious crash we would get otherwise. Differential Revision: https://reviews.llvm.org/D45673 llvm-svn: 330264
* [AMDGPU] Fix issues for backend divergence trackingDavid Stuttard2018-04-181-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: A change to use divergence analysis in the AMDGPU backend was getting formal arguments incorrect (not tagged as divergent) unless they were VGPR0, VGPR1 or VGPR2 For graphics shaders it is possible to have more than these passed in as VGPR Modified the checking code to check for any VGPR registers passed in as formal arguments. Also, some intrinsics that are sources of divergence may have been lowered during instruction selection and are missed on subsequent calls to isSDNodeSourceOfDivergence - added the relevant AMDGPUISD checks as well. Finally, the FunctionLoweringInfo tracks virtual registers that are live across basic block boundaries. This is used to check for divergence of CopyFromRegister registers using the DivergenceAnalysis analysis. For multiple blocks the lazily evaluated inverted map VirtReg2Value was not cleared when the ValueMap map was. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45372 Change-Id: I112f3bd6dfe0f62e63ce9b43b893982778e4bee3 llvm-svn: 330257
* [X86][Broadwell] Remove some unnecessary InstRW overrides and add some FIXMEs.Craig Topper2018-04-181-43/+8
| | | | llvm-svn: 330241
* [X86] Give CMOV 2 cycle latency on SLM.Craig Topper2018-04-181-1/+1
| | | | llvm-svn: 330239
* [X86] Don't crash on bad operand modifiers in inline assemblyCraig Topper2018-04-181-0/+6
| | | | | | | | | | | | | | Summary: Previously if a modifer was placed on a non-GPR register class we would hit an assert or crash. Reviewers: echristo Reviewed By: echristo Subscribers: eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D45751 llvm-svn: 330238
* [AMDGPU] Enabled v2.16 literals for VOP3PStanislav Mekhanoshin2018-04-172-8/+19
| | | | | | | | Literal encoding needs op_sel_hi to select low 16 bit in this case. Differential Revision: https://reviews.llvm.org/D45745 llvm-svn: 330230
* [RISCV] implement li pseudo instructionAlex Bradbury2018-04-179-49/+266
| | | | | | | | | | | | | | The implementation follows the MIPS backend and expands the pseudo instruction directly during asm parsing. As the result, only real MC instructions are emitted to the MCStreamer. Additionally, PseudoLI instructions are emitted during codegen. The actual expansion to real instructions is performed during MI to MC lowering and is similar to the expansion performed by the GNU Assembler. Differential Revision: https://reviews.llvm.org/D41949 Patch by Mario Werner. llvm-svn: 330224
* [XRay] Typed event logging intrinsicKeith Wyss2018-04-173-352/+912
| | | | | | | | | | | | | | | | | | | | | Summary: Add an LLVM intrinsic for type discriminated event logging with XRay. Similar to the existing intrinsic for custom events, but also accepts a type tag argument to allow plugins to be aware of different types and semantically interpret logged events they know about without choking on those they don't. Relies on a symbol defined in compiler-rt patch D43668. I may wait to submit before I can see demo everything working together including a still to come clang patch. Reviewers: dberris, pelikan, eizan, rSerge, timshen Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45633 llvm-svn: 330219
* [WebAssembly] Add an assertion for an invalid CFGHeejin Ahn2018-04-171-0/+3
| | | | | | | | | | | | | | | Summary: It was not easy to provide a test case for D45648 (rL330079) because the bug didn't manifest itself in the set of currently valid IRs. Added an assertion to check this faster, thanks to @dblaikie's suggestion. Reviewers: dblaikie Subscribers: jfb, dschuff, sbc100, jgravelle-google, llvm-commits, dblaikie Differential Revision: https://reviews.llvm.org/D45711 llvm-svn: 330217
* [WebAssembly] Teach fast-isel to gracefully recover from illegal return types.Dan Gohman2018-04-171-2/+6
| | | | | | Fixes PR36564. llvm-svn: 330215
* [X86] Add separate scheduling class for PSADBW instruction.Craig Topper2018-04-1713-22/+18
| | | | llvm-svn: 330204
* [X86] Remove unnecessary InstRW overrides. Add somes FIXMEs/TODOs.Craig Topper2018-04-175-98/+17
| | | | llvm-svn: 330203
* [Hexagon] Do not merge initializers for stack and non-stack expressionsKrzysztof Parzyszek2018-04-171-2/+42
| | | | | | | | | Stack addressing needs addressing modes that provide an offset field immediately following the frame index. An initializer from a non-stack addressing could force the stack address to use a form that does not provide an offset field. llvm-svn: 330191
* [X86] Add FP comparison scheduler classesSimon Pilgrim2018-04-1713-315/+105
| | | | | | | | Split VCMP/VMAX/VMIN instructions off to WriteFCmp and VCOMIS instructions off to WriteFCom instead of assuming they match WriteFAdd Differential Revision: https://reviews.llvm.org/D45656 llvm-svn: 330179
* [RISCV] Fix assert message operatorMandeep Singh Grang2018-04-161-1/+1
| | | | | | | | | | | | | | | | Summary: Specifying assert message with an || operator makes the compiler interpret it as a bool. Changed it to &&. Reviewers: asb, apazos Reviewed By: asb Subscribers: rbar, johnrusso, simoncook, jordy.potman.lists, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D45660 llvm-svn: 330148
* [Hexagon] Turn off flag enabling auto-vectorizationKrzysztof Parzyszek2018-04-161-1/+1
| | | | | | It was turned on for testing and was accidentally left on in the commit. llvm-svn: 330139
* [NFC] Move verificaiton check for f128 conversion into LowerINT_TO_FP()Lei Huang2018-04-161-24/+14
| | | | | | | Move veriication check for legal conversions to f128 into LowerINT_TO_FP() and fix some indentations to match other sections of the code for readability. llvm-svn: 330138
* [AMDGPU][MC][VI][GFX9] Added support of SDWA/DPP for v_cndmask_b32Dmitry Preobrazhensky2018-04-162-1/+27
| | | | | | | | | See bug 36356: https://bugs.llvm.org/show_bug.cgi?id=36356 Differential Revision: https://reviews.llvm.org/D45446 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 330123
* [AArch64][SVE] Asm: Support for structured LD4 (scalar+imm) load instructions.Sander de Smalen2018-04-166-3/+68
| | | | | | | | | | | | Reviewers: fhahn, rengolin, javed.absar, huntergr, SjoerdMeijer, t.p.northover, echristo, evandro Reviewed By: rengolin Subscribers: tschuett, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D45624 llvm-svn: 330120
* [AArch64][SVE] Asm: Support for structured LD3 (scalar+imm) load instructions.Sander de Smalen2018-04-166-3/+69
| | | | | | | | | | | | Reviewers: fhahn, rengolin, javed.absar, huntergr, SjoerdMeijer, t.p.northover, echristo, evandro Reviewed By: rengolin Subscribers: tschuett, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D45623 llvm-svn: 330116
* [mips] Restrict certain trap instructions for micromipsr6Stefan Maksimovic2018-04-161-8/+14
| | | | | | | | | Instructions removed from micromipsr6: teqi, tgei, tgeiu, tlti, tltiu, tnei Differential Revision: https://reviews.llvm.org/D45318 llvm-svn: 330114
* [X86] Introduce archs: goldmont-plus & tremontGabor Buella2018-04-162-13/+43
| | | | | | | | | | | | | | | Using Goldmont's cost tables for these two upcoming atom archs. Reviewers: craig.topper Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45612 llvm-svn: 330109
* [AArch64][SVE] Asm: Support for structured LD2 (scalar+imm) load instructions.Sander de Smalen2018-04-166-4/+102
| | | | | | | | | | | | Reviewers: fhahn, rengolin, javed.absar, huntergr, SjoerdMeijer, t.p.northover, echristo, evandro Reviewed By: rengolin Subscribers: tschuett, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D45622 llvm-svn: 330108
* [X86] Use uint32_t instead of unsigned in GetLo32XForm for readability. NFCCraig Topper2018-04-151-1/+1
| | | | | | GetLo8XForm right next to it uses uint8_t so uint32_t is consistent. llvm-svn: 330104
* [X86][MMX] Set PAVG/PHADD/PMIN/PMAX/PSIGN instructions to use same scheduler ↵Simon Pilgrim2018-04-141-12/+12
| | | | | | classes as SSE/AVX llvm-svn: 330085
* [NFC] fix trivial typos in document and commentsHiroshi Inoue2018-04-142-2/+2
| | | | | | "not not" -> "not" etc llvm-svn: 330083
* [WebAssembly] Fix a bug in MachineBasicBlock::findDebugLoc() callHeejin Ahn2018-04-141-3/+4
| | | | | | | | | | | | | | Summary: InsertPos is within the bacic block `Header`, so `findDebugLoc()` should be called on not `MBB` but `Header` instead. Reviewers: yurydelendik Subscribers: jfb, dschuff, aprantl, sbc100, jgravelle-google, sunfish, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D45648 llvm-svn: 330079
* [X86] Add the bizarro movsww and movzww mnemonics for the disassembler.Craig Topper2018-04-131-0/+20
| | | | | | | | The destination size of the movzx/movsx instruction is controlled by the normal operand size mechanisms. Only the input type is fixed. This means that a 0x66 prefix on the encoding for zext/sext 16->32 should really produce a 16->16 instruction. Functionally this is equivalent to a GR16->GR16 move since bits 16 and above will be preserved. So nothing is actually extended. llvm-svn: 330078
* MachO: trap unreachable instructionsTim Northover2018-04-133-1/+8
| | | | | | | Debugability is more important than saving 4 bytes to let us to fall through to nonense. llvm-svn: 330073
* [Hexagon] Initial instruction cost model for auto-vectorizationKrzysztof Parzyszek2018-04-132-98/+195
| | | | llvm-svn: 330065
* Revert r329956, "AArch64: Introduce a DAG combine for folding offsets into ↵Peter Collingbourne2018-04-132-68/+15
| | | | | | | | | | addresses." Caused a hang and eventually an assertion failure in LTO builds of 7zip-benchmark on aarch64 iOS targets. http://green.lab.llvm.org/green/job/lnt-ctmark-aarch64-O3-flto/2024/ llvm-svn: 330063
* [LV] Introduce TTI::getMinimumVFKrzysztof Parzyszek2018-04-132-0/+5
| | | | | | | | | | | The function getMinimumVF(ElemWidth) will return the minimum VF for a vector with elements of size ElemWidth bits. This value will only apply to targets for which TTI::shouldMaximizeVectorBandwidth returns true. The value of 0 indicates that there is no minimum VF. Differential Revision: https://reviews.llvm.org/D45271 llvm-svn: 330062
* [Power9] Add the TLS store instructions to the Power 9 modelStefan Pintilie2018-04-132-2/+2
| | | | | | | | | | | The Power 9 scheduler model should now include the TLS instructions. We can now, once again, mark the model as complete. From now on, if instructions are added to Power 9 but are not added to the model the build should produce an error. Hopefully that will alert the developer who is adding new instructions that they should also be added to the scheulder model. llvm-svn: 330060
OpenPOWER on IntegriCloud