summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* Enable AVX512_BF16 instructions, which are supported for BFLOAT16 in Cooper LakeLuo, Yuanke2019-05-069-0/+213
| | | | | | | | | | | | | | | | | | | | | | | | Summary: 1. Enable infrastructure of AVX512_BF16, which is supported for BFLOAT16 in Cooper Lake; 2. Enable VCVTNE2PS2BF16, VCVTNEPS2BF16 and DPBF16PS instructions, which are Vector Neural Network Instructions supporting BFLOAT16 inputs and conversion instructions from IEEE single precision. VCVTNE2PS2BF16: Convert Two Packed Single Data to One Packed BF16 Data. VCVTNEPS2BF16: Convert Packed Single Data to Packed BF16 Data. VDPBF16PS: Dot Product of BF16 Pairs Accumulated into Packed Single Precision. For more details about BF16 isa, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference Author: LiuTianle Reviewers: craig.topper, smaslov, LuoYuanke, wxiao3, annita.zhang, RKSimon, spatel Reviewed By: craig.topper Subscribers: kristina, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60550 llvm-svn: 360017
* DWARF v5: fix directory index in the line tableFangrui Song2019-05-061-12/+16
| | | | | | | | | | | | | | Summary: Prior to DWARF v5, a directory index of 0 represents DW_AT_comp_dir. In DWARF v5, the index starts with 0 and Entry.DirIdx is the index into Prologue.IncludeDirectories. Reviewed By: labath Differential Revision: https://reviews.llvm.org/D61253 llvm-svn: 360015
* [DebugInfo] GlobalOpt DW_OP_deref_size instead of DW_OP_deref.Markus Lavin2019-05-061-2/+6
| | | | | | | | | | Optimization pass lib/Transforms/IPO/GlobalOpt.cpp needs to insert DW_OP_deref_size instead of DW_OP_deref to be compatible with big-endian targets for same reasons as in D59687. Differential Revision: https://reviews.llvm.org/D60611 llvm-svn: 360013
* [SelectionDAG] Replace llvm_unreachable at the end of getCopyFromParts with ↵Craig Topper2019-05-061-1/+1
| | | | | | | | | | | | | | | | a report_fatal_error. Based on PR41748, not all cases are handled in this function. llvm_unreachable is treated as an optimization hint than can prune code paths in a release build. This causes weird behavior when PR41748 is encountered on a release build. It appears to generate an fp_round instruction from the floating point code. Making this a report_fatal_error prevents incorrect optimization of the code and will instead generate a message to file a bug report. llvm-svn: 360008
* [X86] Pull out repeated Subtarget feature tests. NFCI.Simon Pilgrim2019-05-051-12/+11
| | | | | | Avoids a scan-build "uninitialized value" warning in X86FastISel::X86SelectFPExtOrFPTrunc llvm-svn: 360001
* [TTI][X86] Make getAddressComputationCost cost value const. NFCI.Simon Pilgrim2019-05-051-1/+1
| | | | llvm-svn: 359999
* [NFC] BasicBlock: generalize replaceSuccessorsPhiUsesWith(), take Old bbRoman Lebedev2019-05-051-7/+10
| | | | | | | | | | Thus it does not assume that the old basic block is the basic block for which we are looking at successors. Not reviewed, but seems rather trivial, in line with the rest of previous few patches. llvm-svn: 359997
* [NFC] BasicBlock: refactor changePhiUses() out of replacePhiUsesWith(), use itRoman Lebedev2019-05-054-37/+28
| | | | | | | | | | | | | | | | | | | | | Summary: It is a common thing to loop over every `PHINode` in some `BasicBlock` and change old `BasicBlock` incoming block to a new `BasicBlock` incoming block. `replaceSuccessorsPhiUsesWith()` already had code to do that, it just wasn't a function. So outline it into a new function, and use it. Reviewers: chandlerc, craig.topper, spatel, danielcdh Reviewed By: craig.topper Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61013 llvm-svn: 359996
* [NFC] PHINode: introduce replaceIncomingBlockWith() function, use itRoman Lebedev2019-05-054-38/+9
| | | | | | | | | | | | | | | | | | | | Summary: There is `PHINode::getBasicBlockIndex()`, `PHINode::setIncomingBlock()` and `PHINode::getNumOperands()`, but no function to replace every specified `BasicBlock*` predecessor with some other specified `BasicBlock*`. Clearly, there are a lot of places that could use that functionality. Reviewers: chandlerc, craig.topper, spatel, danielcdh Reviewed By: craig.topper Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61011 llvm-svn: 359995
* [NFC] Instruction: introduce replaceSuccessorWith() function, use itRoman Lebedev2019-05-052-3/+8
| | | | | | | | | | | | | | | | | | | | Summary: There is `Instruction::getNumSuccessors()`, `Instruction::getSuccessor()` and `Instruction::setSuccessor()`, but no function to replace every specified `BasicBlock*` successor with some other specified `BasicBlock*`. I've found one place where it should clearly be used. Reviewers: chandlerc, craig.topper, spatel, danielcdh Reviewed By: craig.topper Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61010 llvm-svn: 359994
* [NFC][Utils] deleteDeadLoop(): add an assert that exit block has some ↵Roman Lebedev2019-05-051-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | non-PHI instruction Summary: If `deleteDeadLoop()` is called on such a loop, that has "bad" exit block, one that e.g. has no terminator instruction, the `DIBuilder::insertDbgValueIntrinsic()` will be told to insert the Dbg Value Intrinsic after `nullptr` (since there is no first non-PHI instruction), which will cause it to not insert those instructions into any basic block. The instructions will be parent-less, and IR verifier will complain. It is rather obvious to track down the root cause when that happens, so let's just assert it never happens. Reviewers: sanjoy, davide, vsk Reviewed By: vsk Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61008 llvm-svn: 359993
* Move getOpcode() call into if statement. NFCI.Simon Pilgrim2019-05-051-2/+1
| | | | | | Avoids a cppcheck "Local variable name shadows outer variable" warning. llvm-svn: 359991
* [SLPVectorizer] Prefer pre-increments. NFCI.Simon Pilgrim2019-05-051-3/+3
| | | | llvm-svn: 359989
* [LLParser] Remove unused variable after r359987. NFCCraig Topper2019-05-051-1/+0
| | | | llvm-svn: 359988
* [LLParser] Remove unnecessary error check making sure NUW/NSW flags aren't ↵Craig Topper2019-05-051-6/+0
| | | | | | | | | | | | | | | | | | set on a non-integer operation. Summary: This check appears to be a leftover from when add/sub/mul could be either integer or fp. The NSW/NUW flags are only set for add/sub/mul/shl earlier. And we check that those operations only have integer types just below this. So it seems unnecessary to explicitly error for NUW/NSW being used on a add/sub/mul that have the wrong type that would later error for that. Reviewers: spatel, dblaikie, jyknight, arsenm Reviewed By: spatel Subscribers: wdng, llvm-commits, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D61562 llvm-svn: 359987
* [LLParser] Simplify type checking in ParseArithmetic and ParseUnaryOp.Craig Topper2019-05-052-37/+18
| | | | | | | | | | | | | | | | | | | Summary: These methods previously took a 0, 1, or 2 to indicate what types were allowed, but the 0 encoding which meant both fp and integer types has been unused for years. Its leftover from when add/sub/mul used to be shared between int and fp Simplify it by changing it to just a bool to distinquish int and fp. Reviewers: spatel, dblaikie, jyknight, arsenm Reviewed By: spatel Subscribers: wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61561 llvm-svn: 359986
* [Constants] Simplify type checking switch in ConstantExpr::get.Craig Topper2019-05-051-26/+6
| | | | | | | | | | | | | | | | | | | | | | Summary: Remove duplicate checks that both operands have the same type. This is checked before the switch. Use 'integer' or 'floating-point' instead of 'arithmetic' type. I think this might be a leftover to the days when floating point and integer operations shared the same opcodes. Reviewers: spatel, RKSimon, dblaikie Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61558 llvm-svn: 359985
* [MCA] Notify event listeners when instructions transition to the Pending ↵Andrea Di Biagio2019-05-052-8/+35
| | | | | | state. NFCI llvm-svn: 359983
* Add FNeg IR constant folding supportCameron McInally2019-05-054-5/+63
| | | | llvm-svn: 359982
* [X86] Make X86RegisterInfo(const Triple &TT) constructor explicit.Simon Pilgrim2019-05-051-1/+1
| | | | | | Fixes cppcheck warning. llvm-svn: 359981
* [X86] Fix some cppcheck "Local variable name shadows outer variable" ↵Simon Pilgrim2019-05-051-44/+42
| | | | | | warnings. NFCI. llvm-svn: 359976
* [SLPVectorizer] Make getSpillCost() const. NFCI.Simon Pilgrim2019-05-051-2/+9
| | | | | | Ideally getTreeCost() should be const as well but non-const Type creation would need to be addressed first. llvm-svn: 359975
* [SelectionDAG] Use any_of/all_of where possible. NFCI.Simon Pilgrim2019-05-051-14/+4
| | | | llvm-svn: 359974
* Move Value *RHSCIOp def into the scope where its actually used. NFCI.Simon Pilgrim2019-05-051-2/+1
| | | | llvm-svn: 359973
* [CodeGenPrepare] limit overflow intrinsic matching to a single basic block ↵Sanjay Patel2019-05-041-28/+21
| | | | | | | | | | | | | | | | | | | | | (2nd try) This is a subset of the original commit from rL359879 which was reverted because it could crash when using the 'RemovedInstructions' structure that enables delayed deletion of dead instructions. The motivating compile-time win does not require that change though. We should get most of that win from this change alone. Using/updating a dominator tree to match math overflow patterns may be very expensive in compile-time (because of the way CGP uses a DT), so just handle the single-block case. See post-commit thread for rL354298 for more details: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190422/646276.html Differential Revision: https://reviews.llvm.org/D61075 llvm-svn: 359969
* [AMDGPU] Fixed asan error after D61536Stanislav Mekhanoshin2019-05-041-1/+1
| | | | llvm-svn: 359963
* AMDGPU] gfx1010 hazard recognizerStanislav Mekhanoshin2019-05-042-3/+268
| | | | | | Differential Revision: https://reviews.llvm.org/D61536 llvm-svn: 359961
* [AMDGPU] gfx1010: use fmac instructionsStanislav Mekhanoshin2019-05-044-39/+105
| | | | | | Differential Revision: https://reviews.llvm.org/D61527 llvm-svn: 359959
* [JITLink] Add two useful Section operations: find by name, get address range.Lang Hames2019-05-041-9/+2
| | | | | | | | These operations were already used in eh-frame registration, and are likely to be used in other runtime registrations, so this commit moves them into a header where they can be re-used. llvm-svn: 359950
* [AArch64][GlobalISel] Use fcsel instead of csel for G_SELECT on FPRsJessica Paquette2019-05-032-24/+75
| | | | | | | | | | | | | | | | | | | | | | | | This saves us some unnecessary copies. If the inputs to a G_SELECT are floating point, we should use fcsel rather than csel. Changes here are... - Teach selectCopy about s1-to-s1 copies across register banks. - AArch64RegisterBankInfo about G_SELECT in general. - Teach the instruction selector about the FCSEL instructions. Also add two tests: - select-select.mir to show that we get the expected FCSEL - regbank-select.mir (unfortunately named) to show the register banks on G_SELECT are properly preserved And update fast-isel-select.ll to show that we do the same thing as other instruction selectors in these cases. llvm-svn: 359940
* [AMDGPU] gfx1010 wait count insertionStanislav Mekhanoshin2019-05-031-56/+144
| | | | | | Differential Revision: https://reviews.llvm.org/D61534 llvm-svn: 359938
* [AMDGPU] gfx1010 s_code_end generationStanislav Mekhanoshin2019-05-034-2/+45
| | | | | | | | Also add some missing metadata in the streamer. Differential Revision: https://reviews.llvm.org/D61531 llvm-svn: 359937
* [AMDGPU] gfx1010 loop alignmentStanislav Mekhanoshin2019-05-032-0/+78
| | | | | | Differential Revision: https://reviews.llvm.org/D61529 llvm-svn: 359935
* [COFF, ARM64] Fix ABI implementation of struct returnsMandeep Singh Grang2019-05-034-2/+80
| | | | | | | | | | | | | | | | | | | Summary: Refer the ABI doc at: https://docs.microsoft.com/en-us/cpp/build/arm64-windows-abi-conventions?view=vs-2019#return-values Related clang patch: D60349 Reviewers: rnk, efriedma, TomTan, ssijaric Reviewed By: rnk, efriedma Subscribers: mstorsjo, javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60348 llvm-svn: 359934
* Reapply r359906, "RegAllocFast: Add heuristic to detect values not live-out ↵Matt Arsenault2019-05-031-4/+41
| | | | | | | | | | | of a block" This reverts commit r359912. This should pass now, since the clang test was made less fragile in r359918. llvm-svn: 359919
* [CommandLine] Enable Grouping for short options by default. Part 4 of 5Don Hinton2019-05-031-0/+2
| | | | | | | | | | | | | | | | | | | Summary: This change enables `cl::Grouping` for short options -- options with names of a single character. This is consistent with GNU getopt behavior. Reviewers: rnk, MaskRay Reviewed By: MaskRay Subscribers: thopre, cfe-commits, MaskRay, rupprecht, hiraditya, llvm-commits Tags: #llvm, #clang Differential Revision: https://reviews.llvm.org/D61270 llvm-svn: 359917
* [DAGCombine] Remove repeated variables. NFCI.Simon Pilgrim2019-05-031-8/+3
| | | | llvm-svn: 359915
* Revert r359906, "RegAllocFast: Add heuristic to detect values not live-out ↵Nico Weber2019-05-031-41/+4
| | | | | | | | of a block" Makes clang/test/Misc/backend-stack-frame-diagnostics-fallback.cpp fail. llvm-svn: 359912
* [TargetLowering] SimplifySetCC - remove repeated variable. NFCI.Simon Pilgrim2019-05-031-2/+1
| | | | | | Also reduce scope of Temp variable. llvm-svn: 359911
* [CommandLine] Change help output to prefix long options with `--` instead of ↵Don Hinton2019-05-031-34/+63
| | | | | | | | | | | | | | | | | | | | | | | | `-`. NFC . Part 3 of 5 Summary: By default, `parseCommandLineOptions()` will accept either a `-` or `--` prefix for long options -- options with names longer than a single character. While this change does not affect behavior, it will be helpful with a subsequent change that requires long options use the `--` prefix. Reviewers: rnk, thopre Reviewed By: thopre Subscribers: thopre, cfe-commits, hiraditya, llvm-commits Tags: #llvm, #clang Differential Revision: https://reviews.llvm.org/D61269 llvm-svn: 359909
* Revert "[CodeGenPrepare] limit overflow intrinsic matching to a single basic ↵Evgeniy Stepanov2019-05-031-42/+47
| | | | | | | | block" This reverts commit r359879, which introduced a compiler crash. llvm-svn: 359908
* RegAllocFast: Add heuristic to detect values not live-out of a blockMatt Arsenault2019-05-031-4/+41
| | | | | | | | | Add an improved/new heuristic to catch more cases when values are not live out of a basic block. Patch by Matthias Braun llvm-svn: 359906
* [hexagon] change AsmParser assertion to errorBrian Cain2019-05-031-10/+10
| | | | | | | For immediates that can't be evaluated in assembler-mapped instructions, we should return 'invalid operand' instead of assert. llvm-svn: 359905
* [X86] Allow assembly parser to accept x/y/z suffixes on non-memory ↵Craig Topper2019-05-031-5/+26
| | | | | | | | | | | | vfpclassps/pd and on memory forms in intel syntax The x/y/z suffix is needed to disambiguate the memory form in at&t syntax since no xmm/ymm/zmm register is mentioned. But we should also allow it for the register and broadcast forms where its not needed for consistency. This matches gas. The printing code will still only use the suffix for the memory form where it is needed. llvm-svn: 359903
* [X86] LowerToHorizontalOp - Tidyup calls to getHopForBuildVector. NFCI.Simon Pilgrim2019-05-031-15/+7
| | | | | | Merge the if() tests for the various HADD/SUB + Subtarget tests llvm-svn: 359901
* [SelectionDAG] CreateTopologicalOrder - don't use iteratorSimon Pilgrim2019-05-031-10/+6
| | | | | | | | We shouldn't use an iterator to loop across a std::vector when the same loop is adding elements to that std::vector Found by cppcheck llvm-svn: 359900
* AMDGPU: Select VOP3 form of subMatt Arsenault2019-05-031-2/+2
| | | | | | | | | | The VOP3 form should always be the preferred selection form to be shrunk later. The r600 sub test needs to be split out because it asserts on the arguments in the new test during the calling convention lowering. llvm-svn: 359899
* AMDGPU: Support shrinking add with FI in SIFoldOperandsMatt Arsenault2019-05-031-35/+37
| | | | | | Avoids test regression in a future patch llvm-svn: 359898
* AMDGPU: Remove redundant patterns for shiftsMatt Arsenault2019-05-031-9/+4
| | | | llvm-svn: 359895
* AMDGPU: Remove redundant patterns for subMatt Arsenault2019-05-031-4/+0
| | | | | | | There were 2 patterns for sub, one selecting to sub and one to subrev. Only one of these will succeed, so remove the reversed one. llvm-svn: 359894
OpenPOWER on IntegriCloud