summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [MCA] Fix a spelling mistake in a comment. NFCGreg Bedwell2019-10-271-1/+1
|
* [X86] Only look up boolean reduction cost tables if the reduction is not ↵Craig Topper2019-10-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | pairwise. Summary: We don't pattern match pairwise shuffles in SelectionDAG. So we should only return the optimized costs if its not a pairwise shuffle. I think SLP vectorizer gives priority to non pairwise shuffle if the cost is the same. And the look up for reduction intrinsics passes false for the pairwise flag. So this probably has no real effect today. Reviewers: RKSimon Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69083
* [APInt] Introduce APIntOps::GetMostSignificantDifferentBit()Roman Lebedev2019-10-261-0/+8
| | | | | | | | | | | | | | | | | | Summary: Compare two values, and if they are different, return the position of the most significant bit that is different in the values. Needed for D69387. Reviewers: nikic, spatel, sanjoy, RKSimon Reviewed By: nikic Subscribers: xbolva00, hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69439
* [X86] Prefer KORTEST on Knights Landing or later for memcmp()David Zarzycki2019-10-264-17/+60
| | | | | | | | | | | PTEST and especially the MOVMSK instructions are slow on Knights Landing or later. As a bonus, this patch increases instruction parallelism by emitting: KORTEST(PCMPNEQ(a, b), PCMPNEQ(c, d)) == 0 Instead of: KORTEST(AND(PCMPEQ(a, b), PCMPEQ(c, d))) == ~0 https://reviews.llvm.org/D69157
* [ObjectYAML] - Do not use auto. NFC.Georgii Rymar2019-10-261-1/+1
| | | | | | Using 'auto' when the type is not obvious is undesired. (it is just a test commit actually)
* [AMDGPU] Fix Vreg_1 PHI lowering in SILowerI1Copies.cdevadas2019-10-261-90/+89
| | | | | | | | | | | | | | | There is a minor flaw in the implementation of function lowerPhis. This function replaces values of regclass Vreg_1 (boolean values) involved in PHIs into an SGPR. Currently it iterates over the MBBs and performs an inplace lowering of PHIs and fails to lower any incoming value that itself is another PHI of Vreg_1 regclass. The failure occurs only when the MBB where the incoming PHI value belongs is not visited/lowered yet. To fix this problem, collect all Vreg_1 PHIs upfront and then perform the lowering. Differential Revision: https://reviews.llvm.org/D69182
* [X86][GISel] Fix typo in comment. NFCCraig Topper2019-10-261-1/+1
|
* Add Record::getValueAsOptionalDef().John McCall2019-10-251-0/+15
| | | | | | Using `?` as an optional marker is very useful in Clang's AST-node emitters because otherwise we need a separate class just to encode the presence or absence of a base node reference.
* [SDAG] fold extract_vector_elt with undef indexSanjay Patel2019-10-251-2/+2
| | | | | | | | | | | This makes the DAG behavior consistent with IR's extractelement after: rGb32e4664a715 https://bugs.llvm.org/show_bug.cgi?id=42689 I've tried to maintain test intent for WebAssembly. The AMDGPU test is trying to test for crashing or other bad behavior, but I'm not sure if that's possible after this change.
* [AMDGPU] Enable SGPR copy foldingStanislav Mekhanoshin2019-10-252-14/+11
| | | | | | | | | | | | | That used to fail in the last testcase function because after %0:sreg_64.sub0 was folded into %3:sreg_32_xm0_xexec COPY, it was further folded into S_STORE_DWORD_IMM. Its legal effective subreg class is SReg_32 while instruction expects more restricted SReg_32_XM0_EXEC. However, SIInstrInfo::isLegalRegOperand() passed the legality check and it was caught in the verifier. Borrowed code from the verifier to check for RC legality. Differential Revision: https://reviews.llvm.org/D69445
* [BPF] fix a CO-RE issue with -mattr=+alu32Yonghong Song2019-10-251-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ilya Leoshkevich (<iii@linux.ibm.com>) reported an issue that with -mattr=+alu32 CO-RE has a segfault in BPF MISimplifyPatchable pass. The pattern will be transformed by MISimplifyPatchable pass looks like below: r5 = ld_imm64 @"b:0:0$0:0" r2 = ldw r5, 0 ... r2 ... // use r2 The pass will remove the intermediate 'ldw' instruction and replacing all r2 with r5 likes below: r5 = ld_imm64 @"b:0:0$0:0" ... r5 ... // use r5 Later, the ld_imm64 insn will be replaced with r5 = <patched immediate> for field relocation purpose. With -mattr=+alu32, the input code may become r5 = ld_imm64 @"b:0:0$0:0" w2 = ldw32 r5, 0 ... w2 ... // use w2 Replacing "w2" with "r5" is incorrect and will trigger compiler internal errors. To fix the problem, if the register class of ldw* dest register is sub_32, we just replace the original ldw* register with: w2 = w5 Directly replacing all uses of w2 with in-place constructed w5 for the use operand seems not working in all cases. The latest kernel will have -mattr=+alu32 on by default, so added this flag to all CORE tests. Tested with latest kernel bpf-next branch as well with this patch. Differential Revision: https://reviews.llvm.org/D69438
* Revert "[ARM] Uses "Sun Style" syntax for section switching"Jian Cai2019-10-251-4/+0
| | | | This reverts commit 03de2f84fc4acf06c719cd007b5459c9d4d0a20c.
* [AMDGPU] Fixed asan failure in SIFoldOperandsStanislav Mekhanoshin2019-10-251-3/+4
| | | | | | | Both tryFoldOMod() and tryFoldClamp() remove original instruction, so the check MI.modifiesRegister() may use a deleted MI. Differential Revision: https://reviews.llvm.org/D69448
* GlobalISel: Implement widenScalar for G_INSERT_VECTOR_ELTMatt Arsenault2019-10-251-0/+25
|
* [Alignment][NFC] Convert AllocaInst to MaybeAlignGuillaume Chatelet2019-10-257-37/+37
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Reviewed By: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69301
* [ARM] Uses "Sun Style" syntax for section switchingJian Cai2019-10-251-0/+4
| | | | | | | | | | | | | | | | Summary: Support "Sun Style" syntax for section switching ("#alloc,#write" etc). https://bugs.llvm.org/show_bug.cgi?id=43759 Reviewers: peter.smith, eli.friedman, kristof.beyls, t.p.northover Reviewed By: peter.smith Subscribers: MaskRay, llozano, manojgupta, nickdesaulniers, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69296
* AMDGPU/GlobalISel: Handle flat/global G_ATOMIC_CMPXCHGMatt Arsenault2019-10-259-76/+97
| | | | | | | | Custom lower this to a target instruction with the merge operands. I think it might be better to directly select this and emit a REG_SEQUENCE, but this would be more work since it would require splitting the tablegen patterns for these cases from the other atomics.
* AMDGPU: Fix the broken dominator tree when creating waterfall loop for ↵Changpeng Fang2019-10-251-2/+2
| | | | | | | | | | | | | resource descriptor Summary: In loadSRsrcFromVGPR, if MBB is the same as Succ, Remiander is not the immediate dominator of Succ. Reviewer: arsenm Differential Revision: https://reviews.llvm.org/D69358
* Revert "Add an instruction marker field to the ExtraInfo in MachineInstrs."Amy Huang2019-10-257-114/+117
| | | | | Reverting commit b85b4e5a6f8579c137fecb59a4d75d7bfb111f79 due to some buildbot failures/ out of memory errors.
* [LLD][ThinLTO] Handle GUID collision in import global processingTeresa Johnson2019-10-251-5/+11
| | | | | | | | | | | | | | | | | | | | | | Summary: If there are a GUID collision between two globals checking the summarylist from the import index to make assumption can be dangerous. Do not assume that a GlobalValue that has a GlobalVarSummary actually is a GlobalVariable as it can be another GlobalValue with the same GUID that the summary is connected to. Patch by Joel Klinghed (the_jk@opera.com) Reviewers: evgeny777, tejohnson Reviewed By: tejohnson Subscribers: tejohnson, dblaikie, MaskRay, mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67322
* [Alignment][NFC] getMemoryOpCost uses MaybeAlignGuillaume Chatelet2019-10-2515-57/+69
| | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69307
* [AMDGPU] Fold AGPR reg_sequence initializersStanislav Mekhanoshin2019-10-251-22/+131
| | | | Differential Revision: https://reviews.llvm.org/D69413
* [AMDGPU] Disallow dpp combining for dpp instructions without Src2 operand ↵vpykhtin2019-10-251-1/+2
| | | | | | (when Src2 is required) Differential revision: https://reviews.llvm.org/D69430
* [X86] Add a check for SSE2 to the top of combineReductionToHorizontal.Craig Topper2019-10-251-0/+4
| | | | Without this, we can create a PSADBW node that isn't legal.
* [DAGCombiner] widen zext of popcount based on target supportSanjay Patel2019-10-251-0/+12
| | | | | | | | | | | | | | | | zext (ctpop X) --> ctpop (zext X) This is a prerequisite step for canonicalizing in the other direction (narrow the popcount) in IR - PR43688: https://bugs.llvm.org/show_bug.cgi?id=43688 I'm not sure if any other targets are affected, but I found a missing fold for PPC, so added tests based on that. The reason we widen all the way to 64-bit in these tests is because the initial DAG looks something like this: t5: i8 = ctpop t4 t6: i32 = zero_extend t5 <-- created based on IR, but unused node? t7: i64 = zero_extend t5 Differential Revision: https://reviews.llvm.org/D69127
* AMDGPU/GlobalISel: Legalize FDIV16Austin Kerbow2019-10-253-2/+44
| | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, dstuttard, tpr, t-tye, hiraditya, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69347
* Add an instruction marker field to the ExtraInfo in MachineInstrs.Amy Huang2019-10-257-117/+114
| | | | | | | | | | | | | | | | | | Summary: Add instruction marker to MachineInstr ExtraInfo. This does almost the same thing as Pre/PostInstrSymbols, except that it doesn't create a label until printing instructions. This allows for labels to be put around instructions that are deleted/duplicated somewhere. Also undo the workaround in r375137. Reviewers: rnk Subscribers: MatzeB, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69136
* [SLP] adjust code comment; NFCSanjay Patel2019-10-251-1/+1
| | | | (check commit access)
* [APInt] Add saturating left-shift opsRoman Lebedev2019-10-251-0/+19
| | | | | | | | | | | | | | | | Summary: There are `*_ov()` functions already, so at least for consistency it may be good to also have saturating variants. These may or may not be needed for `ConstantRange`'s `shlWithNoWrap()` Reviewers: spatel, nikic Reviewed By: nikic Subscribers: hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69398
* [APInt] Add saturating multiply opsRoman Lebedev2019-10-251-0/+21
| | | | | | | | | | | | | | | | Summary: There are `*_ov()` functions already, so at least for consistency it may be good to also have saturating variants. These may or may not be needed for `ConstantRange`'s `mulWithNoWrap()` Reviewers: spatel, nikic Reviewed By: nikic Subscribers: hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69397
* [CodeGen][SelectionDAG] Fix tiny bug in ExpandIntRes_UADDSUBOItay Bookstein2019-10-251-9/+22
| | | | | | | | | | | | | | | | | Summary: Ternary expression checks for ISD::ADD instead of ISD::UADDO inside DAGTypeLegalizer::ExpandIntRes_UADDSUBO. This means the ternary expression will evaluate to ISD::SUBCARRY for both ISD::UADDO and ISD::USUBO nodes. Targets are likely to implement both, so impact will be very limited in practice. Reviewers: bogner, lebedev.ri Reviewed By: lebedev.ri Subscribers: lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68123
* [RISCV] Add support for half-precision floatsLuís Marques2019-10-251-1/+6
| | | | | | | | | Complete fp16 support by ensuring that load extension / truncate store operations are properly expanded. Reviewers: asb, lenary Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D69246
* [MIPS GlobalISel] Select MSA vector generic and builtin fsqrtPetar Avramovic2019-10-252-7/+17
| | | | | | | | | | | | | selectImpl is able to select G_FSQRT when we set bank for vector operands to fprb. Add detailed tests. Note: G_FSQRT is generated from llvm-ir intrinsics llvm.sqrt.*, and at the moment MIPS is not able to generate this intrinsic for vector type (some targets generate vector llvm.sqrt.* from calls to a builtin function). __builtin_msa_fsqrt_<format> will be transformed into G_FSQRT in legalizeIntrinsic and selected in the same way. Differential Revision: https://reviews.llvm.org/D69376
* [yaml2obj, obj2yaml] - Add support for SHT_NOTE sections.georgerim2019-10-252-8/+102
| | | | | | | | | | | | | | | | | | | | | SHT_NOTE is the section that consists of namesz, descsz, type, name + padding, desc + padding data. This patch teaches yaml2obj, obj2yaml to dump and parse them. This patch implements the section how it is described here: https://docs.oracle.com/cd/E23824_01/html/819-0690/chapter6-18048.html Which says: "For 64–bit objects and 32–bit objects, each entry is an array of 4-byte words in the format of the target processor" The official specification is different http://www.sco.com/developers/gabi/latest/ch5.pheader.html#note_section And says: "n 64-bit objects (files with e_ident[EI_CLASS] equal to ELFCLASS64), each entry is an array of 8-byte words in the format of the target processor. In 32-bit objects (files with e_ident[EI_CLASS] equal to ELFCLASS32), each entry is an array of 4-byte words in the format of the target processor" Since LLVM uses the first, 32-bit way, this patch follows it. Differential revision: https://reviews.llvm.org/D68983
* Fix a variable typo in LiveDebugValues [NFC]David Stenberg2019-10-251-2/+2
|
* [PowerPC] [Peephole] fold frame offset by using index form to save add.czhengsz2019-10-254-0/+246
| | | | | | | | | | | | | | | | renamable $x6 = ADDI8 $x1, -80 ;;; 0 is replaced with -80 renamable $x6 = ADD8 killed renamable $x6, renamable $x5 STW killed renamable $r3, 4, killed renamable $x6 :: (store 4 into %ir.14, !tbaa !2) After PEI there is a peephole opt opportunity to combine above -80 in ADDI8 with 4 in the STW to eliminate unnecessary ADD8. Expected result: renamable $x6 = ADDI8 $x1, -76 STWX killed renamable $r3, renamable $x5, killed renamable $x6 :: (store 4 into %ir.6, !tbaa !2) Reviewed by: stefanp Differential Revision: https://reviews.llvm.org/D66329
* [LiveDebugValues] Small code clean up; NFCDjordje Todorovic2019-10-251-18/+11
|
* [X86][GISel] Remove unneeded custom selection code for handling shifts.Craig Topper2019-10-241-78/+0
|
* [SCEV] Expose and use maximum constant exit counts for individual loop exitsPhilip Reames2019-10-242-6/+18
| | | | | | | | We were already going to all of the trouble of computing maximum constant exit counts for each loop exit, we might as well expose them through the API. The change in IndVars is mostly to demonstrate that the wired up code works, but it als very slightly strengthens the transform. The strengthened case is rather narrow though: it requires one exactly analyzeable exit, one imprecisely analyzeable exit (with the upper bound less than the precise one), and one unanalyzeable exit. I coudn't construct a reasonably stable test case. This does increase the memory usage of the BackedgeTakenCount by a factor of 2 in the worst case. I also noticed the loop in IndVars is O(#Exits ^ 2). This doesn't change with this patch. A future patch will cache this result inside of SCEV to avoid requering.
* Fix Clang -Wcovered-switch-default warning by moving llvm_unreachable ↵David Blaikie2019-10-241-5/+2
| | | | default to after the switch
* [SCEV] Start reworking backedge taken count APIs to unify max handling [NFC]Philip Reames2019-10-241-13/+21
| | | | This is a first step in figuring out a proper API for maximum (non constant) exit counts. This may evolve a bit as we get experience with the API needs; suggestions very welcome. This patch just tried to provide a framework that we can later add maximum too in a clean and obvious way.
* Always flush pending errors in MCAsmParserJoerg Sonnenberger2019-10-251-4/+3
| | | | This has become visible with the --fatal-warnings support.
* Test commit access via gitPhilip Reames2019-10-241-1/+0
|
* Try harder to fix GCC 5.3 buildHans Wennborg2019-10-241-1/+1
| | | | | | | | | | (This time verified locally.) It was failing with: llvm/lib/MC/XCOFFObjectWriter.cpp:168:56: error: array must be initialized with a brace-enclosed initializer std::array<Section *const, 2> Sections = {&Text, &BSS}; ^
* Fix cppcheck shadow variable warning. NFCI.Simon Pilgrim2019-10-241-6/+6
|
* Follow up on D69112, fix build break for skipping field initializationjasonliu2019-10-241-2/+2
| | | | | Clang emit warning for skipping field initialization. Add {} to fix it. This is a patch that fixes issue introduced in https://reviews.llvm.org/D69112
* Fix MSVC "switch statement contains 'default' but no 'case' labels" warning. ↵Simon Pilgrim2019-10-241-7/+4
| | | | NFCI.
* Revert "Disable exit-on-SIGPIPE in lldb"Vedant Kumar2019-10-242-16/+1
| | | | | | | This reverts commit 32ce14e55e5a99dd99c3b4fd4bd0ccaaf2948c30. In post-commit review, Pavel pointed out that there's a simpler way to ignore SIGPIPE in lldb that doesn't rely on llvm's handlers.
* [ObjC][ARC] Check whether the return and parameter types of the old andAkira Hatanaka2019-10-241-1/+22
| | | | | | | | | | | | new functions are compatible before upgrading a function call to an intrinsic call. Sometimes users insert calls to ARC runtime functions that are not compatible with the corresponding intrinsic functions (for example, 'i8* @objc_storeStrong' instead of 'void @objc_storeStrong'). Don't upgrade those calls. rdar://problem/56447127
* [AMDGPU] Fix mfma scheduling crashStanislav Mekhanoshin2019-10-241-1/+6
| | | | | | | An SUnit can be neither intruction not SDNode. It is all null if represents a nop. Fixed a crash on using SU->getInstr(). Differential Revision: https://reviews.llvm.org/D69395
OpenPOWER on IntegriCloud