summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [ARM][Thumb2InstrInfo] Fix default `0` opcode when rewriting frame indicesDavid Tellenbach2019-10-281-9/+3
| | | | | | | | | | | | | | | | | | | | The static functions `positiveOffsetOpcode`, `negativeOffsetOpcode` and `immediateOffsetOpcode` (lib/Target/ARM/Thumb2InstrInfo.cpp) currently can return `0` as default opcode which is meaningless in this situation. This patch replaces this default value by llvm_unreachable. Reviewers: t.p.northover, tellenbach Reviewed By: tellenbach Subscribers: tellenbach, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69432 Patch By: Lorenzo Casalino <lorenzo.casalino93@gmail.com>
* Convert files added in d157a9bc8ba1 to unix line endings.Nico Weber2019-10-283-434/+434
| | | | | | Ran: git show --diff-filter=A --stat d157a9bc8ba1 | grep '|' | \ awk '{ print $1 }' | xargs dos2unix
* [ConstantFold] Fold extractelement of getelementptrJay Foad2019-10-281-6/+29
| | | | | | | | | | | | | | Summary: Getelementptr has vector type if any of its operands are vectors (the scalar operands being implicitly broadcast to all vector elements). Extractelement applied to a vector getelementptr can be folded by applying the extractelement in turn to all of the vector operands. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69379
* [X86] Add a DAG combine to turn (and (bitcast (vXi1 (concat_vectors (vYi1 ↵Craig Topper2019-10-281-0/+68
| | | | | | | | | | setcc), undef,))), C) into (bitcast (vXi1 (concat_vectors (vYi1 setcc), zero,))) The legalization of v2i1->i2 or v4i1->i4 bitcasts followed by a setcc can create an and after the bitcast. If we're lucky enough that the input to the bitcast is a concat_vectors where the first operand is a setcc that can natively 0 all the upper bits of ak-register, then we should replace the other operands of the concat_vectors with zero in order to remove the AND. With the AND removed we might be able to use a kortest on the result. Differential Revision: https://reviews.llvm.org/D69205
* Reland [AArch64][DebugInfo] Do not recompute CalleeSavedStackSize (Take 2)Sander de Smalen2019-10-287-8/+81
| | | | | Fixed up test/DebugInfo/MIR/Mips/live-debug-values-reg-copy.mir that broke r375425.
* [LV] Interleaving should not exceed estimated loop trip count.Craig Topper2019-10-281-12/+12
| | | | | | | | | | Currently we may do iterleaving by more than estimated trip count coming from the profile or computed maximum trip count. The solution is to use "best known" trip count instead of exact one in interleaving analysis. Patch by Evgeniy Brevnov. Differential Revision: https://reviews.llvm.org/D67948
* [utils] InlineFunction: fix for debug info affecting optimizationsBjorn Pettersson2019-10-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Debug info affects output from "opt -inline", InlineFunction could not handle the llvm.dbg.value when it exist between alloca instructions. Problem was that the first alloca in a sequence of allocas was handled differently from the subsequence alloca instructions. Now all static alloca instructions are treated the same (being removed if the have no uses). So it does not matter if there are dbg instructions (or any other instructions) in between. Fix the issue: https://bugs.llvm.org/show_bug.cgi?id=43291k Patch by: yechunliang (Chris Ye) Reviewers: bjope, jmorse, vsk, probinson, jdoerfert, mtrofin, aprantl, fhahn Reviewed By: bjope Subscribers: uabelho, ormris, aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68633
* AMDGPU: Avoid overwriting saved PCAustin Kerbow2019-10-281-6/+20
| | | | | | | | | | | | | | | | Summary: An outstanding load with same destination sgpr as call could cause PC to be updated with junk value on return. Reviewers: arsenm, rampitec Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69474
* [AIX] Refactor AIX Call Lowering to use CCState. NFCI.Sean Fertile2019-10-281-94/+120
| | | | | | | | | | | | | This patch reworks the AIX call lowering to use CCState. Some defensive errors are added in this patch to protect from emitting bad code for calling convention logic that has not been implemented by design. The use of CCState follows the precedent of other targets and enables the reuse of calling convention logic in LowerFormalArguments, which will be rewritten to also use CCState in a late patch. Patch by Chris Bowler. Differential Revision: https://reviews.llvm.org/D69101
* [InstCombine] Extra combine for uadd_satDavid Green2019-10-281-0/+7
| | | | | | | This is an extra fold for a canonical form of uadd_sat, as shown in D68651. It essentially selects uadd from an add and a select. Differential Revision: https://reviews.llvm.org/D69244
* Add Windows Control Flow Guard checks (/guard:cf).Andrew Paverd2019-10-2841-24/+657
| | | | | | | | | | | | | | | | | | | Summary: A new function pass (Transforms/CFGuard/CFGuard.cpp) inserts CFGuard checks on indirect function calls, using either the check mechanism (X86, ARM, AArch64) or or the dispatch mechanism (X86-64). The check mechanism requires a new calling convention for the supported targets. The dispatch mechanism adds the target as an operand bundle, which is processed by SelectionDAG. Another pass (CodeGen/CFGuardLongjmp.cpp) identifies and emits valid longjmp targets, as required by /guard:cf. This feature is enabled using the `cfguard` CC1 option. Reviewers: thakis, rnk, theraven, pcc Subscribers: ychen, hans, metalcanine, dmajor, tomrittervg, alex, mehdi_amini, mgorny, javed.absar, kristof.beyls, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D65761
* [AArch64] Fix unannotated fall-through between switch labelsJinsong Ji2019-10-281-0/+1
| | | | | | This is breaking buildbot with -Werror,-Wimplicit-fallthrough on. eg: http://lab.llvm.org:8011/builders/ppc64le-lld-multistage-test/builds/6881
* [DebugInfo] MachineSink: find more DBG_VALUEs to sinkJeremy Morse2019-10-281-15/+86
| | | | | | | | | | | | | | | In the Pre-RA machine sinker, previously we were relying on all DBG_VALUEs being immediately after the instruction that defined their operands. This isn't a valid assumption, as a variable location change doesn't necessarily correspond to where the value is computed. In this patch, we collect DBG_VALUEs that might need sinking as we walk through a block, and sink all of them if their defining instruction is sunk. This patch adds some copy propagation too, so that if we sink a copy inst, the now non-dominated paths can use the copy source for the variable location. Differential Revision: https://reviews.llvm.org/D58386
* [DAGCombiner] widen any_ext of popcount based on target supportSanjay Patel2019-10-281-11/+28
| | | | | | | | | This enhances D69127 (rGe6c145e0548e3b3de6eab27e44e1504387cf6b53) to handle the looser "any_extend" cast in addition to zext. This is a prerequisite step for canonicalizing in the other direction (narrow the popcount) in IR - PR43688: https://bugs.llvm.org/show_bug.cgi?id=43688
* [CVP] prevent propagating poison when substituting edge values into a phi ↵Sanjay Patel2019-10-281-1/+8
| | | | | | | | | | | | | | | | | (PR43802) This phi simplification transform was added with: D45448 However as shown in PR43802: https://bugs.llvm.org/show_bug.cgi?id=43802 ...we must be careful not to propagate poison when we do the substitution. There might be some more complicated analysis possible to retain the overflow flag, but it should always be safe and easy to drop flags (we have similar behavior in instcombine and other passes). Differential Revision: https://reviews.llvm.org/D69442
* [DebugInfo] MachineSink: Insert undef DBG_VALUEs when sinking instructionsJeremy Morse2019-10-281-3/+50
| | | | | | | | | | | When we sink DBG_VALUEs between blocks, we simply move the DBG_VALUE instruction to below the sunk instruction. However, we should also mark the variable as being undef at the original location, to terminate any earlier variable location. This patch does that -- plus, if the instruction being sunk is a copy, it attempts to propagate the copy through the DBG_VALUE, replacing the destination with the source. Differential Revision: https://reviews.llvm.org/D58238
* [AMDGPU][MC][GFX10] Added v_interp_[p1/p2/mov]_f32_e64Dmitry Preobrazhensky2019-10-281-2/+6
| | | | | | | | See https://bugs.llvm.org/show_bug.cgi?id=43747 Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D69348
* [Codegen][ARM] Add float softening for cbrtDavid Green2019-10-282-0/+29
| | | | | | | We would previously have no soft-float softening for cbrt, so could hit a crash failing to select. This fills in what appears to be missing. Differential Revision: https://reviews.llvm.org/D69345
* minor doc typo fix / testing github commitRafael Stahl2019-10-281-1/+1
|
* [ARM][AArch64] Implement __cls, __clsl and __clsll intrinsics from ACLEvhscampos2019-10-282-0/+45
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Writing support for three ACLE functions: unsigned int __cls(uint32_t x) unsigned int __clsl(unsigned long x) unsigned int __clsll(uint64_t x) CLS stands for "Count number of leading sign bits". In AArch64, these two intrinsics can be translated into the 'cls' instruction directly. In AArch32, on the other hand, this functionality is achieved by implementing it in terms of clz (count number of leading zeros). Reviewers: compnerd Reviewed By: compnerd Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D69250
* [AArch64][SVE] Implement masked load intrinsicsKerry McLaughlin2019-10-2810-11/+150
| | | | | | | | | | | | | | | | Summary: Adds support for codegen of masked loads, with non-extending, zero-extending and sign-extending variants. Reviewers: huntergr, rovka, greened, dmgreen Reviewed By: dmgreen Subscribers: dmgreen, samparker, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68877
* [RISCV] Lower llvm.trap and llvm.debugtrapSam Elliott2019-10-282-0/+13
| | | | | | | | | | | | | | | | | | | | Summary: Until this commit, these have lowered to a call to abort(). `llvm.trap()` now lowers to `unimp`, which should trap on all systems. `llvm.debugtrap()` now lowers to `ebreak`, which is exactly what this instruction is for. Reviewers: asb, luismarques Reviewed By: asb Subscribers: hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69390
* [X86] Fix 48/96 byte memcmp code genDavid Zarzycki2019-10-281-2/+21
| | | | | | | Detect scalar ISD::ZERO_EXTEND generated by memcmp lowering and convert it to ISD::INSERT_SUBVECTOR. https://reviews.llvm.org/D69464
* [X86] Use 64-bit version of source register in LowerPATCHABLE_EVENT_CALL and ↵Craig Topper2019-10-271-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | LowerPATCHABLE_TYPED_EVENT_CALL Summary: The PATCHABLE_EVENT_CALL uses i32 in the intrinsic. This results in the register allocator picking a 32-bit register. We need to use the 64-bit register when forming the MOV64rr instructions. Otherwise we print illegal assembly in the text output. I think prior to this it was impossible for SrcReg to be equal to DstReg so the NOP code was not reachable. While there use Register instead of unsigned. Also add a FIXME for what looks like a bug. Reviewers: dberris Reviewed By: dberris Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69365
* Use isConvergent helper instead of directly checking attributeMatt Arsenault2019-10-272-2/+2
|
* PM: silence `-Wpessimizing-move` from GCC 9.2.1 (NFC)Saleem Abdulrasool2019-10-271-1/+1
| | | | Remove the explicit move enabling NVRO.
* [SDAG] fold insert_vector_elt with undef indexSanjay Patel2019-10-272-4/+9
| | | | | | | | | | | | Similar to: rG4c47617627fb This makes the DAG behavior consistent with IR's insertelement. https://bugs.llvm.org/show_bug.cgi?id=42689 I've tried to maintain test intent for AArch64 and WebAssembly by replacing undef index operands with something else.
* [LegalizeTypes] When promoting BITREVERSE/BSWAP don't take the shift amount ↵Craig Topper2019-10-271-10/+9
| | | | | | | | | | | into account when determining the shift amount VT. If the target's preferred shift amount VT can't hold any shift amount for the promoted VT, we should use i32. The specific shift amount shouldn't matter. The type will be adjusted later when the shift itself is type legalized. This avoids an assert in getNode. Fixes PR43820.
* [TargetLowering] Add getBooleanContents contents check to "SETCC (SETCC), ↵Craig Topper2019-10-271-2/+5
| | | | | | | | | | | | | | [0|1], [EQ|NE] -> SETCC" combine. This combine is only valid if the inner setcc produces a 0/1 result or the inner type is MVT::i1. I haven't seen this cause any issues, just happened to notice it while reviewing combines in this function. While there also fix another call to use the value type from the SDValue for the operand instead of calling SDNode::getValueType(0). Though its likely the use is result 0, its not guaranteed.
* [MCA] Fix a spelling mistake in a comment. NFCGreg Bedwell2019-10-271-1/+1
|
* [X86] Only look up boolean reduction cost tables if the reduction is not ↵Craig Topper2019-10-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | pairwise. Summary: We don't pattern match pairwise shuffles in SelectionDAG. So we should only return the optimized costs if its not a pairwise shuffle. I think SLP vectorizer gives priority to non pairwise shuffle if the cost is the same. And the look up for reduction intrinsics passes false for the pairwise flag. So this probably has no real effect today. Reviewers: RKSimon Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69083
* [APInt] Introduce APIntOps::GetMostSignificantDifferentBit()Roman Lebedev2019-10-261-0/+8
| | | | | | | | | | | | | | | | | | Summary: Compare two values, and if they are different, return the position of the most significant bit that is different in the values. Needed for D69387. Reviewers: nikic, spatel, sanjoy, RKSimon Reviewed By: nikic Subscribers: xbolva00, hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69439
* [X86] Prefer KORTEST on Knights Landing or later for memcmp()David Zarzycki2019-10-264-17/+60
| | | | | | | | | | | PTEST and especially the MOVMSK instructions are slow on Knights Landing or later. As a bonus, this patch increases instruction parallelism by emitting: KORTEST(PCMPNEQ(a, b), PCMPNEQ(c, d)) == 0 Instead of: KORTEST(AND(PCMPEQ(a, b), PCMPEQ(c, d))) == ~0 https://reviews.llvm.org/D69157
* [ObjectYAML] - Do not use auto. NFC.Georgii Rymar2019-10-261-1/+1
| | | | | | Using 'auto' when the type is not obvious is undesired. (it is just a test commit actually)
* [AMDGPU] Fix Vreg_1 PHI lowering in SILowerI1Copies.cdevadas2019-10-261-90/+89
| | | | | | | | | | | | | | | There is a minor flaw in the implementation of function lowerPhis. This function replaces values of regclass Vreg_1 (boolean values) involved in PHIs into an SGPR. Currently it iterates over the MBBs and performs an inplace lowering of PHIs and fails to lower any incoming value that itself is another PHI of Vreg_1 regclass. The failure occurs only when the MBB where the incoming PHI value belongs is not visited/lowered yet. To fix this problem, collect all Vreg_1 PHIs upfront and then perform the lowering. Differential Revision: https://reviews.llvm.org/D69182
* [X86][GISel] Fix typo in comment. NFCCraig Topper2019-10-261-1/+1
|
* Add Record::getValueAsOptionalDef().John McCall2019-10-251-0/+15
| | | | | | Using `?` as an optional marker is very useful in Clang's AST-node emitters because otherwise we need a separate class just to encode the presence or absence of a base node reference.
* [SDAG] fold extract_vector_elt with undef indexSanjay Patel2019-10-251-2/+2
| | | | | | | | | | | This makes the DAG behavior consistent with IR's extractelement after: rGb32e4664a715 https://bugs.llvm.org/show_bug.cgi?id=42689 I've tried to maintain test intent for WebAssembly. The AMDGPU test is trying to test for crashing or other bad behavior, but I'm not sure if that's possible after this change.
* [AMDGPU] Enable SGPR copy foldingStanislav Mekhanoshin2019-10-252-14/+11
| | | | | | | | | | | | | That used to fail in the last testcase function because after %0:sreg_64.sub0 was folded into %3:sreg_32_xm0_xexec COPY, it was further folded into S_STORE_DWORD_IMM. Its legal effective subreg class is SReg_32 while instruction expects more restricted SReg_32_XM0_EXEC. However, SIInstrInfo::isLegalRegOperand() passed the legality check and it was caught in the verifier. Borrowed code from the verifier to check for RC legality. Differential Revision: https://reviews.llvm.org/D69445
* [BPF] fix a CO-RE issue with -mattr=+alu32Yonghong Song2019-10-251-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ilya Leoshkevich (<iii@linux.ibm.com>) reported an issue that with -mattr=+alu32 CO-RE has a segfault in BPF MISimplifyPatchable pass. The pattern will be transformed by MISimplifyPatchable pass looks like below: r5 = ld_imm64 @"b:0:0$0:0" r2 = ldw r5, 0 ... r2 ... // use r2 The pass will remove the intermediate 'ldw' instruction and replacing all r2 with r5 likes below: r5 = ld_imm64 @"b:0:0$0:0" ... r5 ... // use r5 Later, the ld_imm64 insn will be replaced with r5 = <patched immediate> for field relocation purpose. With -mattr=+alu32, the input code may become r5 = ld_imm64 @"b:0:0$0:0" w2 = ldw32 r5, 0 ... w2 ... // use w2 Replacing "w2" with "r5" is incorrect and will trigger compiler internal errors. To fix the problem, if the register class of ldw* dest register is sub_32, we just replace the original ldw* register with: w2 = w5 Directly replacing all uses of w2 with in-place constructed w5 for the use operand seems not working in all cases. The latest kernel will have -mattr=+alu32 on by default, so added this flag to all CORE tests. Tested with latest kernel bpf-next branch as well with this patch. Differential Revision: https://reviews.llvm.org/D69438
* Revert "[ARM] Uses "Sun Style" syntax for section switching"Jian Cai2019-10-251-4/+0
| | | | This reverts commit 03de2f84fc4acf06c719cd007b5459c9d4d0a20c.
* [AMDGPU] Fixed asan failure in SIFoldOperandsStanislav Mekhanoshin2019-10-251-3/+4
| | | | | | | Both tryFoldOMod() and tryFoldClamp() remove original instruction, so the check MI.modifiesRegister() may use a deleted MI. Differential Revision: https://reviews.llvm.org/D69448
* GlobalISel: Implement widenScalar for G_INSERT_VECTOR_ELTMatt Arsenault2019-10-251-0/+25
|
* [Alignment][NFC] Convert AllocaInst to MaybeAlignGuillaume Chatelet2019-10-257-37/+37
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Reviewed By: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69301
* [ARM] Uses "Sun Style" syntax for section switchingJian Cai2019-10-251-0/+4
| | | | | | | | | | | | | | | | Summary: Support "Sun Style" syntax for section switching ("#alloc,#write" etc). https://bugs.llvm.org/show_bug.cgi?id=43759 Reviewers: peter.smith, eli.friedman, kristof.beyls, t.p.northover Reviewed By: peter.smith Subscribers: MaskRay, llozano, manojgupta, nickdesaulniers, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69296
* AMDGPU/GlobalISel: Handle flat/global G_ATOMIC_CMPXCHGMatt Arsenault2019-10-259-76/+97
| | | | | | | | Custom lower this to a target instruction with the merge operands. I think it might be better to directly select this and emit a REG_SEQUENCE, but this would be more work since it would require splitting the tablegen patterns for these cases from the other atomics.
* AMDGPU: Fix the broken dominator tree when creating waterfall loop for ↵Changpeng Fang2019-10-251-2/+2
| | | | | | | | | | | | | resource descriptor Summary: In loadSRsrcFromVGPR, if MBB is the same as Succ, Remiander is not the immediate dominator of Succ. Reviewer: arsenm Differential Revision: https://reviews.llvm.org/D69358
* Revert "Add an instruction marker field to the ExtraInfo in MachineInstrs."Amy Huang2019-10-257-114/+117
| | | | | Reverting commit b85b4e5a6f8579c137fecb59a4d75d7bfb111f79 due to some buildbot failures/ out of memory errors.
* [LLD][ThinLTO] Handle GUID collision in import global processingTeresa Johnson2019-10-251-5/+11
| | | | | | | | | | | | | | | | | | | | | | Summary: If there are a GUID collision between two globals checking the summarylist from the import index to make assumption can be dangerous. Do not assume that a GlobalValue that has a GlobalVarSummary actually is a GlobalVariable as it can be another GlobalValue with the same GUID that the summary is connected to. Patch by Joel Klinghed (the_jk@opera.com) Reviewers: evgeny777, tejohnson Reviewed By: tejohnson Subscribers: tejohnson, dblaikie, MaskRay, mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67322
* [Alignment][NFC] getMemoryOpCost uses MaybeAlignGuillaume Chatelet2019-10-2515-57/+69
| | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69307
OpenPOWER on IntegriCloud