summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* [BreakFalseDeps] fix typos/grammar in documentation comment; NFCSanjay Patel2019-09-101-4/+3
| | | | llvm-svn: 371516
* [Alignment][NFC] Use llvm::Align for TargetLowering::getPrefLoopAlignmentGuillaume Chatelet2019-09-101-4/+4
| | | | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Reviewed By: courbet Subscribers: wuzish, arsenm, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, MaskRay, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67386 llvm-svn: 371511
* [AMDGPU]: PHI Elimination hooks added for custom COPY insertion.Alexander Timofeev2019-09-101-14/+14
| | | | | | | | Reviewers: rampitec, vpykhtin Differential Revision: https://reviews.llvm.org/D67101 llvm-svn: 371508
* Revert "Reland "r364412 [ExpandMemCmp][MergeICmps] Move passes out of ↵Dmitri Gribenko2019-09-104-0/+886
| | | | | | | | | CodeGen into opt pipeline."" This reverts commit r371502, it broke tests (clang/test/CodeGenCXX/auto-var-init.cpp). llvm-svn: 371507
* Reland "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into ↵Clement Courbet2019-09-104-886/+0
| | | | | | | | opt pipeline." With a fix for sanitizer breakage (see explanation in D60318). llvm-svn: 371502
* [Alignment] Use Align for TargetLowering::MinStackArgumentAlignmentGuillaume Chatelet2019-09-101-7/+6
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: sdardis, nemanjai, hiraditya, kbarton, jrtc27, MaskRay, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67288 llvm-svn: 371498
* [LegalizeTypes] Teach SoftenFloatOp_SELECT_CC to handle operand 2 or 3 being ↵Craig Topper2019-09-102-3/+15
| | | | | | | | | | | softened. This can only happen on X86 when fp128 is a legal type, but we go through softening to generate libcalls. This causes fp128 to be softened to fp128 instead of an integer type. This can be removed if D67128 lands. llvm-svn: 371493
* [GlobalISel]: Fix a bug where we could dereference NoneAditya Nandakumar2019-09-091-0/+2
| | | | | | | getConstantVRegVal returns None when dealing with constants > 64 bits. Don't assume we always have a value in GISelKnownBits. llvm-svn: 371465
* Introduce infrastructure for an incremental port of SelectionDAG atomic ↵Philip Reames2019-09-091-4/+29
| | | | | | | | | | | | | | | | | | | | | | | | load/store handling This is the first patch in a large sequence. The eventual goal is to have unordered atomic loads and stores - and possibly ordered atomics as well - handled through the normal ISEL codepaths for loads and stores. Today, there handled w/instances of AtomicSDNodes. The result of which is that all transforms need to be duplicated to work for unordered atomics. The benefit of the current design is that it's harder to introduce a silent miscompile by adding an transform which forgets about atomicity. See the thread on llvm-dev titled "FYI: proposed changes to atomic load/store in SelectionDAG" for further context. Note that this patch is NFC unless the experimental flag is set. The basic strategy I plan on taking is: introduce infrastructure and a flag for testing (this patch) Audit uses of isVolatile, and apply isAtomic conservatively* piecemeal conservative* update generic code and x86 backedge code in individual reviews w/tests for cases which didn't check volatile, but can be found with inspection flip the flag at the end (with minimal diffs) Work through todo list identified in (2) and (3) exposing performance ops (*) The "conservative" bit here is aimed at minimizing the number of diffs involved in (4). Ideally, there'd be none. In practice, getting it down to something reviewable by a human is the actual goal. Note that there are (currently) no paths which produce LoadSDNode or StoreSDNode with atomic MMOs, so we don't need to worry about preserving any behaviour there. We've taken a very similar strategy twice before with success - once at IR level, and once at the MI level (post ISEL). Differential Revision: https://reviews.llvm.org/D66309 llvm-svn: 371441
* [IfConversion] Correctly handle cases where analyzeBranch fails.Eli Friedman2019-09-091-0/+6
| | | | | | | | | | | | | | | | If analyzeBranch fails, on some targets, the out parameters point to some blocks in the function. But we can't use that information, so make sure to clear it out. (In some places in IfConversion, we assume that any block with a TrueBB is analyzable.) The change to the testcase makes it trigger a bug on builds without this fix: IfConvertDiamond tries to perform a followup "merge" operation, which isn't legal, and we somehow end up with a branch to a deleted MBB. I'm not sure how this doesn't crash the compiler. Differential Revision: https://reviews.llvm.org/D67306 llvm-svn: 371434
* [SelectionDAG] Remove ISD::FP_ROUND_INREGCraig Topper2019-09-096-59/+1
| | | | | | | | | | | | I don't think anything in tree creates this node. So all of this code appears to be dead. Code coverage agrees http://lab.llvm.org:8080/coverage/coverage-reports/llvm/coverage/Users/buildslave/jenkins/workspace/clang-stage2-coverage-R/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp.html Differential Revision: https://reviews.llvm.org/D67312 llvm-svn: 371431
* Revert "[MachineCopyPropagation] Remove redundant copies after TailDup via ↵Dmitri Gribenko2019-09-091-65/+0
| | | | | | | | | machine-cp" This reverts commit 371359. I'm suspecting a miscompile, I posted a reproducer to https://reviews.llvm.org/D65267. llvm-svn: 371421
* [DFAPacketizer] Reapply: Track resources for packetized instructionsJames Molloy2019-09-091-11/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reapply with fix to reduce resources required by the compiler - use unsigned[2] instead of std::pair. This causes clang and gcc to compile the generated file multiple times faster, and hopefully will reduce the resource requirements on Visual Studio also. This fix is a little ugly but it's clearly the same issue the previous author of DFAPacketizer faced (the previous tables use unsigned[2] rather uglily too). This patch allows the DFAPacketizer to be queried after a packet is formed to work out which resources were allocated to the packetized instructions. This is particularly important for targets that do their own bundle packing - it's not sufficient to know simply that instructions can share a packet; which slots are used is also required for encoding. This extends the emitter to emit a side-table containing resource usage diffs for each state transition. The packetizer maintains a set of all possible resource states in its current state. After packetization is complete, all remaining resource states are possible packetization strategies. The sidetable is only ~500K for Hexagon, but the extra tracking is disabled by default (most uses of the packetizer like MachinePipeliner don't care and don't need the extra maintained state). Differential Revision: https://reviews.llvm.org/D66936 llvm-svn: 371399
* Revert rL371198 from llvm/trunk: [DFAPacketizer] Track resources for ↵Simon Pilgrim2019-09-091-54/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | packetized instructions This patch allows the DFAPacketizer to be queried after a packet is formed to work out which resources were allocated to the packetized instructions. This is particularly important for targets that do their own bundle packing - it's not sufficient to know simply that instructions can share a packet; which slots are used is also required for encoding. This extends the emitter to emit a side-table containing resource usage diffs for each state transition. The packetizer maintains a set of all possible resource states in its current state. After packetization is complete, all remaining resource states are possible packetization strategies. The sidetable is only ~500K for Hexagon, but the extra tracking is disabled by default (most uses of the packetizer like MachinePipeliner don't care and don't need the extra maintained state). Differential Revision: https://reviews.llvm.org/D66936 ........ Reverted as this is causing "compiler out of heap space" errors on MSVC 2017/19 NDEBUG builds llvm-svn: 371393
* GlobalISel: fix unused warnings in release builds.Tim Northover2019-09-091-0/+4
| | | | llvm-svn: 371385
* GlobalISel: add combiner to form indexed loads.Tim Northover2019-09-092-4/+216
| | | | | | | | | | | Loosely based on DAGCombiner version, but this part is slightly simpler in GlobalIsel because all address calculation is performed by G_GEP. That makes the inc/dec distinction moot so there's just pre/post to think about. No targets can handle it yet so testing is via a special flag that overrides target hooks. llvm-svn: 371384
* [MachineCopyPropagation] Remove redundant copies after TailDup via machine-cpKai Luo2019-09-091-0/+65
| | | | | | | | | | | | | | | | | | | | | | | Summary: After tailduplication, we have redundant copies. We can remove these copies in machine-cp if it's safe to, i.e. ``` $reg0 = OP ... ... <<< No read or clobber of $reg0 and $reg1 $reg1 = COPY $reg0 <<< $reg0 is killed ... <RET> ``` will be transformed to ``` $reg1 = OP ... ... <RET> ``` Differential Revision: https://reviews.llvm.org/D65267 llvm-svn: 371359
* [DAGCombiner][X86][ARM] Teach visitMULO to fold multiplies with 0 to 0 and ↵Craig Topper2019-09-081-3/+19
| | | | | | | | | no carry. I modified the ARM test to use two inputs instead of 0 so the test hopefully still tests what was intended. llvm-svn: 371344
* [DebugInfo][X86] Describe call site values for zero-valued immsDavid Stenberg2019-09-081-12/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Add zero-materializing XORs to X86's describeLoadedValue() hook in order to produce call site values. I have had to change the defs logic in collectCallSiteParameters() a bit to be able to describe the XORs. The XORs implicitly define $eflags, which would cause them to never be considered, due to a guard condition that I->getNumDefs() is one. I have changed that condition so that we now only consider instructions where a forwarded register overlaps with the instruction's single explicit define. We still need to collect the implicit defines of other forwarded registers to remove them from the work list. I'm not sure how to move towards supporting instructions with multiple explicit defines, cases where forwarded register are implicitly defined, and/or cases where an instruction produces values for multiple forwarded registers. Perhaps the describeLoadedValue() hook should take a register argument, and we then leave it up to the hook to describe the loaded value in that register? I have not yet encountered a situation where that would be necessary though. Reviewers: aprantl, vsk, djtodoro, NikolaPrica Reviewed By: vsk Subscribers: ychen, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D67225 llvm-svn: 371333
* [NFC] Make the describeLoadedValue() hook return machine operand objectsDavid Stenberg2019-09-082-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This changes the ParamLoadedValue pair which the describeLoadedValue() hook returns so that MachineOperand objects are returned instead of pointers. When describing call site values we may need to describe operands which are not part of the instruction. One such example is zero-materializing XORs on x86, which I have implemented support for in a child revision. Instead of having to return a pointer to an operand stored somewhere outside the instruction, start returning objects directly instead, as that simplifies the code. The MachineOperand class only holds POD members, and on x86-64 it is 32 bytes large. That combined with copy elision means that the overhead of returning a machine operand object from the hook does not become very large. I benchmarked this on a 8-thread i7-8650U machine with 32 GB RAM. The benchmark consisted of building a clang 8.0 binary configured with: -DCMAKE_BUILD_TYPE=RelWithDebInfo \ -DLLVM_TARGETS_TO_BUILD=X86 \ -DLLVM_USE_SANITIZER=Address \ -DCMAKE_CXX_FLAGS="-Xclang -femit-debug-entry-values -stdlib=libc++" The average wall clock time increased by 4 seconds, from 62:05 to 62:09, which is an 0.1% increase. Reviewers: aprantl, vsk, djtodoro, NikolaPrica Reviewed By: vsk Subscribers: hiraditya, ychen, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D67261 llvm-svn: 371332
* [CodeGen] Handle SMULFIXSAT with scale zero in ↵Bjorn Pettersson2019-09-071-10/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TargetLowering::expandFixedPointMul Summary: Normally TargetLowering::expandFixedPointMul would handle SMULFIXSAT with scale zero by using an SMULO to compute the product and determine if saturation is needed (if overflow happened). But if SMULO isn't custom/legal it falls through and uses the same technique, using MULHS/SMUL_LOHI, as used for non-zero scales. Problem was that when checking for overflow (handling saturation) when not using MULO we did not expect to find a zero scale. So we ended up in an assertion when doing APInt::getLowBitsSet(VTSize, Scale - 1) This patch fixes the problem by adding a new special case for how saturation is computed when scale is zero. Reviewers: RKSimon, bevinh, leonardchan, spatel Reviewed By: RKSimon Subscribers: wuzish, nemanjai, hiraditya, MaskRay, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67071 llvm-svn: 371309
* [Intrinsic] Add the llvm.umul.fix.sat intrinsicBjorn Pettersson2019-09-079-46/+149
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Add an intrinsic that takes 2 unsigned integers with the scale of them provided as the third argument and performs fixed point multiplication on them. The result is saturated and clamped between the largest and smallest representable values of the first 2 operands. This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics. Patch by: leonardchan, bjope Reviewers: RKSimon, craig.topper, bevinh, leonardchan, lebedev.ri, spatel Reviewed By: leonardchan Subscribers: ychen, wuzish, nemanjai, MaskRay, jsji, jdoerfert, Ka-Ka, hiraditya, rjmccall, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57836 llvm-svn: 371308
* [DwarfExpression] Disallow some rewrites to avoid undefined behaviorBjorn Pettersson2019-09-072-8/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The value operand in DW_OP_plus_uconst/DW_OP_constu value can be large (it uses uint64_t as representation internally in LLVM). This means that in the uint64_t to int conversions, previously done by DwarfExpression::addMachineRegExpression, could lose information. Also, the negation done in "-Offset" was undefined behavior in case Offset was exactly INT_MIN. To avoid the above problems, we now avoid transformation like [Reg, DW_OP_plus_uconst, Offset] --> [DW_OP_breg, Offset] and [Reg, DW_OP_constu, Offset, DW_OP_plus] --> [DW_OP_breg, Offset] when Offset > INT_MAX. And we avoid to transform [Reg, DW_OP_constu, Offset, DW_OP_minus] --> [DW_OP_breg,-Offset] when Offset > INT_MAX+1. The patch also adjusts DwarfCompileUnit::constructVariableDIEImpl to make sure that "DW_OP_constu, Offset, DW_OP_minus" is used instead of "DW_OP_plus_uconst, Offset" when creating DIExpressions with negative frame index offsets. Notice that this might just be the tip of the iceberg. There are lots of fishy handling related to these constants. I think both DIExpression::appendOffset and DIExpression::extractIfOffset may trigger undefined behavior for certain values. Reviewers: sdesmalen, rnk, JDevlieghere Reviewed By: JDevlieghere Subscribers: jholewinski, aprantl, hiraditya, ychen, uabelho, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D67263 llvm-svn: 371304
* Change TargetLibraryInfo analysis passes to always require FunctionTeresa Johnson2019-09-075-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is the first change to enable the TLI to be built per-function so that -fno-builtin* handling can be migrated to use function attributes. See discussion on D61634 for background. This is an enabler for fixing handling of these options for LTO, for example. This change should not affect behavior, as the provided function is not yet used to build a specifically per-function TLI, but rather enables that migration. Most of the changes were very mechanical, e.g. passing a Function to the legacy analysis pass's getTLI interface, or in Module level cases, adding a callback. This is similar to the way the per-function TTI analysis works. There was one place where we were looking for builtins but not in the context of a specific function. See FindCXAAtExit in lib/Transforms/IPO/GlobalOpt.cpp. I'm somewhat concerned my workaround could provide the wrong behavior in some corner cases. Suggestions welcome. Reviewers: chandlerc, hfinkel Subscribers: arsenm, dschuff, jvesely, nhaehnle, mehdi_amini, javed.absar, sbc100, jgravelle-google, eraman, aheejin, steven_wu, george.burgess.iv, dexonsmith, jfb, asbirlea, gchatelet, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66428 llvm-svn: 371284
* GlobalISel: Add G_FMAD instructionMatt Arsenault2019-09-061-0/+2
| | | | llvm-svn: 371254
* [DFAPacketizer] Track resources for packetized instructionsJames Molloy2019-09-061-11/+54
| | | | | | | | | | | | | | | | | | | | | | This patch allows the DFAPacketizer to be queried after a packet is formed to work out which resources were allocated to the packetized instructions. This is particularly important for targets that do their own bundle packing - it's not sufficient to know simply that instructions can share a packet; which slots are used is also required for encoding. This extends the emitter to emit a side-table containing resource usage diffs for each state transition. The packetizer maintains a set of all possible resource states in its current state. After packetization is complete, all remaining resource states are possible packetization strategies. The sidetable is only ~500K for Hexagon, but the extra tracking is disabled by default (most uses of the packetizer like MachinePipeliner don't care and don't need the extra maintained state). Differential Revision: https://reviews.llvm.org/D66936 llvm-svn: 371198
* [DebugInfo] LiveDebugValues: explicitly terminate overwritten stack locationsJeremy Morse2019-09-061-12/+54
| | | | | | | | | | | | | | | | | | If a stack spill location is overwritten by another spill instruction, any variable locations pointing at that slot should be terminated. We cannot rely on spills always being restored to registers or variable locations being moved by a DBG_VALUE: the register allocator is entitled to spill a value and then forget about it when it goes out of liveness. To address this, scan for memory writes to spill locations, even those we don't consider to be normal "spills". isSpillInstruction and isLocationSpill distinguish the two now. After identifying spill overwrites, terminate the open range, and insert a $noreg DBG_VALUE for that variable. Differential Revision: https://reviews.llvm.org/D66941 llvm-svn: 371193
* [CodeGen] Do the Simple Early Return in block-placement pass to optimize the ↵Kang Zhang2019-09-061-0/+46
| | | | | | | | | | | | | | | | | | blocks Summary: Fix a bug of not update the jump table and recommit it again. In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun. But the `early-ret` pass is before `block-placement`, we don't want to run it again. This patch is to do the simple early return to optimize the blocks at the last of `block-placement`. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D63972 llvm-svn: 371177
* [MIR] MIRNamer pass for improving MIR test authoring experience.Puyan Lotfi2019-09-053-0/+79
| | | | | | | | | | | This patch reuses the MIR vreg renamer from the MIRCanonicalizerPass to cleanup names of vregs in a MIR file for MIR test authors. I found it useful when writing a regression test for a globalisel failure I encountered recently and thought it might be useful for other folks as well. Differential Revision: https://reviews.llvm.org/D67209 llvm-svn: 371121
* [globalisel][knownbits] Account for missing type constraintsDaniel Sanders2019-09-051-2/+15
| | | | | | | | | | | | | Now that we look through copies, it's possible to visit registers that have a register class constraint but not a type constraint. Avoid looking through copies when this occurs as the SrcReg won't be able to determine it's bit width or any known bits. Along the same lines, if the initial query is on a register that doesn't have a type constraint then the result is a default-constructed KnownBits, that is, a 1-bit fully-unknown value. llvm-svn: 371116
* Recommit "[AArch64][GlobalISel] Teach AArch64CallLowering to handle basic ↵Jessica Paquette2019-09-051-1/+3
| | | | | | | | | | | | | | | | | | sibling calls" Recommit basic sibling call lowering (https://reviews.llvm.org/D67189) The issue was that if you have a return type other than void, call lowering will emit COPYs to get the return value after the call. Disallow sibling calls other than ones that return void for now. Also proactively disable swifterror tail calls for now, since there's a similar issue with COPYs there. Update call-translator-tail-call.ll to include test cases for each of these things. llvm-svn: 371114
* [IfConversion] Fix diamond conversion with unanalyzable branches.Eli Friedman2019-09-051-2/+8
| | | | | | | | | | | | | | | | | | The code was incorrectly counting the number of identical instructions, and therefore tried to predicate an instruction which should not have been predicated. This could have various effects: a compiler crash, an assembler failure, a miscompile, or just generating an extra, unnecessary instruction. Instead of depending on TargetInstrInfo::removeBranch, which only works on analyzable branches, just remove all branch instructions. Fixes https://bugs.llvm.org/show_bug.cgi?id=43121 and https://bugs.llvm.org/show_bug.cgi?id=41121 . Differential Revision: https://reviews.llvm.org/D67203 llvm-svn: 371111
* [Alignment][NFC] Change internal representation of TargetLowering.hGuillaume Chatelet2019-09-051-4/+0
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67226 llvm-svn: 371082
* Revert rL370996 from llvm/trunk: [AArch64][GlobalISel] Teach ↵Simon Pilgrim2019-09-051-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | AArch64CallLowering to handle basic sibling calls This adds support for basic sibling call lowering in AArch64. The intent here is to only handle tail calls which do not change the ABI (hence, sibling calls.) At this point, it is very restricted. It does not handle - Vararg calls. - Calls with outgoing arguments. - Calls whose calling conventions differ from the caller's calling convention. - Tail/sibling calls with BTI enabled. This patch adds - `AArch64CallLowering::isEligibleForTailCallOptimization`, which is equivalent to the same function in AArch64ISelLowering.cpp (albeit with the restrictions above.) - `mayTailCallThisCC` and `canGuaranteeTCO`, which are identical to those in AArch64ISelLowering.cpp. - `getCallOpcode`, which is exactly what it sounds like. Tail/sibling calls are lowered by checking if they pass target-independent tail call positioning checks, and checking if they satisfy `isEligibleForTailCallOptimization`. If they do, then a tail call instruction is emitted instead of a normal call. If we have a sibling call (which is always the case in this patch), then we do not emit any stack adjustment operations. When we go to lower a return, we check if we've already emitted a tail call. If so, then we skip the return lowering. For testing, this patch - Adds call-translator-tail-call.ll to test which tail calls we currently lower, which ones we don't, and which ones we shouldn't. - Updates branch-target-enforcement-indirect-calls.ll to show that we fall back as expected. Differential Revision: https://reviews.llvm.org/D67189 ........ This fails on EXPENSIVE_CHECKS builds due to a -verify-machineinstrs test failure in CodeGen/AArch64/dllimport.ll llvm-svn: 371051
* [LLVM][Alignment] Make functions using log of alignment explicitGuillaume Chatelet2019-09-0511-44/+47
| | | | | | | | | | | | | | | | | | | | | Summary: This patch renames functions that takes or returns alignment as log2, this patch will help with the transition to llvm::Align. The renaming makes it explicit that we deal with log(alignment) instead of a power of two alignment. A few renames uncovered dubious assignments: - `MirParser`/`MirPrinter` was expecting powers of two but `MachineFunction` and `MachineBasicBlock` were using deal with log2(align). This patch fixes it and updates the documentation. - `MachineBlockPlacement` exposes two flags (`align-all-blocks` and `align-all-nofallthru-blocks`) supposedly interpreted as power of two alignments, internally these values are interpreted as log2(align). This patch updates the documentation, - `MachineFunctionexposes` exposes `align-all-functions` also interpreted as power of two alignment, internally this value is interpreted as log2(align). This patch updates the documentation, Reviewers: lattner, thegameg, courbet Subscribers: dschuff, arsenm, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, Jim, s.egerton, llvm-commits, courbet Tags: #llvm Differential Revision: https://reviews.llvm.org/D65945 llvm-svn: 371045
* [AArch64][GlobalISel] Teach AArch64CallLowering to handle basic sibling callsJessica Paquette2019-09-041-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for basic sibling call lowering in AArch64. The intent here is to only handle tail calls which do not change the ABI (hence, sibling calls.) At this point, it is very restricted. It does not handle - Vararg calls. - Calls with outgoing arguments. - Calls whose calling conventions differ from the caller's calling convention. - Tail/sibling calls with BTI enabled. This patch adds - `AArch64CallLowering::isEligibleForTailCallOptimization`, which is equivalent to the same function in AArch64ISelLowering.cpp (albeit with the restrictions above.) - `mayTailCallThisCC` and `canGuaranteeTCO`, which are identical to those in AArch64ISelLowering.cpp. - `getCallOpcode`, which is exactly what it sounds like. Tail/sibling calls are lowered by checking if they pass target-independent tail call positioning checks, and checking if they satisfy `isEligibleForTailCallOptimization`. If they do, then a tail call instruction is emitted instead of a normal call. If we have a sibling call (which is always the case in this patch), then we do not emit any stack adjustment operations. When we go to lower a return, we check if we've already emitted a tail call. If so, then we skip the return lowering. For testing, this patch - Adds call-translator-tail-call.ll to test which tail calls we currently lower, which ones we don't, and which ones we shouldn't. - Updates branch-target-enforcement-indirect-calls.ll to show that we fall back as expected. Differential Revision: https://reviews.llvm.org/D67189 llvm-svn: 370996
* [mir-canon][NFC] Move MIR vreg renaming code to separate file for better reuse.Puyan Lotfi2019-09-044-337/+436
| | | | | | | | | | | | | Moving MIRCanonicalizerPass vreg renaming code to MIRVRegNamerUtils so that it can be reused in another pass (ie planing to write a standalone mir-namer pass). I'm going to write a mir-namer pass so that next time someone has to author a test in MIR, they can use it to cleanup the naming and make it more readable by having the numbered vregs swapped out with named vregs. Differential Revision: https://reviews.llvm.org/D67114 llvm-svn: 370985
* GlobalISel: Add basic legalization for G_BITREVERSEMatt Arsenault2019-09-041-0/+19
| | | | llvm-svn: 370979
* [globalisel] Support trivial COPY in GISelKnownBitsDaniel Sanders2019-09-041-0/+13
| | | | | | | | | | | | | | Summary: Allow GISelKnownBits to look through the trivial case of TargetOpcode::COPY Reviewers: aditya_nandakumar Subscribers: rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67131 llvm-svn: 370955
* Update CodeGen to use hasMetadata as appropriate [NFC]Philip Reames2019-09-042-12/+11
| | | | | | My intial grepping for rL370933 missed a directory worth of cases. llvm-svn: 370942
* GlobalISel: Add G_BITREVERSEMatt Arsenault2019-09-041-0/+2
| | | | | | This is the first failing pattern for AMDGPU and is trivial to handle. llvm-svn: 370927
* [ModuloSchedule] Fix no-asserts buildJames Molloy2019-09-041-5/+8
| | | | | | Apologies, due to a git SNAFU this fix (dump doesn't exist and silence unused variables) stayed in my index rather than applying to rL370893. llvm-svn: 370894
* [ModuloSchedule] Introduce PeelingModuloScheduleExpanderJames Molloy2019-09-042-6/+478
| | | | | | | | | | | | | | | This is the beginnings of a reimplementation of ModuloScheduleExpander. It works by generating a single-block correct pipelined kernel and then peeling out the prolog and epilogs. This patch implements kernel generation as well as a validator that will confirm the number of phis added is the same as the ModuloScheduleExpander. Prolog and epilog peeling will come in a different patch. Differential Revision: https://reviews.llvm.org/D67081 llvm-svn: 370893
* [DebugInfo] LiveDebugValues: locations with different exprs should not be mergedJeremy Morse2019-09-041-7/+17
| | | | | | | | | | | | | | | | | When comparing variable locations, LiveDebugValues currently considers only the machine location, ignoring any DIExpression applied to it. This is a problem because that DIExpression can do pretty much anything to the machine location, for example dereferencing it. This patch adds DIExpressions to that comparison; now variables based on the same register/memory-location but with different expressions will compare differently, and be dropped if we attempt to merge them between blocks. This reduces variable coverage-range a little, but only because we were producing broken locations. Differential Revision: https://reviews.llvm.org/D66942 llvm-svn: 370877
* [LiveDebugValues][NFC] Silence an unused variable warningJeremy Morse2019-09-041-0/+1
| | | | | | | | On release builds, 'MI' isn't used by anything (it's already inserted into a block by BuildMI), while on non-release builds it's used by a LLVM_DEBUG statement. Mark as explicitly used to avoid the warning. llvm-svn: 370870
* [GlobalISel] Fix G_SEXT narrowScalar to bail out of unsupported type ↵Amara Emerson2019-09-041-3/+7
| | | | | | | | | | | combination. Similar to the issue with G_ZEXT that was fixed earlier, this is a quick to fall back if the source type is not exactly half of the dest type. Fixes the clang-cmake-aarch64-lld bot build. llvm-svn: 370847
* [AArch64][GlobalISel] Legalize 128 bit divisions to libcalls.Amara Emerson2019-09-031-4/+22
| | | | | | | | | Now that we have the infrastructure to support s128 types as parameters we can expand these to libcalls. Differential Revision: https://reviews.llvm.org/D66185 llvm-svn: 370823
* [GlobalISel][CallLowering] Add support for splitting types according to ↵Amara Emerson2019-09-033-37/+157
| | | | | | | | | | | | | | calling conventions. On AArch64, s128 types have to be split into s64 GPRs when passed as arguments. This change adds the generic support in call lowering for dealing with multiple registers, for incoming and outgoing args. Support for splitting for return types not yet implemented. Differential Revision: https://reviews.llvm.org/D66180 llvm-svn: 370822
* [CodeGen] Use FSHR in DAGTypeLegalizer::ExpandIntRes_MULFIXBjorn Pettersson2019-09-031-49/+19
| | | | | | | | | | | | | | | | | | | | | | | Summary: Simplify the right shift of the intermediate result (given in four parts) by using funnel shift. There are some impact on lit tests, but that seems to be related to register allocation differences due to how FSHR is expanded on X86 (giving a slightly different operand order for the OR operations compared to the old code). Reviewers: leonardchan, RKSimon, spatel, lebedev.ri Reviewed By: RKSimon Subscribers: hiraditya, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, s.egerton, pzheng, bevinh, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67036 llvm-svn: 370813
* [MachinePipeliner] Add a way to unit-test the schedule emitterJames Molloy2019-09-033-0/+128
| | | | | | | | | | | | | | | | | | Emitting a schedule is really hard. There are lots of corner cases to take care of; in fact, of the 60+ SWP-specific testcases in the Hexagon backend most of those are testing codegen rather than the schedule creation itself. One issue is that to test an emission corner case we must craft an input such that the generated schedule uses that corner case; sometimes this is very hard and convolutes testcases. Other times it is impossible but we want to test it anyway. This patch adds a simple test pass that will consume a module containing a loop and generate pipelined code from it. We use post-instr-symbols as a way to annotate instructions with the stage and cycle that we want to schedule them at. We also provide a flag that causes the MachinePipeliner to generate these annotations instead of actually emitting code; this allows us to generate an input testcase with: llc < %s -stop-after=pipeliner -pipeliner-annotate-for-testing -o test.mir And run the emission in isolation with: llc < test.mir -run-pass=modulo-schedule-test llvm-svn: 370705
OpenPOWER on IntegriCloud