summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* [PowerPC] Support fp128 libcallsFangrui Song2019-07-151-0/+28
| | | | | | | | | | | | | On PowerPC, IEEE 754 quadruple-precision libcall names use "kf" instead of "tf". In libgcc, libgcc/config/rs6000/float128-sed converts TF names to KF names. This patch implements its 24 substitution rules. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D64282 llvm-svn: 366039
* [DebugInfo] Add column info for inline sitesJonas Devlieghere2019-07-121-0/+1
| | | | | | | | | | | | The column field is missing for all inline sites, currently it's always zero. This changes populates DW_AT_call_column field for inline sites. Test case modified to cover this change. Patch by: Wenlei He Differential revision: https://reviews.llvm.org/D64033 llvm-svn: 365945
* Delete dead storesFangrui Song2019-07-124-15/+3
| | | | llvm-svn: 365903
* [DAGCombine] narrowExtractedVectorBinOp - wrap subvector extraction in ↵Simon Pilgrim2019-07-121-9/+11
| | | | | | | | helper. NFCI. First step towards supporting 'free' subvector extractions other than concat_vectors. llvm-svn: 365896
* Revert "[DwarfDebug] Dump call site debug info"Djordje Todorovic2019-07-1210-393/+43
| | | | | | | | A build failure was found on the SystemZ platform. This reverts commit 9e7e73578e54cd22b3c7af4b54274d743b6607cc. llvm-svn: 365886
* [MachinePipeliner] Fix order for nodes with Anti dependence in same cycleJinsong Ji2019-07-121-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Problem exposed in PowerPC functional testing. We did not consider Anti dependence for nodes in same cycle, so we may end up generating bad machine code. eg: the reduced test won't verify. *** Bad machine code: Using an undefined physical register *** - function: lame_encode_buffer_interleaved - basic block: %bb.4 (0x4bde4e12928) - instruction: %29:gprc = ADDZE %27:gprc, implicit-def dead $carry, implicit $carry - operand 3: implicit $carry Reviewers: bcahoon, kparzysz, hfinkel Subscribers: MaskRay, wuzish, nemanjai, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64192 llvm-svn: 365859
* [DAGCombine] narrowInsertExtractVectorBinOp - add CONCAT_VECTORS supportSimon Pilgrim2019-07-111-4/+14
| | | | | | | | | | | | | | We already split extract_subvector(binop(insert_subvector(v,x),insert_subvector(w,y))) -> binop(x,y). This patch adds support for extract_subvector(binop(concat_vectors(),concat_vectors())) cases as well. In particular this means we don't have to wait for X86 lowering to convert concat_vectors to insert_subvector chains, which helps avoid some cases where demandedelts/combine calls occur too late to split large vector ops. The fast-isel-store.ll load folding regression is annoying but I don't think is that critical. Differential Revision: https://reviews.llvm.org/D63653 llvm-svn: 365785
* RegUsageInfoCollector: Skip calling conventions I missed beforeMatt Arsenault2019-07-111-0/+3
| | | | llvm-svn: 365784
* GlobalISel: Use RegisterMatt Arsenault2019-07-111-5/+5
| | | | llvm-svn: 365780
* OpaquePtr: switch to GlobalValue::getValueType in a few places. NFC.Tim Northover2019-07-111-2/+2
| | | | llvm-svn: 365770
* OpaquePtr: use byval accessor instead of inspecting pointer type. NFC.Tim Northover2019-07-111-3/+2
| | | | | | | The accessor can deal with both "byval(ty)" and "ty* byval" forms seamlessly. llvm-svn: 365769
* [SDAG] commute setcc operands to match a subtractSanjay Patel2019-07-101-0/+11
| | | | | | | | | | | | | | | | | | | If we have: R = sub X, Y P = cmp Y, X ...then flipping the operands in the compare instruction can allow using a subtract that sets compare flags. Motivated by diffs in D58875 - not sure if this changes anything there, but this seems like a good thing independent of that. There's a more involved version of this transform already in IR (in instcombine although that seems misplaced to me) - see "swapMayExposeCSEOpportunities()". Differential Revision: https://reviews.llvm.org/D63958 llvm-svn: 365711
* [AArch64][GlobalISel] Optimize compare and branch cases with G_INTTOPTR and ↵Amara Emerson2019-07-101-0/+3
| | | | | | | | | | | | | | | | | | | | unknown values. Since we have distinct types for pointers and scalars, G_INTTOPTRs can sometimes obstruct attempts to find constant source values. These usually come about when try to do some kind of null pointer check. Teaching getConstantVRegValWithLookThrough about this operation allows the CBZ/CBNZ optimization to catch more cases. This change also improves the case where we can't find a constant source at all. Previously we would emit a cmp, cset and tbnz for that. Now we try to just emit a cmp and conditional branch, saving an instruction. The cumulative code size improvement of this change plus D64354 is 5.5% geomean on arm64 CTMark -O0. Differential Revision: https://reviews.llvm.org/D64377 llvm-svn: 365690
* Move three folds for FADD, FSUB and FMUL in the DAG combiner away from ↵Michael Berg2019-07-101-4/+4
| | | | | | | | | | | | | | | | Unsafe to more aligned checks that reflect context Summary: Unsafe does not map well alone for each of these three cases as it is missing NoNan context when accessed directly with clang. I have migrated the fold guards to reflect the expectations of handing nan and zero contexts directly (NoNan, NSZ) and some tests with it. Unsafe does include NSZ, however there is already precedent for using the target option directly to reflect that context. Reviewers: spatel, wristow, hfinkel, craig.topper, arsenm Reviewed By: arsenm Subscribers: michele.scandale, wdng, javed.absar Differential Revision: https://reviews.llvm.org/D64450 llvm-svn: 365679
* [TargetLowering] support BlockAddress as "i" inline asm constraintNick Desaulniers2019-07-101-0/+7
| | | | | | | | | | | | | | | | | | | | Summary: This allows passing address of labels to inline assembly "i" input constraints. Fixes pr/42502. Reviewers: ostannard Reviewed By: ostannard Subscribers: void, echristo, nathanchance, ostannard, javed.absar, hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D64167 llvm-svn: 365664
* GlobalISel: Legalization for G_FMINNUM/G_FMAXNUMMatt Arsenault2019-07-102-0/+71
| | | | llvm-svn: 365658
* GlobalISel: Define the full family of FP min/max instructionsMatt Arsenault2019-07-101-0/+8
| | | | llvm-svn: 365657
* [DAGCombine] visitINSERT_SUBVECTOR - use uint64_t subvector index. NFCI.Simon Pilgrim2019-07-101-1/+1
| | | | | | Keep the uint64_t type from getZExtValue() to stop truncation/extension overflow warnings in MSVC in subvector index math. llvm-svn: 365621
* Fix const/non-const lambda return type warning. NFCI.Simon Pilgrim2019-07-101-1/+1
| | | | llvm-svn: 365613
* GlobalISel: Implement lower for G_FCOPYSIGNMatt Arsenault2019-07-091-0/+50
| | | | | | | | | In SelectionDAG AMDGPU treated these as legal, but this was mostly because the bitcasts required for FP types were painful. Theoretically the bitpattern should eventually match to bfi, so don't bother trying to get the patterns to import. llvm-svn: 365583
* GlobalISel: Combine unmerge of merge with intermediate castMatt Arsenault2019-07-091-3/+9
| | | | | | | This eliminates some illegal intermediate vectors when operations are scalarized. llvm-svn: 365566
* [X86][AMDGPU][DAGCombiner] Move call to allowsMemoryAccess into ↵Craig Topper2019-07-091-15/+8
| | | | | | | | | | | | | | | | isLoadBitCastBeneficial/isStoreBitCastBeneficial to allow X86 to bypass it Basically the problem is that X86 doesn't set the Fast flag from allowsMemoryAccess on certain CPUs due to slow unaligned memory subtarget features. This prevents bitcasts from being folded into loads and stores. But all vector loads and stores of the same width are the same cost on X86. This patch merges the allowsMemoryAccess call into isLoadBitCastBeneficial to allow X86 to skip it. Differential Revision: https://reviews.llvm.org/D64295 llvm-svn: 365549
* Revert "[HardwareLoops] NFC - move hardware loop checking code to ↵Jinsong Ji2019-07-091-12/+33
| | | | | | | | isHardwareLoopProfitable()" This reverts commit d95557306585404893d610784edb3e32f1bfce18. llvm-svn: 365520
* [AArch64][GlobalISel] Optimize conditional branches followed by ↵Amara Emerson2019-07-091-0/+62
| | | | | | | | | | | | | | unconditional branches If we have an icmp->brcond->br sequence where the brcond just branches to the next block jumping over the br, while the br takes the false edge, then we can modify the conditional branch to jump to the br's target while inverting the condition of the incoming icmp. This means we can eliminate the br as an unconditional branch to the fallthrough block. Differential Revision: https://reviews.llvm.org/D64354 llvm-svn: 365510
* [DAGCombine] LoadedSlice - keep getOffsetFromBase() uint64_t offset. NFCI.Simon Pilgrim2019-07-091-1/+1
| | | | | | Keep the uint64_t type from getOffsetFromBase() to stop truncation/extension overflow warnings in MSVC in alignment math. llvm-svn: 365504
* [HardwareLoops] NFC - move hardware loop checking code to ↵Chen Zheng2019-07-091-33/+12
| | | | | | | | isHardwareLoopProfitable() Differential Revision: https://reviews.llvm.org/D64197 llvm-svn: 365497
* [MIPS GlobalISel] Register bank select for G_PHI. Select i64 phiPetar Avramovic2019-07-091-0/+28
| | | | | | | | | | | | | | | Select gprb or fprb when def/use register operand of G_PHI is used/defined by either: copy to/from physical register or instruction with only one mapping available for that use/def operand. Integer s64 phi is handled with narrowScalar when mapping is applied, produced artifacts are combined away. Manually set gprb to all register operands of instructions created during narrowScalar. Differential Revision: https://reviews.llvm.org/D64351 llvm-svn: 365494
* [CodeGen] AccelTable - remove non-constexpr (MSVC) Atom defsSimon Pilgrim2019-07-091-20/+0
| | | | | | Now that we've dropped VS2015 support (D64326) we can enable the constexpr variables on MSVC builds as VS2017+ correctly handles them llvm-svn: 365477
* [NFC][AsmPrinter] Fix the formatting for the rL365467Djordje Todorovic2019-07-092-23/+21
| | | | | | | In addition, fix the build failure for the 'unused' variable. The variable was used inside the 'LLVM_DEBUG()'. llvm-svn: 365469
* OpaquePtr: add Type parameter to Loads analysis API.Tim Northover2019-07-091-2/+4
| | | | | | | | | | | | | This makes the functions in Loads.h require a type to be specified independently of the pointer Value so that when pointers have no structure other than address-space, it can still do its job. Most callers had an obvious memory operation handy to provide this type, but a SROA and ArgumentPromotion were doing more complicated analysis. They get updated to merge the properties of the various instructions they were considering. llvm-svn: 365468
* [DwarfDebug] Dump call site debug infoDjordje Todorovic2019-07-0910-38/+390
| | | | | | | | | | | | | | | | | | | Dump the DWARF information about call sites and call site parameters into debug info sections. The patch also provides an interface for the interpretation of instructions that could load values of a call site parameters in order to generate DWARF about the call site parameters. ([13/13] Introduce the debug entry values.) Co-authored-by: Ananth Sowda <asowda@cisco.com> Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com> Co-authored-by: Ivan Baev <ibaev@cisco.com> Differential Revision: https://reviews.llvm.org/D60716 llvm-svn: 365467
* [SelectionDAG] Simplify some calls to getSetCCResultType. NFCBjorn Pettersson2019-07-093-8/+4
| | | | | | | | DAGTypeLegalizer and SelectionDAGLegalize has helper functions wrapping the call to TLI.getSetCCResultType(...). Use those helpers in more places. llvm-svn: 365456
* [LegalizeTypes] Fix saturation bug for smul.fix.satBjorn Pettersson2019-07-091-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Make sure we use SETGE instead of SETGT when checking if the sign bit is zero at SMULFIXSAT expansion. The faulty expansion occured when doing "expand" of SMULFIXSAT and the scale was exactly matching the size of the smaller type. For example doing i64 Z = SMULFIXSAT X, Y, 32 and expanding X/Y/Z into using two i32 values. The problem was that we sometimes did not saturate to min when overflowing. Here is an example using Q3.4 numbers: Consider that we are multiplying X and Y. X = 0x80 (-8.0 as Q3.4) Y = 0x20 (2.0 as Q3.4) To avoid loss of precision we do a widening multiplication, getting a 16 bit result Z = 0xF000 (-16.0 as Q7.8) To detect negative overflow we should check if the five most significant bits in Z are less than -1. Assume that we name the 4 most significant bits as HH and the next 4 bits as HL. Then we can do the check by examining if (HH < -1) or (HH == -1 && "sign bit in HL is zero"). The fault was that we have been doing the check as (HH < -1) or (HH == -1 && HL > 0) instead of (HH < -1) or (HH == -1 && HL >= 0). In our example HH is -1 and HL is 0, so the old code did not trigger saturation and simply truncated the result to 0x00 (0.0). With the bugfix we instead detect that we should saturate to min, and the result will be set to 0x80 (-8.0). Reviewers: leonardchan, bevinh Reviewed By: leonardchan Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64331 llvm-svn: 365455
* Fixing @llvm.memcpy not honoring volatile.Guillaume Chatelet2019-07-091-19/+24
| | | | | | | | | | | | | | | | This is explicitly not addressing target-specific code, or calls to memcpy. Summary: https://bugs.llvm.org/show_bug.cgi?id=42254 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63215 llvm-svn: 365449
* Revert r364515 and r364524Jeremy Morse2019-07-091-85/+2
| | | | | | | Jordan reports on llvm-commits a performance regression with r364515, backing the patch out while it's investigated. llvm-svn: 365448
* Reland "[LiveDebugValues] Emit the debug entry values"Djordje Todorovic2019-07-093-18/+127
| | | | | | | | | | | | | | | | Emit replacements for clobbered parameters location if the parameter has unmodified value throughout the funciton. This is basic scenario where we can use the debug entry values. ([12/13] Introduce the debug entry values.) Co-authored-by: Ananth Sowda <asowda@cisco.com> Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com> Co-authored-by: Ivan Baev <ibaev@cisco.com> Differential Revision: https://reviews.llvm.org/D58042 llvm-svn: 365444
* [MachinePipeliner] Fix Phi refers to Phi in same stage in 1st epilogueJinsong Ji2019-07-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Summary: This is exposed by functional testing on PowerPC. In some pipelined loops, Phi refer to phi did not get value defined by the Phi, hence getting wrong value later. As the comment mentioned, we should "use the value defined by the Phi, unless we're generating the firstepilog and the Phi refers to a Phi in a different stage.", so Phi refering to same stage Phi should use the value defined by the Phi here. Reviewers: bcahoon, hfinkel Reviewed By: hfinkel Subscribers: MaskRay, wuzish, nemanjai, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64035 llvm-svn: 365428
* Changing CodeView debug info type record representation in assembly files to ↵Nilanjana Basu2019-07-091-13/+52
| | | | | | make it more human-readable & editable & fixing bug introduced in r364987 llvm-svn: 365417
* Standardize on MSVC behavior for triples with no environmentReid Kleckner2019-07-083-7/+5
| | | | | | | | | | | | | | | | | | | | | Summary: This makes it so that IR files using triples without an environment work out of the box, without normalizing them. Typically, the MSVC behavior is more desirable. For example, it tends to enable things like constant merging, use of associative comdats, etc. Addresses PR42491 Reviewers: compnerd Subscribers: hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64109 llvm-svn: 365387
* RegUsageInfoCollector: Don't iterate all regs for every reg classMatt Arsenault2019-07-081-31/+6
| | | | | | | | | | This is extremly slow on AMDGPU, which has a lot of physical register and a lot of register classes. determineCalleeSaves, via MachineRegisterInfo::isPhysRegUsed already added all of the super registers to the saved set. llvm-svn: 365370
* GlobalISel: Convert some build functions to using SrcOp/DstOpMatt Arsenault2019-07-081-43/+54
| | | | llvm-svn: 365343
* GlobalISel: widenScalar for G_BUILD_VECTORMatt Arsenault2019-07-081-0/+19
| | | | llvm-svn: 365320
* [TargetLowering] SimplifyDemandedBits - just call computeKnownBits for ↵Simon Pilgrim2019-07-081-23/+3
| | | | | | | | | | BUILD_VECTOR cases. Don't do this locally, computeKnownBits does this better (and can handle non-constant cases as well). A next step would be to actually simplify non-constant elements - building on what we already do in SimplifyDemandedVectorElts. llvm-svn: 365309
* [CodeGen] Add larger vector types for i32 and f32David Majnemer2019-07-071-1/+25
| | | | | | | | | | Some out of tree backend require larger vector type. Since maintaining the changes out of tree is difficult due to the many manual changes needed when adding a new type we are adding it even if no backend currently use it. Differential Revision: https://reviews.llvm.org/D64141 Patch by Thomas Raoux! llvm-svn: 365274
* [DAGCombine] convertBuildVecZextToZext - remove duplicate getOpcode() call. ↵Simon Pilgrim2019-07-061-1/+1
| | | | | | NFCI. llvm-svn: 365269
* [RegisterCoalescer] Fix an overzealous assertQuentin Colombet2019-07-061-1/+8
| | | | | | | | | | Although removeCopyByCommutingDef deals with full copies, it is still possible to copy undef lanes and thus, we wouldn't have any a value number for these lanes. This fixes PR40215. llvm-svn: 365256
* RegUsageInfoCollector: Skip AMDGPU entry point functionsMatt Arsenault2019-07-051-6/+42
| | | | | | | | | | | | | | I'm not sure if it's worth it or not to add a hook to disable the pass for an arbitrary function. This pass is taking up to 5% of compile time in tiny programs by iterating through all of the physical registers in every register class. This pass should be rewritten in terms of regunits. For now, skip doing anything for entry point functions. The vast majority of functions in the real world aren't callable, so just not running this will give the majority of the benefit. llvm-svn: 365255
* [CodeGen] Enhance `MachineInstrSpan` to allow the end of MBB to be used.Michael Liao2019-07-051-3/+3
| | | | | | | | | | | | | | | | Summary: - Explicitly specify the parent MBB to allow the end iterator to be used. Reviewers: aprantl, MatzeB, craig.topper, qcolombet Subscribers: arsenm, jvesely, nhaehnle, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64261 llvm-svn: 365240
* ScheduleDAG: Fix incorrectly killing registers in bundlesMatt Arsenault2019-07-051-6/+5
| | | | | | | | | | When looking for uses/defs to add kill flags, the iterator was double incremented, skipping the first instruction in the bundle. The use register in the first bundle instruction was then incorrectly killed. The "First" instruction should be the BUNDLE itself as the proper reverse iterator endpoint. llvm-svn: 365216
* Revert r365198 as this accidentally commited something thatRobert Lougher2019-07-051-7/+2
| | | | | | should not have been added. llvm-svn: 365199
OpenPOWER on IntegriCloud