summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* Pull out VectorNumElements value. NFC.Nirav Dave2017-08-011-13/+9
| | | | llvm-svn: 309719
* Revert "[DAG] Extend visitSCALAR_TO_VECTOR optimization to truncated vector."Nirav Dave2017-08-011-26/+11
| | | | | | | This reverts commit r309680 which appears to be raising an assertion in the test-suite. llvm-svn: 309717
* [CGP] use narrower types in memcmp expansion when possibleSanjay Patel2017-08-011-1/+6
| | | | | | | | This only affects very small memcmp on x86 for now, but it will become more important if we allow vector-sized load and compares. llvm-svn: 309711
* [DAG] Convert extload check to equivalent type check. NFC.Nirav Dave2017-08-011-5/+10
| | | | | | Replace check with check that consuming store has the same type. llvm-svn: 309708
* [DAG] Move extload check in store merge. NFC.Nirav Dave2017-08-011-5/+3
| | | | | | Move candidate check from later check to initial candidate check. llvm-svn: 309698
* [X86] Fix a crash in FEntryInserter Pass.Manoj Gupta2017-08-011-3/+1
| | | | | | | | | | | | | | | | | | | | | Summary: FEntryInserter pass unconditionally derefs the first Instruction in the first Basic Block. The pass crashes when the first BasicBlock is empty. Fix the crash by not dereferencing the basic Block iterator. This fixes an issue observed when building Linux kernel 4.4 with clang. Fixes PR33971. Reviewers: hfinkel, niravd, dblaikie Reviewed By: niravd Subscribers: davide, llvm-commits Differential Revision: https://reviews.llvm.org/D35979 llvm-svn: 309694
* DebugInfo: Update flag description that'd been copypasted from anotherDavid Blaikie2017-08-011-1/+1
| | | | | | Post-commit review feedback from Paul Robinson on r309630. Thanks Paul! llvm-svn: 309685
* [DAG] Extend visitSCALAR_TO_VECTOR optimization to truncated vector.Nirav Dave2017-08-011-11/+26
| | | | | | | | | | | | | | | Summary: Allow SCALAR_TO_VECTOR of EXTRACT_VECTOR_ELT to reduce to EXTRACT_SUBVECTOR of vector shuffle when output is smaller. Marginally improves vector shuffle computations. Reviewers: efriedma, RKSimon, spatel Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D35566 llvm-svn: 309680
* Support itineraries in TargetSubtargetInfo::getSchedInfoStr - Now if the ↵Andrew V. Tischenko2017-08-011-3/+10
| | | | | | | | given instr does not have sched model then we try to calculate the latecy/throughput with help of itineraries. Differential Revision https://reviews.llvm.org/D35997 llvm-svn: 309666
* [StackColoring] Update AliasAnalysis information in stack coloring passHiroshi Inoue2017-08-013-68/+67
| | | | | | | | | | | | | | | | | Stack coloring pass need to maintain AliasAnalysis information when merging stack slots of different types. Actually, there is a FIXME comment in StackColoring.cpp // FIXME: In order to enable the use of TBAA when using AA in CodeGen, // we'll also need to update the TBAA nodes in MMOs with values // derived from the merged allocas. But, TBAA has been already enabled in CodeGen without fixing this pass. The incorrect TBAA metadata results in recent failures in bootstrap test on ppc64le (PR33928) by allowing unsafe instruction scheduling. Although we observed the problem on ppc64le, this is a platform neutral issue. This patch makes the stack coloring pass maintains AliasAnalysis information when merging multiple stack slots. llvm-svn: 309651
* [ScheduleDAG] Don't schedule node with physical register interferenceEli Friedman2017-08-011-25/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | https://reviews.llvm.org/D31536 didn't really solve the problem it was trying to solve; it got rid of the assertion failure, but we were still scheduling the DAG incorrectly (mixing together instructions from different calls), leading to a MachineVerifier failure. In order to schedule the DAG correctly, we have to make sure we don't schedule a node which should be blocked by an interference. Fix ScheduleDAGRRList::PickNodeToScheduleBottomUp so it doesn't pick a node like that. The added call to FindAvailableNode() is the key change here; this makes sure we don't try to schedule a call while we're in the middle of scheduling a different call. I'm not sure this is the right approach; in particular, I'm not sure how to prove we don't end up with an infinite loop of repeatedly backtracking. This also reverts the code change from D31536. It doesn't do anything useful: we should never schedule an ADJCALLSTACKDOWN unless we've already scheduled the corresponding ADJCALLSTACKUP. Differential Revision: https://reviews.llvm.org/D33818 llvm-svn: 309642
* DebugInfo: Put range base specifier entry functionality behind a flagDavid Blaikie2017-07-311-4/+9
| | | | | | | | Chromium's gold build seems to have trouble with this (gold produces errors) - not sure if it's gold that's not coping with the valid representation, or a bug in the implementation in LLVM, etc. llvm-svn: 309630
* [codeview] Ignore DBG_VALUEs when choosing a BB start source locReid Kleckner2017-07-311-0/+2
| | | | | | | | | | When the first instruction of a basic block has no location (consider a LEA materializing the address of an alloca for a call), we want to start the line table for the block with the first valid source location in the block. We need to ignore DBG_VALUE instructions during this scan to get decent line tables. llvm-svn: 309628
* [TargetPassConfig] Feature generic options to setup start/stop-after/beforeQuentin Colombet2017-07-312-21/+107
| | | | | | | | | | | | | | | | This patch refactors the code used in llc such that all the users of the addPassesToEmitFile API have access to a homogeneous way of handling start/stop-after/before options right out of the box. In particular, just invoking addPassesToEmitFile will set the proper pipeline without additional effort (modulo parsing a .mir file if the start-before/after options are used. NFC. Differential Revision: https://reviews.llvm.org/D30913 llvm-svn: 309599
* [CGP] use subtract or subtract-of-cmps for result of memcmp expansionSanjay Patel2017-07-311-6/+18
| | | | | | | | | | | | | | | | | | | | | | As noted in the code comment, transforming this in the other direction might require a separate transform here in CGP given the block-at-a-time DAG constraint. Besides that theoretical motivation, there are 2 practical motivations for the subtract-of-cmps form: 1. The codegen for both x86 and PPC is better for this IR (though PPC could be better still). There is discussion about canonicalizing IR to the select form ( http://lists.llvm.org/pipermail/llvm-dev/2017-July/114885.html ), so we probably need to add DAG transforms for those patterns anyway, but this improves the memcmp output without waiting for that step. 2. If we allow vector-sized chunks for the load and compare, x86 is better prepared to convert that to optimal code when using subtract-of-cmps, so another prerequisite patch is avoided if we choose to enable that. Differential Revision: https://reviews.llvm.org/D34904 llvm-svn: 309597
* [GISel]: Support Widening G_ICMP's destination operand.Aditya Nandakumar2017-07-311-9/+45
| | | | | | | | | Updated AArch64 to widen destination to s32. https://reviews.llvm.org/D35737 Reviewed by Tim llvm-svn: 309579
* Extend ifndef to printDebugLoc.Florian Hahn2017-07-311-1/+1
| | | | | | GCC7 did not warn about that, but Clang does. llvm-svn: 309573
* Extend ifdefs to more unused helper functions.Florian Hahn2017-07-311-1/+1
| | | | | | This fixes a buildbot failure with -Werror introduced by r309553 llvm-svn: 309572
* [SelectionDAG][mips] Fix PR33883Simon Dardis2017-07-311-15/+24
| | | | | | | | | | | | | | | PR33883 shows that calls to intrinsic functions should not have their vector arguments or returns subject to ABI changes required by the target. This resolves PR33883. Thanks to Alex Crichton for reporting the issue! Reviewers: zoran.jovanovic, atanasyan Differential Revision: https://reviews.llvm.org/D35765 llvm-svn: 309561
* Guard print() functions only used by dump() functions.Florian Hahn2017-07-314-2/+6
| | | | | | | | | | | | | | | | | | | Summary: Since r293359, most dump() function are only defined when `!defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)` holds. print() functions only used by dump() functions are now unused in release builds, generating lots of warnings. This patch only defines some print() functions if they are used. Reviewers: MatzeB Reviewed By: MatzeB Subscribers: arsenm, mzolotukhin, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D35949 llvm-svn: 309553
* DebugInfo: Fix r309526, ensure resetting base address selection entries are usedDavid Blaikie2017-07-311-0/+6
| | | | | | | Missed the resetting base address selections when going from a base address version to zero base address for non-base-addressed entries. llvm-svn: 309529
* DebugInfo: Use base address selection entries in debug_ranges to reduce ↵David Blaikie2017-07-301-10/+35
| | | | | | | | | | | | | | | | | | | | relocations (from comments in the test) Group ranges in a range list that apply to the same section and use a base address selection entry to reduce the number of relocations to one reloc per section per range list. DWARF5 debug_rnglist will be more efficient than this in terms of relocations, but it's still better than one reloc per entry in a range list. This is an object/executable size tradeoff - shrinking objects, but growing the linked executable. In one large binary tested, total object size (not just debug info) shrank by 16%, entirely relocation entries. Linked executable grew by 4%. This was with compressed debug info in the objects, uncompressed in the linked executable. Without compression in the objects, the win would be smaller (the growth of debug_ranges itself would be more significant). llvm-svn: 309526
* [SelectionDAG][X86] CombineBT - more aggressively determine demanded bitsSimon Pilgrim2017-07-291-0/+12
| | | | | | | | | | | | This patch is in 2 parts: 1 - replace combineBT's use of SimplifyDemandedBits (hasOneUse only) with SelectionDAG::GetDemandedBits to more aggressively determine the lower bits used by BT. 2 - update SelectionDAG::GetDemandedBits to support ANY_EXTEND - if the demanded bits are only in the non-extended portion, then peek through and demand from the source value and then ANY_EXTEND that if we found a match. Differential Revision: https://reviews.llvm.org/D35896 llvm-svn: 309486
* [MachineOutliner] NFC: Change IsTailCall to a call class + frame classJessica Paquette2017-07-291-37/+50
| | | | | | | | | | | | | | | | | | | | | This commit - Removes IsTailCall and replaces it with a target-defined unsigned - Refactors getOutliningCallOverhead and getOutliningFrameOverhead so that they don't use IsTailCall - Adds a call class + frame class classification to OutlinedFunction and Candidate respectively This accomplishes a couple things. Firstly, we don't need the notion of *tail call* in the general outlining algorithm. Secondly, we now can have different "outlining classes" for each candidate within a set of candidates. This will make it easy to add new ways to outline sequences for certain targets and dynamically choose an appropriate cost model for a sequence depending on the context that that sequence lives in. Ultimately, this should get us closer to being able to do something like, say avoid saving the link register when outlining AArch64 instructions. llvm-svn: 309475
* Remove the unused offset field from LiveDebugValues (NFC)Adrian Prantl2017-07-281-16/+3
| | | | | | | Followup to r309426. rdar://problem/33580047 llvm-svn: 309455
* Remove the unused offset field from LiveDebugVariables (NFC)Adrian Prantl2017-07-281-17/+14
| | | | | | | Followup to r309426. rdar://problem/33580047 llvm-svn: 309451
* Remove the unused offset from DBG_VALUE (NFC)Adrian Prantl2017-07-286-21/+20
| | | | | | | Followup to r309426. rdar://problem/33580047 llvm-svn: 309450
* Remove the unused DBG_VALUE offset parameter from GlobalISel (NFC)Adrian Prantl2017-07-282-9/+7
| | | | | | | Followup to r309426. rdar://problem/33580047 llvm-svn: 309449
* Remove the unused DBG_VALUE offset parameter from RegAllocFast (NFC)Adrian Prantl2017-07-281-2/+4
| | | | | | | Followup to r309426. rdar://problem/33580047 llvm-svn: 309446
* Remove the unused dbg.value offset from SelectionDAG (NFC)Adrian Prantl2017-07-285-63/+43
| | | | | | | Followup to r309426. rdar://problem/33580047 llvm-svn: 309436
* Remove the obsolete offset parameter from @llvm.dbg.valueAdrian Prantl2017-07-283-17/+12
| | | | | | | | | | | | There is no situation where this rarely-used argument cannot be substituted with a DIExpression and removing it allows us to simplify the DWARF backend. Note that this patch does not yet remove any of the newly dead code. rdar://problem/33580047 Differential Revision: https://reviews.llvm.org/D35951 llvm-svn: 309426
* Fix conditional tail call branch folding when both edges are the sameReid Kleckner2017-07-281-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The conditional tail call logic did the wrong thing when both destinations of a conditional branch were the same: BB#1: derived from LLVM BB %entry Live Ins: %EFLAGS Predecessors according to CFG: BB#0 JE_1 <BB#5>, %EFLAGS<imp-use,kill> JMP_1 <BB#5> BB#5: derived from LLVM BB %sw.epilog Predecessors according to CFG: BB#1 TCRETURNdi64 <ga:@mergeable_conditional_tailcall>, 0, ... We would fold the JE_1 to a TCRETURNdi64cc, and then remove our BB#5 successor. Then BB#5 would be deleted as it had no predecessors, leaving a dangling "JMP_1 <BB#5>" reference behind to cause assertions later. This patch checks that both conditional branch destinations are different before doing the transform. The standard branch folding logic is able to remove both the JMP_1 and the JE_1, and for my test case we end up forming a better conditional tail call later. Fixes PR33980 llvm-svn: 309422
* [MachineOutliner] NFC: Comment tidyingJessica Paquette2017-07-281-23/+1
| | | | | | | | | The comment on describing the suffix tree had some pruning stuff that was out of date in it. Also fixed some typos. llvm-svn: 309365
* [MachineOutliner] NFC: Split up getOutliningBenefitJessica Paquette2017-07-281-21/+62
| | | | | | | | | | | | | | | | | | | | | This is some more cleanup in preparation for some actual functional changes. This splits getOutliningBenefit into two cost functions: getOutliningCallOverhead and getOutliningFrameOverhead. These functions return the number of instructions that would be required to call a specific function and the number of instructions that would be required to construct a frame for a specific funtion. The actual outlining benefit logic is moved into the outliner, which calls these functions. The goal of refactoring getOutliningBenefit is to: - Get us closer to getting rid of the IsTailCall flag - Further split up "target-specific" things and "general algorithm" things llvm-svn: 309356
* DebugInfo: Consider a CU containing only local imported entities to be 'empty'David Blaikie2017-07-281-11/+14
| | | | | | | | | | | | | | | | | | | | | | | This can come up in ThinLTO & wastes space & makes degenerate IR. As per the added FIXME, ultimately, local imported entities should hang off the function and that way the imported entity list on the CU can be tested for emptiness like all the other CU lists. (function-attached local imported entities are probably also the best path forward for fixing how imported entities are handled both in cross-module use (currently, while ThinLTO preserves the imported entities, they would not get used at the imported inlined location - only in the abstract origin that appears in the partial CU created by the import (which isn't emitted under Fission due to cross-CU limitations there)) and to reduce the number of points where imported entities are emitted (they're currently emitted into every inlined instance, concrete instance, and abstract origin - they should only go in teh abstract origin if there is one, otherwise in the concrete instance - but this requires lots of delayed handling and wiring up, same as abstract variables & subprograms)) llvm-svn: 309354
* [MachineOutliner] Cleanup: move findCandidates out of suffix treeJessica Paquette2017-07-271-204/+166
| | | | | | | | | | | | Doing some cleanup in preparation for some functional changes. This commit moves findCandidates out of the suffix tree and into the MachineOutliner class. This is much easier to follow, and removes the burden of candidate choice from the suffix tree. It also adds a couple FIXMEs and simplifies building outlined function names. llvm-svn: 309334
* [SelectionDAG] Improve DAGTypeLegalizer::convertMask assertion (PR33960)Simon Pilgrim2017-07-271-12/+9
| | | | | | Improve DAGTypeLegalizer::convertMask's isSETCCorConvertedSETCC assertion to properly check for any mixture of SETCC or BUILD_VECTOR of constants, or a logical mask op of them. llvm-svn: 309302
* [OptRemark] Allow streaming of 64-bit integersAdam Nemet2017-07-271-1/+1
| | | | llvm-svn: 309293
* [SelectionDAG] Avoid repeated calls to getNumOperands in for loop. NFCI.Simon Pilgrim2017-07-271-1/+1
| | | | llvm-svn: 309283
* remove redundant checkAdrian Prantl2017-07-271-1/+1
| | | | llvm-svn: 309280
* [SelectionDAG] Tidyup mask creation. NFCI.Simon Pilgrim2017-07-271-6/+3
| | | | | | Assign all concat elements to UNDEF and then just replace the first element, instead of copying everything individually. llvm-svn: 309277
* DebugInfo: Ensure imported entities at the top level of an inlined function ↵David Blaikie2017-07-273-17/+18
| | | | | | | | | | | | | | | | | | | | | | don't cause degenerate concrete definitions Local imported entities at the top level of a subprogram were being handled differently from those in nested scopes - that different handling would cause pseudo concrete out-of-line definitions to be created (but without any of their attributes, nor an abstract_origin) in the case where there was no real concrete definition. These local imported entities also only appeared in the concrete definition where those imported entities in nested scopes appear in all cases (abstract, concrete, and inlined). This change at least makes top level case handle the same as the others - though there's a FIXME to improve this to /only/ emit them into the abstract origin (though this requires more plumbing - like the abstract subprogram and variable handling that must defer population until the end of the unit to discover if there is an abstract origin, or only a standalone concrete definition). llvm-svn: 309237
* Change CallLoweringInfo::CS to be an ImmutableCallSite instead of a pointer. ↵Peter Collingbourne2017-07-261-3/+2
| | | | | | | | NFCI. This was a use-after-free waiting to happen. llvm-svn: 309159
* This patch returns proper value to indicate the case when instruction ↵Andrew V. Tischenko2017-07-261-20/+24
| | | | | | | | throughput can't be calculated. Differential revision https://reviews.llvm.org/D35831 llvm-svn: 309156
* Do a better job at emitting prefrabricated skeleton CUs.Adrian Prantl2017-07-261-4/+14
| | | | | | | | | | | | | | | | | | | | This is a better fix than r308708 for the problem introduced in r304020. It restores the skeleton CU testcases modified by that commit to their original form and most importantly ensures that frontend-generated skeleton CUs (such as used to point to Clang modules) come after the regular CUs. This broke for DICompileUnit nodes that don't have any immediate children because they are now constructed lazily instead of the order in which they are listed in !llvm.dbg.cu. After this commit we still don't guarantee that order, but we do guarantee that empty skeletons come last. Shipping versions of LLDB are very sensitive to the ordering of CUs. I'll track a fix for LLDB to be more permissive separately. This fixes a test failure in the LLDB testsuite. rdar://problem/33357252 llvm-svn: 309154
* DAGCombiner: Extend reduceBuildVecToTrunc to handle non-zero offsetZvi Rackover2017-07-261-12/+32
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Adding support for combining power2-strided build_vector's where the first build_vectori's operand is extracted from a non-zero index. Example: v4i32 build_vector((extract_elt V, 1), (extract_elt V, 3), (extract_elt V, 5), (extract_elt V, 7)) --> v4i32 truncate (bitcast (shuffle<1,u,3,u,5,u,7,u> V, u) to v4i64) Reviewers: delena, RKSimon, guyblank Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35700 llvm-svn: 309108
* Debug Info: Support fragmented variables in the MMI side tableAdrian Prantl2017-07-251-1/+6
| | | | | | This reapplies commit r309034 with a bugfix+test for inlined variables. llvm-svn: 309057
* Revert "Debug Info: Support fragmented variables in the MMI side table"Adrian Prantl2017-07-251-6/+1
| | | | | | This reverts commit r309034 because of a sanitizer issue. llvm-svn: 309035
* Debug Info: Support fragmented variables in the MMI side tableAdrian Prantl2017-07-251-1/+6
| | | | | | <rdar://problem/17816343> llvm-svn: 309034
* [DAG] Move DAGCombiner::GetDemandedBits to SelectionDAGSimon Pilgrim2017-07-252-62/+58
| | | | | | | | This patch moves the DAGCombiner::GetDemandedBits function to SelectionDAG::GetDemandedBits as a first step towards making it easier for targets to get to the source of any demanded bits without the limitations of SimplifyDemandedBits. Differential Revision: https://reviews.llvm.org/D35841 llvm-svn: 308983
OpenPOWER on IntegriCloud