| Commit message (Collapse) | Author | Age | Files | Lines | 
| ... |  | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Reduced version of D26357 - based on the discussion on llvm-dev about canonicalization of UMIN/UMAX/SMIN/SMAX as well as ABS I've reduced that patch to just the ABS ISD node (with x86/sse support) to improve basic combines and lowering.
ARM/AArch64, Hexagon, PowerPC and NVPTX all have similar instructions allowing us to make this a generic opcode and move away from the hard coded tablegen patterns which makes it tricky to match more complex patterns.
At the moment this patch doesn't attempt legalization as we only create an ABS node if its legal/custom.
Differential Revision: https://reviews.llvm.org/D29639
llvm-svn: 297780
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
This is the backend counterpart to:
https://reviews.llvm.org/rL297390
https://reviews.llvm.org/rL297409
and follow-up to:
https://reviews.llvm.org/rL297384
It surprised me that we need to duplicate the check in FoldConstantArithmetic and FoldConstantVectorArithmetic, 
but one or the other doesn't catch all of the test cases. There is an existing code comment about merging those 
someday.
Differential Revision: https://reviews.llvm.org/D30826
llvm-svn: 297762
 | 
| | 
| 
| 
|  | 
llvm-svn: 297729
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Create nodes for smulwb and smulwt and move their selection from
DAGToDAG to DAG combine. smlawb and smlawt can then be selected
using tablegen. Added some helper functions to detect shift patterns
as well as a wrapper around SimplifyDemandBits. Added a couple of
extra tests.
Differential Revision: https://reviews.llvm.org/D30708
llvm-svn: 297716
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Each Calling convention (CC) defines a static list of registers that should be preserved by a callee function. All other registers should be saved by the caller.
Some CCs use additional condition: If the register is used for passing/returning arguments – the caller needs to save it - even if it is part of the Callee Saved Registers (CSR) list.
The current LLVM implementation doesn’t support it. It will save a register if it is part of the static CSR list and will not care if the register is passed/returned by the callee.
The solution is to dynamically allocate the CSR lists (Only for these CCs). The lists will be updated with actual registers that should be saved by the callee.
Since we need the allocated lists to live as long as the function exists, the list should reside inside the Machine Register Info (MRI) which is a property of the Machine Function and managed by it (and has the same life span).
The lists should be saved in the MRI and populated upon LowerCall and LowerFormalArguments.
The patch will also assist to implement future no_caller_saved_regsiters attribute intended for interrupt handler CC.
Differential Revision: https://reviews.llvm.org/D28566
llvm-svn: 297715
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
When checking if chain node is foldable, make sure the intermediate nodes have a single use across all results not just the result that was used to reach the chain node.
This recovers a test case that was severely broken by r296476, my making sure we don't create ADD/ADC that loads and stores when there is also a flag dependency.
llvm-svn: 297698
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
enabled.
    Recommiting with compiler time improvements
    Recommitting after fixup of 32-bit aliasing sign offset bug in DAGCombiner.
    * Simplify Consecutive Merge Store Candidate Search
    Now that address aliasing is much less conservative, push through
    simplified store merging search and chain alias analysis which only
    checks for parallel stores through the chain subgraph. This is cleaner
    as the separation of non-interfering loads/stores from the
    store-merging logic.
    When merging stores search up the chain through a single load, and
    finds all possible stores by looking down from through a load and a
    TokenFactor to all stores visited.
    This improves the quality of the output SelectionDAG and the output
    Codegen (save perhaps for some ARM cases where we correctly constructs
    wider loads, but then promotes them to float operations which appear
    but requires more expensive constant generation).
    Some minor peephole optimizations to deal with improved SubDAG shapes (listed below)
    Additional Minor Changes:
      1. Finishes removing unused AliasLoad code
      2. Unifies the chain aggregation in the merged stores across code
         paths
      3. Re-add the Store node to the worklist after calling
         SimplifyDemandedBits.
      4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
         arbitrary, but seems sufficient to not cause regressions in
         tests.
      5. Remove Chain dependencies of Memory operations on CopyfromReg
         nodes as these are captured by data dependence
      6. Forward loads-store values through tokenfactors containing
          {CopyToReg,CopyFromReg} Values.
      7. Peephole to convert buildvector of extract_vector_elt to
         extract_subvector if possible (see
         CodeGen/AArch64/store-merge.ll)
      8. Store merging for the ARM target is restricted to 32-bit as
         some in some contexts invalid 64-bit operations are being
         generated. This can be removed once appropriate checks are
         added.
    This finishes the change Matt Arsenault started in r246307 and
    jyknight's original patch.
    Many tests required some changes as memory operations are now
    reorderable, improving load-store forwarding. One test in
    particular is worth noting:
      CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store
      forwarding converts a load-store pair into a parallel store and
      a memory-realized bitcast of the same value. However, because we
      lose the sharing of the explicit and implicit store values we
      must create another local store. A similar transformation
      happens before SelectionDAG as well.
    Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle
llvm-svn: 297695
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
This reverts commit r242302. External type refs of this form were
never used by any LLVM frontend so this is effectively dead code.
(They were introduced to support clang module debug info, but in the
end we came up with a better design that doesn't use this feature at
all.)
rdar://problem/25897929
Differential Revision: https://reviews.llvm.org/D30917
llvm-svn: 297684
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
The previous algorithm for RegUsageInfoCollector had pretty bad
performance on architectures with a lot of registers that alias
a lot one another, because we potentially iterate for every register
over all the aliasing registers. This costs even more if the function
is small and doesn't define a lot of registers.
This patch changes the algorithm to one that while iterating over
all the registers it will iterate over the aliasing registers only
if the register itself is defined.
This should be faster based on the assumption that only a subset
of the whole LLVM registers set is actually defined in the function.
Differential Revision: https://reviews.llvm.org/D30880
llvm-svn: 297673
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Reviewers: qcolombet, aditya_nandakumar, dsanders, t.p.northover, javed.absar, ab
Reviewed By: qcolombet, dsanders, ab
Subscribers: dberris, rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D30216
llvm-svn: 297670
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
This commit adds tail call support to the MachineOutliner pass. This allows
the outliner to insert jumps rather than calls in areas where tail calling is
possible. Outlined tail calls include the return or terminator of the basic
block being outlined from.
Tail call support allows the outliner to take returns and terminators into
consideration while finding candidates to outline. It also allows the outliner
to save more instructions. For example, in the X86-64 outliner, a tail called
outlined function saves one instruction since no return has to be inserted.
llvm-svn: 297653
 | 
| | 
| 
| 
|  | 
llvm-svn: 297560
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Summary: As per title.
Reviewers: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D30836
llvm-svn: 297559
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
We don't need to check whether the fallback path is enabled to return
false. Just do that all the time on error cases, the caller knows (or
at least should know!) how to handle the failing case.
llvm-svn: 297535
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
The problem can occur in presence of subregs. If we are swapping two
instructions defining different subregs of the same register we will
get a new liveout from a block. We need to preserve value number for
block's liveout for successor block's livein to match.
Differential Revision: https://reviews.llvm.org/D30558
llvm-svn: 297534
 | 
| | 
| 
| 
|  | 
llvm-svn: 297529
 | 
| | 
| 
| 
| 
| 
|  | 
'!A || (A && B)' is equivalent to '!A || B'
llvm-svn: 297527
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Summary: No test case as none of the in-tree targets with GlobalISel support has this condition.
Reviewers: qcolombet, aditya_nandakumar, dsanders, t.p.northover, ab
Reviewed By: qcolombet
Subscribers: dberris, rovka, kristof.beyls, llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D30786
llvm-svn: 297512
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Reviewers: qcolombet, aditya_nandakumar, dsanders, ab, t.p.northover, javed.absar
Reviewed By: qcolombet
Subscribers: dberris, rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D30259
llvm-svn: 297509
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Reviewers: qcolombet, aditya_nandakumar, dsanders, ab, t.p.northover, javed.absar
Reviewed By: qcolombet
Subscribers: dberris, rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D30761
llvm-svn: 297495
 | 
| | 
| 
| 
|  | 
llvm-svn: 297492
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Summary:
We don’t actually use LegalizerInfo in Legalizer pass, it’s just passed
as an argument.
In order to check if an instruction is legal or not, we need to get LegalizerInfo
by calling `MI.getParent()->getParent()->getSubtarget().getLegalizerInfo()`.
Instead, make LegalizerInfo accessible in LegalizerHelper.
Reviewers: qcolombet, aditya_nandakumar, dsanders, ab, t.p.northover, kristof.beyls
Reviewed By: qcolombet
Subscribers: dberris, llvm-commits, rovka
Differential Revision: https://reviews.llvm.org/D30838
llvm-svn: 297491
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
and SUBC.
Summary:
Depends on D30379
This improves the state of things for the sub class of operation.
Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D30436
llvm-svn: 297482
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Summary: As per title. This is extracted from D29872 and I threw SADDO in.
Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D30379
llvm-svn: 297479
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
We currently have to insert bits via a temporary variable of the same size as the target with various shift/mask stages, resulting in further temporary variables, all of which require the allocation of memory for large APInts (MaskSizeInBits > 64).
This is another of the compile time issues identified in PR32037 (see also D30265).
This patch adds the APInt::insertBits() helper method which avoids the temporary memory allocation and masks/inserts the raw bits directly into the target.
Differential Revision: https://reviews.llvm.org/D30780
llvm-svn: 297458
 | 
| | 
| 
| 
| 
| 
|  | 
ImmutableCallSite abstracts away CallInst and InvokeInst. Use it!
llvm-svn: 297426
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
We unintentionally stopped falling back in r293670.
While there, change an unusual construct.
llvm-svn: 297425
 | 
| | 
| 
| 
| 
| 
|  | 
They're used for nefarious purposes by ObjC.
llvm-svn: 297422
 | 
| | 
| 
| 
| 
| 
|  | 
Differential Revision: https://reviews.llvm.org/D30598
llvm-svn: 297421
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Summary: This essentially does the same transform as for ADC.
Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D30417
llvm-svn: 297416
 | 
| | 
| 
| 
| 
| 
| 
|  | 
Amongst other things (I expect) this is necessary to ensure decent backtraces
when an "unreachable" is involved.
llvm-svn: 297413
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
The good reason to do this is that static allocas are pretty simple to handle
(especially at -O0) and avoiding tracking DBG_VALUEs throughout the pipeline
should give some kind of performance benefit.
The bad reason is that the debug pipeline is an unholy mess of implicit
contracts, where determining whether "DBG_VALUE %reg, imm" actually implies a
load or not involves the services of at least 3 soothsayers and the sacrifice
of at least one chicken.  And it still gets it wrong if the variable is at SP
directly.
llvm-svn: 297410
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Summary: This essentially does the same transform as for SUBC.
Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D30437
llvm-svn: 297404
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
As discussed in the review thread for rL297026, this is actually 2 changes that 
would independently fix all of the test cases in the patch:
1. Return undef in FoldConstantArithmetic for div/rem by 0.
2. Move basic undef simplifications for div/rem (simplifyDivRem()) before 
   foldBinopIntoSelect() as a matter of efficiency.
I will handle the case of vectors with any zero element as a follow-up. That change
is the DAG sibling for D30665 + adding a check of vector elements to FoldConstantVectorArithmetic().
I'm deleting the test for PR30693 because it does not test for the actual bug any more
(dangers of using bugpoint).
Differential Revision:
https://reviews.llvm.org/D30741
llvm-svn: 297384
 | 
| | 
| 
| 
| 
| 
| 
|  | 
With this, it shows up as an attribute in YAML and non-printable characters
are properly removed by GlobalValue::getRealLinkageName.
llvm-svn: 297362
 | 
| | 
| 
| 
|  | 
llvm-svn: 297354
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
pointer and reference types
Differential Revision: https://reviews.llvm.org/D29670
llvm-svn: 297320
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
This commit changes the BumpPtrAllocator for suffix tree nodes to a SpecificBumpPtrAllocator.
Before, node construction was leaking memory because of the DenseMap in SuffixTreeNodes.
Changing this to a SpecificBumpPtrAllocator allows this memory to properly be released.
llvm-svn: 297319
 | 
| | 
| 
| 
| 
| 
| 
|  | 
It makes sense to only do them once in IRTranslator rather than making everyone
deal with them.
llvm-svn: 297304
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Summary: rL297171 introduced G_FNEG for floating-point negation instruction and IRTranslator started to translate `FSUB -0.0, X` to `FNEG X`. This patch adds a default action for G_FNEG to avoid breaking existing targets.
Reviewers: qcolombet, ab, kristof.beyls, t.p.northover, aditya_nandakumar, dsanders
Reviewed By: qcolombet
Subscribers: dberris, rovka, llvm-commits
Differential Revision: https://reviews.llvm.org/D30721
llvm-svn: 297301
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
|  | 
This helps in cases involving bitfields where an AND is exposed by
legalization.
Differential Revision: https://reviews.llvm.org/D30472
llvm-svn: 297249
 | 
| | 
| 
| 
| 
| 
|  | 
Differential Revision: https://reviews.llvm.org/D29672
llvm-svn: 297247
 | 
| | 
| 
| 
| 
| 
|  | 
It was added between my build+test and my commit.
llvm-svn: 297244
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
object that knows how to generate it.
Summary:
This will allow future patches to inspect the details of the LLT. The implementation is now split between
the Support and CodeGen libraries to allow TableGen to use this class without introducing layering concerns.
Thanks to Ahmed Bougacha for finding a reasonable way to avoid the layering issue and providing the version of this patch without that problem.
The problem with the previous commit appears to have been that TableGen was including CodeGen/LowLevelType.h instead of Support/LowLevelTypeImpl.h.
Reviewers: t.p.northover, qcolombet, rovka, aditya_nandakumar, ab, javed.absar
Subscribers: arsenm, nhaehnle, mgorny, dberris, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D30046
llvm-svn: 297241
 | 
| | 
| 
| 
|  | 
llvm-svn: 297237
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
We were calculating incorrect extract/insert offsets by trying to be too
tricksy with min/max. It's clearer to just split the logic up into "register
starts before this segment" vs "after".
llvm-svn: 297226
 | 
| | 
| 
| 
|  | 
llvm-svn: 297222
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
|  | 
Some intrinsics take metadata parameters.  These all need custom
handling of some form, and cannot possibly be lowered generically to
G_INTRINSIC calls with vreg operands.
Reject them, instead of hitting an assert later in getOrCreateVReg.
llvm-svn: 297209
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
When we translate a no-op (same type) bitcast, we try to be clever and
only emit a COPY if we already assigned a vreg to the defined value.
However, when we didn't, we tried to assign to a reference into the
ValToVReg DenseMap, even though the RHS of the assignment
(getOrCreateVReg) could potentially grow that DenseMap, invalidating the
reference.
Avoid that by getting the source vreg first.
I audited the rest of the translator; this is the only tricky case.
The test is quite unwieldy, as the problem is caused by the DenseMap
growing, which happens after the 47th mapped value.
llvm-svn: 297208
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
For vector operands, the `select` instruction supports both vector and
non-vector conditions.  The MIR builder had an overly restrictive
assertion, that only accepted vector conditions for vector selects
(in effect implementing ISD::VSELECT).
Make it possible to express the full range of G_SELECTs.
llvm-svn: 297207
 |