| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
llvm-svn: 297755
|
|
|
|
|
|
| |
This fixes llvm.org/PR32265.
llvm-svn: 297745
|
|
|
|
| |
llvm-svn: 297742
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch refactors the PHisToFix loop as follows:
- The loop itself now resides in its own method.
- The new method iterates on scalar-loop's header; the PHIsToFix map formerly
propagated as an output parameter and filled during phi widening is removed.
- The code handling reductions is moved into its own method, similar to the
existing fixFirstOrderRecurrence().
Differential Revision: https://reviews.llvm.org/D30755
llvm-svn: 297740
|
|
|
|
|
|
|
|
|
|
|
| |
This instruction was missing from the list of opcodes that we check, so we were
hitting an llvm_unreachable in ARMMCCodeEmitter.cpp for the ARM MOVT
instruction, rather than the diagnostic that is emitted for the other MOVW/MOVT
instructions.
Differential revision: https://reviews.llvm.org/D30936
llvm-svn: 297739
|
|
|
|
|
|
|
|
|
|
|
|
| |
ARMBaseInstrInfo::isProfitableToIfCvt() [NFC]
Reviewers: congh, rengolin
Subscribers: aemerson, llvm-commits
Differential Revision: https://reviews.llvm.org/D30934
llvm-svn: 297738
|
|
|
|
|
|
|
|
|
|
|
| |
Refactoring Cost Model's selectVectorizationFactor() so that it handles only the
selection of the best VF from a pre-computed range of candidate VF's, extracting
early-exit criteria and the computation of a MaxVF upper-bound to other methods,
all driven by a newly introduced LoopVectorizationPlanner.
Differential Revision: https://reviews.llvm.org/D30653
llvm-svn: 297737
|
|
|
|
| |
llvm-svn: 297729
|
|
|
|
| |
llvm-svn: 297726
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If it is possible for the RHS of a shift operation to be greater than or equal
to the bit-width, then the result might be undef, and we can't report any known
bits.
In some cases, this was allowing a transformation in instcombine which widened
an undef value from i1 to i32, increasing the range of values that a function
could return.
Differential revision: https://reviews.llvm.org/D30781
llvm-svn: 297724
|
|
|
|
|
|
|
|
|
|
|
|
| |
Create nodes for smulwb and smulwt and move their selection from
DAGToDAG to DAG combine. smlawb and smlawt can then be selected
using tablegen. Added some helper functions to detect shift patterns
as well as a wrapper around SimplifyDemandBits. Added a couple of
extra tests.
Differential Revision: https://reviews.llvm.org/D30708
llvm-svn: 297716
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Each Calling convention (CC) defines a static list of registers that should be preserved by a callee function. All other registers should be saved by the caller.
Some CCs use additional condition: If the register is used for passing/returning arguments – the caller needs to save it - even if it is part of the Callee Saved Registers (CSR) list.
The current LLVM implementation doesn’t support it. It will save a register if it is part of the static CSR list and will not care if the register is passed/returned by the callee.
The solution is to dynamically allocate the CSR lists (Only for these CCs). The lists will be updated with actual registers that should be saved by the callee.
Since we need the allocated lists to live as long as the function exists, the list should reside inside the Machine Register Info (MRI) which is a property of the Machine Function and managed by it (and has the same life span).
The lists should be saved in the MRI and populated upon LowerCall and LowerFormalArguments.
The patch will also assist to implement future no_caller_saved_regsiters attribute intended for interrupt handler CC.
Differential Revision: https://reviews.llvm.org/D28566
llvm-svn: 297715
|
|
|
|
|
|
| |
extract_subvector/insert_subvector index.
llvm-svn: 297707
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
getIntrinsicInstrCost() used to only compute scalarization cost based on types.
This patch improves this so that the actual arguments are checked when they are
available, in order to handle only unique non-constant operands.
Tests updates:
Analysis/CostModel/X86/arith-fp.ll
Transforms/LoopVectorize/AArch64/interleaved_cost.ll
Transforms/LoopVectorize/ARM/interleaved_cost.ll
The improvement in getOperandsScalarizationOverhead() to differentiate on
constants made it necessary to update the interleaved_cost.ll tests even
though they do not relate to intrinsics.
Review: Hal Finkel
https://reviews.llvm.org/D29540
llvm-svn: 297705
|
|
|
|
|
|
| |
VK1 register into a AH/BH/CH/DH register.
llvm-svn: 297704
|
|
|
|
|
|
|
|
| |
When checking if chain node is foldable, make sure the intermediate nodes have a single use across all results not just the result that was used to reach the chain node.
This recovers a test case that was severely broken by r296476, my making sure we don't create ADD/ADC that loads and stores when there is also a flag dependency.
llvm-svn: 297698
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
enabled.
Recommiting with compiler time improvements
Recommitting after fixup of 32-bit aliasing sign offset bug in DAGCombiner.
* Simplify Consecutive Merge Store Candidate Search
Now that address aliasing is much less conservative, push through
simplified store merging search and chain alias analysis which only
checks for parallel stores through the chain subgraph. This is cleaner
as the separation of non-interfering loads/stores from the
store-merging logic.
When merging stores search up the chain through a single load, and
finds all possible stores by looking down from through a load and a
TokenFactor to all stores visited.
This improves the quality of the output SelectionDAG and the output
Codegen (save perhaps for some ARM cases where we correctly constructs
wider loads, but then promotes them to float operations which appear
but requires more expensive constant generation).
Some minor peephole optimizations to deal with improved SubDAG shapes (listed below)
Additional Minor Changes:
1. Finishes removing unused AliasLoad code
2. Unifies the chain aggregation in the merged stores across code
paths
3. Re-add the Store node to the worklist after calling
SimplifyDemandedBits.
4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
arbitrary, but seems sufficient to not cause regressions in
tests.
5. Remove Chain dependencies of Memory operations on CopyfromReg
nodes as these are captured by data dependence
6. Forward loads-store values through tokenfactors containing
{CopyToReg,CopyFromReg} Values.
7. Peephole to convert buildvector of extract_vector_elt to
extract_subvector if possible (see
CodeGen/AArch64/store-merge.ll)
8. Store merging for the ARM target is restricted to 32-bit as
some in some contexts invalid 64-bit operations are being
generated. This can be removed once appropriate checks are
added.
This finishes the change Matt Arsenault started in r246307 and
jyknight's original patch.
Many tests required some changes as memory operations are now
reorderable, improving load-store forwarding. One test in
particular is worth noting:
CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store
forwarding converts a load-store pair into a parallel store and
a memory-realized bitcast of the same value. However, because we
lose the sharing of the explicit and implicit store values we
must create another local store. A similar transformation
happens before SelectionDAG as well.
Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle
llvm-svn: 297695
|
|
|
|
| |
llvm-svn: 297692
|
|
|
|
| |
llvm-svn: 297690
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r242302. External type refs of this form were
never used by any LLVM frontend so this is effectively dead code.
(They were introduced to support clang module debug info, but in the
end we came up with a better design that doesn't use this feature at
all.)
rdar://problem/25897929
Differential Revision: https://reviews.llvm.org/D30917
llvm-svn: 297684
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This simple optimization has been split out of https://reviews.llvm.org/D30400
Reviewers: efriedma, jmolloy
Subscribers: llvm-commits, rengolin
Differential Revision: https://reviews.llvm.org/D30829
llvm-svn: 297682
|
|
|
|
|
|
|
|
|
|
| |
Previously, it created a temporary directory and then failed when
FileOutputBuffer tried to rename that file to the destination file
(which is actually a directory name).
Differential Revision: https://reviews.llvm.org/D30912
llvm-svn: 297679
|
|
|
|
|
|
|
|
| |
AH/BH/CH/DH with fastisel.
Fixes PR32256. Still planning to do an audit for other possible cases.
llvm-svn: 297678
|
|
|
|
|
|
|
|
|
| |
DW_AT_specification in a single chain
In a recent refactoring (r291959) this regressed to only following one
or the other, not both, in a single chain.
llvm-svn: 297676
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous algorithm for RegUsageInfoCollector had pretty bad
performance on architectures with a lot of registers that alias
a lot one another, because we potentially iterate for every register
over all the aliasing registers. This costs even more if the function
is small and doesn't define a lot of registers.
This patch changes the algorithm to one that while iterating over
all the registers it will iterate over the aliasing registers only
if the register itself is defined.
This should be faster based on the assumption that only a subset
of the whole LLVM registers set is actually defined in the function.
Differential Revision: https://reviews.llvm.org/D30880
llvm-svn: 297673
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: qcolombet, aditya_nandakumar, dsanders, t.p.northover, javed.absar, ab
Reviewed By: qcolombet, dsanders, ab
Subscribers: dberris, rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D30216
llvm-svn: 297670
|
|
|
|
|
|
|
|
|
|
|
|
| |
rL230225 made the assumption that only the lower 32-bits of an MMX register load is used as a shift value, when in fact the whole 64-bits are reloaded and treated as a i64 to determine the shift value.
This patch reverts rL230225 to ensure that the whole 64-bits of memory are folded and ensures that the upper 32-bit are zero'd for cases where the shift value has come from a scalar source.
Found during fuzz testing.
Differential Revision: https://reviews.llvm.org/D30833
llvm-svn: 297667
|
|
|
|
|
|
| |
I am leaving the code in clang which filters mxcsr from the clobber list because that is still technically correct and will be useful again when the MXCSR register is reintroduced.
llvm-svn: 297664
|
|
|
|
| |
llvm-svn: 297662
|
|
|
|
|
|
| |
The issues was just a missing REQUIRES in the test.
llvm-svn: 297661
|
|
|
|
| |
llvm-svn: 297658
|
|
|
|
|
|
|
| |
This reverts commit r297624.
It was failing on the bots.
llvm-svn: 297657
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds tail call support to the MachineOutliner pass. This allows
the outliner to insert jumps rather than calls in areas where tail calling is
possible. Outlined tail calls include the return or terminator of the basic
block being outlined from.
Tail call support allows the outliner to take returns and terminators into
consideration while finding candidates to outline. It also allows the outliner
to save more instructions. For example, in the X86-64 outliner, a tail called
outlined function saves one instruction since no return has to be inserted.
llvm-svn: 297653
|
|
|
|
|
|
|
|
| |
source optimizations to break execution dependencies.
For AVX-512 we force the input to zero if the input is undef or the mask is all ones to break an execution dependency. This patch brings the same behavior to AVX2.
llvm-svn: 297652
|
|
|
|
|
|
|
|
| |
We were already forcing undef inputs to become a zero vector, this now catches an all ones mask too.
Ideally we'd use undef and let execution dep fix handle picking the best register/clearance for the undef, but I don't think it can handle the early clobber today.
llvm-svn: 297651
|
|
|
|
|
|
|
| |
The typical use is a library vote function which
compares to 0. Fold the user condition into the intrinsic.
llvm-svn: 297650
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D30738
llvm-svn: 297649
|
|
|
|
|
|
|
|
| |
and use have it use SmallVectorImpl.
There is nothing specific about allocas in this function.
llvm-svn: 297643
|
|
|
|
|
|
|
|
| |
Previously we could round-trip type records from PDB -> Yaml ->
PDB, but for symbols we could only go from PDB -> Yaml. This
completes the round-tripping for symbols as well.
llvm-svn: 297625
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If raw_fd_ostream is constructed with the path of "-", it claims
ownership of the stdout file descriptor. This means that it closes
stdout when it is destroyed. If there are multiple users of
raw_fd_ostream wrapped around stdout, then a crash can occur because
of operations on a closed stream.
An example of this would be running something like "clang -S -o - -MD
-MF - test.cpp". Alternatively, using outs() (which creates a local
version of raw_fd_stream to stdout) anywhere combined with such a
stream usage would cause the crash.
The fix duplicates the stdout file descriptor when used within
raw_fd_ostream, so that only that particular descriptor is closed when
the stream is destroyed.
Patch by James Henderson!
llvm-svn: 297624
|
|
|
|
|
|
|
| |
We used to hit an unreachable in getRegBankFromRegClass when dealing with the
stack pointer. This commit adds support for the GPRsp reg class.
llvm-svn: 297621
|
|
|
|
|
|
| |
http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/49970
llvm-svn: 297618
|
|
|
|
|
|
|
|
| |
sys::fs::permissions to set them.
Patch by James Henderson.
llvm-svn: 297617
|
|
|
|
| |
llvm-svn: 297611
|
|
|
|
|
|
|
|
| |
This commit is a follow-up on r297580. It fixes the FIXME added temporarily
by that commit to keep the removal of Unroller's specialized version of
scalarizeInstruction() an NFC. See https://reviews.llvm.org/D30715 for details.
llvm-svn: 297610
|
|
|
|
|
|
|
|
|
| |
Loop over the ARM decode tables; this is a clean-up to reduce some code
duplication.
Differential Revision: https://reviews.llvm.org/D30814
llvm-svn: 297608
|
|
|
|
|
|
| |
they can be correctly matched by EVEX2VEX table generation.
llvm-svn: 297601
|
|
|
|
| |
llvm-svn: 297600
|
|
|
|
| |
llvm-svn: 297599
|
|
|
|
| |
llvm-svn: 297594
|