| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 258103
|
| |
|
|
|
|
| |
Also remove an executable bit introduced by r258083.
llvm-svn: 258101
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is a companion patch for http://reviews.llvm.org/D16124.
Internalized symbols increase the size of strongly-connected components in
SCC-based module splitting and thus reduce the amount of parallelism. This
patch records the original linkage of non-local symbols prior to
internalization and then restores it just before splitting/CodeGen. This is
also useful for cases where the linker requires symbols to remain external, for
instance, so they can be placed according to linker script rules.
It's currently under its own flag (-restore-globals) but should eventually
share a common flag with D16124.
Reviewers: joker.eph, pcc
Subscribers: slarin, llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D16229
llvm-svn: 258100
|
| |
|
|
| |
llvm-svn: 258099
|
| |
|
|
| |
llvm-svn: 258096
|
| |
|
|
|
|
|
|
|
|
| |
This breaks the tests that were meant for testing
64-bit inline immediates, so move those to shl where
they won't be broken up.
This should be repeated for the other related bit ops.
llvm-svn: 258095
|
| |
|
|
|
|
| |
AFAICT, these have been unused since the initial backend import.
llvm-svn: 258093
|
| |
|
|
|
|
|
| |
Reduce 64-bit shl with constant > 32. We already special cased
this for the == 32 case, but this also works for any >= 32 constant.
llvm-svn: 258092
|
| |
|
|
|
|
| |
64-bit shifts are very slow on some subtargets.
llvm-svn: 258090
|
| |
|
|
| |
llvm-svn: 258088
|
| |
|
|
| |
llvm-svn: 258085
|
| |
|
|
| |
llvm-svn: 258084
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
globalize any local variables.
Summary:
Currently llvm::SplitModule as the first step globalizes all local objects, which might not be desirable in some scenarios.
This change adds a new flag to llvm::SplitModule that uses SCC approach to search for a balanced partition without the need to externalize symbols.
Such partition might not be possible or fully balanced for a given number of partitions, and is a function of the module properties (global/local dependencies within the module).
Joint development Tobias Edler von Koch (tobias@codeaurora.org) and Sergei Larin (slarin@codeaurora.org)
Subscribers: llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D16124
llvm-svn: 258083
|
| |
|
|
|
|
|
|
| |
AVX2 can only broadcast from the zero'th element of a vector, but if the broadcastable element is the zero'th element of a 128-bit subvector its advantageous to extract the subvector, broadcast from that and avoid the loading of shuffle mask data that would be needed for VPERMPS/VPERMD. The only exception being when the source type is 4f64 or 4i64 which can directly use the immediate shuffle VPERMPD/VPERMQ directly.
Differential Revision: http://reviews.llvm.org/D16050
llvm-svn: 258081
|
| |
|
|
| |
llvm-svn: 258076
|
| |
|
|
|
|
|
| |
- Allow any instruction to define equality between registers.
- Keep the DFG updated.
llvm-svn: 258075
|
| |
|
|
| |
llvm-svn: 258074
|
| |
|
|
| |
llvm-svn: 258073
|
| |
|
|
|
|
| |
BinOpInit and TernOpInit. This remove the memory needed to store the key for the DenseMap. NFC
llvm-svn: 258071
|
| |
|
|
| |
llvm-svn: 258068
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When SimplifySetCC sees a setcc node that compares the result of a
value extension operation with a constant, it tries to simplify the
setcc node by eliminating the extension and shrinking the constant.
If shrinking the inputs to setcc is deemed not desirable by the target
(e.g. the target does not want a setcc comparing i1 values), then it
is still possible to optimize this sequence in some cases.
This patch adds the following combines to SimplifySetCC when shrinking setcc
inputs is not desirable:
(setcc ([sz]ext (setcc x, y, cc)), 0, setne) -> (setcc (x, y, cc))
(setcc ([sz]ext (setcc x, y, cc)), 0, seteq) -> (setcc (x, Y, !cc))
There are no tests for this yet, but once AMDGPU correctly implements
TargetLowering::isTypeDesirableForOp(), this new combine will be
exercised by the existing CodeGen/AMDGPU/setcc-opt.ll test.
Reviewers: resistor, arsenm
Subscribers: jroelofs, arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D15034
llvm-svn: 258067
|
| |
|
|
|
|
| |
vector. This removes the state needed to manage the extra vector thus reducing the size of the Record class. NFC
llvm-svn: 258065
|
| |
|
|
|
|
| |
BitsInit/ListInit object itself. Saves a bit of memory. NFC
llvm-svn: 258063
|
| |
|
|
| |
llvm-svn: 258062
|
| |
|
|
| |
llvm-svn: 258059
|
| |
|
|
| |
llvm-svn: 258058
|
| |
|
|
| |
llvm-svn: 258057
|
| |
|
|
|
|
|
|
| |
Implemented intrinsic for the follow instructions (store) : VMOVDQU8/16/32/64, VMOVDQA32/64, VMOVAPS/PD, VMOVUPS/PD.
Differential Revision: http://reviews.llvm.org/D16271
llvm-svn: 258047
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D16289
llvm-svn: 258046
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
instruction.
code example , previous implementation.
movzbl %dil, %eax
kmovw %eax, %k0
new code
kmovw %edi, %k0
Differential Revision: http://reviews.llvm.org/D16287
llvm-svn: 258045
|
| |
|
|
|
|
|
|
|
| |
When the shift immediate is zero, PKHTB is an alias for PKHBT, but the order of
the input operands needs to be swapped.
Differential Revision: http://reviews.llvm.org/D16288
llvm-svn: 258044
|
| |
|
|
|
|
|
|
|
| |
Fixing wrong typo (avx515) → (avx512)
Review over the shoulder by asaf .
Differential Revision: http://reviews.llvm.org/D16190
llvm-svn: 258041
|
| |
|
|
|
|
|
|
| |
The symtab is logically referenced beyond the call to the create
method. This changes makes sure its lifetime matches that of
the reader.
llvm-svn: 258036
|
| |
|
|
| |
llvm-svn: 258035
|
| |
|
|
|
|
| |
in address space.
llvm-svn: 258029
|
| |
|
|
|
|
|
|
|
|
|
|
| |
getType()->getPointerElementType().
Reviewers: mjacob
Subscribers: llvm-commits, dblaikie
Differential Revision: http://reviews.llvm.org/D16272
llvm-svn: 258028
|
| |
|
|
| |
llvm-svn: 258027
|
| |
|
|
| |
llvm-svn: 258026
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
source element type of the GEP as an argument.
Patch by Eduard Burtescu.
Reviewers: dblaikie, mjacob
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16281
llvm-svn: 258024
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
going through PointerType::getElementType.
Patch by Eduard Burtescu.
Reviewers: dblaikie, mjacob
Subscribers: dsanders, llvm-commits, dblaikie
Differential Revision: http://reviews.llvm.org/D16273
llvm-svn: 258023
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: dblaikie, mjacob
Subscribers: llvm-commits, dblaikie
Patch by Eduard Burtescu.
Differential Revision: http://reviews.llvm.org/D16274
llvm-svn: 258022
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
`LCSSASafePhiForRAUW` as computed was incorrect -- in cases like
these (this exact example does not actually trigger the bug):
define i32 @f(i32 %n, i1* %c) {
entry:
br label %outer.loop
outer.loop:
br label %inner.loop
inner.loop:
%iv = phi i32 [ 0, %outer.loop ], [ %iv.inc, %inner.loop ]
%iv.inc = add nuw nsw i32 %iv, 1
%tc = udiv i32 %n, 13
%be.cond = icmp ult i32 %iv, %tc
br i1 %be.cond, label %inner.loop, label %inner.exit
inner.exit:
%iv.lcssa = phi i32 [ %iv, %inner.loop ]
%outer.be.cond = load volatile i1, i1* %c
br i1 %outer.be.cond, label %outer.loop, label %leave
leave:
%iv.lcssa.lcssa = phi i32 [ %iv.lcssa, %inner.exit ]
ret i32 %iv.lcssa.lcssa
}
`LCSSASafePhiForRAUW` is true for `%iv.lcssa` when re-rewriting the exit
value of `%iv` for `%inner.loop` to `%tc` (this can happen due to
`SCEVExpander::findExistingExpansion`), but the RAUW breaks LCSSA.
To fix this, instead of computing `SafePhi` with special logic, decide
the safety of RAUW directly via `replacementPreservesLCSSAForm`.
llvm-svn: 258016
|
| |
|
|
| |
llvm-svn: 258015
|
| |
|
|
|
|
|
|
|
|
| |
The feature flag is for VPERMB,VPERMI2B,VPERMT2B and VPMULTISHIFTQB instructions.
More about the instruction can be found in:
hattps://software.intel.com/sites/default/files/managed/07/b7/319433-023.pdf
Differential Revision: http://reviews.llvm.org/D16190
llvm-svn: 258012
|
| |
|
|
| |
llvm-svn: 258011
|
| |
|
|
|
|
|
|
| |
Reviewed By: reames
Differential Revision: http://reviews.llvm.org/D16226
llvm-svn: 258010
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D16184
llvm-svn: 258009
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D16189
llvm-svn: 258008
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D16194
llvm-svn: 258006
|
| |
|
|
|
|
|
|
|
|
| |
shuffle lowering
Added support for the extraction of the upper 128-bit subvectors for lower/upper half undef shuffles if it would reduce the number of extractions/insertions or avoid loads of AVX2 permps/permd shuffle masks.
Minor follow up to D15477.
llvm-svn: 258000
|