| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 353710
|
| |
|
|
|
|
|
|
|
|
|
| |
ResourceManager::checkAvailability(). NFCI
In case of bottlenecks caused by pipeline pressure, we want to be able to
correctly report the set of problematic pipelines. This is a first step towards
adding support for bottleneck hints in llvm-mca (see PR37494). No functional
change intended.
llvm-svn: 353706
|
| |
|
|
| |
llvm-svn: 353704
|
| |
|
|
|
|
|
|
|
|
| |
This commit fixes the DPP sequence in the atomic optimizer (which was
previously missing the row_shr:3 step), and works around a read_register
exec bug by using a ballot instead.
Differential Revision: https://reviews.llvm.org/D57737
llvm-svn: 353703
|
| |
|
|
|
|
|
|
|
| |
This reverts commit r353610.
It causes a miscompile visible in macro expansion in a bootstrapped clang.
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190211/626590.html
llvm-svn: 353699
|
| |
|
|
|
|
|
|
|
|
|
| |
The v8m.base ISA contains movw, which can operate on an unsigned
16-bit value. Add the pattern that converts an add with a negative
value, that could fit into 16-bits when negated, into a sub with that
positive value.
Differential Revision: https://reviews.llvm.org/D57942
llvm-svn: 353692
|
| |
|
|
|
|
| |
Related revisions: https://reviews.llvm.org/D55444, https://reviews.llvm.org/D55314
llvm-svn: 353691
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The whole design of generating LDMs/STMs is fragile and unreliable: it depends on
rescheduling here in the LoadStoreOptimizer that isn't register pressure aware
and regalloc that isn't aware of generating LDMs/STMs.
This patch adds a (hidden) option to control the total number of instructions that
can be re-ordered. I appreciate this looks only a tiny bit better than a hard-coded
constant, but at least it allows more easy experimentation with different values
for now. Ideally we calculate this reorder limit based on some heuristics, and take
register pressure into account. I might be looking into that next.
Differential Revision: https://reviews.llvm.org/D57954
llvm-svn: 353678
|
| |
|
|
|
|
| |
instruction base class rather than the `CallSite` wrapper.
llvm-svn: 353676
|
| |
|
|
|
|
| |
`CallBase` and simpler APIs therein.
llvm-svn: 353673
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D57955
llvm-svn: 353670
|
| |
|
|
| |
llvm-svn: 353667
|
| |
|
|
|
|
| |
new one.
llvm-svn: 353665
|
| |
|
|
|
|
|
|
| |
interface and implementation.
Port code with: `cast<CallBase>(CS.getInstruction())`.
llvm-svn: 353662
|
| |
|
|
|
|
|
|
|
| |
`CallBase`.
Users have been updated. You can see how to update any out-of-tree
usages: pass `cast<CallBase>(CS.getInstruction())`.
llvm-svn: 353661
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
`CallBase` class rather than `CallSite` wrappers.
I pushed this change down through most of the statepoint infrastructure,
completely removing the use of CallSite where I could reasonably do so.
I ended up making a couple of cut-points: generic call handling
(instcombine, TLI, SDAG). As soon as it hit truly generic handling with
users outside the immediate code, I simply transitioned into or out of
a `CallSite` to make this a reasonable sized chunk.
Differential Revision: https://reviews.llvm.org/D56122
llvm-svn: 353660
|
| |
|
|
| |
llvm-svn: 353659
|
| |
|
|
|
|
| |
Minor refactor to simplify some incoming patches to improve broadcast loads.
llvm-svn: 353655
|
| |
|
|
|
|
|
|
|
|
|
| |
Now that we have vector support for [US](ADD|SUB)O we no longer
need to scalarize when expanding [US](ADD|SUB)SAT.
This matches what the cost model already does.
Differential Revision: https://reviews.llvm.org/D57348
llvm-svn: 353651
|
| |
|
|
|
|
| |
No change in default behaviour (AllowUndefs = false)
llvm-svn: 353646
|
| |
|
|
|
|
|
|
| |
Now that we have SimplifyDemandedBits support for funnel shifts (rL353539), we need to simplify funnel shifts back to bitshifts in cases where either argument has been folded to undef/zero.
Differential Revision: https://reviews.llvm.org/D58009
llvm-svn: 353645
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
256-bit horizontal math ops are an x86 monstrosity (and thankfully have
not been extended to 512-bit AFAIK).
The two 128-bit halves operate on separate halves of the inputs. So if we
don't demand anything in the upper half of the result, we can extract the
low halves of the inputs, do the math, and then insert that result into a
256-bit output.
All of the extract/insert is free (ymm<-->xmm), so we're left with a
narrower (cheaper) version of the original op.
In the affected tests based on:
https://bugs.llvm.org/show_bug.cgi?id=33758
https://bugs.llvm.org/show_bug.cgi?id=38971
...we see that the h-op narrowing can result in further narrowing of other
math via existing generic transforms.
I originally drafted this patch as an exact pattern match starting from
extract_vector_elt, but I thought we might see diffs starting from
extract_subvector too, so I changed it to a more general demanded elements
solution. There are no extra existing regression test improvements from
that switch though, so we could go back.
Differential Revision: https://reviews.llvm.org/D57841
llvm-svn: 353641
|
| |
|
|
|
|
|
|
|
|
| |
SimplifySetCC still has much room for improvement, but this should
fix the remaining problem examples from:
https://bugs.llvm.org/show_bug.cgi?id=40657
The initial fix for this problem was rL353615.
llvm-svn: 353639
|
| |
|
|
|
|
| |
isInstructionTriviallyDead also performs the use_empty() check.
llvm-svn: 353637
|
| |
|
|
|
|
|
|
| |
Predicates'. NFCI
We don't have any assembler predicates for vector ISAs so this isn't necessary. It just adds extra lines and identation.
llvm-svn: 353631
|
| |
|
|
| |
llvm-svn: 353630
|
| |
|
|
|
|
|
|
| |
As discussed on D57389, this is a first step towards moving the SHLD/SHRD matching code to DAGCombiner using FSHL/FSHR instead.
There's a bit of work to do before I can do that, so this just folds to FSHL/FSHR in the existing code (handling the different SHRD/FSHR argument ordering), which fixes the issue we had with i16 shift amounts not being correctly masked.
llvm-svn: 353626
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D57952
llvm-svn: 353620
|
| |
|
|
|
|
|
|
|
| |
There's effectively no difference for the cases with variables.
We just trade a sub for an add on those. But the case with a
subtract from constant would require an extra move instruction
on x86, so this looks like a reasonable generic combine.
llvm-svn: 353619
|
| |
|
|
| |
llvm-svn: 353615
|
| |
|
|
|
|
|
|
| |
This reverts commit r353611.
Triggers an assertion during the libcall expansion on ARM.
llvm-svn: 353612
|
| |
|
|
|
|
|
|
|
| |
In preparation for supporting vector expansion.
Also drop a variant of ExpandLibCall, of which the MULO expansions
were the only user.
llvm-svn: 353611
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
D42042 introduced the ability for the ExecutionDomainFixPass to more easily change between BLENDPD/BLENDPS/PBLENDW as the domains required.
With this ability, we can avoid most bitcasts/scaling in the DAG that was occurring with X86ISD::BLENDI lowering/combining, blend with the vXi32/vXi64 vectors directly and use isel patterns to lower to the float vector equivalent vectors.
This helps the shuffle combining and SimplifyDemandedVectorElts be more aggressive as we lose track of fewer UNDEF elements than when we go up/down through bitcasts.
I've introduced a basic blend(bitcast(x),bitcast(y)) -> bitcast(blend(x,y)) fold, there are more generalizations I can do there (e.g. widening/scaling and handling the tricky v16i16 repeated mask case).
The vector-reduce-smin/smax regressions will be fixed in a future improvement to SimplifyDemandedBits to peek through bitcasts and support X86ISD::BLENDV.
Differential Revision: https://reviews.llvm.org/D57888
llvm-svn: 353610
|
| |
|
|
|
|
|
|
|
|
|
| |
The second and the last place it seems.
Error was:
[ 4%] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/Error.cpp.o
/Users/buildslave/as-bldslv9_new/lld-x86_64-darwin13/llvm.src/lib/ObjectYAML/ELFYAML.cpp:993:15: error: unused variable 'Object' [-Werror,-Wunused-variable]
const auto *Object = static_cast<ELFYAML::Object *>(IO.getContext());
llvm-svn: 353609
|
| |
|
|
|
|
|
|
|
|
| |
Error was:
[ 4%] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/DAGDeltaAlgorithm.cpp.o
/Users/buildslave/as-bldslv9_new/lld-x86_64-darwin13/llvm.src/lib/ObjectYAML/ELFYAML.cpp:666:15: error: unused variable 'Object' [-Werror,-Wunused-variable]
const auto *Object = static_cast<ELFYAML::Object *>(IO.getContext());
(http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/29920)
llvm-svn: 353608
|
| |
|
|
|
|
|
|
|
| |
This teaches the tools to parse and dump
the .dynamic section and its dynamic tags.
Differential revision: https://reviews.llvm.org/D57691
llvm-svn: 353606
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
cxxDtorIsEmpty checks callers recursively to determine if the
__cxa_atexit-registered function is empty, and eliminates the
__cxa_atexit call accordingly.
This recursive check is unnecessary as redundant instructions and
function calls can be removed by early-cse and inliner. In addition,
cxxDtorIsEmpty does not mark visited function and it may visit a
function exponential times (multiplication principle).
llvm-svn: 353603
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Take care of some missing clean-ups that belong with r249548 and some
other copy/paste that had happened. In particular, the destructors are
no longer vtable anchors after r249548; and `setSectionName` in
`MCSectionWasm` is private and unused since r313058 culled its only
caller. The destructors are now implicitly defined, and the unused
function is removed.
Reviewers: nemanjai, jasonliu, grosbach
Reviewed By: nemanjai
Subscribers: sbc100, aheejin, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57182
llvm-svn: 353597
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For some specific cases with bitcast A->B->A with intervening PHI nodes InstCombiner::optimizeBitCastFromPhi transformation creates extra PHI nodes, which are actually a copy of already created PHI or in another words, they are redundant. These extra PHI nodes could lead to extra move instructions generated after DeSSA transformation. This happens when several conditions are met
- SROA kicks in and creates new alloca;
- there is a simple assignment L = R, which falls under 'canonicalize loads' done by combineLoadToOperationType (this transformation is by default). Exactly this transformation is the reason of bitcasts generated;
- the alloca is then used in A->B->A + PHI chain;
- there is a loop unrolling.
As a result optimizeBitCastFromPhi creates as many of PHI nodes for each new SROA alloca as loop unrolling factor is. These new extra PHI nodes are redundant actually except of one and should not be created. Moreover the idea of optimizeBitCastFromPhi is to get rid of the cast (when possible) but that doesn't happen in these conditions.
The proposed fix is to do the cast replacement for the whole calculated/accumulated PHI closure not for one cast only, which is an argument to the optimizeBitCastFromPhi. These will help to accomplish several things: 1) avoid extra PHI nodes generated as all casts which may trigger optimizeBitCastFromPhi transformation will be replaced, 3) bitcasts will be replaced, and 3) create more opportunities to remove dead code, which appears after the replacement.
A new test case shows that it's possible to get rid of all bitcasts completely and get quite good code reduction.
Author: Igor Tsimbalist <igor.v.tsimbalist@intel.com>
Reviewed By: Carrot
Differential Revision: https://reviews.llvm.org/D57053
llvm-svn: 353595
|
| |
|
|
|
|
|
|
| |
and commit a1853e834c65751f92521f7481b15cf0365e796b.
They broke arm and aarch64
llvm-svn: 353590
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D57971
llvm-svn: 353587
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: vsk, fhahn, davidxl
Reviewed By: vsk
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57957
llvm-svn: 353582
|
| |
|
|
|
|
|
|
| |
CSEMIRBuilder"
With a fix after r353563 that adds some more opcodes.
llvm-svn: 353579
|
| |
|
|
|
|
|
|
|
|
|
|
| |
CSEMIRBuilder"
This reverts commit r353553.
This breaks CodeGen/AArch64/GlobalISel/legalize-ext-csedebug-output.mir:
http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/57963/console
llvm-svn: 353575
|
| |
|
|
|
|
| |
These instructions can generate a stack overflow exception so technically they read the stack overflow exception mask bit.
llvm-svn: 353564
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch accompanies the RFC posted here:
http://lists.llvm.org/pipermail/llvm-dev/2018-October/127239.html
This patch adds a new CallBr IR instruction to support asm-goto
inline assembly like gcc as used by the linux kernel. This
instruction is both a call instruction and a terminator
instruction with multiple successors. Only inline assembly
usage is supported today.
This also adds a new INLINEASM_BR opcode to SelectionDAG and
MachineIR to represent an INLINEASM block that is also
considered a terminator instruction.
There will likely be more bug fixes and optimizations to follow
this, but we felt it had reached a point where we would like to
switch to an incremental development model.
Patch by Craig Topper, Alexander Ivchenko, Mikhail Dvoretckii
Differential Revision: https://reviews.llvm.org/D53765
llvm-svn: 353563
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When CodeExtractor saves the result of InvokeInst at the first insertion
point of the 'normal destination' basic block, this block can be omitted
in the outlined region, so store is placed outside of the function. The
suggested solution is to process saving outputs after creating exit
stubs for new function, and stores will be placed in that blocks before
return in this case.
Patch by Sergei Kachkov!
Fixes llvm.org/PR40455.
Differential Revision: https://reviews.llvm.org/D57919
llvm-svn: 353562
|
| |
|
|
|
|
|
|
|
|
|
| |
Inline compatability is determined from the individual feature
bits. These are just sets of the separate features, but will always be
treated as incompatible unless they are specifically ignored.
Defining the ISA version number here in tablegen would be nice, but it
turns out this wasn't actually used.
llvm-svn: 353558
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The sqrt case is faster and we already do this for the case where
the exponent is 0.25. This adds the 0.75 case which is also not
sensitive to signed zeros.
Patch by Whitney Tsang (Whitney)
Differential revision: https://reviews.llvm.org/D57434
llvm-svn: 353557
|
| |
|
|
|
|
|
|
|
|
| |
https://reviews.llvm.org/D57932
Add some logging + tests to make sure CSEInfo prints debug output.
reviewed by: arsenm
llvm-svn: 353553
|