| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This fixes a hardware bug that makes a branch offset of 0x3f unsafe.
This replaces the 32 bit branch with offset 0x3f to a 64 bit
instruction that includes the same 32 bit branch and the encoding
for a s_nop 0 to follow. The relaxer than modifies the offsets
accordingly.
Change-Id: I10b7aed99d651f8159401b01bb421f105fa6288e
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63494
llvm-svn: 364451
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This implements a small enhancement to https://reviews.llvm.org/D55506
Specifically, while we were able to match strict FP nodes for
floating-point extend operations with a register as source, this
did not work for operations with memory as source.
That is because from regular operations, this is represented as
a combined "extload" node (which is a variant of a load SD node);
but there is no equivalent using a strict FP operation.
However, it turns out that even in the absence of an extload
node, we can still just match the operations explicitly, e.g.
(strict_fpextend (f32 (load node:$ptr))
This patch implements that method to match the LDEB/LXEB/LXDB
SystemZ instructions even when the extend uses a strict-FP node.
llvm-svn: 364450
|
| |
|
|
| |
llvm-svn: 364449
|
| |
|
|
|
|
| |
I was using an iterator that was equal to the end of a collection.
llvm-svn: 364447
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Since the WebAssembly SIMD shift instructions take i32 operands, we
truncate the i64 operand to <2 x i64> shifts during ISel. When the i64
operand is sign extended from i32, this CL makes it so the sign
extension is dropped instead of a wrap instruction added.
Reviewers: dschuff, aheejin
Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63615
llvm-svn: 364446
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Implements direct and indirect tail calls enabled by the 'tail-call'
feature in both DAG ISel and FastISel. Updates existing call tests and
adds new tests including a binary encoding test.
Reviewers: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62877
llvm-svn: 364445
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D63795
llvm-svn: 364444
|
| |
|
|
| |
llvm-svn: 364441
|
| |
|
|
|
|
|
| |
- The newly added GSYM misses LLVMBuild.txt. Add a barely one to pass
the build.
llvm-svn: 364440
|
| |
|
|
|
|
| |
Split jump tables into individual lines and fix spacing.
llvm-svn: 364436
|
| |
|
|
|
|
| |
Allows narrowInsertExtractVectorBinOp to reduce vector size instead of the more restricted SimplifyDemandedVectorEltsForTargetNode
llvm-svn: 364434
|
| |
|
|
|
|
| |
Allows narrowInsertExtractVectorBinOp to reduce vector size
llvm-svn: 364432
|
| |
|
|
|
|
| |
Allows narrowInsertExtractVectorBinOp to reduce vector size
llvm-svn: 364431
|
| |
|
|
|
|
| |
We have (almost) no target opcodes that have scalar/vector equivalents - for now assume we can't scalarize them (we can add exceptions if we need to).
llvm-svn: 364429
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The full GSYM patch started with: https://reviews.llvm.org/D53379
In that patch we wanted to split up getting GSYM into the LLVM code base so we are not committing too much code at once.
This is a first in a series of patches where I only add the foundation classes along with complete unit tests. They provide the foundation for encoding and decoding a GSYM file.
File entries are defined in llvm::gsym::FileEntry. This class splits the file up into a directory and filename represented by uniqued string table offsets. This allows all files that are referred to in a GSYM file to be encoded as 1 based indexes into a global file table in the GSYM file.
Function information in stored in llvm::gsym::FunctionInfo. This object represents a contiguous address range that has a name and range with an optional line table and inline call stack information.
Line table entries are defined in llvm::gsym::LineEntry. They store only address, file and line information to keep the line tables simple and allows the information to be efficiently encoded in a subsequent patch.
Inline information is defined in llvm::gsym::InlineInfo. These structs store the name of the inline function, along with one or more address ranges, and the file and line that called this function. They also contain any child inline information.
There are also utility classes for address ranges in llvm::gsym::AddressRange, and string table support in llvm::gsym::StringTable which are simple classes.
The unit tests test all the APIs on these simple classes so they will be ready for the next patches where we will create GSYM files and parse GSYM files.
Differential Revision: https://reviews.llvm.org/D63104
llvm-svn: 364427
|
| |
|
|
| |
llvm-svn: 364426
|
| |
|
|
|
|
|
| |
This should the same, but MRI does allow dynamically changing the CSR
set, although currently not used.
llvm-svn: 364425
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Doing better separation of Cost and Threshold.
Cost counts the abstract complexity of live instructions, while Threshold is an upper bound of complexity that inlining is comfortable to pay.
There are two parts:
- huge 15K last-call-to-static bonus is no longer subtracted from Cost
but rather is now added to Threshold.
That makes much more sense, as the cost of inlining (Cost) is not changed by the fact
that internal function is called once. It only changes the likelyhood of this inlining
being profitable (Threshold).
- bonus for calls proved-to-be-inlinable into callee is no longer subtracted from Cost
but added to Threshold instead.
While calculations are somewhat different, overall InlineResult should stay the same since Cost >= Threshold compares the same.
Reviewers: eraman, greened, chandlerc, yrouban, apilipenko
Reviewed By: apilipenko
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60740
llvm-svn: 364422
|
| |
|
|
|
|
| |
lambdas by value
llvm-svn: 364420
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The one thing of note here is that the 'bitwidth' constant (32/64) was previously pessimistic.
Given `x & (-1 >> (C - z))`, we were taking `C` to be `bitwidth(x)`, but in reality
we want `(-1 >> (C - z))` pattern to mean "low z bits must be all-ones".
And for that, `C` should be `bitwidth(-1 >> (C - z))`, i.e. of the shift operation itself.
Last pattern D does not seem to exhibit any of these truncation issues.
Although it has the opposite problem - if we extract low bits (no shift) from i64,
and then truncate to i32, then we fail to shrink this 64-bit extraction into 32-bit extraction.
Reviewers: RKSimon, craig.topper, spatel
Reviewed By: RKSimon
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62806
llvm-svn: 364419
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
(Not so) boringly identical to pattern a (D62786)
Not yet sure how do deal with the last pattern c.
Reviewers: RKSimon, craig.topper, spatel
Reviewed By: RKSimon
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62793
llvm-svn: 364418
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Finally tying up loose ends here.
The problem is quite simple:
If we have pattern `(x >> start) & (1 << nbits) - 1`,
and then truncate the result, that truncation will be propagated upwards,
into the `and`. And that isn't currently handled.
I'm only fixing pattern `a` here,
the same fix will be needed for patterns `b`/`c` too.
I *think* this isn't missing any extra legality checks,
since we only look past truncations. Similary, i don't think
we can get any other truncation there other than i64->i32.
Reviewers: craig.topper, RKSimon, spatel
Reviewed By: craig.topper
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62786
llvm-svn: 364417
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
opt pipeline."
Breaks sanitizers:
libFuzzer :: cxxstring.test
libFuzzer :: memcmp.test
libFuzzer :: recommended-dictionary.test
libFuzzer :: strcmp.test
libFuzzer :: value-profile-mem.test
libFuzzer :: value-profile-strcmp.test
llvm-svn: 364416
|
| |
|
|
|
|
| |
to HarewareLoopInfo.
llvm-svn: 364415
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It was failing with:
/b/s/w/ir/cache/builder/src/third_party/llvm/llvm/lib/Target/X86/X86ISelLowering.cpp:18772:66:
error: call of overloaded 'makeArrayRef(<brace-enclosed initializer list>)' is ambiguous
scaleShuffleMask<int>(Scale, makeArrayRef<int>({ 0, 2, 1, 3 }), Mask);
^
/b/s/w/ir/cache/builder/src/third_party/llvm/llvm/lib/Target/X86/X86ISelLowering.cpp:18772:66: note: candidates are:
In file included from /b/s/w/ir/cache/builder/src/third_party/llvm/llvm/include/llvm/CodeGen/MachineFunction.h:20:0,
from /b/s/w/ir/cache/builder/src/third_party/llvm/llvm/include/llvm/CodeGen/CallingConvLower.h:19,
from /b/s/w/ir/cache/builder/src/third_party/llvm/llvm/lib/Target/X86/X86ISelLowering.h:17,
from /b/s/w/ir/cache/builder/src/third_party/llvm/llvm/lib/Target/X86/X86ISelLowering.cpp:14:
/b/s/w/ir/cache/builder/src/third_party/llvm/llvm/include/llvm/ADT/ArrayRef.h:480:15:
note: llvm::ArrayRef<T> llvm::makeArrayRef(const std::vector<_RealType>&) [with T = int]
ArrayRef<T> makeArrayRef(const std::vector<T> &Vec) {
^
/b/s/w/ir/cache/builder/src/third_party/llvm/llvm/include/llvm/ADT/ArrayRef.h:485:37:
note: llvm::ArrayRef<T> llvm::makeArrayRef(const llvm::ArrayRef<T>&) [with T = int]
template <typename T> ArrayRef<T> makeArrayRef(const ArrayRef<T> &Vec) {
^
llvm-svn: 364414
|
| |
|
|
|
|
|
|
|
| |
This allows later passes (in particular InstCombine) to optimize more
cases.
One that's important to us is `memcmp(p, q, constant) < 0` and memcmp(p, q, constant) > 0.
llvm-svn: 364412
|
| |
|
|
|
|
|
|
| |
extract_subvector(bitcast()) support
Ideally this needs to be a generic combine in DAGCombiner::visitEXTRACT_SUBVECTOR but there's some nasty regressions in aarch64 due to neon shuffles not handling bitcasts at all.....
llvm-svn: 364407
|
| |
|
|
|
|
|
|
| |
extract_subvector(bitcast()) support
We support 'big to little' (e.g. extract_subvector(v16i8 bitcast(v2i64))) but not 'little to big' cases (e.g. extract_subvector(v2i64 bitcast(v16i8)))
llvm-svn: 364405
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The getFixupKindContainerSizeBytes function returns the size of the
instruction containing a given fixup. Currently fixup_arm_pcrel_9 is
not handled in this function, this causes an assertion failure in
the debug build and incorrect codegen in the release build.
This patch fixes the problem.
Reviewers: ostannard, simon_tatham
Reviewed By: ostannard
Subscribers: javed.absar, kristof.beyls, hiraditya, pbarrio, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63778
llvm-svn: 364404
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds the PseudoCALLReg instruction which allows using an
explicit register operand as the destination for the return address.
GCC can successfully parse this form of the call instruction, which
would be used for calls to functions which do not use ra as the return
address register, such as the __riscv_save libcalls. This patch forms
the first part of an implementation of -msave-restore for RISC-V.
Differential Revision: https://reviews.llvm.org/D62685
llvm-svn: 364403
|
| |
|
|
|
|
|
|
| |
truncateVectorWithPACK is often used in conjunction with ComputeNumSignBits which struggles when peeking through bitcasts.
This fix tries to avoid bitcast(shuffle(bitcast())) patterns in the 256-bit 64-bit sublane shuffles so we can still see through at least until lowering when the shuffles will need to be bitcasted to widen the shuffle type.
llvm-svn: 364401
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch generalizes the UnrollLoop utility to support loops that exit
from the header instead of the latch. Usually, LoopRotate would take care
of must of those cases, but in some cases (e.g. -Oz), LoopRotate does
not kick in.
Codesize impact looks relatively neutral on ARM64 with -Oz + LTO.
Program master patch diff
External/S.../CFP2006/447.dealII/447.dealII 629060.00 627676.00 -0.2%
External/SPEC/CINT2000/176.gcc/176.gcc 1245916.00 1244932.00 -0.1%
MultiSourc...Prolangs-C/simulator/simulator 86100.00 86156.00 0.1%
MultiSourc...arks/Rodinia/backprop/backprop 66212.00 66252.00 0.1%
MultiSourc...chmarks/Prolangs-C++/life/life 67276.00 67312.00 0.1%
MultiSourc...s/Prolangs-C/compiler/compiler 69824.00 69788.00 -0.1%
MultiSourc...Prolangs-C/assembler/assembler 86672.00 86696.00 0.0%
Reviewers: efriedma, vsk, paquette
Reviewed By: paquette
Differential Revision: https://reviews.llvm.org/D61962
llvm-svn: 364398
|
| |
|
|
|
|
| |
to isHardwareLoopProfitable()
llvm-svn: 364397
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: gchatelet, echristo, spatel, atdt
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63769
llvm-svn: 364384
|
| |
|
|
|
|
|
|
|
| |
PPCMIPeephole::emitRLDICWhenLoweringJumpTables should return a bool
value to indicate optimization is conducted or not.
Differential Revision: https://reviews.llvm.org/D63801
llvm-svn: 364383
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
// fold (sext (select cond, c1, c2)) -> (select cond, sext c1, sext c2)
// fold (zext (select cond, c1, c2)) -> (select cond, zext c1, zext c2)
// fold (aext (select cond, c1, c2)) -> (select cond, sext c1, sext c2)
Sign extend the operands if it is any_extend, to keep the signess of the operands that, the other combine rule would apply. The any_extend is handled as zero extend for constants. i.e.
t1: i8 = select t0, Constant:i8<-1>, Constant:i8<0>
t2: i64 = any_extend t1
-->
t3: i64 = select t0, Constant:i64<-1>, Constant:i64<0>
-->
t4: i64 = sign_extend_inreg t3
Differential Revision: https://reviews.llvm.org/D63318
llvm-svn: 364382
|
| |
|
|
| |
llvm-svn: 364376
|
| |
|
|
|
|
|
|
|
|
| |
This was just an omission in the back end. We have had the instructions for both
single and double precision for a few HW generations, but never got around to
legalizing these.
Differential revision: https://reviews.llvm.org/D63634
llvm-svn: 364373
|
| |
|
|
| |
llvm-svn: 364372
|
| |
|
|
|
|
|
|
|
|
|
| |
The weak alias should have the characteristics set to
`IMAGE_EXTERN_WEAK_SEARCH_ALIAS` to indicate that the weak external here
is a symbol alias and that the symbol is aliased to a locally defined
symbol. We were previously setting the characteristics to
`IMAGE_EXTERN_WEAK_SEARCH_LIBRARY` which indicates that the symbol
should be looked for in the libraries.
llvm-svn: 364370
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The list of relocations with addend in lld was missing `R_WASM_MEMORY_ADDR_REL_SLEB`,
causing `wasm-ld` to generate corrupted output. This fixes that problem and while
we're at it pulls the list of such relocations into the Wasm.h header, to avoid
duplicating it in multiple places.
Reviewers: sbc100
Differential Revision: https://reviews.llvm.org/D63696
llvm-svn: 364367
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
`catch_all` is from the first version of EH proposal and now has been
removed. There were no tests covering this, and thus no tests to remove
or fix.
Reviewers: aardappel
Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63737
llvm-svn: 364360
|
| |
|
|
|
|
|
| |
This verifier check is failing for us while doing ThinLTO on Chrome for
x86, see https://crbug.com/978218, and this helps to debug the problem.
llvm-svn: 364357
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
When we calculate MII, we use two loops, one with iterator R++ to
check whether we can reserve the resource, then --R to move back
the iterator to do reservation.
This is risky, as R++, --R may not point to the same element at all.
The can cause wrong MII.
Differential Revision: https://reviews.llvm.org/D63536
llvm-svn: 364353
|
| |
|
|
|
|
| |
The same oddity was apparently copy-pasted between multiple targets.
llvm-svn: 364349
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-> icmp eq/ne (and %x, (lshr -C1, C2)), 0.
Simplify 'shl' inequality test into 'and' equality test.
This pattern happens in the middle-end while simplifying bitfield access,
Exposed in https://reviews.llvm.org/D63505
https://rise4fun.com/Alive/6uz
Reviewers: lebedev.ri, efriedma
Reviewed By: lebedev.ri
Subscribers: spatel, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63675
llvm-svn: 364348
|
| |
|
|
| |
llvm-svn: 364346
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Original patch https://reviews.llvm.org/D63659 from
Steven Perron <stevenperron@google.com>
The pass AMDGPUUnifyDivergentExitNodes does not update the phi nodes in
the successors of blocks that is splits. This is fixed by calling
BasicBlock::splitBasicBlock to split the block instead of doing it
manually. This does extra work because a new conditional branch is
created in BB which is immediately replaced, but I think the simplicity
is worth it. It also helps make the code more future proof in case other
things need to be updated.
llvm-svn: 364342
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This follows up the transform from rL363956 to use the ctpop intrinsic when checking for power-of-2-or-zero.
This is matching the isPowerOf2() patterns used in PR42314:
https://bugs.llvm.org/show_bug.cgi?id=42314
But there's at least 1 instcombine follow-up needed to match the alternate form:
(v & (v - 1)) == 0;
We should have all of the backend expansions handled with:
rL364319
(x86-specific changes still needed for optimal code based on subtarget)
And the larger patterns to exclude zero as a power-of-2 are joining with this change after:
rL364153 ( D63660 )
rL364246
Differential Revision: https://reviews.llvm.org/D63777
llvm-svn: 364341
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D63780
llvm-svn: 364339
|