| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Don't check for unsupported targets for every instruction.
llvm-svn: 361406
|
|
|
|
|
|
|
|
|
|
|
|
| |
Keep it optional in cases this is ever needed in some global
context. Currently it's only used for getting an upper bound inline
asm code size.
For AMDGPU, gfx10 increases the maximum instruction size to
20-bytes. This avoids penalizing older subtargets when estimating code
size, and making some annoying branch relaxation test adjustments.
llvm-svn: 361405
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This extends Krzysztof Parzyszek's X86-specific solution
(https://reviews.llvm.org/D60208) to the generic code pointed out by
James Y Knight.
Reviewers: kparzysz, craig.topper, nickdesaulniers
Subscribers: efriedma, sdardis, nemanjai, javed.absar, eraman, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, llvm-commits, srhines, void, nickdesaulniers, jyknight
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60224
llvm-svn: 361404
|
|
|
|
| |
llvm-svn: 361403
|
|
|
|
|
|
|
|
| |
If a template parameter refers to a pointer to member, but the mangling
of that was a string literal instead of a real symbol, llvm-undname used
to crash instead of rejecting the input.
llvm-svn: 361402
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a minimal start to correcting a problem most directly discussed in PR38086:
https://bugs.llvm.org/show_bug.cgi?id=38086
We have been hacking around a limitation for FP select patterns by using the
fast-math-flags on the condition of the select rather than the select itself.
This patch just allows FMF to appear with the 'select' opcode. No changes are
needed to "FPMathOperator" because it already includes select-of-FP because
that definition is based on the (return) value type.
Once we have this ability, we can start correcting and adding IR transforms
to use the FMF on a 'select' instruction. The instcombine and vectorizer test
diffs only show that the IRBuilder change is behaving as expected by applying
an FMF guard value to 'select'.
For reference:
rL241901 - allowed FMF with fcmp
rL255555 - allowed FMF with FP calls
Differential Revision: https://reviews.llvm.org/D61917
llvm-svn: 361401
|
|
|
|
|
|
|
|
| |
non-zero loop cost. NFCI.
The input LoopCost value can be zero, but if so it should be recalculated with the current VF. After that it should always be non-zero.
llvm-svn: 361387
|
|
|
|
|
|
|
|
|
|
| |
See bug 41361: https://bugs.llvm.org/show_bug.cgi?id=41361
Reviewers: artem.tamazov, arsenm
Differential Revision: https://reviews.llvm.org/D61012
llvm-svn: 361386
|
|
|
|
|
|
| |
Fixes scan-build warning.
llvm-svn: 361375
|
|
|
|
| |
llvm-svn: 361371
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the tiny code model is requested for a target machine that does not
support this, we get an error message (which is nice) but also this diagnostic
and request to submit a bug report:
fatal error: error in backend: Target does not support the tiny CodeModel
[Inferior 2 (process 31509) exited with code 0106]
clang-9: error: clang frontend command failed with exit code 70 (use -v to see invocation)
(gdb) clang version 9.0.0 (http://llvm.org/git/clang.git 29994b0c63a40f9c97c664170244a7bba5ecc15e) (http://llvm.org/git/llvm.git 95606fdf91c2d63a931e865f4b78b2e9828ddc74)
Target: arm-arm-none-eabi
Thread model: posix
clang-9: note: diagnostic msg: PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script.
clang-9: note: diagnostic msg:
********************
PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang-9: note: diagnostic msg: /tmp/tiny-dfe1a2.c
clang-9: note: diagnostic msg: /tmp/tiny-dfe1a2.sh
clang-9: note: diagnostic msg:
But this is not a bug, this is a feature. :-) Not only is this not a bug, this
is also pretty confusing. This patch causes just to print the fatal error and
not the diagnostic:
fatal error: error in backend: Target does not support the tiny CodeModel
Differential Revision: https://reviews.llvm.org/D62236
llvm-svn: 361370
|
|
|
|
|
|
|
|
|
|
|
| |
This adds proper handling of the NONAME-keyword, which makes llvm-dlltool
generate an import using the ordinal instead of the name.
Patch by by Jannik Vogel, test added by Stefan Schmidt.
Differential Revision: https://reviews.llvm.org/D62175
llvm-svn: 361367
|
|
|
|
| |
llvm-svn: 361366
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the second part of the commit fixing PR38917 (hoisting
partitially redundant machine instruction). Most of PRE (partitial
redundancy elimination) and CSE work is done on LLVM IR, but some of
redundancy arises during DAG legalization. Machine CSE is not enough
to deal with it. This simple PRE implementation works a little bit
intricately: it passes before CSE, looking for partitial redundancy
and transforming it to fully redundancy, anticipating that the next
CSE step will eliminate this created redundancy. If CSE doesn't
eliminate this, than created instruction will remain dead and eliminated
later by Remove Dead Machine Instructions pass.
The third part of the commit is supposed to refactor MachineCSE,
to make it more clear and to merge MachinePRE with MachineCSE,
so one need no rely on further Remove Dead pass to clear instrs
not eliminated by CSE.
First step: https://reviews.llvm.org/D54839
Fixes llvm.org/PR38917
llvm-svn: 361356
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
For big-endian powerpc64, the default ABI is ELFv1. OpenPower ABI ELFv2 is supported when -mabi=elfv2 is specified. FreeBSD support for PowerPC64 ELFv2 ABI with LLVM is in progress[1]. This patch adds an alternative way to specify ELFv2 ABI on target triple [2].
The following results are expected:
ELFv1 when using:
-target powerpc64-unknown-freebsd12.0
-target powerpc64-unknown-freebsd12.0 -mabi=elfv1
-target powerpc64-unknown-freebsd12.0-elfv1
ELFv2 when using:
-target powerpc64-unknown-freebsd12.0 -mabi=elfv2
-target powerpc64-unknown-freebsd12.0-elfv2
[1] https://wiki.freebsd.org/powerpc/llvm-elfv2
[2] https://clang.llvm.org/docs/CrossCompilation.html
Patch by Alfredo Dal'Ava Júnior!
Differential Revision: https://reviews.llvm.org/D61950
llvm-svn: 361355
|
|
|
|
|
|
|
|
|
| |
The Armv8.2-A crypto extensions all defaulted to true, but should default to
false, like all the other extensions.
Differential Revision: https://reviews.llvm.org/D62180
llvm-svn: 361354
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix for https://bugs.llvm.org/show_bug.cgi?id=41971. Make the
combineVectorSizedSetCCEquality() transform more conservative by
checking that the bitcast to the vector type will be cheap/free
for both operands. I'm considering it cheap if it's a constant,
a load or already a vector. I've dropped the explicit check for
f128 because it should fall out naturally (in the cases where
it'd be detrimental).
Differential Revision: https://reviews.llvm.org/D62220
llvm-svn: 361352
|
|
|
|
| |
llvm-svn: 361347
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D62173
llvm-svn: 361346
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CET-IBT enabled
Return-twice functions will indirectly jump after the caller's position.
So when CET-IBT is enable, we should make sure these is endbr*
instructions follow these Return-twice function caller. Like GCC does.
Patch by Xiang Zhang (xiangzhangllvm)
Differential Revision: https://reviews.llvm.org/D61881
llvm-svn: 361342
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This should be a valid exception to the general rule of not creating new shuffle masks in IR...
because we already do it. :)
Also, DAG combining/legalization will undo this by widening the shuffle back out if needed.
Explanation for how we already do this: SLP or vector source can create chains of insert/extract
as shown in 1 of the examples from PR16739:
https://godbolt.org/z/NlK7rA
https://bugs.llvm.org/show_bug.cgi?id=16739
And we expect instcombine or DAGCombine to clean that up by creating relatively simple shuffles.
Differential Revision: https://reviews.llvm.org/D62024
llvm-svn: 361338
|
|
|
|
| |
llvm-svn: 361333
|
|
|
|
|
|
|
| |
There should probably be nonconvergent versions, but my guess is it
doesn't matter in practice.
llvm-svn: 361331
|
|
|
|
| |
llvm-svn: 361330
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
r360889 added new llround builtin functions. This patch adds their
signatures for the WebAssembly backend.
It also adds wasm32 support to utils/update_llc_test_checks.py, since
that's the script other targets are using for their testcases for this
feature.
Differential Revision: https://reviews.llvm.org/D62207
llvm-svn: 361327
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Register coalescer fails for the test in the patch with the assertion in
JoinVals::ConflictResolution `DefMI != nullptr'. It attempts to join
live intervals for two adjacent instructions and erase the copy:
%2:vreg_256 = COPY %1
%3:vreg_256 = COPY killed %1
The LI needs to be adjusted to kill subrange for the erased instruction
and extend the subrange of the original def. That was done for the main
interval only but not for the subrange. As a result subrange had a VNI
pointing to the erased slot resulting in the above failure.
Differential Revision: https://reviews.llvm.org/D62162
llvm-svn: 361293
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add an intrinsic that takes 2 signed integers with the scale of them provided
as the third argument and performs fixed point multiplication on them. The
result is saturated and clamped between the largest and smallest representable
values of the first 2 operands.
This is a part of implementing fixed point arithmetic in clang where some of
the more complex operations will be implemented as intrinsics.
Differential Revision: https://reviews.llvm.org/D55720
llvm-svn: 361289
|
|
|
|
|
|
| |
We were trying to ZERO_EXTEND from an i8 X86ISD::SETCC to i8 again.
llvm-svn: 361288
|
|
|
|
|
|
|
|
|
|
|
|
| |
DAGCombiner simplifies this more liberally as:
// If inserting an UNDEF, just return the original vector.
if (N1.isUndef())
return N0;
So there's no way to make this visible in output AFAIK, but
doing this at node creation time should be slightly more efficient.
llvm-svn: 361287
|
|
|
|
|
|
|
|
|
| |
getNode() squashes concatenation of undefs via FoldCONCAT_VECTORS():
// Concat of UNDEFs is UNDEF.
if (llvm::all_of(Ops, [](SDValue Op) { return Op.isUndef(); }))
return DAG.getUNDEF(VT);
llvm-svn: 361284
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Because the sort order was not strongly stable on the RHS, whether the
chain could merge would depend on the order of the blocks in the Phi.
EXPENSIVE_CHECKS would shuffle the blocks before sorting, resulting in
non-deterministic merging.
Reviewers: gchatelet
Subscribers: hiraditya, llvm-commits, RKSimon
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62193
llvm-svn: 361281
|
|
|
|
|
|
| |
Fixes PACKSS-PSHUFB shuffle regressions mentioned on D61692
llvm-svn: 361270
|
|
|
|
|
|
|
|
|
|
|
| |
There are no FP callers of DAGCombiner::reassociateOps() currently,
but we can add a fast-math check to make sure this API is not being
misused.
This was noted as a potential risk (and that risk might increase) with:
D62191
llvm-svn: 361268
|
|
|
|
|
|
| |
Broke some bots.
llvm-svn: 361263
|
|
|
|
|
|
|
|
| |
And handle for self-move. This is required so that llvm::sort can work
with EXPENSIVE_CHECKS, as it will do a random shuffle of the input
which can result in self-moves.
llvm-svn: 361257
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In most cases, the topological ordering does not get changed in
ScheduleDAGInstrs. We can compute the ordering on demand, similar to
D60125.
This drastically cuts down the number of times we need to compute the
topological ordering, e.g. for SPEC2006, SPEC2k and MultiSource, we get
the following stats for -O3 -flto on X86 (showing the top reductions,
with small absolute values filtered). The smallest reduction is -50%.
Slightly positive impact on compile-time (-0.1 % geomean speedup for
test-suite + SPEC & co, with -O1 on X86)
Tests: 243
Metric: pre-RA-sched.NumTopoInits
Program base patch diff
test-suite...ngs-C/fixoutput/fixoutput.test 115.00 3.00 -97.4%
test-suite...ks/Prolangs-C/cdecl/cdecl.test 957.00 26.00 -97.3%
test-suite...math/automotive-basicmath.test 107.00 3.00 -97.2%
test-suite...rolangs-C++/deriv2/deriv2.test 144.00 6.00 -95.8%
test-suite...lowfish/security-blowfish.test 410.00 18.00 -95.6%
test-suite...frame_layout/frame_layout.test 441.00 23.00 -94.8%
test-suite...rolangs-C++/employ/employ.test 159.00 11.00 -93.1%
test-suite...s/Ptrdist/anagram/anagram.test 157.00 11.00 -93.0%
test-suite...s-C/unix-smail/unix-smail.test 829.00 59.00 -92.9%
test-suite...chmarks/Olden/power/power.test 154.00 11.00 -92.9%
test-suite...T95/147.vortex/147.vortex.test 19876.00 1434.00 -92.8%
test-suite...000/255.vortex/255.vortex.test 19881.00 1435.00 -92.8%
test-suite...ce/Applications/Burg/burg.test 2203.00 168.00 -92.4%
test-suite...urce/Applications/hbd/hbd.test 1067.00 85.00 -92.0%
test-suite...ternal/HMMER/hmmcalibrate.test 3145.00 251.00 -92.0%
test-suite.../Applications/spiff/spiff.test 1037.00 84.00 -91.9%
test-suite...SPEC/CINT95/130.li/130.li.test 5913.00 487.00 -91.8%
test-suite.../CINT95/134.perl/134.perl.test 12532.00 1041.00 -91.7%
test-suite...ce/Benchmarks/Olden/bh/bh.test 220.00 19.00 -91.4%
test-suite :: External/Nurbs/nurbs.test 2304.00 206.00 -91.1%
test-suite...arks/VersaBench/dbms/dbms.test 773.00 75.00 -90.3%
test-suite...ce/Applications/siod/siod.test 9043.00 878.00 -90.3%
test-suite...pplications/treecc/treecc.test 4510.00 438.00 -90.3%
test-suite...T2006/456.hmmer/456.hmmer.test 7093.00 697.00 -90.2%
test-suite...s-C/Pathfinder/PathFinder.test 882.00 87.00 -90.1%
test-suite.../CINT2000/176.gcc/176.gcc.test 64978.00 6721.00 -89.7%
test-suite...cations/hexxagon/hexxagon.test 657.00 69.00 -89.5%
test-suite...fice-ispell/office-ispell.test 2712.00 285.00 -89.5%
test-suite.../CINT2006/403.gcc/403.gcc.test 139613.00 14992.00 -89.3%
test-suite...lications/ClamAV/clamscan.test 25880.00 2785.00 -89.2%
Reviewers: MatzeB, atrick, efriedma, niravd
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D60839
llvm-svn: 361253
|
|
|
|
|
|
|
|
|
|
|
| |
This provides the correct file path for the original source, rather
than the preprocessed source.
Part of the fix for PR41839.
Differential Revision: https://reviews.llvm.org/D62074
llvm-svn: 361248
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit rr360902. It caused an assertion failure in
lib/IR/DebugInfoMetadata.cpp: Assertion `(OffsetInBits + SizeInBits <=
FragmentSizeInBits) && "new fragment outside of original fragment"'
failed.
PR41931.
llvm-svn: 361246
|
|
|
|
|
|
|
|
|
|
| |
This option provides only the base filename, not a full relative path.
Part of the fix for PR41839.
Differential Revision: https://reviews.llvm.org/D62071
llvm-svn: 361245
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: In preparation for D60318 .
Reviewers: gchatelet, efriedma
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62068
llvm-svn: 361239
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On PowerPC64 ELFv2 ABI, functions may have 2 entry points: global and local.
The local entry point location of a function is stored in the st_other field of the symbol, as an offset relative to the global entry point.
In order to make symbol assignments (e.g. .equ/.set) work properly with this, PPCTargetELFStreamer already copies the local entry bits from the source symbol to the destination one, on emitAssignment(). The problem is that this copy is performed only at the assignment location, where the source symbol may not yet have processed the .localentry directive, that sets the local entry. This may cause the destination symbol to end up with wrong local entry information. Other symbol info is not affected by this because, in this case, the destination symbol value is actually a symbol reference.
This change keeps track of these assignments, and update all needed st_other fields when finish() is called.
Patch by Leandro Lupori!
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D56586
llvm-svn: 361237
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some checks in isShuffleMaskLegal expect an even number of elements,
e.g. isTRN_v_undef_Mask or isUZP_v_undef_Mask, otherwise they access
invalid elements and crash. This patch adds checks to the impacted
functions.
Fixes PR41951
Reviewers: t.p.northover, dmgreen, samparker
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D60690
llvm-svn: 361235
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Patch adds support for the following instructions:
* URECPE, URSQRTE, SQABS, SQNEG
The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D62129
llvm-svn: 361230
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Patch adds support for the following instructions:
ADDP, SMAXP, UMAXP, SMINP, UMINP
The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D62128
llvm-svn: 361229
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PrepareConstants step converts add/sub with 'negative' immediates to
sub/add with a 'positive' imm to make promotion more simple. nuw
already states that the add shouldn't cause an unsigned wrap, so
it shouldn't need any tweaking. Plus, we also don't allow a sub with
a 'negative' immediate to be safe wrap, so this functionality has
been removed. The PrepareConstants step now just handles the add
instructions that we've determined would be safe if they wrap around
zero.
Differential Revision: https://reviews.llvm.org/D62057
llvm-svn: 361227
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
convention endianess
Summary:
The endianess used in the calling convention does not always match the
endianess of the target on all architectures, namely AVR.
When an argument is too large to be legalised by the architecture and is
split for the ABI, a new hook TargetLoweringInfo::shouldSplitFunctionArgumentsAsLittleEndian
is queried to find the endianess that function arguments must be laid
out in.
This approach was recommended by Eli Friedman.
Originally reported in https://github.com/avr-rust/rust/issues/129.
Patch by Carl Peto.
Reviewers: bogner, t.p.northover, RKSimon, niravd, efriedma
Reviewed By: efriedma
Subscribers: JDevlieghere, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62003
llvm-svn: 361222
|
|
|
|
| |
llvm-svn: 361218
|
|
|
|
|
|
|
|
| |
Patch by Praveen Velliengiri. Thanks Praveen!
Differential Revision: https://reviews.llvm.org/D62139
llvm-svn: 361215
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
scan-build flagged a potential use-after-move in debug builds. It's not
safe that a moved from value contains anything but garbage. Manually
DRY up these repeated expressions.
Reviewers: lhames
Reviewed By: lhames
Subscribers: hiraditya, llvm-commits, srhines
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62112
llvm-svn: 361203
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unfortunately the way SIInsertSkips works is backwards, and is
required for correctness. r338235 added handling of some special cases
where skipping is mandatory to avoid side effects if no lanes are
active. It conservatively handled asm correctly, but the same logic
needs to apply to calls.
Usually the call sequence code is larger than the skip threshold,
although the way the count is computed is really broken, so I'm not
sure if anything was likely to really hit this.
llvm-svn: 361202
|