| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
This reverts commit r336623
llvm-svn: 336675
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
An explicit untied use is not sufficient to maintain liveness of a
register redefined in a predicated instruction. For example
%1 = COPY %0
...
%1 = A2_paddif %2, %1, 1
could become
$r1 = COPY $r0
...
$r1 = A2_paddif $p0, $r1, 1
and later
$r1 = COPY $r0 ;; this is not really dead!
...
$r1 = A2_paddif $p0, $r0, 1
llvm-svn: 336662
|
| |
|
|
|
|
|
|
|
| |
This patch adds fast-isel tests for the IR patterns produced for truncation
intrinsics in rC336643.
Differential Revision: https://reviews.llvm.org/D48822
llvm-svn: 336645
|
| |
|
|
|
|
|
|
|
|
| |
Now that rL336250 has landed, we should prefer 2 immediate shifts + a shuffle blend over performing a multiply. Despite the increase in instructions, this is quicker (especially for slow v4i32 multiplies), avoid loads and constant pool usage. It does mean however that we increase register pressure. The code size will go up a little but by less than what we save on the constant pool data.
This patch also adds support for v16i16 to the BLEND(SHIFT(v,c1),SHIFT(v,c2)) combine, and also prevents blending on pre-SSE41 shifts if it would introduce extra blend masks/constant pool usage.
Differential Revision: https://reviews.llvm.org/D48936
llvm-svn: 336642
|
| |
|
|
|
|
| |
and 64 bit checks together. NFC
llvm-svn: 336641
|
| |
|
|
|
|
|
|
| |
floating point load bitcasted to integer.
DAG combine wouldn't let a floating point load bitcasted to integer exist. It would just be an integer load.
llvm-svn: 336626
|
| |
|
|
|
|
| |
instructions due to bad isel pattern.
llvm-svn: 336625
|
| |
|
|
|
|
|
| |
This reverts commit r336587, it was causing test failures on the
sanitizer bots.
llvm-svn: 336623
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This adds support for binary atomic read-modify-write instructions:
add, sub, and, or, xor, and xchg.
This does not yet support translations of some of LLVM IR atomicrmw
instructions (nand, max, min, umax, and umin) that do not have a direct
counterpart in wasm instructions.
Reviewers: dschuff
Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D49088
llvm-svn: 336615
|
| |
|
|
| |
llvm-svn: 336602
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added __float128 support for a number of rounding operations:
trunc
rint
nearbyint
round
floor
ceil
Differential Revision: https://reviews.llvm.org/D48415
llvm-svn: 336601
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Changed variable/function names to be more consistent
- Improved comments in test files
- Added more tests
- Fixed a few typos
- Misc. cosmetic changes
Reviewers: dschuff
Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D49087
llvm-svn: 336598
|
| |
|
|
|
|
|
|
|
| |
Add support for this builtin:
double builtin_truncf128_round_to_odd(float128)
Differential Revision: https://reviews.llvm.org/D48483
llvm-svn: 336595
|
| |
|
|
|
|
|
|
|
| |
Ensure that, if updating a tied operand pair, to only update
that pair.
Differential Revision: https://reviews.llvm.org/D49052
llvm-svn: 336593
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch adds support for the atomicrmw instructions and the strong
cmpxchg instruction to the IRTranslator.
I've left out weak cmpxchg because LangRef.rst isn't entirely clear on what
difference it makes to the backend. As far as I can tell from the code, it
only matters to AtomicExpandPass which is run at the LLVM-IR level.
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar, volkan, javed.absar
Reviewed By: qcolombet
Subscribers: kristof.beyls, javed.absar, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D40092
llvm-svn: 336589
|
| |
|
|
|
|
|
|
|
|
| |
These won't work for the forseeable future. These aren't allowed
from OpenCL, but IPO optimizations can make them appear.
Also directly set the attributes on functions, regardless
of the linkage rather than cloning functions like before.
llvm-svn: 336587
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This adds a reverse transform for the instcombine canonicalizations
that were added in D47980, D47981.
As discussed later, that was worse at least for the code size,
and potentially for the performance, too.
https://rise4fun.com/Alive/Zmpl
Reviewers: craig.topper, RKSimon, spatel
Reviewed By: spatel
Subscribers: reames, llvm-commits
Differential Revision: https://reviews.llvm.org/D48768
llvm-svn: 336585
|
| |
|
|
|
|
|
|
|
|
|
|
| |
GCC has builtins for these round to odd instructions:
__float128 __builtin_sqrtf128_round_to_odd (__float128)
__float128 __builtin_{add,sub,mul,div}f128_round_to_odd (__float128, __float128)
__float128 __builtin_fmaf128_round_to_odd (__float128, __float128, __float128)
Differential Revision: https://reviews.llvm.org/D47550
llvm-svn: 336578
|
| |
|
|
|
|
|
|
| |
expected type before creating the new FMA node.
Previously, we were creating malformed SDNodes, but nothing noticed because the type constraints prevented isel from noticing.
llvm-svn: 336566
|
| |
|
|
|
|
| |
Let the update script merge 32/64 tests where possible
llvm-svn: 336565
|
| |
|
|
|
|
|
|
| |
Added handling for the select f128.
Differential Revision: https://reviews.llvm.org/D48294
llvm-svn: 336548
|
| |
|
|
| |
llvm-svn: 336514
|
| |
|
|
|
|
|
|
| |
Pre-AVX512 (which can perform a quick extend/shift/truncate), extending to 2 v8i16 for the PMULLW and then truncating is more performant than relying on the generic PBLENDVB vXi8 shift path and uses a similar amount of mask constant pool data.
Differential Revision: https://reviews.llvm.org/D48963
llvm-svn: 336513
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows us to handle masking in a very similar way to the default rounding version that uses llvm.fma.
I had to add new rounding mode CodeGenOnly instructions to support isel when we can't find a movss to grab the upper bits from to use the b_Int instruction.
Fast-isel tests have been updated to match new clang codegen.
We are currently having trouble folding fneg into the new intrinsic. I'm going to correct that in a follow up patch to keep the size of this one down.
A future patch will also remove the old intrinsics.
llvm-svn: 336506
|
| |
|
|
|
|
| |
tests to match clang test cases.
llvm-svn: 336505
|
| |
|
|
| |
llvm-svn: 336496
|
| |
|
|
|
|
|
|
| |
As discussed on PR37989, this patch adds EXTRACT_SUBVECTOR handling to TargetLowering::SimplifyDemandedVectorElts and calls it from DAGCombiner::visitEXTRACT_SUBVECTOR.
Differential Revision: https://reviews.llvm.org/D48825
llvm-svn: 336490
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D48934
llvm-svn: 336484
|
| |
|
|
|
|
| |
This should bring the bots back to green state.
llvm-svn: 336482
|
| |
|
|
| |
llvm-svn: 336453
|
| |
|
|
|
|
|
|
|
| |
Added statistics for the number of SMLAD instructions created, and
als renamed the pass name to -arm-parallel-dsp.
Differential Revision: https://reviews.llvm.org/D48971
llvm-svn: 336441
|
| |
|
|
|
|
|
|
| |
http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/50537/testReport/junit/LLVM/CodeGen_AArch64/FoldRedundantShiftedMasking_ll/
This removes the comments of the function label causing this error.
llvm-svn: 336440
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
D48278
Allow to reduce redundant shift masks.
For example:
x1 = x & 0xAB00
x2 = (x >> 8) & 0xAB
can be reduced to:
x1 = x & 0xAB00
x2 = x1 >> 8
It only allows folding when the masks and shift values are constants.
llvm-svn: 336426
|
| |
|
|
|
|
|
|
| |
The intrinsics can be implemented with a f32/f64 llvm.fma intrinsic and an insert into a zero vector.
There are a couple regressions here due to SelectionDAG not being able to pull an fneg through an extract_vector_elt. I'm not super worried about this though as InstCombine should be able to do it before we get to SelectionDAG.
llvm-svn: 336416
|
| |
|
|
|
|
|
|
|
|
| |
unmasked 512-bit intrinsics with rounding mode.
This upgrades all of the intrinsics to use fneg instructions to convert fma into fmsub/fnmsub/fnmadd/fmsubadd. And uses a select instruction for masking.
This matches how clang uses the intrinsics these days.
llvm-svn: 336409
|
| |
|
|
|
|
|
|
|
|
| |
and autoupgrading.
-Split cases that call 2 intrinsics in the same case.
-Remove testing mask3 and maskz intrinsics with an all ones mask. These won't be interesting after the upgrade.
-Restore test cases for some intrinsics that are marked for deletion, but haven't been deleted yet.
llvm-svn: 336408
|
| |
|
|
|
|
|
|
| |
Power 9 does not have a hardware instruction for frem but we can call fmodf128.
Differential Revision: https://reviews.llvm.org/D48552
llvm-svn: 336406
|
| |
|
|
| |
llvm-svn: 336404
|
| |
|
|
|
|
|
|
| |
'llvm.fma'. Add upgrade tests for all.
Still need to remove the AVX512 masked versions.
llvm-svn: 336383
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D48954
llvm-svn: 336376
|
| |
|
|
|
|
| |
Also use verify-machineinstrs.
llvm-svn: 336374
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Avoid using allocateKernArg / AssignFn. We do not want any
of the type splitting properties of normal calling convention
lowering.
For now at least this exists alongside the IR argument lowering
pass. This is necessary to handle struct padding correctly while
some arguments are still skipped by the IR argument lowering
pass.
llvm-svn: 336373
|
| |
|
|
|
|
|
|
|
|
|
| |
instructions
Map the following instructions to the proper float128 lib calls:
pow[i], exp[2], log[2|10], sin, cos, fmin, fmax
Differential Revision: https://reviews.llvm.org/D48544
llvm-svn: 336361
|
| |
|
|
|
|
| |
Now that D45806 has landed we can start trying to avoid scalarizing srem by constant - these tests demonstrate some example cases.
llvm-svn: 336360
|
| |
|
|
| |
llvm-svn: 336348
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Wait states are not properly being inserted after buffer_store for v_interp instructions.
Add VALU to V_INTERP instructions so that the GCNHazardRecognizer can
check and insert the appropriate wait states when needed.
Differential Revision: https://reviews.llvm.org/D48772
Change-Id: Id540c9b074fc69b5c1de6b182276aa089c74aa64
llvm-svn: 336339
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: There the typos are intentional, explicitly introduced to disable these cases in r280285.
Reviewers: bkramer
Reviewed By: bkramer
Subscribers: dschuff, sbc100, jgravelle-google, aheejin, llvm-commits
Differential Revision: https://reviews.llvm.org/D48962
llvm-svn: 336336
|
| |
|
|
|
|
| |
We want to compare shifts with repeated vs non-repeated v8i16 shuffle masks (for PBLENDW ymm)
llvm-svn: 336333
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Similar to PR/25526, fast-regalloc introduces spills at the end of basic
blocks. When this occurs in between an ll and sc, the stores can cause the
atomic sequence to fail.
This patch fixes the issue by introducing more pseudos to represent atomic
operations and moving their lowering to after the expansion of postRA
pseudos.
This version addresses issues with the initial implementation and covers
all atomic operations.
This resolves PR/32020.
Thanks to James Cowgill for reporting the issue!
Patch By: Simon Dardis
Differential Revision: https://reviews.llvm.org/D31287
llvm-svn: 336328
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Resolves:
Unsupported ARM Neon intrinsics in Target-specific DAG combine
function for VLDDUP
https://bugs.llvm.org/show_bug.cgi?id=38031
Related diff: D48439
Differential Revision: https://reviews.llvm.org/D48920
llvm-svn: 336325
|