| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
| |
inputs being bitcasted from floating point types.
There's really no reason to do this we should just let isel pick the integer version and let the execution dependency fixing pass take care of moving to FP if necessary.
It's not very reliable to look for bitcasts at the edges of patterns. If for some reason one input was bitcasted and the other wasn't, or if one was a v4f32 bitcast and one was a v2f64 bitcast, we would have fallen back to the integer pattern anyway.
llvm-svn: 311138
|
| |
|
|
|
|
|
|
| |
If a struct would end up half in GPRs and half on SP the ABI says it should
actually go entirely on the stack. We were getting this wrong in GlobalISel
before, causing compatibility issues.
llvm-svn: 311137
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is step towards separating the GCN and R600 tablegen'd code.
This is a little awkward for now, because the R600 functions won't have the
MCSubtargetInfo parameter, so we need to have AMDMGPUInstPrinter
delegate to R600InstPrinter, but once the tablegen'd code is split,
we will be able to drop the delegation and use R600InstPrinter directly.
Reviewers: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D36444
llvm-svn: 311128
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
AVX512DQI is enabled.
There's no reason to switch instructions with and without DQI. It just creates extra isel patterns and test divergences.
There is however value in enabling the masked version of the instructions with DQI.
This required introducing some new multiclasses to enabling this splitting.
Differential Revision: https://reviews.llvm.org/D36661
llvm-svn: 311091
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Just like the FIXME says, there is no alignment requirement for MMX.
Reviewers: RKSimon, zvi, igorb
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36815
llvm-svn: 311090
|
| |
|
|
|
|
|
| |
Authored by aivchenk
Differential Revision: https://reviews.llvm.org/D35685
llvm-svn: 311082
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Support the case where an operand of a pattern is also the whole of the
result pattern. In this case the original result and all its uses must be
replaced by the operand. However, register class restrictions can require
a COPY. This patch handles both cases by always emitting the copy and
leaving it for the register allocator to optimize.
The previous commit failed on Windows machines due to a flaw in the sort
predicate which allowed both A < B < C and B == C to be satisfied
simultaneously. The cause of this was some sloppiness in the priority order of
G_CONSTANT instructions compared to other instructions. These had equal priority
because it makes no difference, however there were operands had higher priority
than G_CONSTANT but lower priority than any other instruction. As a result, a
priority order between G_CONSTANT and other instructions must be enforced to
ensure the predicate defines a strict weak order.
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Subscribers: javed.absar, kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D36084
llvm-svn: 311076
|
| |
|
|
|
|
| |
TII needs to be wrapped with #ifndef NDEBUG to silece compiler warnings.
llvm-svn: 311075
|
| |
|
|
|
|
|
| |
SystemZHazardRecognizer::TII is only used for debug output, so it needs
also to be wrapped with #ifndef NDEBUG.
llvm-svn: 311074
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The idea of this patch is to continue the scheduler state over an MBB boundary
in the case where the successor block has only one predecessor. This means
that the scheduler will continue in the successor block (after emitting any
branch instructions) with e.g. maintained processor resource counters.
Benchmarks have been confirmed to benefit from this.
The algorithm in MachineScheduler.cpp that extracts scheduling regions of an
MBB has been extended so that the strategy may optionally reverse the order
of processing the regions themselves. This is controlled by a new method
doMBBSchedRegionsTopDown(), which defaults to false.
Handling the top-most region of an MBB first also means that a top-down
scheduler can continue the scheduler state across any scheduling boundary
between to regions inside MBB.
Review: Ulrich Weigand, Matthias Braun, Andy Trick.
https://reviews.llvm.org/D35053
llvm-svn: 311072
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
When lowering a VLA, we emit a __chstk call. However, this call can
internally clobber CPSR. We did not mark this register as an ImpDef,
which could potentially allow a comparison to be hoisted above the call
to `__chkstk`. In such a case, the CPSR could be clobbered, and the
check invalidated. When the support was initially added, it seemed that
the call would take care of preventing CPSR from being clobbered, but
this is not the case. Mark the register as clobbered to fix a possible
state corruption.
llvm-svn: 311061
|
| |
|
|
|
|
| |
swapped them.
llvm-svn: 311060
|
| |
|
|
|
|
|
|
| |
We used to have a separate multiclass for AVX2 and SSE/AVX. Now we have one multiclass and pass the relevant differences.
We were also missing load patterns, though we had them for the AVX-512 version.
llvm-svn: 311059
|
| |
|
|
| |
llvm-svn: 311058
|
| |
|
|
| |
llvm-svn: 311055
|
| |
|
|
|
|
| |
array. NFC
llvm-svn: 311054
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Mark LoopDataPrefetch and AArch64FalkorHWPFFix passes as preserving
ScalarEvolution since they do not alter loop structure and should not
alter any SCEV values (though LoopDataPrefetch may introduce new
instructions that won't have cached SCEV values yet).
This can result in slight code differences, mainly w.r.t. nsw/nuw flags
on SCEVs, since these are computed somewhat lazily when a zext/sext
instruction is encountered. As a result, passes after the modified
passes may see SCEVs with more nsw/nuw flags present.
Reviewers: sanjoy, anemet
Subscribers: aemerson, rengolin, mzolotukhin, javed.absar, kristof.beyls, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D36716
llvm-svn: 311032
|
| |
|
|
| |
llvm-svn: 311019
|
| |
|
|
| |
llvm-svn: 311017
|
| |
|
|
|
|
|
|
|
|
|
|
| |
v_div_fixup_f16
This change implements features postponed in https://reviews.llvm.org/D35424 because of a dependency on https://reviews.llvm.org/D36322
Reviewers: SamWot, artem.tamazov, arsenm
Differential Revision: https://reviews.llvm.org/D36694
llvm-svn: 311011
|
| |
|
|
|
|
|
|
|
|
| |
See Bug 34152: https://bugs.llvm.org//show_bug.cgi?id=34152
Reviewers: SamWot, artem.tamazov, arsenm
Differential Revision: https://reviews.llvm.org/D36674
llvm-svn: 311006
|
| |
|
|
|
|
| |
VPPERM/VPERMIL2PD/VPERMIL2PS all provide more effective 2-input shuffles than regular AVX instructions
llvm-svn: 311005
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
.sdata, .sbss
If a variable has an explicit section such as .sdata or .sbss, it is placed
in that section and accessed in a gp relative manner. This overrides the global
-G setting.
Otherwise if a variable has a explicit section attached to it, such as '.rodata'
or '.mysection', it is not placed in the small data section. This also overrides
the global -G setting.
Reviewers: atanasyan, nitesh.jain
Differential Revision: https://reviews.llvm.org/D36616
llvm-svn: 311001
|
| |
|
|
|
|
|
|
|
|
|
| |
- Set the default runtime unroll count to 4 and use the newly added
UnrollRemainder option.
- Create loop cost and force unroll for a cost less than 12.
- Disable unrolling on Thumb1 only targets.
Differential Revision: https://reviews.llvm.org/D36134
llvm-svn: 310997
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D36585
llvm-svn: 310987
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r310425, thus reapplying r310335 with a fix for link
issue of the AArch64 unittests on Linux bots when BUILD_SHARED_LIBS is ON.
Original commit message:
[GlobalISel] Remove the GISelAccessor API.
Its sole purpose was to avoid spreading around ifdefs related to
building global-isel. Since r309990, GlobalISel is not optional anymore,
thus, we can get rid of this mechanism all together.
NFC.
----
The fix for the link issue consists in adding the GlobalISel library in
the list of dependencies for the AArch64 unittests. This dependency
comes from the use of AArch64Subtarget that needs to know how
to destruct the GISel related APIs when being detroyed.
Thanks to Bill Seurer and Ahmed Bougacha for helping me reproducing and
understand the problem.
llvm-svn: 310969
|
| |
|
|
|
|
|
|
|
|
| |
As expected, this failed on the windows bots but the instrumentation showed
something interesting. The ADD8ri and INC8r rules are never directly compared
on the windows machines. That implies that the issue lies in transitivity of
the Compare predicate. I believe I've already verified that but maybe I missed
something.
llvm-svn: 310922
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
zero-instruction emission.
Summary:
Support the case where an operand of a pattern is also the whole of the
result pattern. In this case the original result and all its uses must be
replaced by the operand. However, register class restrictions can require
a COPY. This patch handles both cases by always emitting the copy and
leaving it for the register allocator to optimize.
The previous commit failed on the windows bots and this one is likely to fail
on those same bots. However, the added instrumentation should reveal a particular
isHigherPriorityThan() evaluation which I'm expecting to expose that
these machines are weighing priority of two rules differently from the
non-windows machines.
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Subscribers: javed.absar, kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D36084
llvm-svn: 310919
|
| |
|
|
|
|
|
|
|
| |
With the addition of RISCVInstPrinter, it is now possible to test the basic
operation of the RISCV MC layer.
Differential Revision: https://reviews.llvm.org/D23564
llvm-svn: 310917
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is modeled on the implementation for x86 which stores the command line
option in a 'StackAlignOverride' field in MipsSubtarget and then uses this
to compute a 'stackAlignment' value in
MipsSubtarget::initializeSubtargetDependencies.
The stackAlignment() method in MipsSubTarget is renamed to getStackAlignment()
and returns the computed 'stackAlignment'.
Reviewers: sdardis
Reviewed By: sdardis
Subscribers: llvm-commits, arichardson
Differential Revision: https://reviews.llvm.org/D35874
llvm-svn: 310891
|
| |
|
|
| |
llvm-svn: 310876
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add codegen for VSX word extract conversion from signed/unsigned to single/double
precision.
For UINT_TO_FP:
Extract word unsigned and convert to float was implemented in https://reviews.llvm.org/D20239.
Here we will add the missing extract integer and conversion to double. This
utilizes the new P9 instruction xxextractuw to extracting an integer element
when the result will be converted to double thereby saving 2 direct moves
(VSR <-> GPR).
For SINT_TO_FP:
We will implement the following sequence which will also reduce the number of
instructions by saving 2 direct moves.
v4i32->f32:
xxspltw
xvcvsxwsp
xscvspdpn
v4i32->f64:
xxspltw
xvcvsxwdp
Differential Revision: https://reviews.llvm.org/D35859
llvm-svn: 310866
|
| |
|
|
|
|
|
| |
This reverts r310834. It didn't pacify the buildbot, FileCheck is still
crashing.
llvm-svn: 310854
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ref the post-commit thread for r310770:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170807/478507.html
The motivating cases as 'C' source examples can look like this:
unsigned char rotate_right_8(unsigned char v, int shift) {
// shift &= 7;
v = ( v >> shift ) | ( v << ( 8 - shift ) );
return v;
}
https://godbolt.org/g/K6rc1A
Notice that the source doesn't contain UB-safe masked shift amounts, but instcombine created those
in order to produce narrow rotate patterns. This should be the last step needed to resolve PR34046:
https://bugs.llvm.org/show_bug.cgi?id=34046
Differential Revision: https://reviews.llvm.org/D36644
llvm-svn: 310849
|
| |
|
|
|
|
|
|
| |
According to the X86ISelLowering.h, UMUL results are low, high, and flags. But this place was treating result 1 or 2 as flags.
Differential Revision: https://reviews.llvm.org/D36654
llvm-svn: 310846
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The flag result is an i32 type. But its only really used for connectivity. I don't think anything even assumes a particular format. We don't ever do any real operations on it. So known bits don't help us optimize anything.
My main motivation is that the UMUL behavior is actually wrong. I was going to fix this in D36654, but then realized there was just no reason for it to be here.
Reviewers: RKSimon, zvi, spatel
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36657
llvm-svn: 310845
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
AVX512_maskable_common multiclass
Summary: This looks to have been disconnected about 3 years ago in r219358.
Reviewers: gadi.haber, RKSimon, zvi
Reviewed By: gadi.haber
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36658
llvm-svn: 310844
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
isel load/store code.
Summary:
I don't think we need this code anymore. It only existed because i1 used to be legal.
There's probably more unneeded code in fast isel still.
Reviewers: guyblank, zvi
Reviewed By: guyblank
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36652
llvm-svn: 310843
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adjusts the tests to hopfully pacify the llvm-clang-x86_64-expensive-checks-win
buildbot.
Unlike many other instructions, these instructions have aliases which
take coprocessor registers, gpr register, accumulator (and dsp accumulator)
registers, floating point registers, floating point control registers and
coprocessor 2 data and control operands.
For the moment, these aliases are treated as pseudo instructions which are
expanded into the underlying instruction. As a result, disassembling these
instructions shows the underlying instruction and not the alias.
Reviewers: slthakur, atanasyan
Differential Revision: https://reviews.llvm.org/D35253
llvm-svn: 310834
|
| |
|
|
|
|
|
|
|
|
|
|
| |
An unused function warning was raised in
https://bugs.llvm.org/show_bug.cgi?id=34178.
The offending function, in AArch64MCCodeEmitter.cpp, was committed by
me last week.
Differential Revision: https://reviews.llvm.org/D36665
llvm-svn: 310823
|
| |
|
|
| |
llvm-svn: 310813
|
| |
|
|
| |
llvm-svn: 310812
|
| |
|
|
| |
llvm-svn: 310811
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
introduce a miscompile bug.
There appears to be a bug where the generated code to extract the sign
bit doesn't work correctly for 32-bit inputs. I've replied to the
original commit pointing out the problem. I think I see by inspection
(and reading the manual for PPC) how to fix this, but I can't be 100%
confident and I also don't know what the best way to test this is.
Currently it seems nearly impossible to get the backend to hit this code
path, but the patch autohr is likely in a better position to craft such
test cases than I am, and based on where the bug is it should be easily
done.
Original commit message for r310346:
"""
[PowerPC] Eliminate compares - add i32 sext/zext handling for SETLE/SETGE
Adds handling for SETLE/SETGE comparisons on i32 values. Furthermore, it
adds the handling for the special case where RHS == 0.
Differential Revision: https://reviews.llvm.org/D34048
"""
llvm-svn: 310809
|
| |
|
|
|
|
| |
The comment about why we couldn't use avx512_maskable appears to have been incorrect.
llvm-svn: 310808
|
| |
|
|
|
|
|
|
|
| |
Modernise the code with range-loops etc
Reviewed by: @fhahn, @rovka
Differential Revision: https://reviews.llvm.org/D36502
llvm-svn: 310807
|
| |
|
|
| |
llvm-svn: 310801
|
| |
|
|
| |
llvm-svn: 310799
|
| |
|
|
|
|
|
|
|
|
|
|
| |
environments
This allows using semicolons for bundling up more than one
statement per line. This is used within the mingw-w64 project in some
assembly files that contain code for multiple architectures.
Differential Revision: https://reviews.llvm.org/D36366
llvm-svn: 310797
|
| |
|
|
|
|
|
|
|
|
|
|
| |
answers for extracting 128-bits from a 512-bit vector and for mask registers.
Previously it would not return true for extracting either of the upper quarters of a 512-bit registers.
For mask registers we support extracting anything from index 0. And otherwise we only support extracting the upper half of a register.
Differential Revision: https://reviews.llvm.org/D36638
llvm-svn: 310794
|