| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
| |
When lowering a VLA, we emit a __chstk call. However, this call can
internally clobber CPSR. We did not mark this register as an ImpDef,
which could potentially allow a comparison to be hoisted above the call
to `__chkstk`. In such a case, the CPSR could be clobbered, and the
check invalidated. When the support was initially added, it seemed that
the call would take care of preventing CPSR from being clobbered, but
this is not the case. Mark the register as clobbered to fix a possible
state corruption.
llvm-svn: 311061
|
| |
|
|
|
|
| |
swapped them.
llvm-svn: 311060
|
| |
|
|
|
|
|
|
| |
We used to have a separate multiclass for AVX2 and SSE/AVX. Now we have one multiclass and pass the relevant differences.
We were also missing load patterns, though we had them for the AVX-512 version.
llvm-svn: 311059
|
| |
|
|
| |
llvm-svn: 311058
|
| |
|
|
| |
llvm-svn: 311055
|
| |
|
|
|
|
| |
array. NFC
llvm-svn: 311054
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Mark LoopDataPrefetch and AArch64FalkorHWPFFix passes as preserving
ScalarEvolution since they do not alter loop structure and should not
alter any SCEV values (though LoopDataPrefetch may introduce new
instructions that won't have cached SCEV values yet).
This can result in slight code differences, mainly w.r.t. nsw/nuw flags
on SCEVs, since these are computed somewhat lazily when a zext/sext
instruction is encountered. As a result, passes after the modified
passes may see SCEVs with more nsw/nuw flags present.
Reviewers: sanjoy, anemet
Subscribers: aemerson, rengolin, mzolotukhin, javed.absar, kristof.beyls, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D36716
llvm-svn: 311032
|
| |
|
|
| |
llvm-svn: 311019
|
| |
|
|
| |
llvm-svn: 311017
|
| |
|
|
|
|
|
|
|
|
|
|
| |
v_div_fixup_f16
This change implements features postponed in https://reviews.llvm.org/D35424 because of a dependency on https://reviews.llvm.org/D36322
Reviewers: SamWot, artem.tamazov, arsenm
Differential Revision: https://reviews.llvm.org/D36694
llvm-svn: 311011
|
| |
|
|
|
|
|
|
|
|
| |
See Bug 34152: https://bugs.llvm.org//show_bug.cgi?id=34152
Reviewers: SamWot, artem.tamazov, arsenm
Differential Revision: https://reviews.llvm.org/D36674
llvm-svn: 311006
|
| |
|
|
|
|
| |
VPPERM/VPERMIL2PD/VPERMIL2PS all provide more effective 2-input shuffles than regular AVX instructions
llvm-svn: 311005
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
.sdata, .sbss
If a variable has an explicit section such as .sdata or .sbss, it is placed
in that section and accessed in a gp relative manner. This overrides the global
-G setting.
Otherwise if a variable has a explicit section attached to it, such as '.rodata'
or '.mysection', it is not placed in the small data section. This also overrides
the global -G setting.
Reviewers: atanasyan, nitesh.jain
Differential Revision: https://reviews.llvm.org/D36616
llvm-svn: 311001
|
| |
|
|
|
|
|
|
|
|
|
| |
- Set the default runtime unroll count to 4 and use the newly added
UnrollRemainder option.
- Create loop cost and force unroll for a cost less than 12.
- Disable unrolling on Thumb1 only targets.
Differential Revision: https://reviews.llvm.org/D36134
llvm-svn: 310997
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D36585
llvm-svn: 310987
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r310425, thus reapplying r310335 with a fix for link
issue of the AArch64 unittests on Linux bots when BUILD_SHARED_LIBS is ON.
Original commit message:
[GlobalISel] Remove the GISelAccessor API.
Its sole purpose was to avoid spreading around ifdefs related to
building global-isel. Since r309990, GlobalISel is not optional anymore,
thus, we can get rid of this mechanism all together.
NFC.
----
The fix for the link issue consists in adding the GlobalISel library in
the list of dependencies for the AArch64 unittests. This dependency
comes from the use of AArch64Subtarget that needs to know how
to destruct the GISel related APIs when being detroyed.
Thanks to Bill Seurer and Ahmed Bougacha for helping me reproducing and
understand the problem.
llvm-svn: 310969
|
| |
|
|
|
|
|
|
|
|
| |
As expected, this failed on the windows bots but the instrumentation showed
something interesting. The ADD8ri and INC8r rules are never directly compared
on the windows machines. That implies that the issue lies in transitivity of
the Compare predicate. I believe I've already verified that but maybe I missed
something.
llvm-svn: 310922
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
zero-instruction emission.
Summary:
Support the case where an operand of a pattern is also the whole of the
result pattern. In this case the original result and all its uses must be
replaced by the operand. However, register class restrictions can require
a COPY. This patch handles both cases by always emitting the copy and
leaving it for the register allocator to optimize.
The previous commit failed on the windows bots and this one is likely to fail
on those same bots. However, the added instrumentation should reveal a particular
isHigherPriorityThan() evaluation which I'm expecting to expose that
these machines are weighing priority of two rules differently from the
non-windows machines.
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Subscribers: javed.absar, kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D36084
llvm-svn: 310919
|
| |
|
|
|
|
|
|
|
| |
With the addition of RISCVInstPrinter, it is now possible to test the basic
operation of the RISCV MC layer.
Differential Revision: https://reviews.llvm.org/D23564
llvm-svn: 310917
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is modeled on the implementation for x86 which stores the command line
option in a 'StackAlignOverride' field in MipsSubtarget and then uses this
to compute a 'stackAlignment' value in
MipsSubtarget::initializeSubtargetDependencies.
The stackAlignment() method in MipsSubTarget is renamed to getStackAlignment()
and returns the computed 'stackAlignment'.
Reviewers: sdardis
Reviewed By: sdardis
Subscribers: llvm-commits, arichardson
Differential Revision: https://reviews.llvm.org/D35874
llvm-svn: 310891
|
| |
|
|
| |
llvm-svn: 310876
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add codegen for VSX word extract conversion from signed/unsigned to single/double
precision.
For UINT_TO_FP:
Extract word unsigned and convert to float was implemented in https://reviews.llvm.org/D20239.
Here we will add the missing extract integer and conversion to double. This
utilizes the new P9 instruction xxextractuw to extracting an integer element
when the result will be converted to double thereby saving 2 direct moves
(VSR <-> GPR).
For SINT_TO_FP:
We will implement the following sequence which will also reduce the number of
instructions by saving 2 direct moves.
v4i32->f32:
xxspltw
xvcvsxwsp
xscvspdpn
v4i32->f64:
xxspltw
xvcvsxwdp
Differential Revision: https://reviews.llvm.org/D35859
llvm-svn: 310866
|
| |
|
|
|
|
|
| |
This reverts r310834. It didn't pacify the buildbot, FileCheck is still
crashing.
llvm-svn: 310854
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ref the post-commit thread for r310770:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170807/478507.html
The motivating cases as 'C' source examples can look like this:
unsigned char rotate_right_8(unsigned char v, int shift) {
// shift &= 7;
v = ( v >> shift ) | ( v << ( 8 - shift ) );
return v;
}
https://godbolt.org/g/K6rc1A
Notice that the source doesn't contain UB-safe masked shift amounts, but instcombine created those
in order to produce narrow rotate patterns. This should be the last step needed to resolve PR34046:
https://bugs.llvm.org/show_bug.cgi?id=34046
Differential Revision: https://reviews.llvm.org/D36644
llvm-svn: 310849
|
| |
|
|
|
|
|
|
| |
According to the X86ISelLowering.h, UMUL results are low, high, and flags. But this place was treating result 1 or 2 as flags.
Differential Revision: https://reviews.llvm.org/D36654
llvm-svn: 310846
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The flag result is an i32 type. But its only really used for connectivity. I don't think anything even assumes a particular format. We don't ever do any real operations on it. So known bits don't help us optimize anything.
My main motivation is that the UMUL behavior is actually wrong. I was going to fix this in D36654, but then realized there was just no reason for it to be here.
Reviewers: RKSimon, zvi, spatel
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36657
llvm-svn: 310845
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
AVX512_maskable_common multiclass
Summary: This looks to have been disconnected about 3 years ago in r219358.
Reviewers: gadi.haber, RKSimon, zvi
Reviewed By: gadi.haber
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36658
llvm-svn: 310844
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
isel load/store code.
Summary:
I don't think we need this code anymore. It only existed because i1 used to be legal.
There's probably more unneeded code in fast isel still.
Reviewers: guyblank, zvi
Reviewed By: guyblank
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36652
llvm-svn: 310843
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adjusts the tests to hopfully pacify the llvm-clang-x86_64-expensive-checks-win
buildbot.
Unlike many other instructions, these instructions have aliases which
take coprocessor registers, gpr register, accumulator (and dsp accumulator)
registers, floating point registers, floating point control registers and
coprocessor 2 data and control operands.
For the moment, these aliases are treated as pseudo instructions which are
expanded into the underlying instruction. As a result, disassembling these
instructions shows the underlying instruction and not the alias.
Reviewers: slthakur, atanasyan
Differential Revision: https://reviews.llvm.org/D35253
llvm-svn: 310834
|
| |
|
|
|
|
|
|
|
|
|
|
| |
An unused function warning was raised in
https://bugs.llvm.org/show_bug.cgi?id=34178.
The offending function, in AArch64MCCodeEmitter.cpp, was committed by
me last week.
Differential Revision: https://reviews.llvm.org/D36665
llvm-svn: 310823
|
| |
|
|
| |
llvm-svn: 310813
|
| |
|
|
| |
llvm-svn: 310812
|
| |
|
|
| |
llvm-svn: 310811
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
introduce a miscompile bug.
There appears to be a bug where the generated code to extract the sign
bit doesn't work correctly for 32-bit inputs. I've replied to the
original commit pointing out the problem. I think I see by inspection
(and reading the manual for PPC) how to fix this, but I can't be 100%
confident and I also don't know what the best way to test this is.
Currently it seems nearly impossible to get the backend to hit this code
path, but the patch autohr is likely in a better position to craft such
test cases than I am, and based on where the bug is it should be easily
done.
Original commit message for r310346:
"""
[PowerPC] Eliminate compares - add i32 sext/zext handling for SETLE/SETGE
Adds handling for SETLE/SETGE comparisons on i32 values. Furthermore, it
adds the handling for the special case where RHS == 0.
Differential Revision: https://reviews.llvm.org/D34048
"""
llvm-svn: 310809
|
| |
|
|
|
|
| |
The comment about why we couldn't use avx512_maskable appears to have been incorrect.
llvm-svn: 310808
|
| |
|
|
|
|
|
|
|
| |
Modernise the code with range-loops etc
Reviewed by: @fhahn, @rovka
Differential Revision: https://reviews.llvm.org/D36502
llvm-svn: 310807
|
| |
|
|
| |
llvm-svn: 310801
|
| |
|
|
| |
llvm-svn: 310799
|
| |
|
|
|
|
|
|
|
|
|
|
| |
environments
This allows using semicolons for bundling up more than one
statement per line. This is used within the mingw-w64 project in some
assembly files that contain code for multiple architectures.
Differential Revision: https://reviews.llvm.org/D36366
llvm-svn: 310797
|
| |
|
|
|
|
|
|
|
|
|
|
| |
answers for extracting 128-bits from a 512-bit vector and for mask registers.
Previously it would not return true for extracting either of the upper quarters of a 512-bit registers.
For mask registers we support extracting anything from index 0. And otherwise we only support extracting the upper half of a register.
Differential Revision: https://reviews.llvm.org/D36638
llvm-svn: 310794
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Without the SrcVT its hard to know what is really being asked for. For example if your target has 128, 256, and 512 bit vectors. Maybe extracting 128 from 256 is cheap, but maybe extracting 128 from 512 is not.
For x86 we do support extracting a quarter of a 512-bit register. But for i1 vectors we don't have isel patterns for extracting arbitrary pieces. So we need this to have a correct implementation of isExtractSubvectorCheap for mask vectors.
Reviewers: RKSimon, zvi, efriedma
Reviewed By: RKSimon
Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D36649
llvm-svn: 310793
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
information
This is a continuation patch for commit r307529 which completely replaces the scheduling information for the SandyBridge architecture target by modifying the file X86SchedSandyBridge.td located under the X86 Target (see also https://reviews.llvm.org/D35019).
In this patch we added the scheduling information of additional SNB instructions that were missing from the patch commit r307529, fixed the scheduling of several resource groups that include only port0 instead of port05 (i.e., port0 OR port5) and fixed several incorrect instructions' scheduling in the r307529 commit.
The patch also includes the X87 instructions which were missing in previous patch commit r307529 as reported in bugzilla bug 34080.
Reviewers: zvi, RKSimon, chandlerc, igorb, m_zuckerman, craig.topper, aymanmus, dim
Differential Revision: https://reviews.llvm.org/D36388
llvm-svn: 310792
|
| |
|
|
|
|
|
|
|
| |
K0 isn't expected as a write-mask, so provide a detailed error here, instead of the more generic one (invalid op for insn)
Conforms with gas
Differential Revision: https://reviews.llvm.org/D36570
llvm-svn: 310789
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Add an X86 combine for TESTM when one of the operands is a BUILD_VECTOR(0,0,...).
TESTM op0, BUILD_VECTOR(0,0,...) -> BUILD_VECTOR(0,0,...)
TESTM BUILD_VECTOR(0,0,...), op1 -> BUILD_VECTOR(0,0,...)
Differential Revision:
https://reviews.llvm.org/D36536
llvm-svn: 310787
|
| |
|
|
|
|
| |
The combines here shouldn't be done for mask vectors, but it wasn't clear anything was preventing that.
llvm-svn: 310786
|
| |
|
|
| |
llvm-svn: 310785
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
correct type so we don't crash if we use a memory instruction
Summary:
Previously we were creating the flag result with MVT::Other which is interpretted as a Chain node. If we used a memory form of the instruction we would end up with a copyToReg that consumed the chain result of the adcx instruction instead of the flag result.
Pretty sure we should be using MVT::i32 here, that's what we do other places we create these node types.
We should probably consider this for 5.0 as well.
Reviewers: RKSimon, zvi, spatel
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36645
llvm-svn: 310784
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
isThumb returns true for Thumb triples (little and big endian), isARM
returns true for ARM triples (little and big endian).
There are a few more checks using arm/thumb that are not covered by
those functions, e.g. that the architecture is either ARM or Thumb
(little endian) or ARM/Thumb little endian only.
Reviewers: javed.absar, rengolin, kristof.beyls, t.p.northover
Reviewed By: rengolin
Subscribers: llvm-commits, aemerson
Differential Revision: https://reviews.llvm.org/D34682
llvm-svn: 310781
|
| |
|
|
|
|
|
|
|
|
|
| |
`external_weak` global
An `external_weak` global may be intended to resolve as a null pointer if it's
not defined, so it doesn't make sense to use a copy relocation for it.
Differential Revision: https://reviews.llvm.org/D36604
llvm-svn: 310773
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The stack alignment depends on the ABI (16 bytes for N32 and N64 and 8
bytes for O32), not the CPU type.
Reviewers: sdardis
Reviewed By: sdardis
Subscribers: atanasyan, arichardson, llvm-commits
Differential Revision: https://reviews.llvm.org/D36326
llvm-svn: 310768
|