| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
| |
llvm-svn: 319240
|
| |
|
|
| |
llvm-svn: 319239
|
| |
|
|
|
|
|
|
| |
legal when zero extending from vXi8/vX816.
The UINT_TO_FP is immediately converted to SINT_TO_FP when the node is re-evaluated because we'll detect that the sign bit is zero.
llvm-svn: 319234
|
| |
|
|
|
|
| |
We have a DAG combine that uses a zero extend that should prevent this from ever occurring now.
llvm-svn: 319233
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ordering
Summary:
This fixes failures in the following tests uncovered by D39245:
LLVM :: CodeGen/Hexagon/args.ll
LLVM :: CodeGen/Hexagon/constp-extract.ll
LLVM :: CodeGen/Hexagon/expand-condsets-basic.ll
LLVM :: CodeGen/Hexagon/gp-rel.ll
LLVM :: CodeGen/Hexagon/packetize_cond_inst.ll
LLVM :: CodeGen/Hexagon/simple_addend.ll
LLVM :: CodeGen/Hexagon/swp-stages4.ll
LLVM :: CodeGen/Hexagon/swp-vmult.ll
LLVM :: CodeGen/Hexagon/swp-vsum.ll
LLVM :: MC/Hexagon/align.s
LLVM :: MC/Hexagon/asmMap.s
LLVM :: MC/Hexagon/dis-duplex-p0.s
LLVM :: MC/Hexagon/double-vector-producer.s
LLVM :: MC/Hexagon/inst_select.ll
LLVM :: MC/Hexagon/instructions/j.s
Reviewers: colinl, kparzysz, adasgupt, slarin
Reviewed By: kparzysz
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D40227
llvm-svn: 319223
|
| |
|
|
|
|
|
|
| |
Allow fastcc callees to be tail-called from ccc callers.
Differential Revision: https://reviews.llvm.org/D40355
llvm-svn: 319218
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
them legal
The IRTranslator cannot generate these instructions at the moment so there's no
issue with not having implemented ISel for them yet. D40092 will add
G_ATOMIC_CMPXCHG_WITH_SUCCESS and G_ATOMICRMW_* to the IRTranslator and a
further patch will add support for lowering G_ATOMIC_CMPXCHG_WITH_SUCCESS into
G_ATOMIC_CMPXCHG with an external success check via the `Lower` action.
The separation of G_ATOMIC_CMPXCHG_WITH_SUCCESS and G_ATOMIC_CMPXCHG is
to import SelectionDAG rules while still supporting targets that prefer to
custom lower the original LLVM-IR-like operation.
llvm-svn: 319216
|
| |
|
|
|
|
|
|
| |
Update multi-classes to take the scheduling OpndItins instead of hard coding it.
Will be reused in the AVX512 equivalents.
llvm-svn: 319209
|
| |
|
|
|
|
|
|
|
|
| |
i8 or i16 and need to zero extend it, make sure we use a vXi32 type of the full vector width.
Previously, this was hardcoded to v4i32, but if the input type is 256 bits we need to use v8i32.
Fixes PR35443
llvm-svn: 319208
|
| |
|
|
| |
llvm-svn: 319204
|
| |
|
|
|
|
| |
We don't need scheduling info for pseudos
llvm-svn: 319197
|
| |
|
|
|
|
|
|
| |
This was requested by tools.
Differential Revision: https://reviews.llvm.org/D40321
llvm-svn: 319192
|
| |
|
|
|
|
|
|
|
|
|
| |
As part of the unification of the debug format and the MIR format,
always print registers as lowercase.
* Only debug printing is affected. It now follows MIR.
Differential Revision: https://reviews.llvm.org/D40417
llvm-svn: 319187
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Generalize FixFunctionBitcasts to handle varargs functions. This in
particular fixes the case where clang bitcasts away a varargs when
calling a K&R-style function.
This avoids interacting with tricky ABI details because it operates
at the LLVM IR level before varargs ABI details are exposed.
This fixes PR35385.
llvm-svn: 319186
|
| |
|
|
|
|
| |
Looking through Agner, FTST is very similar to generic float compare behaviour, so I've added them to the existing IIC_FCOMI (WriteFAdd) tags.
llvm-svn: 319184
|
| |
|
|
|
|
|
| |
Atom's FABS/FCHS/FSQRT latencies taken from Agner.
Note: I just added FSIN and FCOS to the existing IIC_FSINCOS itinerary, which is actually a more costly instruction.
llvm-svn: 319175
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is needed for cases when the memory access is not as big as the width of
the data type. For instance, storing i1 (1 bit) would be done in a byte (8
bits).
Using 'BitSize >> 3' (or '/ 8') would e.g. give the memory access of an i1 a
size of 0, which for instance makes alias analysis return NoAlias even when
it shouldn't.
There are no tests as this was done as a follow-up to the bugfix for the case
where this was discovered (r318824). This handles more similar cases.
Review: Björn Petterson
https://reviews.llvm.org/D40339
llvm-svn: 319173
|
| |
|
|
|
|
|
|
|
|
|
| |
LLVM Coding Standards:
Function names should be verb phrases (as they represent actions), and
command-like function should be imperative. The name should be camel
case, and start with a lower case letter (e.g. openFile() or isFoo()).
Differential Revision: https://reviews.llvm.org/D40416
llvm-svn: 319168
|
| |
|
|
|
|
| |
femms/prefetch/prefetchw
llvm-svn: 319167
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The entire algorithm operates per basic-block, so for cache locality
it should be better to re-optimize a basic-block immediately rather than
in a separate loop.
I don't have performance measurements.
Change-Id: I85106570bd623c4ff277faaa50ee43258e1ddcc5
Reviewers: arsenm, rampitec
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D40344
llvm-svn: 319156
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The PeepholeOptimizer pass calls this function solely based on checking
DefMI->isMoveImmediate(), which only checks the MoveImm bit of the
instruction description. So it's up to FoldImmediate itself to properly
check that DefMI *actually* moves from an immediate.
I don't have a separate test case for this, but the next patch introduces
a test case which happens to crash without this change.
This error is caught by the assertion in MachineOperand::getImm().
Change-Id: I88e7cdbcf54d75e1a296822e6fe5f9a5f095bbf8
Reviewers: arsenm, rampitec
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D40342
llvm-svn: 319155
|
| |
|
|
|
|
|
|
|
| |
Fast-isel routines need to bail out in the case that fast-isel
fails on the operands.
This fixes https://bugs.llvm.org/show_bug.cgi?id=35064
llvm-svn: 319144
|
| |
|
|
| |
llvm-svn: 319143
|
| |
|
|
|
|
| |
under AVX512.
llvm-svn: 319136
|
| |
|
|
|
|
|
|
|
|
| |
https://llvm.org/PR32578
I simplified and converted the reproducer into a lit test.
Patch by Vedant Kumar!
llvm-svn: 319130
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This adds code to protect WebAssembly's `trunc_s` family of opcodes
from values outside their domain. Even though such conversions have
full undefined behavior in C/C++, LLVM IR's `fptosi` and `fptoui` do
not, and only return undef.
This also implements the proposed non-trapping float-to-int conversion
feature and uses that instead when available.
llvm-svn: 319128
|
| |
|
|
|
|
|
|
| |
block. NFCI
These lines all exist identically either under SSE2, AVX2 or AVX512. Given that VLX implies all of those, these aren't providing anything new.
llvm-svn: 319124
|
| |
|
|
|
|
| |
These same calls exist a few lines down.
llvm-svn: 319122
|
| |
|
|
|
|
|
|
| |
looking at larger than 512-bit vectors.
Which VTs are considered simple is determined by the superset of the legal types of all targets in LLVM. If we're looking at VTs that are going to be split down to 512-bits we should allow any VT not just simple ones since the simple list changes over time as new targets are added.
llvm-svn: 319110
|
| |
|
|
|
|
| |
We don't do this for narrow vectors under AVX or SSE features. We also don't set them to Expand like we do for many vectors op. Nor does TargetLoweringBase.cpp. This leads me to believe these default to Legal.
llvm-svn: 319103
|
| |
|
|
|
|
|
|
|
|
|
| |
on arg rather than result
This should fix PR31455:
https://bugs.llvm.org/show_bug.cgi?id=31455
Differential Revision: https://reviews.llvm.org/D28314
llvm-svn: 319094
|
| |
|
|
|
|
|
|
|
|
| |
This patch adds a peep hole optimization to remove any redundant toc save
instructions added as part of the call sequence for indirect calls. It removes
any toc saves within a function that are dominated by another toc save.
Differential Revision: https://reviews.llvm.org/D39736
llvm-svn: 319087
|
| |
|
|
|
|
| |
I don't believe our current lowering/combining would ever produce such a node. We only produce integer typed pshufds.
llvm-svn: 319068
|
| |
|
|
|
|
| |
I don't have a good test case for this at the moment. I was playing around with a change in legalizing and triggered this code to produce a PSHUFD with sse1 only.
llvm-svn: 319066
|
| |
|
|
|
|
|
|
| |
SSE_PACK/SSE_PMADD schedule classes
llvm-svn: 319065
|
| |
|
|
| |
llvm-svn: 319064
|
| |
|
|
|
|
|
|
|
|
| |
512 bits long when AVX512 is enabled.
Similar for vXi16/vXi8 with BWI.
Any vector larger than 512 bits will be split to 512 bits during legalization. But without this we will fold sexts with them before that making it difficult to recover leading to scalarization.
llvm-svn: 319059
|
| |
|
|
|
|
| |
itineraries
llvm-svn: 319054
|
| |
|
|
|
|
|
|
|
| |
See bug 35433: https://bugs.llvm.org/show_bug.cgi?id=35433
Differential Revision: https://reviews.llvm.org/D40493
Reviewers: artem.tamazov, SamWot, arsenm
llvm-svn: 319050
|
| |
|
|
|
|
|
|
|
|
| |
This patch extends on to rL307174 to not use the power9 vector extract with
variable index instructions when extracting word element 1. For such cases,
the existing selection of MFVSRWZ provides a better sequence.
Differential Revision: https://reviews.llvm.org/D38287
llvm-svn: 319049
|
| |
|
|
| |
llvm-svn: 319045
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Now that store-merge is only generates type-safe stores, do a second
pass just before instruction selection to allow lowered intrinsics to
be merged as well.
Reviewers: jyknight, hfinkel, RKSimon, efriedma, rnk, jmolloy
Subscribers: javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D33675
llvm-svn: 319036
|
| |
|
|
| |
llvm-svn: 319031
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make the print format consistent with other assembler instructions.
Adding a tab character instead of space in asmstring of Ext and Ins
instructions.
Removing space around the tab character for JALRC and replacing space with
tab in JRC.
Patch by Milos Stojanovic.
Differential Revision: https://reviews.llvm.org/D38144
llvm-svn: 319030
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
AMDGPU backend errors with "unsupported call to function" upon
encountering a call to llvm.log{,10}.{f16,f32} intrinsics. This patch
adds custom lowering to avoid that error on both R600 and SI.
Reviewers: arsenm, jvesely
Subscribers: tstellar
Differential Revision: https://reviews.llvm.org/D29942
llvm-svn: 319025
|
| |
|
|
|
|
|
|
|
|
| |
As mentioned on PR17367, many instructions are missing scheduling tags preventing us from setting 'CompleteModel = 1' for better instruction analysis. This patch deals with FMA/FMA4 which is one of the bigger offenders (along with AVX512 in general).
Annoyingly all scheduler models need to define WriteFMA (now that its actually used), even for older targets without FMA/FMA4 support, but that is an existing problem shared by other schedule classes.
Differential Revision: https://reviews.llvm.org/D40351
llvm-svn: 319016
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The commit https://reviews.llvm.org/rL318143 computes incorrectly to offset to
restore LR from.
The number of tPOP operands is 2 (condition) + 2 (implicit def and use of SP) +
count of the popped registers. We need to load LR from just past the last
register, hence the correct offset should be either getNumOperands() - 4 and
getNumExplicitOperands() - 2 (multiplied by 4).
Differential revision: https://reviews.llvm.org/D40305
llvm-svn: 319014
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D39846
llvm-svn: 319013
|
| |
|
|
|
|
| |
The check is actually unnecessary since AVX512VBMI implies AVX512BW which is the other part of the assert.
llvm-svn: 319006
|
| |
|
|
| |
llvm-svn: 319005
|