| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This prevents functions accessing varargs from being inlined if they
have the alwaysinline attribute.
Reviewers: efriedma, rnk, davide
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D42556
llvm-svn: 323619
|
|
|
|
|
|
|
|
|
|
|
| |
This patch moves the DJB hash to support. This is consistent with other
hashing algorithms living there. The hash is used by the DWARF
accelerator tables. We're doing this now because the hashing function is
needed by dsymutil and we don't want to link against libBinaryFormat.
Differential revision: https://reviews.llvm.org/D42594
llvm-svn: 323616
|
|
|
|
|
|
| |
checking the width of a ConstantSDNode before calling getConstantOperandVal.
llvm-svn: 323614
|
|
|
|
|
|
| |
eq/ne with immallzeros.
llvm-svn: 323612
|
|
|
|
|
|
| |
We can widen the mask and extract it back down.
llvm-svn: 323610
|
|
|
|
|
|
|
|
|
|
|
|
| |
zero vector.
We can use the same input for both operands to get a free compare with zero.
We already use this trick in a couple places where we explicitly create PTESTM with the same input twice. This generalizes it.
I'm hoping to remove the ISD opcodes and move this to isel patterns like we do for scalar cmp/test.
llvm-svn: 323605
|
|
|
|
|
|
|
|
|
|
| |
pattern match the immediate value during isel.
Legalization is still biased to turn LT compares in to GT by swapping operands to avoid needing extra isel patterns to commute.
I'm hoping to remove TESTM/TESTNM next and this should simplify that by making EQ/NE more similar.
llvm-svn: 323604
|
|
|
|
|
|
|
|
| |
If broadcasting from another shuffle, attempt to simplify it.
We can probably generalize this a lot more (embedding in combineX86ShufflesRecursively), but BROADCAST is one of the more troublesome as it accepts inputs of different sizes to the result.
llvm-svn: 323602
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
dest alignments
Summary:
This change is step two in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments.
Step 3) Update Clang to use the new IRBuilder API.
Step 4) Update Polly to use the new IRBuilder API.
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use
getDestAlignment() and getSourceAlignment() instead.
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.
Reference
http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html
llvm-svn: 323597
|
|
|
|
|
|
|
|
| |
vXi1 vectors into logic ops.
This transform was already being done for setcc of scalar i1. This extends it to vectors.
llvm-svn: 323585
|
|
|
|
|
|
|
|
| |
SETEQ/SETNE correctly for vector types.
The code was using getValueSizeInBits and combining with the result of a call to DAG.ComputeNumSignBits. But for vector types getValueSizeInBits returns the width of the full vector while ComputeNumSignBits is going to give a number no larger than the width of a single element. So we should be using getScalarValueSizeInBits to get the element width.
llvm-svn: 323583
|
|
|
|
|
|
|
|
|
|
|
| |
for G_FCONSTANT.
We weren't converting the immediate ConstantFP during legalization, which caused
the wrong bit patterns to be emitted for half type FP constants.
Fixes PR36106.
llvm-svn: 323582
|
|
|
|
|
|
|
|
| |
as shuffle."
This reverts commit r323530 to fix possible problems in users code.
llvm-svn: 323581
|
|
|
|
|
|
| |
This reverts commit r323533 to fix possible problems in users code.
llvm-svn: 323580
|
|
|
|
|
|
| |
This fixes a think-o in r323574.
llvm-svn: 323576
|
|
|
|
|
|
|
| |
When there are no uses of profiling intrinsics in a module, and there's
no coverage data to lower, InstrProfiling has no work to do.
llvm-svn: 323574
|
|
|
|
|
|
|
|
| |
Previously we had to materialize all 1s in a register using vpternlog or pcmpeq and then xor with that. By using vpternlog directly we can do it in one operation.
This is implemented using isel patterns, but we should maybe consider creating a generalized vpternlog combiner.
llvm-svn: 323572
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A cast from A to B is eliminable if its result is casted to C, and if
the pair of casts could just be expressed as a single cast. E.g here,
%c1 is eliminable:
%c1 = zext i16 %A to i32
%c2 = sext i32 %c1 to i64
InstCombine optimizes away eliminable casts. This patch teaches it to
insert a dbg.value intrinsic pointing to the final result, so that local
variables pointing to the eliminable result are preserved.
Differential Revision: https://reviews.llvm.org/D42566
llvm-svn: 323570
|
|
|
|
| |
llvm-svn: 323569
|
|
|
|
| |
llvm-svn: 323568
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A correctly aligned address may happen to be separated into a variable
part and a constant part, where the constant part does not match the
alignment needed in a load/store that uses this address. Such a constant
cannot be used as an immediate offset in an indexed instruction.
When lowering a global address, make sure that if there is an offset
folded into the global, the offset is valid for all uses in load/store
instructions.
llvm-svn: 323562
|
|
|
|
| |
llvm-svn: 323561
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
One common source of blocks with no successors is calls to noreturn
functions; we want to preserve pristine registers in case they throw an
exception.
The whole pristine register thing is messy (we should really prefer to
explicitly model registers), but this fills a hole in the model for now.
Fixes https://bugs.llvm.org/show_bug.cgi?id=36073.
Differential Revision: https://reviews.llvm.org/D42509
llvm-svn: 323559
|
|
|
|
| |
llvm-svn: 323558
|
|
|
|
|
|
|
|
| |
X86ISelLowering.cpp:34130:5: error: return type 'llvm::SDValue' must
match previous return type 'const llvm::SDValue' when lambda expression
has unspecified explicit return type
llvm-svn: 323557
|
|
|
|
|
|
| |
For VLX target getSetccResultType returns vXi1 which prevents the target independent DAG combine from doing this tranform itself.
llvm-svn: 323555
|
|
|
|
|
|
|
|
| |
Similar to the existing support for X86ISD::VTRUNCUS.
Differential Revision: https://reviews.llvm.org/D42544
llvm-svn: 323553
|
|
|
|
|
|
|
|
|
|
|
|
| |
blank lines are printed during isel process to make things more sensibly grouped.
Previously some targets printed their own message at the start of Select to indicate what they were selecting. For the targets that didn't, it means there was no print of the root node before any custom handling in the target executed. So if the target did something custom and never called SelectNodeCommon, no print would be made. For the targets that did print a message in Select, if they didn't custom handle a node SelectNodeCommon would reprint the root node before walking the isel table.
It seems better to just print the message before the call to Select so all targets behave the same. And then remove the root node printing from SelectNodeCommon and just leave a message that says we're starting the table search.
There were also some oddities in blank line behavior. Usually due to a \n after a call to SelectionDAGNode::dump which already inserted a new line.
llvm-svn: 323551
|
|
|
|
|
|
| |
gcc recently fixed this bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83546
llvm-svn: 323550
|
|
|
|
| |
llvm-svn: 323548
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This is the producer side for DWARF v5 string offsets tables. The reader/consumer
side was committed with r321295. All compile and type units in a module share a
contribution to the string offsets table. Indirect strings use the strx{1,2,3,4} index forms.
Reviewers: dblaikie, aprantl, JDevliegehere
Differential Revision: https://reviews.llvm.org/D42021
llvm-svn: 323546
|
|
|
|
| |
llvm-svn: 323545
|
|
|
|
|
|
|
|
| |
v4i32/v4f32
Extension to D42431, adding support for v4i32/v4f32 as well as v2i64/v2f64 now that D42308 has landed
llvm-svn: 323542
|
|
|
|
|
|
|
|
|
|
| |
We currently coalesce v4i32 extracts from all 4 elements to 2 v2i64 extracts + shifts/sign-extends.
This seems to have been added back in the days when we tended to spill vectors and reload scalars, or ended up with repeated shuffles moving everything down to 0'th index. I don't think either of these are likely these days as we have better EXTRACT_VECTOR_ELT and VECTOR_SHUFFLE handling, and the existing code tends to make it very difficult for various vector and load combines.
Differential Revision: https://reviews.llvm.org/D42308
llvm-svn: 323541
|
|
|
|
|
|
| |
As mentioned in D42258, we don't need this any more
llvm-svn: 323540
|
|
|
|
|
|
| |
Indexed outputs are addition / subtractions and can be interpreted as such.
llvm-svn: 323539
|
|
|
|
|
|
|
|
|
| |
See bug 36000: https://bugs.llvm.org/show_bug.cgi?id=36000
Differential Revision: https://reviews.llvm.org/D42483
Reviewers: vpykhtin, artem.tamazov, arsenm
llvm-svn: 323538
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This fixes an assertion when building the FreeBSD MIPS64 kernel.
Reviewers: atanasyan, sdardis, emaste
Reviewed By: sdardis
Subscribers: krytarowski, llvm-commits
Differential Revision: https://reviews.llvm.org/D42571
llvm-svn: 323536
|
|
|
|
|
|
|
|
| |
range
From OSS Fuzz Test Case #5688
llvm-svn: 323535
|
|
|
|
|
|
|
|
|
| |
See bug 35998: https://bugs.llvm.org/show_bug.cgi?id=35998
Differential Revision: https://reviews.llvm.org/D42469
Reviewers: vpykhtin, artem.tamazov, arsenm
llvm-svn: 323534
|
|
|
|
| |
llvm-svn: 323533
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
If the same value is going to be vectorized several times in the same
tree entry, this entry is considered to be a gather entry and cost of
this gather is counter as cost of InsertElementInstrs for each gathered
value. But we can consider these elements as ShuffleInstr with
SK_PermuteSingle shuffle kind.
Reviewers: spatel, RKSimon, mkuper, hfinkel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38697
llvm-svn: 323530
|
|
|
|
|
|
|
|
|
| |
See bug 35988: https://bugs.llvm.org/show_bug.cgi?id=35988
Differential Revision: https://reviews.llvm.org/D42186
Reviewers: vpykhtin, artem.tamazov, arsenm
llvm-svn: 323527
|
|
|
|
| |
llvm-svn: 323526
|
|
|
|
|
|
|
|
|
|
| |
Add support for printing / parsing the addrspace of a MachineMemOperand.
Fixes PR35970.
Differential Revision: https://reviews.llvm.org/D42502
llvm-svn: 323521
|
|
|
|
|
|
|
|
|
|
|
|
| |
- using qualified pointer addrspace in intrinsics class to avoid .f32 mangling
- changed too common atomic mangling to ds
- added missing intrinsics to AMDGPUTTIImpl::getTgtMemIntrinsic
Reviewed by: b-sumner
Differential Revision: https://reviews.llvm.org/D42383
llvm-svn: 323516
|
|
|
|
|
|
|
|
|
| |
Fix infinite loop when recording conditions by correctly marking basic
blocks as visited.
Fixes https://bugs.llvm.org/show_bug.cgi?id=36105
llvm-svn: 323515
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
load instruction
The function `Thumb1InstrInfo::loadRegFromStackSlot` accepts only the `tGPR`
register class. The function serves to emit a `tLDRspi` instruction and
certainly any subset of the `tGPR` register class is a valid destination of the
load.
Differential revision: https://reviews.llvm.org/D42535
llvm-svn: 323514
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the groundwork for Armv8.2-A FP16 code generation .
Clang passes and returns _Float16 values as floats, together with the required
bitconverts and truncs etc. to implement correct AAPCS behaviour, see D42318.
We will implement half-precision argument passing/returning lowering in the ARM
backend soon, but for now this means that this:
_Float16 sub(_Float16 a, _Float16 b) {
return a + b;
}
gets lowered to this:
define float @sub(float %a.coerce, float %b.coerce) {
entry:
%0 = bitcast float %a.coerce to i32
%tmp.0.extract.trunc = trunc i32 %0 to i16
%1 = bitcast i16 %tmp.0.extract.trunc to half
<SNIP>
%add = fadd half %1, %3
<SNIP>
}
When FullFP16 is *not* supported, we don't make f16 a legal type, and we get
legalization for "free", i.e. nothing changes and everything works as before.
And also f16 argument passing/returning is handled.
When FullFP16 is supported, we do make f16 a legal type, and have 2 places that
we need to patch up: f16 argument passing and returning, which involves minor
tweaks to avoid unnecessary code generation for some bitcasts.
As a "demonstrator" that this works for the different FP16, FullFP16, softfp
modes, etc., I've added match rules to the VSUB instruction description showing
that we can codegen this instruction from IR, but more importantly, also to
some conversion instructions. These conversions were causing issue before in
the FP16 and FullFP16 cases.
I've also added match rules to the VLDRH and VSTRH desriptions, so that we can
actually compile the entire half-precision sub code example above. This showed
that these loads and stores had the wrong addressing mode specified: AddrMode5
instead of AddrMode5FP16, which turned out not be implemented at all, so that
has also been added.
This is the minimal patch that shows all the different moving parts. In patch
2/3 I will add some efficient lowering of bitcasts, and in 2/3 I will add the
remaining Armv8.2-A FP16 instruction descriptions.
Thanks to Sam Parker and Oliver Stannard for their help and reviews!
Differential Revision: https://reviews.llvm.org/D38315
llvm-svn: 323512
|
|
|
|
|
|
| |
"in in" -> "in", "on on" -> "on" etc.
llvm-svn: 323508
|