|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| | 
| 
| 
| 
| 
| 
| 
| | Patch by Reshabh Sharma!
Differential Revision: https://reviews.llvm.org/D43861
llvm-svn: 326352 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This patch implements Chandler's idea [0] for supporting languages that
require support for infinite loops with side effects, such as Rust, providing
part of a solution to bug 965 [1].
Specifically, it adds an `llvm.sideeffect()` intrinsic, which has no actual
effect, but which appears to optimization passes to have obscure side effects,
such that they don't optimize away loops containing it. It also teaches
several optimization passes to ignore this intrinsic, so that it doesn't
significantly impact optimization in most cases.
As discussed on llvm-dev [2], this patch is the first of two major parts.
The second part, to change LLVM's semantics to have defined behavior
on infinite loops by default, with a function attribute for opting into
potential-undefined-behavior, will be implemented and posted for review in
a separate patch.
[0] http://lists.llvm.org/pipermail/llvm-dev/2015-July/088103.html
[1] https://bugs.llvm.org/show_bug.cgi?id=965
[2] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118632.html
Differential Revision: https://reviews.llvm.org/D38336
llvm-svn: 317729 | 
| | 
| 
| 
| 
| 
| 
| 
| | isZero/isOne/isMinusOne instead of isNullValue/isOneValue/isAllOnesValue inherited from Constant. NFCI
Going through the Constant methods requires redetermining that the Constant is a ConstantInt and then calling isZero/isOne/isMinusOne.
llvm-svn: 307292 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.
I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.
This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.
Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).
llvm-svn: 304787 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | - This change allows targets to opt-in to using them instead of the log2
  shufflevector algorithm.
- The SLP and Loop vectorizers have the common code to do shuffle reductions
  factored out into LoopUtils, and now have a unified interface for generating
  reductions regardless of the preference of the target. LoopUtils now uses TTI
  to determine what kind of reductions the target wants to handle.
- For CodeGen, basic legalization support is added.
Differential Revision: https://reviews.llvm.org/D30086
llvm-svn: 302514 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This patch moves some helper functions related to interleaved access
vectorization out of LoopVectorize.cpp and into VectorUtils.cpp. We would like
to use these functions in a follow-on patch that improves interleaved load and
store lowering in (ARM/AArch64)ISelLowering.cpp. One of the functions was
already duplicated there and has been removed.
Differential Revision: https://reviews.llvm.org/D29398
llvm-svn: 293788 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | type.
Instead, expose whether the current type is an array or a struct, if an array
what the upper bound is, and if a struct the struct type itself. This is
in preparation for a later change which will make PointerType derive from
Type rather than SequentialType.
Differential Revision: https://reviews.llvm.org/D26594
llvm-svn: 288458 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
This will let e.g. the load/store vectorizer propagate this metadata
appropriately.
Reviewers: arsenm
Subscribers: tra, jholewinski, hfinkel, mzolotukhin
Differential Revision: https://reviews.llvm.org/D23479
llvm-svn: 281153 | 
| | 
| 
| 
| 
| 
| 
| 
| | This will be re-used by the LoadStoreVectorizer.
Fix handling of range metadata and testcase by Justin Lebar.
llvm-svn: 274281 | 
| | 
| 
| 
| 
| 
| | Allows XOP to vectorize BITREVERSE - other targets will follow as their costmodels improve.
llvm-svn: 271803 | 
| | 
| 
| 
| 
| 
| 
| 
| | This was a fairly simple patch but on closer inspection was seriously flawed and caused PR27690.
This reverts commit r268921.
llvm-svn: 269051 | 
| | 
| 
| 
| 
| 
| | When deciding if a vector calculation can be done in a smaller bitwidth, use sign bit information from ValueTracking to add more information and allow more truncations.
llvm-svn: 268921 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | The functionality contained within getIntrinsicIDForCall is two-fold: it
checks if a CallInst's callee is a vectorizable intrinsic.  If it isn't
an intrinsic, it attempts to map the call's target to a suitable
intrinsic.
Move the mapping functionality into getIntrinsicForCallSite and rename
getIntrinsicIDForCall to getVectorIntrinsicIDForCall while
reimplementing it in terms of getIntrinsicForCallSite.
llvm-svn: 266801 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | Remove an ad-hoc transform in InstCombine and replace it with more
general machinery (ValueTracking, InstructionSimplify and VectorUtils).
This fixes PR27332.
llvm-svn: 266175 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | To quote the langref "Unlike sqrt in libm, however, llvm.sqrt has
undefined behavior for negative numbers other than -0.0 (which allows
for better optimization, because there is no need to worry about errno
being set). llvm.sqrt(-0.0) is defined to return -0.0 like IEEE sqrt."
This means that it's unsafe to replace sqrt with llvm.sqrt unless the
call is annotated with nnan.
Thanks to Hal Finkel for pointing this out!
llvm-svn: 265521 | 
| | 
| 
| 
| 
| 
| 
| | We didn't realize that we could transform the libcall into a vectorized
intrinsic.
llvm-svn: 265493 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | We already try not to truncate PHIs in computeMinimalBitwidths. LoopVectorize can't handle it and we really don't need to, because both induction and reduction PHIs are truncated by other means.
However, we weren't bailing out in all the places we should have, and we ended up by returning a PHI to be truncated, which has caused PR27018.
This fixes PR17018.
llvm-svn: 264852 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | get{Source,Result}ElementType.
Summary:
GEPOperator: provide getResultElementType alongside getSourceElementType.
This is made possible by adding a result element type field to GetElementPtrConstantExpr, which GetElementPtrInst already has.
GEP: replace get(Pointer)ElementType uses with get{Source,Result}ElementType.
Reviewers: mjacob, dblaikie
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16275
llvm-svn: 258145 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Reviewers: dblaikie, mjacob
Subscribers: llvm-commits, dblaikie
Patch by Eduard Burtescu.
Differential Revision: http://reviews.llvm.org/D16274
llvm-svn: 258022 | 
| | 
| 
| 
| | llvm-svn: 255921 | 
| | 
| 
| 
| | llvm-svn: 254409 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | Cost calculation for vector GEP failed with due to invalid cast to GEP index operand.
The bug is fixed, added a test.
http://reviews.llvm.org/D14976
llvm-svn: 254408 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | The order in which instructions are truncated in truncateToMinimalBitwidths
effects code generation. Switch to a map with a determinisic order, since the
iteration order over a DenseMap is not defined.
This code is not hot, so the difference in container performance isn't
interesting.
Many thanks to David Blaikie for making me aware of MapVector!
Fixes PR25490.
Differential Revision: http://reviews.llvm.org/D14981
llvm-svn: 254179 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | Implemented as many of Michael's suggestions as were possible:
  * clang-format the added code while it is still fresh.
  * tried to change Value* to Instruction* in many places in computeMinimumValueSizes - unfortunately there are several places where Constants need to be handled so this wasn't possible.
  * Reduce the pass list on loop-vectorization-factors.ll.
  * Fix a bug where we were querying MinBWs for I->getOperand(0) but using MinBWs[I].
llvm-svn: 252469 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | C semantics force sub-int-sized values (e.g. i8, i16) to be promoted to int
type (e.g. i32) whenever arithmetic is performed on them.
For targets with native i8 or i16 operations, usually InstCombine can shrink
the arithmetic type down again. However InstCombine refuses to create illegal
types, so for targets without i8 or i16 registers, the lengthening and
shrinking remains.
Most SIMD ISAs (e.g. NEON) however support vectors of i8 or i16 even when
their scalar equivalents do not, so during vectorization it is important to
remove these lengthens and truncates when deciding the profitability of
vectorization.
The algorithm this uses starts at truncs and icmps, trawling their use-def
chains until they terminate or instructions outside the loop are found (or
unsafe instructions like inttoptr casts are found). If the use-def chains
starting from different root instructions (truncs/icmps) meet, they are
unioned. The demanded bits of each node in the graph are ORed together to form
an overall mask of the demanded bits in the entire graph. The minimum bitwidth
that graph can be truncated to is the bitwidth minus the number of leading
zeroes in the overall mask.
The intention is that this algorithm should "first do no harm", so it will
never insert extra cast instructions. This is why the use-def graphs are
unioned, so that subgraphs with different minimum bitwidths do not need casts
inserted between them.
This algorithm works hard to reduce compile time impact. DemandedBits are only
queried if there are extends of illegal types and if a truncate to an illegal
type is seen. In the general case, this results in a simple linear scan of the
instructions in the loop.
No non-noise compile time impact was seen on a clang bootstrap build.
llvm-svn: 250032 | 
| | 
| 
| 
| 
| 
| | Differential Revision:	http://reviews.llvm.org/D12478
llvm-svn: 246381 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | *getSplatValue(Value *Val);""
This reverts commit r246379. It seems that the commit was not the culprit,
and the bot will be investigated for instability.
llvm-svn: 246380 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | *getSplatValue(Value *Val);"
This reverts commit r246371, as it cause a rather obscure bug in AArch64
test-suite paq8p (time outs, seg-faults). I'll investigate it before
reapplying.
llvm-svn: 246379 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Value *getSplatValue(Value *Val);
It complements the CreateVectorSplat(), which creates 2 instructions - insertelement and shuffle with all-zero mask.
The new function recognizes the pattern - insertelement+shuffle and returns the splat value (or nullptr).
It also returns a splat value form ConstantDataVector, for completeness.
Differential Revision:	http://reviews.llvm.org/D11124
llvm-svn: 246371 | 
| | 
| 
| 
| 
| 
| 
| | It isn't always possible to get a value from getAggregateElement.
This fixes PR24488.
llvm-svn: 245365 | 
| | 
| 
| 
| 
| 
| 
| | No funcitonal change is intended, this just makes the file look more
like the rest of LLVM.
llvm-svn: 245364 | 
| | 
| 
| 
| 
| 
| | This was already done in most places a while ago. This just fixes the ones that crept in over time.
llvm-svn: 243842 | 
| | 
| 
| 
| | llvm-svn: 242008 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | The following functions are moved from the LoopVectorizer to VectorUtils:
  - getGEPInductionOperand
  - stripGetElementPtr
  - getUniqueCastUse
  - getStrideFromPointer
These used to be static functions in LoopVectorize, but will also be used by
the upcoming loop versioning LICM transformation.
Patch by Ashutosh Nema!
llvm-svn: 241980 | 
|  | llvm-svn: 240804 |