| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
| |
If a range has a lower bound of 0, add an AssertZext from the
nearest floor power of two.
This allows operations with some workitem intrinsics with known
maximum ranges to use fast 24-bit multiplies.
llvm-svn: 260109
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D16983
llvm-svn: 260101
|
| |
|
|
|
|
| |
sanitizer bots.
llvm-svn: 260087
|
| |
|
|
|
|
|
|
|
|
|
|
| |
We shouldn't assert when there are no memchecks, since we
can have SCEV checks. There is already an assert covering
the case where there are no SCEV checks or memchecks.
This also changes the LAA pointer wrapping versioning test
to use the loop versioning pass (this was how I managed to
trigger the assert in the loop versioning pass).
llvm-svn: 260086
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pointer detection
Summary:
This change adds no wrap SCEV predicates with:
- support for runtime checking
- support for expression rewriting:
(sext ({x,+,y}) -> {sext(x),+,sext(y)}
(zext ({x,+,y}) -> {zext(x),+,sext(y)}
Note that we are sign extending the increment of the SCEV, even for
the zext case. This is needed to cover the fairly common case where y would
be a (small) negative integer. In order to do this, this change adds two new
flags: nusw and nssw that are applicable to AddRecExprs and permit the
transformations above.
We also change isStridedPtr in LAA to be able to make use of
these predicates. With this feature we should now always be able to
work around overflow issues in the dependence analysis.
Reviewers: mzolotukhin, sanjoy, anemet
Subscribers: mzolotukhin, sanjoy, llvm-commits, rengolin, jmolloy, hfinkel
Differential Revision: http://reviews.llvm.org/D15412
llvm-svn: 260085
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As discussed in https://github.com/google/sanitizers/issues/398, with current
implementation of poisoning globals we can have some CHECK failures or false
positives in case of mixing instrumented and non-instrumented code due to ASan
poisons innocent globals from non-sanitized binary/library. We can use private
aliases to avoid such errors. In addition, to preserve ODR violation detection,
we introduce new __odr_asan_gen_XXX symbol for each instrumented global that
indicates if this global was already registered. To detect ODR violation in
runtime, we should only check the value of indicator and report an error if it
isn't equal to zero.
Differential Revision: http://reviews.llvm.org/D15642
llvm-svn: 260075
|
| |
|
|
| |
llvm-svn: 260069
|
| |
|
|
|
|
| |
Investigating.
llvm-svn: 260064
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The combineX86ShufflesRecursively only supports unary shuffles, but was missing the opportunity to combine binary shuffles with a zero / undef second input.
This patch resolves target shuffle inputs, converting the shuffle mask elements to SM_SentinelUndef/SM_SentinelZero where possible. It then resolves the updated mask to check if we have created a faux unary shuffle.
Additionally, we now attempt to recursively call combineX86ShufflesRecursively for all input operands (we used to just recurse for unary integer shuffles and unary unpacks) - it safely returns early if its not a target shuffle.
Differential Revision: http://reviews.llvm.org/D16683
llvm-svn: 260063
|
| |
|
|
| |
llvm-svn: 260061
|
| |
|
|
| |
llvm-svn: 260055
|
| |
|
|
| |
llvm-svn: 260034
|
| |
|
|
|
|
|
|
| |
rounding mode
Differential Revision: http://reviews.llvm.org/D16629
llvm-svn: 260033
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D16813
llvm-svn: 260024
|
| |
|
|
| |
llvm-svn: 260010
|
| |
|
|
|
|
| |
As raised in PR26491, we don't make use of these instructions at the moment.
llvm-svn: 260008
|
| |
|
|
| |
llvm-svn: 260007
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is the attribute purpose-made for e.g. __syncthreads. It appears
that NoDuplicate may not be sufficient to prevent Sink from touching a
call to __syncthreads.
Reviewers: jingyue, hfinkel
Subscribers: llvm-commits, jholewinski, jhen, rnk, tra, majnemer
Differential Revision: http://reviews.llvm.org/D16941
llvm-svn: 260005
|
| |
|
|
|
|
| |
Let AVX512 targets share the same CHECKs.
llvm-svn: 260000
|
| |
|
|
| |
llvm-svn: 259999
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Adds the linkage type to both the per-module and combined function
summaries, which subsumes the current islocal bit. This will eventually
be used to optimized linkage types based on global summary-based
analysis.
Reviewers: joker.eph
Subscribers: joker.eph, davidxl, llvm-commits
Differential Revision: http://reviews.llvm.org/D16943
llvm-svn: 259993
|
| |
|
|
| |
llvm-svn: 259992
|
| |
|
|
|
|
|
|
| |
If we are already loading a single 32-bit float/integer then just reuse it.
Fix for regression in D16729
llvm-svn: 259991
|
| |
|
|
|
|
| |
Earlier they were failing under no-assert build.
llvm-svn: 259989
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When alias analysis is uncertain about the aliasing between any two accesses,
it will return MayAlias. This uncertainty from alias analysis restricts LICM
from proceeding further. In cases where alias analysis is uncertain we might
use loop versioning as an alternative.
Loop Versioning will create a version of the loop with aggressive aliasing
assumptions in addition to the original with conservative (default) aliasing
assumptions. The version of the loop making aggressive aliasing assumptions
will have all the memory accesses marked as no-alias. These two versions of
loop will be preceded by a memory runtime check. This runtime check consists
of bound checks for all unique memory accessed in loop, and it ensures the
lack of memory aliasing. The result of the runtime check determines which of
the loop versions is executed: If the runtime check detects any memory
aliasing, then the original loop is executed. Otherwise, the version with
aggressive aliasing assumptions is used.
The pass is off by default and can be enabled with command line option
-enable-loop-versioning-licm.
Reviewers: hfinkel, anemet, chatur01, reames
Subscribers: MatzeB, grosser, joker.eph, sanjoy, javed.absar, sbaranga,
llvm-commits
Differential Revision: http://reviews.llvm.org/D9151
llvm-svn: 259986
|
| |
|
|
|
|
| |
This is almost feature complete - just missing tu_index merging now.
llvm-svn: 259971
|
| |
|
|
| |
llvm-svn: 259954
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The current situation isn't great, because the amount of padding
requires is determined by the inverse order of the first encountered
use. We should eventually somehow sort these to minimize wasted space.
Another problem is the alignment of kernel arguments isn't
respected. The group_segment_alignment is always emitted as
the default 16, and typed arguments with higher alignments
or an explicitly set alignment are also ignored.
llvm-svn: 259912
|
| |
|
|
|
|
|
| |
Also switch to internal linkage, and include the name of the function in
the name.
llvm-svn: 259911
|
| |
|
|
| |
llvm-svn: 259896
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
LiveRangeEdit::eliminateDeadDef is used to remove dead define instructions
after rematerialization. To remove a VNI for a vreg from its LiveInterval,
LiveIntervals::removeVRegDefAt is used. However, after non-PHI VNIs are all
removed, PHI VNI are still left in the LiveInterval. Such unused vregs will
be kept in RegsToSpill[] at the end of InlineSpiller::reMaterializeAll and
spiller will allocate stackslot for them.
The fix is to get rid of unused reg by checking whether it has non-dbg
reference instead of whether it has non-empty interval.
llvm-svn: 259895
|
| |
|
|
| |
llvm-svn: 259893
|
| |
|
|
| |
llvm-svn: 259888
|
| |
|
|
|
|
| |
This reverts commit r259812 as it broke AArch64 self-hosting.
llvm-svn: 259881
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CodeView, like most other debug formats, represents the live range of a
variable so that debuggers might print them out.
They use a variety of records to represent how a particular variable
might be available (in a register, in a frame pointer, etc.) along with
a set of ranges where this debug information is relevant.
However, the format only allows us to use ranges which are limited to a
maximum of 0xF000 in size. This means that we need to split our debug
information into chunks of 0xF000.
Because the layout of code is not known until *very* late, we must use a
new fragment to record the information we need until we can know
*exactly* what the range is.
llvm-svn: 259868
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: This diff increase the tested surface of the C API.
Reviewers: bogner, chandlerc, echristo, dblaikie, joker.eph, Wallbraker
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16910
llvm-svn: 259863
|
| |
|
|
|
|
| |
This was requested in the review of D16300.
llvm-svn: 259861
|
| |
|
|
| |
llvm-svn: 259860
|
| |
|
|
|
|
|
| |
We don't currently have many tests that deal with operations on multiple
local MemoryLocations. This new test helps out a bit in that regard.
llvm-svn: 259854
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This basically add an echo test case in C. The support is limited right now, but full support would just be too much to review at once.
The echo test case simply get a module as input and try to output the same exact module. This allow to check the both reading and writing API are working as expected.
I want to improve this test over time to support more and more of the API, in order to improve coverage (coverage is quite poor right now).
Test Plan: Run the test.
Reviewers: chandlerc, bogner
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10725
llvm-svn: 259844
|
| |
|
|
|
|
|
|
|
| |
Using the load immediate only when the immediate (whether signed or unsigned)
can fit in a 16-bit signed field. Namely, from -32768 to 32767 for signed and
0 to 65535 for unsigned. This patch also ensures that we sign-extend under the
right conditions.
llvm-svn: 259840
|
| |
|
|
| |
llvm-svn: 259835
|
| |
|
|
|
|
|
|
|
|
| |
EltsFromConsecutiveLoads
Choose between MOVD/MOVSS and MOVQ/MOVSD depending on the target vector type.
This has a lot fewer test changes than trying to add this to X86InstrInfo::setExecutionDomain.....
llvm-svn: 259816
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch allows the mixing of scaled and unscaled load/stores to form
load/store pairs.
PR24465
http://reviews.llvm.org/D12116
Many thanks to Ahmed and Michael for fixes and code review.
This is a reapplication of r246769 and r259790. The tramp3d failure was caused
by an incorrect refactoring in the patch. Specifically, we weren't always
properly clearing the SExtIdx flag.
llvm-svn: 259812
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During instruction selection, the AArch64 backend can recognise the
following pattern and generate an [U|S]MADDL instruction, i.e. a
multiply of two 32-bit operands with a 64-bit result:
(mul (sext i32), (sext i32))
However, when one of the operands is constant, the sign extension
gets folded into the constant in SelectionDAG::getNode(). This means
that the instruction selection sees this:
(mul (sext i32), i64)
...which doesn't match the pattern. Sign-extension and 64-bit
multiply instructions are generated, which are slower than one 32-bit
multiply.
Add a pattern to match this and generate the correct instruction, for
both signed and unsigned multiplies.
Patch by Chris Diamand!
llvm-svn: 259800
|
| |
|
|
|
|
|
|
| |
Fix the lit bug that enabled this "feature" (empty triple is substring
of all possible target triples) and change the two outliers to use the
documented * syntax.
llvm-svn: 259799
|
| |
|
|
| |
llvm-svn: 259797
|
| |
|
|
|
|
|
|
|
|
| |
EltsFromConsecutiveLoads
This patch adds support for consecutive (load/undef elements) 32-bit loads, followed by trailing undef/zero elements to be combined to a single MOVD load.
Differential Revision: http://reviews.llvm.org/D16729
llvm-svn: 259796
|
| |
|
|
|
|
| |
This reverts commit r259790. tramp3d-v4 is still having problems.
llvm-svn: 259795
|
| |
|
|
|
|
| |
consecutive entries as 64-bit integers
llvm-svn: 259794
|