| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
| |
not excluding ourselves when checking if any equivalent stores
exist.
llvm-svn: 291421
|
| |
|
|
|
|
| |
A future patch will conver it back to BLENDM if its beneficial to register allocation.
llvm-svn: 291419
|
| |
|
|
|
|
|
|
| |
vselects of all ones and all zeros.
Previously we emitted a VPTERNLOG and a separate masked move.
llvm-svn: 291415
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
By using stripPointerCasts we can get to the root
value and then walk down the bitcast graph
Reviewers: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28181
llvm-svn: 291405
|
| |
|
|
|
|
| |
of zeroes/ones when handling sign extends of i1 without VLX.
llvm-svn: 291402
|
| |
|
|
|
|
|
|
| |
test.
This is preparation for improving a case with avx512dq.
llvm-svn: 291401
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I noticed this problem as part of the ongoing attempt to canonicalize min/max ops in IR.
The debug output shows nodes like this:
t4: i32 = xor t2, Constant:i32<-1>
t21: i8 = setcc t4, Constant:i32<0>, setlt:ch
t14: i32 = select t21, t4, Constant:i32<-1>
And because the select is holding onto the t4 (xor) node while EmitTest creates a new
x86-specific xor node, the lowering results in:
t4: i32 = xor t2, Constant:i32<-1>
t25: i32,i32 = X86ISD::XOR t2, Constant:i32<-1>
t28: i32,glue = X86ISD::CMOV Constant:i32<-1>, t4, Constant:i8<15>, t25:1
Differential Revision: https://reviews.llvm.org/D28374
llvm-svn: 291392
|
| |
|
|
|
|
|
|
|
|
| |
The 'fast' costs should only work for shifts by uniform constants (uniform non-constant are lowered using the slow default implementation).
Logical shifts were not taking into account that we must mask the psrlw result, so the costs needed to be doubled.
Added missing AVX2/AVX512BW costs as well.
llvm-svn: 291391
|
| |
|
|
|
|
| |
XOP was prematurely matching, doubling the cost of ashr/lshr uniform shifts.
llvm-svn: 291390
|
| |
|
|
|
|
|
| |
This allows the use of the 'read_register' intrinsics used by clang's
named register globals features.
llvm-svn: 291375
|
| |
|
|
|
|
| |
SSE41 provides pmulld which allows the simpler pslld/paddd/cvttps2dq/pmulld pattern than SSE2's use of pmuludq.
llvm-svn: 291372
|
| |
|
|
|
|
|
|
| |
redundant with masked move instructions.
We should probably teach the two address instruction pass to turn masked moves into BLENDM when its beneficial to the register allocator.
llvm-svn: 291371
|
| |
|
|
| |
llvm-svn: 291370
|
| |
|
|
|
|
| |
I'm not too sure how to get isel to select even all of the unmasked forms, but at least we have a consistent set now.
llvm-svn: 291368
|
| |
|
|
| |
llvm-svn: 291366
|
| |
|
|
|
|
| |
We were matching against general vector shift costs before the uniform splat costs
llvm-svn: 291365
|
| |
|
|
|
|
| |
Fixed missing checks for tests that used a '-' in the name, which was messing with update_llc_test_checks.py
llvm-svn: 291363
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The issue happens with:
%0 = ....., !tbaa !0
%1 = ....., !tbaa !1
With !0 that references !1.
In this case when loading !0 we generates a temporary for the
operand !1. We now flush it immediately and trigger the load of
!1 before moving on. If we don't we get the temporary when
attaching to %1. This is usually not an issue except that we
eagerly try to update TBAA MDNodes, which is obviously not possible
if we only have a temporary.
Differential Revision: https://reviews.llvm.org/D28423
llvm-svn: 291362
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Don't print a line multiple times, each for different inlining contexts, if
nothing happened in any context. This prevents situations like this:
[[
> main:
65 | if ((i * ni + j) % 20 == 0) fprintf
> print_array:
65 | if ((i * ni + j) % 20 == 0) fprintf
]]
which could happen if different optimizations were missed in different inlining
contexts.
llvm-svn: 291361
|
| |
|
|
|
|
|
|
| |
fabs(x * x) is not generally safe to assume x is positive if x is a NaN.
This is also less general than it could be, so this will be replaced
with a transformation on the intrinsic.
llvm-svn: 291359
|
| |
|
|
| |
llvm-svn: 291354
|
| |
|
|
|
|
|
| |
congruence classes for stores, and then keep them up to date. Add
testcases.
llvm-svn: 291351
|
| |
|
|
|
|
| |
v64i8 shuffles (PR31470)
llvm-svn: 291347
|
| |
|
|
|
|
|
|
| |
Gracefully leave code that performs function-pointer bitcasts implying
non-trivial pointer conversions alone, rather than aborting, since it's
just undefined behavior.
llvm-svn: 291326
|
| |
|
|
|
|
|
|
|
|
|
|
| |
WebAssembly requires caller and callee signatures to match exactly. In LLVM,
there are a variety of circumstances where signatures may be mismatched in
practice, and one can bitcast a function address to another type to call it
as that type. This patch adds a pass which replaces bitcasted function
addresses with wrappers to replace the bitcasts.
This doesn't catch everything, but it does match many common cases.
llvm-svn: 291315
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: LLVM's non-standard notion of phi nodes means we can't both try to substitute for undef in phi nodes *and* use phi nodes as leaders all the time. This changes NewGVN to use the same semantics as SimplifyPHINode to decide which phi nodes are equivalent.
Reviewers: davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28312
llvm-svn: 291308
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
r285871 introduced an assert that was overly aggressive in the case
of a same-named local in different same-named files (in different
directories), where the source name and therefore the GUID ended up
the same because the files were compiled in their own directory without
any leading path. Change the handling in the promotion logic to get
the summary for the version in that module.
This also exposed an issue where we are not always importing the
right copy, which is a performance not correctness issue (because
the renaming is based on the module hash which must be different,
see the bug report for details). I will fix that as a follow-on.
Fixes PR31561.
Reviewers: mehdi_amini
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28411
llvm-svn: 291304
|
| |
|
|
|
|
| |
We know that udiv %V, C can be optimized away to 0 if %V is ult C.
llvm-svn: 291296
|
| |
|
|
| |
llvm-svn: 291292
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Modify the --system-libs option in llvm-config to print system libs only
when using static linking. The system libraries are irrelevant when
linking to a shared library since the library has appropriate library
dependencies embedded.
Modify the --system-libs test appropriately to force static linking, and
disable it if static libs are not available (i.e. BUILD_SHARED_LIBS is
enabled).
Differential Revision: https://reviews.llvm.org/D27805
llvm-svn: 291285
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Canonicalize all CMake booleans to 0/1 before passing them to lit, to
ensure that the Python side handles all of them consistently
and correctly. 0/1 is a safe choice of values that trigger the same
boolean interpretation in CMake, Python and C++.
Furthermore, using them without quotes improves the chance Python will
explicitly fail when an incorrect value (such as ON/OFF, TRUE/FALSE,
YES/NO) is accidentally passed, rather than silently misinterpreting
the value.
This replaces a lot of different logics spread around lit site files,
attempting to partially reproduce the boolean logic used in CMake
and usually silently failing when an uncommon value was used instead.
In fact, some of them were never working correctly since different
values were assigned in CMake and checked in Python.
The alternative solution could be to create a common parser for CMake
booleans in lit and use it consistently throughout the site files.
However, it does not seem like the best idea to create redundant
implementation of the same logic and have to follow upstream if it ever
is extended to handle more values.
Differential Revision: https://reviews.llvm.org/D28294
llvm-svn: 291284
|
| |
|
|
|
|
|
|
|
| |
Remove config.test_examples from lit.site.cfg and the relevant
ENABLE_EXAMPLES definition from CMake. It is not used anywhere.
Differential Revision: https://reviews.llvm.org/D28283
llvm-svn: 291283
|
| |
|
|
|
|
| |
We know that urem %V, C can be optimized away to %V if %V is ult C.
llvm-svn: 291282
|
| |
|
|
|
|
|
|
|
|
|
| |
This is fixing a bug where Loop Vectorization is widening a load but
with a lower alignment. Hoisting the load without propagating the alignment
will allow inst-combine to later deduce a higher alignment that what the pointer
actually is.
Differential Revision: https://reviews.llvm.org/D28408
llvm-svn: 291281
|
| |
|
|
|
|
|
|
| |
This will make transition to SCRATCH_MEMORY easier
Differential Revision: https://reviews.llvm.org/D24746
llvm-svn: 291279
|
| |
|
|
|
|
| |
Made no sense for them to be different and caused useless diffs in assembly remarks.
llvm-svn: 291274
|
| |
|
|
| |
llvm-svn: 291269
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Re-apply r288561: This time with a fix where the ADDs that are part of a
3 instruction LOH would not invalidate the "LastAdrp" state. This fixes
http://llvm.org/PR31361
Previously this pass was using up to 5% compile time in some cases which
is a bit much for what it is doing. The pass featured a full blown
data-flow analysis which in the default configuration was restricted to a
single block.
This rewrites the pass under the assumption that we only ever work on a
single block. This is done in a single pass maintaining a state machine
per general purpose register to catch LOH patterns.
Differential Revision: https://reviews.llvm.org/D27329
This reverts commit 9e6cedb0a4f14364d6511597a9160305e7d34493.
llvm-svn: 291266
|
| |
|
|
| |
llvm-svn: 291265
|
| |
|
|
|
|
|
|
| |
As discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2017-January/108749.html
...we should be able to better optimize this pattern.
llvm-svn: 291262
|
| |
|
|
|
|
|
|
|
|
| |
LICM in
order to avoid jumpy line tables. Calls are left alone because they may be inlined.
Differential Revision: https://reviews.llvm.org/D28390
llvm-svn: 291258
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D28403
llvm-svn: 291254
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D27732
llvm-svn: 291245
|
| |
|
|
|
|
| |
The EVEX -> VEX fix means that AVX/AVX512 code is more likely the same now.
llvm-svn: 291242
|
| |
|
|
|
|
| |
The EVEX -> VEX fix means that AVX/AVX512 code is more likely the same now.
llvm-svn: 291241
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: Previously we only supported constant-masked loads and stores.
Reviewers: kcc, RKSimon, pgousseau, gbedwell, vitalybuka
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28370
llvm-svn: 291238
|
| |
|
|
|
|
| |
Set the costs on the lowest target that supports the type.
llvm-svn: 291229
|
| |
|
|
|
|
| |
Added a test demonstrating bug in AVX512 division costs
llvm-svn: 291228
|
| |
|
|
|
|
| |
It is a common convention that our internal test runner depends upon.
llvm-svn: 291227
|
| |
|
|
| |
llvm-svn: 291214
|