| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes a bunch of assert-on-invalid-bitcode regressions after 315483.
Expected<> calls assertIsChecked() in its dtor, and operator bool() only calls
setChecked() if there's no error. So for functions that don't return an error
itself, the Expected<> version needs explicit code to disarm the error that the
ErrorOr<> code didn't need.
https://reviews.llvm.org/D39437
llvm-svn: 317010
|
|
|
|
|
|
|
| |
COFF comdats require symbol table entries, which means the comdat leader
cannot have private linkage.
llvm-svn: 317009
|
|
|
|
|
|
|
|
| |
computeKnownBits/ComputeNumSignBits
Mainly a perf improvements as most combines will have occurred before we lower to these instructions
llvm-svn: 317005
|
|
|
|
|
|
| |
No functionality change intended.
llvm-svn: 317003
|
|
|
|
|
|
|
|
|
|
| |
If a select instruction tests the returned flag of a cmpxchg instruction and
selects between the returned value of the cmpxchg instruction and its compare
operand, the result of the select will always be equal to its false value.
Differential Revision: https://reviews.llvm.org/D39383
llvm-svn: 316994
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The optimisation remarks for loop unrolling with an unrolled remainder looks something like:
test.c:7:18: remark: completely unrolled loop with 3 iterations [-Rpass=loop-unroll]
C[i] += A[i*N+j];
^
test.c:6:9: remark: unrolled loop by a factor of 4 with run-time trip count [-Rpass=loop-unroll]
for(int j = 0; j < N; j++)
^
This removes the first of the two messages.
Differential revision: https://reviews.llvm.org/D38725
llvm-svn: 316986
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
extract subvector of vXi1 from vYi1 is poorly supported by LLVM and most of the time end with an assertion.
This patch fixes this issue by adding new patterns to the TD file.
Reviewers:
1. guyblank
2. igorb
3. zvi
4. ayman
5. craig.topper
Differential Revision: https://reviews.llvm.org/D39292
Change-Id: Ideb4d7e946c8d40cfce2920891f2d89fe64c58f8
llvm-svn: 316981
|
|
|
|
|
|
|
| |
The address can be presented as a bitcast of baseReg.
In this case it is still trivial but OriginalValue != baseReg.
llvm-svn: 316980
|
|
|
|
|
|
|
|
|
|
| |
Rename `Offset`, `Scale`, `Length` into `Begin`, `Step`, `End` respectively
to make naming of similar entities for Ranges and Range Checks more
consistent.
Differential Revision: https://reviews.llvm.org/D39414
llvm-svn: 316979
|
|
|
|
|
|
|
|
|
|
| |
enabled. Use 128-bit VLX instruction when VLX is enabled.
Unfortunately, this weakens our ability to do domain fixing when AVX512DQ is not enabled, but it is consistent with our 256-bit behavior.
Maybe we should add custom handling to domain fixing to allow EVEX integer XOR/AND/OR/ANDN to switch to VEX encoded fp instructions if the high registers aren't being used?
llvm-svn: 316978
|
|
|
|
| |
llvm-svn: 316977
|
|
|
|
|
|
|
|
| |
As noted in the nice block comment, the previous code didn't actually handle multi-entry loops correctly, it just assumed SCEV didn't analyze such loops. Given SCEV has comments to the contrary, that seems a bit suspect. More importantly, the pass actually requires loopsimplify form which ensures a loop-preheader is available. Remove the excessive generaility and shorten the code greatly.
Note that we do successfully analyze many multi-entry loops, but we do so by converting them to single entry loops. See the added test case.
llvm-svn: 316976
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pass control flow to successors"
This patch fixes the miscompile that happens when PRE hoists loads across guards and
other instructions that don't always pass control flow to their successors. PRE is now prohibited
to hoist across such instructions because there is no guarantee that the load standing after such
instruction is still valid before such instruction. For example, a load from under a guard may be
invalid before the guard in the following case:
int array[LEN];
...
guard(0 <= index && index < LEN);
use(array[index]);
Differential Revision: https://reviews.llvm.org/D37460
llvm-svn: 316975
|
|
|
|
|
|
|
|
| |
Previously, the code returned early from the *function* when it couldn't find a free expansion, it should be returning from the *transform*. I don't have a test case, noticed this via inspection.
As a follow up, I'm going to revisit the logic in the extract function. I think that essentially the whole helper routine can be replaced with SCEVExpander, but I wanted to do that in a series of separate commits.
llvm-svn: 316974
|
|
|
|
| |
llvm-svn: 316973
|
|
|
|
|
|
| |
These files shouldn't have been submitted in 316967
llvm-svn: 316968
|
|
|
|
|
|
|
|
| |
Issue found by llvm-isel-fuzzer on OSS fuzz, https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3725
If anyone actually cares about > 64 bit arithmetic, there's a lot more to do in this area. There's a bunch of obviously wrong code in the same function. I don't have the time to fix all of them and am just using this to understand what the workflow for fixing fuzzer cases might look like.
llvm-svn: 316967
|
|
|
|
| |
llvm-svn: 316964
|
|
|
|
|
|
| |
We don't need to extend/truncate the Known structure before calling computeKnownBits - it will reset at the start of the function.
llvm-svn: 316962
|
|
|
|
| |
llvm-svn: 316960
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
InferAddressSpaces assumes the pointee type of addrspacecast
is the same as the operand, which is not always true and causes
invalid IR.
This bug cause build failure in HCC.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D39432
llvm-svn: 316957
|
|
|
|
| |
llvm-svn: 316955
|
|
|
|
|
|
|
|
| |
It's not guaranteed. There's a bug open to sort them in predecessor
order, but it won't happen anytime soon. In the meanwhile, passes
will have to do an O(#preds) scan. Such is life.
llvm-svn: 316953
|
|
|
|
|
|
|
|
|
|
| |
Revert r316478.
A test case has failed.
Will recommit this change once we find and fix the failure.
This reverts commit 7c330fabaedaba3d02c58bc3cc1198896c895f34.
llvm-svn: 316952
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
intrinsic. (NFC)
Summary:
For reference, see: http://lists.llvm.org/pipermail/llvm-dev/2017-August/116589.html
This patch fleshes out the instruction class hierarchy with respect to atomic and
non-atomic memory intrinsics. With this change, the relevant part of the class
hierarchy becomes:
IntrinsicInst
-> MemIntrinsicBase (methods-only class)
-> MemIntrinsic (non-atomic intrinsics)
-> MemSetInst
-> MemTransferInst
-> MemCpyInst
-> MemMoveInst
-> AtomicMemIntrinsic (atomic intrinsics)
-> AtomicMemSetInst
-> AtomicMemTransferInst
-> AtomicMemCpyInst
-> AtomicMemMoveInst
-> AnyMemIntrinsic (both atomicities)
-> AnyMemSetInst
-> AnyMemTransferInst
-> AnyMemCpyInst
-> AnyMemMoveInst
This involves some class renaming:
ElementUnorderedAtomicMemCpyInst -> AtomicMemCpyInst
ElementUnorderedAtomicMemMoveInst -> AtomicMemMoveInst
ElementUnorderedAtomicMemSetInst -> AtomicMemSetInst
A script for doing this renaming in downstream trees is included below.
An example of where the Any* classes should be used in LLVM is when reasoning
about the effects of an instruction (ex: aliasing).
---
Script for renaming AtomicMem* classes:
PREFIXES="[<,([:space:]]"
CLASSES="MemIntrinsic|MemTransferInst|MemSetInst|MemMoveInst|MemCpyInst"
SUFFIXES="[;)>,[:space:]]"
REGEX="(${PREFIXES})ElementUnorderedAtomic(${CLASSES})(${SUFFIXES})"
REGEX2="visitElementUnorderedAtomic(${CLASSES})"
FILES=$( grep -E "(${REGEX}|${REGEX2})" -r . | tr ':' ' ' | awk '{print $1}' | sort | uniq )
SED_SCRIPT="s~${REGEX}~\1Atomic\2\3~g"
SED_SCRIPT2="s~${REGEX2}~visitAtomic\1~g"
for f in $FILES; do
echo "Processing: $f"
sed -i ".bak" -E "${SED_SCRIPT};${SED_SCRIPT2};${EA_SED_SCRIPT};${EA_SED_SCRIPT2}" $f
done
Reviewers: sanjoy, deadalnix, apilipenko, anna, skatkov, mkazantsev
Reviewed By: sanjoy
Subscribers: hfinkel, jholewinski, arsenm, sdardis, nhaehnle, JDevlieghere, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D38419
llvm-svn: 316950
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This fixes failure in Transforms/GVNHoist/hoist.ll uncovered by D39245.
Reviewers: hiraditya, spop, dberlin
Reviewed By: dberlin
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D39410
llvm-svn: 316949
|
|
|
|
| |
llvm-svn: 316947
|
|
|
|
| |
llvm-svn: 316944
|
|
|
|
| |
llvm-svn: 316933
|
|
|
|
| |
llvm-svn: 316927
|
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D38312
Change-Id: I71c8605a8e4c98013ef25289694afc5cfd46bb0b
llvm-svn: 316921
|
|
|
|
| |
llvm-svn: 316920
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: The two 32-bit words were swapped. Update a test omitted in reverted r316270.
Reviewers: jtony, aaron.ballman
Subscribers: nemanjai, kbarton
Differential Revision: https://reviews.llvm.org/D39163
llvm-svn: 316916
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
flag is being used.
Summary:
INC/DEC don't update the carry flag so we need to make sure we don't try to use it.
This patch introduces new X86ISD opcodes for locked INC/DEC. Teaches lowerAtomicArithWithLOCK to emit these nodes if INC/DEC is not slow or the function is being optimized for size. An additional flag is added that allows the INC/DEC to be disabled if the caller determines that the carry flag is being requested.
The test_sub_1_cmp_1_setcc_ugt test is currently showing this bug. The other test case changes are recovering cases that were regressed in r316860.
This should fully fix PR35068 finishing the fix started in r316860.
Reviewers: RKSimon, zvi, spatel
Reviewed By: zvi
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D39411
llvm-svn: 316913
|
|
|
|
|
|
| |
This shouldn't be needed anymore since i1 isn't a legal type.
llvm-svn: 316912
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Identifies kernels which performs device side kernel enqueues and emit
metadata for the associated hidden kernel arguments. Such kernels are
marked with calls-enqueue-kernel function attribute by
AMDGPUOpenCLEnqueueKernelLowering pass and later on
hidden kernel arguments metadata HiddenDefaultQueue and
HiddenCompletionAction are emitted for them.
Differential Revision: https://reviews.llvm.org/D39255
llvm-svn: 316907
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Targets that want to support memcmp expansions now return the list of
supported load sizes.
- Expansion codegen does not assume that all power-of-two load sizes
smaller than the max load size are valid. For examples, this is not the
case for x86(32bit)+sse2.
Fixes PR34887.
llvm-svn: 316905
|
|
|
|
| |
llvm-svn: 316904
|
|
|
|
|
|
|
|
| |
Adding support for VSUB.
Reviewed by: @rovka
Differential Revision: https://reviews.llvm.org/D39261
llvm-svn: 316902
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D38626
llvm-svn: 316898
|
|
|
|
|
|
|
| |
This reverts commit r316890.
Change-Id: I683cceee9848ef309b452293086b1f26a941950d
llvm-svn: 316894
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This version of the patch includes a fix addressing a stage2 LTO buildbot
failure and addressed some additional nits.
Original commit message:
This updates the SCCP solver to use of the ValueElement lattice for
parameters, which provides integer range information. The range
information is used to remove unneeded icmp instructions.
For the following function, f() can be optimized to ret i32 2 with
this change
source_filename = "sccp.c"
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
; Function Attrs: norecurse nounwind readnone uwtable
define i32 @main() local_unnamed_addr #0 {
entry:
%call = tail call fastcc i32 @f(i32 1)
%call1 = tail call fastcc i32 @f(i32 47)
%add3 = add nsw i32 %call, %call1
ret i32 %add3
}
; Function Attrs: noinline norecurse nounwind readnone uwtable
define internal fastcc i32 @f(i32 %x) unnamed_addr #1 {
entry:
%c1 = icmp sle i32 %x, 100
%cmp = icmp sgt i32 %x, 300
%. = select i1 %cmp, i32 1, i32 2
ret i32 %.
}
attributes #1 = { noinline }
Reviewers: davide, sanjoy, efriedma, dberlin
Reviewed By: davide, dberlin
Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits
Differential Revision: https://reviews.llvm.org/D36656
llvm-svn: 316891
|
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D38312
Change-Id: I6551fb13879e098aed74de410e29815cf37d9ab5
llvm-svn: 316890
|
|
|
|
| |
llvm-svn: 316889
|
|
|
|
| |
llvm-svn: 316888
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This version of the patch includes a fix addressing a stage2 LTO buildbot
failure and addressed some additional nits.
Original commit message:
This updates the SCCP solver to use of the ValueElement lattice for
parameters, which provides integer range information. The range
information is used to remove unneeded icmp instructions.
For the following function, f() can be optimized to ret i32 2 with
this change
source_filename = "sccp.c"
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
; Function Attrs: norecurse nounwind readnone uwtable
define i32 @main() local_unnamed_addr #0 {
entry:
%call = tail call fastcc i32 @f(i32 1)
%call1 = tail call fastcc i32 @f(i32 47)
%add3 = add nsw i32 %call, %call1
ret i32 %add3
}
; Function Attrs: noinline norecurse nounwind readnone uwtable
define internal fastcc i32 @f(i32 %x) unnamed_addr #1 {
entry:
%c1 = icmp sle i32 %x, 100
%cmp = icmp sgt i32 %x, 300
%. = select i1 %cmp, i32 1, i32 2
ret i32 %.
}
attributes #1 = { noinline }
Reviewers: davide, sanjoy, efriedma, dberlin
Reviewed By: davide, dberlin
Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits
Differential Revision: https://reviews.llvm.org/D36656
llvm-svn: 316887
|
|
|
|
|
|
|
|
| |
It is done to uniformly handle instructions removal.
Differential Revision: https://reviews.llvm.org/D39369
llvm-svn: 316884
|
|
|
|
|
|
|
|
| |
foldMemoryOperandImpl methods together without partial/undef register handling in the middle. NFC
I have a future patch that wants to make use of the one of the partial functions in one of the earlier memory folding methods and the current ordering prevents that.
llvm-svn: 316883
|
|
|
|
| |
llvm-svn: 316882
|
|
|
|
|
|
| |
patch. NFC
llvm-svn: 316881
|