| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
| |
This reverts commit r349674. It causes a failure in
test-suite enc-3des.execution_time.
llvm-svn: 349684
|
| |
|
|
| |
llvm-svn: 349681
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This (mostly) fixes https://bugs.llvm.org/show_bug.cgi?id=39771.
BDCE currently detects instructions that don't have any demanded bits
and replaces their uses with zero. However, if an instruction has
multiple uses, then some of the uses may be dead (have no demanded bits)
even though the instruction itself is still live. This patch extends
DemandedBits/BDCE to detect such uses and replace them with zero.
While this will not immediately render any instructions dead, it may
lead to simplifications (in the motivating case, by converting a rotate
into a simple shift), break dependencies, etc.
The implementation tries to strike a balance between analysis power and
complexity/memory usage. Originally I wanted to track demanded bits on
a per-use level, but ultimately we're only really interested in whether
a use is entirely dead or not. I'm using an extra set to track which uses
are dead. However, as initially all uses are dead, I'm not storing uses
those user is also dead. This case is checked separately instead.
The test case has a couple of cases that are not simplified yet. In
particular, we're only looking at uses of instructions right now. I think
it would make sense to also extend this to arguments. Furthermore
DemandedBits doesn't yet know some of the tricks that InstCombine does
for the demanded bits or bitwise or/and/xor in combination with known
bits information.
Differential Revision: https://reviews.llvm.org/D55563
llvm-svn: 349674
|
| |
|
|
|
|
| |
This reduces indentation and makes it obvious this function always returns something.
llvm-svn: 349671
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is to address post-commit feedback from Paul Robinson on r348954.
The original commit misinterprets count and upper bound as the same thing (I thought I saw GCC producing an upper bound the same as Clang's count, but GCC correctly produces an upper bound that's one less than the count (in C, that is, where arrays are zero indexed)).
I want to preserve the C-like output for the common case, so in the absence of a lower bound the count (or one greater than the upper bound) is rendered between []. In the trickier cases, where a lower bound is specified, a half-open range is used (eg: lower bound 1, count 2 would be "[1, 3)" and an unknown parts use a '?' (eg: "[1, ?)" or "[?, 7)" or "[?, ? + 3)").
Reviewers: aprantl, probinson, JDevlieghere
Differential Revision: https://reviews.llvm.org/D55721
llvm-svn: 349670
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The LTO/ThinLTO driver currently creates invalid bitcode by setting
symbols marked dllimport as dso_local. The compiler often has access
to the definition (often dllexport) and the declaration (often
dllimport) of an object at link-time, leading to a conflicting
declaration. This patch resolves the inconsistency by removing the
dllimport attribute.
Reviewers: tejohnson, pcc, rnk, echristo
Reviewed By: rnk
Subscribers: dmikulin, wristow, mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, dang, llvm-commits
Differential Revision: https://reviews.llvm.org/D55627
llvm-svn: 349667
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This adds a G_FCEIL generic instruction and uses it in AArch64. This adds
selection for floating point ceil where it has a supported, dedicated
instruction. Other cases aren't handled here.
It updates the relevant gisel tests and adds a select-ceil test. It also adds a
check to arm64-vcvt.ll which ensures that we don't fall back when we run into
one of the relevant cases.
llvm-svn: 349664
|
| |
|
|
|
|
|
|
|
|
|
|
| |
processing
The (cmp (and X, Y) 0) pattern is greedy and ends up forming a TESTrr and consuming the and when it might be better to use one of the BMI/TBM like BLSR or BLSI.
This patch moves removes the pattern from isel and adds a post processing check to combine TESTrr+ANDrr into just a TESTrr. With this patch we are able to select the BMI/TBM instructions, but we'll also emit a TESTrr when the result is compared to 0. In many cases the peephole pass will be able to use optimizeCompareInstr to remove the TEST, but its probably not perfect.
Differential Revision: https://reviews.llvm.org/D55870
llvm-svn: 349661
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes https://bugs.llvm.org/show_bug.cgi?id=38743
The function removeRedundantBlockingStores is supposed to remove any blocking stores contained in each other in lockingStoresDispSizeMap.
But it currently looks only at the previous one, which will miss some cases that result in assert.
This patch refine the function to check all previous layouts until find the uncontained one. So all redundant stores will be removed.
Patch by Pengfei Wang
Differential Revision: https://reviews.llvm.org/D55642
llvm-svn: 349660
|
| |
|
|
| |
llvm-svn: 349652
|
| |
|
|
|
|
| |
Fix typos.
llvm-svn: 349644
|
| |
|
|
| |
llvm-svn: 349641
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements BTF (BPF Type Format).
The BTF is the debug info format for BPF, introduced
in the below linux patch:
https://github.com/torvalds/linux/commit/69b693f0aefa0ed521e8bd02260523b5ae446ad7#diff-06fb1c8825f653d7e539058b72c83332
and further extended several times, e.g.,
https://www.spinics.net/lists/netdev/msg534640.html
https://www.spinics.net/lists/netdev/msg538464.html
https://www.spinics.net/lists/netdev/msg540246.html
The main advantage of implementing in LLVM is:
. better integration/deployment as no extra tools are needed.
. bpf JIT based compilation (like bcc, bpftrace, etc.) can get
BTF without much extra effort.
. BTF line_info needs selective source codes, which can be
easily retrieved when inside the compiler.
This patch implemented BTF generation by registering a BPF
specific DebugHandler in BPFAsmPrinter.
Signed-off-by: Yonghong Song <yhs@fb.com>
Differential Revision: https://reviews.llvm.org/D55752
llvm-svn: 349640
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Import libraries as created by llvm-dlltool always use the same archive
member name for every object file (namely, the DLL library name). Ensure
that long names are not repeatedly stored in the string table.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D55860
llvm-svn: 349637
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
generic intrinsics (llvm)
Now that we use the generic ISD opcodes, we can use the generic intrinsics directly as well. This fixes the poor fast-isel codegen by not expanding to an easily broken IR code sequence.
I'm intending to deal with the signed saturation equivalents as well.
Clang counterpart: https://reviews.llvm.org/D55879
Differential Revision: https://reviews.llvm.org/D55855
llvm-svn: 349630
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
(part 2 of 2)
Now that SimplifyDemandedBits/SimplifyDemandedVectorElts is simplifying vector elements, we're seeing more constant BUILD_VECTOR containing undefs.
This patch provides opt-in support for UNDEF elements in matchBinaryPredicate, passing NULL instead of the result ConstantSDNode* argument.
I've updated the (or (and X, c1), c2) -> (and (or X, c2), c1|c2) fold to demonstrate its use, which I believe is safe for undef cases.
Differential Revision: https://reviews.llvm.org/D55822
llvm-svn: 349629
|
| |
|
|
|
|
|
|
|
|
|
|
| |
(part 1 of 2)
Now that SimplifyDemandedBits/SimplifyDemandedVectorElts is simplifying vector elements, we're seeing more constant BUILD_VECTOR containing undefs.
This patch provides opt-in support for UNDEF elements in matchBinaryPredicate, passing NULL instead of the result ConstantSDNode* argument.
Differential Revision: https://reviews.llvm.org/D55822
llvm-svn: 349628
|
| |
|
|
|
|
|
|
|
|
|
|
| |
As described on PR40091, we have several places where zext (and zext_vector_inreg) fold an undef input into an undef output. For zero extensions this is incorrect as the output should guarantee to least have the new upper bits set to zero.
SimplifyDemandedVectorElts is the worst offender (and its the most likely to cause new undefs to appear) but DAGCombiner's tryToFoldExtendOfConstant has a similar issue.
Thanks to @dmgreen for catching this.
Differential Revision: https://reviews.llvm.org/D55883
llvm-svn: 349625
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
attempt 2
This relands r330742:
"""
Let TableGen write output only if it changed, instead of doing so in cmake.
Removes one subprocess and one temp file from the build for each tablegen
invocation.
No intended behavior change.
"""
In particular, if you see rebuilds after this change that you didn't see
before this change, that's unintended and it's fine to revert this change
again (but let me know).
r330742 got reverted because some people reported that llvm-tblgen ran on every
build after it. This could happen if the depfile output got deleted without
deleting the main .inc output. To fix, make TableGen always write the depfile,
but keep writing the main .inc output only if it has changed. This matches what
we did in cmake before.
Differential Revision: https://reviews.llvm.org/D55842
llvm-svn: 349624
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Using HI here makes no logical sense, since the dword is only
32 bits to begin with.
Current Mesa master does not look at the relocation type at all,
so this change is fine. Future Mesa will rely on this, however.
Change-Id: I91085707834c4ac0370926602b93c94b90e44cb1
Reviewers: arsenm, rampitec, mareko
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D55369
llvm-svn: 349620
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Now that SimplifyDemandedBits/SimplifyDemandedVectorElts are simplifying vector elements, we're seeing more constant BUILD_VECTOR containing UNDEFs.
This patch provides opt-in handling of UNDEF elements in matchUnaryPredicate, passing NULL instead of the ConstantSDNode* argument.
I've updated SelectionDAG::simplifyShift to demonstrate its use.
Differential Revision: https://reviews.llvm.org/D55819
llvm-svn: 349616
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Fix an issue where VGPR/SGPR bounds are not properly extended when brackets are merged.
This manifests as missing waitcnt insertions when multiple brackets are forwarded to a successor block and the first forward has lower VGPR/SGPR bounds.
Irreducible loop test has been extended based on a CTS failure detected for GFX9.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, jfb, llvm-commits
Differential Revision: https://reviews.llvm.org/D55602
llvm-svn: 349611
|
| |
|
|
|
|
|
|
|
|
|
| |
All we have to do is mark it as legal.
This allows us to select a lot of new patterns handled by TableGen. This
patch adds tests for them and splits up the existing test file for
binary operators into 2 files, one for arithmetic ops and one for
logical ones.
llvm-svn: 349610
|
| |
|
|
| |
llvm-svn: 349608
|
| |
|
|
|
|
|
|
|
| |
This is an initial implementation of no-op passthrough copying of COFF
with objcopy.
Differential Revision: https://reviews.llvm.org/D54939
llvm-svn: 349605
|
| |
|
|
|
|
|
|
|
|
|
|
| |
For type v4i32/v8ii16/v16i8, do following transforms:
(vselect (setcc a, b, setugt), (sub a, b), (sub b, a)) -> (vabsd a, b)
(vselect (setcc a, b, setuge), (sub a, b), (sub b, a)) -> (vabsd a, b)
(vselect (setcc a, b, setult), (sub b, a), (sub a, b)) -> (vabsd a, b)
(vselect (setcc a, b, setule), (sub b, a), (sub a, b)) -> (vabsd a, b)
Differential Revision: https://reviews.llvm.org/D55812
llvm-svn: 349599
|
| |
|
|
| |
llvm-svn: 349569
|
| |
|
|
| |
llvm-svn: 349568
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch moved the following files in lib/CodeGen/AsmPrinter/
AsmPrinterHandler.h
DbgEntityHistoryCalculator.h
DebugHandlerBase.h
to include/llvm/CodeGen directory.
Such a change will enable Target to extend DebugHandlerBase
and emit Target specific debug info sections.
Signed-off-by: Yonghong Song <yhs@fb.com>
Differential Revision: https://reviews.llvm.org/D55755
llvm-svn: 349564
|
| |
|
|
|
|
|
|
|
|
| |
weak_external in some cases
Clang uses weak linkage for objc runtime functions when they are not available on the platform.
The intrinsic has this linkage so we just need to pass that on to the runtime call.
llvm-svn: 349559
|
| |
|
|
|
|
|
|
| |
For performance reasons, clang set nonlazybind on these functions. Now that we
are using intrinsics instead of runtime calls, we should set this attribute when
creating the runtime functions.
llvm-svn: 349558
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds a VectorizationSafetyStatus enum, which will be extended
in a follow up patch to distinguish between 'safe with runtime checks'
and 'known unsafe' dependences.
Reviewers: anemet, anna, Ayal, hsaito
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D54892
llvm-svn: 349556
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
unnamed_addr is still useful for detecting of ODR violations on vtables
Still unnamed_addr with lld and --icf=safe or --icf=all can trigger false
reports which can be avoided with --icf=none or by using private aliases
with -fsanitize-address-use-odr-indicator
Reviewers: eugenis
Reviewed By: eugenis
Subscribers: kubamracek, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D55799
llvm-svn: 349555
|
| |
|
|
|
|
|
|
|
|
| |
instead of SDAG.
SelectionDAG currently changes these intrinsics to function calls, but that won't work
for other ISel's. Also we want to eventually support nonlazybind and weak linkage coming
from the front-end which we can't do in SelectionDAG.
llvm-svn: 349552
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D55670
llvm-svn: 349549
|
| |
|
|
|
|
|
|
|
|
| |
over them
Looks like there are valid reasons why we need to allow bitcasts in llvm.asan.globals, see discussion at https://github.com/apple/swift-llvm/pull/133. Let's look through bitcasts when iterating over entries in the llvm.asan.globals list.
Differential Revision: https://reviews.llvm.org/D55794
llvm-svn: 349544
|
| |
|
|
|
|
| |
Dump the resources masks as hexadecimal.
llvm-svn: 349536
|
| |
|
|
|
|
|
|
|
|
|
| |
We're moving ARC optimisation and ARC emission in clang away from runtime methods
and towards intrinsics. This is the part which actually uses the intrinsics in the ARC
optimizer when both analyzing the existing calls and emitting new ones.
Differential Revision: https://reviews.llvm.org/D55348
Reviewers: ahatanak
llvm-svn: 349534
|
| |
|
|
|
|
|
|
| |
We already had BSF here as part of __builtin_ffs improvements and I was just wondering yesterday whether we should have BSR there.
This addresses one issue from PR40090.
llvm-svn: 349531
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Checking whether a number has a certain number of trailing / leading
zeros means checking whether it is of the form XXXX1000 / 0001XXXX,
which can be done with an and+icmp.
Related to https://bugs.llvm.org/show_bug.cgi?id=28668. As a next
step, this can be extended to non-equality predicates.
Differential Revision: https://reviews.llvm.org/D55745
llvm-svn: 349530
|
| |
|
|
|
|
|
|
|
|
| |
processBaseWithConstOffset().
Summary: 32bit operand sizes are guaranteed by the opcode check AMDGPU::V_ADD_I32_e64 and
AMDGPU::V_ADDC_U32_e64. Therefore, we don't any additional operand size-check-assert.
Author: FarhanaAleen
llvm-svn: 349529
|
| |
|
|
|
|
| |
Post commit review/bug reported by Pavel Labath - thanks!
llvm-svn: 349528
|
| |
|
|
|
|
| |
We can use the result fetched a few lines above.
llvm-svn: 349527
|
| |
|
|
|
|
|
|
| |
Let type legalization and op legalization deal with it.
Now that we've switched to target independent nodes we can rely on generic infrastructure to do the legalization for us.
llvm-svn: 349526
|
| |
|
|
|
|
|
|
|
| |
As the FIXME indicates, this has the potential to go
overboard. So I'm not sure if it's even worth keeping
this vs. iteratively doing simple matches, but we might
as well clean it up.
llvm-svn: 349523
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Migrate the X86 backend from X86ISD opcodes ADDS and SUBS to generic
ISD opcodes SADDSAT and SSUBSAT. This also improves scodegen for
@llvm.sadd.sat() and @llvm.ssub.sat() intrinsics.
This is a followup to D55787 and part of PR40056.
Differential Revision: https://reviews.llvm.org/D55833
llvm-svn: 349520
|
| |
|
|
|
|
|
|
|
|
|
|
| |
InstCombine seems to canonicalize or PSUB patter into a max with the cosntant and an add with an inverse of the constant.
This patch recognizes this pattern and turns it into PSUBUS. Future work could improve undef element handling.
Fixes some of PR40053
Differential Revision: https://reviews.llvm.org/D55780
llvm-svn: 349519
|
| |
|
|
| |
llvm-svn: 349518
|
| |
|
|
|
|
| |
Differential revision: https://reviews.llvm.org/D55723
llvm-svn: 349516
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This the initial code change to facilitate managing FMF flags from Instructions to MI wrt Intrinsics in Global Isel. Eventually the GlobalObserver interface will be added as well, where FMF additions can be tracked for the builder and CSE.
Reviewers: aditya_nandakumar, bogner
Reviewed By: bogner
Subscribers: rovka, kristof.beyls, javed.absar
Differential Revision: https://reviews.llvm.org/D55668
llvm-svn: 349514
|