summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [PGO][PGSO] Fix -DBUILD_SHARED_LIBS=on builds after ↵Fangrui Song2019-10-311-0/+22
| | | | | | | | | | | D69580/llvmorg-10-init-8797-g0d987e411ac Move TargetLoweringBase::isSuitableForJumpTable from llvm/CodeGen/TargetLowering.h to .cpp, to avoid the undefined reference from all LLVM${Target}ISelLowering.cpp. Another fix is to add a dependency on TransformUtils to all lib/Target/$Target/LLVMBuild.txt, but that is too disruptive.
* [X86] Remove FSIN/FCOS isel patterns and the pseudo instructions that they ↵Craig Topper2019-10-313-13/+3
| | | | | | | selected for the FP stackifier. We always expand these to libcalls so get rid of the last vestiges of using the instructions.
* [AArch64] Update for ExynosEvandro Menezes2019-10-313-2/+15
| | | | Fix the costs of `add` and `orr` with an immediate operand.
* [PGO][PGSO] TargetLowering/TargetTransformationInfo/SwitchLoweringUtils part.Hiroshi Yamauchi2019-10-315-9/+13
| | | | | | | | | | | | | | | | Summary: (Split of off D67120) TargetLowering/TargetTransformationInfo/SwitchLoweringUtils changes for profile guided size optimization. Reviewers: davidxl Subscribers: eraman, hiraditya, haicheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69580
* [Attributor] Really use the executed-contextJohannes Doerfert2019-10-311-14/+29
| | | | | | | | | | | | | | | | | | Before we did not follow casts and geps when we looked at the users of a pointer in the pointers must-be-executed-context. This caused us to fail to determine if it was accessed for sure. With this change we follow such users now. The above extension exposed problems in getKnownNonNullAndDerefBytesForUse which did not always check what the base pointer was. We also did not handle negative offsets as conservative as we have to without explicit loop handling. Finally, we should not derive a huge number if we access a pointer that was traversed backwards first. The problems exposed by this functional change are already tested in the existing test cases as is the functional change. Differential Revision: https://reviews.llvm.org/D69647
* [SLP] Vectorize jumbled stores.Alexey Bataev2019-10-311-20/+96
| | | | | | | | | | | | | Summary: Patch adds support for vectorization of the jumbled stores. The value operands are vectorized and then shuffled in the right order before store. Reviewers: RKSimon, spatel, hfinkel, mkuper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43339
* [Attributor] Make AANonNull perform context sensitive queriesJohannes Doerfert2019-10-311-21/+6
| | | | | | | | | | | | | | | | | Summary: In order to get context sensitivity from isKnownNonZero we need to provide a context instruction *and* a dominator tree. The latter is passed now to which actually allows to remove some initialization code. Tests taken from PR43833. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69595
* [ValueTracking] Allow context-sensitive nullness check for non-pointersJohannes Doerfert2019-10-312-7/+12
| | | | | | | | | | | | | | | Same as D60846 but with a fix for the problem encountered there which was a missing context adjustment in the handling of PHI nodes. The test that caused D60846 to be reverted was added in e15ab8f277c7. Reviewers: nikic, nlopes, mkazantsev,spatel, dlrobertson, uabelho, hakzsam Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69571
* Revert rG0e252ae19ff8d99a59d64442c38eeafa5825d441 : [X86] Enable YMM memcmp ↵Simon Pilgrim2019-10-311-2/+3
| | | | | | | | with AVX1 Breaks build bots Differential Revision: https://reviews.llvm.org/D69658
* [X86] Enable YMM memcmp with AVX1David Zarzycki2019-10-311-3/+2
| | | | | | | | Update TargetTransformInfo to allow AVX1 to use YMM registers for memcmp. This is a follow up to D68632 which enabled XOR compares which made this possible. https://reviews.llvm.org/D69658
* Revert rG57ee0435bd47f23f3939f402914c231b4f65ca5e - [TII] Use optional ↵Simon Pilgrim2019-10-3112-78/+108
| | | | | | destination and source pair as a return value; NFC This is breaking MSVC builds: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/20375
* [IPCP] Bail on extractvalue's with more than 1 index.Craig Topper2019-10-311-1/+1
| | | | | | | | | | | | | | The replacement code only looks at the first index of the extractvalue. If there are additional indices we'll end up doing a bad replacement. This only happens if the function returns a nested struct. Not sure if clang ever generates such code. The original report came from ispc. Fixes PR43857 Differential Revision: https://reviews.llvm.org/D69656
* [AArch64] Select saturating Neon instructionsDavid Green2019-10-313-1/+31
| | | | | | | | | | | | | | | | | | | | This adds some extra patterns to select AArch64 Neon SQADD, UQADD, SQSUB and UQSUB from the existing target independent sadd_sat, uadd_sat, ssub_sat and usub_sat nodes. It does not attempt to replace the existing int_aarch64_neon_uqadd intrinsic nodes as they are apparently used for both scalar and vector, and need to be legal on scalar types for some of the patterns to work. The int_aarch64_neon_uqadd on scalar would move the two integers into floating point registers, perform a Neon uqadd and move the value back. I don't believe this is good idea for uadd_sat to do the same as the scalar alternative is simpler (an adds with a csinv). For signed it may be smaller, but I'm not sure about it being better. So this just adds some extra patterns for the existing vector instructions, matching on the _sat nodes. Differential Revision: https://reviews.llvm.org/D69374
* [FIX] Make LSan happy by *not* leaking memoryJohannes Doerfert2019-10-311-4/+13
| | | | | | | I left a memory leak in a printer pass which made LSan sad so I remove the memory leak now to make LSan happy. Reported and tested by vlad.tsyrklevich.
* [InstCombine] simplify fcmp+select canonicalization; NFCISanjay Patel2019-10-311-20/+4
| | | | | We had 2 blocks of code that are nearly identical. Existing regression tests should cover both of the patterns.
* Fix missing memcpy, memmove and memset tail callsSanne Wouda2019-10-311-1/+18
| | | | | | | | | | | | | | | | | | | Summary: If a wrapper around one of the mem* stdlib functions bitcasts the returned pointer value before returning it (e.g. to a wchar_t*), LLVM does not emit a tail call. Add a check for this scenario so that we emit a tail call. Reviewers: wmi, mkuper, ramred01, dmgreen Reviewed By: wmi, dmgreen Subscribers: hiraditya, sanwou01, javed.absar, lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59078
* DAG: Add new control for ISD::FMAD formationMatt Arsenault2019-10-313-2/+18
| | | | | | | | | For AMDGPU this depends on whether denormals are enabled in the default FP mode for the function. Currently this is treated as a subtarget feature, so FMAD is selectively legal based on that. I want to move this out of the subtarget features so this can be controlled with a denormal mode attribute. Additionally, this will allow folding based on a future ftz fast math flag.
* AMDGPU: Simplify getAddressSpace callsMatt Arsenault2019-10-314-11/+12
| | | | | These can be directly taken from the GlobalValue instead of going through the type.
* [TII] Use optional destination and source pair as a return value; NFCDjordje Todorovic2019-10-3112-108/+78
| | | | | | | | | | Refactor usage of isCopyInstrImpl, isCopyInstr and isAddImmediate methods to return optional machine operand pair of destination and source registers. Patch by Nikola Prica Differential Revision: https://reviews.llvm.org/D69622
* [InstCombine] Canonicalize uadd.with.overflow to uadd.satDavid Green2019-10-311-0/+32
| | | | | | | | | | | This adds some patterns to transform uadd.with.overflow to uadd.sat (with usub.with.overflow to usub.sat too). The patterns selects from UINTMAX (or 0 for subs) depending on whether the operation overflowed. Signed patterns are a little more involved (they can wrap in two directions), but can be added here in a followup patch too. Differential Revision: https://reviews.llvm.org/D69245
* Revert "[DebugInfo] MachineSink: Insert undef DBG_VALUEs when sinking ↵Jeremy Morse2019-10-311-50/+3
| | | | | | | | | instructions" This reverts commit ee50590e1684c197bc4336984795e48bf53c7a4e. PR43855 reports a performance regression from this commit, which I'll look into.
* Revert "[DebugInfo] MachineSink: find more DBG_VALUEs to sink"Jeremy Morse2019-10-311-86/+15
| | | | | | | This reverts commit f5e1b718a675a4449b71423f04d38e1e93045105. PR43855 reports a performance regression with commit ee50590e. This commit depends on the faulty one, so has to come out too.
* [X86][SSE] Convert computeZeroableShuffleElements to emit KnownUndef and ↵Simon Pilgrim2019-10-311-23/+35
| | | | KnownZero
* [LICM] Invalidate SCEV upon instruction hoistingSerguei Katkov2019-10-311-14/+20
| | | | | | | | | | | Since SCEV can cache information about location of an instruction, it should be invalidated when the instruction is moved. There should be similar bug in code sinking part of LICM, it will be fixed in a follow-up change. Patch Author: Daniil Suchkov Reviewers: asbirlea, mkazantsev, reames Reviewed By: asbirlea Subscribers: hiraditya, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D69370
* minidump: Add an "arm64" constantPavel Labath2019-10-311-0/+1
| | | | | | This is the "official" constant for arm64. We also have another constant for arm64 (called BP_ARM64), which was used by breakpad while there was no official constant for arm64 available.
* [cfi] Add flag to always generate .debug_frameDavid Candler2019-10-3111-23/+24
| | | | | | | | | This adds a flag to LLVM and clang to always generate a .debug_frame section, even if other debug information is not being generated. In situations where .eh_frame would normally be emitted, both .debug_frame and .eh_frame will be used. Differential Revision: https://reviews.llvm.org/D67216
* [yaml2obj/obj2yaml] - Add support for SHT_GNU_HASH section.georgerim2019-10-312-0/+115
| | | | | | | This adds parsing and dumping support for GNU hash sections. They are described nicely here: https://blogs.oracle.com/solaris/gnu-hash-elf-sections-v2 Differential revision: https://reviews.llvm.org/D69399
* Revert "[SLP] Vectorize jumbled stores."Haojian Wu2019-10-311-91/+16
| | | | | | This reverts commit 21d498c9c0f32dcab5bc89ac593aa813b533b43a. This commit causes some crashes on some targets.
* [MustExecute] Silence clang warning about unused captured 'this'Mikael Holmen2019-10-311-2/+2
| | | | | | | | | | | | New code introduced in fe799c97fa caused clang to complain with ../lib/Analysis/MustExecute.cpp:360:34: error: lambda capture 'this' is not used [-Werror,-Wunused-lambda-capture] GetterTy<LoopInfo> LIGetter = [this](const Function &F) { ^~~~ ../lib/Analysis/MustExecute.cpp:365:44: error: lambda capture 'this' is not used [-Werror,-Wunused-lambda-capture] GetterTy<PostDominatorTree> PDTGetter = [this](const Function &F) { ^~~~ 2 errors generated.
* [Attributor][NFCI] Improve the usage of IntegerStatesJohannes Doerfert2019-10-311-12/+8
| | | | | Setting the upper bound directly in the state can be beneficial and simplifies the logic. This also exposed more copy&paste type errors.
* [Attributor] Make liveness "edge-based"Johannes Doerfert2019-10-311-126/+186
| | | | | | | | | | | | | | | | Summary: If control is transferred to a successor is the key question when it comes to liveness. The new implementation puts that question in the focus and thereby providing a clean way to assume certain CFG edges are dead or instructions will not transfer control. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69605
* [Attributor] Liveness for valuesJohannes Doerfert2019-10-311-31/+321
| | | | | | | | | | | | | | | | | | | | | | | Summary: This patch introduces liveness (AAIsDead) for all positions, thus for all kinds of values. For now, we say an instruction is dead if it would be removed assuming all users are dead. A call site return is different as we just look at the users. If all call site returns have been eliminated, the return values can return undef instead of their original value, eliminating uses. We try to recursively delete dead instructions now and we introduce a simple check interface for use-traversal. This is the idea tried out in D68626 but implemented in the right way. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68925
* [Attributor][NFC] Do not delete dead blocks but "clear" themJohannes Doerfert2019-10-311-3/+6
| | | | | | | Deleting blocks will require us to deal with dead edges, e.g., `br i1 false, label %live, label %dead` explicitly. For now we just clear the blocks and move on. This will be revisited once we actually fold branches.
* [MustExecute] Forward iterate over conditional branchesJohannes Doerfert2019-10-311-1/+187
| | | | | | | | | | | | | | | | | | | Summary: If a conditional branch is encountered we can try to find a join block where the execution is known to continue. This means finding a suitable block, e.g., the immediate post dominator of the conditional branch, and proofing control will always reach that block. This patch implements different techniques that work with and without provided analysis. Reviewers: uenoku, sstefan1, hfinkel Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68933
* [Attributor] Add "free"-based heap2stack deductionJohannes Doerfert2019-10-301-18/+43
| | | | | | | | | | | | | | | Summary: If there is a unique free of the allocated that has to be reached from the malloc, we can apply the heap-2-stack transformation even if the pointer escapes. Reviewers: hfinkel, sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68958
* [AArch64][SVE] Add patterns for some integer vector instructionsEhsan Amiri2019-10-303-40/+85
| | | | | | | | | | | | Add pattern matching for SVE vector instructions: -- add, sub, and, or, xor instructions -- sqadd, uqadd, sqsub, uqsub target-independent intrinsics -- bic intrinsics -- predicated add, sub, subr intrinsics Patch Review: https://reviews.llvm.org/D69128 Patch authored by: dancgr (Danilo Carvalho Grael)
* [Attributor][NFC] Eagerly mark attributes as fixed.Johannes Doerfert2019-10-301-4/+14
| | | | | | If an attribute did not query any optimistic (=non-fixed) information to justify its state, we know the attribute state will not change anymore. Thus, we can indicate an optimistic fixpoint.
* [Attributor][NFC] Do not record dependences on fixed attributesJohannes Doerfert2019-10-301-0/+6
| | | | | Since fixed values cannot change, we do not need to wait for it to happen, we will never notify the dependent attribute anyway.
* [Attributor][NFC] Simplify the IRPosition interfaceJohannes Doerfert2019-10-301-4/+6
| | | | | | | We pretended IRPosition came either as mutable or immutable objects while they are basically always immutable, with a single (existing) unfortunate exceptions. This patch cleans up the uses to deal with the immutable version.
* [GISel][CombinerHelper] Combine shuffle_vector scalar to build_vectorQuentin Colombet2019-10-301-2/+2
| | | | | | | | | | Teach the combiner helper how to replace shuffle_vector of scalars into build_vector. I am not particularly happy about having to add this combine, but we currently get those from <1 x iN> from the IR. Bonus: This fixes an assert in the shuffle_vector combines since before this patch, we were expecting vector types.
* [ThinLTO/WPD] Fix index-based WPD for available_externally vtablesTeresa Johnson2019-10-301-8/+26
| | | | | | | | | | | | | | | | | | | | | | Summary: Clang does not add type metadata to available_externally vtables. When choosing a summary to look at for virtual function definitions, make sure we skip summaries for any available externally vtables as they will not describe any virtual function functions, which are only summarized in the presence of type metadata on the vtable def. Simply look for the corresponding strong def's summary. Also add handling for same-named local vtables with the same GUID because of same-named files without enough distinguishing path. In that case we return a conservative result with no devirtualization. Reviewers: pcc, davidxl, evgeny777 Subscribers: mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69452
* [JITLink] Move block ownership from LinkGraph to Section.Lang Hames2019-10-301-4/+0
| | | | This enables easy iteration over blocks in a specific section.
* Revert "[CodeView] Add option to disable inline line tables."Amy Huang2019-10-301-30/+11
| | | | | | because it breaks compiler-rt tests. This reverts commit 6d03890384517919a3ba7fe4c35535425f278f89.
* [CodeView] Add option to disable inline line tables.Amy Huang2019-10-301-11/+30
| | | | | | | | | | | | | | | | | Summary: This adds a clang option to disable inline line tables. When it is used, the inliner uses the call site as the location of the inlined function instead of marking it as an inline location with the function location. See https://bugs.llvm.org/show_bug.cgi?id=42344 Reviewers: rnk Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D67723
* [InstCombine] keep assumption before sinking callstyker2019-10-311-2/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: in the following C code the branch is not removed by clang in O3. ``` int f1(char* p) { int i1 = __builtin_strlen(p); if (!p) return -1; return i1; } ``` The issue is that the call to strlen is sunk to the following block by instcombine. In its new place the call to strlen doesn't dominate the use in the icmp anymore so value tracking can't see that p cannot be null. This patch resolves the issue by inserting an assumption at the place of the call before sinking a call when that call can be used to prove an argument to be nonnull. This resolves this issue at O3. Reviewers: majnemer, xbolva00, fhahn, jdoerfert, spatel, efriedma Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69477
* Fix -Wsign-compare warning with clang-clReid Kleckner2019-10-301-2/+2
| | | | | | off_t apparently is just "long" on Win64, which is 32-bits, and therefore not long enough to compare with UINT32_MAX. Use auto to follow the surrounding code. uint64_t would also be fine.
* [X86] Model MXCSR for all SSE instructionsCraig Topper2019-10-304-51/+97
| | | | | | | | | | | | | | | This patch adds MXCSR as a reserved physical register and models its use by X86 SSE instructions. It also adds flag "mayRaiseFPException" for the instructions that possibly can raise FP exception according to the architecture definition. Following what SystemZ and other targets does, only the current rounding modes and the IEEE exception masks are modeled. *Changes* of the MXCSR due to exceptions are not modeled. Patch by Pengfei Wang Differential Revision: https://reviews.llvm.org/D68121
* AMDGPU: Disallow spill folding with m0 copiesMatt Arsenault2019-10-302-0/+42
| | | | | | | | | | readlane and writelane instructions are not allowed to use m0 as the data operand, so spilling them is tricky and would require an intermediate SGPR to spill it. Constrain the virtual register class in this caes to disallow the inline spiller from folding the m0 operand directly into the spill instruction. I copied this hack from AArch64 which has the same problem for $sp.
* AMDGPU: Don't fold S_NOPs with implicit operandsMatt Arsenault2019-10-301-1/+3
|
* RegAllocFast: Use RegisterMatt Arsenault2019-10-301-69/+69
|
OpenPOWER on IntegriCloud