summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
* Cast variable to void to resolve unused variable warning in non-asserts builds.Richard Trieu2015-12-151-0/+1
| | | | llvm-svn: 255704
* Fix "Not having LAHF/SAHF" assert.Hans Wennborg2015-12-151-1/+2
| | | | | | It wants to assert that the subtarget is 64-bit, not the register. llvm-svn: 255703
* AMDGPU/SI: Set the code object work group segment size when targeting HSATom Stellard2015-12-152-0/+6
| | | | | | | | | | Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15493 llvm-svn: 255702
* [llvm-objdump/MachODump] Shrink code a little bit. NFC.Davide Italiano2015-12-151-4/+1
| | | | llvm-svn: 255701
* [x86] inline calls to fmaxf / llvm.maxnum.f32 using maxss (PR24475)Sanjay Patel2015-12-152-98/+228
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch improves on the suggested codegen from PR24475: https://llvm.org/bugs/show_bug.cgi?id=24475 but only for the fmaxf() case to start, so we can sort out any bugs before extending to fmin, f64, and vectors. The fmax / maxnum definitions provide us flexibility for signed zeros, so the only thing we have to worry about in this replacement sequence is NaN handling. Note 1: It may be better to implement this as lowerFMAXNUM(), but that exposes a problem: SelectionDAGBuilder::visitSelect() transforms compare/select instructions into FMAXNUM nodes if we declare FMAXNUM legal or custom. Perhaps that should be checking for NaN inputs or global unsafe-math before transforming? As it stands, that bypasses a big set of optimizations that the x86 backend already has in PerformSELECTCombine(). Note 2: The v2f32 test reveals another bug; the vector is extended to v4f32, so we have completely unnecessary operations happening on undef elements of the vector. Differential Revision: http://reviews.llvm.org/D15294 llvm-svn: 255700
* [Sparc] Tweak r255668: Use llvm_unreachable.James Y Knight2015-12-151-1/+1
| | | | llvm-svn: 255698
* Cross-DSO control flow integrity (LLVM part).Evgeniy Stepanov2015-12-157-0/+263
| | | | | | | | An LTO pass that generates a __cfi_check() function that validates a call based on a hash of the call-site-known type and the target pointer. llvm-svn: 255693
* AMDGPU/SI: Set the code objects private segment size when targeting HSA.Tom Stellard2015-12-153-1/+9
| | | | | | | | | | | | Summary: I'm not sure how things worked before without this. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15492 llvm-svn: 255692
* [LoopVectorizer] Refine loop vectorizer's register usage calculator by ↵Cong Hou2015-12-153-32/+178
| | | | | | | | | | | | | | | | | | | | | | ignoring specific instructions. (This is the third attempt to check in this patch, and the first two are r255454 and r255460. The once failed test file reg-usage.ll is now moved to test/Transform/LoopVectorize/X86 directory with target datalayout and target triple indicated.) LoopVectorizationCostModel::calculateRegisterUsage() is used to estimate the register usage for specific VFs. However, it takes into account many instructions that won't be vectorized, such as induction variables, GetElementPtr instruction, etc.. This makes the loop vectorizer too conservative when choosing VF. In this patch, the induction variables that won't be vectorized plus GetElementPtr instruction will be added to ValuesToIgnore set so that their register usage won't be considered any more. Differential revision: http://reviews.llvm.org/D15177 llvm-svn: 255691
* AMDGPU/SI: Emit constant variables in the .hsatext section when targeting HSATom Stellard2015-12-154-21/+15
| | | | | | | | | | Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15426 llvm-svn: 255689
* Wrap include of <future> in some warning suppression pragmasReid Kleckner2015-12-151-1/+11
| | | | | | | | Eventually we may need to sink this include to the .cpp file or something to suport LLVM_ENABLE_THREADS=OFF, but this solves my immediate problem of fixing the build. llvm-svn: 255682
* [WebAssembly] Implement instruction selection for constant offsets in addresses.Dan Gohman2015-12-157-28/+574
| | | | | | | | | | | | | | | | | | | | Add instruction patterns for matching load and store instructions with constant offsets in addresses. The code is fairly redundant due to the need to replicate everything between imm, tglobaldadr, and texternalsym, but this appears to be common tablegen practice. The main alternative appears to be to introduce matching functions with C++ code, but sticking with purely generated matchers seems better for now. Also note that this doesn't yet support offsets from getelementptr, which will be the most common case; that will depend on a change in target-independent code in order to set the NoUnsignedWrap flag, which I'll submit separately. Until then, the testcase uses ptrtoint+add+inttoptr with a nuw on the add. Also implement isLegalAddressingMode with an approximation of this. Differential Revision: http://reviews.llvm.org/D15538 llvm-svn: 255681
* Initialize all bytes in vp data (msan error)Xinliang David Li2015-12-151-4/+5
| | | | llvm-svn: 255680
* Add support for the .debug_macro section of the forthcoming DWARF 5 spec.Eric Christopher2015-12-151-0/+20
| | | | | | Patch by B. Sivachandra Reddy! llvm-svn: 255679
* Fix clang-cl self-host with MSVC 2013 STL std::bind implementationReid Kleckner2015-12-151-1/+6
| | | | llvm-svn: 255678
* [WinEH] Remove unused intrinsic llvm.x86.seh.restoreframeReid Kleckner2015-12-154-74/+5
| | | | | | | We can clean this up now that we have the X86 CATCHRET instruction to restore the FP, SP, and BP. llvm-svn: 255677
* [WinEH] Use operand bundles to describe call sitesDavid Majnemer2015-12-1536-250/+303
| | | | | | | | | | | | | | | | | SimplifyCFG allows tail merging with code which terminates in unreachable which, in turn, makes it possible for an invoke to end up in a funclet which it was not originally part of. Using operand bundles on invokes allows us to determine whether or not an invoke was part of a funclet in the source program. Furthermore, it allows us to unambiguously answer questions about the legality of inlining into call sites which the personality may have trouble with. Differential Revision: http://reviews.llvm.org/D15517 llvm-svn: 255674
* Test cleanup -- remove duplicate run linesXinliang David Li2015-12-151-4/+0
| | | | llvm-svn: 255673
* AMDGPU/SI: Select constant loads with non-uniform addresses to MUBUF ↵Tom Stellard2015-12-159-43/+178
| | | | | | | | | | | | | | | | | | | | | | | | instructions Summary: We were previously selecting all constant loads to SMRD instructions and legalizing the SMRDs with non-uniform addresses during the SIFixSGPRCopesPass. This new solution is more simple and also generates much better code, because the instruction selector is able to take advantage of all the MUBUF addressing modes that are legalization pass wasn't able to. We also no longer need to generate v_add_* instructions when we have a uniform pointer and a non-uniform offset, as this is now folded into the MUBUF instruction during instruction selection. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15425 llvm-svn: 255672
* LLVM tutorial: fix broken links/anchorsAlex Denisov2015-12-1517-52/+52
| | | | llvm-svn: 255671
* Coverage code refactoring /NFCXinliang David Li2015-12-153-13/+18
| | | | llvm-svn: 255670
* LPM: Stop threading `Pass *` through all of the loop utility APIs. NFCJustin Bogner2015-12-1518-175/+164
| | | | | | | | | | | | | | | | | | | | | | A large number of loop utility functions take a `Pass *` and reach into it to find out which analyses to preserve. There are a number of problems with this: - The APIs have access to pretty well any Pass state they want, so it's hard to tell what they may or may not do. - Other APIs have copied these and pass around a `Pass *` even though they don't even use it. Some of these just hand a nullptr to the API since the callers don't even have a pass available. - Passes in the new pass manager don't work like the current ones, so the APIs can't be used as is there. Instead, we should explicitly thread the analysis results that we actually care about through these APIs. This is both simpler and more reusable. llvm-svn: 255669
* [Sparc] Fix handling of double incoming arguments on sparc little-endian.James Y Knight2015-12-152-9/+204
| | | | | | | | | | | | | | On SparcV8, doubles get passed in two 32-bit integer registers. The call code was already handling endianness correctly, but the incoming argument code was not -- it got the two halves in opposite order. Also remove some dead code in LowerFormalArguments_32 to handle less-than-32bit values, which can't actually happen. Finally, add some test cases for the 32-bit calling convention, cribbed from the 64abi.ll test, and run for both big and little-endian. llvm-svn: 255668
* [Docs] Fix Unexpected indentation errors.Akira Hatanaka2015-12-151-0/+2
| | | | llvm-svn: 255665
* [X86] MOVPC32r should only emit CFI adjustments when neededMichael Kuperstein2015-12-152-4/+35
| | | | | | | | | We only want to emit CFI adjustments when actually using DWARF. This fixes PR25828. Differential Revision: http://reviews.llvm.org/D15522 llvm-svn: 255664
* AMDGPU/SI: Implement AMDGPUTargetTransformInfo::isSourceOfDivergence()Tom Stellard2015-12-152-0/+78
| | | | | | | | | | Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15476 llvm-svn: 255661
* [SimplifyCFG] allow speculation of exactly one expensive instruction (PR24818)Sanjay Patel2015-12-153-47/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the last general step to allow more IR-level speculation with a safety harness in place in CodeGenPrepare. The intent is to restore the behavior enabled by: http://reviews.llvm.org/rL228826 but prevent bad performance such as: https://llvm.org/bugs/show_bug.cgi?id=24818 Earlier patches in this sequence: D12882 (disable SimplifyCFG speculation for expensive instructions) D13297 (have CGP despeculate expensive ops) D14630 (have CGP despeculate special versions of cttz/ctlz) As shown in the test cases, we only have two instructions currently affected: ctz for some x86 and fdiv generally. Allowing exactly one expensive instruction is a bit of a hack, but it lines up with what is currently implemented in CGP. If we make the despeculation more general in CGP, we can make the speculation here more liberal. A follow-up patch will adjust the cost for sqrt and possibly other typically expensive math intrinsics (currently everything is cheap by default). GPU targets would likely want to override those expensive default costs (just as they probably should already override the cost of div/rem) because just about any math is cheaper than control-flow on those targets. Differential Revision: http://reviews.llvm.org/D15213 llvm-svn: 255660
* [llvm-profdata] Add support for weighted merge of profile data (2nd try)Nathan Slingerland2015-12-1514-55/+369
| | | | | | | | | | | | | | | | | | | | Summary: This change adds support for specifying a weight when merging profile data with the llvm-profdata tool. Weights are specified by using the --weighted-input=<weight>,<filename> option. Input files not specified with this option (normal positional list after options) are given a default weight of 1. Adding support for arbitrary weighting of input profile data allows for relative importance to be placed on the input data from multiple training runs. Both sampled and instrumented profiles are supported. Reviewers: davidxl, dnovillo, bogner, silvas Subscribers: silvas, davidxl, llvm-commits Differential Revision: http://reviews.llvm.org/D15306 llvm-svn: 255659
* AMDGPU: mark ldexp LibCalls as unavailableNicolai Hahnle2015-12-152-10/+16
| | | | | | | | | | | | | | | | | | Summary: The LibCallSimplifier will turn llvm.exp2.* intrinsics into ldexp* libcalls which do not make sense with the AMDGPU backend. In the long run, we'll want an llvm.ldexp.* intrinsic to properly make use of this optimization, but this works around the problem for now. See also: http://reviews.llvm.org/D14327 (suggested llvm.ldexp.* implementation) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92709 Reviewers: arsenm, tstellarAMD Differential Revision: http://reviews.llvm.org/D14990 llvm-svn: 255658
* AMDGPU/SI: Fix bitcast between v2f32 and f64Tom Stellard2015-12-151-0/+4
| | | | | | | | | | The radeonsi fp64 support can hit these now that some redundant bitcasts are folded. Patch by: Michel Dänzer Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 255657
* [X86] Smaller code for materializing 32-bit 1 and -1 constantsHans Wennborg2015-12-153-5/+159
| | | | | | | | | "movl $-1, %eax" is 5 bytes, "xorl %eax, %eax; decl %eax" is 3 bytes. This commit makes LLVM use the latter when optimizing for size. Differential Revision: http://reviews.llvm.org/D14971 llvm-svn: 255656
* WebAssembly: update expected torture test failuresJF Bastien2015-12-151-33/+0
| | | | | | We now have 252 expected failures. llvm-svn: 255654
* [Hexagon] Preprocess mapped instructions before lowering to MCKrzysztof Parzyszek2015-12-157-9/+397
| | | | llvm-svn: 255653
* AMDGPU/SI: Add llvm.amdgcn.mbcnt.* intrinsicsTom Stellard2015-12-153-2/+34
| | | | | | | | | | | | | | Summary: These are meant to be used instead of the llvm.SI.tid intrinsic which will be deprecated at some point. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15475 llvm-svn: 255652
* AMDGPU/SI: Add llvm.amdgcn.v.interp.p[12] intrinsicsTom Stellard2015-12-153-0/+58
| | | | | | | | | | | | | | Summary: These are meant to be used instead of the llvm.SI.fs.interp intrinsic which will be deprecated at some point. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15474 llvm-svn: 255651
* AMDGPU/SI: Add getShaderType() function to Utils/Tom Stellard2015-12-155-17/+26
| | | | | | | | | | Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15424 llvm-svn: 255650
* Bitcasts between FP and INT values using direct movesNemanja Ivanovic2015-12-155-6/+160
| | | | | | | | | | | | | | | This patch corresponds to review: http://reviews.llvm.org/D15286 This patch was meant to land in revision 255246, but I accidentally uploaded the patch that corresponds to http://reviews.llvm.org/D15372 in that revision accidentally. Thereby, this patch is the actual Bitcasts using direct moves patch, whereas http://reviews.llvm.org/rL255246 actually corresponds to http://reviews.llvm.org/D15372. llvm-svn: 255649
* [x86] adding PKU feature flagAsaf Badouh2015-12-155-0/+11
| | | | | | | | | the feature flag is essential for RDPKRU and WRPKRU instruction more about the instruction can be found in the SDM rev 56, vol 2 from http://www.intel.com/sdm Differential Revision: http://reviews.llvm.org/D15491 llvm-svn: 255644
* Do not try to use i8 and i16 versions of FP_TO_U/SINT soft float library callsMichael Kuperstein2015-12-155-47/+63
| | | | | | | | | | | It appears that neither compiler-rt nor the gnu soft-float libraries actually implement these conversions. Instead of emitting calls to library functions that don't exist, handle it similarly to the way we handle i8 -> float and i16 -> float conversions: call the i32 library function, and adjust the type. Differential Revision: http://reviews.llvm.org/D15151 llvm-svn: 255643
* Define a feature for __float128 support in the PPC back endNemanja Ivanovic2015-12-153-0/+7
| | | | | | | | | | | | | | This patch corresponds to review: http://reviews.llvm.org/D15117 In preparation for supporting IEEE Quad precision floating point, this patch simply defines a feature to specify the target supports this. For now, nothing is done with the target feature, we just don't want warnings from the Clang FE when a user specifies -mfloat128. Calling convention and other related work will add to this patch in the near future. llvm-svn: 255642
* Improve the successor list update in TailDuplication.cpp.Cong Hou2015-12-152-7/+7
| | | | | | | This patch improves a temporary fix in r255530 so that we can normalize successor list without trigger assertion failures in tail duplication pass. llvm-svn: 255638
* InstCombineLoadStoreAlloca.cpp: Avoid instantiating Twine.NAKAMURA Takumi2015-12-151-4/+9
| | | | llvm-svn: 255637
* [PassManagerBuilder] Add a few more scalar optimization passesJames Molloy2015-12-151-0/+13
| | | | | | | | | | | | | | | | | | | | This patch does two things: 1. mem2reg is now run immediately after globalopt. Now that globalopt can localize variables more aggressively, it makes sense to lower them to SSA form earlier rather than later so they can benefit from the full set of optimization passes. 2. More scalar optimizations are run after the loop optimizations in LTO mode. The loop optimizations (especially indvars) can clean up scalar code sufficiently to make it worthwhile running more scalar passes. I've particularly added SCCP here as it isn't run anywhere else in the LTO pass pipeline. Mem2reg is super cheap and shouldn't affect compilation time at all. The rest of the added passes are in the LTO pipeline only so doesn't affect the vast majority of compilations, just the link step. llvm-svn: 255634
* Mark ThreadPool unittests as unsupported on PowerPC64Mehdi Amini2015-12-151-0/+4
| | | | | | | Bots are crashing unexpectingly, see: https://llvm.org/bugs/show_bug.cgi?id=25829 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 255633
* ThreadPool unittest: add a rough mechanism to mark UNSUPPORTED on a given ↵Mehdi Amini2015-12-151-5/+53
| | | | | | | platform From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 255632
* Type legalizer for masked gather and scatter intrinsics.Elena Demikhovsky2015-12-159-221/+2085
| | | | | | | | | | | | Full type legalizer that works with all vectors length - from 2 to 16, (i32, i64, float, double). This intrinsic, for example void @llvm.masked.scatter.v2f32(<2 x float>%data , <2 x float*>%ptrs , i32 align , <2 x i1>%mask ) requires type widening for data and type promotion for mask. Differential Revision: http://reviews.llvm.org/D13633 llvm-svn: 255629
* [IR] Add classof for GetElementPtrConstantExpr, CompareConstantExpr, ↵Craig Topper2015-12-152-2/+30
| | | | | | | | InsertValueConstantExpr, and ExtractValueConstantExpr. All but CompareConstantExpr were being used in casts that were erroneously using ConstantExpr::classof due to inheritance. While there use cast<CompareConstantExpr> to simplify code slightly. I believe in one place we were always casting to ExtractValueConstantExpr when we were trying to choose between ExtractValueConstantExpr and InsertValueConstantExpr because of this. But since they have identical layouts this didn't cause any observable problems. llvm-svn: 255624
* Use CmpInst::Predicate instead of 'unsigned short' in some places. NFCCraig Topper2015-12-155-43/+46
| | | | llvm-svn: 255623
* Fix MSVC build with LLVM_ENABLE_THREADS=OFFMehdi Amini2015-12-151-1/+10
| | | | | | | Follow-up to the ThreadPool implementation. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 255621
* LoopUtils: Remove defaults for arguments that are always specified. NFCJustin Bogner2015-12-151-3/+3
| | | | llvm-svn: 255620
OpenPOWER on IntegriCloud