summaryrefslogtreecommitdiffstats
path: root/llvm/include
Commit message (Collapse)AuthorAgeFilesLines
...
* [MachineScheduler]Add support for store clusteringJun Bum Lim2016-04-151-3/+5
| | | | | | | | | | | | Perform store clustering just like load clustering. This change add StoreClusterMutation in machine-scheduler. To control StoreClusterMutation, added enableClusterStores() in TargetInstrInfo.h. This is enabled only on AArch64 for now. This change also add support for unscaled stores which were not handled in getMemOpBaseRegImmOfs(). llvm-svn: 266437
* Add a setOperationPromotedToType convenience method that sets an operation ↵Craig Topper2016-04-151-0/+7
| | | | | | to promoted and set the type in one call. Use it so save code in X86. llvm-svn: 266413
* Revert "[LTO] Add a new splitCodeGen() API which takes a TargetMachineFactory."Davide Italiano2016-04-151-15/+0
| | | | | | This reverts commits r266390 and r266396 as they broke some bots. llvm-svn: 266408
* [TTI] Add getInliningThresholdMultiplier.Justin Lebar2016-04-153-0/+16
| | | | | | | | | | | | | | | | Summary: InlineCost's threshold is multiplied by this value. This lets us adjust the inlining threshold up or down on a per-target basis. For example, we might want to increase the threshold on targets where calls are unusually expensive. Reviewers: chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18560 llvm-svn: 266405
* [Speculation] Add a SpeculativeExecution mode where the pass does nothing ↵Justin Lebar2016-04-152-0/+5
| | | | | | | | | | | | | | | | unless TTI::hasBranchDivergence() is true. Summary: This lets us add this pass to the IR pass manager unconditionally; it will simply not do anything on targets without branch divergence. Reviewers: tra Subscribers: llvm-commits, jingyue, rnk, chandlerc Differential Revision: http://reviews.llvm.org/D18625 llvm-svn: 266398
* [ParallelCG] Attempt to placate MSVC.Davide Italiano2016-04-151-0/+2
| | | | llvm-svn: 266396
* Option parser: class for consuming a joined arg in addition to all remaining ↵Hans Wennborg2016-04-152-0/+5
| | | | | | args llvm-svn: 266394
* [LTO] Add a new splitCodeGen() API which takes a TargetMachineFactory.Davide Italiano2016-04-151-0/+13
| | | | | | | | | This will be used in lld to avoid creating TargetMachine in two different places. See D18999 for a more detailed discussion. Differential Revision: http://reviews.llvm.org/D19139 llvm-svn: 266390
* [AliasSetTracker] Correctly handle changing the size of an entryMichael Kuperstein2016-04-141-5/+11
| | | | | | | | | | | | | If the size of an AST entry changes, we also need to make sure we perform necessary alias set merges, as the new size may overlap pointers in other sets. We happen to run into this with memset, because memset allows an entry for a i8* pointer to have a decidedly non-i8 size. This fixes PR27262. Differential Revision: http://reviews.llvm.org/D18939 llvm-svn: 266381
* Nuke getGlobalContext() from LLVM (but the C API)Mehdi Amini2016-04-141-4/+0
| | | | | | | | | | The only use for getGlobalContext() is in the C API. Let's just move the static global here and nuke the C++ API. Differential Revision: http://reviews.llvm.org/D19094 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266380
* Remove every uses of getGlobalContext() in LLVM (but the C API)Mehdi Amini2016-04-141-2/+3
| | | | | | | | | | | At the same time, fixes InstructionsTest::CastInst unittest: yes you can leave the IR in an invalid state and exit when you don't destroy the context (like the global one), no longer now. This is the first part of http://reviews.llvm.org/D19094 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266379
* [ScheduleDAGInstrs] Re-factor for based on review feedback. NFC.Geoff Berry2016-04-141-2/+7
| | | | | | | | | | | | | | Summary: Re-factor some code to improve clarity and style based on review comments from http://reviews.llvm.org/D18093. Reviewers: MatzeB, mcrosier Subscribers: MatzeB, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19128 llvm-svn: 266372
* [ARM] Adding IEEE-754 SIMD detection to loop vectorizerRenato Golin2016-04-142-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Some SIMD implementations are not IEEE-754 compliant, for example ARM's NEON. This patch teaches the loop vectorizer to only allow transformations of loops that either contain no floating-point operations or have enough allowance flags supporting lack of precision (ex. -ffast-math, Darwin). For that, the target description now has a method which tells us if the vectorizer is allowed to handle FP math without falling into unsafe representations, plus a check on every FP instruction in the candidate loop to check for the safety flags. This commit makes LLVM behave like GCC with respect to ARM NEON support, but it stops short of fixing the underlying problem: sub-normals. Neither GCC nor LLVM have a flag for allowing sub-normal operations. Before this patch, GCC only allows it using unsafe-math flags and LLVM allows it by default with no way to turn it off (short of not using NEON at all). As a first step, we push this change to make it safe and in sync with GCC. The second step is to discuss a new sub-normal's flag on both communitues and come up with a common solution. The third step is to improve the FastMath flags in LLVM to encode sub-normals and use those flags to restrict NEON FP. Fixes PR16275. llvm-svn: 266363
* Sink DI metadata usage out of MachineInstr.h and MachineInstrBuilder.hReid Kleckner2016-04-142-46/+13
| | | | | | | | | | | MachineInstr.h and MachineInstrBuilder.h are very popular headers, widely included across all LLVM backends. It turns out that there only a handful of TUs that actually care about DI operands on MachineInstrs. After this change, touching DebugInfoMetadata.h and rebuilding llc only needs 112 actions instead of 542. llvm-svn: 266351
* [GlobalISel] Move GISelAccessor class into public headersTom Stellard2016-04-141-0/+33
| | | | | | | | | | Reviewers: qcolombet Subscribers: joker.eph, vkalintiris, llvm-commits Differential Revision: http://reviews.llvm.org/D19120 llvm-svn: 266348
* [DivergenceAnalysis] Treat PHI with incoming undef as constantNicolai Haehnle2016-04-141-0/+5
| | | | | | | | | | | | | | | | | | | | | | | Summary: If a PHI has an incoming undef, we can pretend that it is equal to one non-undef, non-self incoming value. This is particularly relevant in combination with the StructurizeCFG pass, which introduces PHI nodes with undefs. Previously, this lead to branch conditions that were uniform before StructurizeCFG to become non-uniform afterwards, which confused the SIAnnotateControlFlow pass. This fixes a crash when Mesa radeonsi compiles a shader from dEQP-GLES3.functional.shaders.switch.switch_in_for_loop_dynamic_vertex Reviewers: arsenm, tstellarAMD, jingyue Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19013 llvm-svn: 266347
* [GlobalISel] Coding style and whitespace fixesTom Stellard2016-04-141-3/+3
| | | | | | | | | | Reviewers: qcolombet Subscribers: joker.eph, llvm-commits, vkalintiris Differential Revision: http://reviews.llvm.org/D19119 llvm-svn: 266342
* AMDGPU: allow specifying a workgroup size that needs to fit in a compute unitTom Stellard2016-04-141-0/+7
| | | | | | | | | | | | | | | | | | | Summary: For GL_ARB_compute_shader we need to support workgroup sizes of at least 1024. However, if we want to allow large workgroup sizes, we may need to use less registers, as we have to run more waves per SIMD. This patch adds an attribute to specify the maximum work group size the compiled program needs to support. It defaults, to 256, as that has no wave restrictions. Reducing the number of registers available is done similarly to how the registers were reserved for chips with the sgpr init bug. Reviewers: mareko, arsenm, tstellarAMD, nhaehnle Subscribers: FireBurn, kerberizer, llvm-commits, arsenm Differential Revision: http://reviews.llvm.org/D18340 Patch By: Bas Nieuwenhuizen llvm-svn: 266337
* [SCEV][LAA] Add tests for SCEV expression transformations performed during LAASilviu Baranga2016-04-141-0/+3
| | | | | | | | | | | | | | | | | | | | Summary: Add a print method to Predicated Scalar Evolution which prints all interesting transformations done by PSE. Loop Access Analysis will now print this as part of the analysis output. We now use this to check the exact expression transformations that were done by PSE in LAA. The additional checking also acts as white-box testing for the getAsAddRec method. Reviewers: anemet, sanjoy Subscribers: sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18792 llvm-svn: 266334
* Revert "Support arbitrary addrspace pointers in masked load/store intrinsics"Adam Nemet2016-04-142-5/+4
| | | | | | | | This reverts commit r266086. It breaks the LTO build of gcc in SPEC2000. llvm-svn: 266282
* [CodeGen] Teach LLVM how to lower @llvm.{min,max}num to {MIN,MAX}NANDavid Majnemer2016-04-143-56/+75
| | | | | | | | | | | | | | | The behavior of {MIN,MAX}NAN differs from that of {MIN,MAX}NUM when only one of the inputs is NaN: -NUM will return the non-NaN argument while -NAN would return NaN. It is desirable to lower to @llvm.{min,max}num to -NAN if they don't have a native instruction for -NUM. Notably, ARMv7 NEON's vmin has the -NAN semantics. N.B. Of course, it is only safe to do this if the intrinsic call is marked nnan. llvm-svn: 266279
* AMDGPU: Implement canonicalizeMatt Arsenault2016-04-142-0/+4
| | | | | | Also add generic DAG node for it. llvm-svn: 266272
* TargetLowering: Factor out common code for tail call eligibility checking; NFCMatthias Braun2016-04-141-0/+10
| | | | llvm-svn: 266270
* Revert "Add LLVMGetAttrKindIDInContext in the C API in order to facilitate ↵Amaury Sechet2016-04-131-3/+0
| | | | | | | | migration away from LLVMAttribute" This reverts commit 0bcfd95c268bcb180a525e1837e84475df8acdc7. llvm-svn: 266259
* Add LLVMGetAttrKindIDInContext in the C API in order to facilitate migration ↵Amaury Sechet2016-04-131-0/+3
| | | | | | | | | | | | | | away from LLVMAttribute Summary: LLVMAttribute has outlived its utility and is becoming a problem for C API users that what to use all the LLVM attributes. In order to help moving away from LLVMAttribute in a smooth manner, this diff introduce LLVMGetAttrKindIDInContext, which can be used instead of the enum values. Reviewers: Wallbraker, whitequark, joker.eph, echristo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18749 llvm-svn: 266257
* [IR] Optimize memory usage of Metadata on MSVCReid Kleckner2016-04-131-1/+2
| | | | | | | | An unsigned 2 bit bitfield takes 4 bytes in MSVC. Instead of a bitfield, just use an unsigned char. We can go back to a bitfield when someone implements the TODO of exposing and reusing the remaining 6 bits. llvm-svn: 266256
* [DebugInfo] Optimize memory layout of DISubprogram.Davide Italiano2016-04-131-8/+21
| | | | | | | | | | | | | | | A DISubprogram on x86_64 was 48 bytes. During an LTO build we end up allocating *a lot* of these (see Duncan's numbers on llvm-dev and/or my numbers in the review link). This change reduces the size to 40 bytes, with a nice effect on peak memory usage when LTO'ing clang. There are more classes in the hierarchy which can be compacted so more patches will come. DISubprogram was the biggest offender in my profiling, anyway. Differential Revision: http://reviews.llvm.org/D18918 llvm-svn: 266241
* Revert "Make aliases explicit in the summary"Mehdi Amini2016-04-132-32/+2
| | | | | | | | | Inadvertently commited... This reverts commit e618ec93786d99df2ddf280ad2d5e02f5516cecf. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266215
* Make aliases explicit in the summaryMehdi Amini2016-04-132-2/+32
| | | | | | | | | | | | | | | | Summary: To be able to work accurately on the reference graph when taking decision about internalizing, promoting, renaming, etc. We need to have the alias information explicit. Reviewers: tejohnson Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18836 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266214
* Revert inadvertently modified comment in r266131Mehdi Amini2016-04-131-1/+1
| | | | | From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266210
* Simplify strlen to a subtraction for certain cases.David L Kreitzer2016-04-131-0/+5
| | | | | | | | Patch by Li Huang (li1.huang@intel.com) Differential Revision: http://reviews.llvm.org/D18230 llvm-svn: 266200
* Calculate __builtin_object_size when pointer depends on a conditionPetar Jovanovic2016-04-131-2/+12
| | | | | | | | | | | | | | | | This patch fixes calculating of builtin_object_size if it depends on a condition. Before this patch compiler did not know how to calculate the object size when it finds a condition that cannot be eliminated. This patch enables calculating of builtin_object_size even in case when condition cannot be eliminated by choosing minimum or maximum value as a result from condition. Choosing minimum or maximum value from condition is based on the second argument of __builtin_object_size function. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D18438 llvm-svn: 266193
* [InstCombine] We folded an fcmp to an i1 instead of a vector of i1David Majnemer2016-04-132-3/+6
| | | | | | | | | Remove an ad-hoc transform in InstCombine and replace it with more general machinery (ValueTracking, InstructionSimplify and VectorUtils). This fixes PR27332. llvm-svn: 266175
* Refactor the InternalizePass into a helper class, and expose it through a ↵Mehdi Amini2016-04-131-0/+27
| | | | | | | | | | public free function (NFC) There is really no reason to require to instanciate a pass manager to internalize. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266167
* Refactor Internalization pass to use as a callback instead of a StringSet (NFC)Mehdi Amini2016-04-131-5/+6
| | | | | | | | | | | | This will save a bunch of copies / initialization of intermediate datastructure, and (hopefully) simplify the code. This also abstract the symbol preservation mechanism outside of the Internalization pass into the client code, which is not forced to keep a map of strings for instance (ThinLTO will prefere hashes). From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266163
* Recommit r265547, and r265610,r265639,r265657 on top of it, plusWei Mi2016-04-131-9/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | two fixes with one about error verify-regalloc reported, and another about live range update of phi after rematerialization. r265547: Replace analyzeSiblingValues with new algorithm to fix its compile time issue. The patch is to solve PR17409 and its duplicates. analyzeSiblingValues is a N x N complexity algorithm where N is the number of siblings generated by reg splitting. Although it causes siginificant compile time issue when N is large, it is also important for performance since it removes redundent spills and enables rematerialization. To solve the compile time issue, the patch removes analyzeSiblingValues and replaces it with lower cost alternatives containing two parts. The first part creates a new spill hoisting method in postOptimization of register allocation. It does spill hoisting at once after all the spills are generated instead of inside every instance of selectOrSplit. The second part queries the define expr of the original register for rematerializaiton and keep it always available during register allocation even if it is already dead. It deletes those dead instructions only in postOptimization. With the two parts in the patch, it can remove analyzeSiblingValues without sacrificing performance. Patches on top of r265547: r265610 "Fix the compare-clang diff error introduced by r265547." r265639 "Fix the sanitizer bootstrap error in r265547." r265657 "InlineSpiller.cpp: Escap \@ in r265547. [-Wdocumentation]" Differential Revision: http://reviews.llvm.org/D15302 Differential Revision: http://reviews.llvm.org/D18934 Differential Revision: http://reviews.llvm.org/D18935 Differential Revision: http://reviews.llvm.org/D18936 llvm-svn: 266162
* Add a pass to name anonymous/nameless functionMehdi Amini2016-04-123-0/+9
| | | | | | | | | | | | | | | | Summary: For correct handling of alias to nameless function, we need to be able to refer them through a GUID in the summary. Here we name them using a hash of the non-private global names in the module. Reviewers: tejohnson Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D18883 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266132
* Move summary creation out of llvm-as into optMehdi Amini2016-04-121-1/+1
| | | | | | | | | | | | | | | | | | Summary: Let keep llvm-as "dumb": it converts textual IR to bitcode. This commit removes the dependency from llvm-as to libLLVMAnalysis. We'll add back summary in llvm-as if we get to a textual representation for it at some point. In the meantime, opt seems like a better place for that. Reviewers: tejohnson Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D19032 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266131
* AMDGPU: add llvm.amdgcn.buffer.load/store intrinsicsNicolai Haehnle2016-04-121-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: They correspond to BUFFER_LOAD/STORE_DWORD[_X2,X3,X4] and mostly behave like llvm.amdgcn.buffer.load/store.format. They will be used by Mesa for SSBO and atomic counters at least when robust buffer access behavior is desired. (These instructions perform no format conversion and do buffer range checking per component.) As a side effect of sharing patterns with llvm.amdgcn.buffer.store.format, it has become trivial to add support for the f32 and v2f32 variants of that intrinsic, so the patch does so. Also DAG-ify (and fix) some tests that I noticed intermittent failures in while developing this patch. Some tests were (temporarily) adjusted for the required mayLoad/hasSideEffects changes to the BUFFER_STORE_DWORD* instructions. See also http://reviews.llvm.org/D18291. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18292 llvm-svn: 266126
* [ThinLTO] Only compute imports for current module in FunctionImport passTeresa Johnson2016-04-122-0/+14
| | | | | | | | | | | | | | | | | | | | | Summary: The function import pass was computing all the imports for all the modules in the index, and only using the imports for the current module. Change this to instead compute only for the given module. This means that the exports list can't be populated, but they weren't being used anyway. Longer term, the linker can collect all the imports and export lists and serialize them out for consumption by the distributed backend processes which use this pass. Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D18945 llvm-svn: 266125
* Add __atomic_* lowering to AtomicExpandPass.James Y Knight2016-04-122-1/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (Recommit of r266002, with r266011, r266016, and not accidentally including an extra unused/uninitialized element in LibcallRoutineNames) AtomicExpandPass can now lower atomic load, atomic store, atomicrmw, and cmpxchg instructions to __atomic_* library calls, when the target doesn't support atomics of a given size. This is the first step towards moving all atomic lowering from clang into llvm. When all is done, the behavior of __sync_* builtins, __atomic_* builtins, and C11 atomics will be unified. Previously LLVM would pass everything through to the ISelLowering code. There, unsupported atomic instructions would turn into __sync_* library calls. Because of that behavior, Clang currently avoids emitting llvm IR atomic instructions when this would happen, and emits __atomic_* library functions itself, in the frontend. This change makes LLVM able to emit __atomic_* libcalls, and thus will eventually allow clang to depend on LLVM to do the right thing. It is advantageous to do the new lowering to atomic libcalls in AtomicExpandPass, before ISel time, because it's important that all atomic operations for a given size either lower to __atomic_* libcalls (which may use locks), or native instructions which won't. No mixing and matching. At the moment, this code is enabled only for SPARC, as a demonstration. The next commit will expand support to all of the other targets. Differential Revision: http://reviews.llvm.org/D18200 llvm-svn: 266115
* APInt: Add overload of isMaskMatt Arsenault2016-04-121-0/+7
| | | | | | | This mimics the version in MathExtras.h which isn't testing for a specific mask size. llvm-svn: 266101
* Introduce an GCRelocateInst class [NFC]Philip Reames2016-04-121-9/+32
| | | | | | Previously, we were using isGCRelocate predicates. Using a subclass of IntrinsicInst is far more idiomatic. The refactoring also enables a couple of minor simplifications and code sharing. llvm-svn: 266098
* Support arbitrary addrspace pointers in masked load/store intrinsicsArtur Pilipenko2016-04-122-4/+5
| | | | | | | | | | | | | | This is a resubmittion of 263158 change. This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace. The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics. Reviewed By: reames Differential Revision: http://reviews.llvm.org/D17270 llvm-svn: 266086
* AMDGPU: Add atomic_inc + atomic_dec intrinsicsMatt Arsenault2016-04-121-0/+11
| | | | | | | These are different than atomicrmw add 1 because they have an additional input value to clamp the result. llvm-svn: 266074
* This reverts commit r266002, r266011 and r266016.Rafael Espindola2016-04-122-91/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | They broke the msan bot. Original message: Add __atomic_* lowering to AtomicExpandPass. AtomicExpandPass can now lower atomic load, atomic store, atomicrmw,and cmpxchg instructions to __atomic_* library calls, when the target doesn't support atomics of a given size. This is the first step towards moving all atomic lowering from clang into llvm. When all is done, the behavior of __sync_* builtins, __atomic_* builtins, and C11 atomics will be unified. Previously LLVM would pass everything through to the ISelLowering code. There, unsupported atomic instructions would turn into __sync_* library calls. Because of that behavior, Clang currently avoids emitting llvm IR atomic instructions when this would happen, and emits __atomic_* library functions itself, in the frontend. This change makes LLVM able to emit __atomic_* libcalls, and thus will eventually allow clang to depend on LLVM to do the right thing. It is advantageous to do the new lowering to atomic libcalls in AtomicExpandPass, before ISel time, because it's important that all atomic operations for a given size either lower to __atomic_* libcalls (which may use locks), or native instructions which won't. No mixing and matching. At the moment, this code is enabled only for SPARC, as a demonstration. The next commit will expand support to all of the other targets. Differential Revision: http://reviews.llvm.org/D18200 llvm-svn: 266062
* Refactor the Internalize stage of libLTO in a separate file (NFC)Mehdi Amini2016-04-121-4/+0
| | | | | | | | | | | | | | | | | This is intended to be shared by the ThinLTOCodeGenerator. Note that there is a change in the way the verifier is run, previously it was ran as a Pass on the merged module during internalization. While now the verifier is called explicitely on the merged module outside of the internalize "pass pipeline". What remains strange in the API is the fact that `DisableVerify` in the API does not disable this initial verifier. Differential Revision: http://reviews.llvm.org/D19000 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266047
* Use StringSet instead of StringMap where it makes sense to in ↵Mehdi Amini2016-04-121-4/+3
| | | | | | | LTOCodeGenerator (NFC) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266046
* TargetLowering: Add missing doxygen group end.Matthias Braun2016-04-121-0/+2
| | | | | | | The missing end was also confusing the '{', '}' matching heuristics in vim. llvm-svn: 266036
* Add the allocsize attribute to LLVM.George Burgess IV2016-04-123-4/+40
| | | | | | | | | | | | | | | | `allocsize` is a function attribute that allows users to request that LLVM treat arbitrary functions as allocation functions. This patch makes LLVM accept the `allocsize` attribute, and makes `@llvm.objectsize` recognize said attribute. The review for this was split into two patches for ease of reviewing: D18974 and D14933. As promised on the revisions, I'm landing both patches as a single commit. Differential Revision: http://reviews.llvm.org/D14933 llvm-svn: 266032
OpenPOWER on IntegriCloud