summaryrefslogtreecommitdiffstats
path: root/llvm/include
Commit message (Collapse)AuthorAgeFilesLines
* [SCEV] Prove no-overflow via constant rangesSanjoy Das2016-03-031-0/+3
| | | | | | | Exploit ScalarEvolution::getRange's newly acquired smartness (since r262438) by using that to infer nsw and nuw when possible. llvm-svn: 262639
* [ConstantRange] Generalize makeGuaranteedNoWrapRegion to work on rangesSanjoy Das2016-03-031-10/+12
| | | | | | | This will be used in a later patch to ScalarEvolution. Right now only the unit tests exercise the newly added code. llvm-svn: 262637
* Infrastructure for PGO enhancements in inlinerEaswaran Raman2016-03-033-7/+65
| | | | | | | | | | | | This patch provides the following infrastructure for PGO enhancements in inliner: Enable the use of block level profile information in inliner Incremental update of block frequency information during inlining Update the function entry counts of callees when they get inlined into callers. Differential Revision: http://reviews.llvm.org/D16381 llvm-svn: 262636
* Use LineLocation instead of CallsiteLocation to index callsite profile.Dehao Chen2016-03-032-30/+21
| | | | | | | | | | | | Summary: With discriminator, LineLocation can uniquely identify a callsite without the need to specifying callee name. Remove Callee function name from the key, and put it in the value (FunctionSamples). Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17827 llvm-svn: 262634
* [LLVM][AVX512] PSRLWI Chnage imm8 to intMichael Zuckerman2016-03-031-3/+3
| | | | | | Differential Revision: http://reviews.llvm.org/D17753 llvm-svn: 262592
* TTI: Fix not using overload of getIntrinsicInstrCostMatt Arsenault2016-03-031-1/+1
| | | | | | | This was always calling the generic version, so the target custom implementation was never called. llvm-svn: 262585
* TargetSchedule: Allow explicit Unsupported markers in InstRWMatthias Braun2016-03-031-0/+2
| | | | llvm-svn: 262549
* [X86] Don't give catch objects a displacement of zeroDavid Majnemer2016-03-031-0/+4
| | | | | | | | | | | | | | | | | | Catch objects with a displacement of zero do not initialize a catch object. The displacement is relative to %rsp at the end of the function's prologue for x86_64 targets. If we place an object at the top-of-stack, we will end up wit a displacement of zero resulting in our catch object remaining uninitialized. Address this by creating our catch objects as fixed objects. We will ensure that the UnwindHelp object is created after the catch objects so that no catch object will have a displacement of zero. Differential Revision: http://reviews.llvm.org/D17823 llvm-svn: 262546
* SelectionDAG: Use correctly sized allocation functions for SDNodesJustin Bogner2016-03-021-0/+6
| | | | | | | | | | | | | | | | The placement new calls here were all calling the allocation function in RecyclingAllocator/Recycler for SDNode, instead of the function for the specific subclass we were constructing. Since this particular allocator always overallocates it more or less worked, but would hide what we're actually doing from any memory tools. Also, if you tried to change this allocator so something like a BumpPtrAllocator or MallocAllocator, the compiler would crash horribly all the time. Part of llvm.org/PR26808. llvm-svn: 262500
* [AA] Hoist the logic to reformulate various AA queries in terms of otherChandler Carruth2016-03-029-175/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | parts of the AA interface out of the base class of every single AA result object. Because this logic reformulates the query in terms of some other aspect of the API, it would easily cause O(n^2) query patterns in alias analysis. These could in turn be magnified further based on the number of call arguments, and then further based on the number of AA queries made for a particular call. This ended up causing problems for Rust that were actually noticable enough to get a bug (PR26564) and probably other places as well. When originally re-working the AA infrastructure, the desire was to regularize the pattern of refinement without losing any generality. While I think it was successful, that is clearly proving to be too costly. And the cost is needless: we gain no actual improvement for this generality of making a direct query to tbaa actually be able to re-use some other alias analysis's refinement logic for one of the other APIs, or some such. In short, this is entirely wasted work. To the extent possible, delegation to other API surfaces should be done at the aggregation layer so that we can avoid re-walking the aggregation. In fact, this significantly simplifies the logic as we no longer need to smuggle the aggregation layer into each alias analysis (or the TargetLibraryInfo into each alias analysis just so we can form argument memory locations!). However, we also have some delegation logic inside of BasicAA and some of it even makes sense. When the delegation logic is baking in specific knowledge of aliasing properties of the LLVM IR, as opposed to simply reformulating the query to utilize a different alias analysis interface entry point, it makes a lot of sense to restrict that logic to a different layer such as BasicAA. So one aspect of the delegation that was in every AA base class is that when we don't have operand bundles, we re-use function AA results as a fallback for callsite alias results. This relies on the IR properties of calls and functions w.r.t. aliasing, and so seems a better fit to BasicAA. I've lifted the logic up to that point where it seems to be a natural fit. This still does a bit of redundant work (we query function attributes twice, once via the callsite and once via the function AA query) but it is *exactly* twice here, no more. The end result is that all of the delegation logic is hoisted out of the base class and into either the aggregation layer when it is a pure retargeting to a different API surface, or into BasicAA when it relies on the IR's aliasing properties. This should fix the quadratic query pattern reported in PR26564, although I don't have a stand-alone test case to reproduce it. It also seems general goodness. Now the numerous AAs that don't need target library info don't carry it around and depend on it. I think I can even rip out the general access to the aggregation layer and only expose that in BasicAA as it is the only place where we re-query in that manner. However, this is a non-trivial change to the AA infrastructure so I want to get some additional eyes on this before it lands. Sadly, it can't wait long because we should really cherry pick this into 3.8 if we're going to go this route. Differential Revision: http://reviews.llvm.org/D17329 llvm-svn: 262490
* [LLVM][AVX512]PSRAWI Change imm8 to int.Michael Zuckerman2016-03-021-9/+9
| | | | | | Differential Revision: http://reviews.llvm.org/D17705 llvm-svn: 262480
* [SCEV] Make getRange smarter around selectsSanjoy Das2016-03-021-0/+7
| | | | | | | | | | | | Have ScalarEvolution::getRange re-consider cases like "{C?A:B,+,C?P:Q}" by factoring out "C" and computing RangeOf{A,+,P} union RangeOf({B,+,Q}) instead. The latter can be easier to compute precisely in cases like "{C?0:N,+,C?1:-1}" N is the backedge taken count of the loop; since in such cases the latter form simplifies to [0,N+1) union [0,N+1). llvm-svn: 262438
* [SCEV] Extract out a getRangeForAffineAR; NFCSanjoy Das2016-03-021-0/+6
| | | | | | Pure code-motion change. Will be used later in making getRange more clever. llvm-svn: 262437
* Perform InstructioinCombiningPass before SampleProfile pass.Dehao Chen2016-03-011-0/+17
| | | | | | | | | | | | Summary: SampleProfile pass needs to be performed after InstructionCombiningPass, which helps eliminate un-inlinable function calls. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17742 llvm-svn: 262419
* [lanai] Add ELF enum value and relocations.Jacques Pienaar2016-03-014-0/+48
| | | | | | | | | | Add ELF enum value and relocations for Lanai backed. General Lanai backend discussion on llvm-dev thread "[RFC] Lanai backend" (http://lists.llvm.org/pipermail/llvm-dev/2016-February/095118.html). Differential Revision: http://reviews.llvm.org/D17008 llvm-svn: 262394
* Fix some warnings a bit harder/differentDavid Blaikie2016-03-011-3/+3
| | | | | | | This is an alternate fix to 262378 and a fix to a pessimizing-move warning. llvm-svn: 262390
* TableGen: Check scheduling models for completenessMatthias Braun2016-03-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | TableGen checks at compiletime that for scheduling models with "CompleteModel = 1" one of the following holds: - Is marked with the hasNoSchedulingInfo flag - The instruction is a subclass of Sched - There are InstRW definitions in the scheduling model Typical steps necessary to complete a model: - Ensure all pseudo instructions that are expanded before machine scheduling (usually everything handled with EmitYYY() functions in XXXTargetLowering). - If a CPU does not support some instructions mark the corresponding resource unsupported: "WriteRes<WriteXXX, []> { let Unsupported = 1; }". - Add missing scheduling information. Differential Revision: http://reviews.llvm.org/D17747 llvm-svn: 262384
* TableGen: Add hasNoSchedulingInfo to instructionsMatthias Braun2016-03-011-1/+8
| | | | | | | | | | | | | This introduces a new flag that indicates that a specific instruction will never be present when the MachineScheduler runs and therefore needs no scheduling information. This is in preparation for an upcoming commit which checks completeness of a scheduling model when tablegen runs. Differential Revision: http://reviews.llvm.org/D17728 llvm-svn: 262383
* Fix -Wnon-virtual-dtor warningsReid Kleckner2016-03-011-2/+3
| | | | llvm-svn: 262378
* Fix an issue where fast math flags were dropped during scalarization.Owen Anderson2016-03-011-0/+9
| | | | | | | Most portions of InstCombine properly propagate fast math flags, but apparently the vector scalarization section was overlooked. llvm-svn: 262376
* [SCEV] Minor cleanup: rename method, C++11'ify; NFCSanjoy Das2016-03-011-3/+2
| | | | llvm-svn: 262374
* [NVPTX] Use different, convergent MIs for convergent calls.Justin Lebar2016-03-011-4/+11
| | | | | | | | | | | | | | | | | | | | | | | Summary: Calls sometimes need to be convergent. This is already handled at the LLVM IR level, but it also needs to be handled at the MI level. Ideally we'd propagate convergence from instructions, down through the selection DAG, and into MIs. But this is Hard, and would affect optimizations in the SDNs -- right now only SDNs with two operands have any flags at all. Instead, here's a much simpler hack: Add new opcodes for NVPTX for convergent calls, and generate these when lowering convergent LLVM calls. Reviewers: jholewinski Subscribers: jholewinski, chandlerc, joker.eph, jhen, tra, llvm-commits Differential Revision: http://reviews.llvm.org/D17423 llvm-svn: 262373
* Move ObjectYAML code to a new library.Rafael Espindola2016-03-013-8/+17
| | | | | | | It is only ever used by obj2yaml and yaml2obj. No point in linking it everywhere. llvm-svn: 262368
* Fix breakage caused by r262360.Easwaran Raman2016-03-011-1/+1
| | | | llvm-svn: 262363
* Add the beginnings of an update API for preserving MemorySSADaniel Berlin2016-03-011-0/+16
| | | | | | | | | | | | | | | | | | | Summary: This adds the beginning of an update API to preserve MemorySSA. In particular, this patch adds a way to remove memory SSA accesses when instructions are deleted. It also adds relevant unit testing infrastructure for MemorySSA's API. (There is an actual user of this API, i will make that diff dependent on this one. In practice, a ton of opt passes remove memory instructions, so it's hopefully an obviously useful API :P) Reviewers: hfinkel, reames, george.burgess.iv Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17157 llvm-svn: 262362
* Metadata support for profile summary.Easwaran Raman2016-03-011-17/+64
| | | | | | | | This adds support to convert ProfileSummary object to Metadata and create a ProfileSummary object from metadata. This would allow attaching profile summary information to Module allowing optimization passes to use it. llvm-svn: 262360
* Add isScalarInteger helper to EVT/MVTMatt Arsenault2016-03-012-0/+13
| | | | llvm-svn: 262357
* AMDGPU/SI: Implement DS_PERMUTE/DS_BPERMUTE Instruction Definitions and ↵Changpeng Fang2016-03-011-0/+9
| | | | | | | | | | | | | | | | Intrinsics Summary: This patch impleemnts DS_PERMUTE/DS_BPERMUTE instruction definitions and intrinsics, which are new since VI. Reviewers: tstellarAMD, arsenm Subscribers: llvm-commits, arsenm Differential Revision: http://reviews.llvm.org/D17614 llvm-svn: 262356
* [LLVM][AVX512] PSRL{DI|QI} Change imm8 to intMichael Zuckerman2016-03-011-6/+6
| | | | | | Differential Revision: http://reviews.llvm.org/D17713 llvm-svn: 262353
* [X86] Check that attribute parameters match for tail calls (PR26590)Hans Wennborg2016-03-011-1/+31
| | | | | | | | | | | | In the code below on 32-bit targets, x would previously get forwarded to g() without sign-extension to 32 bits as required by the parameter attribute. void g(signed short); void f(unsigned short x) { g(x); } llvm-svn: 262352
* [LTO] Fix error reporting from lto_module_create_in_local_context()Petr Pavlu2016-03-011-9/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Function lto_module_create_in_local_context() would previously rely on the default LLVMContext being created for it by LTOModule::makeLTOModule(). This context exits the program on error and is not arranged to update sLastStringError in tools/lto/lto.cpp. Function lto_module_create_in_local_context() now creates an LLVMContext by itself, sets it up correctly to its needs and then passes it to LTOModule::createInLocalContext() which takes ownership of the context and keeps it present for the lifetime of the returned LTOModule. Function LTOModule::makeLTOModule() is modified to take a reference to LLVMContext (instead of a pointer) and no longer creates a default context when nullptr is passed to it. Method LTOModule::createInContext() that takes a pointer to LLVMContext is removed because it allows to pass a nullptr to it. Instead LTOModule::createFromBuffer() (that takes a reference to LLVMContext) should be used. Differential Revision: http://reviews.llvm.org/D17715 llvm-svn: 262330
* [AVX512][PSRAQ][PSRAD] Change imm8 to int.Michael Zuckerman2016-03-011-6/+6
| | | | | | Differential Revision: http://reviews.llvm.org/D17692 llvm-svn: 262320
* Rename embedded bitcode section in MachOSteven Wu2016-02-292-0/+8
| | | | | | | | | | | | | | Summary: Rename the section embeds bitcode from ".llvmbc,.llvmbc" to "__LLVM,__bitcode". The new name matches MachO section naming convention. Reviewers: rafael, pcc Subscribers: davide, llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D17388 llvm-svn: 262245
* Move discriminator assignment to the right place.Dehao Chen2016-02-291-9/+0
| | | | | | | | | | | | Summary: Now discriminator is assigned per-function instead of per-module. Reviewers: davidxl, dnovillo Subscribers: dblaikie, llvm-commits Differential Revision: http://reviews.llvm.org/D17664 llvm-svn: 262240
* [PM] Wire up optimization levels and default pipeline construction APIsChandler Carruth2016-02-281-1/+118
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | in the PassBuilder. These are really just stubs for now, but they give a nice API surface that Clang or other tools can start learning about and enabling for experimentation. I've also wired up parsing various synthetic module pass names to generate these set pipelines. This allows the pipelines to be combined with other passes and have their order controlled, with clear separation between the *kind* of canned pipeline, and the *level* of optimization to be used within that canned pipeline. The most interesting part of this patch is almost certainly the spec for the different optimization levels. I don't think we can ever have hard and fast rules that would make it easy to determine whether a particular optimization makes sense at a particular level -- it will always be in large part a judgement call. But hopefully this will outline the expected rationale that should be used, and the direction that the pipelines should be taken. Much of this was based on a long llvm-dev discussion I started years ago to try and crystalize the intent behind these pipelines, and now, at long long last I'm returning to the task of actually writing it down somewhere that we can cite and try to be consistent with. Differential Revision: http://reviews.llvm.org/D12826 llvm-svn: 262196
* [PM] Appease mingw32's auto-import DLL build with minimal tweaks, with fix ↵NAKAMURA Takumi2016-02-2814-0/+32
| | | | | | | | for clang. char AnalysisBase::ID should be declared as extern and defined in one module. llvm-svn: 262188
* Revert r262185, "[PM] Appease mingw32's auto-import DLL build with minimal ↵NAKAMURA Takumi2016-02-2814-32/+0
| | | | | | | | tweaks." I'll rework soon. llvm-svn: 262186
* [PM] Appease mingw32's auto-import DLL build with minimal tweaks.NAKAMURA Takumi2016-02-2814-0/+32
| | | | | | char AnalysisBase::ID should be declared as extern and defined in one module. llvm-svn: 262185
* [AVX512][PSLLW ][PSLLV] Change imm8 to intMichael Zuckerman2016-02-281-3/+3
| | | | | | Differential Revision: http://reviews.llvm.org/D17684 llvm-svn: 262176
* CodeGen: Update LiveIntervalAnalysis API to use MachineInstr&, NFCDuncan P. N. Exon Smith2016-02-271-3/+3
| | | | | | These parameters aren't expected to be null, so take them by reference. llvm-svn: 262151
* CodeGen: Change MachineInstr to use MachineInstr&, NFCDuncan P. N. Exon Smith2016-02-272-4/+4
| | | | | | | | Change MachineInstr API to prefer MachineInstr& over MachineInstr* whenever the parameter is expected to be non-null. Slowly inching toward being able to fix PR26753. llvm-svn: 262149
* CodeGen: Update DFAPacketizer API to take MachineInstr&, NFCDuncan P. N. Exon Smith2016-02-271-14/+10
| | | | | | | | | In all but one case, change the DFAPacketizer API to take MachineInstr& instead of MachineInstr*. In DFAPacketizer::endPacket(), take MachineBasicBlock::iterator. Besides cleaning up the API, this is in search of PR26753. llvm-svn: 262142
* WIP: CodeGen: Use MachineInstr& in MachineInstrBundle.h, NFCDuncan P. N. Exon Smith2016-02-274-29/+25
| | | | | | | | Update APIs in MachineInstrBundle.h to take and return MachineInstr& instead of MachineInstr* when the instruction cannot be null. Besides being a nice cleanup, this is tacking toward a fix for PR26753. llvm-svn: 262141
* [PM] Provide explicit instantiation declarations and definitions for theChandler Carruth2016-02-273-0/+8
| | | | | | PassManager and AnalysisManager template specializations as well. llvm-svn: 262128
* [PM] Provide two templates for the two directionalities of analysisChandler Carruth2016-02-273-490/+128
| | | | | | | | | | | | | | | | | | | | | | | | manager proxies and use those rather than repeating their definition four times. There are real differences between the two directions: outer AMs are const and don't need to have invalidation tracked. But every proxy in a particular direction is identical except for the analysis manager type and the IR unit they proxy into. This makes them prime candidates for nice templates. I've started introducing explicit template instantiation declarations and definitions as well because we really shouldn't be emitting all this everywhere. I'm going to go back and add the same for the other templates like this in a follow-up patch. I've left the analysis manager as an opaque type rather than using two IR units and requiring it to be an AnalysisManager template specialization. I think its important that users retain the ability to provide their own custom analysis management layer and provided it has the appropriate API everything should Just Work. llvm-svn: 262127
* AMDGPU: Add s_sleep intrinsicMatt Arsenault2016-02-271-0/+5
| | | | llvm-svn: 262120
* AMDGPU: Implement readcyclecounterMatt Arsenault2016-02-271-0/+7
| | | | | | | | | | This matches the behavior of the HSAIL clock instruction. s_realmemtime is used if the subtarget supports it, and falls back to s_memtime if not. Also introduces new intrinsics for each of s_memtime / s_memrealtime. llvm-svn: 262119
* CodeGen: Avoid implicit conversion in MachineInstrBuilder, NFCDuncan P. N. Exon Smith2016-02-271-2/+2
| | | | | | | | Avoid another implicit conversion from MachineInstrBundleIterator to MachineInstr*, this time in MachineInstrBuilder.h (this is in pursuit of PR26753). llvm-svn: 262118
* CodeGen: Remove implicit iterator to pointer conversions, NFCDuncan P. N. Exon Smith2016-02-271-2/+2
| | | | | | | Remove a couple of implicit conversions from MachineInstrBundleIterator to MachineInstr*. llvm-svn: 262116
* CodeGen: Take MachineInstr& in SlotIndexes and LiveIntervals, NFCDuncan P. N. Exon Smith2016-02-273-37/+45
| | | | | | | | | | | | | | Take MachineInstr by reference instead of by pointer in SlotIndexes and the SlotIndex wrappers in LiveIntervals. The MachineInstrs here are never null, so this cleans up the API a bit. It also incidentally removes a few implicit conversions from MachineInstrBundleIterator to MachineInstr* (see PR26753). At a couple of call sites it was convenient to convert to a range-based for loop over MachineBasicBlock::instr_begin/instr_end, so I added MachineBasicBlock::instrs. llvm-svn: 262115
OpenPOWER on IntegriCloud