summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* [IR] Add a dedicated FNeg IR InstructionCameron McInally2018-11-134-0/+20
| | | | | | | | | | | The IEEE-754 Standard makes it clear that fneg(x) and fsub(-0.0, x) are two different operations. The former is a bitwise operation, while the latter is an arithmetic operation. This patch creates a dedicated FNeg IR Instruction to model that behavior. Differential Revision: https://reviews.llvm.org/D53877 llvm-svn: 346774
* Fix uninitialized variable.Alexander Kornienko2018-11-131-1/+1
| | | | | | | | | | | | | | | | Flags variable was not initialized and later used (both isMBBSafeToOutlineFrom implementations assume it's initialized), which breaks test/CodeGen/AArch64/machine-outliner.mir. under memory sanitizer: MemorySanitizer: use-of-uninitialized-value #0 in llvm::AArch64InstrInfo::getOutliningType(llvm::MachineInstrBundleIterator<llvm::MachineInstr, false>&, unsigned int) const llvm/lib/Target/AArch64/AArch64InstrInfo.cpp:5494:9 #1 in (anonymous namespace)::InstructionMapper::convertToUnsignedVec(llvm::MachineBasicBlock&, llvm::TargetInstrInfo const&) llvm/lib/CodeGen/MachineOutliner.cpp:772:19 #2 in (anonymous namespace)::MachineOutliner::populateMapper((anonymous namespace)::InstructionMapper&, llvm::Module&, llvm::MachineModuleInfo&) llvm/lib/CodeGen/MachineOutliner.cpp:1543:14 #3 in (anonymous namespace)::MachineOutliner::runOnModule(llvm::Module&) llvm/lib/CodeGen/MachineOutliner.cpp:1645:3 #4 in (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) llvm/lib/IR/LegacyPassManager.cpp:1744:27 #5 in llvm::legacy::PassManagerImpl::run(llvm::Module&) llvm/lib/IR/LegacyPassManager.cpp:1857:44 #6 in compileModule(char**, llvm::LLVMContext&) llvm/tools/llc/llc.cpp:597:8 llvm-svn: 346761
* [DAGCombiner] Enable tryToFoldExtendOfConstant to run after legalize vector opsCraig Topper2018-11-131-14/+7
| | | | | | | | | | It should be ok to create a new build_vector after legal operations so long as it doesn't cause an infinite loop in DAG combiner. Unfortunately, X86's custom constant folding in combineVSZext is hiding any test changes from this. But I'm trying to get to a point where that X86 specific code isn't necessary at all. Differential Revision: https://reviews.llvm.org/D54285 llvm-svn: 346728
* [MachineOutliner][NFC] Change getMachineOutlinerMBBFlags to ↵Jessica Paquette2018-11-121-1/+6
| | | | | | | | | | | | isMBBSafeToOutlineFrom Instead of returning Flags, return true if the MBB is safe to outline from. This lets us check for unsafe situations, like say, in AArch64, X17 is live across a MBB without being defined in that MBB. In that case, there's no point in performing an instruction mapping. llvm-svn: 346718
* [GC][NFC] Simplify code now that we only have one safepoint kindPhilip Reames2018-11-123-17/+7
| | | | | | This is the NFC follow up to exploit the semantic simplification from r346701 llvm-svn: 346712
* Use a data structure better suited for large sets in SimplificationTracker.Ali Tamur2018-11-121-11/+156
| | | | | | | | | | | | | | | | | Summary: D44571 changed SimplificationTracker to use SmallSetVector to keep phi nodes. As a result, when the number of phi nodes is large, the build time performance suffers badly. When building for power pc, we have a case where there are more than 600.000 nodes, and it takes too long to compile. In this change, I partially revert D44571 to use SmallPtrSet, which does an acceptable job with any number of elements. In the original patch, having a deterministic iteration order was mentioned as a motivation, however I think it only applies to the nodes already matched in MatchPhiSet method, which I did not touch. Reviewers: bjope, skatkov Reviewed By: bjope, skatkov Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54007 llvm-svn: 346710
* [GC] Remove so called PreCall safepointsPhilip Reames2018-11-122-9/+2
| | | | | | | Remove another bit of unused configuration potential from GCStrategy. It's not entirely clear what the intention here was, but from the docs, it sounds like this may have been subsumed by patchable call support. Note: This change is deliberately small to make it clear that while implemented, there's nothing using the option. A following NFC will do most of the simplifications. llvm-svn: 346701
* Fix MachineInstr::findRegisterUseOperandIdx subreg checksStanislav Mekhanoshin2018-11-121-3/+1
| | | | | | | | | | | | The function only checks that instruction reads a super-register containing requested physical register. In case if a sub-register if being read that is also a use of a super-reg, so added the check. In particular MI->readsRegister() is broken because of the missing check. The resulting check is essentially regsOverlap(). Differential Revision: https://reviews.llvm.org/D54128 llvm-svn: 346686
* [MachineOutliner][NFC] Early exit pruning when candidates don't share an MBBJessica Paquette2018-11-121-0/+8
| | | | | | | | | | There's no way they can overlap in this case. This can save a few iterations when the candidate is close to the beginning of a MachineBasicBlock. It's particularly useful when the average length of a MachineBasicBlock in the program is small. llvm-svn: 346682
* [MachineOutliner][NFC] Put suffix tree in buildCandidateListJessica Paquette2018-11-121-6/+5
| | | | | | It's only used there, so it doesn't make much sense to have it in runOnModule. llvm-svn: 346681
* [DWARFv5] Emit split type units in .debug_info.dwo.Paul Robinson2018-11-121-4/+7
| | | | | | Differential Revision: https://reviews.llvm.org/D54350 llvm-svn: 346674
* [DAGCombiner] Fix load-store forwarding of indexed loads.Nirav Dave2018-11-121-3/+17
| | | | | | | | | | | | | | | | Summary: Handle extra output from index loads in cases where we wish to forward a load value directly from a preceeding store. Fixes PR39571. Reviewers: peter.smith, rengolin Subscribers: javed.absar, hiraditya, arphaman, llvm-commits Differential Revision: https://reviews.llvm.org/D54265 llvm-svn: 346654
* [GC] Remove unused configuration variablePhilip Reames2018-11-121-6/+1
| | | | | | The custom root mechanism didn't actually do anything. ShadowStackGC, the only one which used it, just removed the gcroots before they reached the normal lowering in SelectionDAG. As a result, the state flag had no value. llvm-svn: 346632
* [GC] Minor style modernizationPhilip Reames2018-11-121-44/+43
| | | | llvm-svn: 346631
* [GCRoot] Remove some unneccessary complexityPhilip Reames2018-11-112-50/+33
| | | | | | | | | The GCStrategy provides three configuration options were are largely redundant. 1) Support for conditionally lowering gcread and gcwrite to loads and stores. This is redundant since any GC which wished to use these abstractions would lower them out of existance before the built in lowering anyways. As such, there's no need to have the lowering being conditional. 2) Conditional initialization for allocas marked via gcroot. Semantically, roots have to be initialized before first potential use. Arguably, the frontend really should have responsibility for that, but the old API allowed the frontend to ignore this detail. Only one builtin GC used the non-initializing mode. Since no one to my knowledge actually uses the ErlangGC strategy, I decide the slight pessimization was worth the simplicity. If that turns out to be problematic, we can always improve the insertion algorithm to detect more existing initializing stores. llvm-svn: 346621
* [DAGCombiner] Make tryToFoldExtendOfConstant return an SDValue instead of an ↵Craig Topper2018-11-101-14/+14
| | | | | | | | SDNode*. NFC Removes the need to call getNode internally and to recreate an SDValue after the call. llvm-svn: 346600
* [x86] allow vector load narrowing with multi-use valuesSanjay Patel2018-11-101-5/+7
| | | | | | | | | | | | | | | | | | | | | | This is a long-awaited follow-up suggested in D33578. Since then, we've picked up even more opportunities for vector narrowing from changes like D53784, so there are a lot of test diffs. Apart from 2-3 strange cases, these are all wins. I've structured this to be no-functional-change-intended for any target except for x86 because I couldn't tell if AArch64, ARM, and AMDGPU would improve or not. All of those targets have existing regression tests (4, 4, 10 files respectively) that would be affected. Also, Hexagon overrides the shouldReduceLoadWidth() hook, but doesn't show any regression test diffs. The trade-off is deciding if an extra vector load is better than a single wide load + extract_subvector. For x86, this is almost always better (on paper at least) because we often can fold loads into subsequent ops and not increase the official instruction count. There's also some unknown -- but potentially large -- benefit from using narrower vector ops if wide ops are implemented with multiple uops and/or frequency throttling is avoided. Differential Revision: https://reviews.llvm.org/D54073 llvm-svn: 346595
* [GC] Rename a header for consistencyPhilip Reames2018-11-103-3/+3
| | | | llvm-svn: 346588
* RegAllocFast: Further cleanups; NFCMatthias Braun2018-11-101-210/+217
| | | | llvm-svn: 346576
* [GC] Simplify linking of GC builtin GC strategiesPhilip Reames2018-11-091-6/+2
| | | | llvm-svn: 346569
* [SelectionDAG] Fix a -Wparentheses warning from gcc in an assert. NFCCraig Topper2018-11-091-2/+2
| | | | | | gcc wants parentheses around the logical OR since there is a logical AND for the string. llvm-svn: 346564
* [DWARFv5] Emit normal type units in .debug_info comdats.Paul Robinson2018-11-091-1/+6
| | | | | | Differential Revision: https://reviews.llvm.org/D54282 llvm-svn: 346540
* [DAGCombiner][X86][Mips] Enable combineShuffleOfScalars to run between ↵Craig Topper2018-11-091-2/+5
| | | | | | | | | | | | vector op legalization and DAG legalization. Fix bad one use check in combineShuffleOfScalars It's possible for vector op legalization to generate a shuffle. If that happens we should give a chance for DAG combine to combine that with a build_vector input. I also fixed a bug in combineShuffleOfScalars that was considering the number of uses on a undef input to a shuffle. We don't care how many times undef is used. Differential Revision: https://reviews.llvm.org/D54283 llvm-svn: 346530
* Type safe version of MachinePassRegistrySerge Guelton2018-11-094-41/+6
| | | | | | | | | | | Previous version used type erasure through a `void* (*)()` pointer, which triggered gcc warning and implied a lot of reinterpret_cast. This version should make it harder to hit ourselves in the foot. Differential revision: https://reviews.llvm.org/D54203 llvm-svn: 346522
* [Power9] Allow gpr callee saved spills in prologue to vectors registersZaara Syeda2018-11-092-12/+42
| | | | | | | | | | | | | | Currently in llvm, CalleeSavedInfo can only assign a callee saved register to stack frame index to be spilled in the prologue. We would like to enable spilling gprs to vector registers. This patch adds the capability to spill to other registers aside from just the stack. It also adds the changes for power9 to spill gprs to volatile vector registers when they are available. This happens only for leaf functions when using the option -ppc-enable-pe-vector-spills. Differential Revision: https://reviews.llvm.org/D39386 llvm-svn: 346512
* [SelectionDAG] swap select_cc operands to enable foldingAlexandros Lamprineas2018-11-091-32/+34
| | | | | | | | | | | | | | | | | | The DAGCombiner tries to SimplifySelectCC as follows: select_cc(x, y, 16, 0, cc) -> shl(zext(set_cc(x, y, cc)), 4) It can't cope with the situation of reordered operands: select_cc(x, y, 0, 16, cc) In that case we just need to swap the operands and invert the Condition Code: select_cc(x, y, 16, 0, ~cc) Differential Revision: https://reviews.llvm.org/D53236 llvm-svn: 346484
* [SelectionDAG] Assert on the width of DemandedElts argument to ↵Craig Topper2018-11-081-2/+3
| | | | | | | | computeKnownBits for all vector typed operations not just build_vector. Fix AArch64 unit test that fails with the assertion added. llvm-svn: 346437
* [DAGCombine] Improve alias analysis for chain of independent stores.Nirav Dave2018-11-081-59/+116
| | | | | | | | | | | | | | | | | | | FindBetterNeighborChains simulateanously improves the chain dependencies of a chain of related stores avoiding the generation of extra token factors. For chains longer than the GatherAllAliasDepths, stores further down in the chain will necessarily fail, a potentially significant waste and preventing otherwise trivial parallelization. This patch directly parallelize the chains of stores before improving each store. This generally improves DAG-level parallelism. Reviewers: courbet, spatel, RKSimon, bogner, efriedma, craig.topper, rnk Subscribers: sdardis, javed.absar, hiraditya, jrtc27, atanasyan, llvm-commits Differential Revision: https://reviews.llvm.org/D53552 llvm-svn: 346432
* NFC: DebugInfo: Track the origin CU rather than just the base address for ↵David Blaikie2018-11-084-13/+11
| | | | | | | | | | range lists Turns out knowing more than just the base address might be useful - specifically a future change to respect a DICompileUnit flag for the use of base address specifiers in DWARF < 5. llvm-svn: 346380
* [MachineOutliner][NFC] Only map blocks which have adjacent legal instructionsJessica Paquette2018-11-081-14/+36
| | | | | | | | | | | If a block doesn't have any ranges of adjacent legal instructions, then it can't have outlining candidates. There's no point in mapping legal isntructions in situations like this. I noticed this reduces the size of the suffix tree in sqlite3 for AArch64 at -Oz by about 3%. llvm-svn: 346379
* [MachineOutliner][NFC] Don't map MBBs that don't contain legal instructionsJessica Paquette2018-11-081-18/+47
| | | | | | | | | | | | | | I noticed that there are lots of basic blocks that don't have enough legal instructions in them to warrant outlining. We can skip mapping these entirely. In sqlite3, compiled for AArch64 at -Oz, this results in a 10% reduction of the total nodes in the suffix tree. These nodes can never be part of a repeated substring, and so they don't impact the result at all. Before this, there were 62128 nodes in the tree for sqlite3. After this, there are 56457 nodes. llvm-svn: 346373
* [MachineOutliner][NFC] Remove Parent field from SuffixTreeNodeJessica Paquette2018-11-071-28/+14
| | | | | | | | | | This is only used for calculating ConcatLen. This isn't necessary, since it's easily derived from the traversal setting suffix indices. Remove that. Rename CurrIdx to CurrNodeLen to better describe what's going on. llvm-svn: 346349
* [MachineOutliner][NFC] Traverse suffix tree using a RepeatedSubstring iteratorJessica Paquette2018-11-071-53/+111
| | | | | | | | | This takes the traversal methods introduced in r346269 and adapts them into an iterator. This allows the outliner to iterate over repeated substrings within the suffix tree directly without having to initially find all of the substrings and then iterate over them after you've found them. llvm-svn: 346345
* [MachineOutliner] Don't store outlined function numberings on OutlinedFunctionJessica Paquette2018-11-071-5/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NFC-ish. This doesn't change the behaviour of the outliner, but does make sure that you won't end up with say OUTLINED_FUNCTION_2: ... ret OUTLINED_FUNCTION_248: ... ret as the only outlined functions in your module. Those should really be OUTLINED_FUNCTION_0: ... ret OUTLINED_FUNCTION_1: ... ret If we produce outlined functions, they probably should have sequential numbers attached to them. This makes it a bit easier+stable to write outliner tests. The point of this is to move towards a bit more stability in outlined function names. By doing this, we at least don't rely on the traversal order of the suffix tree. Instead, we rely on the order of the candidate list, which is *far* more consistent. The candidate list is ordered by the end indices of candidates, so we're more likely to get a stable ordering. This is still susceptible to changes in the cost model though (like, if we suddenly find new candidates, for example). llvm-svn: 346340
* Fix ignorded type qualifier warning [NFC]Serge Guelton2018-11-071-1/+1
| | | | llvm-svn: 346332
* Add support for llvm.is.constant intrinsic (PR4898)James Y Knight2018-11-074-14/+49
| | | | | | | | | | | | | | | This adds the llvm-side support for post-inlining evaluation of the __builtin_constant_p GCC intrinsic. Also fixed SCCPSolver::visitCallSite to not blow up when seeing a call to a function where canConstantFoldTo returns true, and one of the arguments is a struct. Updated from patch initially by Janusz Sobczak. Differential Revision: https://reviews.llvm.org/D4276 llvm-svn: 346322
* RegAllocFast: Leave unassigned virtreg entries in mapMatthias Braun2018-11-071-93/+74
| | | | | | | | | | | | | | | | Set `LiveReg::PhysReg` to zero when freeing a register instead of removing it from the entry from `LiveRegMap`. This way no iterators get invalidated and we can avoid passing around and updating iterators all over the place. This does not change any allocator decisions. It is not completely NFC because the arbitrary iteration order through `LiveRegMap` in `spillAll()` changes so we may get a different order in those spill sequences (the amount of spills does not change). This is in preparation of https://reviews.llvm.org/D52010. llvm-svn: 346298
* RegAllocFast: Further cleanups; NFCMatthias Braun2018-11-071-31/+35
| | | | | | This is in preparation of https://reviews.llvm.org/D52010. llvm-svn: 346297
* RegAllocFast: Refactor PhysRegState usage; NFCMatthias Braun2018-11-071-10/+18
| | | | | | This is in preparation of https://reviews.llvm.org/D52010. llvm-svn: 346296
* RegAllocFast: Factor spill/reload creation into their own functions; NFCMatthias Braun2018-11-071-32/+50
| | | | | | This is in preparation of https://reviews.llvm.org/D52010. llvm-svn: 346289
* RegAllocFast: Cleanups; NFCMatthias Braun2018-11-071-16/+13
| | | | | | This is in preparation of https://reviews.llvm.org/D52010. llvm-svn: 346288
* RegAllocFast: Rename statistic from NumCopies to NumCoalescedMatthias Braun2018-11-071-2/+2
| | | | | | | | The metric does not return the number of remaining (or inserted) copies but the number of copies that were coalesced. Pick a more descriptive name. llvm-svn: 346287
* [MachineOutliner][NFC] Remove OccurrenceCount from SuffixTreeNodeJessica Paquette2018-11-061-7/+0
| | | | | | After changing the way we find candidates in r346269, this is no longer used. llvm-svn: 346275
* [MachineOutliner][NFC] Remove IsInTree from SuffixTreeNodeJessica Paquette2018-11-061-4/+0
| | | | | | | After changing the way we find repeated substrings in r346269, this field is no longer used by anything, so it can be removed. llvm-svn: 346274
* [MachineOutliner][NFC] Add findRepeatedSubstrings to SuffixTree, kill LeafVectorJessica Paquette2018-11-061-87/+106
| | | | | | | | | | | | | | | | | | | | | Instead of iterating over the leaves to find repeated substrings, and walking collecting leaf children when we don't necessarily need them, let's just calculate what we need and iterate over that. By doing this, we don't have to save every leaf. It's easier to read the code too and understand what's going on. The goal here, at the end of the day, is to set up to allow us to do something like for (RepeatedSubstring &RS : ST) { ... do stuff with RS ... } Which would let us perform the cost model stuff and the repeated substring query at the same time. llvm-svn: 346269
* LivePhysRegs/IfConversion: Change some types from unsigned to MCPhysReg; NFCMatthias Braun2018-11-062-15/+15
| | | | | | | | Change the type in a couple of lists and sets that only store physical registers from unsigned to MCPhysRegs. The later is only 16bits and saves us a bit of memory. llvm-svn: 346254
* MachineFunction: Store more specific reference to LLVMTargetMachine; NFCMatthias Braun2018-11-053-3/+4
| | | | | | | | | | MachineFunction can only be used in code using lib/CodeGen, hence we can keep a more specific reference to LLVMTargetMachine rather than just TargetMachine around. Do the same for references in ScheduleDAG and RegUsageInfoCollector. llvm-svn: 346183
* MachineModuleInfo: Store more specific reference to LLVMTargetMachine; NFCMatthias Braun2018-11-051-1/+1
| | | | | | | | MachineModuleInfo can only be used in code using lib/CodeGen, hence we can keep a more specific reference to LLVMTargetMachine rather than just TargetMachine around. llvm-svn: 346182
* [FPEnv] Add constrained CEIL/FLOOR/ROUND/TRUNC intrinsicsCameron McInally2018-11-056-0/+52
| | | | | | Differential Revision: https://reviews.llvm.org/D53411 llvm-svn: 346141
* [TargetLowering] Begin generalizing TargetLowering::expandFP_TO_SINT ↵Simon Pilgrim2018-11-051-26/+26
| | | | | | | | support. NFCI. Prior to initial work to add vector expansion support, remove assumptions that we're working on scalar types. llvm-svn: 346139
OpenPOWER on IntegriCloud