summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* [SelectionDAG] Assert on the width of DemandedElts argument to ↵Craig Topper2018-11-081-2/+3
| | | | | | | | computeKnownBits for all vector typed operations not just build_vector. Fix AArch64 unit test that fails with the assertion added. llvm-svn: 346437
* [DAGCombine] Improve alias analysis for chain of independent stores.Nirav Dave2018-11-081-59/+116
| | | | | | | | | | | | | | | | | | | FindBetterNeighborChains simulateanously improves the chain dependencies of a chain of related stores avoiding the generation of extra token factors. For chains longer than the GatherAllAliasDepths, stores further down in the chain will necessarily fail, a potentially significant waste and preventing otherwise trivial parallelization. This patch directly parallelize the chains of stores before improving each store. This generally improves DAG-level parallelism. Reviewers: courbet, spatel, RKSimon, bogner, efriedma, craig.topper, rnk Subscribers: sdardis, javed.absar, hiraditya, jrtc27, atanasyan, llvm-commits Differential Revision: https://reviews.llvm.org/D53552 llvm-svn: 346432
* NFC: DebugInfo: Track the origin CU rather than just the base address for ↵David Blaikie2018-11-084-13/+11
| | | | | | | | | | range lists Turns out knowing more than just the base address might be useful - specifically a future change to respect a DICompileUnit flag for the use of base address specifiers in DWARF < 5. llvm-svn: 346380
* [MachineOutliner][NFC] Only map blocks which have adjacent legal instructionsJessica Paquette2018-11-081-14/+36
| | | | | | | | | | | If a block doesn't have any ranges of adjacent legal instructions, then it can't have outlining candidates. There's no point in mapping legal isntructions in situations like this. I noticed this reduces the size of the suffix tree in sqlite3 for AArch64 at -Oz by about 3%. llvm-svn: 346379
* [MachineOutliner][NFC] Don't map MBBs that don't contain legal instructionsJessica Paquette2018-11-081-18/+47
| | | | | | | | | | | | | | I noticed that there are lots of basic blocks that don't have enough legal instructions in them to warrant outlining. We can skip mapping these entirely. In sqlite3, compiled for AArch64 at -Oz, this results in a 10% reduction of the total nodes in the suffix tree. These nodes can never be part of a repeated substring, and so they don't impact the result at all. Before this, there were 62128 nodes in the tree for sqlite3. After this, there are 56457 nodes. llvm-svn: 346373
* [MachineOutliner][NFC] Remove Parent field from SuffixTreeNodeJessica Paquette2018-11-071-28/+14
| | | | | | | | | | This is only used for calculating ConcatLen. This isn't necessary, since it's easily derived from the traversal setting suffix indices. Remove that. Rename CurrIdx to CurrNodeLen to better describe what's going on. llvm-svn: 346349
* [MachineOutliner][NFC] Traverse suffix tree using a RepeatedSubstring iteratorJessica Paquette2018-11-071-53/+111
| | | | | | | | | This takes the traversal methods introduced in r346269 and adapts them into an iterator. This allows the outliner to iterate over repeated substrings within the suffix tree directly without having to initially find all of the substrings and then iterate over them after you've found them. llvm-svn: 346345
* [MachineOutliner] Don't store outlined function numberings on OutlinedFunctionJessica Paquette2018-11-071-5/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NFC-ish. This doesn't change the behaviour of the outliner, but does make sure that you won't end up with say OUTLINED_FUNCTION_2: ... ret OUTLINED_FUNCTION_248: ... ret as the only outlined functions in your module. Those should really be OUTLINED_FUNCTION_0: ... ret OUTLINED_FUNCTION_1: ... ret If we produce outlined functions, they probably should have sequential numbers attached to them. This makes it a bit easier+stable to write outliner tests. The point of this is to move towards a bit more stability in outlined function names. By doing this, we at least don't rely on the traversal order of the suffix tree. Instead, we rely on the order of the candidate list, which is *far* more consistent. The candidate list is ordered by the end indices of candidates, so we're more likely to get a stable ordering. This is still susceptible to changes in the cost model though (like, if we suddenly find new candidates, for example). llvm-svn: 346340
* Fix ignorded type qualifier warning [NFC]Serge Guelton2018-11-071-1/+1
| | | | llvm-svn: 346332
* Add support for llvm.is.constant intrinsic (PR4898)James Y Knight2018-11-074-14/+49
| | | | | | | | | | | | | | | This adds the llvm-side support for post-inlining evaluation of the __builtin_constant_p GCC intrinsic. Also fixed SCCPSolver::visitCallSite to not blow up when seeing a call to a function where canConstantFoldTo returns true, and one of the arguments is a struct. Updated from patch initially by Janusz Sobczak. Differential Revision: https://reviews.llvm.org/D4276 llvm-svn: 346322
* RegAllocFast: Leave unassigned virtreg entries in mapMatthias Braun2018-11-071-93/+74
| | | | | | | | | | | | | | | | Set `LiveReg::PhysReg` to zero when freeing a register instead of removing it from the entry from `LiveRegMap`. This way no iterators get invalidated and we can avoid passing around and updating iterators all over the place. This does not change any allocator decisions. It is not completely NFC because the arbitrary iteration order through `LiveRegMap` in `spillAll()` changes so we may get a different order in those spill sequences (the amount of spills does not change). This is in preparation of https://reviews.llvm.org/D52010. llvm-svn: 346298
* RegAllocFast: Further cleanups; NFCMatthias Braun2018-11-071-31/+35
| | | | | | This is in preparation of https://reviews.llvm.org/D52010. llvm-svn: 346297
* RegAllocFast: Refactor PhysRegState usage; NFCMatthias Braun2018-11-071-10/+18
| | | | | | This is in preparation of https://reviews.llvm.org/D52010. llvm-svn: 346296
* RegAllocFast: Factor spill/reload creation into their own functions; NFCMatthias Braun2018-11-071-32/+50
| | | | | | This is in preparation of https://reviews.llvm.org/D52010. llvm-svn: 346289
* RegAllocFast: Cleanups; NFCMatthias Braun2018-11-071-16/+13
| | | | | | This is in preparation of https://reviews.llvm.org/D52010. llvm-svn: 346288
* RegAllocFast: Rename statistic from NumCopies to NumCoalescedMatthias Braun2018-11-071-2/+2
| | | | | | | | The metric does not return the number of remaining (or inserted) copies but the number of copies that were coalesced. Pick a more descriptive name. llvm-svn: 346287
* [MachineOutliner][NFC] Remove OccurrenceCount from SuffixTreeNodeJessica Paquette2018-11-061-7/+0
| | | | | | After changing the way we find candidates in r346269, this is no longer used. llvm-svn: 346275
* [MachineOutliner][NFC] Remove IsInTree from SuffixTreeNodeJessica Paquette2018-11-061-4/+0
| | | | | | | After changing the way we find repeated substrings in r346269, this field is no longer used by anything, so it can be removed. llvm-svn: 346274
* [MachineOutliner][NFC] Add findRepeatedSubstrings to SuffixTree, kill LeafVectorJessica Paquette2018-11-061-87/+106
| | | | | | | | | | | | | | | | | | | | | Instead of iterating over the leaves to find repeated substrings, and walking collecting leaf children when we don't necessarily need them, let's just calculate what we need and iterate over that. By doing this, we don't have to save every leaf. It's easier to read the code too and understand what's going on. The goal here, at the end of the day, is to set up to allow us to do something like for (RepeatedSubstring &RS : ST) { ... do stuff with RS ... } Which would let us perform the cost model stuff and the repeated substring query at the same time. llvm-svn: 346269
* LivePhysRegs/IfConversion: Change some types from unsigned to MCPhysReg; NFCMatthias Braun2018-11-062-15/+15
| | | | | | | | Change the type in a couple of lists and sets that only store physical registers from unsigned to MCPhysRegs. The later is only 16bits and saves us a bit of memory. llvm-svn: 346254
* MachineFunction: Store more specific reference to LLVMTargetMachine; NFCMatthias Braun2018-11-053-3/+4
| | | | | | | | | | MachineFunction can only be used in code using lib/CodeGen, hence we can keep a more specific reference to LLVMTargetMachine rather than just TargetMachine around. Do the same for references in ScheduleDAG and RegUsageInfoCollector. llvm-svn: 346183
* MachineModuleInfo: Store more specific reference to LLVMTargetMachine; NFCMatthias Braun2018-11-051-1/+1
| | | | | | | | MachineModuleInfo can only be used in code using lib/CodeGen, hence we can keep a more specific reference to LLVMTargetMachine rather than just TargetMachine around. llvm-svn: 346182
* [FPEnv] Add constrained CEIL/FLOOR/ROUND/TRUNC intrinsicsCameron McInally2018-11-056-0/+52
| | | | | | Differential Revision: https://reviews.llvm.org/D53411 llvm-svn: 346141
* [TargetLowering] Begin generalizing TargetLowering::expandFP_TO_SINT ↵Simon Pilgrim2018-11-051-26/+26
| | | | | | | | support. NFCI. Prior to initial work to add vector expansion support, remove assumptions that we're working on scalar types. llvm-svn: 346139
* [DAGCombiner] Use tryFoldToZero to simplify some code and make it work ↵Craig Topper2018-11-051-8/+2
| | | | | | | | correctly between LegalTypes and LegalOperations. The original code avoided creating a zero vector after type legalization, but if we're after type legalization the type we have is legal. The real hazard we need to avoid is creating a build vector after op legalization. tryFoldToZero takes care of checking for this. llvm-svn: 346119
* [DAGCombiner] Remove an unused argument from tryFoldToZero. NFCCraig Topper2018-11-051-4/+3
| | | | llvm-svn: 346118
* [DAGCombiner] Remove 'else' after return. NFCCraig Topper2018-11-041-8/+7
| | | | | | This makes this code consistent with the nearly identical code in visitZERO_EXTEND. llvm-svn: 346090
* [SelectionDAG] Remove special methods for creating *_EXTEND_VECTOR_INREG ↵Craig Topper2018-11-044-44/+22
| | | | | | | | | | nodes. Move asserts into getNode. These methods were just wrappers around getNode with additional asserts (identical and repeated 3 times). But getNode already has a switch that can be used to hold these asserts that allows them to be shared for all 3 opcodes. This also enables checking on the places that create these nodes without using the wrappers. The rest of the patch is just changing all callers to use getNode directly. llvm-svn: 346087
* [codeview] Let the X86 backend tell us the VFRAME offset adjustmentReid Kleckner2018-11-032-8/+6
| | | | | | | | | | | Use MachineFrameInfo's OffsetAdjustment field to pass this information from the target to CodeViewDebug.cpp. The X86 backend doesn't use it for any other purpose. This fixes PR38857 in the case where there is a non-aligned quantity of CSRs and a non-aligned quantity of locals. llvm-svn: 346062
* [X86] Don't emit *_extend_vector_inreg nodes when both the input and output ↵Craig Topper2018-11-021-1/+1
| | | | | | | | | | | | | | | | types are legal with AVX1 We already have custom lowering for the AVX case in LegalizeVectorOps. So its better to keep the regular extend op around as long as possible. I had to qualify one place in DAG combine that created illegal vector extending load operations. This change by itself had no effect on any tests which is why its included here. I've made a few cleanups to the custom lowering. The sign extend code no longer creates an identity shuffle with undef elements. The zero extend code now emits a zero_extend_vector_inreg instead of an unpckl with a zero vector. For the high half of the custom lowering of zero_extend/any_extend, we're now using an unpckh with a zero vector or undef. Previously we used used a pshufd to move the upper 64-bits to the lower 64-bits and then used a zero_extend_vector_inreg. I think the zero vector should require less execution resources and be smaller code size. Differential Revision: https://reviews.llvm.org/D54024 llvm-svn: 346043
* [MachineSink][DebugInfo] Correctly sink DBG_VALUEsJeremy Morse2018-11-021-10/+47
| | | | | | | | | | | | | | | As reported in PR38952, postra-machine-sink relies on DBG_VALUE insns being adjacent to the def of the register that they reference. This is not always true, leading to register copies being sunk but not the associated DBG_VALUEs, which gives the debugger a bad variable location. This patch collects DBG_VALUEs as we walk through a BB looking for copies to sink, then passes them down to performSink. Compile-time impact should be negligable. Differential Revision: https://reviews.llvm.org/D53992 llvm-svn: 345996
* [DAGCombiner] Remove reduceBuildVecConvertToConvertBuildVec and rely on the ↵Simon Pilgrim2018-11-021-75/+0
| | | | | | | | | | | | | vectorizers instead (PR35732) reduceBuildVecConvertToConvertBuildVec vectorizes int2float in the DAGCombiner, which means that even if the LV/SLP has decided to keep scalar code using the cost models, this will override this. While there are cases where vectorization is necessary in the DAG (mainly due to legalization artefacts), I don't think this is the case here, we should assume that the vectorizers know what they are doing. Differential Revision: https://reviews.llvm.org/D53712 llvm-svn: 345964
* LLVMTargetMachine/TargetPassConfig: Simplify handling of start/stop options; NFCMatthias Braun2018-11-022-27/+28
| | | | | | | | | | - Make some TargetPassConfig methods that just check whether options have been set static. - Shuffle code in LLVMTargetMachine around so addPassesToGenerateCode only deals with TargetPassConfig now (but not with MCContext or the creation of MachineModuleInfo) llvm-svn: 345918
* [COFF, ARM64] Implement Intrinsic.sponentry for AArch64Mandeep Singh Grang2018-11-013-0/+6
| | | | | | | | | | | | | | | | Summary: This patch adds Intrinsic.sponentry. This intrinsic is required to correctly support setjmp for AArch64 Windows platform. Patch by: Yin Ma (yinma@codeaurora.org) Reviewers: mgrang, ssijaric, eli.friedman, TomTan, mstorsjo, rnk, compnerd, efriedma Reviewed By: efriedma Subscribers: efriedma, javed.absar, kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D53996 llvm-svn: 345909
* [DAGCombiner] Make the isTruncateOf call from visitZERO_EXTEND work for ↵Craig Topper2018-11-011-16/+13
| | | | | | | | vectors. Remove FIXME. I'm having trouble creating a test case for the ISD::TRUNCATE part of this that shows any codegen differences. But I was able to test the setcc path which is what the test changes here cover. llvm-svn: 345908
* [MachineOutliner][NFC] Remember when you map something illegal across MBBsJessica Paquette2018-11-011-20/+27
| | | | | | | | | | | | | | | | | | | | | Instruction mapping in the outliner uses "illegal numbers" to signify that something can't ever be part of an outlining candidate. This means that the number is unique and can't be part of any repeated substring. Because each of these is unique, we can use a single unique number to represent a range of things we can't outline. The outliner tries to leverage this using a flag which is set in an MBB when the previous instruction we tried to map was "illegal". This patch improves that logic to work across MBBs. As a bonus, this also simplifies the mapping logic somewhat. This also updates the machine-outliner-remarks test, which was impacted by the order of Candidates on an OutlinedFunction changing. This order isn't guaranteed, so I added a FIXME to fix that in a follow-up. The order of Candidates on an OutlinedFunction isn't important, so this still is NFC. llvm-svn: 345906
* [LegalizeDAG] Add generic vector CTPOP expansion (PR32655)Simon Pilgrim2018-11-012-2/+26
| | | | | | | | This patch adds support for expanding vector CTPOP instructions and removes the x86 'bitmath' lowering which replicates the same expansion. Differential Revision: https://reviews.llvm.org/D53258 llvm-svn: 345869
* Revert "[COFF, ARM64] Implement Intrinsic.sponentry for AArch64"Mandeep Singh Grang2018-11-013-6/+0
| | | | | | This reverts commit 585b6667b4712e3c7f32401e929855b3313b4ff2. llvm-svn: 345863
* [DAGCombiner] make sure we have a whole-number extract before trying to ↵Sanjay Patel2018-11-011-1/+5
| | | | | | | | | | | | narrow a vector op (PR39511) The test causes a crash because we were trying to extract v4f32 to v3f32, and the narrowing factor was then 4/3 = 1 producing a bogus narrow type. This should fix: https://bugs.llvm.org/show_bug.cgi?id=39511 llvm-svn: 345842
* [CodeView] Emit the correct TypeIndex for std::nullptr_t.Zachary Turner2018-11-011-0/+2
| | | | | | | | | | | | | | | | | | | | | | The TypeIndex used by cl.exe is 0x103, which indicates a SimpleTypeMode of NearPointer (note the absence of the bitness, normally pointers use a mode of NearPointer32 or NearPointer64) and a SimpleTypeKind of void. So this is basically a void*, but without a specified size, which makes sense given how std::nullptr_t is defined. clang-cl was actually not emitting *anything* for this. Instead, when we encountered std::nullptr_t in a DIType, we would actually just emit a TypeIndex of 0, which is obviously wrong. std::nullptr_t in DWARF is represented as a DW_TAG_unspecified_type with a name of "decltype(nullptr)", so we add that logic along with a test, as well as an update to the dumping code so that we no longer print void* when dumping 0x103 (which would previously treat Void/NearPointer no differently than Void/NearPointer64). Differential Revision: https://reviews.llvm.org/D53957 llvm-svn: 345811
* [COFF, ARM64] Implement Intrinsic.sponentry for AArch64Mandeep Singh Grang2018-10-313-0/+6
| | | | | | | | | | | | | | Summary: This patch adds Intrinsic.sponentry. This intrinsic is required to correctly support setjmp for AArch64 Windows platform. Reviewers: mgrang, TomTan, rnk, compnerd, mstorsjo, efriedma Reviewed By: efriedma Subscribers: majnemer, chrib, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D53673 llvm-svn: 345791
* Check shouldReduceLoadWidth from SimplifySetCCStanislav Mekhanoshin2018-10-311-1/+2
| | | | | | | | | | | | SimplifySetCC could shrink a load without checking for profitability or legality of such shink with a target. Added checks to prevent shrinking of aligned scalar loads in AMDGPU below dword as scalar engine does not support it. Differential Revision: https://reviews.llvm.org/D53846 llvm-svn: 345778
* [SelectionDAG] Handle constant range [0,1) in lowerRangeToAssertZExtScott Linder2018-10-311-1/+2
| | | | | | | | | lowerRangeToAssertZExt currently relies on something like EarlyCSE having eliminated the constant range [0,1). At -O0 this leads to an assert. Differential Revision: https://reviews.llvm.org/D53888 llvm-svn: 345770
* [SelectionDAGISel] Suppress a -Wunused-but-set-variable warning in release ↵Craig Topper2018-10-311-0/+1
| | | | | | builds. NFC llvm-svn: 345761
* Fix comment typo. NFCI.Simon Pilgrim2018-10-311-1/+1
| | | | llvm-svn: 345758
* [SelectionDAG] SelectionDAGLegalize::ExpandBITREVERSE - ensure we use ShiftTySimon Pilgrim2018-10-311-6/+6
| | | | | | We should be using the getShiftAmountTy value type for shift amounts. llvm-svn: 345756
* [globalisel][irtranslator] Verify that DILocations aren't lost in translationDaniel Sanders2018-10-311-23/+69
| | | | | | | | | | | | | | | | Summary: Also fix a couple bugs where DILocations are lost. EntryBuilder wasn't passing on debug locations for PHI's, constants, GLOBAL_VALUE, etc. Reviewers: aprantl, vsk, bogner, aditya_nandakumar, volkan, rtereshin, aemerson Reviewed By: aemerson Subscribers: aemerson, rovka, kristof.beyls, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D53740 llvm-svn: 345743
* MachineModuleInfo: Initialize DbgInfoAvailable depending on debug_cus existingMatthias Braun2018-10-313-3/+8
| | | | | | | | | | | | | | | | Before this patch DbgInfoAvailable was set to true in DwarfDebug::beginModule() or CodeViewDebug::CodeViewDebug(). This made MIR testing weird since passes would suddenly stop dealing with debug info just because we stopped the pipeline before the debug printers. This patch changes the logic to initialize DbgInfoAvailable based on the fact that debug_compile_units exist in the llvm Module. The debug printers may then override it with false in case of debug printing being disabled. Differential Revision: https://reviews.llvm.org/D53885 llvm-svn: 345740
* [DAGCombiner] Fold 0 div/rem X to 0David Bolvansky2018-10-311-2/+5
| | | | | | | | | | | | Reviewers: RKSimon, spatel, javed.absar, craig.topper, t.p.northover Reviewed By: RKSimon Subscribers: craig.topper, llvm-commits Differential Revision: https://reviews.llvm.org/D52504 llvm-svn: 345721
* ADT/STLExtras: Introduce llvm::empty; NFCMatthias Braun2018-10-314-5/+4
| | | | | | | | This is modeled after C++17 std::empty(). Differential Revision: https://reviews.llvm.org/D53909 llvm-svn: 345679
OpenPOWER on IntegriCloud