summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [DAGCombiner] visitSHL - pull out repeated shift amount VT. NFCI.Simon Pilgrim2019-06-191-6/+6
| | | | llvm-svn: 363789
* [DAGCombine] Fix (shl (ext (shl x, c1)), c2) -> (shl (ext x), (add c1, c2)) ↵Simon Pilgrim2019-06-191-1/+2
| | | | | | | | comment. NFCI. We pre-extend, not post. llvm-svn: 363787
* [NFC] move some hardware loop checking code to a common place for other using.Chen Zheng2019-06-191-82/+10
| | | | | | Differential Revision: https://reviews.llvm.org/D63478 llvm-svn: 363758
* Rename ExpandISelPseudo->FinalizeISel, delay register reservationMatt Arsenault2019-06-197-23/+56
| | | | | | | | | | | This allows targets to make more decisions about reserved registers after isel. For example, now it should be certain there are calls or stack objects in the frame or not, which could have been introduced by legalization. Patch by Matthias Braun llvm-svn: 363757
* [GlobalISel][Localizer] Remove redundant set lookup.Amara Emerson2019-06-181-1/+1
| | | | | | | After changing the algorithm to only process the entry block we never revisit a processed instruction. llvm-svn: 363745
* [MachinePipeliner][NFC] Do resource tracking log only when requested.Jinsong Ji2019-06-181-22/+43
| | | | | | | In most cases we don't need to do resource tracking debug, so leave them off by default. llvm-svn: 363733
* [TargetLowering] SimplifyDemandedBits - Cleanup ANY_EXTEND handlingSimon Pilgrim2019-06-181-2/+8
| | | | | | Match SIGN_EXTEND + ZERO_EXTEND handling - will be adding ANY_EXTEND_VECTOR_INREG support in a future patch. llvm-svn: 363716
* [TargetLowering] SimplifyDemandedBits - Merge ↵Simon Pilgrim2019-06-181-24/+16
| | | | | | | | ZERO_EXTEND+ZERO_EXTEND_VECTOR_INREG handling Other than adding consistent demanded elts handling which was a trivial addition, the other differences in functionality will be added in later patches. llvm-svn: 363713
* [TargetLowering] SimplifyDemandedBits - Merge ↵Simon Pilgrim2019-06-181-25/+17
| | | | | | | | SIGN_EXTEND+SIGN_EXTEND_VECTOR_INREG handling Other than adding consistent demanded elts handling which was a trivial addition, the other differences in functionality will be added in later patches. llvm-svn: 363710
* [TargetLowering] SimplifyDemandedVectorElts - support MUL and ↵Simon Pilgrim2019-06-181-0/+9
| | | | | | | | | | ANY_EXTEND_VECTOR_INREG Also fold ANY_EXTEND_VECTOR_INREG -> BITCAST if we only need the bottom element. Fixes temporary regression introduced in rL363693. llvm-svn: 363694
* [SelectionDAG] Legalize vaargs that require vector splittingSimon Pilgrim2019-06-182-0/+24
| | | | | | | | | | This adds vector splitting for vaarg instructions during type legalization Committed on behalf of @luke (Luke Lau) Differential Revision: https://reviews.llvm.org/D60762 llvm-svn: 363671
* GlobalISel: Remove redundant pass initializationTom Stellard2019-06-185-13/+4
| | | | | | | | | | | | | | | | | | | Summary: All the GlobalISel passes are initialized when the target calls initializeGlobalISel(), so we don't need to call the initializers from the pass constructors. Reviewers: qcolombet, t.p.northover, paquette, dsanders, aemerson, aditya_nandakumar Reviewed By: aemerson Subscribers: rovka, kristof.beyls, hiraditya, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63235 llvm-svn: 363642
* GlobalISel: Use the original flags when lowering fneg to fsubMatt Arsenault2019-06-171-2/+1
| | | | | | | | | | This was ignoring the flag on fneg, and using the source instruction's flags. Also fixes tests missing from r358702. Note the expansion itself isn't correct without nnan, but that should be fixed separately. llvm-svn: 363637
* hwasan: Add a tag_offset DWARF attribute to instrumented stack variables.Peter Collingbourne2019-06-173-0/+8
| | | | | | | | | | | | | | | | | | | | | | | The goal is to improve hwasan's error reporting for stack use-after-return by recording enough information to allow the specific variable that was accessed to be identified based on the pointer's tag. Currently we record the PC and lower bits of SP for each stack frame we create (which will eventually be enough to derive the base tag used by the stack frame) but that's not enough to determine the specific tag for each variable, which is the stack frame's base tag XOR a value (the "tag offset") that is unique for each variable in a function. In IR, the tag offset is most naturally represented as part of a location expression on the llvm.dbg.declare instruction. However, the presence of the tag offset in the variable's actual location expression is likely to confuse debuggers which won't know about tag offsets, and moreover the tag offset is not required for a debugger to determine the location of the variable on the stack, so at the DWARF level it is represented as an attribute so that it will be ignored by debuggers that don't know about it. Differential Revision: https://reviews.llvm.org/D63119 llvm-svn: 363635
* [GlobalISel][Localizer] Rewrite localizer to run in 2 phases, inter & intra ↵Amara Emerson2019-06-171-61/+158
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | block. Inter-block localization is the same as what currently happens, except now it only runs on the entry block because that's where the problematic constants with long live ranges come from. The second phase is a new intra-block localization phase which attempts to re-sink the already localized instructions further right before one of the multiple uses. One additional change is to also localize G_GLOBAL_VALUE as they're constants too. However, on some targets like arm64 it takes multiple instructions to materialize the value, so some additional heuristics with a TTI hook have been introduced attempt to prevent code size regressions when localizing these. Overall, these changes improve CTMark code size on arm64 by 1.2%. Full code size results: Program baseline new diff ------------------------------------------------------------------------------ test-suite...-typeset/consumer-typeset.test 1249984 1217216 -2.6% test-suite...:: CTMark/ClamAV/clamscan.test 1264928 1232152 -2.6% test-suite :: CTMark/SPASS/SPASS.test 1394092 1361316 -2.4% test-suite...Mark/mafft/pairlocalalign.test 731320 714928 -2.2% test-suite :: CTMark/lencod/lencod.test 1340592 1324200 -1.2% test-suite :: CTMark/kimwitu++/kc.test 3853512 3820420 -0.9% test-suite :: CTMark/Bullet/bullet.test 3406036 3389652 -0.5% test-suite...ark/tramp3d-v4/tramp3d-v4.test 8017000 8016992 -0.0% test-suite...TMark/7zip/7zip-benchmark.test 2856588 2856588 0.0% test-suite...:: CTMark/sqlite3/sqlite3.test 765704 765704 0.0% Geomean difference -1.2% Differential Revision: https://reviews.llvm.org/D63303 llvm-svn: 363632
* Propagate fmf in IRTranslate for fnegMichael Berg2019-06-171-8/+17
| | | | | | | | | | | | | | Summary: This case is related to D63405 in that we need to be propagating FMF on negates. Reviewers: volkan, spatel, arsenm Reviewed By: arsenm Subscribers: wdng, javed.absar Differential Revision: https://reviews.llvm.org/D63458 llvm-svn: 363631
* [globalisel] Fix iterator invalidation in the extload combinesDaniel Sanders2019-06-171-54/+38
| | | | | | | | | | | | | | | | | | | | | | | | Summary: Change the way we deal with iterator invalidation in the extload combines as it was still possible to neglect to visit a use. Even worse, it happened in the in-tree test cases and the checks weren't good enough to detect it. We now take a cheap copy of the use list before iterating over it. This prevents iterator invalidation from occurring and has the nice side effect of making the existing schedule-for-erase/schedule-for-insert mechanism moot. Reviewers: aditya_nandakumar Reviewed By: aditya_nandakumar Subscribers: rovka, kristof.beyls, javed.absar, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61813 llvm-svn: 363616
* GlobalISel: Ignore callsite attributes when picking intrinsic typeMatt Arsenault2019-06-171-1/+3
| | | | | | | | | | | A target intrinsic may be defined as possibly reading memory, but the call site may have additional knowledge that it doesn't read memory. The intrinsic lowering will expect the pessimistic assumption of the intrinsic definition, so the chain should still be used. I fixed the same bug in SelectionDAG in r287593. llvm-svn: 363580
* GlobalISel: Verify intrinsicsMatt Arsenault2019-06-171-0/+29
| | | | | | | | I keep using the wrong instruction when manually writing tests. This really needs to check the number of operands, but I don't see an easy way to do that right now. llvm-svn: 363579
* PHINode: introduce setIncomingValueForBlock() function, and use it.Whitney Tsang2019-06-171-7/+3
| | | | | | | | | | | | | | | | Summary: There is PHINode::getBasicBlockIndex() and PHINode::setIncomingValue() but no function to replace incoming value for a specified BasicBlock* predecessor. Clearly, there are a lot of places that could use that functionality. Reviewer: craig.topper, lebedev.ri, Meinersbur, kbarton, fhahn Reviewed By: Meinersbur, fhahn Subscribers: fhahn, hiraditya, zzheng, jsji, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D63338 llvm-svn: 363566
* [CodeGen] Check for HardwareLoop Latch ExitBlockSam Parker2019-06-171-3/+13
| | | | | | | | | | | | The HardwareLoops pass finds exit blocks with a scevable exit count. If the target specifies to update the loop counter in a register, through a phi, we need to ensure that the exit block is a latch so that we can insert the phi with the correct value for the incoming edge. Differential Revision: https://reviews.llvm.org/D63336 llvm-svn: 363556
* [DAGCombiner] [CodeGenPrepare] More comprehensive GEP splittingLuis Marques2019-06-172-12/+67
| | | | | | | | | | | | | | | Some GEPs were not being split, presumably because that split would just be undone by the DAGCombiner. Not performing those splits can prevent important optimizations, such as preventing the element indices / member offsets from being (partially) folded into load/store instruction immediates. This patch: - Makes the splits also occur in the cases where the base address and the GEP are in the same BB. - Ensures that the DAGCombiner doesn't reassociate them back again. Differential Revision: https://reviews.llvm.org/D60294 llvm-svn: 363544
* [SelectionDAG] Fold insert_subvector(undef, extract_subvector(v, c), c) -> v ↵Simon Pilgrim2019-06-171-0/+6
| | | | | | | | in getNode This is already done in DAGCombiner::visitINSERT_SUBVECTOR, but this helps a number of shuffles across different vector widths recognise when they come from the same source. llvm-svn: 363542
* Describe stack-id as an enumSander de Smalen2019-06-174-12/+28
| | | | | | | | | | | | | | | | | This patch changes MIR stack-id from an integer to an enum, and adds printing/parsing support for this in MIR files. The default stack-id '0' is now renamed to 'default'. This should make MIR tests that have stack objects with different stack-ids more descriptive. It also clarifies code operating on StackID. Reviewers: arsenm, thegameg, qcolombet Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D60137 llvm-svn: 363533
* [CodeGenPrepare][x86] shift both sides of a vector select when profitableSanjay Patel2019-06-161-3/+42
| | | | | | | | | | | | | | | | | This is based on the example/discussion in PR37428: https://bugs.llvm.org/show_bug.cgi?id=37428 Proper vector shift instructions don't appear until AVX2, so we may generate several extra instructions within a loop trying to compensate for that. It's difficult to recover from that shift expansion later than this, so use the existing TLI hook and splat analysis to enable better codegen. This extends CGP functionality introduced with: rL201655 Differential Revision: https://reviews.llvm.org/D63233 llvm-svn: 363511
* adding more fmf propagation for selects plus updated testsMichael Berg2019-06-152-20/+37
| | | | llvm-svn: 363484
* Revert "adding more fmf propagation for selects plus tests"Fangrui Song2019-06-152-37/+20
| | | | | | | | | | | This reverts rL363474. -debug-only=isel was added to some tests that don't specify `REQUIRES: asserts`. This causes failures on -DLLVM_ENABLE_ASSERTIONS=off builds. I chose to revert instead of fixing the tests because I'm not sure whether we should add `REQUIRES: asserts` to more tests. llvm-svn: 363482
* Reapply "GlobalISel: Avoid producing Illegal copies in RegBankSelect"Matt Arsenault2019-06-151-4/+26
| | | | | | | This reapplies r363410, avoiding null dereference if there is no AltRegBank. llvm-svn: 363478
* Revert "GlobalISel: Avoid producing Illegal copies in RegBankSelect"Mitch Phillips2019-06-141-25/+4
| | | | | | | | | | | This patch breaks UBSan build bots. See https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild for a guide as to how to reproduce the error. This reverts commit c2864c0de07efb5451d32d27a7d4ff2984830929. This reverts rL363410. llvm-svn: 363476
* adding more fmf propagation for selects plus testsMichael Berg2019-06-142-20/+37
| | | | llvm-svn: 363474
* [MBP] Move a latch block with conditional exit and multi predecessors to top ↵Guozhi Wei2019-06-141-50/+233
| | | | | | | | | | | | | | | | of loop Current findBestLoopTop can find and move one kind of block to top, a latch block has one successor. Another common case is: * a latch block * it has two successors, one is loop header, another is exit * it has more than one predecessors If it is below one of its predecessors P, only P can fall through to it, all other predecessors need a jump to it, and another conditional jump to loop header. If it is moved before loop header, all its predecessors jump to it, then fall through to loop header. So all its predecessors except P can reduce one taken branch. Differential Revision: https://reviews.llvm.org/D43256 llvm-svn: 363471
* [GlobalISel] Add a G_BRJT opcode.Amara Emerson2019-06-142-0/+23
| | | | | | | | | | | | | This is a branch opcode that takes a jump table pointer, jump table index and an index into the table to do an indirect branch. We pass both the table pointer and JTI to allow targets like ARM64 to more easily use the existing jump table compression optimization without having to walk up the block to find a paired G_JUMP_TABLE. Differential Revision: https://reviews.llvm.org/D63159 llvm-svn: 363434
* GlobalISel: Avoid producing Illegal copies in RegBankSelectMatt Arsenault2019-06-141-4/+25
| | | | | | | | | | | | | | | | | | | | | | Avoid producing illegal register bank copies for reg_sequence and phi. The default implementation assumes it is possible to pick any operand's bank and use that for the result, introducing a copy for operands with a different bank. This does not check for illegal copies. It is not legal to introduce a VGPR->SGPR copy, so any VGPR operand requires the result to be a VGPR. The changes in getInstrMappingImpl aren't strictly necessary, since AMDGPU now just bypasses this for reg_sequence/phi. This could be replaced with an assert in case other targets run into this. It is currently responsible for producing the error for unsatisfiable copies, but this will be better served with a verifier check. For phis, for now assume any undetermined operands must be VGPRs. Eventually, this needs to be able to defer mapping these operations. This also does not yet have a way to check for whether the block is in a divergent region. llvm-svn: 363410
* [CodeGenPrepare] propagate debuginfo when copying a shuffleSanjay Patel2019-06-141-0/+1
| | | | llvm-svn: 363409
* RegBankSelect: Remove checks for invalid mappingsMatt Arsenault2019-06-141-5/+2
| | | | | | | | | Avoid a check for valid and a set of redundant asserts. The place InstructionMapping is constructed asserts all of the default fields are passed anyway for an invalid mapping, so don't overcomplicate this. llvm-svn: 363391
* Fix not calling TargetCustom PSVs printerMatt Arsenault2019-06-141-1/+1
| | | | | | | If the enum value was greater than the starting target custom value, the custom printer wasn't called. llvm-svn: 363386
* DebugInfo: Include enumerators in pubnamesDavid Blaikie2019-06-141-0/+5
| | | | | | | | | | | This is consistent with GCC's behavior (which is the defacto standard for pubnames). Though I find the presence of enumerators from enum classes to be a bit confusing, possibly a bug on GCC's end (since they can't be named unqualified, unlike the other names - and names nested in classes don't go in pubnames, for instance - presumably because one must name the class first & that's enough to limit the scope of the search) llvm-svn: 363349
* Use fully qualified name when printing S_CONSTANT recordsAmy Huang2019-06-131-4/+5
| | | | | | | | | | | | | | | | Summary: Before it was using the fully qualified name only for static data members. Now it does for all variable names to match MSVC. Reviewers: rnk Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63012 llvm-svn: 363335
* [GlobalISel][IRTranslator] Add debug loc with line 0 to constants emitted ↵Amara Emerson2019-06-131-3/+14
| | | | | | | | | | | | | | | into the entry block. Constants, including G_GLOBAL_VALUE, are all emitted into the entry block which lets us use the vreg def assuming it dominates all other users. However, it can cause jumpy debug behaviour since the DebugLoc attached to these MIs are from a user instruction that could be in a different block. Fixes PR40887. Differential Revision: https://reviews.llvm.org/D63286 llvm-svn: 363331
* [MachinePiepliner] Don't check boundary node in checkValidNodeOrderJinsong Ji2019-06-131-0/+5
| | | | | | | | | | | | | | | | | | | This was exposed by PowerPC target enablement. In ScheduleDAG, if we haven't seen any uses in this scheduling region, we will create a dependence edge to ExitSU to model the live-out latency. This is required for vreg defs with no in-region use, and prefetches with no vreg def. When we build NodeOrder in Scheduler, we ignore these boundary nodes. However, when we check Succs in checkValidNodeOrder, we did not skip them, so we still assume all the nodes have been sorted and in order in Indices array. So when we call lower_bound() for ExitSU, it will return Indices.end(), causing memory issues in following Node access. Differential Revision: https://reviews.llvm.org/D63282 llvm-svn: 363329
* [Codegen] Merge tail blocks with no successors after block placementDavid Bolvansky2019-06-131-20/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: I found the following case having tail blocks with no successors merging opportunities after block placement. Before block placement: bb0: ... bne a0, 0, bb2: bb1: mv a0, 1 ret bb2: ... bb3: mv a0, 1 ret bb4: mv a0, -1 ret The conditional branch bne in bb0 is opposite to beq. After block placement: bb0: ... beq a0, 0, bb1 bb2: ... bb4: mv a0, -1 ret bb1: mv a0, 1 ret bb3: mv a0, 1 ret After block placement, that appears new tail merging opportunity, bb1 and bb3 can be merged as one block. So the conditional constraint for merging tail blocks with no successors should be removed. In my experiment for RISC-V, it decreases code size. Author of original patch: Jim Lin Reviewers: haicheng, aheejin, craig.topper, rnk, RKSimon, Jim, dmgreen Reviewed By: Jim, dmgreen Subscribers: xbolva00, dschuff, javed.absar, sbc100, jgravelle-google, aheejin, kito-cheng, dmgreen, PkmX, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D54411 llvm-svn: 363284
* Remove ';' after namespace's closing bracket [NFC]David Stenberg2019-06-131-1/+1
| | | | llvm-svn: 363267
* [FIX] Forces shrink wrapping to consider any memory access as aliasing with ↵Diogo N. Sampaio2019-06-131-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | the stack Summary: Relate bug: https://bugs.llvm.org/show_bug.cgi?id=37472 The shrink wrapping pass prematurally restores the stack, at a point where the stack might still be accessed. Taking an exception can cause the stack to be corrupted. As a first approach, this patch is overly conservative, assuming that any instruction that may load or store could access the stack. Reviewers: dmgreen, qcolombet Reviewed By: qcolombet Subscribers: simpal01, efriedma, eli.friedman, javed.absar, llvm-commits, eugenis, chill, carwil, thegameg Tags: #llvm Differential Revision: https://reviews.llvm.org/D63152 llvm-svn: 363265
* [NFC] Sink a function call into LiveDebugValues::processJeremy Morse2019-06-131-7/+13
| | | | | | | This was requested in D62904, which I successfully missed. This is just a refactor and shouldn't change any behaviour. llvm-svn: 363259
* [CodeGen] Add getMachineMemOperand + MachineMemOperand::Flags allocator ↵Simon Pilgrim2019-06-131-0/+9
| | | | | | | | helper wrapper. NFCI. Pre-commit for D62726 on behalf of @luke (Luke Lau) llvm-svn: 363257
* [DebugInfo] Honour variable fragments in LiveDebugValuesJeremy Morse2019-06-131-39/+227
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes the LiveDebugValues pass consider fragments when propagating DBG_VALUE insts between blocks, fixing PR41979. Fragment info for a variable location is added to the open-ranges key, which allows distinct fragments to be tracked separately. To handle overlapping fragments things become slightly funkier. To avoid excessive searching for overlaps in the data-flow part of LiveDebugValues, this patch: * Pre-computes pairings of fragments that overlap, for each DILocalVariable * During data-flow, whenever something happens that causes an open range to be terminated (via erase), any fragments pre-determined to overlap are also terminated. The effect of which is that when encountering a DBG_VALUE fragment that overlaps others, the overlapped fragments do not get propagated to other blocks. We still rely on later location-list building to correctly handle overlapping fragments within blocks. It's unclear whether a mixture of DBG_VALUEs with and without fragmented expressions are legitimate. To avoid suprises, this patch interprets a DBG_VALUE with no fragment as overlapping any DBG_VALUE _with_ a fragment. Differential Revision: https://reviews.llvm.org/D62904 llvm-svn: 363256
* [DebugInfo] Move Value struct out of DebugLocEntry as DbgValueLoc (NFC)Nikola Prica2019-06-133-106/+102
| | | | | | | | | | | | | Since the DebugLocEntry::Value is used as part of DwarfDebug and DebugLocEntry make it as the separate class. Reviewers: aprantl, dstenb Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D63213 llvm-svn: 363246
* [DebugInfo] Use FrameDestroy to extend stack locations to end-of-functionJeremy Morse2019-06-131-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | We aim to ignore changes in variable locations during the prologue and epilogue of functions, to avoid using space documenting location changes that aren't visible. However in D61940 / r362951 this got ripped out as the previous implementation was unsound. Instead, use the FrameDestroy flag to identify when we're in the epilogue of a function, and ignore variable location changes accordingly. This fits in with existing code that examines the FrameSetup flag. Some variable locations get shuffled in modified tests as they now cover greater ranges, which is what would be expected. Some additional single-location variables are generated too. Two tests are un-xfailed, they were only xfailed due to r362951 deleting functionality they depended on. Apparently some out-of-tree backends don't accurately maintain FrameDestroy flags -- if you're an out-of-tree maintainer and see changes in variable locations disappear due to a faulty FrameDestroy flag, it's safe to back this change out. The impact is just slightly more debug info than necessary. Differential Revision: https://reviews.llvm.org/D62314 llvm-svn: 363245
* [TargetLowering] Add MachineMemOperand::Flags to allowsMemoryAccess tests ↵Simon Pilgrim2019-06-123-4/+7
| | | | | | | | | | | | | | (PR42123) As discussed on D62910, we need to check whether particular types of memory access are allowed, not just their alignment/address-space. This NFC patch adds a MachineMemOperand::Flags argument to allowsMemoryAccess and allowsMisalignedMemoryAccesses, and wires up calls to pass the relevant flags to them. If people are happy with this approach I can then update X86TargetLowering::allowsMisalignedMemoryAccesses to handle misaligned NT load/stores. Differential Revision: https://reviews.llvm.org/D63075 llvm-svn: 363179
* StackProtector: Use PointerMayBeCapturedMatt Arsenault2019-06-121-35/+4
| | | | | | | | | | | | | | | | This was using its own, outdated list of possible captures. This was at minimum not catching cmpxchg and addrspacecast captures. One change is now any volatile access is treated as capturing. The test coverage for this pass is quite inadequate, but this required removing volatile in the lifetime capture test. Also fixes some infrastructure issues to allow running just the IR pass. Fixes bug 42238. llvm-svn: 363169
OpenPOWER on IntegriCloud