summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Fixing @llvm.memcpy not honoring volatile.Guillaume Chatelet2019-07-091-19/+24
| | | | | | | | | | | | | | | | This is explicitly not addressing target-specific code, or calls to memcpy. Summary: https://bugs.llvm.org/show_bug.cgi?id=42254 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63215 llvm-svn: 365449
* Revert r364515 and r364524Jeremy Morse2019-07-091-85/+2
| | | | | | | Jordan reports on llvm-commits a performance regression with r364515, backing the patch out while it's investigated. llvm-svn: 365448
* Reland "[LiveDebugValues] Emit the debug entry values"Djordje Todorovic2019-07-093-18/+127
| | | | | | | | | | | | | | | | Emit replacements for clobbered parameters location if the parameter has unmodified value throughout the funciton. This is basic scenario where we can use the debug entry values. ([12/13] Introduce the debug entry values.) Co-authored-by: Ananth Sowda <asowda@cisco.com> Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com> Co-authored-by: Ivan Baev <ibaev@cisco.com> Differential Revision: https://reviews.llvm.org/D58042 llvm-svn: 365444
* [MachinePipeliner] Fix Phi refers to Phi in same stage in 1st epilogueJinsong Ji2019-07-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Summary: This is exposed by functional testing on PowerPC. In some pipelined loops, Phi refer to phi did not get value defined by the Phi, hence getting wrong value later. As the comment mentioned, we should "use the value defined by the Phi, unless we're generating the firstepilog and the Phi refers to a Phi in a different stage.", so Phi refering to same stage Phi should use the value defined by the Phi here. Reviewers: bcahoon, hfinkel Reviewed By: hfinkel Subscribers: MaskRay, wuzish, nemanjai, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64035 llvm-svn: 365428
* Changing CodeView debug info type record representation in assembly files to ↵Nilanjana Basu2019-07-091-13/+52
| | | | | | make it more human-readable & editable & fixing bug introduced in r364987 llvm-svn: 365417
* Standardize on MSVC behavior for triples with no environmentReid Kleckner2019-07-083-7/+5
| | | | | | | | | | | | | | | | | | | | | Summary: This makes it so that IR files using triples without an environment work out of the box, without normalizing them. Typically, the MSVC behavior is more desirable. For example, it tends to enable things like constant merging, use of associative comdats, etc. Addresses PR42491 Reviewers: compnerd Subscribers: hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64109 llvm-svn: 365387
* RegUsageInfoCollector: Don't iterate all regs for every reg classMatt Arsenault2019-07-081-31/+6
| | | | | | | | | | This is extremly slow on AMDGPU, which has a lot of physical register and a lot of register classes. determineCalleeSaves, via MachineRegisterInfo::isPhysRegUsed already added all of the super registers to the saved set. llvm-svn: 365370
* GlobalISel: Convert some build functions to using SrcOp/DstOpMatt Arsenault2019-07-081-43/+54
| | | | llvm-svn: 365343
* GlobalISel: widenScalar for G_BUILD_VECTORMatt Arsenault2019-07-081-0/+19
| | | | llvm-svn: 365320
* [TargetLowering] SimplifyDemandedBits - just call computeKnownBits for ↵Simon Pilgrim2019-07-081-23/+3
| | | | | | | | | | BUILD_VECTOR cases. Don't do this locally, computeKnownBits does this better (and can handle non-constant cases as well). A next step would be to actually simplify non-constant elements - building on what we already do in SimplifyDemandedVectorElts. llvm-svn: 365309
* [CodeGen] Add larger vector types for i32 and f32David Majnemer2019-07-071-1/+25
| | | | | | | | | | Some out of tree backend require larger vector type. Since maintaining the changes out of tree is difficult due to the many manual changes needed when adding a new type we are adding it even if no backend currently use it. Differential Revision: https://reviews.llvm.org/D64141 Patch by Thomas Raoux! llvm-svn: 365274
* [DAGCombine] convertBuildVecZextToZext - remove duplicate getOpcode() call. ↵Simon Pilgrim2019-07-061-1/+1
| | | | | | NFCI. llvm-svn: 365269
* [RegisterCoalescer] Fix an overzealous assertQuentin Colombet2019-07-061-1/+8
| | | | | | | | | | Although removeCopyByCommutingDef deals with full copies, it is still possible to copy undef lanes and thus, we wouldn't have any a value number for these lanes. This fixes PR40215. llvm-svn: 365256
* RegUsageInfoCollector: Skip AMDGPU entry point functionsMatt Arsenault2019-07-051-6/+42
| | | | | | | | | | | | | | I'm not sure if it's worth it or not to add a hook to disable the pass for an arbitrary function. This pass is taking up to 5% of compile time in tiny programs by iterating through all of the physical registers in every register class. This pass should be rewritten in terms of regunits. For now, skip doing anything for entry point functions. The vast majority of functions in the real world aren't callable, so just not running this will give the majority of the benefit. llvm-svn: 365255
* [CodeGen] Enhance `MachineInstrSpan` to allow the end of MBB to be used.Michael Liao2019-07-051-3/+3
| | | | | | | | | | | | | | | | Summary: - Explicitly specify the parent MBB to allow the end iterator to be used. Reviewers: aprantl, MatzeB, craig.topper, qcolombet Subscribers: arsenm, jvesely, nhaehnle, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64261 llvm-svn: 365240
* ScheduleDAG: Fix incorrectly killing registers in bundlesMatt Arsenault2019-07-051-6/+5
| | | | | | | | | | When looking for uses/defs to add kill flags, the iterator was double incremented, skipping the first instruction in the bundle. The use register in the first bundle instruction was then incorrectly killed. The "First" instruction should be the BUNDLE itself as the proper reverse iterator endpoint. llvm-svn: 365216
* Revert r365198 as this accidentally commited something thatRobert Lougher2019-07-051-7/+2
| | | | | | should not have been added. llvm-svn: 365199
* This reverts r365061 and r365062 (test update)Robert Lougher2019-07-051-2/+7
| | | | | | | | Revision r365061 changed a skip of debug instructions for a skip of meta instructions. This is not safe, as IMPLICIT_DEF is classed as a meta instruction. llvm-svn: 365198
* [DAGCombiner] Don't combine (addcarry (uaddo X, Y), 0, Carry) -> (addcarry ↵Craig Topper2019-07-041-1/+4
| | | | | | | | | | | | | | | | | | | | | X, Y, Carry) if the Carry comes from the uaddo. Summary: The uaddo won't be removed and the addcarry will still be dependent on the uaddo. So we'll just increase the use count of X and Y and potentially require a COPY. Reviewers: spatel, RKSimon, deadalnix Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64190 llvm-svn: 365149
* GlobalISel: Fix widenScalar for pointer typed G_MERGE_VALUESMatt Arsenault2019-07-031-1/+1
| | | | llvm-svn: 365093
* [CodeGen] Make branch funnels pass the machine verifierFrancis Visoiu Mistrih2019-07-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We previously marked all the tests with branch funnels as `-verify-machineinstrs=0`. This is an attempt to fix it. 1) `ICALL_BRANCH_FUNNEL` has no defs. Mark it as `let OutOperandList = (outs)` 2) After that we hit an assert: ``` Assertion failed: (Op.getValueType() != MVT::Other && Op.getValueType() != MVT::Glue && "Chain and glue operands should occur at end of operand list!"), function AddOperand, file /Users/francisvm/llvm/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp, line 461. ``` The chain operand was added at the beginning of the operand list. Move that to the end. 3) After that we hit another verifier issue in the pseudo expansion where the registers used in the cmps and jmps are not added to the livein lists. Add the `EFLAGS` to all the new MBBs that we create. PR39436 Differential Review: https://reviews.llvm.org/D54155 llvm-svn: 365058
* Use getAllOnesConstants instead of -1 in DAGCombiner. NFCAmaury Sechet2019-07-031-1/+1
| | | | llvm-svn: 365054
* [DAGCombine] More diamong carry pattern optimization.Amaury Sechet2019-07-031-27/+92
| | | | | | | | | | | | | | | Summary: This diff improve the capability of DAGCOmbine to generate linear carries propagation in presence of a diamond pattern. It is now able to match a large variety of different patterns rather than some hardcoded one. Arguably, the codegen in test cases is not better, but this is to be expected. The goal of this transformation is more about canonicalisation than actual optimisation. Reviewers: hfinkel, RKSimon, craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D57302 llvm-svn: 365051
* [SelectionDAG] Propagate alias metadata to target intrinsic nodesJames Molloy2019-07-032-6/+8
| | | | | | | | When a target intrinsic has been determined to touch memory, we construct a MachineMemOperand during SDAG construction. In this case, we should propagate AAMDNodes metadata to the MachineMemOperand where available. Differential revision: https://reviews.llvm.org/D64131 llvm-svn: 365043
* [ARM] Thumb2: favor R4-R7 over R12/LR in allocation order when opt for minsizeOliver Stannard2019-07-031-1/+3
| | | | | | | | | | | | | | | | | | | | For Thumb2, we prefer low regs (costPerUse = 0) to allow narrow encoding. However, current allocation order is like: R0-R3, R12, LR, R4-R11 As a result, a lot of instructs that use R12/LR will be wide instrs. This patch changes the allocation order to: R0-R7, R12, LR, R8-R11 for thumb2 and -Osize. In most cases, there is no extra push/pop instrs as they will be folded into existing ones. There might be slight performance impact due to more stack usage, so we only enable it when opt for min size. https://reviews.llvm.org/D30324 llvm-svn: 365014
* [Codegen][X86][AArch64][ARM][PowerPC] Inc-of-add vs sub-of-not (PR42457)Roman Lebedev2019-07-031-0/+31
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is the backend part of [[ https://bugs.llvm.org/show_bug.cgi?id=42457 | PR42457 ]]. In middle-end, we'd want to prefer the form with two adds - D63992, but as this diff shows, not every target will prefer that pattern. Out of 4 targets for which i added tests all seem to be ok with inc-of-add for scalars, but only X86 prefer that same pattern for vectors. Here i'm adding a new TLI hook, always defaulting to the inc-of-add, but adding AArch64,ARM,PowerPC overrides to prefer inc-of-add only for scalars. Reviewers: spatel, RKSimon, efriedma, t.p.northover, hfinkel Reviewed By: efriedma Subscribers: nemanjai, javed.absar, kristof.beyls, kbarton, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64090 llvm-svn: 365010
* Revert Changing CodeView debug info type record representation in assembly ↵Nilanjana Basu2019-07-031-41/+8
| | | | | | | | files to make it more human-readable & editable This reverts r364982 (git commit 2082bf28ebea76cc187b508f801122866420d9ff) llvm-svn: 364987
* Changing CodeView debug info type record representation in assembly files to ↵Nilanjana Basu2019-07-031-8/+41
| | | | | | make it more human-readable & editable llvm-svn: 364982
* [RA] Fix spelling of Greedy register allocator internal optionTeresa Johnson2019-07-021-1/+1
| | | | | | | | The internal option added with r323870 has a typo. It isn't being used by any tests, but I decided to fix the spelling and leave it in for use in debugging the changes added in that patch. llvm-svn: 364958
* GlobalISel: Add G_FENCEMatt Arsenault2019-07-022-0/+15
| | | | | | | The pattern importer is for some reason emitting checks for G_CONSTANT for the immediate operands. llvm-svn: 364926
* [NFC][TargetLowering] Some preparatory cleanups around 'prepareUREMEqFold()' ↵Roman Lebedev2019-07-021-17/+18
| | | | | | from D63963 llvm-svn: 364921
* [TailDuplicator] Fix copy instruction emitting into the wrong block.Amara Emerson2019-07-021-1/+1
| | | | | | | | | | | | | The code for duplicating instructions could sometimes try to emit copies intended to deal with unconstrainable register classes to the tail block of the original instruction, rather than before the newly cloned instruction in the predecessor block. This was exposed by GlobalISel on arm64. Differential Revision: https://reviews.llvm.org/D64049 llvm-svn: 364888
* [DAGCombiner] Exploiting more about the transformation of ↵Zi Xuan Wu2019-07-021-4/+2
| | | | | | | | | | | | | | | | TransformFPLoadStorePair function For a given floating point load / store pair, if the load value isn't used by any other operations, then consider transforming the pair to integer load / store operations if the target deems the transformation profitable. And we can exploiting much more when there are other operation nodes with chain operand between the load/store pair so long as we keep the chain ordering original. We only replace the register used to load/store from float to integer. I only add testcase in ARM because the TLI.isDesirableToTransformToIntegerOp hook is only enabled in ARM target. Differential Revision: https://reviews.llvm.org/D60601 llvm-svn: 364883
* Testing commit access through minor formatting changeNilanjana Basu2019-07-011-2/+3
| | | | llvm-svn: 364843
* GlobalISel: Try to widen merges with other mergesMatt Arsenault2019-07-011-2/+28
| | | | | | | | If the requested source type an be used as a merge source type, create a merge of merges. This avoids creating large, illegal extensions and bit-ops directly to the result type. llvm-svn: 364841
* GlobalISel: Verify G_MERGE_VALUES operand sizesMatt Arsenault2019-07-011-0/+10
| | | | llvm-svn: 364822
* [GlobalISel]: Allow backends to custom legalize IntrinsicsAditya Nandakumar2019-07-012-0/+10
| | | | | | | | | https://reviews.llvm.org/D31359 Add a hook "legalizeInstrinsic" to allow backends to override this and custom lower/legalize intrinsics. llvm-svn: 364821
* GlobalISel: Implement lower for min/maxMatt Arsenault2019-07-011-0/+36
| | | | llvm-svn: 364816
* Fixup r364512Diana Picus2019-07-011-10/+12
| | | | | | | | | | Fix stack-use-after-scope errors from r364512. One instance was already fixed in r364611 - this patch simplifies that fix and addresses one more instance of similar code. Discussed in: https://reviews.llvm.org/D63905 llvm-svn: 364778
* [SelectionDAG] Do minnum->minimum at legalization time instead of building timeBenjamin Kramer2019-07-012-16/+17
| | | | | | | | The SDAGBuilder behavior stems from the days when we didn't have fast math flags available in SDAG. We do now and doing the transformation in the legalizer has the advantage that it also works for vector types. llvm-svn: 364743
* [DebugInfo] Avoid adding too much indirection to pointer-valued variablesJeremy Morse2019-07-011-2/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch addresses PR41675, where a stack-pointer variable is dereferenced too many times by its location expression, presenting a value on the stack as the pointer to the stack. The difference between a stack *pointer* DBG_VALUE and one that refers to a value on the stack, is currently the indirect flag. However the DWARF backend will also try to guess whether something is a memory location or not, based on whether there is any computation in the location expression. By simply prepending the stack offset to existing expressions, we can accidentally convert a register location into a memory location, which introduces a suprise (and unintended) dereference. The solution is to add DW_OP_stack_value whenever we add a DIExpression computation to a stack *pointer*. It's an implicit location computed on the expression stack, thus needs to be flagged as a stack_value. For the edge case where the offset is zero and the location could be a register location, DIExpression::prepend will still generate opcodes, and thus DW_OP_stack_value must still be added. Differential Revision: https://reviews.llvm.org/D63429 llvm-svn: 364736
* [ARM] WLS/LE Code GenerationSam Parker2019-07-011-0/+1
| | | | | | | | | | | | | | | | | Backend changes to enable WLS/LE low-overhead loops for armv8.1-m: 1) Use TTI to communicate to the HardwareLoop pass that we should try to generate intrinsics that guard the loop entry, as well as setting the loop trip count. 2) Lower the BRCOND that uses said intrinsic to an Arm specific node: ARMWLS. 3) ISelDAGToDAG the node to a new pseudo instruction: t2WhileLoopStart. 4) Add support in ArmLowOverheadLoops to handle the new pseudo instruction. Differential Revision: https://reviews.llvm.org/D63816 llvm-svn: 364733
* Cleanup: llvm::bsearch -> llvm::partition_point after r364719Fangrui Song2019-06-302-7/+6
| | | | llvm-svn: 364720
* [SelectionDAG] Use the memory VT instead of result VT for FoldingSet ↵Craig Topper2019-06-301-3/+2
| | | | | | | | | profiling in getMaskedLoad/getMaskedStore. This matches what is done by the Profile function. Otherwise CSE won't work properly. llvm-svn: 364717
* [HardwareLoops] Loop counter guard intrinsicSam Parker2019-06-281-16/+105
| | | | | | | | | Introduce llvm.test.set.loop.iterations which sets the loop counter and also produces an i1 after testing that the count is not zero. Differential Revision: https://reviews.llvm.org/D63809 llvm-svn: 364628
* GlobalISel: Use RegisterMatt Arsenault2019-06-283-39/+39
| | | | llvm-svn: 364618
* GlobalISel: Convert rest of MachineIRBuilder to using RegisterMatt Arsenault2019-06-281-50/+50
| | | | llvm-svn: 364615
* [GlobalISel][IRTranslator] Fix some PHI bugs related to jump tables when ↵Amara Emerson2019-06-271-14/+26
| | | | | | | | | | | | optimizations are used. The new switch lowering code that tries to generate jump tables and range checks were tested at -O0 on arm64, but on -O3 the generic switch lowering code goes to town on trying to generate optimized lowerings, e.g. multiple jump tables, range checks etc. This exposed bugs in the way PHI nodes are handled because the CFG looks even stranger after all of this is done. llvm-svn: 364613
* Fix ASAN error caused by commit r364512.Rumeet Dhindsa2019-06-271-4/+6
| | | | | | | | | This patch intends to fix ASAN stack-use-after-scope error. This is at least a short-term fix to unbreak LLVM's mainline. Differential Revision: https://reviews.llvm.org/D63905 llvm-svn: 364611
* [CodeGen] [SelectionDAG] More efficient code for X % C == 0 (UREM case) (try 3)Roman Lebedev2019-06-271-0/+109
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: I'm submitting a new revision since i don't understand how to reclaim/reopen/take over the existing one, D50222. There is no such action in "Add Action" menu... This implements an optimization described in Hacker's Delight 10-17: when `C` is constant, the result of `X % C == 0` can be computed more cheaply without actually calculating the remainder. The motivation is discussed here: https://bugs.llvm.org/show_bug.cgi?id=35479. This is a recommit, the original commit rL364563 was reverted in rL364568 because test-suite detected miscompile - the new comparison constant 'Q' was being computed incorrectly (we divided by `D0` instead of `D`). Original patch D50222 by @hermord (Dmytro Shynkevych) Notes: - In principle, it's possible to also handle the `X % C1 == C2` case, as discussed on bugzilla. This seems to require an extra branch on overflow, so I refrained from implementing this for now. - An explicit check for when the `REM` can be reduced to just its LHS is included: the `X % C` == 0 optimization breaks `test1` in `test/CodeGen/X86/jump_sign.ll` otherwise. I hadn't managed to find a better way to not generate worse output in this case. - The `test/CodeGen/X86/jump_sign.ll` regresses, and is being fixed by a followup patch D63390. Reviewers: RKSimon, craig.topper, spatel, hermord, xbolva00 Reviewed By: RKSimon, xbolva00 Subscribers: dexonsmith, kristina, xbolva00, javed.absar, llvm-commits, hermord Tags: #llvm Differential Revision: https://reviews.llvm.org/D63391 llvm-svn: 364600
OpenPOWER on IntegriCloud