summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [IfConversion] Hoist removeBranch calls out of if/else clauses [NFC]Mikael Holmen2017-06-261-4/+9
| | | | | | | | | | | | | | | | | Summary: Also added a comment. Pulled out of https://reviews.llvm.org/D34099. Reviewers: iteratee Reviewed By: iteratee Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34388 llvm-svn: 306279
* This reverts commit r306272.Serguei Katkov2017-06-261-29/+0
| | | | | | | | Revert "[MBP] do not rotate loop if it creates extra branch" It breaks the sanitizer build bots. Need to fix this. llvm-svn: 306276
* [MBP] do not rotate loop if it creates extra branchSerguei Katkov2017-06-261-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a last fix for the corner case of PR32214. Actually this is not really corner case in general. We should not do a loop rotation if we create an additional branch due to it. Consider the case where we have a loop chain H, M, B, C , where H is header with viable fallthrough from pre-header and exit from the loop M - some middle block B - backedge to Header but with exit from the loop also. C - some cold block of the loop. Let's H is determined as a best exit. If we do a loop rotation M, B, C, H we can introduce the extra branch. Let's compute the change in number of branches: +1 branch from pre-header to header -1 branch from header to exit +1 branch from header to middle block if there is such -1 branch from cold bock to header if there is one So if C is not a predecessor of H then we introduce extra branch. This change actually prohibits rotation of the loop if both true 1) Best Exit has next element in chain as successor. 2) Last element in chain is not a predecessor of first element of chain. Reviewers: iteratee, xur Reviewed By: iteratee Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34271 llvm-svn: 306272
* AVX-512: Fixed a crash during legalization of <3 x i8> typeElena Demikhovsky2017-06-251-2/+1
| | | | | | | | | The compiler fails with assertion during legalization of SETCC for <3 x i8> operands. The result is extended to <4 x i8> and then truncated <4 x i1>. It does not happen on AVX2, because the final result of SETCC is <4 x i32>. Differential Revision: https://reviews.llvm.org/D34503 llvm-svn: 306242
* [SelectionDAG] set dereferenceable flag when expanding memcpy/memmoveHiroshi Inoue2017-06-242-8/+43
| | | | | | | | | | When SelectionDAG expands memcpy (or memmove) call into a sequence of load and store instructions, it disregards dereferenceable flag even the source pointer is known to be dereferenceable. This results in an assertion failure if SelectionDAG commonizes a load instruction generated for memcpy with another load instruction for the source pointer. This patch makes SelectionDAG to set the dereferenceable flag for the load instructions properly to avoid the assertion failure. Differential Revision: https://reviews.llvm.org/D34467 llvm-svn: 306209
* GlobalISel: remove G_SEQUENCE instruction.Tim Northover2017-06-232-72/+1
| | | | | | | | It was trying to do too many things. The basic lumping together of values for legalization purposes is now handled by G_MERGE_VALUES. More complex things involving gaps and odd sizes are handled by G_INSERT sequences. llvm-svn: 306120
* GlobalISel: convert buildSequence to use non-deprecated instructions.Tim Northover2017-06-232-10/+26
| | | | | | | | G_SEQUENCE is going away soon so as a first step the MachineIRBuilder needs to be taught how to emulate it with alternatives. We use G_MERGE_VALUES where possible, and a sequence of G_INSERTs if not. llvm-svn: 306119
* Restrict the definition of loop preheader to avoid EH blocksAndrew Kaylor2017-06-221-0/+6
| | | | | | Differential Revision: https://reviews.llvm.org/D34487 llvm-svn: 306070
* [DAG] Add Target Store Merge pass ordering functionNirav Dave2017-06-221-1/+2
| | | | | | | Allow targets to specify if they should merge stores before or after legalization. llvm-svn: 306006
* Mark dump() methods as const. NFCSam Clegg2017-06-214-8/+8
| | | | | | | | | Add const qualifier to any dump() method where adding one was trivial. Differential Revision: https://reviews.llvm.org/D34481 llvm-svn: 305963
* [CGP, memcmp] replace CreateZextOrTrunc with CreateZext because it can never ↵Sanjay Patel2017-06-211-5/+7
| | | | | | trunc llvm-svn: 305936
* [CGP] fix variables to be unsigned in memcmp expansionSanjay Patel2017-06-211-12/+14
| | | | llvm-svn: 305935
* [DAG] Move BaseIndexOffset into separate Libarary. NFC.Nirav Dave2017-06-213-114/+97
| | | | | | | Move BaseIndexOffset analysis out of DAGCombiner for use in other files. llvm-svn: 305921
* [DAG] Remove Node csonstruction from BaseIndexOffset match. NFCI.Nirav Dave2017-06-211-52/+69
| | | | | | | | Move GlobalAddress Offset decomposition from initial match into comparision check and removing the possibility of constructing a new offseted global address when examining addresses. llvm-svn: 305917
* Use range-loop in machine-scheduler. NFCI.Javed Absar2017-06-211-94/+72
| | | | | | | | | | | | Converts to range-loop usage in machine scheduler. This makes the code neater and easier to read, and also keeps pace of the machine scheduler implementation with C++11 features. Reviewed by: Matthias Braun Differential Revision: https://reviews.llvm.org/D34320 llvm-svn: 305887
* [DAGCombiner] Add another combine from build vector to shuffleGuy Blank2017-06-211-0/+5
| | | | | | | Add support for combining a build vector to a shuffle. When the build vector is of extracted elements from 2 vectors (vec1, vec2) where vec2 is 2 times smaller than vec1. llvm-svn: 305883
* [XRay] Reduce synthetic references emitted by XRayDean Michael Berris2017-06-211-8/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When we're building with XRay instrumentation, we use a trick that preserves references from the function to a function sled index. This index table lives in a separate section, and without this trick the linker is free to garbage-collect this section and all the segments it refers to. Until we're able to tell the linkers to preserve these sections, we use this reference trick to keep around both the index and the entries in the instrumentation map. Before this change we emitted both a synthetic reference to the label in the instrumentation map, and to the entry in the function map index. This change removes the first synthetic reference and only emits one synthetic reference to the index -- the index entry has the references to the labels in the instrumentation map, so the linker will still preserve those if the function itself is preserved. This reduces the amount of synthetic references we emit from 16 bytes to just 8 bytes in x86_64, and similarly to other platforms. Reviewers: dblaikie Subscribers: javed.absar, kpw, pelikan, llvm-commits Differential Revision: https://reviews.llvm.org/D34340 llvm-svn: 305880
* [ImplicitNullChecks] Uphold an invariant in areMemoryOpsAliasedSerguei Katkov2017-06-211-24/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Right now areMemoryOpsAliased has an assertion justified as: MMO1 should have a value due it comes from operation we'd like to use as implicit null check. assert(MMO1->getValue() && "MMO1 should have a Value!"); However, it is possible for that invariant to not be upheld in the following situation (conceptually): Null check %RAX NotNullSucc: %RAX = LEA %RSP, 16 // I0 %RDX = MOV64rm %RAX // I1 With the current code, we will have an early exit from ImplicitNullChecks::isSuitableMemoryOp on I0 with SR_Unsuitable. However, I1 will look plausible (since it loads from %RAX) and will go ahead and call areMemoryOpsAliased(I1, I0). This will cause us to fail the assert mentioned above since I1 does not load from an IR level value and thus is allowed to have a non-Value base address. The fix is to bail out earlier whenever we see an unsuitable instruction overwrite PointerReg. This would guarantee that when we call areMemoryOpsAliased, we're guaranteed to be looking at an instruction that loads from or stores to an IR level value. Original Patch Author: sanjoy Reviewers: sanjoy, mkazantsev, reames Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34385 llvm-svn: 305879
* Fix a crash in DwarfDebug::validThroughout.Adrian Prantl2017-06-201-3/+5
| | | | | | | | | | | The instruction it falls over on is an IMPLICT_DEF that also happens to be the only instruction in its lexical scope. That LexicalScope has never been created because its range is empty. This patch skips over all meta-instructions instead of just DBG_VALUEs. Thanks to David Blaikie for providing a testcase! llvm-svn: 305853
* [GISel]: Add G_FMA opcode for fused multiply addsAditya Nandakumar2017-06-201-0/+7
| | | | | | | | https://reviews.llvm.org/D34372 Reviewed by dsanders llvm-svn: 305824
* RegisterScavenging: Followup to r305625Matthias Braun2017-06-201-41/+38
| | | | | | | | | | | | | This does some improvements/cleanup to the recently introduced scavengeRegisterBackwards() functionality: - Rewrite findSurvivorBackwards algorithm to use the existing LiveRegUnit::accumulateBackward() code. This also avoids the Available and Candidates bitset and just need 1 LiveRegUnit instance (= 1 bitset). - Pick registers in allocation order instead of register number order. llvm-svn: 305817
* DAG: correctly legalize UMULO.Tim Northover2017-06-201-11/+18
| | | | | | | | | We were incorrectly sign extending into the high word (as you would for SMULO) when legalizing UMULO in terms of a wider full multiplication. Patch by James Duley. llvm-svn: 305800
* [globalisel][tablegen] Add support for COPY_TO_REGCLASS.Daniel Sanders2017-06-202-10/+30
| | | | | | | | | | | | | | | | | | | | | | Summary: As part of this * Emitted instructions now have named MachineInstr variables associated with them. This isn't particularly important yet but it's a small step towards multiple-insn emission. * constrainSelectedInstRegOperands() is no longer hardcoded. It's now added as the ConstrainOperandsToDefinitionAction() action. COPY_TO_REGCLASS uses an alternate constraint mechanism ConstrainOperandToRegClassAction() which supports arbitrary constraints such as that defined by COPY_TO_REGCLASS. Reviewers: ab, qcolombet, t.p.northover, rovka, kristof.beyls, aditya_nandakumar Reviewed By: ab Subscribers: javed.absar, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D33590 llvm-svn: 305791
* [SelectionDAG] Fix an use-after-free issue introduced in r305775.Haojian Wu2017-06-201-2/+2
| | | | | | vector.back() will be invalidated when memory reallocation happens. llvm-svn: 305785
* [GlobalISel] combine not symmetric merge/unmerge nodes.Igor Breger2017-06-201-13/+58
| | | | | | | | | | | | | | | | Summary: In some cases legalization ends up with not symmetric merge/unmerge nodes. Transform it to merge/unmerge nodes. Reviewers: t.p.northover, qcolombet, zvi Reviewed By: t.p.northover Subscribers: rovka, kristof.beyls, guyblank, llvm-commits Differential Revision: https://reviews.llvm.org/D33626 llvm-svn: 305783
* [SelectionDAG] Get rid of recursion in CalcNodeSethiUllmanNumberMax Kazantsev2017-06-201-19/+59
| | | | | | | | | | The recursive implementation of CalcNodeSethiUllmanNumber may overflow stack on extremely long pred chains. This patch replaces it with an equivalent iterative implementation. Differential Revision: https://reviews.llvm.org/D33769 llvm-svn: 305775
* [DAG] Simplify BaseIndexOffset. NFCI.Nirav Dave2017-06-201-59/+57
| | | | | | Remove tail calls and cleanup codeflow. llvm-svn: 305768
* [Target] Fix some Clang-tidy modernize-use-using and Include What You Use ↵Eugene Zelenko2017-06-192-11/+23
| | | | | | warnings; other minor fixes (NFC). llvm-svn: 305757
* Fix typosMatt Arsenault2017-06-191-2/+2
| | | | llvm-svn: 305749
* [CGP, PowerPC] try to constant fold before creating loads for memcmp expansionSanjay Patel2017-06-191-3/+13
| | | | | | | | | | | | | | | This is the last step needed to avoid regressions for x86 before we flip the switch to allow expansion of the smallest set of memcpy() via CGP. The DAG version checks for constant strings, so we need to do that here too. FWIW, the 2 constant test is not handled by LibCallSimplifier::optimizeMemCmp() because that code is limited to 8-bit constant arrays. LibCallSimplifier will also fail to optimize some 1 constant tests because its alignment requirements are too strict (shouldn't require alignment for a constant operand). Differential Revision: https://reviews.llvm.org/D34071 llvm-svn: 305734
* Allow truncated and extend memory operations in Store Merge. NFCI.Nirav Dave2017-06-191-24/+46
| | | | | | | | | | As all store merges checks are based on the memory operation performed, allow use of truncated stores and extended loads as valid input candidates for merging. Relanding after fixing selection between truncated and normal store. llvm-svn: 305701
* Recommit rL305677: [CodeGen] Add generic MacroFusion passFlorian Hahn2017-06-192-0/+151
| | | | | | | | | | | | | Use llvm::make_unique to avoid ambiguity with MSVC. This patch adds a generic MacroFusion pass, that is used on X86 and AArch64, which both define target-specific shouldScheduleAdjacent functions. This generic pass should make it easier for other targets to implement macro fusion and I intend to add macro fusion for ARM shortly. Differential Revision: https://reviews.llvm.org/D34144 llvm-svn: 305690
* Revert r305677 [CodeGen] Add generic MacroFusion pass.Florian Hahn2017-06-192-151/+0
| | | | | | This causes Windows buildbot failures do an ambiguous call. llvm-svn: 305681
* [CodeGen] Add generic MacroFusion pass.Florian Hahn2017-06-192-0/+151
| | | | | | | | | | | | | | | | | | Summary: This patch adds a generic MacroFusion pass, that is used on X86 and AArch64, which both define target-specific shouldScheduleAdjacent functions. This generic pass should make it easier for other targets to implement macro fusion and I intend to add macro fusion for ARM shortly. Reviewers: craig.topper, evandro, t.p.northover, atrick, MatzeB Reviewed By: MatzeB Subscribers: atrick, aemerson, mgorny, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34144 llvm-svn: 305677
* Fixed the warning introduced by r305625 to make ubuntu-gcc7.1-werror bot green.Galina Kistanova2017-06-171-1/+1
| | | | llvm-svn: 305640
* RegScavenging: Add scavengeRegisterBackwards()Matthias Braun2017-06-171-116/+317
| | | | | | | | | | | | | | | | | | | Re-apply r276044/r279124/r305516. Fixed a problem where we would refuse to place spills as the very first instruciton of a basic block and thus artifically increase pressure (test in test/CodeGen/PowerPC/scavenging.mir:spill_at_begin) This is a variant of scavengeRegister() that works for enterBasicBlockEnd()/backward(). The benefit of the backward mode is that it is not affected by incomplete kill flags. This patch also changes PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register scavenger in backwards mode. Differential Revision: http://reviews.llvm.org/D21885 llvm-svn: 305625
* [SelectionDAG] Update Loop info after splitting critical edges.Davide Italiano2017-06-171-6/+9
| | | | | | The analysis is expected to be preserved by SelectionDAG. llvm-svn: 305621
* [WebAssembly] Use __stack_pointer global when writing wasm binarySam Clegg2017-06-161-1/+0
| | | | | | | | | | | | | | | | | | This ensures that symbolic relocations are generated for stack pointer manipulations. These relocations are of type R_WEBASSEMBLY_GLOBAL_INDEX_LEB. This change also adds support for reading relocations of this type in WasmObjectFile.cpp. Since its a globally imported symbol this does mean that the get_global/set_global instruction won't be valid until the objects are linked that global used in no longer an imported global. Differential Revision: https://reviews.llvm.org/D34172 llvm-svn: 305616
* [SelectionDAG] Use APInt::isSubsetOf. NFCCraig Topper2017-06-162-4/+4
| | | | llvm-svn: 305606
* [SelectionDAG] Use APInt::isNullValue/isOneValue. NFCCraig Topper2017-06-162-5/+5
| | | | llvm-svn: 305605
* [TargetLowering] Use ConstantSDNode::isOne and getSExtValue instead of ↵Craig Topper2017-06-161-6/+6
| | | | | | getting the underlying APInt first. NFC llvm-svn: 305604
* Improve the accuracy of variable ranges .debug_loc location lists.Adrian Prantl2017-06-161-12/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For the following motivating example bool c(); void f(); bool start() { bool result = c(); if (!c()) { result = false; goto exit; } f(); result = true; exit: return result; } we would previously generate a single DW_AT_const_value(1) because only the DBG_VALUE in the second-to-last basic block survived codegen. This patch improves the heuristic used to determine when a DBG_VALUE is available at the beginning of its variable's enclosing lexical scope: - Stop giving singular constants blanket permission to take over the entire scope. There is still a special case for constants in the function prologue that we also miight want to retire later. - Use the lexical scope information to determine available-at-entry instead of proximity to the function prologue. After this patch we generate a location list with a more accurate narrower availability for the constant true value. As a pleasant side effect, we also generate inline locations instead of location lists where a loacation covers the entire range of the enclosing lexical scope. Measured on compiling llc with four targets this doesn't have an effect on compile time and reduces the size of the debug info for llc by ~600K. rdar://problem/30286912 llvm-svn: 305599
* Revert "RegScavenging: Add scavengeRegisterBackwards()"Matthias Braun2017-06-161-315/+116
| | | | | | | | | Revert because of reports of some PPC input starting to spill when it was predicted that it wouldn't and no spillslot was reserved. This reverts commit r305516. llvm-svn: 305566
* [Atomics] Rename and change prototype for atomic memcpy intrinsicDaniel Neilson2017-06-162-26/+26
| | | | | | | | | | | | | | | | | | Summary: Background: http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html This change is to alter the prototype for the atomic memcpy intrinsic. The prototype itself is being changed to more closely resemble the semantics and parameters of the llvm.memcpy intrinsic -- to ease later combination of the llvm.memcpy and atomic memcpy intrinsics. Furthermore, the name of the atomic memcpy intrinsic is being changed to make it clear that it is not a generic atomic memcpy, but specifically a memcpy is unordered atomic. Reviewers: reames, sanjoy, efriedma Reviewed By: reames Subscribers: mzolotukhin, anna, llvm-commits, skatkov Differential Revision: https://reviews.llvm.org/D33240 llvm-svn: 305558
* [MachineBlockPlacement] trivial fix in comments, NFCHiroshi Inoue2017-06-161-5/+5
| | | | | | | | - Topologocal is abbreviated as "topo" in comments, but "top" is used in only one comment. Modify it for consistency. - Capitalize "succ" and "pred" for consistency in one figure. - Other trivial fixes. llvm-svn: 305552
* Revert "[DAG] Allow truncated and extend memory operations in Store Merge. ↵Ahmed Bougacha2017-06-151-21/+10
| | | | | | | | NFCI." This reverts commit r305468, as it caused PR33475. llvm-svn: 305527
* RegScavenging: Add scavengeRegisterBackwards()Matthias Braun2017-06-151-116/+315
| | | | | | | | | | | | | | | | | | Re-apply r276044/r279124. Trying to reproduce or disprove the ppc64 problems reported in the stage2 build last time, which I cannot reproduce right now. This is a variant of scavengeRegister() that works for enterBasicBlockEnd()/backward(). The benefit of the backward mode is that it is not affected by incomplete kill flags. This patch also changes PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register scavenger in backwards mode. Differential Revision: http://reviews.llvm.org/D21885 llvm-svn: 305516
* [MachineLICM] Hoist TOC-based address instructionsLei Huang2017-06-151-2/+5
| | | | | | | | | | | | | | | | | | Add condition for MachineLICM to safely hoist instructions that utilize non constant registers that are reserved. On PPC, global variable access is done through the table of contents (TOC) which is always in register X2. The ABI reserves this register in any functions that have calls or access global variables. A call through a function pointer involves saving, changing and restoring this register around the call and thus MachineLICM does not consider it to be invariant. We can however guarantee the register is preserved across the call and thus is invariant. Differential Revision: https://reviews.llvm.org/D33562 llvm-svn: 305490
* Fold variable into assert.Benjamin Kramer2017-06-151-2/+1
| | | | | | Silences an unused variable warning in Release builds. llvm-svn: 305488
* ISel: Fix FastISel of swifterror valuesArnold Schwaighofer2017-06-153-14/+125
| | | | | | | | | | | | The code assumed that we process instructions in basic block order. FastISel processes instructions in reverse basic block order. We need to pre-assign virtual registers before selecting otherwise we get def-use relationships wrong. This only affects code with swifterror registers. rdar://32659327 llvm-svn: 305484
OpenPOWER on IntegriCloud