summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* Revert "PR32288: More efficient encoding for DWARF expr subregister access."Adrian Prantl2017-03-162-35/+3
| | | | | | This reverts commit 2bf453116889a576956892ea9683db4fcd96e30e while investigating buildbot breakage. llvm-svn: 297962
* PR32288: More efficient encoding for DWARF expr subregister access.Adrian Prantl2017-03-162-3/+35
| | | | | | | | | | | | | | | | | | | | | Citing http://bugs.llvm.org/show_bug.cgi?id=32288 The DWARF generated by LLVM includes this location: 0x55 0x93 0x04 DW_OP_reg5 DW_OP_piece(4) When GCC's DWARF is simply 0x55 (DW_OP_reg5) without the DW_OP_piece. I believe it's reasonable to assume the DWARF consumer knows which part of a register logically holds the value (low bytes, high bytes, how many bytes, etc) for a primitive value like an integer. This patch gets rid of the redundant DW_OP_piece when a subregister is at offset 0. It also adds previously missing subregister masking when a subregister is followed by another operation. rdar://problem/31069390 https://reviews.llvm.org/D31010 llvm-svn: 297960
* Fixing typos.Oren Ben Simhon2017-03-161-4/+5
| | | | llvm-svn: 297932
* [SelectionDAG] Optimize VSELECT->SETCC of incompatible or illegal types.Jonas Paulsson2017-03-165-31/+247
| | | | | | | | | | | | | | | | | | | | | | | | Don't scalarize VSELECT->SETCC when operands/results needs to be widened, or when the type of the SETCC operands are different from those of the VSELECT. (VSELECT SETCC) and (VSELECT (AND/OR/XOR (SETCC,SETCC))) are handled. The previous splitting of VSELECT->SETCC in DAGCombiner::visitVSELECT() is no longer needed and has been removed. Updated tests: test/CodeGen/ARM/vuzp.ll test/CodeGen/NVPTX/f16x2-instructions.ll test/CodeGen/X86/2011-10-19-widen_vselect.ll test/CodeGen/X86/2011-10-21-widen-cmp.ll test/CodeGen/X86/psubus.ll test/CodeGen/X86/vselect-pcmp.ll Review: Eli Friedman, Simon Pilgrim https://reviews.llvm.org/D29489 llvm-svn: 297930
* CodeGen: BlockPlacement: Reduce TriangleChainCount to 2Kyle Butt2017-03-161-1/+1
| | | | | | | | | This produces a 1% speedup on an important internal Google benchmark (protocol buffers), with no other regressions in google or in the llvm test-suite. Only 5 targets in the entire llvm test-suite are affected, and on those 5 targets the size increase is 0.027% llvm-svn: 297925
* [StackColoring] Remove unused header file for post-order traversal. Update ↵Craig Topper2017-03-151-4/+2
| | | | | | comment that indicated we were using it when we really use a depth-first search. NFC llvm-svn: 297904
* CodeGenPrepare: Sink addressing modes for atomicsMatt Arsenault2017-03-151-1/+30
| | | | llvm-svn: 297903
* Fix up grammar in a comment.Eric Christopher2017-03-151-1/+1
| | | | llvm-svn: 297898
* [DAGCombine] Bail out if can't create a vector with at least two elementsZvi Rackover2017-03-151-2/+5
| | | | | | | | | | | | | | | | Summary: Fixes pr32278 Reviewers: igorb, craig.topper, RKSimon, spatel, hfinkel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30978 llvm-svn: 297878
* [GlobalISel] Avoid translating synthetic constants to new G_CONSTANTS.Ahmed Bougacha2017-03-151-17/+20
| | | | | | | | | | | | | | | | | | | | | Currently, we create a G_CONSTANT for every "synthetic" integer constant operand (for instance, for the G_GEP offset). Instead, share the G_CONSTANTs we might have created by going through the ValueToVReg machinery. When we're emitting synthetic constants, we do need to get Constants from the context. One could argue that we shouldn't modify the context at all (for instance, this means that we're going to use a tad more memory if the constant wasn't used elsewhere), but constants are mostly harmless. We currently do this for extractvalue and all. For constant fcmp, this does mean we'll emit an extra COPY, which is not necessarily more optimal than an extra materialized constant. But that preserves the current intended design of uniqued G_CONSTANTs, and the rematerialization problem exists elsewhere and should be resolved with a single coherent solution. llvm-svn: 297875
* ARM: avoid clobbering register in v6 jump-table expansion.Tim Northover2017-03-152-0/+4
| | | | | | | | | | | If we got unlucky with register allocation and actual constpool placement, we could end up producing a tTBB_JT with an index that's already been clobbered. Technically, we might be able to fix this situation up with a MOV, but I think the constant islands pass is complex enough without having to deal with more weird edge-cases. llvm-svn: 297871
* [GlobalISel] Insert translated switch icmp blocks after switch parent.Ahmed Bougacha2017-03-151-1/+2
| | | | | | | | | Now that we preserve the IR layout, we would end up with all the newly synthesized switch comparison blocks at the end of the function. Instead, use a hopefully more reasonable layout, with the comparison blocks immediately following the switch comparison blocks. llvm-svn: 297869
* [GlobalISel] Preserve IR block layout.Ahmed Bougacha2017-03-151-20/+26
| | | | | | | | | | | | | | It makes the output function layout more predictable; the layout has an effect on performance, we don't want it to be at the mercy of the translator's visitation order and such. The predictable output is also easier to digest. getOrCreateBB isn't appropriately named anymore, as it never needs to create anything. Rename it and extract the MBB creation logic out of it. A couple tests were sensitive to the order. Update them. llvm-svn: 297868
* [CodeGen] Use APInt::setLowBits/setHighBits/setBitsFrom in more placesCraig Topper2017-03-152-25/+19
| | | | | | | | | | This patch replaces ORs with getHighBits/getLowBits etc. with setLowBits/setHighBits/setBitsFrom. In a few of the places we weren't ORing, but the KnownZero/KnownOne vectors were already initialized to zero. We exploit this in most places already there were just some that were inconsistent. Differential Revision: https://reviews.llvm.org/D30965 llvm-svn: 297860
* [GlobalISel] Remove dead member. NFC.Ahmed Bougacha2017-03-151-1/+0
| | | | llvm-svn: 297855
* CodeGen: Use the source filename as the argument to .file, rather than the ↵Peter Collingbourne2017-03-151-1/+1
| | | | | | | | | | | | | | | | module ID. Using the module ID here is wrong for a couple of reasons: 1) The module ID is not persisted, so we can end up with different object file contents given the same input file (for example if the same file is accessed via different paths). 2) With ThinLTO the module ID field may contain the path to a bitcode file, which is incorrect, as the .file argument is supposed to contain the path to a source file. Differential Revision: https://reviews.llvm.org/D30584 llvm-svn: 297853
* [SelectionDAG] Support BUILD_VECTOR implicit truncation in ↵Simon Pilgrim2017-03-151-3/+14
| | | | | | SelectionDAG::ComputeNumSignBits (PR32273) llvm-svn: 297852
* fix gcc -Wmisleading-indentation [NFC]Nuno Lopes2017-03-151-1/+1
| | | | llvm-svn: 297816
* NFC: Reformats comments according to the coding guildelines.Taewook Oh2017-03-152-34/+41
| | | | llvm-svn: 297808
* [BranchFolding] Merge debug locations from common tail instead of removingTaewook Oh2017-03-152-4/+43
| | | | | | | | | | | | | | Summary: D25742 improved the precision of debug locations for PGO by removing debug locations from common tail when tail-merging. However, if identical insturctions that are merged into a common tail have the same debug locations, there's no need to remove them. This patch creates a merged debug location of identical instructions across SameTails and assign it to the instruction in the common tail, so that the debug locations are maintained if they are same across identical instructions. Reviewers: aprantl, probinson, MatzeB, rob.lougher Reviewed By: aprantl Subscribers: andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D30226 llvm-svn: 297805
* Ensure that prefix data is preserved with subsections-via-symbolsPeter Collingbourne2017-03-151-2/+17
| | | | | | | | | | | | | | On MachO platforms that use subsections-via-symbols dead code stripping will drop prefix data. Unfortunately there is no great way to convey the relationship between a function and its prefix data to the linker. We are forced to use a bit of a hack: we give the prefix data it’s own symbol, and mark the actual function entry an .alt_entry. Patch by Moritz Angermann! Differential Revision: https://reviews.llvm.org/D30770 llvm-svn: 297804
* [GlobalISel] IRTranslator: Return the scalar for <1 x Ty> constant vectorsVolkan Keles2017-03-141-0/+6
| | | | | | | | | | | | | | | | Summary: <1 x Ty> is not a legal vector type in LLT, we shouldn’t build G_MERGE_VALUES instruction for them. Reviewers: qcolombet, aditya_nandakumar, dsanders, t.p.northover, ab, javed.absar Reviewed By: qcolombet Subscribers: dberris, rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D30948 llvm-svn: 297792
* [globalisel][tblgen] Add support for ComplexPatternsDaniel Sanders2017-03-142-0/+11
| | | | | | | | | | | | | | | | | | | Summary: Adds a new kind of MachineOperand: MO_Placeholder. This operand must not appear in the MIR and only exists as a way of creating an 'uninitialized' operand until a matcher function overwrites it. Depends on D30046, D29712 Reviewers: t.p.northover, ab, rovka, aditya_nandakumar, javed.absar, qcolombet Reviewed By: qcolombet Subscribers: dberris, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D30089 llvm-svn: 297782
* [SelectionDAG] Add a signed integer absolute ISD nodeSimon Pilgrim2017-03-144-0/+41
| | | | | | | | | | | | Reduced version of D26357 - based on the discussion on llvm-dev about canonicalization of UMIN/UMAX/SMIN/SMAX as well as ABS I've reduced that patch to just the ABS ISD node (with x86/sse support) to improve basic combines and lowering. ARM/AArch64, Hexagon, PowerPC and NVPTX all have similar instructions allowing us to make this a generic opcode and move away from the hard coded tablegen patterns which makes it tricky to match more complex patterns. At the moment this patch doesn't attempt legalization as we only create an ABS node if its legal/custom. Differential Revision: https://reviews.llvm.org/D29639 llvm-svn: 297780
* [DAG] vector div/rem with any zero element in divisor is undefSanjay Patel2017-03-142-15/+31
| | | | | | | | | | | | | | | | This is the backend counterpart to: https://reviews.llvm.org/rL297390 https://reviews.llvm.org/rL297409 and follow-up to: https://reviews.llvm.org/rL297384 It surprised me that we need to duplicate the check in FoldConstantArithmetic and FoldConstantVectorArithmetic, but one or the other doesn't catch all of the test cases. There is an existing code comment about merging those someday. Differential Revision: https://reviews.llvm.org/D30826 llvm-svn: 297762
* [CodeGen] Fix -Wreorder warning.Benjamin Kramer2017-03-141-3/+3
| | | | llvm-svn: 297729
* [ARM] Move SMULW[B|T] isel to DAG CombineSam Parker2017-03-141-0/+15
| | | | | | | | | | | | Create nodes for smulwb and smulwt and move their selection from DAGToDAG to DAG combine. smlawb and smlawt can then be selected using tablegen. Added some helper functions to detect shift patterns as well as a wrapper around SimplifyDemandBits. Added a couple of extra tests. Differential Revision: https://reviews.llvm.org/D30708 llvm-svn: 297716
* Disable Callee Saved RegistersOren Ben Simhon2017-03-1410-22/+77
| | | | | | | | | | | | | | Each Calling convention (CC) defines a static list of registers that should be preserved by a callee function. All other registers should be saved by the caller. Some CCs use additional condition: If the register is used for passing/returning arguments – the caller needs to save it - even if it is part of the Callee Saved Registers (CSR) list. The current LLVM implementation doesn’t support it. It will save a register if it is part of the static CSR list and will not care if the register is passed/returned by the callee. The solution is to dynamically allocate the CSR lists (Only for these CCs). The lists will be updated with actual registers that should be saved by the callee. Since we need the allocated lists to live as long as the function exists, the list should reside inside the Machine Register Info (MRI) which is a property of the Machine Function and managed by it (and has the same life span). The lists should be saved in the MRI and populated upon LowerCall and LowerFormalArguments. The patch will also assist to implement future no_caller_saved_regsiters attribute intended for interrupt handler CC. Differential Revision: https://reviews.llvm.org/D28566 llvm-svn: 297715
* Recommitting Craig Topper's patch now that r296476 has been recommitted.Nirav Dave2017-03-141-1/+1
| | | | | | | | When checking if chain node is foldable, make sure the intermediate nodes have a single use across all results not just the result that was used to reach the chain node. This recovers a test case that was severely broken by r296476, my making sure we don't create ADD/ADC that loads and stores when there is also a flag dependency. llvm-svn: 297698
* In visitSTORE, always use FindBetterChain, rather than only when UseAA is ↵Nirav Dave2017-03-142-371/+391
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | enabled. Recommiting with compiler time improvements Recommitting after fixup of 32-bit aliasing sign offset bug in DAGCombiner. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle llvm-svn: 297695
* Revert "Debug Info: Add basic support for external types references."Adrian Prantl2017-03-133-20/+1
| | | | | | | | | | | | | | This reverts commit r242302. External type refs of this form were never used by any LLVM frontend so this is effectively dead code. (They were introduced to support clang module debug info, but in the end we came up with a better design that doesn't use this feature at all.) rdar://problem/25897929 Differential Revision: https://reviews.llvm.org/D30917 llvm-svn: 297684
* [IPRA] Change algorithm for RegUsageInfoCollector.Marcello Maggioni2017-03-131-3/+21
| | | | | | | | | | | | | | | | | The previous algorithm for RegUsageInfoCollector had pretty bad performance on architectures with a lot of registers that alias a lot one another, because we potentially iterate for every register over all the aliasing registers. This costs even more if the function is small and doesn't define a lot of registers. This patch changes the algorithm to one that while iterating over all the registers it will iterate over the aliasing registers only if the register itself is defined. This should be faster based on the assumption that only a subset of the whole LLVM registers set is actually defined in the function. Differential Revision: https://reviews.llvm.org/D30880 llvm-svn: 297673
* GlobalISel: Translate ConstantDataVectorVolkan Keles2017-03-131-0/+7
| | | | | | | | | | | | Reviewers: qcolombet, aditya_nandakumar, dsanders, t.p.northover, javed.absar, ab Reviewed By: qcolombet, dsanders, ab Subscribers: dberris, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30216 llvm-svn: 297670
* [Outliner] Add tail call supportJessica Paquette2017-03-131-12/+42
| | | | | | | | | | | | | | This commit adds tail call support to the MachineOutliner pass. This allows the outliner to insert jumps rather than calls in areas where tail calling is possible. Outlined tail calls include the return or terminator of the basic block being outlined from. Tail call support allows the outliner to take returns and terminators into consideration while finding candidates to outline. It also allows the outliner to save more instructions. For example, in the X86-64 outliner, a tail called outlined function saves one instruction since no return has to be inserted. llvm-svn: 297653
* Fix -Wsentinel warningSimon Pilgrim2017-03-111-1/+1
| | | | llvm-svn: 297560
* Use setBits in SelectionDAGAmaury Sechet2017-03-111-9/+8
| | | | | | | | | | | | Summary: As per title. Reviewers: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30836 llvm-svn: 297559
* [IRTranslator] Simplify error handling for translating constants. NFC.Quentin Colombet2017-03-111-9/+3
| | | | | | | | We don't need to check whether the fallback path is enabled to return false. Just do that all the time on error cases, the caller knows (or at least should know!) how to handle the failing case. llvm-svn: 297535
* Fix subreg value numbers in handleMoveUpStanislav Mekhanoshin2017-03-111-1/+3
| | | | | | | | | | | The problem can occur in presence of subregs. If we are swapping two instructions defining different subregs of the same register we will get a new liveout from a block. We need to preserve value number for block's liveout for successor block's livein to match. Differential Revision: https://reviews.llvm.org/D30558 llvm-svn: 297534
* Strip trailing whitespace.Simon Pilgrim2017-03-101-23/+23
| | | | llvm-svn: 297529
* Fix redundant condition (PR32138)Simon Pilgrim2017-03-101-2/+2
| | | | | | '!A || (A && B)' is equivalent to '!A || B' llvm-svn: 297527
* [GlobalISel] LegalizerHelper: Lower (G_FSUB X, Y) to (G_FADD X, (G_FNEG Y))Volkan Keles2017-03-101-0/+18
| | | | | | | | | | | | | | Summary: No test case as none of the in-tree targets with GlobalISel support has this condition. Reviewers: qcolombet, aditya_nandakumar, dsanders, t.p.northover, ab Reviewed By: qcolombet Subscribers: dberris, rovka, kristof.beyls, llvm-commits, igorb Differential Revision: https://reviews.llvm.org/D30786 llvm-svn: 297512
* GlobalISel: Translate ConstantAggregateZero vectorsVolkan Keles2017-03-101-1/+10
| | | | | | | | | | | | Reviewers: qcolombet, aditya_nandakumar, dsanders, ab, t.p.northover, javed.absar Reviewed By: qcolombet Subscribers: dberris, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30259 llvm-svn: 297509
* [GlobalISel] Translate insertelement and extractelementVolkan Keles2017-03-102-0/+70
| | | | | | | | | | | | Reviewers: qcolombet, aditya_nandakumar, dsanders, ab, t.p.northover, javed.absar Reviewed By: qcolombet Subscribers: dberris, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30761 llvm-svn: 297495
* [SelectionDAG] Add support for BUILD_VECTOR to ComputeNumSignBitsSimon Pilgrim2017-03-101-0/+6
| | | | llvm-svn: 297492
* [GlobalISel] Make LegalizerInfo accessible in LegalizerHelperVolkan Keles2017-03-102-11/+8
| | | | | | | | | | | | | | | | | | | | Summary: We don’t actually use LegalizerInfo in Legalizer pass, it’s just passed as an argument. In order to check if an instruction is legal or not, we need to get LegalizerInfo by calling `MI.getParent()->getParent()->getSubtarget().getLegalizerInfo()`. Instead, make LegalizerInfo accessible in LegalizerHelper. Reviewers: qcolombet, aditya_nandakumar, dsanders, ab, t.p.northover, kristof.beyls Reviewed By: qcolombet Subscribers: dberris, llvm-commits, rovka Differential Revision: https://reviews.llvm.org/D30838 llvm-svn: 297491
* [SelectionDAG] Make SelectionDAG aware of the known bits in USUBO and SSUBO ↵Amaury Sechet2017-03-101-4/+13
| | | | | | | | | | | | | | | | | and SUBC. Summary: Depends on D30379 This improves the state of things for the sub class of operation. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30436 llvm-svn: 297482
* [SelectionDAG] Make SelectionDAG aware of the known bits in UADDO and SADDO.Amaury Sechet2017-03-101-13/+37
| | | | | | | | | | | | Summary: As per title. This is extracted from D29872 and I threw SADDO in. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30379 llvm-svn: 297479
* [APInt] Add APInt::insertBits() method to insert an APInt into a larger APIntSimon Pilgrim2017-03-101-4/+4
| | | | | | | | | | | | We currently have to insert bits via a temporary variable of the same size as the target with various shift/mask stages, resulting in further temporary variables, all of which require the allocation of memory for large APInts (MaskSizeInBits > 64). This is another of the compile time issues identified in PR32037 (see also D30265). This patch adds the APInt::insertBits() helper method which avoids the temporary memory allocation and masks/inserts the raw bits directly into the target. Differential Revision: https://reviews.llvm.org/D30780 llvm-svn: 297458
* [GlobalISel] Use ImmutableCallSite instead of templates. NFC.Ahmed Bougacha2017-03-102-21/+10
| | | | | | ImmutableCallSite abstracts away CallInst and InvokeInst. Use it! llvm-svn: 297426
* [GlobalISel] Fallback when failing to translate invoke.Ahmed Bougacha2017-03-101-3/+4
| | | | | | | | We unintentionally stopped falling back in r293670. While there, change an unusual construct. llvm-svn: 297425
OpenPOWER on IntegriCloud