summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* Fix constant folding of fp2int to large integersSimon Pilgrim2017-03-192-16/+9
| | | | | | | | | | | | We make the assumption in most of our constant folding code that a fp2int will target an integer of 128-bits or less, calling the APFloat::convertToInteger with only uint64_t[2] of raw bits for the result. Fuzz testing (PR24662) showed that we don't handle other cases at all, resulting in stack overflows and all sorts of crashes. This patch uses the APSInt version of APFloat::convertToInteger instead to better handle such cases. Differential Revision: https://reviews.llvm.org/D31074 llvm-svn: 298226
* [GlobalISel] Don't select trivially dead instructions.Ahmed Bougacha2017-03-191-0/+31
| | | | | | | | | | | | | | Folding instructions when selecting can cause them to become dead. Don't select these dead instructions (if they don't have other side effects, and don't define physical registers). Preserve existing tests by adding COPYs. In some tests, the G_CONSTANT vregs never get constrained to a class: the only use of the vreg was folded into another instruction, so the G_CONSTANT, now dead, never gets selected. llvm-svn: 298224
* [GlobalISel] Move method definition to the proper file. NFC.Ahmed Bougacha2017-03-192-19/+21
| | | | llvm-svn: 298221
* [MIR] Support Customed Register Mask and CSRsOren Ben Simhon2017-03-194-31/+90
| | | | | | | | | | | | | The MIR printer dumps a string that describe the register mask of a function. A static predefined list of register masks matches a static list of strings. However when the register mask is not from the static predefined list, there is no descriptor string and the printer fails. This patch adds support to custom register mask printing and dumping. Also the list of callee saved registers (describing the registers that must be preserved for the caller) might be dynamic. As such this data needs to be dumped and parsed back to the Machine Register Info. Differential Revision: https://reviews.llvm.org/D30971 llvm-svn: 298207
* ExecutionDepsFix: Let targets specialize the pass; NFCMatthias Braun2017-03-181-212/+2
| | | | | | | | Let targets specialize the pass with the register class so we can get a parameterless default constructor and can put the pass into the pass registry to enable testing with -run-pass=. llvm-svn: 298184
* ExecutionDepsFix: Normalize names; NFCMatthias Braun2017-03-181-32/+32
| | | | | | | Normalize ExeDepsFix, execution-fix, ExecutionDependencyFix and ExecutionDepsFix to the last one. llvm-svn: 298183
* CodeGen.cpp: Sort alphabetically; NFCMatthias Braun2017-03-181-6/+6
| | | | llvm-svn: 298182
* Make library calls sensitive to regparm module flag (Fixes PR3997).Nirav Dave2017-03-187-66/+72
| | | | | | | | | | Reviewers: mkuper, rnk Subscribers: mehdi_amini, jyknight, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D27050 llvm-svn: 298179
* Capitalize ArgListEntry fields. NFC.Nirav Dave2017-03-185-76/+75
| | | | llvm-svn: 298178
* Add !associated metadata.Evgeniy Stepanov2017-03-171-13/+50
| | | | | | | | | | | | | | | | This is an ELF-specific thing that adds SHF_LINK_ORDER to the global's section pointing to the metadata argument's section. The effect of that is a reverse dependency between sections for the linker GC. !associated does not change the behavior of global-dce. The global may also need to be added to llvm.compiler.used. Since SHF_LINK_ORDER is per-section, !associated effectively enables fdata-sections for the affected globals, the same as comdats do. Differential Revision: https://reviews.llvm.org/D29104 llvm-svn: 298157
* [SelectionDAG] Remove redundant stores more aggressively.Eli Friedman2017-03-171-7/+25
| | | | | | | | | | | | | | | | Handle TokenFactors more aggressively in SDValue::reachesChainWithoutSideEffects. This isn't really a very effective change anymore because of other changes to chain handling, but it's a cheap check, and the expanded comments are still useful. It might be possible to loosen the hasOneUse() requirement with a deeper analysis, but a naive implementation of that check would be expensive. Differential Revision: https://reviews.llvm.org/D29845 llvm-svn: 298156
* [CodeGenPrep]Restructure promoting Ext to form ExtLoadJun Bum Lim2017-03-171-90/+125
| | | | | | | | | | | | | | | | Summary: Instead of just looking for a load which is mergable with Ext to form ExtLoad, trying to promote Exts as long as the cost is acceptable. This change is not a NFC as it continue promoting Exts even after finding a load during promotions; the change in arm64-codegen-prepare-extload.ll described in 2.b might show the case. This change was motivated from D26524. Based on this change, I will move the transformation performed in aarch64-type-promotion into CGP. Reviewers: jmolloy, qcolombet, mcrosier, javed.absar Reviewed By: qcolombet Subscribers: rengolin, llvm-commits, aemerson Differential Revision: https://reviews.llvm.org/D27853 llvm-svn: 298114
* [SelectionDAG] Add SelectionDAG.computeKnownBits test support for ISD::ABSSimon Pilgrim2017-03-171-0/+20
| | | | llvm-svn: 298108
* SplitKit: Correctly implement partial subregister copiesMatthias Braun2017-03-173-26/+168
| | | | | | | | | | | - This fixes a bug where subregister incompatible with the vregs register class where used. - Implement the case where multiple copies are necessary to cover a given lanemask. Differential Revision: https://reviews.llvm.org/D30438 llvm-svn: 298025
* VirtRegMap: Correctly deal with bundles when deleting identity copies.Matthias Braun2017-03-172-7/+51
| | | | | | | | | | | | | | | | | | This fixes two problems when VirtRegMap encounters bundles: - When substituting a vreg subregister def with an actual register the internal read flag must be cleared. - Removing an identity COPY from a bundle needs to use removeFromBundle() and a newly introduced function to update SlotIndexes. No testcase here, because none of the in-tree targets trigger this, however an upcoming commit of mine will need this and the testcase there will trigger this. Differential Revision: https://reviews.llvm.org/D30925 llvm-svn: 298024
* Remove LessPreciseFPMADOption from TargetOptions along with all of theEric Christopher2017-03-171-8/+0
| | | | | | | associated command line options and functions - it's currently unused in all of llvm and clang other than being set and reset. llvm-svn: 298023
* Remove getArgumentList() in favor of arg_begin(), args(), etcReid Kleckner2017-03-161-1/+1
| | | | | | | | | | | | | | | | | Users often call getArgumentList().size(), which is a linear way to get the number of function arguments. arg_size(), on the other hand, is constant time. In general, the fact that arguments are stored in an iplist is an implementation detail, so I've removed it from the Function interface and moved all other users to the argument container APIs (arg_begin(), arg_end(), args(), arg_size()). Reviewed By: chandlerc Differential Revision: https://reviews.llvm.org/D31052 llvm-svn: 298010
* Remove redundant conditions (PR31753). NFCI.Simon Pilgrim2017-03-161-2/+2
| | | | llvm-svn: 297976
* Attempt to fix bot failure on Windows.Adrian Prantl2017-03-161-1/+1
| | | | | | Looks like this expression was accidentally using 32-bit arithmetic. llvm-svn: 297969
* Rearrange fields. NFC.Adrian Prantl2017-03-161-1/+1
| | | | llvm-svn: 297967
* Rename methods in DwarfExpression to adhere to the LLVM coding guidelines.Adrian Prantl2017-03-165-121/+121
| | | | | | NFC. llvm-svn: 297966
* PR32288: More efficient encoding for DWARF expr subregister access.Adrian Prantl2017-03-162-3/+35
| | | | | | | | | | | | | | | | | | | | | | | Citing http://bugs.llvm.org/show_bug.cgi?id=32288 The DWARF generated by LLVM includes this location: 0x55 0x93 0x04 DW_OP_reg5 DW_OP_piece(4) When GCC's DWARF is simply 0x55 (DW_OP_reg5) without the DW_OP_piece. I believe it's reasonable to assume the DWARF consumer knows which part of a register logically holds the value (low bytes, high bytes, how many bytes, etc) for a primitive value like an integer. This patch gets rid of the redundant DW_OP_piece when a subregister is at offset 0. It also adds previously missing subregister masking when a subregister is followed by another operation. (This reapplies r297960 with two additional testcase updates). rdar://problem/31069390 https://reviews.llvm.org/D31010 llvm-svn: 297965
* Revert "PR32288: More efficient encoding for DWARF expr subregister access."Adrian Prantl2017-03-162-35/+3
| | | | | | This reverts commit 2bf453116889a576956892ea9683db4fcd96e30e while investigating buildbot breakage. llvm-svn: 297962
* PR32288: More efficient encoding for DWARF expr subregister access.Adrian Prantl2017-03-162-3/+35
| | | | | | | | | | | | | | | | | | | | | Citing http://bugs.llvm.org/show_bug.cgi?id=32288 The DWARF generated by LLVM includes this location: 0x55 0x93 0x04 DW_OP_reg5 DW_OP_piece(4) When GCC's DWARF is simply 0x55 (DW_OP_reg5) without the DW_OP_piece. I believe it's reasonable to assume the DWARF consumer knows which part of a register logically holds the value (low bytes, high bytes, how many bytes, etc) for a primitive value like an integer. This patch gets rid of the redundant DW_OP_piece when a subregister is at offset 0. It also adds previously missing subregister masking when a subregister is followed by another operation. rdar://problem/31069390 https://reviews.llvm.org/D31010 llvm-svn: 297960
* Fixing typos.Oren Ben Simhon2017-03-161-4/+5
| | | | llvm-svn: 297932
* [SelectionDAG] Optimize VSELECT->SETCC of incompatible or illegal types.Jonas Paulsson2017-03-165-31/+247
| | | | | | | | | | | | | | | | | | | | | | | | Don't scalarize VSELECT->SETCC when operands/results needs to be widened, or when the type of the SETCC operands are different from those of the VSELECT. (VSELECT SETCC) and (VSELECT (AND/OR/XOR (SETCC,SETCC))) are handled. The previous splitting of VSELECT->SETCC in DAGCombiner::visitVSELECT() is no longer needed and has been removed. Updated tests: test/CodeGen/ARM/vuzp.ll test/CodeGen/NVPTX/f16x2-instructions.ll test/CodeGen/X86/2011-10-19-widen_vselect.ll test/CodeGen/X86/2011-10-21-widen-cmp.ll test/CodeGen/X86/psubus.ll test/CodeGen/X86/vselect-pcmp.ll Review: Eli Friedman, Simon Pilgrim https://reviews.llvm.org/D29489 llvm-svn: 297930
* CodeGen: BlockPlacement: Reduce TriangleChainCount to 2Kyle Butt2017-03-161-1/+1
| | | | | | | | | This produces a 1% speedup on an important internal Google benchmark (protocol buffers), with no other regressions in google or in the llvm test-suite. Only 5 targets in the entire llvm test-suite are affected, and on those 5 targets the size increase is 0.027% llvm-svn: 297925
* [StackColoring] Remove unused header file for post-order traversal. Update ↵Craig Topper2017-03-151-4/+2
| | | | | | comment that indicated we were using it when we really use a depth-first search. NFC llvm-svn: 297904
* CodeGenPrepare: Sink addressing modes for atomicsMatt Arsenault2017-03-151-1/+30
| | | | llvm-svn: 297903
* Fix up grammar in a comment.Eric Christopher2017-03-151-1/+1
| | | | llvm-svn: 297898
* [DAGCombine] Bail out if can't create a vector with at least two elementsZvi Rackover2017-03-151-2/+5
| | | | | | | | | | | | | | | | Summary: Fixes pr32278 Reviewers: igorb, craig.topper, RKSimon, spatel, hfinkel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30978 llvm-svn: 297878
* [GlobalISel] Avoid translating synthetic constants to new G_CONSTANTS.Ahmed Bougacha2017-03-151-17/+20
| | | | | | | | | | | | | | | | | | | | | Currently, we create a G_CONSTANT for every "synthetic" integer constant operand (for instance, for the G_GEP offset). Instead, share the G_CONSTANTs we might have created by going through the ValueToVReg machinery. When we're emitting synthetic constants, we do need to get Constants from the context. One could argue that we shouldn't modify the context at all (for instance, this means that we're going to use a tad more memory if the constant wasn't used elsewhere), but constants are mostly harmless. We currently do this for extractvalue and all. For constant fcmp, this does mean we'll emit an extra COPY, which is not necessarily more optimal than an extra materialized constant. But that preserves the current intended design of uniqued G_CONSTANTs, and the rematerialization problem exists elsewhere and should be resolved with a single coherent solution. llvm-svn: 297875
* ARM: avoid clobbering register in v6 jump-table expansion.Tim Northover2017-03-152-0/+4
| | | | | | | | | | | If we got unlucky with register allocation and actual constpool placement, we could end up producing a tTBB_JT with an index that's already been clobbered. Technically, we might be able to fix this situation up with a MOV, but I think the constant islands pass is complex enough without having to deal with more weird edge-cases. llvm-svn: 297871
* [GlobalISel] Insert translated switch icmp blocks after switch parent.Ahmed Bougacha2017-03-151-1/+2
| | | | | | | | | Now that we preserve the IR layout, we would end up with all the newly synthesized switch comparison blocks at the end of the function. Instead, use a hopefully more reasonable layout, with the comparison blocks immediately following the switch comparison blocks. llvm-svn: 297869
* [GlobalISel] Preserve IR block layout.Ahmed Bougacha2017-03-151-20/+26
| | | | | | | | | | | | | | It makes the output function layout more predictable; the layout has an effect on performance, we don't want it to be at the mercy of the translator's visitation order and such. The predictable output is also easier to digest. getOrCreateBB isn't appropriately named anymore, as it never needs to create anything. Rename it and extract the MBB creation logic out of it. A couple tests were sensitive to the order. Update them. llvm-svn: 297868
* [CodeGen] Use APInt::setLowBits/setHighBits/setBitsFrom in more placesCraig Topper2017-03-152-25/+19
| | | | | | | | | | This patch replaces ORs with getHighBits/getLowBits etc. with setLowBits/setHighBits/setBitsFrom. In a few of the places we weren't ORing, but the KnownZero/KnownOne vectors were already initialized to zero. We exploit this in most places already there were just some that were inconsistent. Differential Revision: https://reviews.llvm.org/D30965 llvm-svn: 297860
* [GlobalISel] Remove dead member. NFC.Ahmed Bougacha2017-03-151-1/+0
| | | | llvm-svn: 297855
* CodeGen: Use the source filename as the argument to .file, rather than the ↵Peter Collingbourne2017-03-151-1/+1
| | | | | | | | | | | | | | | | module ID. Using the module ID here is wrong for a couple of reasons: 1) The module ID is not persisted, so we can end up with different object file contents given the same input file (for example if the same file is accessed via different paths). 2) With ThinLTO the module ID field may contain the path to a bitcode file, which is incorrect, as the .file argument is supposed to contain the path to a source file. Differential Revision: https://reviews.llvm.org/D30584 llvm-svn: 297853
* [SelectionDAG] Support BUILD_VECTOR implicit truncation in ↵Simon Pilgrim2017-03-151-3/+14
| | | | | | SelectionDAG::ComputeNumSignBits (PR32273) llvm-svn: 297852
* fix gcc -Wmisleading-indentation [NFC]Nuno Lopes2017-03-151-1/+1
| | | | llvm-svn: 297816
* NFC: Reformats comments according to the coding guildelines.Taewook Oh2017-03-152-34/+41
| | | | llvm-svn: 297808
* [BranchFolding] Merge debug locations from common tail instead of removingTaewook Oh2017-03-152-4/+43
| | | | | | | | | | | | | | Summary: D25742 improved the precision of debug locations for PGO by removing debug locations from common tail when tail-merging. However, if identical insturctions that are merged into a common tail have the same debug locations, there's no need to remove them. This patch creates a merged debug location of identical instructions across SameTails and assign it to the instruction in the common tail, so that the debug locations are maintained if they are same across identical instructions. Reviewers: aprantl, probinson, MatzeB, rob.lougher Reviewed By: aprantl Subscribers: andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D30226 llvm-svn: 297805
* Ensure that prefix data is preserved with subsections-via-symbolsPeter Collingbourne2017-03-151-2/+17
| | | | | | | | | | | | | | On MachO platforms that use subsections-via-symbols dead code stripping will drop prefix data. Unfortunately there is no great way to convey the relationship between a function and its prefix data to the linker. We are forced to use a bit of a hack: we give the prefix data it’s own symbol, and mark the actual function entry an .alt_entry. Patch by Moritz Angermann! Differential Revision: https://reviews.llvm.org/D30770 llvm-svn: 297804
* [GlobalISel] IRTranslator: Return the scalar for <1 x Ty> constant vectorsVolkan Keles2017-03-141-0/+6
| | | | | | | | | | | | | | | | Summary: <1 x Ty> is not a legal vector type in LLT, we shouldn’t build G_MERGE_VALUES instruction for them. Reviewers: qcolombet, aditya_nandakumar, dsanders, t.p.northover, ab, javed.absar Reviewed By: qcolombet Subscribers: dberris, rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D30948 llvm-svn: 297792
* [globalisel][tblgen] Add support for ComplexPatternsDaniel Sanders2017-03-142-0/+11
| | | | | | | | | | | | | | | | | | | Summary: Adds a new kind of MachineOperand: MO_Placeholder. This operand must not appear in the MIR and only exists as a way of creating an 'uninitialized' operand until a matcher function overwrites it. Depends on D30046, D29712 Reviewers: t.p.northover, ab, rovka, aditya_nandakumar, javed.absar, qcolombet Reviewed By: qcolombet Subscribers: dberris, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D30089 llvm-svn: 297782
* [SelectionDAG] Add a signed integer absolute ISD nodeSimon Pilgrim2017-03-144-0/+41
| | | | | | | | | | | | Reduced version of D26357 - based on the discussion on llvm-dev about canonicalization of UMIN/UMAX/SMIN/SMAX as well as ABS I've reduced that patch to just the ABS ISD node (with x86/sse support) to improve basic combines and lowering. ARM/AArch64, Hexagon, PowerPC and NVPTX all have similar instructions allowing us to make this a generic opcode and move away from the hard coded tablegen patterns which makes it tricky to match more complex patterns. At the moment this patch doesn't attempt legalization as we only create an ABS node if its legal/custom. Differential Revision: https://reviews.llvm.org/D29639 llvm-svn: 297780
* [DAG] vector div/rem with any zero element in divisor is undefSanjay Patel2017-03-142-15/+31
| | | | | | | | | | | | | | | | This is the backend counterpart to: https://reviews.llvm.org/rL297390 https://reviews.llvm.org/rL297409 and follow-up to: https://reviews.llvm.org/rL297384 It surprised me that we need to duplicate the check in FoldConstantArithmetic and FoldConstantVectorArithmetic, but one or the other doesn't catch all of the test cases. There is an existing code comment about merging those someday. Differential Revision: https://reviews.llvm.org/D30826 llvm-svn: 297762
* [CodeGen] Fix -Wreorder warning.Benjamin Kramer2017-03-141-3/+3
| | | | llvm-svn: 297729
* [ARM] Move SMULW[B|T] isel to DAG CombineSam Parker2017-03-141-0/+15
| | | | | | | | | | | | Create nodes for smulwb and smulwt and move their selection from DAGToDAG to DAG combine. smlawb and smlawt can then be selected using tablegen. Added some helper functions to detect shift patterns as well as a wrapper around SimplifyDemandBits. Added a couple of extra tests. Differential Revision: https://reviews.llvm.org/D30708 llvm-svn: 297716
* Disable Callee Saved RegistersOren Ben Simhon2017-03-1410-22/+77
| | | | | | | | | | | | | | Each Calling convention (CC) defines a static list of registers that should be preserved by a callee function. All other registers should be saved by the caller. Some CCs use additional condition: If the register is used for passing/returning arguments – the caller needs to save it - even if it is part of the Callee Saved Registers (CSR) list. The current LLVM implementation doesn’t support it. It will save a register if it is part of the static CSR list and will not care if the register is passed/returned by the callee. The solution is to dynamically allocate the CSR lists (Only for these CCs). The lists will be updated with actual registers that should be saved by the callee. Since we need the allocated lists to live as long as the function exists, the list should reside inside the Machine Register Info (MRI) which is a property of the Machine Function and managed by it (and has the same life span). The lists should be saved in the MRI and populated upon LowerCall and LowerFormalArguments. The patch will also assist to implement future no_caller_saved_regsiters attribute intended for interrupt handler CC. Differential Revision: https://reviews.llvm.org/D28566 llvm-svn: 297715
OpenPOWER on IntegriCloud