summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG
Commit message (Collapse)AuthorAgeFilesLines
...
* [Fast-ISel] Clear kill flags on registers replaced by updateValueMap.Pete Cooper2015-05-081-0/+7
| | | | | | | | | | When selecting an extract instruction, we don't actually generate code but instead work out which register we are reading, and rewrite uses of the extract def to the source register. This is done via updateValueMap,. However, its possible that the source register we are rewriting *to* to also have uses. If those uses are after a kill of the value we are rewriting *from* then we have uses after a kill and the verifier fails. This code checks for the case where the to register is also used, and if so it clears all kill on the from register. This is conservative, but better that always clearing kills on the from register. llvm-svn: 236897
* Extend the statepoint intrinsic to allow statepoints to be marked as ↵Pat Gavlin2015-05-082-11/+93
| | | | | | | | | | | | | | | | | | | | | | transitions from GC-aware code to code that is not GC-aware. This changes the shape of the statepoint intrinsic from: @llvm.experimental.gc.statepoint(anyptr target, i32 # call args, i32 unused, ...call args, i32 # deopt args, ...deopt args, ...gc args) to: @llvm.experimental.gc.statepoint(anyptr target, i32 # call args, i32 flags, ...call args, i32 # transition args, ...transition args, i32 # deopt args, ...deopt args, ...gc args) This extension offers the backend the opportunity to insert (somewhat) arbitrary code to manage the transition from GC-aware code to code that is not GC-aware and back. In order to support the injection of transition code, this extension wraps the STATEPOINT ISD node generated by the usual lowering lowering with two additional nodes: GC_TRANSITION_START and GC_TRANSITION_END. The transition arguments that were passed passed to the intrinsic (if any) are lowered and provided as operands to these nodes and may be used by the backend during code generation. Eventually, the lowering of the GC_TRANSITION_{START,END} nodes should be informed by the GC strategy in use for the function containing the intrinsic call; for now, these nodes are instead replaced with no-ops. Differential Revision: http://reviews.llvm.org/D9501 llvm-svn: 236888
* Fix alignment checks in MergeConsecutiveStores.James Y Knight2015-05-081-36/+52
| | | | | | | | | | | | | | | 1) check whether the alignment of the memory is sufficient for the *merged* store or load to be efficient. Not doing so can result in some ridiculously poor code generation, if merging creates a vector operation which must be aligned but isn't. 2) DON'T check that the alignment of each load/store is equal. If you're merging 2 4-byte stores, the first *might* have 8-byte alignment, but the second certainly will have 4-byte alignment. We do want to allow those to be merged. llvm-svn: 236850
* Fix coding standart based on post submit comments.Igor Laevsky2015-05-081-4/+4
| | | | | | Differential Revision: http://reviews.llvm.org/D7760 llvm-svn: 236849
* Switch lowering: handle zero-weight branch probabilitiesHans Wennborg2015-05-071-16/+7
| | | | | | | | | | After r236617, branch probabilities are no longer guaranteed to be >= 1. This patch makes the swich lowering code handle that correctly, without bumping the branch weights by 1 which might cause overflow and skews the probabilities. Covered by @zero_weight_tree in test/CodeGen/X86/switch.ll. llvm-svn: 236739
* Fix incorrect kill flags in fastisel.Pete Cooper2015-05-061-2/+6
| | | | | | | | If called twice in the same BB on the same constant, FastISel::fastEmit_ri_ was marking the materialized vreg as killed on each use, instead of only the last use. Change this to only mark the last use as killed by making earlier uses check if the vreg is already used elsewhere. llvm-svn: 236650
* [SelectionDAG] Delete SelectionDAGBuilder::removeValue. NFC.Sanjoy Das2015-05-061-6/+0
| | | | | | SelectionDAGBuilder::removeValue is dead now, after rL236563. llvm-svn: 236618
* Allow 0-weight branches in BranchProbabilityInfo.Diego Novillo2015-05-061-1/+5
| | | | | | | | | | | | | | | | | | | | | | | Summary: When computing branch weights in BPI, we used to disallow branches with weight 0. This is a minor nuisance, because a branch with weight 0 is different to "don't have information". In the context of instrumentation, it may mean "never executed", in the context of sampling, it means "never or seldom executed". In allowing 0 weight branches, I ran into issues with the switch expansion code in selection DAG. It is currently hardwired to not handle branches with weight 0. To maintain the current behaviour, I changed it to use 1 when it finds 0, but perhaps the algorithm needs changes to tolerate branches with weight zero. Reviewers: hansw Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9533 llvm-svn: 236617
* Reformat.NAKAMURA Takumi2015-05-062-6/+5
| | | | llvm-svn: 236601
* Revert r236546, "propagate IR-level fast-math-flags to DAG nodes (NFC)"NAKAMURA Takumi2015-05-064-65/+60
| | | | | | It caused undefined behavior. llvm-svn: 236600
* SelectionDAG: Handle out-of-bounds index in extract vector elementPawel Bylica2015-05-061-0/+4
| | | | | | | | | | | | | | | | | | Summary: This patch correctly handles undef case of EXTRACT_VECTOR_ELT node where the element index is constant and not less than vector size. Test Plan: CodeGen for X86 test included. Also one incorrect regression test fixed. Reviewers: qcolombet, chandlerc, hfinkel Reviewed By: hfinkel Subscribers: hfinkel, llvm-commits Differential Revision: http://reviews.llvm.org/D9250 llvm-svn: 236584
* [Statepoint] Clean up StatepointLowering: symbolic constants.Sanjoy Das2015-05-061-2/+3
| | | | | | | | For accessors in the `Statepoint` class, use symbolic constants for offsets into the argument vector instead of literals. This makes the code intent clearer and simpler to change. llvm-svn: 236566
* [Statepoint] Clean up Statepoint.h: accessor names.Sanjoy Das2015-05-061-10/+10
| | | | | | Use getFoo() as accessors consistently and some other naming changes. llvm-svn: 236564
* [StatepointLowering] Don't create temporary instructions. NFCI.Sanjoy Das2015-05-061-73/+69
| | | | | | | | | | | | | | Summary: Instead of creating a temporary call instruction and lowering that, use SelectionDAGBuilder::lowerCallOperands. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9480 llvm-svn: 236563
* [SelectionDAG] Make an argument optional in RFV::getCopyToRegs. NFC.Sanjoy Das2015-05-051-5/+6
| | | | | | | | | | | | | | | Summary: We default the value argument to nullptr. The only use of the value is in diagnosePossiblyInvalidConstraint and that seems to be resilient to it being nullptr. Reviewers: atrick, reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9479 llvm-svn: 236555
* [SelectionDAG] Move RegsForValue into SelectionDAGBuilder.h. NFC.Sanjoy Das2015-05-052-85/+90
| | | | | | | | | | | | | | | Summary: The exported class will be used in later change, in StatepointLowering.cpp. It is still internal to SelectionDAG (not exported via include/). Reviewers: reames, atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9478 llvm-svn: 236554
* [SelectionDAG] Pass explicit type to lowerCallOperands. NFC.Sanjoy Das2015-05-052-5/+6
| | | | | | | | | | | | | | Summary: Currently this does not change anything, but change will be used in a later change to StatepointLowering.cpp Reviewers: reames, atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9477 llvm-svn: 236553
* [StatepointLowering] Rename variable, NFC.Sanjoy Das2015-05-051-3/+3
| | | | | | Rename LoweredArgs to LoweredMetaArgs to clarify intent. llvm-svn: 236552
* propagate IR-level fast-math-flags to DAG nodes (NFC)Sanjay Patel2015-05-054-60/+65
| | | | | | | | | | | | | | | | | | | | | | | | This patch adds the minimum plumbing necessary to use IR-level fast-math-flags (FMF) in the backend without actually using them for anything yet. This is a follow-on to: http://reviews.llvm.org/rL235997 ...which split the existing nsw / nuw / exact flags and FMF into their own struct. There are 2 structural changes here: 1. The main diff is that we're preparing to extend the optimization flags to affect more than just binary SDNodes. Eg, IR intrinsics ( https://llvm.org/bugs/show_bug.cgi?id=21290 ) or non-binop nodes that don't even exist in IR such as FMA, FNEG, etc. 2. The other change is that we're actually copying the FP fast-math-flags from the IR instructions to SDNodes. Differential Revision: http://reviews.llvm.org/D8900 llvm-svn: 236546
* [DAGCombiner] Account for getVectorIdxTy() when narrowing vector loadUlrich Weigand2015-05-051-2/+3
| | | | | | | | | | | This patch makes ReplaceExtractVectorEltOfLoadWithNarrowedLoad convert the element number from getVectorIdxTy() to PtrTy before doing pointer arithmetic on it. This is needed on z, where element numbers are i32 but pointers are i64. Original patch by Richard Sandiford. llvm-svn: 236530
* [DAGCombiner] Fix ReplaceExtractVectorEltOfLoadWithNarrowedLoad for BEUlrich Weigand2015-05-051-7/+0
| | | | | | | | | | | | | For little-endian, the function would convert (extract_vector_elt (load X), Y) to X + Y*sizeof(elt). For big-endian it would instead use X + sizeof(vec) - Y*sizeof(elt). The big-endian case wasn't right since vector index order always follows memory/array order, even for big-endian. (Note that the current handling has to be wrong for Y==0 since it would access beyond the end of the vector.) Original patch by Richard Sandiford. llvm-svn: 236529
* [LegalizeVectorTypes] Allow single loads and stores for more short vectorsUlrich Weigand2015-05-051-1/+4
| | | | | | | | | | | | | | | | | | When lowering a load or store for TypeWidenVector, the type legalizer would use a single load or store if the associated integer type was legal. E.g. it would load a v4i8 as an i32 if i32 was legal. This patch extends that behavior to promoted integers as well as legal ones. If the integer type for the full vector width is TypePromoteInteger, the element type is going to be TypePromoteInteger too, and it's still better to use a single promoting load or truncating store rather than N individual promoting loads or truncating stores. E.g. if you have a v2i8 on a target where i16 is promoted to i32, it's better to load the v2i8 as an i16 rather than load both i8s individually. Original patch by Richard Sandiford. llvm-svn: 236528
* Masked gather and scatter intrinsics - enabled codegen for KNL.Elena Demikhovsky2015-05-033-3/+182
| | | | llvm-svn: 236394
* [DAGCombiner] Enabled vector float/double -> int constant foldingSimon Pilgrim2015-05-022-4/+4
| | | | llvm-svn: 236387
* [SelectionDAG] Unary vector constant folding integer legality fixesSimon Pilgrim2015-05-011-5/+25
| | | | | | | | | | | | This patch fixes issues with vector constant folding not correctly handling scalar input operands if they require implicit truncation - this was tested with llvm-stress as recommended by Patrik H Hagglund. The patch ensures that integer input scalars from a build vector are correctly truncated before folding, and that constant integer scalar results are promoted to a legal type before inclusion in the new folded build vector. I have added another crash test case and also a test for UINT_TO_FP / SINT_TO_FP using an non-truncated scalar input, which was failing before this patch. Differential Revision: http://reviews.llvm.org/D9282 llvm-svn: 236308
* Reinstate revisions r234755, r234759, r234760Jan Vesely2015-04-301-0/+32
| | | | | | | | | changes: Don't apply on hexagon and NVPTX since they no longer claim to support UADDO/USUBO Add location to getConstant Drop comment about the ops being turned into expand llvm-svn: 236240
* Inline local variable to silence unused warning.Daniel Jasper2015-04-301-2/+1
| | | | llvm-svn: 236212
* Masked gather and scatter - added DAGCombine visitorsElena Demikhovsky2015-04-302-0/+144
| | | | | | | | | and AVX-512 instruction selection patterns. All other patches, including tests will follow. http://reviews.llvm.org/D7665 llvm-svn: 236211
* Semantically revert r236031, which is not a good idea for in-order targets.Owen Anderson2015-04-301-31/+0
| | | | | | | | | | | | At the least it should be guarded by some kind of target hook. It also introduced catastrophic compile time and code quality regressions on some out of tree targets (test case still being reduced/sanitized). Sanjay agreed with reverting this patch until these issues can be resolved. llvm-svn: 236199
* Switch lowering: use profile info to build weight-balanced binary search treesHans Wennborg2015-04-301-7/+35
| | | | | | | | | | | | | | This will cause hot nodes to appear closer to the root. The literature says building the tree like this makes it a near-optimal (in terms of search time given key frequencies) binary search tree. In LLVM's case, we can do up to 3 comparisons in each leaf node, so it might be better to opt for lower tree height in some cases; that's something to look into in the future. Differential Revision: http://reviews.llvm.org/D9318 llvm-svn: 236192
* generalize binop reassociation; NFCSanjay Patel2015-04-291-17/+30
| | | | | | | | | | | | | Move the fold introduced in r236031: http://reviews.llvm.org/rL236031 to its own helper function, so we can use it for other binops. This is a preliminary step before partially solving: https://llvm.org/bugs/show_bug.cgi?id=21768 https://llvm.org/bugs/show_bug.cgi?id=23116 llvm-svn: 236171
* Run StatepointLowering.{cpp,h} through clang-format.Pat Gavlin2015-04-292-39/+28
| | | | llvm-svn: 236166
* tidy up; NFCSanjay Patel2015-04-291-41/+28
| | | | llvm-svn: 236156
* too much space again; NFCSanjay Patel2015-04-291-4/+0
| | | | llvm-svn: 236150
* too much space; NFCSanjay Patel2015-04-291-4/+0
| | | | llvm-svn: 236147
* IR: Give 'DI' prefix to debug info metadataDuncan P. N. Exon Smith2015-04-296-16/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Finish off PR23080 by renaming the debug info IR constructs from `MD*` to `DI*`. The last of the `DIDescriptor` classes were deleted in r235356, and the last of the related typedefs removed in r235413, so this has all baked for about a week. Note: If you have out-of-tree code (like a frontend), I recommend that you get everything compiling and tests passing with the *previous* commit before updating to this one. It'll be easier to keep track of what code is using the `DIDescriptor` hierarchy and what you've already updated, and I think you're extremely unlikely to insert bugs. YMMV of course. Back to *this* commit: I did this using the rename-md-di-nodes.sh upgrade script I've attached to PR23080 (both code and testcases) and filtered through clang-format-diff.py. I edited the tests for test/Assembler/invalid-generic-debug-node-*.ll by hand since the columns were off-by-three. It should work on your out-of-tree testcases (and code, if you've followed the advice in the previous paragraph). Some of the tests are in badly named files now (e.g., test/Assembler/invalid-mdcompositetype-missing-tag.ll should be 'dicompositetype'); I'll come back and move the files in a follow-up commit. llvm-svn: 236120
* CodeGen: Default overflow operations to expand so we don't have to assume ↵Jan Vesely2015-04-291-6/+0
| | | | | | | | | | targets are lying Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: ab Differential Revision: http://reviews.llvm.org/D9265 llvm-svn: 236119
* Fixed masked gather/scatter switch-caseElena Demikhovsky2015-04-291-0/+2
| | | | llvm-svn: 236092
* fixed comments, blanks, nullptr; NFCElena Demikhovsky2015-04-291-5/+4
| | | | llvm-svn: 236086
* transform fadd chains to increase parallelismSanjay Patel2015-04-281-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a compromise: with this simple patch, we should always handle a chain of exactly 3 operations optimally, but we're not generating the optimal balanced binary tree for a longer sequence. In general, this transform will reduce the dependency chain for a sequence of instructions using N operands from a worst case N-1 dependent operations to N/2 dependent operations. The optimal balanced binary tree would reduce the chain to log2(N). The trade-off for not dealing with longer sequences is: (1) we have less complexity in the compiler, (2) we avoid unknown compile-time blowup calculating a balanced tree, and (3) we don't need to worry about the increased register pressure required to parallelize longer sequences. It also seems unlikely that we would ever encounter really long strings of dependent ops like that in the wild, but I'm not sure how to verify that speculation. FWIW, I see no perf difference for test-suite running on btver2 (x86-64) with -ffast-math and this patch. We can extend this patch to cover other associative operations such as fmul, fmax, fmin, integer add, integer mul. This is a partial fix for: https://llvm.org/bugs/show_bug.cgi?id=17305 and if extended: https://llvm.org/bugs/show_bug.cgi?id=21768 https://llvm.org/bugs/show_bug.cgi?id=23116 The issue also came up in: http://reviews.llvm.org/D8941 Differential Revision: http://reviews.llvm.org/D9232 llvm-svn: 236031
* move IR-level optimization flags into their own structSanjay Patel2015-04-282-8/+11
| | | | | | | | | | | | | | | | | | | | | | This is a preliminary step to using the IR-level floating-point fast-math-flags in the SDAG (D8900). In this patch, we introduce the optimization flags as their own struct. As noted in the TODO comment, we should eventually share this data between the IR passes and the backend. We also switch the existing nsw / nuw / exact bit functionality of the BinaryWithFlagsSDNode class to use the new struct. The tradeoff is that instead of using the free but limited space of SDNode's SubclassData, we add a data member to the subclass. This means we don't have to repeat all of the get/set methods per flag, but we're potentially adding size to all nodes of this subclassi type. In practice on 64-bit systems (measured on Linux and MacOS X), there is no size difference between an SDNode and BinaryWithFlagsSDNode after this change: they're both 80 bytes. This means that we had at least one free byte to play with due to struct alignment. Differential Revision: http://reviews.llvm.org/D9325 llvm-svn: 235997
* Reapply r235977 "[DebugInfo] Add debug locations to constant SD nodes"Sergey Dmitrouk2015-04-2813-1172/+1434
| | | | | | | | | | | | | | | | | | | | | | | | | [DebugInfo] Add debug locations to constant SD nodes This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235989
* Revert "[DebugInfo] Add debug locations to constant SD nodes"Daniel Jasper2015-04-2813-1434/+1172
| | | | | | | This breaks a test: http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/23870 llvm-svn: 235987
* [DebugInfo] Add debug locations to constant SD nodesSergey Dmitrouk2015-04-2813-1172/+1434
| | | | | | | | | | | | | | | | | | | | | | | This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235977
* Masked gather and scatter: Added code for SelectionDAG.Elena Demikhovsky2015-04-283-0/+203
| | | | | | | | All other patches, including tests will follow. http://reviews.llvm.org/D7665 llvm-svn: 235970
* Switch lowering: use uint32_t for weights everywhereHans Wennborg2015-04-272-6/+12
| | | | | | | | | I previously thought switch clusters would need to use uint64_t in case the weights of multiple cases overflowed a 32-bit int. It turns out that the weights on a terminator instruction are capped to allow for being added together, so using a uint32_t should be safe. llvm-svn: 235945
* Switch lowering: Take branch weight into account when ordering for fall-throughHans Wennborg2015-04-271-3/+4
| | | | | | | | | | | Previously, the code would try to put a fall-through case last, even if that meant moving a case with much higher branch weight further down the chain. Ordering by branch weight is most important, putting a fall-through block last is secondary. llvm-svn: 235942
* Switch lowering: order bit tests by branch weight.Hans Wennborg2015-04-271-1/+4
| | | | llvm-svn: 235912
* [DAGCombiner] Fix the type used in canFoldInAddressingMode to account for theQuentin Colombet2015-04-241-2/+2
| | | | | | | | | | | | | | | | | | right scaling. In the function canFoldInAddressingMode, VT is computed as the type of the destination/source of a LOAD/STORE operations, instead of the memory type of the operation. On targets with a scaling factor on the offset of the LOAD/STORE operations, the function may return false for actually valid cases. This may then prevent the selection of profitable pre or post indexed load/store operations, and instead select pre or post indexed load/store for unprofitable cases. Patch by Francois de Ferriere <francois.de-ferriere@st.com>! Differential Revision: http://reviews.llvm.org/D9146 llvm-svn: 235780
* [SEH] Implement GetExceptionCode in __except blocksReid Kleckner2015-04-243-20/+35
| | | | | | | | | This introduces an intrinsic called llvm.eh.exceptioncode. It is lowered by copying the EAX value live into whatever basic block it is called from. Obviously, this only works if you insert it late during codegen, because otherwise mid-level passes might reschedule it. llvm-svn: 235768
OpenPOWER on IntegriCloud