summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG
Commit message (Collapse)AuthorAgeFilesLines
...
* Pass address space to allowsUnalignedMemoryAccessesMatt Arsenault2014-02-053-11/+25
| | | | llvm-svn: 200888
* Add address space argument to allowsUnalignedMemoryAccess.Matt Arsenault2014-02-051-1/+1
| | | | | | | On R600, some address spaces have more strict alignment requirements than others. llvm-svn: 200887
* Add CheckChildInteger to ISelMatcher operations. Removes nearly 2000 bytes ↵Craig Topper2014-02-051-2/+23
| | | | | | from X86 matcher table. llvm-svn: 200821
* Expand vector bswap in LegalizeVectorOpsHal Finkel2014-02-031-0/+1
| | | | | | | ISD::BSWAP was missing from the list of node types that should be expanded element-wise. llvm-svn: 200705
* Implement inalloca codegen for x86 with the new inalloca designReid Kleckner2014-01-312-2/+25
| | | | | | | | | | | | | | | | Calls with inalloca are lowered by skipping all stores for arguments passed in memory and the initial stack adjustment to allocate argument memory. Now the frontend is responsible for the memory layout, and the backend doesn't have to do any work. As a result these changes are pretty minimal. Reviewers: echristo Differential Revision: http://llvm-reviews.chandlerc.com/D2637 llvm-svn: 200596
* Don't put non-static allocas in the static alloca mapReid Kleckner2014-01-311-1/+7
| | | | | | | | Allocas marked inalloca are never static, but we were trying to put them into the static alloca map if they were in the entry block. Also add an assertion in x86 fastisel. llvm-svn: 200593
* This patch teaches the DAGCombiner how to fold insert_subvector nodesManman Ren2014-01-311-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | when the input is a concat_vectors and the insert replaces one of the concat halves: Lower half: fold (insert_subvector (concat_vectors X, Y), Z) -> (concat_vectors Z, Y) Upper half: fold (insert_subvector (concat_vectors X, Y), Z) -> (concat_vectors X, Z) This can be seen with the following IR: define <8 x float> @lower_half(<4 x float> %v1, <4 x float> %v2, <4 x float> %v3) { %1 = shufflevector <4 x float> %v1, <4 x float> %v2, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7> %2 = tail call <8 x float> @llvm.x86.avx.vinsertf128.ps.256(<8 x float> %1, <4 x float> %v3, i8 0) The vinsertf128 intrinsic is converted into an insert_subvector node in SelectionDAGBuilder.cpp. Using AVX, without the patch this generates two vinsertf128 instructions: vinsertf128 $1, %xmm1, %ymm0, %ymm0 vinsertf128 $0, %xmm2, %ymm0, %ymm0 With the patch this is optimized into: vinsertf128 $1, %xmm1, %ymm2, %ymm0 Patch by Robert Lougher. llvm-svn: 200506
* DAGCombine should not produce ISD::OR nodes after operation legalization if ↵Owen Anderson2014-01-311-2/+4
| | | | | | they're not legal. llvm-svn: 200503
* PGO branch weight: update edge weights in SelectionDAGBuilder.Manman Ren2014-01-312-12/+65
| | | | | | | | | | | | | | | | When converting from "or + br" to two branches, or converting from "and + br" to two branches, we correctly update the edge weights of the two branches. The previous attempt at r200431 was reverted at r200434 because of two testing case failures. I modified my patch a little, but forgot to re-run "make check-all". Testing case CodeGen/ARM/lsr-unfolded-offset.ll is updated because of the patch's impact on branch probability which causes changes in spill placement. llvm-svn: 200502
* Revert r200431 due to bot failures.Manman Ren2014-01-302-65/+12
| | | | llvm-svn: 200434
* PGO branch weight: update edge weights in SelectionDAGBuilder.Manman Ren2014-01-302-12/+65
| | | | | | | | When converting from "or + br" to two branches, or converting from "and + br" to two branches, we correctly update the edge weights of the two branches. llvm-svn: 200431
* [DAGCombiner] Avoid introducing an illegal build_vector when folding a ↵Andrea Di Biagio2014-01-281-9/+15
| | | | | | | | | | | | | | | | | sign_extend. Make sure that we don't introduce illegal build_vector dag nodes when trying to fold a sign_extend of a build_vector. This fixes a regression introduced by r200234. Added test CodeGen/X86/fold-vector-sext-crash.ll to verify that llc no longer crashes with an assertion failure due to an illegal build_vector of type MVT::v4i64. Thanks to Ilia Filippov for spotting this regression and for providing a reproducible test case. llvm-svn: 200313
* [TLI] Add a new hook to TargetLowering to query the target if a load of a ↵Juergen Ributzka2014-01-282-11/+5
| | | | | | | | | | | | | | | | | constant should be converted to simply the constant itself. Before this patch we used getIntImmCost from TargetTransformInfo to determine if a load of a constant should be converted to just a constant, but the threshold for this was set to an arbitrary value. This value works well for the two targets (X86 and ARM) that implement this target-hook, but it isn't target-independent at all. Now targets have the possibility to decide directly if this optimization should be performed. The default value is set to false to preserve the current behavior. The target hook has been moved to TargetLowering, which removed the last use and need of TargetTransformInfo in SelectionDAG. llvm-svn: 200271
* Fix sext(setcc) -> select_cc using wrong type for setcc.Matt Arsenault2014-01-271-10/+16
| | | | | | | | | | | | | | | | Also update the comment, since it actually produces a select (setcc) instead of select_cc. It was checking and using the setcc result type for the type of the sext, instead of the type of the compared items. In my problem case, the sext was to i32 and was used as the setcc type, but the expected type was i64. No test since I haven't been able to hit the problem with this on any in-tree targets. llvm-svn: 200249
* [DAGCombiner] Teach how to fold sext/aext/zext of constant build vectors.Andrea Di Biagio2014-01-271-9/+64
| | | | | | | | | | | | | This patch teaches the DAGCombiner how to fold a sext/aext/zext dag node when the operand in input is a build vector of constants (or UNDEFs). The inability to fold a sext/zext of a constant build_vector was the root cause of some pcg bugs affecting vselect expansion on x86-64 with AVX support. Before this change, the DAGCombiner only knew how to fold a sext/zext/aext of a ConstantSDNode. llvm-svn: 200234
* Fix for PR18102.Stepan Dyatkovskiy2014-01-271-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Issue outcomes from DAGCombiner::MergeConsequtiveStores, more precisely from mem-ops sequence sorting. Consider, how MergeConsequtiveStores works for next example: store i8 1, a[0] store i8 2, a[1] store i8 3, a[1] ; a[1] again. return ; DAG starts here 1. Method will collect all the 3 stores. 2. It sorts them by distance from the base pointer (farthest with highest index). 3. It takes first consecutive non-overlapping stores and (if possible) replaces them with a single store instruction. The point is, we can't determine here which 'store' instruction would be the second after sorting ('store 2' or 'store 3'). It happens that 'store 3' would be the second, and 'store 2' would be the third. So after merging we have the next result: store i16 (1 | 3 << 8), base ; is a[0] but bit-casted to i16 store i8 2, a[1] So actually we swapped 'store 3' and 'store 2' and got wrong contents in a[1]. Fix: In sort routine just also take into account mem-op sequence number. llvm-svn: 200201
* [AArch64 NEON] Fix pattern match failed on FP_ROUND from v1f128 to v1f64.Kevin Qin2014-01-262-0/+13
| | | | llvm-svn: 200109
* Disable the use of TBAA when using AA in CodeGenHal Finkel2014-01-251-2/+14
| | | | | | | | | | | | | | | | | There are currently two issues, of which I currently know, that prevent TBAA from being correctly usable in CodeGen: 1. Stack coloring does not update TBAA when merging allocas. This is easy enough to fix, but is not the largest problem. 2. CGP inserts ptrtoint/inttoptr pairs when sinking address computations. Because BasicAA does not handle inttoptr, we'll often miss basic type punning idioms that we need to catch so we don't miscompile real-world code (like LLVM). I don't yet have a small test case for this, but this fixes self hosting a non-asserts build of LLVM on PPC64 when using -enable-aa-sched-mi and -misched=shuffle. llvm-svn: 200093
* Add combiner-aa-only-func (debug only)Hal Finkel2014-01-251-0/+22
| | | | | | | | | This option (which is !NDEBUG only) allows restricting the use of alias analysis in DAGCombiner to a specific function. This has proved extremely valuable to isolating bugs related to this feature, and mirrors the misched-only-func option provided by the new instruction scheduler. llvm-svn: 200088
* Improve descriptions of combiner-alias-analysis and ↵Hal Finkel2014-01-251-2/+2
| | | | | | combiner-global-alias-analysis llvm-svn: 200087
* Revert "Revert "Add Constant Hoisting Pass" (r200034)"Juergen Ributzka2014-01-255-25/+56
| | | | | | | This reverts commit r200058 and adds the using directive for ARMTargetTransformInfo to silence two g++ overload warnings. llvm-svn: 200062
* Revert "Add Constant Hoisting Pass" (r200034)Hans Wennborg2014-01-255-56/+25
| | | | | | | | | | | | | | | This commit caused -Woverloaded-virtual warnings. The two new TargetTransformInfo::getIntImmCost functions were only added to the superclass, and to the X86 subclass. The other targets were not updated, and the warning highlighted this by pointing out that e.g. ARMTTI::getIntImmCost was hiding the two new getIntImmCost variants. We could pacify the warning by adding "using TargetTransformInfo::getIntImmCost" to the various subclasses, or turning it off, but I suspect that it's wrong to leave the functions unimplemnted in those targets. The default implementations return TCC_Free, which I don't think is right e.g. for ARM. llvm-svn: 200058
* Add Constant Hoisting PassJuergen Ributzka2014-01-245-25/+56
| | | | | | | | Retry commit r200022 with a fix for the build bot errors. Constant expressions have (unlike instructions) module scope use lists and therefore may have users in different functions. The fix is to simply ignore these out-of-function uses. llvm-svn: 200034
* Fix DAGCombiner::GatherAllAliases to account for non-chain dependenciesHal Finkel2014-01-241-1/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DAGCombiner::GatherAllAliases, which is only used when AA used is enabled during DAGCombine, had a fundamentally incorrect assumption for which this change compensates. GatherAllAliases, which is used to find aliasing predecessor chain nodes (so that a better chain can be selected for a load or store to enable subsequent optimizations) assumed that walking up the chain would always catch all possibly-aliasing loads and stores. This is not true: To really find all aliases, we also need to search for aliases through the value operand of a store, etc. Consider the following situation: Token1 = ... L1 = load Token1, %52 S1 = store Token1, L1, %51 L2 = load Token1, %52+8 S2 = store Token1, L2, %51+8 Token2 = Token(S1, S2) L3 = load Token2, %53 S3 = store Token2, L3, %52 L4 = load Token2, %53+8 S4 = store Token2, L4, %52+8 If we search for aliases of S3 (which loads address %52), and we look only through the chain, then we'll miss the trivial dependence on L1 (which loads from %52). We then might change all loads and stores to use Token1 as their chain operand, which could result in copying %53 into %52 before copying %52 into %51 (which should happen first). The problem is, however, that searching for such data dependencies can become expensive, and the cost is not directly related to the chain depth. Instead, we'll rule out such configurations by insisting that we've visited all chain users (except for users of the original chain, which is not necessary). When doing this, we need to look through nodes we don't care about (otherwise, things like register copies will interfere with trivial use cases). Unfortunately, I don't have a small test case for this problem. Creating the underlying situation is not hard (a pair of memcpys will do it), but arranging for the default instruction schedule to be incorrect is very fragile. This unbreaks self hosting on PPC64 when using -mllvm -combiner-global-alias-analysis -mllvm -combiner-alias-analysis. llvm-svn: 200033
* Revert "Add Constant Hoisting Pass"Juergen Ributzka2014-01-245-56/+25
| | | | | | This reverts commit r200022 to unbreak the build bots. llvm-svn: 200024
* Restrict FindBetterChain DAG combines to unindexed nodesHal Finkel2014-01-241-2/+2
| | | | | | | | | | | | These transformations obviously won't work for indexed (pre/post-inc) loads and stores. In practice, I'm not sure there is any benefit to enabling them for indexed nodes because other transformations that these might enable likely also won't handle indexed nodes. I don't have an in-tree test case that hits this problem, but an upcoming bug fix will make it much more likely. llvm-svn: 200023
* Add Constant Hoisting PassJuergen Ributzka2014-01-245-25/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This pass identifies expensive constants to hoist and coalesces them to better prepare it for SelectionDAG-based code generation. This works around the limitations of the basic-block-at-a-time approach. First it scans all instructions for integer constants and calculates its cost. If the constant can be folded into the instruction (the cost is TCC_Free) or the cost is just a simple operation (TCC_BASIC), then we don't consider it expensive and leave it alone. This is the default behavior and the default implementation of getIntImmCost will always return TCC_Free. If the cost is more than TCC_BASIC, then the integer constant can't be folded into the instruction and it might be beneficial to hoist the constant. Similar constants are coalesced to reduce register pressure and materialization code. When a constant is hoisted, it is also hidden behind a bitcast to force it to be live-out of the basic block. Otherwise the constant would be just duplicated and each basic block would have its own copy in the SelectionDAG. The SelectionDAG recognizes such constants as opaque and doesn't perform certain transformations on them, which would create a new expensive constant. This optimization is only applied to integer constants in instructions and simple (this means not nested) constant cast experessions. For example: %0 = load i64* inttoptr (i64 big_constant to i64*) Reviewed by Eric llvm-svn: 200022
* Fix known typosAlp Toker2014-01-241-3/+3
| | | | | | | Sweep the codebase for common typos. Includes some changes to visible function names that were misspelt. llvm-svn: 200018
* Revert r162101 and replace it with a solution that works for targets where ↵Owen Anderson2014-01-222-7/+11
| | | | | | | | | | the pointer type is illegal. This is a horrible bit of code. We're calling a simplification routine *in the middle* of type legalization. We tell the simplification routine that it's running after legalization, but some of the types it will encounter will be illegal! The fix is only to invoke the simplification if the types in question were legal, so that none of its invariants will be violated. llvm-svn: 199847
* AVX512: combining setcc and zext is wrong on AVX512Elena Demikhovsky2014-01-221-1/+4
| | | | | | because vector compare instruction puts result in mask register. llvm-svn: 199798
* Allow SMUL_LOHI and UMUL_LOHI to be narrow to MUL on targets where MUL is ↵Owen Anderson2014-01-201-1/+1
| | | | | | Custom rather than Legal. Even if the target is doing some kind of expansion for MUL, it's pretty much guaranteed to be more efficent than whatever it does for SMUL_LOHI or UMUL_LOHI! llvm-svn: 199678
* [DAGCombiner] Fix a wrong check in method SimplifyVBinOp.Andrea Di Biagio2014-01-151-2/+2
| | | | | | | | | | | | | | | | | | | This fixes a regression intruced by r199135. Revision 199135 tried to simplify part of the logic in method DAGCombiner::SimplifyVBinOp introducing calls to method BuildVectorSDNode::isConstant(). However, that revision wrongly changed the check performed by method SimplifyVBinOp to identify dag nodes that can be folded. Before revision 199135, that method only tried to simplify vector binary operations if both operands were build_vector of Constant/ConstantFP/Undef only. After revision 199135, method SimplifyVBinop tried to simplify also vector binary operations with only one constant operand. This fixes the problem restoring the old behavior of SimplifyVBinOp. llvm-svn: 199328
* Always let value types influence register classes.Jakob Stoklund Olesen2014-01-141-4/+13
| | | | | | | | | | | | | | | | | | | | | When creating a virtual register for a def, the value type should be used to pick the register class. If we only use the register class constraint on the instruction, we might pick a too large register class. Some registers can store values of different sizes. For example, the x86 xmm registers can hold f32, f64, and 128-bit vectors. The three different value sizes are represented by register classes with identical register sets: FR32, FR64, and VR128. These register classes have different spill slot sizes, so it is important to use the right one. The register class constraint on an instruction doesn't necessarily care about the size of the value its defining. The value type determines that. This fixes a problem where InstrEmitter was picking 32-bit register classes for 64-bit values on SPARC. llvm-svn: 199187
* [DAG] Refactor ReassociateOps - no functional change intended.Juergen Ributzka2014-01-131-73/+44
| | | | llvm-svn: 199146
* [DAG] Teach DAG to also reassociate vector operationsJuergen Ributzka2014-01-132-16/+60
| | | | | | | | | | This commit teaches DAG to reassociate vector ops, which in turn enables constant folding of vector op chains that appear later on during custom lowering and DAG combine. Reviewed by Andrea Di Biagio llvm-svn: 199135
* Hide the pre-RA-sched= option.Andrew Trick2014-01-131-1/+1
| | | | | | | | | This is a very confusing option for a feature that will go away. -enable-misched is exposed instead to help triage issues with the new scheduler. llvm-svn: 199133
* Fix non-deterministic SDNodeOrder-dependent codegenNico Rieck2014-01-122-1/+6
| | | | | | | Reset SelectionDAGBuilder's SDNodeOrder to ensure deterministic code generation. llvm-svn: 199050
* Fix 'ned' typo in doc commentAlp Toker2014-01-111-1/+1
| | | | | | Patch by Jasper Neumann! llvm-svn: 199007
* Handle masked rotate amountsRichard Sandiford2014-01-091-16/+72
| | | | | | | | | | | | | | At the moment we expect rotates to have the form: (or (shl X, Y), (shr X, Z)) where Y == bitsize(X) - Z or Z == bitsize(X) - Y. This form means that the (or ...) is undefined for Y == 0 or Z == 0. This undefinedness can be avoided by using Y == (C * bitsize(X) - Z) & (bitsize(X) - 1) or Z == (C * bitsize(X) - Y) & (bitsize(X) - 1) for any integer C (including 0, the most natural choice). llvm-svn: 198861
* Match the InstCombine form of rotates by X+CRichard Sandiford2014-01-091-12/+39
| | | | | | | | | | | | | | | | | InstCombine converts (sub 32, (add X, C)) into (sub 32-C, X), so a rotate left of a 32-bit Y by X+C could appear as either: (or (shl Y, (add X, C)), (shr Y, (sub 32, (add X, C)))) without InstCombine or: (or (shl Y, (add X, C)), (shr Y, (sub 32-C, X))) with it. We already matched the first form. This patch handles the second too. llvm-svn: 198860
* Put the functionality for printing a value to a raw_ostream as anChandler Carruth2014-01-093-6/+3
| | | | | | | | | | | | operand into the Value interface just like the core print method is. That gives a more conistent organization to the IR printing interfaces -- they are all attached to the IR objects themselves. Also, update all the users. This removes the 'Writer.h' header which contained only a single function declaration. llvm-svn: 198836
* Teach the DAGCombiner how to fold 'vselect' dag nodes accordingAndrea Di Biagio2014-01-081-0/+7
| | | | | | | | to the following two rules: 1) fold (vselect (build_vector AllOnes), A, B) -> A 2) fold (vselect (build_vector AllZeros), A, B) -> B llvm-svn: 198777
* [DAGCombiner] Factor duplicated rotate code into a separate functionRichard Sandiford2014-01-081-66/+70
| | | | | | No functional change intended. llvm-svn: 198768
* Move the LLVM IR asm writer header files into the IR directory, as theyChandler Carruth2014-01-073-3/+3
| | | | | | | | | | | | | | | | | are part of the core IR library in order to support dumping and other basic functionality. Rename the 'Assembly' include directory to 'AsmParser' to match the library name and the only functionality left their -- printing has been in the core IR library for quite some time. Update all of the #includes to match. All of this started because I wanted to have the layering in good shape before I started adding support for printing LLVM IR using the new pass infrastructure, and commandline support for the new pass infrastructure. llvm-svn: 198688
* [AArch64 NEON] Fix invalid constant used in vselect condition.Kevin Qin2014-01-061-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | There is a wrong assumption that the vector element type and the type of each ConstantSDNode in the build_vector were the same. However, when promoting the integer operand of a legally typed build_vector, the operand type and the vector element type do not need to be the same (See method 'DAGTypeLegalizer::PromoteIntOp_BUILD_VECTOR' in LegalizeIntegerTypes.cpp). in AArch64 backend, the following dag sequence: C0: i1 = Constant<0> C1: i1 = Constant<-1> V: v8i1 = BUILD_VECTOR C1, C1, C0, C0, C0, C0, C0, C0 is type-legalized into: NewC0: i32 = Constant<0> NewC1: i32 = Constant<1> V: v8i8 = BUILD_VECTOR NewC1, NewC1, NewC0, NewC0, NewC0, NewC0, NewC0, NewC0 Forcing a getZeroExtend to VTBits to ensure that the new constant is correctly. llvm-svn: 198582
* Refactor function that checks that __builtin_returnaddress's argument is ↵Bill Wendling2014-01-061-0/+12
| | | | | | | | | constant. This moves the check up into the parent class so that all targets can use it without having to copy (and keep in sync) the same error message. llvm-svn: 198579
* Fix a bug in DAGcombiner about zero-extend after setcc.Kevin Qin2013-12-301-1/+2
| | | | | | | | | | For AArch64 backend, if DAGCombiner see "sext(setcc)", it will combine them together to a single setcc with extended value type. Then if it see "zext(setcc)", it assumes setcc is Vxi1, and try to create "(and (vsetcc), (1, 1, ...)". While setcc isn't Vxi1, DAGcombiner will create wrong node and get wrong code emitted. llvm-svn: 198190
* Teach DAGCombiner how to fold a SIGN_EXTEND_INREG of a BUILD_VECTOR ofAndrea Di Biagio2013-12-272-0/+39
| | | | | | | | | | | | | | | | | | | | ConstantSDNodes (or UNDEFs) into a simple BUILD_VECTOR. For example, given the following sequence of dag nodes: i32 C = Constant<1> v4i32 V = BUILD_VECTOR C, C, C, C v4i32 Result = SIGN_EXTEND_INREG V, ValueType:v4i1 The SIGN_EXTEND_INREG node can be folded into a build_vector since the vector in input is a BUILD_VECTOR of constants. The optimized sequence is: i32 C = Constant<-1> v4i32 Result = BUILD_VECTOR C, C, C, C llvm-svn: 198084
* [stackprotector] Use analysis from the StackProtector pass for stack layout ↵Josh Magee2013-12-192-10/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | in PEI a nd LocalStackSlot passes. This changes the MachineFrameInfo API to use the new SSPLayoutKind information produced by the StackProtector pass (instead of a boolean flag) and updates a few pass dependencies (to preserve the SSP analysis). The stack layout follows the same approach used prior to this change - i.e., only LargeArray stack objects will be placed near the canary and everything else will be laid out normally. After this change, structures containing large arrays will also be placed near the canary - a case previously missed by the old implementation. Out of tree targets will need to update their usage of MachineFrameInfo::CreateStackObject to remove the MayNeedSP argument. The next patch will implement the rules for sspstrong and sspreq. The end goal is to support ssp-strong stack layout rules. WIP. Differential Revision: http://llvm-reviews.chandlerc.com/D2158 llvm-svn: 197653
* Add a machine code print in DEBUG() following instruction selection.Jim Grosbach2013-12-171-0/+3
| | | | | | | Make debugging ISel a bit easier by printing out a dump of the generated code at the end. llvm-svn: 197456
OpenPOWER on IntegriCloud