summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* - When DAG combiner is folding a bit convert into a BUILD_VECTOR, it should ↵Evan Cheng2008-02-181-5/+8
| | | | | | | | check if it's essentially a SCALAR_TO_VECTOR. Avoid turning (v8i16) <10, u, u, u> to <10, 0, u, u, u, u, u, u>. Instead, simply convert it to a SCALAR_TO_VECTOR of the proper type. - X86 now normalize SCALAR_TO_VECTOR to (BIT_CONVERT (v4i32 SCALAR_TO_VECTOR)). Get rid of X86ISD::S2VEC. llvm-svn: 47290
* teach dag combiner how to eliminate MERGE_VALUES nodes.Chris Lattner2008-02-131-0/+14
| | | | llvm-svn: 47052
* Add a isBigEndian method to complement isLittleEndian.Duncan Sands2008-02-111-4/+4
| | | | llvm-svn: 46954
* Return "(c1 + c2)" instead of yet another ADD node (which made this aBill Wendling2008-02-101-1/+1
| | | | | | no-op). llvm-svn: 46922
* the world doesn't need my debugging code.Chris Lattner2008-02-031-1/+0
| | | | llvm-svn: 46678
* Change the 'global modification' APIs in SelectionDAG to take a newChris Lattner2008-02-031-117/+124
| | | | | | | | | | DAGUpdateListener object pointer instead of just returning a vector of deleted nodes. This makes the interfaces more efficient (no more allocating a vector [at least a malloc], filling it in, then walking it) and more clean. This also allows the client to be notified of nodes that are *changed* but not deleted. llvm-svn: 46677
* Factor the addressing mode and the load/store VT out of LoadSDNodeDan Gohman2008-01-301-26/+26
| | | | | | | | and StoreSDNode into their common base class LSBaseSDNode. Member functions getLoadedVT and getStoredVT are replaced with the common getMemoryVT to simplify code that will handle both loads and stores. llvm-svn: 46538
* Use empty() instead of comparing size() with zero.Dan Gohman2008-01-291-1/+1
| | | | llvm-svn: 46514
* Fix PowerPC/./2007-10-18-PtrArithmetic.llChris Lattner2008-01-271-7/+14
| | | | llvm-svn: 46424
* fix a crash on CodeGen/X86/vector-rem.llChris Lattner2008-01-271-4/+6
| | | | llvm-svn: 46422
* Implement some dag combines that allow doing fneg/fabs/fcopysign in integerChris Lattner2008-01-271-2/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | registers if used by a bitconvert or using a bitconvert. This allows us to avoid constant pool loads and use cheaper integer instructions when the values come from or end up in integer regs anyway. For example, we now compile CodeGen/X86/fp-in-intregs.ll to: _test1: movl $2147483648, %eax xorl 4(%esp), %eax ret _test2: movl $1065353216, %eax orl 4(%esp), %eax andl $3212836864, %eax ret Instead of: _test1: movss 4(%esp), %xmm0 xorps LCPI2_0, %xmm0 movd %xmm0, %eax ret _test2: movss 4(%esp), %xmm0 andps LCPI3_0, %xmm0 movss LCPI3_1, %xmm1 andps LCPI3_2, %xmm1 orps %xmm0, %xmm1 movd %xmm1, %eax ret bitconverts can happen due to various calling conventions that require fp values to passed in integer regs in some cases, e.g. when returning a complex. llvm-svn: 46414
* Infer alignment of loads and increase their alignment when we can tell they are Chris Lattner2008-01-261-4/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | from the stack. This allows us to compile stack-align.ll to: _test: movsd LCPI1_0, %xmm0 movapd %xmm0, %xmm1 *** andpd 4(%esp), %xmm1 andpd _G, %xmm0 addsd %xmm1, %xmm0 movl 20(%esp), %eax movsd %xmm0, (%eax) ret instead of: _test: movsd LCPI1_0, %xmm0 ** movsd 4(%esp), %xmm1 ** andpd %xmm0, %xmm1 andpd _G, %xmm0 addsd %xmm1, %xmm0 movl 20(%esp), %eax movsd %xmm0, (%eax) ret llvm-svn: 46401
* Fix some bugs in SimplifyNodeWithTwoResults where it would call deletenode to Chris Lattner2008-01-261-41/+31
| | | | | | | | | | delete a node even if it was not dead in some cases. Instead, just add it to the worklist. Also, make sure to use the CombineTo methods, as it was doing things that were unsafe: the top level combine loop could touch dangling memory. This fixes CodeGen/Generic/2008-01-25-dag-combine-mul.ll llvm-svn: 46384
* reduce indentationChris Lattner2008-01-251-42/+44
| | | | llvm-svn: 46377
* Add skeletal code to increase the alignment of loads and stores whenChris Lattner2008-01-251-0/+38
| | | | | | | we can infer it. This will eventually help stuff, though it doesn't do much right now because all fixed FI's have an alignment of 1. llvm-svn: 46349
* clarify a comment, thanks Duncan.Chris Lattner2008-01-241-2/+3
| | | | llvm-svn: 46313
* Fix this buggy transformation. Two observations:Chris Lattner2008-01-241-17/+8
| | | | | | | | | | | | 1. we already know the value is dead, so don't bother replacing it with undef. 2. The very case the comment describes actually makes the load live which asserts in deletenode. If we do the replacement and the node becomes live, just treat it as new. This fixes a failure on X86/2008-01-16-InvalidDAGCombineXform.ll with some local changes in my tree. llvm-svn: 46306
* The dag combiner is missing revisiting nodes that it really should, and thus ↵Chris Lattner2008-01-241-0/+5
| | | | | | | | | | leaving dead stuff around. This gets fed into the isel pass and causes certain foldings from happening because nodes have extraneous uses floating around. For example, if we turned foo(bar(x)) -> baz(x), we sometimes left bar(x) around. llvm-svn: 46305
* fold fp_round(fp_round(x)) -> fp_round(x).Chris Lattner2008-01-241-0/+9
| | | | llvm-svn: 46304
* This commit changes:Chris Lattner2008-01-171-10/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. Legalize now always promotes truncstore of i1 to i8. 2. Remove patterns and gunk related to truncstore i1 from targets. 3. Rename the StoreXAction stuff to TruncStoreAction in TLI. 4. Make the TLI TruncStoreAction table a 2d table to handle from/to conversions. 5. Mark a wide variety of invalid truncstores as such in various targets, e.g. X86 currently doesn't support truncstore of any of its integer types. 6. Add legalize support for truncstores with invalid value input types. 7. Add a dag combine transform to turn store(truncate) into truncstore when safe. The later allows us to compile CodeGen/X86/storetrunc-fp.ll to: _foo: fldt 20(%esp) fldt 4(%esp) faddp %st(1) movl 36(%esp), %eax fstps (%eax) ret instead of: _foo: subl $4, %esp fldt 24(%esp) fldt 8(%esp) faddp %st(1) fstps (%esp) movl 40(%esp), %eax movss (%esp), %xmm0 movss %xmm0, (%eax) addl $4, %esp ret llvm-svn: 46140
* code cleanups, no functionality change.Chris Lattner2008-01-171-7/+9
| | | | llvm-svn: 46126
* * Introduce a new SelectionDAG::getIntPtrConstant methodChris Lattner2008-01-171-11/+25
| | | | | | | | | | | | | and switch various codegen pieces and the X86 backend over to using it. * Add some comments to SelectionDAGNodes.h * Introduce a second argument to FP_ROUND, which indicates whether the FP_ROUND changes the value of its input. If not it is safe to xform things like fp_extend(fp_round(x)) -> x. llvm-svn: 46125
* Fixes a nasty dag combiner bug that causes a bunch of tests to fail at -O0.Evan Cheng2008-01-161-6/+44
| | | | | | | | | | | | | | | | | | | | It's not safe to use the two value CombineTo variant to combine away a dead load. e.g. v1, chain2 = load chain1, loc v2, chain3 = load chain2, loc v3 = add v2, c Now we replace use of v1 with undef, use of chain2 with chain1. ReplaceAllUsesWith() will iterate through uses of the first load and update operands: v1, chain2 = load chain1, loc v2, chain3 = load chain1, loc v3 = add v2, c Now the second load is the same as the first load, SelectionDAG cse will ensure the use of second load is replaced with the first load. v1, chain2 = load chain1, loc v3 = add v1, c Then v1 is replaced with undef and bad things happen. llvm-svn: 46099
* Factor the ReachesChainWithoutSideEffects out of dag combiner into Chris Lattner2008-01-161-28/+1
| | | | | | | a public SDOperand::reachesChainWithoutSideEffects method. No functionality change. llvm-svn: 46050
* Make load->store deletion a bit smarter. This allows us to compile this:Chris Lattner2008-01-081-3/+33
| | | | | | | | | | | | | | | | | | | | | | void test(long long *P) { *P ^= 1; } into just: _test: movl 4(%esp), %eax xorl $1, (%eax) ret instead of code like this: _test: movl 4(%esp), %ecx xorl $1, (%ecx) movl 4(%ecx), %edx movl %edx, 4(%ecx) ret llvm-svn: 45762
* Remove attribution from file headers, per discussion on llvmdev.Chris Lattner2007-12-291-2/+2
| | | | llvm-svn: 45418
* make sure not to zap volatile stores, thanks a lot to Dale for noticing this!Chris Lattner2007-12-291-1/+2
| | | | llvm-svn: 45402
* don't fold fp_round(fp_extend(load)) -> fp_round(extload)Chris Lattner2007-12-291-0/+4
| | | | llvm-svn: 45400
* Delete a store whose input is a load from the same pointer:Chris Lattner2007-12-291-1/+12
| | | | | | | x = load p store x -> p llvm-svn: 45398
* Tell TargetLoweringOpt whether it is running beforeChris Lattner2007-12-221-1/+1
| | | | | | or after legalize. llvm-svn: 45321
* Don't leave newly created nodes around if it turns out they are not needed.Evan Cheng2007-12-191-2/+4
| | | | llvm-svn: 45186
* Redo previous patch so optimization only done for i1.Dale Johannesen2007-12-061-16/+4
| | | | | | Simpler and safer. llvm-svn: 44663
* third time around: instead of disabling this completely,Chris Lattner2007-12-061-6/+13
| | | | | | | only disable it if we don't know it will be obviously profitable. Still fixme, but less so. :) llvm-svn: 44658
* Actually, disable this code for now. More analysis and improvements toChris Lattner2007-12-061-0/+6
| | | | | | the X86 backend are needed before this should be enabled by default. llvm-svn: 44657
* implement a readme entry, compiling the code into:Chris Lattner2007-12-061-19/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | _foo: movl $12, %eax andl 4(%esp), %eax movl _array(%eax), %eax ret instead of: _foo: movl 4(%esp), %eax shrl $2, %eax andl $3, %eax movl _array(,%eax,4), %eax ret As it turns out, this triggers all the time, in a wide variety of situations, for example, I see diffs like this in various programs: - movl 8(%eax), %eax - shll $2, %eax - andl $1020, %eax - movl (%esi,%eax), %eax + movzbl 8(%eax), %eax + movl (%esi,%eax,4), %eax - shll $2, %edx - andl $1020, %edx - movl (%edi,%edx), %edx + andl $255, %edx + movl (%edi,%edx,4), %edx Unfortunately, I also see stuff like this, which can be fixed in the X86 backend: - andl $85, %ebx - addl _bit_count(,%ebx,4), %ebp + shll $2, %ebx + andl $340, %ebx + addl _bit_count(%ebx), %ebp llvm-svn: 44656
* Fix PR1842.Dale Johannesen2007-12-061-4/+16
| | | | llvm-svn: 44649
* Don't lower srem/urem X%C to X-X/C*C unless the division is actuallyDan Gohman2007-11-261-14/+18
| | | | | | | | | optimized. This avoids creating illegal divisions when the combiner is running after legalize; this fixes PR1815. Also, it produces better code in the included testcase by avoiding the subtract and multiply when the division isn't optimized. llvm-svn: 44341
* Move MinAlign to MathExtras.h.Duncan Sands2007-11-091-1/+0
| | | | llvm-svn: 43944
* Fix some load/store logic that would be wrong forDuncan Sands2007-11-091-4/+8
| | | | | | | | apints on big-endian machines if the bitwidth is not a multiple of 8. Introduce a new helper, MVT::getStoreSizeInBits, and use it. llvm-svn: 43934
* If both parts of smul_lohi, etc. are used, don't simplify. If only one part ↵Evan Cheng2007-11-081-30/+31
| | | | | | is used, try simplify it. llvm-svn: 43888
* Typo.Evan Cheng2007-10-301-1/+1
| | | | llvm-svn: 43511
* Fix a DAGCombiner abort on a bitcast from a scalar to a vector.Dan Gohman2007-10-291-1/+2
| | | | llvm-svn: 43470
* Enable more fold (sext (load x)) -> (sext (truncate (sextload x)))Evan Cheng2007-10-291-24/+134
| | | | | | | | transformation. Previously, it's restricted by ensuring the number of load uses is one. Now the restriction is loosened up by allowing setcc uses to be "extended" (e.g. setcc x, c, eq -> setcc sext(x), sext(c), eq). llvm-svn: 43465
* The guaranteed alignment of ptr+offset is only the minimum ofDuncan Sands2007-10-281-9/+13
| | | | | | | | | | | | | | | | | | | of offset and the alignment of ptr if these are both powers of 2. While the ptr alignment is guaranteed to be a power of 2, there is no reason to think that offset is. For example, if offset is 12 (the size of a long double on x86-32 linux) and the alignment of ptr is 8, then the alignment of ptr+offset will in general be 4, not 8. Introduce a function MinAlign, lifted from gcc, for computing the minimum guaranteed alignment. I've tried to fix up everywhere under lib/CodeGen/SelectionDAG/. I also changed some places that weren't wrong (because both values were a power of 2), as a defensive change against people copying and pasting the code. Hopefully someone who cares about alignment will review the rest of LLVM and fix up the remaining places. Since I'm on x86 I'm not very motivated to do this myself... llvm-svn: 43421
* Redo "last ppc long double fix" as Chris wants.Dale Johannesen2007-10-191-1/+1
| | | | llvm-svn: 43189
* More ppcf128 issues (maybe the last)?Dale Johannesen2007-10-191-1/+1
| | | | llvm-svn: 43160
* Disable attempts to constant fold PPC f128.Dale Johannesen2007-10-161-12/+16
| | | | | | | Remove the assumption that this will happen from various places. llvm-svn: 43053
* One mundane change: Change ReplaceAllUsesOfValueWith to *optionally* Chris Lattner2007-10-151-19/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | take a deleted nodes vector, instead of requiring it. One more significant change: Implement the start of a legalizer that just works on types. This legalizer is designed to run before the operation legalizer and ensure just that the input dag is transformed into an output dag whose operand and result types are all legal, even if the operations on those types are not. This design/impl has the following advantages: 1. When finished, this will *significantly* reduce the amount of code in LegalizeDAG.cpp. It will remove all the code related to promotion and expansion as well as splitting and scalarizing vectors. 2. The new code is very simple, idiomatic, and modular: unlike LegalizeDAG.cpp, it has no 3000 line long functions. :) 3. The implementation is completely iterative instead of recursive, good for hacking on large dags without blowing out your stack. 4. The implementation updates nodes in place when possible instead of deallocating and reallocating the entire graph that points to some mutated node. 5. The code nicely separates out handling of operations with invalid results from operations with invalid operands, making some cases simpler and easier to understand. 6. The new -debug-only=legalize-types option is very very handy :), allowing you to easily understand what legalize types is doing. This is not yet done. Until the ifdef added to SelectionDAGISel.cpp is enabled, this does nothing. However, this code is sufficient to legalize all of the code in 186.crafty, olden and freebench on an x86 machine. The biggest issues are: 1. Vectors aren't implemented at all yet 2. SoftFP is a mess, I need to talk to Evan about it. 3. No lowering to libcalls is implemented yet. 4. Various operations are missing etc. 5. There are FIXME's for stuff I hax0r'd out, like softfp. Hey, at least it is a step in the right direction :). If you'd like to help, just enable the #ifdef in SelectionDAGISel.cpp and compile code with it. If this explodes it will tell you what needs to be implemented. Help is certainly appreciated. Once this goes in, we can do three things: 1. Add a new pass of dag combine between the "type legalizer" and "operation legalizer" passes. This will let us catch some long-standing isel issues that we miss because operation legalization often obfuscates the dag with target-specific nodes. 2. We can rip out all of the type legalization code from LegalizeDAG.cpp, making it much smaller and simpler. When that happens we can then reimplement the core functionality left in it in a much more efficient and non-recursive way. 3. Once the whole legalizer is non-recursive, we can implement whole-function selectiondags maybe... llvm-svn: 42981
* Enhance the truncstore optimization code to handle shiftedChris Lattner2007-10-131-2/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | values and propagate demanded bits through them in simple cases. This allows this code: void foo(char *P) { strcpy(P, "abc"); } to compile to: _foo: ldrb r3, [r1] ldrb r2, [r1, #+1] ldrb r12, [r1, #+2]! ldrb r1, [r1, #+1] strb r1, [r0, #+3] strb r2, [r0, #+1] strb r12, [r0, #+2] strb r3, [r0] bx lr instead of: _foo: ldrb r3, [r1, #+3] ldrb r2, [r1, #+2] orr r3, r2, r3, lsl #8 ldrb r2, [r1, #+1] ldrb r1, [r1] orr r2, r1, r2, lsl #8 orr r3, r2, r3, lsl #16 strb r3, [r0] mov r2, r3, lsr #24 strb r2, [r0, #+3] mov r2, r3, lsr #16 strb r2, [r0, #+2] mov r3, r3, lsr #8 strb r3, [r0, #+1] bx lr testcase here: test/CodeGen/ARM/truncstore-dag-combine.ll This also helps occasionally for X86 and other cases not involving unaligned load/stores. llvm-svn: 42954
* Add a simple optimization to simplify the input toChris Lattner2007-10-131-0/+42
| | | | | | | truncate and truncstore instructions, based on the knowledge that they don't demand the top bits. llvm-svn: 42952
OpenPOWER on IntegriCloud