summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/InstCombine
Commit message (Collapse)AuthorAgeFilesLines
* Propagate LeftDistributes and RightDistributes into their only uses.Duncan Sands2010-11-231-6/+4
| | | | | | Stylistic improvement suggested by Frits van Bommel. llvm-svn: 120026
* Fix typo pointed out by Frits van Bommel and Marius Wachtler.Duncan Sands2010-11-231-1/+1
| | | | llvm-svn: 120025
* Exploit distributive laws (eg: And distributes over Or, Mul over Add, etc) in aDuncan Sands2010-11-234-42/+133
| | | | | | | | | | | | fairly systematic way in instcombine. Some of these cases were already dealt with, in which case I removed the existing code. The case of Add has a bunch of funky logic which covers some of this plus a few variants (considers shifts to be a form of multiplication), which I didn't touch. The simplification performed is: A*B+A*C -> A*(B+C). The improvement is to do this in cases that were not already handled [such as A*B-A*C -> A*(B-C), which was reported on the mailing list], and also to do it more often by not checking for "only one use" if "B+C" simplifies. llvm-svn: 120024
* duncan's spider sense was right, I completely reversed the conditionChris Lattner2010-11-231-8/+8
| | | | | | on this instcombine xform. This fixes a miscompilation of 403.gcc. llvm-svn: 119988
* InstCombine: Implement X - A*-B -> X + A*B.Benjamin Kramer2010-11-221-0/+9
| | | | llvm-svn: 119984
* If a GEP index simply advances by multiples of a type of zero size,Duncan Sands2010-11-221-14/+24
| | | | | | then replace the index with zero. llvm-svn: 119974
* Move the "gep undef" -> "undef" transform from instcombine toDuncan Sands2010-11-221-3/+0
| | | | | | InstructionSimplify. llvm-svn: 119970
* optimize:Chris Lattner2010-11-211-2/+72
| | | | | | | | | void a(int x) { if (((1<<x)&8)==0) b(); } into "x != 3", which occurs over 100 times in 403.gcc but in no other program in llvm-test. llvm-svn: 119922
* InstCombine: Add a missing irem identity (X % X -> 0).Benjamin Kramer2010-11-171-0/+4
| | | | llvm-svn: 119538
* Move some those Xor simplifications which don't require creating newDuncan Sands2010-11-171-31/+2
| | | | | | | | instructions out of InstCombine and into InstructionSimplify. While there, introduce an m_AllOnes pattern to simplify matching with integers and vectors with all bits equal to one. llvm-svn: 119536
* Teach InstructionSimplify about phi nodes. I chose to have it simplyDuncan Sands2010-11-141-2/+3
| | | | | | | | | | offload the work to hasConstantValue rather than do something more complicated (such handling mutually recursive phis) because (1) it is not clear it is worth it; and (2) if it is worth it, maybe such logic would be better placed in hasConstantValue. Adjust some GVN tests which are now cleaned up much further (eg: all phi nodes are removed). llvm-svn: 119043
* Generalize the reassociation transform in SimplifyCommutative (now renamed toDuncan Sands2010-11-135-46/+128
| | | | | | | | | | | | | | | | SimplifyAssociativeOrCommutative) "(A op C1) op C2" -> "A op (C1 op C2)", which previously was only done if C1 and C2 were constants, to occur whenever "C1 op C2" simplifies (a la InstructionSimplify). Since the simplifying operand combination can no longer be assumed to be the right-hand terms, consider all of the possible permutations. When compiling "gcc as one big file", transform 2 (i.e. using right-hand operands) fires about 4000 times but it has to be said that most of the time the simplifying operands are both constants. Transforms 3, 4 and 5 each fired once. Transform 6, which is an existing transform that I didn't change, never fired. With this change, the testcase is now optimized perfectly with one run of instcombine (previously it required instcombine + reassociate + instcombine, and it may just have been luck that this worked). llvm-svn: 119002
* When checking that the necessary bits are zero inDale Johannesen2010-11-101-2/+2
| | | | | | | order to reduce ((x<<30)>>24) to x<<6, check the correct bits. PR 8547. llvm-svn: 118665
* When folding away a (shl (shr)) pair, we need to check that the bits that ↵Owen Anderson2010-11-011-1/+1
| | | | | | | | will BECOME the low bits are zero, not that the current low bits are zero. Fixes <rdar://problem/8606771>. llvm-svn: 117953
* Clean up indentation and other whitespace.Bob Wilson2010-10-291-11/+9
| | | | llvm-svn: 117728
* Remove trailing whitespace.Bob Wilson2010-10-291-70/+69
| | | | llvm-svn: 117727
* Fix 80-column violation.Bob Wilson2010-10-291-1/+2
| | | | llvm-svn: 117722
* Change instcombine's getShuffleMask to represent undef with negative values.Bob Wilson2010-10-291-40/+36
| | | | | | | | This code had previously used 2*N, where N is the mask length, to represent undef. That is not safe because the shufflevector operands may have more than N elements -- they don't have to match the result type. llvm-svn: 117721
* Make instcombine a little more aggressive in combining vector shuffles.Bob Wilson2010-10-291-15/+22
| | | | | | | | Allow splats even if they don't match either of the original shuffles, possibly due to undef entries in the shuffles masks. Radar 8597790. Also fix some 80-column violations. llvm-svn: 117719
* Teach InstCombine not to use Add and Neg on FP. PR 8490.Dale Johannesen2010-10-271-1/+8
| | | | llvm-svn: 117510
* Fix a case where instcombine was stripping metadata (and alignment)Dan Gohman2010-10-251-1/+3
| | | | | | from stores when folding in bitcasts. llvm-svn: 117265
* SmallVectorize.Benjamin Kramer2010-10-231-3/+1
| | | | llvm-svn: 117213
* Teach instcombine to set the alignment arguments for NEON load/store intrinsics.Bob Wilson2010-10-221-0/+26
| | | | llvm-svn: 117154
* Get rid of static constructors for pass registration. Instead, every pass ↵Owen Anderson2010-10-191-1/+3
| | | | | | | | | | | | | | | | | exposes an initializeMyPassFunction(), which must be called in the pass's constructor. This function uses static dependency declarations to recursively initialize the pass's dependencies. Clients that only create passes through the createFooPass() APIs will require no changes. Clients that want to use the CommandLine options for passes will need to manually call the appropriate initialization functions in PassInitialization.h before parsing commandline arguments. I have tested this with all standard configurations of clang and llvm-gcc on Darwin. It is possible that there are problems with the static dependencies that will only be visible with non-standard options. If you encounter any crash in pass registration/creation, please send the testcase to me directly. llvm-svn: 116820
* Now with fewer extraneous semicolons!Owen Anderson2010-10-071-1/+1
| | | | llvm-svn: 115996
* Add initialization routines to InstCombine.Owen Anderson2010-10-071-0/+9
| | | | llvm-svn: 115965
* fix PR8267 - Instcombine shouldn't optimizer away volatile memcpy's.Chris Lattner2010-10-011-1/+6
| | | | llvm-svn: 115296
* Removed a bunch of unnecessary target_link_libraries.Oscar Fuentes2010-09-281-2/+0
| | | | llvm-svn: 114999
* Revert "CMake: Get rid of LLVMLibDeps.cmake and export the libraries normally."Michael J. Spencer2010-09-131-7/+2
| | | | | | | | | | This reverts commit r113632 Conflicts: cmake/modules/AddLLVM.cmake llvm-svn: 113819
* Re-apply r113679, which was reverted in r113720, which added a paid of new ↵Owen Anderson2010-09-131-5/+31
| | | | | | | | | instcombine transforms to expose greater opportunities for store narrowing in codegen. This patch fixes a potential infinite loop in instcombine caused by one of the introduced transforms being overly aggressive. llvm-svn: 113763
* Revert 113679, it was causing an infinite loop in a testcase that I've sentEric Christopher2010-09-121-30/+5
| | | | | | on to Owen. llvm-svn: 113720
* Invert and-of-or into or-of-and when doing so would allow us to clear bits ↵Owen Anderson2010-09-111-5/+30
| | | | | | | | | | | | | | of the and's mask. This can result in increased opportunities for store narrowing in code generation. Update a number of tests for this change. This fixes <rdar://problem/8285027>. Additionally, because this inverts the order of ors and ands, some patterns for optimizing or-of-and-of-or no longer fire in instances where they did originally. Add a simple transform which recaptures most of these opportunities: if we have an or-of-constant-or and have failed to fold away the inner or, commute the order of the two ors, to give the non-constant or a chance for simplification instead. llvm-svn: 113679
* CMake: Get rid of LLVMLibDeps.cmake and export the libraries normally.Michael J. Spencer2010-09-101-2/+7
| | | | llvm-svn: 113632
* This transform is also performed by InstructionSimplify, remove the duplicate.Benjamin Kramer2010-09-101-3/+0
| | | | llvm-svn: 113608
* Generalize instcombine's support for combining multiple bit checks into a ↵Owen Anderson2010-09-081-32/+278
| | | | | | single test. Patch by Dirk Steinke! llvm-svn: 113423
* Fix a serious performance regression introduced by r108687 on linux:Chris Lattner2010-09-071-1/+6
| | | | | | | | turning (fptrunc (sqrt (fpext x))) -> (sqrtf x) is great, but we have to delete the original sqrt as well. Not doing so causes us to do two sqrt's when building with -fmath-errno (the default on linux). llvm-svn: 113260
* Remove r111665, which implemented store-narrowing in InstCombine. Chris ↵Owen Anderson2010-08-311-47/+0
| | | | | | | | discovered a miscompilation in it, and it's not easily fixable at the optimizer level. I'll investigate reimplementing it in DAGCombine. llvm-svn: 112575
* for completeness, allow undef also.Chris Lattner2010-08-281-0/+3
| | | | llvm-svn: 112351
* handle the constant case of vector insertion. For somethingChris Lattner2010-08-281-3/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | like this: struct S { float A, B, C, D; }; struct S g; struct S bar() { struct S A = g; ++A.B; A.A = 42; return A; } we now generate: _bar: ## @bar ## BB#0: ## %entry movq _g@GOTPCREL(%rip), %rax movss 12(%rax), %xmm0 pshufd $16, %xmm0, %xmm0 movss 4(%rax), %xmm2 movss 8(%rax), %xmm1 pshufd $16, %xmm1, %xmm1 unpcklps %xmm0, %xmm1 addss LCPI1_0(%rip), %xmm2 pshufd $16, %xmm2, %xmm2 movss LCPI1_1(%rip), %xmm0 pshufd $16, %xmm0, %xmm0 unpcklps %xmm2, %xmm0 ret instead of: _bar: ## @bar ## BB#0: ## %entry movq _g@GOTPCREL(%rip), %rax movss 12(%rax), %xmm0 pshufd $16, %xmm0, %xmm0 movss 4(%rax), %xmm2 movss 8(%rax), %xmm1 pshufd $16, %xmm1, %xmm1 unpcklps %xmm0, %xmm1 addss LCPI1_0(%rip), %xmm2 movd %xmm2, %eax shlq $32, %rax addq $1109917696, %rax ## imm = 0x42280000 movd %rax, %xmm0 ret llvm-svn: 112345
* optimize bitcasts from large integers to vector into vectorChris Lattner2010-08-282-11/+129
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | element insertion from the pieces that feed into the vector. This handles a pattern that occurs frequently due to code generated for the x86-64 abi. We now compile something like this: struct S { float A, B, C, D; }; struct S g; struct S bar() { struct S A = g; ++A.A; ++A.C; return A; } into all nice vector operations: _bar: ## @bar ## BB#0: ## %entry movq _g@GOTPCREL(%rip), %rax movss LCPI1_0(%rip), %xmm1 movss (%rax), %xmm0 addss %xmm1, %xmm0 pshufd $16, %xmm0, %xmm0 movss 4(%rax), %xmm2 movss 12(%rax), %xmm3 pshufd $16, %xmm2, %xmm2 unpcklps %xmm2, %xmm0 addss 8(%rax), %xmm1 pshufd $16, %xmm1, %xmm1 pshufd $16, %xmm3, %xmm2 unpcklps %xmm2, %xmm1 ret instead of icky integer operations: _bar: ## @bar movq _g@GOTPCREL(%rip), %rax movss LCPI1_0(%rip), %xmm1 movss (%rax), %xmm0 addss %xmm1, %xmm0 movd %xmm0, %ecx movl 4(%rax), %edx movl 12(%rax), %esi shlq $32, %rdx addq %rcx, %rdx movd %rdx, %xmm0 addss 8(%rax), %xmm1 movd %xmm1, %eax shlq $32, %rsi addq %rax, %rsi movd %rsi, %xmm1 ret This resolves rdar://8360454 llvm-svn: 112343
* Enhance the shift propagator to handle the case when you have:Chris Lattner2010-08-271-22/+56
| | | | | | | | | | | | | | | | A = shl x, 42 ... B = lshr ..., 38 which can be transformed into: A = shl x, 4 ... iff we can prove that the would-be-shifted-in bits are already zero. This eliminates two shifts in the testcase and allows eliminate of the whole i128 chain in the real example. llvm-svn: 112314
* Implement a pretty general logical shift propagationChris Lattner2010-08-272-2/+227
| | | | | | | | | | | | framework, which is good at ripping through bitfield operations. This generalize a bunch of the existing xforms that instcombine does, such as (x << c) >> c -> and to handle intermediate logical nodes. This is useful for ripping up the "promote to large integer" code produced by SRoA. llvm-svn: 112304
* remove some special shift cases that have been subsumed into theChris Lattner2010-08-271-34/+13
| | | | | | more general simplify demanded bits logic. llvm-svn: 112291
* teach the truncation optimization that an entire chain ofChris Lattner2010-08-271-0/+5
| | | | | | | computation can be truncated if it is fed by a sext/zext that doesn't have to be exactly equal to the truncation result type. llvm-svn: 112285
* Add an instcombine to clean up a common pattern producedChris Lattner2010-08-271-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | by the SRoA "promote to large integer" code, eliminating some type conversions like this: %94 = zext i16 %93 to i32 ; <i32> [#uses=2] %96 = lshr i32 %94, 8 ; <i32> [#uses=1] %101 = trunc i32 %96 to i8 ; <i8> [#uses=1] This also unblocks other xforms from happening, now clang is able to compile: struct S { float A, B, C, D; }; float foo(struct S A) { return A.A + A.B+A.C+A.D; } into: _foo: ## @foo ## BB#0: ## %entry pshufd $1, %xmm0, %xmm2 addss %xmm0, %xmm2 movdqa %xmm1, %xmm3 addss %xmm2, %xmm3 pshufd $1, %xmm1, %xmm0 addss %xmm3, %xmm0 ret on x86-64, instead of: _foo: ## @foo ## BB#0: ## %entry movd %xmm0, %rax shrq $32, %rax movd %eax, %xmm2 addss %xmm0, %xmm2 movapd %xmm1, %xmm3 addss %xmm2, %xmm3 movd %xmm1, %rax shrq $32, %rax movd %eax, %xmm0 addss %xmm3, %xmm0 ret This seems pretty close to optimal to me, at least without using horizontal adds. This also triggers in lots of other code, including SPEC. llvm-svn: 112278
* optimize "integer extraction out of the middle of a vector" as producedChris Lattner2010-08-261-13/+35
| | | | | | | by SRoA. This is part of rdar://7892780, but needs another xform to expose this. llvm-svn: 112232
* optimize bitcast(trunc(bitcast(x))) where the result is a float and 'x'Chris Lattner2010-08-261-0/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | is a vector to be a vector element extraction. This allows clang to compile: struct S { float A, B, C, D; }; float foo(struct S A) { return A.A + A.B+A.C+A.D; } into: _foo: ## @foo ## BB#0: ## %entry movd %xmm0, %rax shrq $32, %rax movd %eax, %xmm2 addss %xmm0, %xmm2 movapd %xmm1, %xmm3 addss %xmm2, %xmm3 movd %xmm1, %rax shrq $32, %rax movd %eax, %xmm0 addss %xmm3, %xmm0 ret instead of: _foo: ## @foo ## BB#0: ## %entry movd %xmm0, %rax movd %eax, %xmm0 shrq $32, %rax movd %eax, %xmm2 addss %xmm0, %xmm2 movd %xmm1, %rax movd %eax, %xmm1 addss %xmm2, %xmm1 shrq $32, %rax movd %eax, %xmm0 addss %xmm1, %xmm0 ret ... eliminating half of the horribleness. llvm-svn: 112227
* Re-apply r111568 with a fix for the clang self-host.Owen Anderson2010-08-201-0/+47
| | | | llvm-svn: 111665
* Revert r111568 to unbreak clang self-host.Owen Anderson2010-08-191-45/+0
| | | | llvm-svn: 111571
* When a set of bitmask operations, typically from a bitfield initialization, ↵Owen Anderson2010-08-191-0/+45
| | | | | | | | only modifies the low bytes of a value, we can narrow the store to only over-write the affected bytes. llvm-svn: 111568
OpenPOWER on IntegriCloud