summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* Now that instruction optzns can update the iterator as they go, we can Chris Lattner2011-01-151-10/+16
| | | | | | | | have objectsize folding recursively simplify away their result when it folds. It is important to catch this here, because otherwise we won't eliminate the cross-block values at isel and other times. llvm-svn: 123524
* make the current instruction iterator an ivar, allowing xforms thatChris Lattner2011-01-151-35/+38
| | | | | | | potentially invalidate it (like inline asm lowering) to be sunk into their proper place, cleaning up a ton of code. llvm-svn: 123523
* implement an instcombine xform that canonicalizes casts outside of ↵Chris Lattner2011-01-151-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and-with-constant operations. This fixes rdar://8808586 which observed that we used to compile: union xy { struct x { _Bool b[15]; } x; __attribute__((packed)) struct y { __attribute__((packed)) unsigned long b0to7; __attribute__((packed)) unsigned int b8to11; __attribute__((packed)) unsigned short b12to13; __attribute__((packed)) unsigned char b14; } y; }; struct x foo(union xy *xy) { return xy->x; } into: _foo: ## @foo movq (%rdi), %rax movabsq $1095216660480, %rcx ## imm = 0xFF00000000 andq %rax, %rcx movabsq $-72057594037927936, %rdx ## imm = 0xFF00000000000000 andq %rax, %rdx movzbl %al, %esi orq %rdx, %rsi movq %rax, %rdx andq $65280, %rdx ## imm = 0xFF00 orq %rsi, %rdx movq %rax, %rsi andq $16711680, %rsi ## imm = 0xFF0000 orq %rdx, %rsi movl %eax, %edx andl $-16777216, %edx ## imm = 0xFFFFFFFFFF000000 orq %rsi, %rdx orq %rcx, %rdx movabsq $280375465082880, %rcx ## imm = 0xFF0000000000 movq %rax, %rsi andq %rcx, %rsi orq %rdx, %rsi movabsq $71776119061217280, %r8 ## imm = 0xFF000000000000 andq %r8, %rax orq %rsi, %rax movzwl 12(%rdi), %edx movzbl 14(%rdi), %esi shlq $16, %rsi orl %edx, %esi movq %rsi, %r9 shlq $32, %r9 movl 8(%rdi), %edx orq %r9, %rdx andq %rdx, %rcx movzbl %sil, %esi shlq $32, %rsi orq %rcx, %rsi movl %edx, %ecx andl $-16777216, %ecx ## imm = 0xFFFFFFFFFF000000 orq %rsi, %rcx movq %rdx, %rsi andq $16711680, %rsi ## imm = 0xFF0000 orq %rcx, %rsi movq %rdx, %rcx andq $65280, %rcx ## imm = 0xFF00 orq %rsi, %rcx movzbl %dl, %esi orq %rcx, %rsi andq %r8, %rdx orq %rsi, %rdx ret We now compile this into: _foo: ## @foo ## BB#0: ## %entry movzwl 12(%rdi), %eax movzbl 14(%rdi), %ecx shlq $16, %rcx orl %eax, %ecx shlq $32, %rcx movl 8(%rdi), %edx orq %rcx, %rdx movq (%rdi), %rax ret A small improvement :-) llvm-svn: 123520
* one more instcombine variant that is needed to work with future changes,Chris Lattner2011-01-151-0/+9
| | | | | | no functionality change currently. llvm-svn: 123517
* fix typoChris Lattner2011-01-151-1/+1
| | | | llvm-svn: 123516
* Catch ~x < cst just like ~x < ~y, we currently handle this throughChris Lattner2011-01-151-4/+8
| | | | | | means that are about to disappear. llvm-svn: 123515
* reduce indentationChris Lattner2011-01-151-29/+29
| | | | llvm-svn: 123514
* Generalize LoadAndStorePromoter a bit and switch LICMChris Lattner2011-01-153-190/+111
| | | | | | to use it. llvm-svn: 123501
* Fix a false-positive warning.Owen Anderson2011-01-141-1/+3
| | | | llvm-svn: 123480
* Enhance GlobalOpt to be able evaluate initializers that involve stores throughOwen Anderson2011-01-141-2/+49
| | | | | | bitcasts, at least in simple cases. This fixes clang's CodeGenCXX/virtual-base-dtor.cpp llvm-svn: 123477
* switch SRoA to use LoadAndStorePromoter instead of its own copy of the code.Chris Lattner2011-01-141-136/+26
| | | | llvm-svn: 123457
* Add a new LoadAndStorePromoter class, which implements the generalChris Lattner2011-01-141-0/+154
| | | | | | | "promote a bunch of load and stores" logic, allowing the code to be shared and reused. llvm-svn: 123456
* split SROA into two passes: one that uses DomFrontiers (-scalarrepl) Chris Lattner2011-01-142-27/+57
| | | | | | and one that uses SSAUpdater (-scalarrepl-ssa) llvm-svn: 123436
* Implement full support for promoting allocas to registers using SSAUpdaterChris Lattner2011-01-141-5/+162
| | | | | | | | | | | instead of DomTree/DomFrontier. This may be interesting for reducing compile time. This is currently disabled, but seems to work just fine. When this is enabled, we eliminate two runs of dominator frontier, one in the "early per-function" optimizations and one in the "interlaced with inliner" function passes. llvm-svn: 123434
* indentationChris Lattner2011-01-141-1/+1
| | | | llvm-svn: 123426
* Move some shift transforms out of instcombine and into InstructionSimplify.Duncan Sands2011-01-141-26/+10
| | | | | | | | | | | | While there, I noticed that the transform "undef >>a X -> undef" was wrong. For example if X is 2 then the top two bits must be equal, so the result can not be anything. I fixed this in the constant folder as well. Also, I made the transform for "X << undef" stronger: it now folds to undef always, even though X might be zero. This is in accordance with the LangRef, but I must admit that it is fairly aggressive. Also, I added "i32 X << 32 -> undef" following the LangRef and the constant folder, likewise fairly aggressive. llvm-svn: 123417
* Fix whitespace.Bob Wilson2011-01-131-120/+120
| | | | llvm-svn: 123396
* Check for empty structs, and for consistency, zero-element arrays.Bob Wilson2011-01-131-2/+2
| | | | llvm-svn: 123383
* Extend SROA to handle arrays accessed as homogeneous structs and vice versa.Bob Wilson2011-01-131-14/+57
| | | | | | | | | | | | | | | | | This is a minor extension of SROA to handle a special case that is important for some ARM NEON operations. Some of the NEON intrinsics return multiple values, which are handled as struct types containing multiple elements of the same vector type. The corresponding return types declared in the arm_neon.h header have equivalent arrays. We need SROA to recognize that it can split up those arrays and structs into separate vectors, even though they are not always accessed with the same type. SROA already handles loads and stores of an entire alloca by using insertvalue/extractvalue to access the individual pieces, and that code works the same regardless of whether the type is a struct or an array. So, all that needs to be done is to check for compatible arrays and homogeneous structs. llvm-svn: 123381
* Make SROA more aggressive with allocas containing padding.Bob Wilson2011-01-131-34/+27
| | | | | | | SROA only split up structs and arrays one level at a time, so padding can only cause trouble if it is located in between the struct or array elements. llvm-svn: 123380
* Use SmallVector instead of SmallPtrSet and avoid non-deterministic behavior.Devang Patel2011-01-121-3/+3
| | | | llvm-svn: 123318
* revert 123144, reenabling the rest of memset formation.Chris Lattner2011-01-121-3/+0
| | | | llvm-svn: 123302
* revert r123146 which disabled code that wasn't the root causeChris Lattner2011-01-121-2/+0
| | | | | | of the bootstrap miscompare issue. llvm-svn: 123299
* revert r123149, reenabling an improvement to memcpyopt that wasn'tChris Lattner2011-01-121-4/+2
| | | | | | the source of the bootstrap problem. llvm-svn: 123298
* Remove the PR8954 workaround.Jakob Stoklund Olesen2011-01-111-4/+0
| | | | llvm-svn: 123288
* Fix a non-deterministic loop in llvm::MergeBlockIntoPredecessor.Jakob Stoklund Olesen2011-01-111-2/+2
| | | | | | | | | DT->changeImmediateDominator() trivially ignores identity updates, so there is really no need for the uniqueing provided by SmallPtrSet. I expect this to fix PR8954. llvm-svn: 123286
* Dial back the speculative fix for PR8954 a bit, so that we only recompute ↵Cameron Zwarich2011-01-111-1/+3
| | | | | | | | dominators once at the beginning of GVN instead of once per iteration. llvm-svn: 123278
* Attempt to fix the bootstrap buildbot. Rafael says this works for him on ↵Cameron Zwarich2011-01-111-0/+1
| | | | | | x86-64 Linux. llvm-svn: 123270
* Remove dead variable, const-ref-ize an APInt.Owen Anderson2011-01-111-4/+1
| | | | llvm-svn: 123248
* this pass claims to preserve scev, make sure to tell it about deletions.Chris Lattner2011-01-111-0/+1
| | | | llvm-svn: 123247
* Factor the actual simplification out of SimplifyIndirectBrOnSelect and into ↵Frits van Bommel2011-01-111-26/+37
| | | | | | | | a new helper function so it can be reused in e.g. an upcoming SimplifySwitchOnSelect. No functional change. llvm-svn: 123234
* update memdep when an instruction is deleted. This code isn'tChris Lattner2011-01-111-2/+5
| | | | | | | actually reached in the testcase in PR8954, but it's safe and good practice. llvm-svn: 123224
* when MergeBlockIntoPredecessor merges two blocks, update MemDep if itChris Lattner2011-01-111-0/+4
| | | | | | is floating around in the ether. llvm-svn: 123223
* Fix FoldSingleEntryPHINodes to update memdep and AA when it deletesChris Lattner2011-01-112-5/+21
| | | | | | | | | | phi nodes. It is called from MergeBlockIntoPredecessor which is called from GVN, which claims to preserve these. I'm skeptical that this is the actual problem behind PR8954, but this is a stab in the right direction. llvm-svn: 123222
* random cleanupsChris Lattner2011-01-112-3/+2
| | | | llvm-svn: 123221
* remove a bogus assertion: the latch block of a loop is not Chris Lattner2011-01-111-6/+5
| | | | | | | neccesarily an uncond branch to the header. This fixes PR8955 (the assertion tripping). llvm-svn: 123219
* Fix a random missed optimization by making InstCombine more aggressive when ↵Owen Anderson2011-01-111-2/+40
| | | | | | | | determining which bits are demanded by a comparison against a constant. llvm-svn: 123203
* Teach instcombine about the rest of the SSE and SSE2 conversionChandler Carruth2011-01-101-4/+11
| | | | | | intrinsics element dependencies. Reviewed by Nick. llvm-svn: 123161
* another random stab in the dark trying to fix llvm-gcc-i386-linux-selfhostChris Lattner2011-01-101-2/+4
| | | | llvm-svn: 123149
* another (more) aggressive attempt to bring llvm-gcc-i386-linux-selfhostChris Lattner2011-01-101-0/+2
| | | | | | back to life. llvm-svn: 123146
* temporarily disable memset formation from memsets in an effort to restore ↵Chris Lattner2011-01-091-0/+3
| | | | | | buildbot stability. llvm-svn: 123144
* fix a few old bugs (found by inspection) where we would zap instructionsChris Lattner2011-01-091-1/+4
| | | | | | | | without informing memdep. This could cause nondeterminstic weirdness based on where instructions happen to get allocated, and will hopefully breath some life into some broken testers. llvm-svn: 123124
* Instcombine: Fix pattern where the sext did not dominate the icmp using itTobias Grosser2011-01-091-2/+7
| | | | llvm-svn: 123121
* LoopInstSimplify preserves LoopSimplify.Cameron Zwarich2011-01-091-0/+1
| | | | llvm-svn: 123117
* reduce indentation. Print <nuw> and <nsw> when dumping SCEV AddRec'sChris Lattner2011-01-091-3/+2
| | | | | | that have the bit set. llvm-svn: 123104
* fix a latent bug in memcpyoptimizer that my recent patches exposed: it wasn't Chris Lattner2011-01-081-2/+4
| | | | | | | updating memdep when fusing stores together. This fixes the crash optimizing the bullet benchmark. llvm-svn: 123091
* tryMergingIntoMemset can only handle constant length memsets.Chris Lattner2011-01-081-5/+6
| | | | llvm-svn: 123090
* Merge memsets followed by neighboring memsets and other stores intoChris Lattner2011-01-081-3/+18
| | | | | | | | | | | | | | | | larger memsets. Among other things, this fixes rdar://8760394 and allows us to handle "Example 2" from http://blog.regehr.org/archives/320, compiling it into a single 4096-byte memset: _mad_synth_mute: ## @mad_synth_mute ## BB#0: ## %entry pushq %rax movl $4096, %esi ## imm = 0x1000 callq ___bzero popq %rax ret llvm-svn: 123089
* fix an issue in IsPointerOffset that prevented us from recognizing thatChris Lattner2011-01-081-3/+19
| | | | | | P and P+1 are relative to the same base pointer. llvm-svn: 123087
* enhance memcpyopt to merge a store and a subsequentChris Lattner2011-01-081-53/+83
| | | | | | memset into a single larger memset. llvm-svn: 123086
OpenPOWER on IntegriCloud