summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* Instcombine: Fix pattern where the sext did not dominate the icmp using itTobias Grosser2011-01-091-2/+7
| | | | llvm-svn: 123121
* LoopInstSimplify preserves LoopSimplify.Cameron Zwarich2011-01-091-0/+1
| | | | llvm-svn: 123117
* Another missed memset in std::vector initialization.Chandler Carruth2011-01-091-0/+19
| | | | llvm-svn: 123116
* Eliminate some extra hash table lookups.Cameron Zwarich2011-01-091-7/+10
| | | | llvm-svn: 123115
* Add an informative comment.Cameron Zwarich2011-01-091-1/+9
| | | | llvm-svn: 123114
* Fix a cut-paste-o so that the sample code is correct for my last note.Chandler Carruth2011-01-091-2/+6
| | | | | | | Also, switch to a more clear 'sink' function with its declaration to avoid any confusion about 'g'. Thanks for the suggestion Frits. llvm-svn: 123113
* Another missed optimization of trivial vector code.Chandler Carruth2011-01-091-0/+33
| | | | llvm-svn: 123112
* Add a note about vector's size-constructor producing dead stores.Chandler Carruth2011-01-091-0/+55
| | | | llvm-svn: 123111
* Simplify LiveDebugVariables by storing MachineOperand copies locations insteadJakob Stoklund Olesen2011-01-091-169/+48
| | | | | | | | | | | | | | of using a Location class with the same information. When making a copy of a MachineOperand that was already stored in a MachineInstr, it is necessary to clear the parent pointer on the copy. Otherwise the register use-def lists become inconsistent. Add MachineOperand::clearParent() to do that. An alternative would be a custom MachineOperand copy constructor that cleared ParentMI. I didn't want to do that because of the performance impact. llvm-svn: 123109
* Shrink a BitVector that didn't mean to store bits for all physical registers.Jakob Stoklund Olesen2011-01-091-6/+4
| | | | llvm-svn: 123108
* Replace TargetRegisterInfo::printReg with a PrintReg class that also works ↵Jakob Stoklund Olesen2011-01-0914-107/+54
| | | | | | | | | | without a TRI instance. Print virtual registers numbered from 0 instead of the arbitrary FirstVirtualRegister. The first virtual register is printed as %vreg0. TRI::NoRegister is printed as %noreg. llvm-svn: 123107
* Use IndexedMap for MachineRegisterInfo as well. No functional change.Jakob Stoklund Olesen2011-01-091-19/+22
| | | | llvm-svn: 123106
* teach SCEV analysis of PHI nodes that PHI recurences formedChris Lattner2011-01-091-0/+5
| | | | | | | with GEP instructions are always NUW, because PHIs cannot wrap the end of the address space. llvm-svn: 123105
* reduce indentation. Print <nuw> and <nsw> when dumping SCEV AddRec'sChris Lattner2011-01-092-52/+56
| | | | | | that have the bit set. llvm-svn: 123104
* Add a note about a missed memset optimization from std::fill.Chandler Carruth2011-01-091-0/+30
| | | | llvm-svn: 123103
* Fix the last virtual register enumerations.Jakob Stoklund Olesen2011-01-082-7/+8
| | | | llvm-svn: 123102
* Fix VirtRegMap to use TRI::index2VirtReg and TRI::virtReg2Index instead ofJakob Stoklund Olesen2011-01-082-19/+27
| | | | | | | | depending on TRI::FirstVirtualRegister. Also use TRI::printReg instead of printing virtual registers directly. llvm-svn: 123101
* Fix a MachineVerifier loop that probably didn't mean to skip the last twoJakob Stoklund Olesen2011-01-081-2/+2
| | | | | | virtual registers. llvm-svn: 123100
* Use an IndexedMap for LiveVariables::VirtRegInfo.Jakob Stoklund Olesen2011-01-081-25/+10
| | | | | | | | Provide MRI::getNumVirtRegs() and TRI::index2VirtReg() functions to allow iteration over virtual registers without depending on the representation of virtual register numbers. llvm-svn: 123098
* Use an IndexedMap for LiveOutRegInfo to hide its dependence on ↵Jakob Stoklund Olesen2011-01-082-9/+4
| | | | | | TargetRegisterInfo::FirstVirtualRegister. llvm-svn: 123096
* Fix coding style.Cameron Zwarich2011-01-081-64/+64
| | | | llvm-svn: 123093
* fix a latent bug in memcpyoptimizer that my recent patches exposed: it wasn't Chris Lattner2011-01-081-2/+4
| | | | | | | updating memdep when fusing stores together. This fixes the crash optimizing the bullet benchmark. llvm-svn: 123091
* tryMergingIntoMemset can only handle constant length memsets.Chris Lattner2011-01-081-5/+6
| | | | llvm-svn: 123090
* Merge memsets followed by neighboring memsets and other stores intoChris Lattner2011-01-081-3/+18
| | | | | | | | | | | | | | | | larger memsets. Among other things, this fixes rdar://8760394 and allows us to handle "Example 2" from http://blog.regehr.org/archives/320, compiling it into a single 4096-byte memset: _mad_synth_mute: ## @mad_synth_mute ## BB#0: ## %entry pushq %rax movl $4096, %esi ## imm = 0x1000 callq ___bzero popq %rax ret llvm-svn: 123089
* fix an issue in IsPointerOffset that prevented us from recognizing thatChris Lattner2011-01-081-3/+19
| | | | | | P and P+1 are relative to the same base pointer. llvm-svn: 123087
* enhance memcpyopt to merge a store and a subsequentChris Lattner2011-01-081-53/+83
| | | | | | memset into a single larger memset. llvm-svn: 123086
* constify TargetData references.Chris Lattner2011-01-081-86/+96
| | | | | | | Split memset formation logic out into its own "tryMergingIntoMemset" helper function. llvm-svn: 123081
* When loop rotation happens, it is *very* common for the duplicated condbrChris Lattner2011-01-081-21/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to be foldable into an uncond branch. When this happens, we can make a much simpler CFG for the loop, which is important for nested loop cases where we want the outer loop to be aggressively optimized. Handle this case more aggressively. For example, previously on phi-duplicate.ll we would get this: define void @test(i32 %N, double* %G) nounwind ssp { entry: %cmp1 = icmp slt i64 1, 1000 br i1 %cmp1, label %bb.nph, label %for.end bb.nph: ; preds = %entry br label %for.body for.body: ; preds = %bb.nph, %for.cond %j.02 = phi i64 [ 1, %bb.nph ], [ %inc, %for.cond ] %arrayidx = getelementptr inbounds double* %G, i64 %j.02 %tmp3 = load double* %arrayidx %sub = sub i64 %j.02, 1 %arrayidx6 = getelementptr inbounds double* %G, i64 %sub %tmp7 = load double* %arrayidx6 %add = fadd double %tmp3, %tmp7 %arrayidx10 = getelementptr inbounds double* %G, i64 %j.02 store double %add, double* %arrayidx10 %inc = add nsw i64 %j.02, 1 br label %for.cond for.cond: ; preds = %for.body %cmp = icmp slt i64 %inc, 1000 br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge for.cond.for.end_crit_edge: ; preds = %for.cond br label %for.end for.end: ; preds = %for.cond.for.end_crit_edge, %entry ret void } Now we get the much nicer: define void @test(i32 %N, double* %G) nounwind ssp { entry: br label %for.body for.body: ; preds = %entry, %for.body %j.01 = phi i64 [ 1, %entry ], [ %inc, %for.body ] %arrayidx = getelementptr inbounds double* %G, i64 %j.01 %tmp3 = load double* %arrayidx %sub = sub i64 %j.01, 1 %arrayidx6 = getelementptr inbounds double* %G, i64 %sub %tmp7 = load double* %arrayidx6 %add = fadd double %tmp3, %tmp7 %arrayidx10 = getelementptr inbounds double* %G, i64 %j.01 store double %add, double* %arrayidx10 %inc = add nsw i64 %j.01, 1 %cmp = icmp slt i64 %inc, 1000 br i1 %cmp, label %for.body, label %for.end for.end: ; preds = %for.body ret void } With all of these recent changes, we are now able to compile: void foo(char *X) { for (int i = 0; i != 100; ++i) for (int j = 0; j != 100; ++j) X[j+i*100] = 0; } into a single memset of 10000 bytes. This series of changes should also be helpful for other nested loop scenarios as well. llvm-svn: 123079
* make domtree verification print something useful on failure.Chris Lattner2011-01-081-1/+8
| | | | llvm-svn: 123078
* split ssa updating code out to its own helper function. Don't botherChris Lattner2011-01-081-74/+78
| | | | | | | moving the OrigHeader block anymore: we just merge it away anyway so its code layout doesn't matter. llvm-svn: 123077
* Implement a TODO: Enhance loopinfo to merge away the unconditional branchChris Lattner2011-01-081-11/+7
| | | | | | | | | | that it was leaving in loops after rotation (between the original latch block and the original header. With this change, it is possible for rotated loops to have just a single basic block, which is useful. llvm-svn: 123075
* various code cleanups, enhance MergeBlockIntoPredecessor to preserveChris Lattner2011-01-081-13/+10
| | | | | | loop info. llvm-svn: 123074
* inline preserveCanonicalLoopForm now that it is simple.Chris Lattner2011-01-081-39/+17
| | | | llvm-svn: 123073
* Three major changes:Chris Lattner2011-01-081-115/+20
| | | | | | | | | | | | | | | 1. Rip out LoopRotate's domfrontier updating code. It isn't needed now that LICM doesn't use DF and it is super complex and gross. 2. Make DomTree updating code a lot simpler and faster. The old loop over all the blocks was just to find a block?? 3. Change the code that inserts the new preheader to just use SplitCriticalEdge instead of doing an overcomplex reimplementation of it. No behavior change, except for the name of the inserted preheader. llvm-svn: 123072
* reduce nesting.Chris Lattner2011-01-081-6/+6
| | | | llvm-svn: 123071
* LoopRotate requires canonical loop form, so it always has preheadersChris Lattner2011-01-081-15/+11
| | | | | | | and latch blocks. Reorder entry conditions to make hte pass faster and more logical. llvm-svn: 123069
* use the LI ivar.Chris Lattner2011-01-081-3/+2
| | | | llvm-svn: 123068
* some cleanups: remove dead arguments and eliminate ivarsChris Lattner2011-01-081-55/+36
| | | | | | that are just passed to one function. llvm-svn: 123067
* fix an issue duncan pointed out, which could cause loop rotateChris Lattner2011-01-081-12/+16
| | | | | | to violate LCSSA form llvm-svn: 123066
* Fix coding style issues.Cameron Zwarich2011-01-081-2/+2
| | | | llvm-svn: 123065
* Make more passes preserve dominators (or state that they preserve dominators ifCameron Zwarich2011-01-085-2/+49
| | | | | | | | | | they all ready do). This removes two dominator recomputations prior to isel, which is a 1% improvement in total llc time for 403.gcc. The only potentially suspect thing is making GCStrategy recompute dominators if it used a custom lowering strategy. llvm-svn: 123064
* First step in fixing PR8927:Rafael Espindola2011-01-087-10/+36
| | | | | | | | | | | | | | | | | | | Add a unnamed_addr bit to global variables and functions. This will be used to indicate that the address is not significant and therefore the constant or function can be merged with others. If an optimization pass can show that an address is not used, it can set this. Examples of things that can have this set by the FE are globals created to hold string literals and C++ constructors. Adding unnamed_addr to a non-const global should have no effect unless an optimization can transform that global into a constant. Aliases are not allowed to have unnamed_addr since I couldn't figure out any use for it. llvm-svn: 123063
* Contract subloop bodies. However, it is still important to visit the phis at theCameron Zwarich2011-01-081-7/+41
| | | | | | top of subloop headers, as the phi uses logically occur outside of the subloop. llvm-svn: 123062
* Fix a bug in r123034 (trying to sext/zext non-integers) and clean up a little.Frits van Bommel2011-01-081-5/+8
| | | | llvm-svn: 123061
* Have loop-rotate simplify instructions (yay instsimplify!) as it clonesChris Lattner2011-01-081-5/+21
| | | | | | | | | | | | them into the loop preheader, eliminating silly instructions like "icmp i32 0, 100" in fixed tripcount loops. This also better exposes the bigger problem with loop rotate that I'd like to fix: once this has been folded, the duplicated conditional branch *often* turns into an uncond branch. Not aggressively handling this is pessimizing later loop optimizations somethin' fierce by making "dominates all exit blocks" checks fail. llvm-svn: 123060
* Revamp the ValueMapper interfaces in a couple ways:Chris Lattner2011-01-086-164/+87
| | | | | | | | | | | | | | | | 1. Take a flags argument instead of a bool. This makes it more clear to the reader what it is used for. 2. Add a flag that says that "remapping a value not in the map is ok". 3. Reimplement MapValue to share a bunch of code and be a lot more efficient. For lookup failures, don't drop null values into the map. 4. Using the new flag a bunch of code can vaporize in LinkModules and LoopUnswitch, kill it. No functionality change. llvm-svn: 123058
* two minor changes: switch to the standard ValueToValueMapTyChris Lattner2011-01-081-2/+7
| | | | | | | | map from ValueMapper.h (giving us access to its utilities) and add a fastpath in the loop rotation code, avoiding expensive ssa updator manipulation for values with nothing to update. llvm-svn: 123057
* Recognize inline asm 'rev /bin/bash, ' as a bswap intrinsic call.Evan Cheng2011-01-085-53/+101
| | | | llvm-svn: 123048
* Do not model all INLINEASM instructions as having unmodelled side effects.Evan Cheng2011-01-0716-37/+100
| | | | | | | | | | Instead encode llvm IR level property "HasSideEffects" in an operand (shared with IsAlignStack). Added MachineInstrs::hasUnmodeledSideEffects() to check the operand when the instruction is an INLINEASM. This allows memory instructions to be moved around INLINEASM instructions. llvm-svn: 123044
* Add an explanatory message for an assertion.Bob Wilson2011-01-071-1/+2
| | | | llvm-svn: 123042
OpenPOWER on IntegriCloud