summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* Allow targets to inject passes before the virtual register rewriter.Jakob Stoklund Olesen2012-06-261-1/+5
| | | | | | | | Such passes can be used to tweak the register assignments in a target-dependent way, for example to avoid write-after-write dependencies. llvm-svn: 159209
* Update a bunch of stale comments that dated from when this folled theChandler Carruth2012-06-261-14/+11
| | | | | | | very first (and worst) placement algorithm. These should now more accurately reflect the reality of the pass. llvm-svn: 159185
* Enable the new LoopInfo algorithm by default.Andrew Trick2012-06-261-9/+1
| | | | | | | | | | | | | | | | | | | | | | | | The primary advantage is that loop optimizations will be applied in a stable order. This helps debugging and unit test creation. It is also a better overall implementation without pathologically bad performance on deep functions. On large functions (llvm-stress --size=200000 | opt -loops) Before: 0.1263s After: 0.0225s On deep functions (after tweaking llvm-stress, thanks Nadav): Before: 0.2281s After: 0.0227s See r158790 for more comments. The loop tree is now consistently generated in forward order, but loop passes are applied in reverse order over the program. If we have a loop optimization that prefers forward order, that can easily be achieved by adding a different type of LoopPassManager. llvm-svn: 159183
* Make sure type is not extended or untyped before create a constant of the ↵Evan Cheng2012-06-261-0/+4
| | | | | | type. No test case. Found by inspection. llvm-svn: 159179
* Enforce stricter liveness rules for PHIs.Jakob Stoklund Olesen2012-06-251-6/+11
| | | | | | | | | | | | | Verify that all paths from the entry block to a virtual register read pass through a def. Enable this check even when MRI->isSSA() is false. Verify that the live range of a virtual register is live out of all predecessor blocks, even for PHI-values. This requires that PHIElimination sometimes inserts IMPLICIT_DEF instruction in predecessor blocks. llvm-svn: 159150
* Run ProcessImplicitDefs on SSA form where it can be much simpler.Jakob Stoklund Olesen2012-06-252-262/+99
| | | | | | | | | | | Implicitly defined virtual registers can simply have the <undef> bit set on all uses, and copies can be turned into implicit defs recursively. Physical registers are a bit trickier. We handle the common case where a physreg def is used by a nearby instruction in the same basic block. For more complicated cases, just leave the IMPLICIT_DEF instruction in. llvm-svn: 159149
* Teach PHIElimination to handle <undef> operands.Jakob Stoklund Olesen2012-06-251-19/+34
| | | | | | | | | When a PHI use is <undef>, don't emit a copy in the predecessor block, but insert an IMPLICIT_DEF instruction instead. This ensures that virtual register uses are always jointly dominated by defs, even if some of them are IMPLICIT_DEF. llvm-svn: 159121
* Handle <undef> operands in TwoAddressInstructionPass.Jakob Stoklund Olesen2012-06-251-12/+31
| | | | | | | | | | | | | | | | | When the source register to a 2-addr instruction is undefined, there is no need to attempt any transformations - simply replace the source register with the destination register. This also comes up when lowering IMPLICIT_DEF instructions - make sure the <undef> flag is moved to the new partial register def operand: %vreg8<def> = INSERT_SUBREG %vreg9<undef>, %vreg0<kill>, sub_16bit rewrite undef: %vreg8<def> = INSERT_SUBREG %vreg8<undef>, %vreg0<kill>, sub_16bit convert to: %vreg8:sub_16bit<def,read-undef> = COPY %vreg0<kill> llvm-svn: 159120
* llvm/lib: [CMake] Add explicit dependency to intrinsics_gen.NAKAMURA Takumi2012-06-242-0/+4
| | | | llvm-svn: 159112
* DAG legalisation can now handle illegal fma vector types by scalarisationPete Cooper2012-06-242-0/+32
| | | | llvm-svn: 159092
* Teach LiveVariables to handle <undef> operands.Jakob Stoklund Olesen2012-06-231-3/+5
| | | | | | | | It's simple: Don't treat <undef> operands as uses, and don't assume a virtual register has a defining instruction unless a real use has been seen. llvm-svn: 159061
* Remove ProcessImplicitDefs.h which was unused.Jakob Stoklund Olesen2012-06-221-2/+27
| | | | | | The ProcessImplicitDefs class can be local to its implementation file. llvm-svn: 159041
* Also verify the def index for early clobbers.Jakob Stoklund Olesen2012-06-221-2/+3
| | | | llvm-svn: 159039
* Delete a boring statistic.Jakob Stoklund Olesen2012-06-221-6/+0
| | | | llvm-svn: 159030
* Store live intervals in an IndexedMap.Jakob Stoklund Olesen2012-06-221-14/+8
| | | | | | It is both smaller and faster than DenseMap. llvm-svn: 159029
* Revert r158679 - use case is unclear (and it increases the memory footprint).Hal Finkel2012-06-222-8/+8
| | | | | | | | | | Original commit message: Allow up to 64 functional units per processor itinerary. This patch changes the type used to hold the FU bitset from unsigned to uint64_t. This will be needed for some upcoming PowerPC itineraries. llvm-svn: 159027
* Fix a crash in --debug code.Jakob Stoklund Olesen2012-06-221-2/+6
| | | | | | Don't try to print out the live range of a physreg. llvm-svn: 159021
* Don't depend on live ranges being present.Jakob Stoklund Olesen2012-06-221-3/+8
| | | | | | | DBG_VALUE instructions could be referring to non-existing virtual registers. llvm-svn: 159020
* Simplify handleMove() a bit.Jakob Stoklund Olesen2012-06-221-4/+4
| | | | | | | There is no need to check for physreg live ranges. They don't exist any more. llvm-svn: 159019
* Stop computing physreg live ranges.Jakob Stoklund Olesen2012-06-221-189/+1
| | | | | | Everyone is using on-demand regunit ranges now. llvm-svn: 159018
* Remove some redundant LIS->hasInterval() checks.Jakob Stoklund Olesen2012-06-221-22/+0
| | | | | | | These functions only operate on virtual registers now, and they all have live ranges. llvm-svn: 159015
* Use MRI::isConstantPhysReg() to check remat feasibility.Jakob Stoklund Olesen2012-06-221-4/+8
| | | | | | | Don't depend on LiveIntervals::hasInterval() to determine if a physreg is reserved and constant. llvm-svn: 159013
* Use regunit liveness to guide LiveDebugVariables.Jakob Stoklund Olesen2012-06-221-5/+18
| | | | | | This should produce the same results as using physreg liveness directly. llvm-svn: 159009
* Remove LiveIntervals::trackingRegUnits().Jakob Stoklund Olesen2012-06-223-71/+13
| | | | | | | | | With regunit liveness permanently enabled, this function would always return true. Also remove now obsolete code for checking physreg interference. llvm-svn: 159006
* Remove another duplicated variable. We only need one to tell us if the linkerRafael Espindola2012-06-221-1/+1
| | | | | | knows dwarf or not. llvm-svn: 158993
* Fix a FIXME: DwarfRequiresRelocationForSectionOffset is the same asRafael Espindola2012-06-221-1/+1
| | | | | | DwarfUsesRelocationsAcrossSections. llvm-svn: 158992
* Emit relocations for DW_AT_location entries on systems which need it. This isNick Lewycky2012-06-222-11/+16
| | | | | | a recommit of r127757. Fixes PR9493. Patch by Paul Robinson! llvm-svn: 158957
* Rename -allow-excess-fp-precision flag to -fuse-fp-ops, and switch from aLang Hames2012-06-222-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | boolean flag to an enum: { Fast, Standard, Strict } (default = Standard). This option controls the creation by optimizations of fused FP ops that store intermediate results in higher precision than IEEE allows (E.g. FMAs). The behavior of this option is intended to match the behaviour specified by a soon-to-be-introduced frontend flag: '-ffuse-fp-ops'. Fast mode - allows formation of fused FP ops whenever they're profitable. Standard mode - allow fusion only for 'blessed' FP ops. At present the only blessed op is the fmuladd intrinsic. In the future more blessed ops may be added. Strict mode - allow fusion only if/when it can be proven that the excess precision won't effect the result. Note: This option only controls formation of fused ops by the optimizers. Fused operations that are explicitly requested (e.g. FMA via the llvm.fma.* intrinsic) will always be honored, regardless of the value of this option. Internally TargetOptions::AllowExcessFPPrecision has been replaced by TargetOptions::AllowFPOpFusion. llvm-svn: 158956
* The inline asm operand modifier 'n' is suppose Jack Carter2012-06-211-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | to be generic across architectures. It has the following description in the gnu sources: Negate the immediate constant Several Architectures such as x86 have local implementations of operand modifier 'n' which go beyond the above description slightly. This won't affect them. Affected files: lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp Added 'n' to the switch cases. test/CodeGen/Generic/asm-large-immediate.ll Generic compiled test (x86 for me) test/CodeGen/Mips/asm-large-immediate.ll Mips compiled version of the generic one Contributer: Jack Carter llvm-svn: 158939
* Fix potential crash if DAGCombine on stores sees a half typePete Cooper2012-06-211-1/+2
| | | | llvm-svn: 158927
* The inline asm operand modifier 'c' is suppose Jack Carter2012-06-211-3/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to be generic across architectures. It has the following description in the gnu sources: Substitute immediate value without immediate syntax Several Architectures such as x86 have local implementations of operand modifier 'c' which go beyond the above description slightly. To make use of the generic modifiers without overriding local implementation one can make a call to the base class method for AsmPrinter::PrintAsmOperand() in the locally derived method's "default" case in the switch statement. That way if it is already defined locally the generic version will never get called. This change is needed when test/CodeGen/generic/asm-large-immediate.ll failed on a native Mips board. The test was assuming a generic implementation was in place. Affected files: lib/Target/Mips/MipsAsmPrinter.cpp: Changed the default case to call the base method. lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp Added 'c' to the switch cases. test/CodeGen/Mips/asm-large-immediate.ll Mips compiled version of the generic one Contributer: Jack Carter llvm-svn: 158925
* Emit a single _udivmodsi4 libcall instead of two separate _udivsi3 andEvan Cheng2012-06-211-5/+7
| | | | | | | | _umodsi3 libcalls if they have the same arguments. This optimization was apparently broken if one of the node was replaced in place. rdar://11714607 llvm-svn: 158900
* Update regunits in RegisterCoalescer::reMaterializeTrivialDef.Jakob Stoklund Olesen2012-06-211-6/+4
| | | | | | Old code would only update physreg live intervals. llvm-svn: 158881
* Remove spurious typedefs.Jakob Stoklund Olesen2012-06-201-3/+0
| | | | llvm-svn: 158878
* Remove the RenderMachineFunction HTML output pass.Jakob Stoklund Olesen2012-06-206-1374/+0
| | | | | | | I don't think anyone has been using this functionality for a while, and it is getting in the way of refactoring now. llvm-svn: 158876
* Remove the -live-regunits command line option.Jakob Stoklund Olesen2012-06-201-12/+4
| | | | | | Register allocators depend on it being permanently enabled now. llvm-svn: 158873
* Fix some more LiveInterval enumerations.Jakob Stoklund Olesen2012-06-202-13/+12
| | | | | | Deterministically enumerate the virtual registers instead. llvm-svn: 158872
* Remove LiveIntervalUnions from RegAllocBase.Jakob Stoklund Olesen2012-06-204-167/+14
| | | | | | They are living in LiveRegMatrix now. llvm-svn: 158868
* Convert RAGreedy to LiveRegMatrix interference checking.Jakob Stoklund Olesen2012-06-204-92/+169
| | | | | | | | | | | | | | | | | Stop depending on the LiveIntervalUnions in RegAllocBase, they are about to be removed. The changes are mostly replacing register alias iterators with regunit iterators, and querying LiveRegMatrix instrad of RegAllocBase. InterferenceCache is converted to work with per-regunit LiveIntervalUnions, and it checks fixed regunit interference separately, using the fixed live intervals provided by LiveIntervalAnalysis. The local splitting helper calcGapWeights() is also considering fixed regunit interference which is kept on the side now. llvm-svn: 158867
* Convert RABasic to using LiveRegMatrix interference checking.Jakob Stoklund Olesen2012-06-203-69/+67
| | | | | | | Stop using the LiveIntervalUnions provided by RegAllocBase, they will be removed soon. llvm-svn: 158866
* Enable register unit liveness by default.Jakob Stoklund Olesen2012-06-201-1/+1
| | | | | | Soon we won't need to compute live intervals for physical registers. llvm-svn: 158865
* Teach PBQPBuilder::build() about regunit interference.Jakob Stoklund Olesen2012-06-201-33/+31
| | | | | | | Filter out physreg candidates with regunit interferrence. Also compute regmask interference more efficiently. llvm-svn: 158864
* Avoid iterating with LiveIntervals::iterator.Jakob Stoklund Olesen2012-06-203-41/+45
| | | | | | | | | | That is a DenseMap iterator keyed by pointers, so the iteration order is nondeterministic. I would like to replace the DenseMap with an IndexedMap which doesn't allow iteration. llvm-svn: 158856
* Add users of a MERGE_VALUE node to the worklist to process again when the ↵Pete Cooper2012-06-201-0/+3
| | | | | | node is removed. Sorry, no test case. Foudn it by inspection of the code llvm-svn: 158839
* Only update regunit live ranges that have been precomputed.Jakob Stoklund Olesen2012-06-201-4/+8
| | | | | | | | | | | Regunit live ranges are computed on demand, so when mi-sched calls handleMove, some regunits may not have live ranges yet. That makes updating them easier: Just skip the non-existing ranges. They will be computed correctly from the rescheduled machine code when they are needed. llvm-svn: 158831
* Delete dead code.Jakob Stoklund Olesen2012-06-201-48/+0
| | | | llvm-svn: 158827
* Fix DAGCombine to deal with ext-conversion of pre/post_inc loads.Hal Finkel2012-06-201-1/+8
| | | | | | The test case for this will come with the PPC indexed preinc loads commit. llvm-svn: 158822
* Fixing a compiler warning in MSVC 10.Aaron Ballman2012-06-201-1/+1
| | | | llvm-svn: 158820
* Fix two rather subtle internal vs. external linker issues.Chandler Carruth2012-06-201-25/+20
| | | | | | | | | | | | | | | | | | | | | | I'll admit I'm not entirely satisfied with this change, but it seemed the cleanest option. Other suggestions quite welcome The issue is that the traits specializations have static methods which return the typedef'ed PHI_iterator type. In both the IR and MI layers this is typedef'ed to a custom iterator class defined in an anonymous namespace giving the types and the functions returning them internal linkage. However, because the traits specialization is defined in the 'llvm' namespace (where it has to be, specialized template lives there), and is in turn used in the templated implementation of the SSAUpdater. This led to the linkage conflict that Clang now warns about. The simplest solution to me was just to define the PHI_iterator as a nested class inside the trait specialization. That way it still doesn't get scoped widely, it can't be accidentally reused somewhere, etc. This is a little gross just because nested class definitions are a little gross, but the alternatives seem more ad-hoc. llvm-svn: 158799
* A new algorithm for computing LoopInfo. Temporarily disabled.Andrew Trick2012-06-201-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | -stable-loops enables a new algorithm for generating the Loop forest. It differs from the original algorithm in a few respects: - Not determined by use-list order. - Initially guarantees RPO order of block and subloops. - Linear in the number of CFG edges. - Nonrecursive. I didn't want to change the LoopInfo API yet, so the block lists are still inclusive. This seems strange to me, and it means that building LoopInfo is not strictly linear, but it may not be a problem in practice. At least the block lists start out in RPO order now. In the future we may add an attribute or wrapper analysis that allows other passes to assume RPO order. The primary motivation of this work was not to optimize LoopInfo, but to allow reproducing performance issues by decomposing the compilation stages. I'm often unable to do this with the current LoopInfo, because the loop tree order determines Loop pass order. Serializing the IR tends to invert the order, which reverses the optimization order. This makes it nearly impossible to debug interdependent loop optimizations such as LSR. I also believe this will provide more stable performance results across time. llvm-svn: 158790
OpenPOWER on IntegriCloud