summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* Early exit if we don't have invokes. The 'Unwinds' vector isn't modified unlessBill Wendling2011-01-071-219/+219
| | | | | | we have invokes, so there is no functionality change here. llvm-svn: 122990
* Fix the other problem reported in PR8582. Testcase and patch byDuncan Sands2011-01-061-0/+5
| | | | | | Nadav Rotem. llvm-svn: 122983
* Add some fairly duplicated code to let type legalization split illegalEric Christopher2011-01-063-0/+141
| | | | | | typed atomics. This will lower exclusively to libcalls at the moment. llvm-svn: 122979
* Emit 128 bit constant.Devang Patel2011-01-062-10/+38
| | | | | | This fixes PR 8913 crash. llvm-svn: 122971
* Re-implement r122936 with proper target hooks. Now getMaxStoresPerMemcpyEvan Cheng2011-01-062-15/+14
| | | | | | | etc. takes an option OptSize. If OptSize is true, it would return the inline limit for functions with attribute OptSize. llvm-svn: 122952
* Revert r122936. I'll re-implement the change.Evan Cheng2011-01-061-9/+2
| | | | llvm-svn: 122949
* Zap the last two -Wself-assign warnings in llvm.Jakob Stoklund Olesen2011-01-061-7/+3
| | | | | | Simplify RALinScan::DowngradeRegister with TRI::getOverlaps while we are there. llvm-svn: 122940
* Add the SpillPlacement analysis pass.Jakob Stoklund Olesen2011-01-063-0/+460
| | | | | | | | | | | | | | | | This pass precomputes CFG block frequency information that can be used by the register allocator to find optimal spill code placement. Given an interference pattern, placeSpills() will compute which basic blocks should have the current variable enter or exit in a register, and which blocks prefer the stack. The algorithm is ready to consume block frequencies from profiling data, but for now it gets by with the static estimates used for spill weights. This is a work in progress and still not hooked up to RegAllocGreedy. llvm-svn: 122938
* r105228 reduced the memcpy / memset inline limit to 4 with -Os to avoid blowingEvan Cheng2011-01-061-2/+9
| | | | | | | | up freebsd bootloader. However, this doesn't make much sense for Darwin, whose -Os is meant to optimize for size only if it doesn't hurt performance. rdar://8821501 llvm-svn: 122936
* Avoid zero extend bit test operands to pointer type if all the masks fit inEvan Cheng2011-01-063-24/+42
| | | | | | | the original type of the switch statement key. rdar://8781238 llvm-svn: 122935
* Optimize:Evan Cheng2011-01-051-23/+39
| | | | | | | | | r1025 = s/zext r1024, 4 r1026 = extract_subreg r1025, 4 to: r1026 = copy r1024 llvm-svn: 122925
* Add a hidden command line option to display edge bundle graphs as they areJakob Stoklund Olesen2011-01-051-0/+7
| | | | | | calculated. llvm-svn: 122912
* 80-cols.Eric Christopher2011-01-051-1/+2
| | | | llvm-svn: 122909
* Remove TODO, these appear to be implemented.Eric Christopher2011-01-041-1/+0
| | | | llvm-svn: 122849
* Turn the EdgeBundles class into a stand-alone machine CFG analysis pass.Jakob Stoklund Olesen2011-01-044-88/+81
| | | | | | | | | | The analysis will be needed by both the greedy register allocator and the X86FloatingPoint pass. It only needs to be computed once when the CFG doesn't change. This pass is very fast, usually showing up as 0.0% wall time. llvm-svn: 122832
* Switch to path halving from path compression for a small speedup. This alsoCameron Zwarich2011-01-041-6/+12
| | | | | | makes getLeader() nonrecursive. llvm-svn: 122811
* Eliminate repeated allocation of a per-BB DenseMap for a 4.6% reduction of timeCameron Zwarich2011-01-041-6/+5
| | | | | | spent in StrongPHIElimination on 403.gcc. llvm-svn: 122803
* Clean up a funky pass registration that got passed over when I got rid of ↵Owen Anderson2011-01-041-7/+1
| | | | | | static constructors. llvm-svn: 122795
* Use a RecyclingAllocator to allocate values for MachineCSE's ScopedHashTable forCameron Zwarich2011-01-031-3/+7
| | | | | | a 28% speedup of MachineCSE time on 403.gcc. llvm-svn: 122735
* split dom frontier handling stuff out to its own DominanceFrontier header,Chris Lattner2011-01-021-0/+1
| | | | | | so that Dominators.h is *just* domtree. Also prune #includes a bit. llvm-svn: 122714
* Try to reuse the value when lowering memset.Benjamin Kramer2011-01-021-3/+21
| | | | | | | | | | This allows us to compile: void test(char *s, int a) { __builtin_memset(s, a, 15); } into 1 mul + 3 stores instead of 3 muls + 3 stores. llvm-svn: 122710
* Lower the i8 extension in memset to a multiply instead of a potentially long ↵Benjamin Kramer2011-01-021-15/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | series of shifts and ors. We could implement a DAGCombine to turn x * 0x0101 back into logic operations on targets that doesn't support the multiply or it is slow (p4) if someone cares enough. Example code: void test(char *s, int a) { __builtin_memset(s, a, 4); } before: _test: ## @test movzbl 8(%esp), %eax movl %eax, %ecx shll $8, %ecx orl %eax, %ecx movl %ecx, %eax shll $16, %eax orl %ecx, %eax movl 4(%esp), %ecx movl %eax, 4(%ecx) movl %eax, (%ecx) ret after: _test: ## @test movzbl 8(%esp), %eax imull $16843009, %eax, %eax ## imm = 0x1010101 movl 4(%esp), %ecx movl %eax, 4(%ecx) movl %eax, (%ecx) ret llvm-svn: 122707
* Use getVRegDef() instead of def_iterator. This leads to fewer defs being addedCameron Zwarich2010-12-301-4/+3
| | | | | | | with 2-address instructions, for about a 3.5% speedup of StrongPHIElimination on 403.gcc. llvm-svn: 122635
* None of the other pass names in CodeGen have terminating periods.Cameron Zwarich2010-12-291-2/+2
| | | | llvm-svn: 122628
* Instead of processing every instruction when splitting interferences, onlyCameron Zwarich2010-12-291-27/+61
| | | | | | | process those instructions that define phi sources. This is a 47% speedup of StrongPHIElimination compile time on 403.gcc. llvm-svn: 122627
* Add a missing word to a comment.Cameron Zwarich2010-12-291-1/+1
| | | | llvm-svn: 122625
* Add text explaining an assertion.Cameron Zwarich2010-12-291-1/+3
| | | | llvm-svn: 122617
* Simplify some code in MachineVerifier that was doing the correct thing, but notCameron Zwarich2010-12-281-10/+11
| | | | | | in the most obvious way. llvm-svn: 122610
* Revert the optimization in r122596. It is correct for all current targets, butCameron Zwarich2010-12-281-1/+8
| | | | | | it relies on assumptions that may not be true in the future. llvm-svn: 122608
* Avoid iterating every operand of an instruction in StrongPHIElimination, sinceCameron Zwarich2010-12-281-4/+3
| | | | | | | | we are only interested in the defs when discovering interferences. This is a 28% speedup running StrongPHIElimination on 403.gcc. llvm-svn: 122596
* Pacify the compiler. BestWeight cannot in fact be used uninitializedDuncan Sands2010-12-281-1/+1
| | | | | | | in this function, but the compiler was warning that it might be when doing a release build. llvm-svn: 122595
* Change an assertion to assert what the code actually relies upon.Cameron Zwarich2010-12-271-1/+1
| | | | llvm-svn: 122586
* Land a first cut at StrongPHIElimination. There are only 5 new test failuresCameron Zwarich2010-12-271-64/+590
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | when running without the verifier, and I have not yet checked them to see if the new results are still correct. There are more verifier failures, but they all seem to be additional occurrences of verifier failures that occur with the existing PHIElimination pass. There are a few obvious issues with the code: 1) It doesn't properly update the register equivalence classes during copy insertion, and instead recomputes them before merging live intervals and renaming registers. I wanted to keep this first patch simple for debugging purposes, but it shouldn't be very hard to do this. 2) It doesn't mix the renaming and live interval merging with the copy insertion process, which leads to a lot of virtual register churn. Virtual registers and live intervals are created, only to later be merged into others. The code should be smarter and only create a new virtual register if there is no existing register in the same congruence class. 3) In one place the code uses a DenseMap per basic block, which is unnecessary heap allocation. There should be an inline storage version of DenseMap. I did a quick compile-time test of running llc on 403.gcc with and without StrongPHIElimination. It is slightly slower with StrongPHIElimination, because the small decrease in the coalescer runtime can't beat the increase in phi elimination runtime. Perhaps fixing the above performance issues will narrow the gap. I also haven't yet run any tests of the quality of the generated code. llvm-svn: 122582
* Add knowledge of phi-def and phi-kill valnos to MachineVerifier's predecessorCameron Zwarich2010-12-271-1/+17
| | | | | | | | | valno verification. The "Different value live out of predecessor" check is incorrect in the case of phi-def valnos, so just skip that check for phi-def valnos and instead check that all of the valnos for predecessors have phi-kill. Fixes PR8863. llvm-svn: 122581
* Minor cleanup related to my latest scheduler changes.Andrew Trick2010-12-241-3/+5
| | | | llvm-svn: 122545
* Fix a few cases where the scheduler is not checking for phys reg copies. The ↵Andrew Trick2010-12-242-4/+11
| | | | | | scheduling node may have a NULL DAG node, yuck. llvm-svn: 122544
* Various bits of framework needed for precise machine-level selectionAndrew Trick2010-12-248-129/+508
| | | | | | | | | | | | | | | | | | | | | | | | | | DAG scheduling during isel. Most new functionality is currently guarded by -enable-sched-cycles and -enable-sched-hazard. Added InstrItineraryData::IssueWidth field, currently derived from ARM itineraries, but could be initialized differently on other targets. Added ScheduleHazardRecognizer::MaxLookAhead to indicate whether it is active, and if so how many cycles of state it holds. Added SchedulingPriorityQueue::HasReadyFilter to allowing gating entry into the scheduler's available queue. ScoreboardHazardRecognizer now accesses the ScheduleDAG in order to get information about it's SUnits, provides RecedeCycle for bottom-up scheduling, correctly computes scoreboard depth, tracks IssueCount, and considers potential stall cycles when checking for hazards. ScheduleDAGRRList now models machine cycles and hazards (under flags). It tracks MinAvailableCycle, drives the hazard recognizer and priority queue's ready filter, manages a new PendingQueue, properly accounts for stall cycles, etc. llvm-svn: 122541
* whitespaceAndrew Trick2010-12-243-178/+178
| | | | llvm-svn: 122539
* Simplify a check for implicit defs and remove a FIXME.Cameron Zwarich2010-12-241-8/+6
| | | | llvm-svn: 122537
* flags -> glue for selectiondagChris Lattner2010-12-236-78/+77
| | | | llvm-svn: 122509
* sdisel flag -> glue.Chris Lattner2010-12-231-77/+76
| | | | llvm-svn: 122507
* Reorganize ListScheduleBottomUp in preparation for modeling machine cycles ↵Andrew Trick2010-12-231-130/+153
| | | | | | and instruction issue. llvm-svn: 122491
* Converted LiveRegCycles to LiveRegGens. It's easier to work with and allows ↵Andrew Trick2010-12-231-17/+18
| | | | | | multiple nodes per cycle. llvm-svn: 122474
* In CheckForLiveRegDef use TRI->getOverlaps.Andrew Trick2010-12-231-6/+9
| | | | llvm-svn: 122473
* Fixes PR8823: add-with-overflow-128.llAndrew Trick2010-12-231-12/+33
| | | | | | | | In the bottom-up selection DAG scheduling, handle two-address instructions that read/write unspillable registers. Treat the entire chain of two-address nodes as a single live range. llvm-svn: 122472
* Change all self assignments X=X to (void)X, so that we can turn on aJeffrey Yasskin2010-12-234-9/+9
| | | | | | | new gcc warning that complains on self-assignments and self-initializations. llvm-svn: 122458
* DAGCombine add (sext i1), X into sub X, (zext i1) if sext from i1 is ↵Benjamin Kramer2010-12-221-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | illegal. The latter usually compiles into smaller code. example code: unsigned foo(unsigned x, unsigned y) { if (x != 0) y--; return y; } before: _foo: ## @foo cmpl $1, 4(%esp) ## encoding: [0x83,0x7c,0x24,0x04,0x01] sbbl %eax, %eax ## encoding: [0x19,0xc0] notl %eax ## encoding: [0xf7,0xd0] addl 8(%esp), %eax ## encoding: [0x03,0x44,0x24,0x08] ret ## encoding: [0xc3] after: _foo: ## @foo cmpl $1, 4(%esp) ## encoding: [0x83,0x7c,0x24,0x04,0x01] movl 8(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x08] adcl $-1, %eax ## encoding: [0x83,0xd0,0xff] ret ## encoding: [0xc3] llvm-svn: 122455
* When RegAllocGreedy decides to spill the interferences of the current register,Jakob Stoklund Olesen2010-12-221-37/+89
| | | | | | pick the victim with the lowest total spill weight. llvm-svn: 122445
* Include a shadow of the original CFG edges in the edge bundle graph.Jakob Stoklund Olesen2010-12-221-0/+4
| | | | llvm-svn: 122444
* Fix a bug in ReduceLoadWidth that wasn't handling extendingChris Lattner2010-12-221-1/+4
| | | | | | | | | | | | | | | | | | | | | loads properly. We miscompiled the testcase into: _test: ## @test movl $128, (%rdi) movzbl 1(%rdi), %eax ret Now we get a proper: _test: ## @test movl $128, (%rdi) movsbl (%rdi), %eax movzbl %ah, %eax ret This fixes PR8757. llvm-svn: 122392
OpenPOWER on IntegriCloud