summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
* Support for precise scheduling of the instruction selection DAG,Andrew Trick2011-01-141-537/+663
| | | | | | | | | | | | | | | | | | | | | | | | | disabled in this checkin. Sorry for the large diffs due to refactoring. New functionality is all guarded by EnableSchedCycles. Scheduling the isel DAG is inherently imprecise, but we give it a best effort: - Added MayReduceRegPressure to allow stalled nodes in the queue only if there is a regpressure need. - Added BUHasStall to allow checking for either dependence stalls due to latency or resource stalls due to pipeline hazards. - Added BUCompareLatency to encapsulate and standardize the heuristics for minimizing stall cycles (vs. reducing register pressure). - Modified the bottom-up heuristic (now in BUCompareLatency) to prioritize nodes by their depth rather than height. As long as it doesn't stall, height is irrelevant. Depth represents the critical path to the DAG root. - Added hybrid_ls_rr_sort::isReady to filter stalled nodes before adding them to the available queue. Related Cleanup: most of the register reduction routines do not need to be templates. llvm-svn: 123468
* switch SRoA to use LoadAndStorePromoter instead of its own copy of the code.Chris Lattner2011-01-141-136/+26
| | | | llvm-svn: 123457
* Add a new LoadAndStorePromoter class, which implements the generalChris Lattner2011-01-142-0/+186
| | | | | | | "promote a bunch of load and stores" logic, allowing the code to be shared and reused. llvm-svn: 123456
* OperandTraits<>::Layout isn't used for anything. Remove it.Jay Foad2011-01-142-14/+0
| | | | llvm-svn: 123452
* Update llvm-gcc's tests.Rafael Espindola2011-01-146-10/+10
| | | | llvm-svn: 123447
* Reorder macros on config.h.cmake to easily compare it againstOscar Fuentes2011-01-141-54/+77
| | | | | | | | config.h.in. Patch by arrowdodger! llvm-svn: 123445
* Disable debug mode.Devang Patel2011-01-141-2/+2
| | | | llvm-svn: 123443
* Turn X-(X-Y) into Y. According to my auto-simplifier this is the most commonDuncan Sands2011-01-142-1/+23
| | | | | | | | | simplification present in fully optimized code (I think instcombine fails to transform some of these when "X-Y" has more than one use). Fires here and there all over the test-suite, for example it eliminates 8 subtractions in the final IR for 445.gobmk, 2 subs in 447.dealII, 2 in paq8p etc. llvm-svn: 123442
* Factorize common code out of the InstructionSimplify shift logic. Add inDuncan Sands2011-01-142-62/+47
| | | | | | | | | | | threading of shifts over selects and phis while there. This fires here and there in the testsuite, to not much effect. For example when compiling spirit it fires 5 times, during early-cse, resulting in 6 more cse simplifications, and 3 more terminators being folded by jump threading, but the final bitcode doesn't change in any interesting way: other optimizations would have caught the opportunity anyway, only later. llvm-svn: 123441
* Rename this test.Duncan Sands2011-01-141-0/+0
| | | | llvm-svn: 123440
* switch the second scalarrepl pass to use SSAUpdater. We run two scalarrepl ↵Chris Lattner2011-01-141-1/+2
| | | | | | | | | | | | | | | | | | | passes: one early in the cleanup code and one late interlaced with the inliner. The second one is important because inlining and other scalar optzns can unpin allocas, allowing them to be split up and promoted. While important for performance, this is also relatively rare, and we would previously force a (non-lazy) computation of DomFrontiers, which happened even if nothing became unpinned. With this patch, the first pass of scalarrepl still promotes the vast bulk of allocas in programs, but hte second pass has changed to use SSAUpdater, which is more "sparse" and lazy. This speeds up opt -O3 time on kimwitu++ (a c++ app) by about 1%. The numbers are interesting: the first pass promotes ~17500 allocas. The second pass promotes about 1600. For non-C++ codes, the compile time win should be greater, because the second pass of scalarrepl does less. llvm-svn: 123437
* split SROA into two passes: one that uses DomFrontiers (-scalarrepl) Chris Lattner2011-01-144-29/+61
| | | | | | and one that uses SSAUpdater (-scalarrepl-ssa) llvm-svn: 123436
* Remove casts between Value** and Constant**, which won't work if aJay Foad2011-01-146-33/+82
| | | | | | | static_cast from Constant* to Value* has to adjust the "this" pointer. This is groundwork for PR889. llvm-svn: 123435
* Implement full support for promoting allocas to registers using SSAUpdaterChris Lattner2011-01-141-5/+162
| | | | | | | | | | | instead of DomTree/DomFrontier. This may be interesting for reducing compile time. This is currently disabled, but seems to work just fine. When this is enabled, we eliminate two runs of dominator frontier, one in the "early per-function" optimizations and one in the "interlaced with inliner" function passes. llvm-svn: 123434
* relax testcase a bit.Chris Lattner2011-01-141-1/+1
| | | | llvm-svn: 123433
* Try for the third time to teach getFirstTerminator() about debug values.Jakob Stoklund Olesen2011-01-142-4/+11
| | | | | | This time let's rephrase to trick gcc-4.3 into not miscompiling. llvm-svn: 123432
* revert my fastisel patch again which apparently still gives theChris Lattner2011-01-142-18/+1
| | | | | | llvm-gcc-i386-linux-selfhost buildbot heartburn... llvm-svn: 123431
* reapply r123414 now that the botz are calmed down and the fix is already in.Chris Lattner2011-01-142-1/+18
| | | | llvm-svn: 123427
* indentationChris Lattner2011-01-141-1/+1
| | | | llvm-svn: 123426
* Completed :lower16: / :upper16: support for movw / movt pairs on Darwin.Evan Cheng2011-01-148-44/+221
| | | | | | | | - Fixed :upper16: fix up routine. It should be shifting down the top 16 bits first. - Added support for Thumb2 :lower16: and :upper16: fix up. - Added :upper16: and :lower16: relocation support to mach-o object writer. llvm-svn: 123424
* Revert r123419. It still breaks llvm-gcc-i386-linux-selfhost.Jakob Stoklund Olesen2011-01-142-24/+7
| | | | llvm-svn: 123423
* r123414 broke llvm-gcc bootstrap apparently, revertChris Lattner2011-01-142-18/+1
| | | | llvm-svn: 123422
* Set the insertion point correctly for instructions generated by load folding:Chris Lattner2011-01-141-4/+4
| | | | | | they should go *before* the new instruction not after it. llvm-svn: 123420
* Try again to teach getFirstTerminator() about debug values.Jakob Stoklund Olesen2011-01-142-7/+24
| | | | | | Fix some callers to better deal with debug values. llvm-svn: 123419
* Rather than doing early instcombine, try doing early CSE instead. This ↵Owen Anderson2011-01-141-1/+1
| | | | | | | | | should still handle most important simplifications, as well as resolving phase ordering issues where instcombine would inhibit important CSE'ing opportunities, for instance on BitBench/drop3. llvm-svn: 123418
* Move some shift transforms out of instcombine and into InstructionSimplify.Duncan Sands2011-01-145-34/+189
| | | | | | | | | | | | While there, I noticed that the transform "undef >>a X -> undef" was wrong. For example if X is 2 then the top two bits must be equal, so the result can not be anything. I fixed this in the constant folder as well. Also, I made the transform for "X << undef" stronger: it now folds to undef always, even though X might be zero. This is in accordance with the LangRef, but I must admit that it is fairly aggressive. Also, I added "i32 X << 32 -> undef" following the LangRef and the constant folder, likewise fairly aggressive. llvm-svn: 123417
* Don't bother conditionalizing the use of SROA in -O1 mode. We're already ↵Owen Anderson2011-01-141-4/+1
| | | | | | | | running it unconditionally later in the pipeline. llvm-svn: 123416
* fix PR8961 - a fast isel miscompilation where we'd insert a new instructionChris Lattner2011-01-142-1/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | after sext's generated for addressing that got folded. Previously we compiled test5 into: _test5: ## @test5 ## BB#0: movq -8(%rsp), %rax ## 8-byte Reload movq (%rdi,%rax), %rdi addq %rdx, %rdi movslq %esi, %rax movq %rax, -8(%rsp) ## 8-byte Spill movq %rdi, %rax ret which is insane and wrong. Now we produce: _test5: ## @test5 ## BB#0: movslq %esi, %rax movq (%rdi,%rax), %rax addq %rdx, %rax ret llvm-svn: 123414
* Better terminator avoidance.Jakob Stoklund Olesen2011-01-131-9/+3
| | | | | | | This approach also works when the terminator doesn't have a slot index. (Which can happen??) llvm-svn: 123413
* Add comment about Thumb2 fixup comments being completely bogus.Evan Cheng2011-01-131-1/+3
| | | | llvm-svn: 123411
* Add single entry / single exit accessors.Tobias Grosser2011-01-132-23/+46
| | | | | | | | | | | Add methods for accessing the (single) entry / exit edge of a region. If no such edge exists, null is returned. Both accessors return the start block of the corresponding edge. The edge can finally be formed by utilizing Region::getEntry() or Region::getExit(); Contributed by: Andreas Simbuerger <simbuerg@fim.uni-passau.de> llvm-svn: 123410
* Recognize alternative register names like ip -> r12.Owen Anderson2011-01-131-3/+14
| | | | | | Fixes <rdar://problem/8857982>. llvm-svn: 123409
* Fix a few more places that should use MBB::getLastNonDebugInstr().Jakob Stoklund Olesen2011-01-133-3/+3
| | | | llvm-svn: 123408
* As far as I can tell, unified syntax uses c0-c15 instead of cr0-cr15 for mcr ↵Owen Anderson2011-01-131-1/+1
| | | | | | and friends. llvm-svn: 123407
* typoChris Lattner2011-01-131-1/+1
| | | | llvm-svn: 123406
* memcpy + metadata = bliss :)Chris Lattner2011-01-131-0/+48
| | | | llvm-svn: 123405
* Add support to the ARM MC infrastructure to support mcr and friends. This ↵Owen Anderson2011-01-135-29/+227
| | | | | | | | | | | | | | requires supporting the symbolic immediate names used for these instructions, fixing their pretty-printers, and adding proper encoding information for them. With this, we can properly pretty-print and encode assembly like: mrc p15, #0, r3, c13, c0, #3 Fixes <rdar://problem/8857858>. llvm-svn: 123404
* Relax an assertion. On archs like ARM, an immediate field may be scattered. ↵Evan Cheng2011-01-131-2/+6
| | | | | | So it's possible for some bits of every 8 bits to be encoded already, and the rest still needs to be fixed up. llvm-svn: 123403
* Temporary workaround for an i386 crash in LiveDebugVariables.Jakob Stoklund Olesen2011-01-131-1/+2
| | | | llvm-svn: 123400
* Teach frame lowering to ignore debug values after the terminators.Jakob Stoklund Olesen2011-01-1314-24/+42
| | | | llvm-svn: 123399
* Tidy comments, indentation, and 80-column violations.Bob Wilson2011-01-131-37/+39
| | | | llvm-svn: 123397
* Fix whitespace.Bob Wilson2011-01-131-120/+120
| | | | llvm-svn: 123396
* Fix ARMAsmParser::ParseOperand() to allow it to parse . as a branch target andKevin Enderby2011-01-131-2/+4
| | | | | | directional local labels like 1f and 2b. llvm-svn: 123393
* Little help to debug the bugpoint itself.Devang Patel2011-01-131-0/+12
| | | | | | Patch by Bob Wilson. llvm-svn: 123390
* Speculatively revert r123384 to make llvm-gcc-i386-linux-selfhost buildbot ↵Devang Patel2011-01-132-18/+5
| | | | | | happy. llvm-svn: 123389
* Add some platform tests.Oscar Fuentes2011-01-132-7/+14
| | | | | | Patch by arrowdodger! llvm-svn: 123388
* When updating a tSpill/tRestore instruction to be a tSTRr/tLDRr, correctlyJim Grosbach2011-01-131-4/+7
| | | | | | | | | set up the source operands. The original instr has an immediate operand that should be replaced with the frame reg operand rather than just adding the reg operand. Previously, the instruction ended up with too many operands causing an assert() when adding the default predicate. rdar://8825456 llvm-svn: 123387
* Teach MachineBasicBlock::getFirstTerminator to ignore debug values.Jakob Stoklund Olesen2011-01-132-5/+18
| | | | | | | It will still return an iterator that points to the first terminator or end(), but there may be DBG_VALUE instructions following the first terminator. llvm-svn: 123384
* Check for empty structs, and for consistency, zero-element arrays.Bob Wilson2011-01-131-2/+2
| | | | llvm-svn: 123383
* Extend SROA to handle arrays accessed as homogeneous structs and vice versa.Bob Wilson2011-01-132-17/+83
| | | | | | | | | | | | | | | | | This is a minor extension of SROA to handle a special case that is important for some ARM NEON operations. Some of the NEON intrinsics return multiple values, which are handled as struct types containing multiple elements of the same vector type. The corresponding return types declared in the arm_neon.h header have equivalent arrays. We need SROA to recognize that it can split up those arrays and structs into separate vectors, even though they are not always accessed with the same type. SROA already handles loads and stores of an entire alloca by using insertvalue/extractvalue to access the individual pieces, and that code works the same regardless of whether the type is a struct or an array. So, all that needs to be done is to check for compatible arrays and homogeneous structs. llvm-svn: 123381
OpenPOWER on IntegriCloud