summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* Check all overlaps when looking for used registers.Jakob Stoklund Olesen2011-11-151-4/+5
| | | | | | A function using any RC alias is enough to enable the ExeDepsFix pass. llvm-svn: 144636
* Make use of MachinePointerInfo::getFixedStack.Jay Foad2011-11-151-2/+1
| | | | llvm-svn: 144635
* Remove some unnecessary includes of PseudoSourceValue.h.Jay Foad2011-11-157-7/+0
| | | | llvm-svn: 144634
* Fix typo in comment.Jay Foad2011-11-151-1/+1
| | | | llvm-svn: 144633
* Make use of MachinePointerInfo::getFixedStack. This removes all mentionJay Foad2011-11-155-24/+10
| | | | | | of PseudoSourceValue from lib/Target/. llvm-svn: 144632
* Remove some unnecessary includes of PseudoSourceValue.h.Jay Foad2011-11-158-8/+0
| | | | llvm-svn: 144631
* Fix PR11370 for real. Prevents converting 256-bit FP instruction to AVX2 ↵Craig Topper2011-11-151-9/+17
| | | | | | 256-bit integer instructions when AVX2 isn't enabled. llvm-svn: 144629
* Set SeenStore to true to prevent loads from being moved; also eliminates a ↵Evan Cheng2011-11-151-2/+2
| | | | | | non-deterministic behavior. llvm-svn: 144628
* Rather than trying to use the loop block sequence *or* the functionChandler Carruth2011-11-151-27/+24
| | | | | | | | | | | | | | | | | | | | | | | block sequence when recovering from unanalyzable control flow constructs, *always* use the function sequence. I'm not sure why I ever went down the path of trying to use the loop sequence, it is fundamentally not the correct sequence to use. We're trying to preserve the incoming layout in the cases of unreasonable control flow, and that is only encoded at the function level. We already have a filter to select *exactly* the sub-set of blocks within the function that we're trying to form into a chain. The resulting code layout is also significantly better because of this. In several places we were ending up with completely unreasonable control flow constructs due to the ordering chosen by the loop structure for its internal storage. This change removes a completely wasteful vector of basic blocks, saving memory allocation in the common case even though it costs us CPU in the fairly rare case of unnatural loops. Finally, it fixes the latest crasher reduced out of GCC's single source. Thanks again to Benjamin Kramer for the reduction, my bugpoint skills failed at it. llvm-svn: 144627
* Properly qualify AVX2 specific parts of execution dependency table. Also ↵Craig Topper2011-11-152-9/+16
| | | | | | enable converting between 256-bit PS/PD operations when AVX1 is enabled. Fixes PR11370. llvm-svn: 144622
* Add vmov.f32 to materialize f32 immediate splats which cannot be handled byEvan Cheng2011-11-153-0/+28
| | | | | | integer variants. rdar://10437054 llvm-svn: 144608
* ARM parsing datatype suffix variants for fixed-writeback VLD1/VST1 instructions.Jim Grosbach2011-11-151-3/+66
| | | | | | rdar://10435076 llvm-svn: 144606
* Move WEAK marking to the declaration.Nick Lewycky2011-11-151-6/+6
| | | | llvm-svn: 144603
* Break false dependencies before partial register updates.Jakob Stoklund Olesen2011-11-153-0/+84
| | | | | | | | | | | | | | Two new TargetInstrInfo hooks lets the target tell ExecutionDepsFix about instructions with partial register updates causing false unwanted dependencies. The ExecutionDepsFix pass will break the false dependencies if the updated register was written in the previoius N instructions. The small loop added to sse-domains.ll runs twice as fast with dependency-breaking instructions inserted. llvm-svn: 144602
* Track register ages more accurately.Jakob Stoklund Olesen2011-11-151-101/+184
| | | | | | | | | | | | | | | Keep track of the last instruction to define each register individually instead of per DomainValue. This lets us track more accurately when a register was last written. Also track register ages across basic blocks. When entering a new basic block, use the least stale predecessor def as a worst case estimate for register age. The register age is used to arbitrate between conflicting domains. The most recently defined register wins. llvm-svn: 144601
* Fix linking for some users who already have tsan enabled code and are trying toNick Lewycky2011-11-151-6/+6
| | | | | | link it against llvm code, by making our definitions weak. "Some users." llvm-svn: 144596
* ARM parsing datatype suffix variants for non-writeback VST1 instructions.Jim Grosbach2011-11-141-0/+44
| | | | | | rdar://10435076 llvm-svn: 144593
* ARM parsing datatype suffix variants for non-writeback VLD1 instructions.Jim Grosbach2011-11-141-0/+41
| | | | | | rdar://10435076 llvm-svn: 144592
* Add explanatory comment.Jim Grosbach2011-11-141-0/+1
| | | | llvm-svn: 144589
* Split out the plain '.{8|16|32|64}' suffix handling.Jim Grosbach2011-11-141-8/+24
| | | | | | | Make it easier to deal with aliases for instructions that do require a suffix but accept more specific variants of the same size. llvm-svn: 144588
* ARM parsing optional datatype suffix for VAND/VEOR/VORR instructions.Jim Grosbach2011-11-142-1/+38
| | | | | | rdar://10435076 llvm-svn: 144587
* Supporting inline memmove isn't going to be worthwhile. The only way to avoidChad Rosier2011-11-141-16/+9
| | | | | | | violating a dependency is to emit all loads prior to stores. This would likely cause a great deal of spillage offsetting any potential gains. llvm-svn: 144585
* ARM VLDR/VSTR instructions don't need a size suffix.Jim Grosbach2011-11-142-18/+11
| | | | | | | Canonicallize on the non-suffixed form, but continue to accept assembly that has any correctly sized type suffix. llvm-svn: 144583
* Refactor capture tracking (which already had a couple flags for whether returnsNick Lewycky2011-11-142-117/+110
| | | | | | | | | | and stores capture) to permit the caller to see each capture point and decide whether to continue looking. Use this inside memdep to do an analysis that basicaa won't do. This lets us solve another devirtualization case, fixing PR8908! llvm-svn: 144580
* Add support for inlining small memcpys.Chad Rosier2011-11-141-2/+63
| | | | | | rdar://10412592 llvm-svn: 144578
* Fix a performance regression from r144565. Positive offsets were being loweredChad Rosier2011-11-141-3/+3
| | | | | | into registers, rather then encoded directly in the load/store. llvm-svn: 144576
* ARM assembly parsing type suffix options for VLDR/VSTR.Jim Grosbach2011-11-142-0/+28
| | | | | | rdar://10435076 llvm-svn: 144575
* Avoid dereferencing off the beginning of lists.Evan Cheng2011-11-141-7/+4
| | | | llvm-svn: 144569
* At -O0, multiple uses of a virtual registers in the same BB are being markedEvan Cheng2011-11-141-1/+2
| | | | | | | | "kill". This looks like a bug upstream. Since that's going to take some time to understand, loosen the assertion and disable the optimization when multiple kills are seen. llvm-svn: 144568
* Add support for tsan annotations (thread sanitizer, a valgrind-based tool).Nick Lewycky2011-11-142-1/+18
| | | | | | | | | | | | These annotations are disabled entirely when either ENABLE_THREADS is off, or building a release build. When enabled, they add calls to functions with no statements to ManagedStatic's getters. Use these annotations to inform tsan that the race used inside ManagedStatic initialization is actually benign. Thanks to Kostya Serebryany for helping write this patch! llvm-svn: 144567
* Add a missing pattern for X86ISD::MOVLPD. rdar://10436044Evan Cheng2011-11-141-0/+5
| | | | llvm-svn: 144566
* Add support for Thumb load/stores with negative offsets.Chad Rosier2011-11-141-16/+60
| | | | | | rdar://10412592 llvm-svn: 144565
* Unbreak Release builds.Benjamin Kramer2011-11-141-1/+1
| | | | llvm-svn: 144560
* Teach two-address pass to re-schedule two-address instructions (or the killEvan Cheng2011-11-141-19/+356
| | | | | | | | | instructions of the two-address operands) in order to avoid inserting copies. This fixes the few regressions introduced when the two-address hack was disabled (without regressing the improvements). rdar://10422688 llvm-svn: 144559
* Changed SSE4/AVX <2 x i64> extract and insert ops to be Custom loweredPete Cooper2011-11-141-5/+8
| | | | | | | | Constant idx case is still done in tablegen but other cases are then expanded Fixes <rdar://problem/10435460> llvm-svn: 144557
* Fold ConstantVector::isAllOnesValue into Constant::isAllOnesValue and ↵Benjamin Kramer2011-11-141-22/+4
| | | | | | simplify it. llvm-svn: 144555
* 32-to-64-bit extended load.Akira Hatanaka2011-11-141-5/+10
| | | | llvm-svn: 144554
* AnalyzeCallOperands function for N32/64.Akira Hatanaka2011-11-142-0/+45
| | | | | | | | N32/64 places all variable arguments in integer registers (or on stack), regardless of their types, but follows calling convention of non-vaarg function when it handles fixed arguments. llvm-svn: 144553
* Modify LowerFormalArguments to correctly handle vaarg arguments for Mips64.Akira Hatanaka2011-11-141-14/+30
| | | | llvm-svn: 144552
* PTX: Let LLVM use loads/stores for all mem* intrinsics, instead of relying ↵Justin Holewinski2011-11-141-0/+5
| | | | | | on custom implementations. llvm-svn: 144551
* Remove variable that keeps the size of area used to save byval or variableAkira Hatanaka2011-11-143-12/+1
| | | | | | | | | | | argument registers on the callee's stack frame, along with functions that set and get it. It is not necessary to add the size of this area when computing stack size in emitPrologue, since it has already been accounted for in PEI::calculateFrameObjectOffsets. llvm-svn: 144549
* Fix early-clobber handling in shrinkToUses.Jakob Stoklund Olesen2011-11-141-12/+8
| | | | | | | | I broke this in r144515, it affected most ARM testers. <rdar://problem/10441389> llvm-svn: 144547
* Disable generation of compact unwind encodings. <rdar://problem/10441578>Bob Wilson2011-11-141-1/+2
| | | | | | | This still seems to be causing some failures. It needs more testing before it gets enabled again. llvm-svn: 144543
* Tidy up. 80 column.Jim Grosbach2011-11-141-5/+8
| | | | llvm-svn: 144538
* Make headers standalone, move a virtual method out of line.Benjamin Kramer2011-11-141-0/+7
| | | | llvm-svn: 144536
* It helps to deallocate memory as well as allocate it. =] This actuallyChandler Carruth2011-11-141-0/+1
| | | | | | | | cleans up all the chains allocated during the processing of each function so that for very large inputs we don't just grow memory usage without bound. llvm-svn: 144533
* Remove an over-eager assert that was firing on one of the ARM regressionChandler Carruth2011-11-141-3/+6
| | | | | | | | | | | | | | tests when I forcibly enabled block placement. It is apparantly possible for an unanalyzable block to fallthrough to a non-loop block. I don't actually beleive this is correct, I believe that 'canFallThrough' is returning true needlessly for the code construct, and I've left a bit of a FIXME on the verification code to try to track down why this is coming up. Anyways, removing the assert doesn't degrade the correctness of the algorithm. llvm-svn: 144532
* Begin chipping away at one of the biggest quadratic-ish behaviors inChandler Carruth2011-11-141-2/+26
| | | | | | | | | | | | | | this pass. We're leaving already merged blocks on the worklist, and scanning them again and again only to determine each time through that indeed they aren't viable. We can instead remove them once we're going to have to scan the worklist. This is the easy way to implement removing them. If this remains on the profile (as I somewhat suspect it will), we can get a lot more clever here, as the worklist's order is essentially irrelevant. We can use swapping and fold the two loops to reduce overhead even when there are many blocks on the worklist but only a few of them are removed. llvm-svn: 144531
* Under the hood, MBPI is doing a linear scan of every successor everyChandler Carruth2011-11-141-4/+13
| | | | | | | | | | | | | | | | | | time it is queried to compute the probability of a single successor. This makes computing the probability of every successor of a block in sequence... really really slow. ;] This switches to a linear walk of the successors rather than a quadratic one. One of several quadratic behaviors slowing this pass down. I'm not really thrilled with moving the sum code into the public interface of MBPI, but I don't (at the moment) have ideas for a better interface. My direction I'm thinking in for a better interface is to have MBPI actually retain much more state and make *all* of these queries cheap. That's a lot of work, and would require invasive changes. Until then, this seems like the least bad (ie, least quadratic) solution. Suggestions welcome. llvm-svn: 144530
* Reuse the logic in getEdgeProbability within getHotSucc in order toChandler Carruth2011-11-141-11/+3
| | | | | | | | | | correctly handle blocks whose successor weights sum to more than UINT32_MAX. This is slightly less efficient, but the entire thing is already linear on the number of successors. Calling it within any hot routine is a mistake, and indeed no one is calling it. It also simplifies the code. llvm-svn: 144527
OpenPOWER on IntegriCloud