summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/X86/bt.ll
Commit message (Collapse)AuthorAgeFilesLines
* [x86] make 8-bit shl undesirableSanjay Patel2019-04-081-8/+7
| | | | | | | | | | | | | | | | I was looking at a potential DAGCombiner fix for 1 of the regressions in D60278, and it caused severe regression test pain because x86 TLI lies about the desirability of 8-bit shift ops. We've hinted at making all 8-bit ops undesirable for the reason in the code comment: // TODO: Almost no 8-bit ops are desirable because they have no actual // size/speed advantages vs. 32-bit ops, but they do have a major // potential disadvantage by causing partial register stalls. ...but that leads to massive diffs and exposes all kinds of optimization holes itself. Differential Revision: https://reviews.llvm.org/D60286 llvm-svn: 357912
* [SelectionDAG] Teach GetDemandedBits to look at the known zeros of the LHS ↵Craig Topper2019-02-201-2/+0
| | | | | | | | | | | | when handling ISD::AND If the LHS has known zeros, then the RHS immediate mask might have been simplified to remove those bits. This patch adds a call to computeKnownBits to get the known zeroes to handle that possibility. I left an early out to skip the call if all of the demanded bits are set in the mask. Differential Revision: https://reviews.llvm.org/D58464 llvm-svn: 354514
* [X86] Add test case to show missed opportunity to remove an explicit AND on ↵Craig Topper2019-02-201-0/+31
| | | | | | | | | | the bit position from BT when it has known zeros. NFC If the bit position has known zeros in it, then the AND immediate will likely be optimized to remove bits. This can prevent GetDemandedBits from recognizing that the AND is unnecessary. llvm-svn: 354501
* Revert r354498 "[X86] Add test case to show missed opportunity to remove an ↵Craig Topper2019-02-201-29/+0
| | | | | | | | explicit AND on the bit position from BT when it has known zeros." I accidentally committed more than just the test. llvm-svn: 354499
* [X86] Add test case to show missed opportunity to remove an explicit AND on ↵Craig Topper2019-02-201-0/+29
| | | | | | | | | | the bit position from BT when it has known zeros. If the bit position has known zeros in it, then the AND immediate will likely be optimized to remove bits. This can prevent GetDemandedBits from recognizing that the AND is unnecessary. llvm-svn: 354498
* [DAG] consolidate shift simplificationsSanjay Patel2018-11-231-1/+1
| | | | | | | | | | ...and use them to avoid creating obviously undef values as discussed in the post-commit thread for r347478. The diffs in vector div/rem show that we were missing real optimizations by creating bogus shift nodes. llvm-svn: 347502
* [X86] Handle COPYs of physregs better (regalloc hints)Simon Pilgrim2018-09-191-6/+6
| | | | | | | | | | | | | | Enable enableMultipleCopyHints() on X86. Original Patch by @jonpa: While enabling the mischeduler for SystemZ, it was discovered that for some reason a test needed one extra seemingly needless COPY (test/CodeGen/SystemZ/call-03.ll). The handling for that is resulted in this patch, which improves the register coalescing by providing not just one copy hint, but a sorted list of copy hints. On SystemZ, this gives ~12500 less register moves on SPEC, as well as marginally less spilling. Instead of improving just the SystemZ backend, the improvement has been implemented in common-code (calculateSpillWeightAndHint(). This gives a lot of test failures, but since this should be a general improvement I hope that the involved targets will help and review the test updates. Differential Revision: https://reviews.llvm.org/D38128 llvm-svn: 342578
* [CodeGen] Unify MBB reference format in both MIR and debug outputFrancis Visoiu Mistrih2017-12-041-120/+120
| | | | | | | | | | | | | | | | As part of the unification of the debug format and the MIR format, print MBB references as '%bb.5'. The MIR printer prints the IR name of a MBB only for block definitions. * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(*\1)/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g' * find . \( -name "*.txt" -o -name "*.s" -o -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g' * grep -nr 'BB#' and fix Differential Revision: https://reviews.llvm.org/D40422 llvm-svn: 319665
* [SelectionDAG][X86] CombineBT - more aggressively determine demanded bitsSimon Pilgrim2017-07-291-5/+1
| | | | | | | | | | | | This patch is in 2 parts: 1 - replace combineBT's use of SimplifyDemandedBits (hasOneUse only) with SelectionDAG::GetDemandedBits to more aggressively determine the lower bits used by BT. 2 - update SelectionDAG::GetDemandedBits to support ANY_EXTEND - if the demanded bits are only in the non-extended portion, then peek through and demand from the source value and then ANY_EXTEND that if we found a match. Differential Revision: https://reviews.llvm.org/D35896 llvm-svn: 309486
* [X86] Add combineBT test failure because bits have multiple uses.Simon Pilgrim2017-07-261-0/+64
| | | | llvm-svn: 309124
* [X86] Regenerated BT testsSimon Pilgrim2017-07-261-297/+628
| | | | | | Test on 32/64 bit targets where appropriate llvm-svn: 309107
* [x86] regenerate checks with update_llc_test_checks.pySanjay Patel2017-06-121-32/+177
| | | | | | | | | | The dream of a unified check-line auto-generator for all phases of compilation is dead. The llc script has already diverged to be better at its goal, so having 2 scripts that do almost the same thing is just causing confusion for newcomers. I plan to fix up more x86 tests in a next commit. We can rip out the llc ability in update_test_checks.py after that. llvm-svn: 305202
* CodeGen: Allow small copyable blocks to "break" the CFG.Kyle Butt2017-01-311-4/+4
| | | | | | | | | | | When choosing the best successor for a block, ordinarily we would have preferred a block that preserves the CFG unless there is a strong probability the other direction. For small blocks that can be duplicated we now skip that requirement as well, subject to some simple frequency calculations. Differential Revision: https://reviews.llvm.org/D28583 llvm-svn: 293716
* Revert "CodeGen: Allow small copyable blocks to "break" the CFG."Kyle Butt2017-01-111-6/+4
| | | | | | | | | This reverts commit ada6595a526d71df04988eb0a4b4fe84df398ded. This needs a simple probability check because there are some cases where it is not profitable. llvm-svn: 291695
* CodeGen: Allow small copyable blocks to "break" the CFG.Kyle Butt2017-01-101-4/+6
| | | | | | | | | | | When choosing the best successor for a block, ordinarily we would have preferred a block that preserves the CFG unless there is a strong probability the other direction. For small blocks that can be duplicated we now skip that requirement as well. Differential revision: https://reviews.llvm.org/D27742 llvm-svn: 291609
* X86: Improve BT instruction selection for 64-bit values.Peter Collingbourne2016-10-211-0/+12
| | | | | | | | | | If a 64-bit value is tested against a bit which is known to be in the range [0..31) (modulo 64), we can use the 32-bit BT instruction, which has a slightly shorter encoding. Differential Revision: https://reviews.llvm.org/D25862 llvm-svn: 284864
* AVX-512: Fixed BT instruction selection.Elena Demikhovsky2016-07-191-452/+80
| | | | | | | | | | | The following condition expression ( a >> n) & 1 is converted to "bt a, n" instruction. It works on all intel targets. But on AVX-512 it was broken because the expression is modified to (truncate (a >>n) to i1). I added the new sequence (truncate (a >>n) to i1) to the BT pattern. Differential Revision: https://reviews.llvm.org/D22354 llvm-svn: 275950
* X86: Updated a test file. NFC.Elena Demikhovsky2016-07-171-79/+452
| | | | | | | This test shows subotimal code generated for AVX-512 vs PENTIUM4. The issue will be fixed in an upcomming commit. llvm-svn: 275702
* auto-generate checksSanjay Patel2016-07-141-394/+458
| | | | | | | Note: I removed the checks after each jump because that's noise, but we apparently need branches rather than returning i1 to see the bt codegen in some cases. llvm-svn: 275439
* Fix non-deterministic SDNodeOrder-dependent codegenNico Rieck2014-01-121-1/+1
| | | | | | | Reset SelectionDAGBuilder's SDNodeOrder to ensure deterministic code generation. llvm-svn: 199050
* Enable MI Sched for x86.Andrew Trick2013-10-151-27/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | This changes the SelectionDAG scheduling preference to source order. Soon, the SelectionDAG scheduler can be bypassed saving a nice chunk of compile time. Performance differences that result from this change are often a consequence of register coalescing. The register coalescer is far from perfect. Bugs can be filed for deficiencies. On x86 SandyBridge/Haswell, the source order schedule is often preserved, particularly for small blocks. Register pressure is generally improved over the SD scheduler's ILP mode. However, we are still able to handle large blocks that require latency hiding, unlike the SD scheduler's BURR mode. MI scheduler also attempts to discover the critical path in single-block loops and adjust heuristics accordingly. The MI scheduler relies on the new machine model. This is currently unimplemented for AVX, so we may not be generating the best code yet. Unit tests are updated so they don't depend on SD scheduling heuristics. llvm-svn: 192750
* Revert "Temporarily enable MI-Sched on X86."Andrew Trick2013-06-251-27/+27
| | | | | | This reverts commit 98a9b72e8c56dc13a2617de84503a3d78352789c. llvm-svn: 184823
* Temporarily enable MI-Sched on X86.Andrew Trick2013-06-241-27/+27
| | | | | | | Sorry for the unit test churn. I'll try to make the change permanently next time. llvm-svn: 184705
* Remove a recently redundant transform from X86ISelLowering.David Majnemer2013-05-051-4/+1
| | | | | | | | | | | | X86ISelLowering has support to treat: (icmp ne (and (xor %flags, -1), (shl 1, flag)), 0) as if it were actually: (icmp eq (and %flags, (shl 1, flag)), 0) However, r179386 has code at the InstCombine level to handle this. llvm-svn: 181145
* Add x86 isel lowering logic to form bit test with inverted condition. e.g.Evan Cheng2012-12-051-3/+97
| | | | | | | | | x ^ -1. Patch by David Majnemer. rdar://12755626 llvm-svn: 169339
* Eliminate more uses of llvm-as and llvm-dis.Dan Gohman2009-09-081-3/+3
| | | | llvm-svn: 81290
* Add explicit -march=x86 to these tests so that they don'tDan Gohman2009-02-031-2/+2
| | | | | | default to -march=x86-64 on 64-bit hosts. llvm-svn: 63579
* Make x86's BT instruction matching more thorough, and add someDan Gohman2009-01-291-3/+417
| | | | | | | | | dagcombines that help it match in several more cases. Add several more cases to test/CodeGen/X86/bt.ll. This doesn't yet include matching for BT with an immediate operand, it just covers more register+register cases. llvm-svn: 63266
* Disable the register+memory forms of the bt instructions for now. ThanksDan Gohman2009-01-131-1/+6
| | | | | | | | to Eli for pointing out that these forms don't ignore the high bits of their index operands, and as such are not immediately suitable for use by isel. llvm-svn: 62194
* Do not isel load folding bt instructions for pentium m, core, core2, and AMD ↵Evan Cheng2009-01-021-0/+2
| | | | | | processors. These are significantly slower than a load followed by a bt of a register. llvm-svn: 61557
* add PR #Chris Lattner2008-12-251-0/+1
| | | | llvm-svn: 61427
* Add a simple pattern for matching 'bt'.Chris Lattner2008-12-251-0/+20
llvm-svn: 61426
OpenPOWER on IntegriCloud