summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/X86/bitreverse.ll
Commit message (Collapse)AuthorAgeFilesLines
* Recommit r367901 "[X86] Enable ↵Craig Topper2019-08-071-8/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | -x86-experimental-vector-widening-legalization by default." The assert that caused this to be reverted should be fixed now. Original commit message: This patch changes our defualt legalization behavior for 16, 32, and 64 bit vectors with i8/i16/i32/i64 scalar types from promotion to widening. For example, v8i8 will now be widened to v16i8 instead of promoted to v8i16. This keeps the elements widths the same and pads with undef elements. We believe this is a better legalization strategy. But it carries some issues due to the fragmented vector ISA. For example, i8 shifts and multiplies get widened and then later have to be promoted/split into vXi16 vectors. This has the potential to cause regressions so we wanted to get it in early in the 10.0 cycle so we have plenty of time to address them. Next steps will be to merge tests that explicitly test the command line option. And then we can remove the option and its associated code. llvm-svn: 368183
* Revert "[X86] Enable -x86-experimental-vector-widening-legalization by default."Mitch Phillips2019-08-061-5/+8
| | | | | | | | | This reverts commit 3de33245d2c992c9e0af60372043540b60f3a810. This commit broke the MSan buildbots. See https://reviews.llvm.org/rL367901 for more information. llvm-svn: 368107
* [X86] Enable -x86-experimental-vector-widening-legalization by default.Craig Topper2019-08-051-8/+5
| | | | | | | | | | | | | | | | | | | | | This patch changes our defualt legalization behavior for 16, 32, and 64 bit vectors with i8/i16/i32/i64 scalar types from promotion to widening. For example, v8i8 will now be widened to v16i8 instead of promoted to v8i16. This keeps the elements widths the same and pads with undef elements. We believe this is a better legalization strategy. But it carries some issues due to the fragmented vector ISA. For example, i8 shifts and multiplies get widened and then later have to be promoted/split into vXi16 vectors. This has the potential to cause regressions so we wanted to get it in early in the 10.0 cycle so we have plenty of time to address them. Next steps will be to merge tests that explicitly test the command line option. And then we can remove the option and its associated code. llvm-svn: 367901
* [X86FixupLEAs] Turn optIncDec into a generic two address LEA optimizer. ↵Craig Topper2019-05-251-2/+2
| | | | | | | | | | | | | | Support LEA64_32r properly. INC/DEC is really a special case of a more generic issue. We should also turn leas into add reg/reg or add reg/imm regardless of the slow lea flags. This also supports LEA64_32 which has 64 bit input registers and 32 bit output registers. So we need to convert the 64 bit inputs to their 32 bit equivalents to check if they are equal to base reg. One thing to note, the original code preserved the kill flags by adding operands to the new instruction instead of using addReg. But I think tied operands aren't supposed to have the kill flag set. I dropped the kill flags, but I could probably try to preserve it in the add reg/reg case if we think its important. Not sure which operand its supposed to go on for the LEA64_32r instruction due to the super reg implicit uses. Though I'm also not sure those are needed since they were probably just created by an INSERT_SUBREG from a 32-bit input. Differential Revision: https://reviews.llvm.org/D61472 llvm-svn: 361691
* [SelectionDAG] computeKnownBits - support constant pool values from targetSimon Pilgrim2019-05-241-12/+6
| | | | | | | | | | | | | | | | This patch adds the overridable TargetLowering::getTargetConstantFromLoad function which allows targets to return any constant value loaded by a LoadSDNode node - only X86 makes use of this so far but everything should be in place for other targets. computeKnownBits then uses this function to improve codegen, notably vector code after legalization. A future commit will do the same for ComputeNumSignBits but computeKnownBits sees the bigger benefit. This required a couple of fixes: * SimplifyDemandedBits must early-out for getTargetConstantFromLoad cases to prevent infinite loops of constant regeneration (similar to what we already do for BUILD_VECTOR). * Fix a DAGCombiner::visitTRUNCATE issue as we had trunc(shl(v8i32),v8i16) <-> shl(trunc(v8i16),v8i32) infinite loops after legalization on AVX512 targets. Differential Revision: https://reviews.llvm.org/D61887 llvm-svn: 361620
* [X86] Enable 8-bit OR with disjoint bits to convert to LEACraig Topper2019-03-051-26/+26
| | | | | | | | We already support 8-bits adds in convertToThreeAddress. But we can also support 8-bit OR if the bits are disjoint. We already do this for 16/32/64. Differential Revision: https://reviews.llvm.org/D58863 llvm-svn: 355423
* [X86][SSE] Canonicalize OR(AND(X,C),AND(Y,~C)) -> OR(AND(X,C),ANDNP(C,Y))Simon Pilgrim2019-01-221-11/+10
| | | | | | | | | | For constant bit select patterns, replace one AND with a ANDNP, allowing us to reuse the constant mask. Only do this if the mask has multiple uses (to avoid losing load folding) or if we have XOP as its VPCMOV can handle most folding commutations. This also requires computeKnownBitsForTargetNode support for X86ISD::ANDNP and X86ISD::FOR to prevent regressions in fabs/fcopysign patterns. Differential Revision: https://reviews.llvm.org/D55935 llvm-svn: 351819
* [LegalizeTypes] Prevent an assertion from PromoteIntRes_BSWAP and ↵Craig Topper2018-10-131-0/+618
| | | | | | | | | | | | | | | | | | | | | PromoteIntRes_BITREVERSE if the shift amount is too large for the VT returned by getShiftAmountTy Summary: getShiftAmountTy for X86 returns MVT::i8. If a BSWAP or BITREVERSE is created that requires promotion and the difference between the original VT and the promoted VT is more than 255 then we won't able to create the constant. This patch adds a check to replace the result from getShiftAmountTy to MVT::i32 if the difference won't fit. This should get legalized later when the shift is ultimately expanded since its clearly an illegal type that we're only promoting to make it a power of 2 bit width. Alternatively we could base the decision completely on the largest shift amount the promoted VT could use. Vectors should be immune here because getShiftAmountTy always returns the incoming VT for vectors. Only the scalar shift amount can be changed by the targets. Reviewers: eli.friedman, RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D53232 llvm-svn: 344460
* [X86] Handle COPYs of physregs better (regalloc hints)Simon Pilgrim2018-09-191-27/+30
| | | | | | | | | | | | | | Enable enableMultipleCopyHints() on X86. Original Patch by @jonpa: While enabling the mischeduler for SystemZ, it was discovered that for some reason a test needed one extra seemingly needless COPY (test/CodeGen/SystemZ/call-03.ll). The handling for that is resulted in this patch, which improves the register coalescing by providing not just one copy hint, but a sorted list of copy hints. On SystemZ, this gives ~12500 less register moves on SPEC, as well as marginally less spilling. Instead of improving just the SystemZ backend, the improvement has been implemented in common-code (calculateSpillWeightAndHint(). This gives a lot of test failures, but since this should be a general improvement I hope that the involved targets will help and review the test updates. Differential Revision: https://reviews.llvm.org/D38128 llvm-svn: 342578
* Followup on Proposal to move MIR physical register namespace to '$' sigil.Puyan Lotfi2018-01-311-7/+7
| | | | | | | | | | | | Discussed here: http://lists.llvm.org/pipermail/llvm-dev/2018-January/120320.html In preparation for adding support for named vregs we are changing the sigil for physical registers in MIR to '$' from '%'. This will prevent name clashes of named physical register with named vregs. llvm-svn: 323922
* [CodeGen] Use MachineOperand::print in the MIRPrinter for MO_Register.Francis Visoiu Mistrih2017-12-071-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Work towards the unification of MIR and debug output by refactoring the interfaces. For MachineOperand::print, keep a simple version that can be easily called from `dump()`, and a more complex one which will be called from both the MIRPrinter and MachineInstr::print. Add extra checks inside MachineOperand for detached operands (operands with getParent() == nullptr). https://reviews.llvm.org/D40836 * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+)<def> ([^ ]+)/kill: \1 def \2 \3/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: \1 \2 def \3/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: def ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: def \1 \2 def \3/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/<def>//g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<kill>/killed \1/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-use,kill>/implicit killed \1/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<dead>/dead \1/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<def[ ]*,[ ]*dead>/dead \1/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-def[ ]*,[ ]*dead>/implicit-def dead \1/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-def>/implicit-def \1/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-use>/implicit \1/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<internal>/internal \1/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<undef>/undef \1/g' llvm-svn: 320022
* [CodeGen] Unify MBB reference format in both MIR and debug outputFrancis Visoiu Mistrih2017-12-041-30/+30
| | | | | | | | | | | | | | | | As part of the unification of the debug format and the MIR format, print MBB references as '%bb.5'. The MIR printer prints the IR name of a MBB only for block definitions. * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(*\1)/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g' * find . \( -name "*.txt" -o -name "*.s" -o -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g' * grep -nr 'BB#' and fix Differential Revision: https://reviews.llvm.org/D40422 llvm-svn: 319665
* [CodeGen] Print register names in lowercase in both MIR and debug outputFrancis Visoiu Mistrih2017-11-281-7/+7
| | | | | | | | | | | As part of the unification of the debug format and the MIR format, always print registers as lowercase. * Only debug printing is affected. It now follows MIR. Differential Revision: https://reviews.llvm.org/D40417 llvm-svn: 319187
* [X86FixupBWInsts] More precise register liveness if no <imp-use> on MOVs.Nikolai Bozhenov2017-09-181-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | Summary: Subregister liveness tracking is not implemented for X86 backend, so sometimes the whole super register is said to be live, when only a subregister is really live. That might happen if the def and the use are located in different MBBs, see added fixup-bw-isnt.mir test. However, using knowledge of the specific instructions handled by the bw-fixup-pass we can get more precise liveness information which this change does. Reviewers: MatzeB, DavidKreitzer, ab, andrew.w.kaylor, craig.topper Reviewed By: craig.topper Subscribers: n.bozhenov, myatsina, llvm-commits, hiraditya Patch by Andrei Elovikov <andrei.elovikov@intel.com> Differential Revision: https://reviews.llvm.org/D37559 llvm-svn: 313524
* [X86] Add extra BITREVERSE testsSimon Pilgrim2017-03-141-131/+442
| | | | | | | | Test on 32-bit and 64-bit targets. Add bitreverse tests for i64, i32 and i16 llvm-svn: 297741
* [SelectionDAG] Optimization of BITREVERSE legalization for power-of-2 ↵Simon Pilgrim2016-07-221-251/+72
| | | | | | | | | | | | | | integer scalar/vector types An extension of D19978, this patch replaces the default BITREVERSE evaluation of individual bit masks+shifts with block mask+shifts when we have integer elements of power-of-2 bits in size. After calling BSWAP to reverse the order of the constituent bytes (which typically follows a similar approach), every neighbouring 4-bits, 2-bits and finally 1-bit pairs are masked off and swapped over with shifts. In doing so we can significantly reduce the number of operations required. Differential Revision: https://reviews.llvm.org/D21578 llvm-svn: 276432
* VirtRegMap: Replace some identity copies with KILL instructions.Matthias Braun2016-07-091-0/+2
| | | | | | | | | | | | | | An identity COPY like this: %AL = COPY %AL, %EAX<imp-def> has no semantic effect, but encodes liveness information: Further users of %EAX only depend on this instruction even though it does not define the full register. Replace the COPY with a KILL instruction in those cases to maintain this liveness information. (This reverts a small part of r238588 but this time adds a comment explaining why a KILL instruction is useful). llvm-svn: 274952
* [X86] Regenerated bitreverse tests to demonstrate what is going on.Simon Pilgrim2016-07-081-7/+275
| | | | llvm-svn: 274915
* [X86] Added bitreverse tests for non-legal typesSimon Pilgrim2016-07-081-4/+40
| | | | | | Requested on D21578 llvm-svn: 274914
* [x86] avoid printing unnecessary sign bits of hex immediates in asm comments ↵Sanjay Patel2016-05-281-1/+1
| | | | | | | | | | | (PR20347) It would be better to check the valid/expected size of the immediate operand, but this is generally better than what we print right now. Differential Revision: http://reviews.llvm.org/D20385 llvm-svn: 271114
* [SelectionDAG] Added bitreverse(bitreverse(v)) --> vSimon Pilgrim2016-05-071-320/+20
| | | | | | Added bitreverse creation testing llvm-svn: 268865
* [X86] Added BITREVERSE constant folding and identity testsSimon Pilgrim2016-05-071-1/+363
| | | | | | Identity tests are currently failing - this will be fixed soon llvm-svn: 268862
* [SDAG] Introduce a new BITREVERSE node along with a corresponding LLVM intrinsicJames Molloy2015-11-121-0/+22
Several backends have instructions to reverse the order of bits in an integer. Conceptually matching such patterns is similar to @llvm.bswap, and it was mentioned in http://reviews.llvm.org/D14234 that it would be best if these patterns were matched in InstCombine instead of reimplemented in every different target. This patch introduces an intrinsic @llvm.bitreverse.i* that operates similarly to @llvm.bswap. For plumbing purposes there is also a new ISD node ISD::BITREVERSE, with simple expansion and promotion support. The intention is that InstCombine's BSWAP detection logic will be extended to support BITREVERSE too, and @llvm.bitreverse intrinsics emitted (if the backend supports lowering it efficiently). llvm-svn: 252878
OpenPOWER on IntegriCloud