summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/X86/avx-arith.ll
Commit message (Collapse)AuthorAgeFilesLines
* [CodeGen] Unify MBB reference format in both MIR and debug outputFrancis Visoiu Mistrih2017-12-041-31/+31
| | | | | | | | | | | | | | | | As part of the unification of the debug format and the MIR format, print MBB references as '%bb.5'. The MIR printer prints the IR name of a MBB only for block definitions. * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(*\1)/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g' * find . \( -name "*.txt" -o -name "*.s" -o -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g' * grep -nr 'BB#' and fix Differential Revision: https://reviews.llvm.org/D40422 llvm-svn: 319665
* [X86][SSE] Improve lowering of vXi64 multiplies Simon Pilgrim2016-12-211-12/+10
| | | | | | | | | | | | | | | | | | | | | | As mentioned on PR30845, we were performing our vXi64 multiplication as: AloBlo = pmuludq(a, b); AloBhi = pmuludq(a, psrlqi(b, 32)); AhiBlo = pmuludq(psrlqi(a, 32), b); return AloBlo + psllqi(AloBhi, 32)+ psllqi(AhiBlo, 32); when we could avoid one of the upper shifts with: AloBlo = pmuludq(a, b); AloBhi = pmuludq(a, psrlqi(b, 32)); AhiBlo = pmuludq(psrlqi(a, 32), b); return AloBlo + psllqi(AloBhi + AhiBlo, 32); This matches the lowering on gcc/icc. Differential Revision: https://reviews.llvm.org/D27756 llvm-svn: 290267
* [X86] Remove another weird scalar sqrt/rcp/rsqrt pattern.Craig Topper2016-12-061-1/+2
| | | | | | | | This pattern turned a vector sqrt/rcp/rsqrt operation of sse_load_f32/f64 into the the scalar instruction for the operation and put undef into the upper bits. For correctness, the resulting code should still perform the sqrt/rcp/rsqrt on the upper bits after the load is extended since that's what the operation asked for. Particularly in the case where the upper bits are 0, in that case we need calculate the sqrt/rcp/rsqrt of the zeroes and keep the result in the upper-bits. This implies we should be using the packed instruction still. The only test case for this pattern is one I just added so there was no coverage of this. llvm-svn: 288784
* [X86] Add test case demonstrating a case where a vector sqrt being passed ↵Craig Topper2016-12-061-0/+12
| | | | | | | | | | (scalar_to_vector loadf64) uses a scalar sqrt instruction. This occurs due to a pattern that uses sse_load_f32/f64 with vector sqrt/rcp/rsqrt operations and turns them into scalar instructions. Perhaps for the case were the upper bits come from undef this is ok. I believe a (vzmovl load64) would do the same thing but those seems to become vzload instead and selectScalarSSELoad doesn't handle that today. In that case we should be performing the vector operation on the zeros in the upper bits which is not equivalent to using a scalar instruction. I will remove this pattern in a follow up patch. There appears to be no other test content for it. llvm-svn: 288783
* [X86] Regenerate a test using update_llc_test_checks.pyCraig Topper2016-12-061-92/+183
| | | | llvm-svn: 288782
* [opaque pointer type] Add textual IR support for explicit type parameter to ↵David Blaikie2015-02-271-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=|:|^)\s*load (?:atomic )?(?:volatile )?(.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 llvm-svn: 230794
* Enable MI Sched for x86.Andrew Trick2013-10-151-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | This changes the SelectionDAG scheduling preference to source order. Soon, the SelectionDAG scheduler can be bypassed saving a nice chunk of compile time. Performance differences that result from this change are often a consequence of register coalescing. The register coalescer is far from perfect. Bugs can be filed for deficiencies. On x86 SandyBridge/Haswell, the source order schedule is often preserved, particularly for small blocks. Register pressure is generally improved over the SD scheduler's ILP mode. However, we are still able to handle large blocks that require latency hiding, unlike the SD scheduler's BURR mode. MI scheduler also attempts to discover the critical path in single-block loops and adjust heuristics accordingly. The MI scheduler relies on the new machine model. This is currently unimplemented for AVX, so we may not be generating the best code yet. Unit tests are updated so they don't depend on SD scheduling heuristics. llvm-svn: 192750
* Revert "Temporarily enable MI-Sched on X86."Andrew Trick2013-06-251-2/+3
| | | | | | This reverts commit 98a9b72e8c56dc13a2617de84503a3d78352789c. llvm-svn: 184823
* Temporarily enable MI-Sched on X86.Andrew Trick2013-06-241-3/+2
| | | | | | | Sorry for the unit test churn. I'll try to make the change permanently next time. llvm-svn: 184705
* Fixed vsqrt.ss intrinsic usage - order of input operands was wrong.Elena Demikhovsky2011-11-291-0/+11
| | | | | | | Added a test. Thanks Bruno for reviewing the patch. llvm-svn: 145403
* Break 256-bit vector int add/sub/mul into two 128-bit operations to avoid ↵Craig Topper2011-08-241-0/+128
| | | | | | costly scalarization. Fixes PR10711. llvm-svn: 138427
* Rename and tidy up testsBruno Cardoso Lopes2011-08-091-0/+133
llvm-svn: 137103
OpenPOWER on IntegriCloud