summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/X86/vec_extract.ll
Commit message (Collapse)AuthorAgeFilesLines
* [X86][SSE] Avoid specifying unused arguments in SHUFPD loweringSimon Pilgrim2016-08-221-4/+4
| | | | | | | | | | | | | | | | As discussed on PR26491, we are missing the opportunity to make use of the smaller MOVHLPS instruction because we set both arguments of a SHUFPD when using it to lower a single input shuffle. This patch sets the lowered argument to UNDEF if that shuffle element is undefined. This in turn makes it easier for target shuffle combining to decode UNDEF shuffle elements, allowing combines to MOVHLPS to occur. A fix to match against MOVHPD stores was necessary as well. This builds on the improved MOVLHPS/MOVHLPS lowering and memory folding support added in D16956 Adding similar support for SHUFPS will have to wait until have better support for target combining of binary shuffles. Differential Revision: https://reviews.llvm.org/D23027 llvm-svn: 279430
* [X86][SSE] Regenerated the vec_extract tests.Simon Pilgrim2016-04-011-56/+86
| | | | llvm-svn: 265183
* [opaque pointer type] Add textual IR support for explicit type parameter to ↵David Blaikie2015-02-271-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=|:|^)\s*load (?:atomic )?(?:volatile )?(.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 llvm-svn: 230794
* [x86] Slap a triple on this test since it is poking around at the stackChandler Carruth2014-10-041-0/+2
| | | | | | | and calling conventions. Otherwise its too hard to craft a usefully generic set of assertions. llvm-svn: 219047
* [x86] Enable the new vector shuffle lowering by default.Chandler Carruth2014-10-041-13/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update the entire regression test suite for the new shuffles. Remove most of the old testing which was devoted to the old shuffle lowering path and is no longer relevant really. Also remove a few other random tests that only really exercised shuffles and only incidently or without any interesting aspects to them. Benchmarking that I have done shows a few small regressions with this on LNT, zero measurable regressions on real, large applications, and for several benchmarks where the loop vectorizer fires in the hot path it shows 5% to 40% improvements for SSE2 and SSE3 code running on Sandy Bridge machines. Running on AMD machines shows even more dramatic improvements. When using newer ISA vector extensions the gains are much more modest, but the code is still better on the whole. There are a few regressions being tracked (PR21137, PR21138, PR21139) but by and large this is expected to be a win for x86 generated code performance. It is also more correct than the code it replaces. I have fuzz tested this extensively with ISA extensions up through AVX2 and found no crashes or miscompiles (yet...). The old lowering had a few miscompiles and crashers after a somewhat smaller amount of fuzz testing. There is one significant area where the new code path lags behind and that is in AVX-512 support. However, there was *extremely little* support for that already and so this isn't a significant step backwards and the new framework will probably make it easier to implement lowering that uses the full power of AVX-512's table-based shuffle+blend (IMO). Many thanks to Quentin, Andrea, Robert, and others for benchmarking assistance. Thanks to Adam and others for help with AVX-512. Thanks to Hal, Eric, and *many* others for answering my incessant questions about how the backend actually works. =] I will leave the old code path in the tree until the 3 PRs above are at least resolved to folks' satisfaction. Then I will rip it (and 1000s of lines of code) out. =] I don't expect this flag to stay around for very long. It may not survive next week. llvm-svn: 219046
* [x86] Teach the vector combiner that picks a canonical shuffle from toChandler Carruth2014-09-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | support transforming the forms from the new vector shuffle lowering to use 'movddup' when appropriate. A bunch of the cases where we actually form 'movddup' don't actually show up in the test results because something even later than DAG legalization maps them back to 'unpcklpd'. If this shows back up as a performance problem, I'll probably chase it down, but it is at least an encoded size loss. =/ To make this work, also always do this canonicalizing step for floating point vectors where the baseline shuffle instructions don't provide any free copies of their inputs. This also causes us to canonicalize unpck[hl]pd into mov{hl,lh}ps (resp.) which is a nice encoding space win. There is one test which is "regressed" by this: extractelement-load. There, the test case where the optimization it is testing *fails*, the exact instruction pattern which results is slightly different. This should probably be fixed by having the appropriate extract formed earlier in the DAG, but that would defeat the purpose of the test.... If this test case is critically important for anyone, please let me know and I'll try to work on it. The prior behavior was actually contrary to the comment in the test case and seems likely to have been an accident. llvm-svn: 217738
* [x86] Fixup r217565 which baked in an assumption about the functionChandler Carruth2014-09-111-1/+1
| | | | | | | name that breaks on some platforms. This part of the test just doesn't matter... llvm-svn: 217575
* [x86] FileCheck-ize this test.Chandler Carruth2014-09-111-5/+24
| | | | llvm-svn: 217565
* Rename features to match what gcc and clang use.Rafael Espindola2013-08-231-1/+1
| | | | | | | There is no advantage in being different and using the same names simplifies clang a bit. llvm-svn: 189141
* Fixes a bug in the DAGCombiner. LoadSDNodes have two values (data, chain).Nadav Rotem2011-05-111-3/+3
| | | | | | | | If there is a store after the load node, then there is a chain, which means that there is another user. Thus, asking hasOneUser would fail. Instead we ask hasNUsesOfValue on the 'data' value. llvm-svn: 131183
* Eliminate more uses of llvm-as and llvm-dis.Dan Gohman2009-09-081-1/+1
| | | | llvm-svn: 81290
* Remove obsolete -f flags.Dan Gohman2009-08-251-1/+1
| | | | llvm-svn: 79992
* Split the Add, Sub, and Mul instruction opcodes into separateDan Gohman2009-06-041-3/+3
| | | | | | | | | | | | | | | integer and floating-point opcodes, introducing FAdd, FSub, and FMul. For now, the AsmParser, BitcodeReader, and IRBuilder all preserve backwards compatability, and the Core LLVM APIs preserve backwards compatibility for IR producers. Most front-ends won't need to change immediately. This implements the first step of the plan outlined here: http://nondot.org/sabre/LLVMNotes/IntegerOverflow.txt llvm-svn: 72897
* Use movaps / movd to extract vector element 0 even with sse4.1. It's still ↵Evan Cheng2009-01-021-1/+1
| | | | | | cheaper than pextrw especially if the value is in memory. llvm-svn: 61555
* Implement "punpckldq %xmm0, $xmm0" as "pshufd $0x50, %xmm0, %xmm" unless ↵Evan Cheng2008-09-261-4/+4
| | | | | | optimizing for code size. llvm-svn: 56711
* Remove llvm-upgrade and update tests.Tanya Lattner2008-02-211-19/+19
| | | | llvm-svn: 47432
* Convert tests using "| wc -l | grep ..." to use the count script.Dan Gohman2007-08-151-4/+4
| | | | llvm-svn: 41097
* For PR1319:Reid Spencer2007-04-161-4/+5
| | | | | | | | Remove && from the end of the lines to prevent tests from throwing run lines into the background. Also, clean up places where the same command is run multiple times by using a temporary file. llvm-svn: 36142
* Regression is gone, don't try to find it on clean target.Reid Spencer2007-01-171-0/+35
llvm-svn: 33296
OpenPOWER on IntegriCloud