summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/X86/avx-sext.ll
Commit message (Collapse)AuthorAgeFilesLines
* [x86] Rename avx-{s,z}ext.ll to vector-{s,z}ext.ll.Chandler Carruth2014-10-011-441/+0
| | | | | | | | These tests are far and away the best sext and zext tests we have for vectors. I'm going to merge the other similar tests into them and expand the ISA coverage. llvm-svn: 218800
* [x86] Clean up and generate detailed FileCheck assertions forChandler Carruth2014-10-011-123/+365
| | | | | | | | | | | | | avx-sext.ll using my new script. Also add an AVX2 mode to this test. Part of cleaning up the test suite before enabling the new vector shuffle lowering. This also highlights some of the abysmal failures of the old shuffle lowering. Check out those 'pinsrw' and 'pextrw' sequences! llvm-svn: 218794
* [x86] Undo a flawed transform I added to form UNPCK instructions whenChandler Carruth2014-09-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | AVX is available, and generally tidy up things surrounding UNPCK formation. Originally, I was thinking that the only advantage of PSHUFD over UNPCK instruction variants was its free copy, and otherwise we should use the shorter encoding UNPCK instructions. This isn't right though, there is a larger advantage of being able to fold a load into the operand of a PSHUFD. For UNPCK, the operand *must* be in a register so it can be the second input. This removes the UNPCK formation in the target-specific DAG combine for v4i32 shuffles. It also lifts the v8 and v16 cases out of the AVX-specific check as they are potentially replacing multiple instructions with a single instruction and so should always be valuable. The floating point checks are simplified accordingly. This also adjusts the formation of PSHUFD instructions to attempt to match the shuffle mask to one which would fit an UNPCK instruction variant. This was originally motivated to allow it to match the UNPCK instructions in the combiner, but clearly won't now. Eventually, we should add a MachineCombiner pass that can form UNPCK instructions post-RA when the operand is known to be in a register and thus there is no loss. llvm-svn: 217755
* [x86] Fix the very broken formation of vpunpck instructions in theChandler Carruth2014-08-151-1/+1
| | | | | | | | | | | | | | | | | target-specific shuffl DAG combines. We were recognizing the paired shuffles backwards. This code needs to be replaced anyways as we have the same functionality elsewhere, but I'll do the refactoring in a follow-up, this is the minimal fix to the behavior. In addition to fixing miscompiles with the new vector shuffle lowering, it also causes the canonicalization to kick in much better, selecting the smaller encoding variants in lots of places in the new AVX path. This still isn't quite ideal as we don't need both the shufpd and the punpck instructions, but that'll get fixed in a follow-up patch. llvm-svn: 215690
* X86: Custom lower sext v16i8 to v16i16, and the corresponding truncate.Benjamin Kramer2013-10-231-0/+11
| | | | | | Also update the cost model. llvm-svn: 193270
* Cleanup: test source files do not need to be executableArnaud A. de Grandmaison2013-04-221-0/+0
| | | | llvm-svn: 180003
* Optimize sext <4 x i8> and <4 x i16> to <4 x i64>.Nadav Rotem2013-03-191-0/+21
| | | | | | Patch by Ahmad, Muhammad T <muhammad.t.ahmad@intel.com> llvm-svn: 177421
* I optimized the following patterns:Elena Demikhovsky2013-02-201-0/+23
| | | | | | | | | | | | | | | | sext <4 x i1> to <4 x i64> sext <4 x i8> to <4 x i64> sext <4 x i16> to <4 x i64> I'm running Combine on SIGN_EXTEND_IN_REG and revert SEXT patterns: (sext_in_reg (v4i64 anyext (v4i32 x )), ExtraVT) -> (v4i64 sext (v4i32 sext_in_reg (v4i32 x , ExtraVT))) The sext_in_reg (v4i32 x) may be lowered to shl+sar operations. The "sar" does not exist on 64-bit operation, so lowering sext_in_reg (v4i64 x) has no vector solution. I also added a cost of this operations to the AVX costs table. llvm-svn: 175619
* Revert 172708.Nadav Rotem2013-01-201-56/+0
| | | | | | | | | The optimization handles esoteric cases but adds a lot of complexity both to the X86 backend and to other backends. This optimization disables an important canonicalization of chains of SEXT nodes and makes SEXT and ZEXT asymmetrical. Disabling the canonicalization of consecutive SEXT nodes into a single node disables other DAG optimizations that assume that there is only one SEXT node. The AVX mask optimizations is one example. Additionally this optimization does not update the cost model. llvm-svn: 172968
* On Sandybridge split unaligned 256bit stores into two xmm-sized stores. Nadav Rotem2013-01-191-12/+0
| | | | llvm-svn: 172894
* Optimization for the following SIGN_EXTEND pairs:Elena Demikhovsky2013-01-171-0/+68
| | | | | | | | | | | | v8i8 -> v8i64, v8i8 -> v8i32, v4i8 -> v4i64, v4i16 -> v4i64 for AVX and AVX2. Bug 14865. llvm-svn: 172708
* X86: Emit vector sext as shuffle + sra if vpmovsx is not available.Benjamin Kramer2012-12-221-23/+96
| | | | | | | Also loosen the SSSE3 dependency a bit, expanded pshufb + psra is still better than scalarized loads. Fixes PR14590. llvm-svn: 170984
* Optimized load + SIGN_EXTEND patterns in the X86 backend.Elena Demikhovsky2012-12-191-1/+55
| | | | llvm-svn: 170506
* Optimization for SIGN_EXTEND operation on AVX.Elena Demikhovsky2012-02-021-0/+17
Special handling was added for v4i32 -> v4i64 and v8i16 -> v8i32 extensions. llvm-svn: 149600
OpenPOWER on IntegriCloud