summaryrefslogtreecommitdiffstats
path: root/llvm/test/Analysis/CostModel/X86
Commit message (Collapse)AuthorAgeFilesLines
...
* X86 cost model: Vector shifts are expensive in most casesArnold Schwaighofer2013-04-034-2/+730
| | | | | | | | | | | | | | The default logic does not correctly identify costs of casts because they are marked as custom on x86. For some cases, where the shift amount is a scalar we would be able to generate better code. Unfortunately, when this is the case the value (the splat) will get hoisted out of the loop, thereby making it invisible to ISel. radar://13130673 radar://13537826 llvm-svn: 178703
* X86TTI: Add accurate costs for itofp operations, based on the actual ↵Benjamin Kramer2013-04-011-0/+75
| | | | | | instruction counts. llvm-svn: 178459
* Correct cost model for vector shift on AVX2Michael Liao2013-03-201-0/+54
| | | | | | | | | - After moving logic recognizing vector shift with scalar amount from DAG combining into DAG lowering, we declare to customize all vector shifts even vector shift on AVX is legal. As a result, the cost model needs special tuning to identify these legal cases. llvm-svn: 177586
* Optimize sext <4 x i8> and <4 x i16> to <4 x i64>.Nadav Rotem2013-03-191-2/+2
| | | | | | Patch by Ahmad, Muhammad T <muhammad.t.ahmad@intel.com> llvm-svn: 177421
* X86 cost model: Adjust cost for custom lowered vector multipliesArnold Schwaighofer2013-03-021-0/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This matters for example in following matrix multiply: int **mmult(int rows, int cols, int **m1, int **m2, int **m3) { int i, j, k, val; for (i=0; i<rows; i++) { for (j=0; j<cols; j++) { val = 0; for (k=0; k<cols; k++) { val += m1[i][k] * m2[k][j]; } m3[i][j] = val; } } return(m3); } Taken from the test-suite benchmark Shootout. We estimate the cost of the multiply to be 2 while we generate 9 instructions for it and end up being quite a bit slower than the scalar version (48% on my machine). Also, properly differentiate between avx1 and avx2. On avx-1 we still split the vector into 2 128bits and handle the subvector muls like above with 9 instructions. Only on avx-2 will we have a cost of 9 for v4i64. I changed the test case in test/Transforms/LoopVectorize/X86/avx1.ll to use an add instead of a mul because with a mul we now no longer vectorize. I did verify that the mul would be indeed more expensive when vectorized with 3 kernels: for (i ...) r += a[i] * 3; for (i ...) m1[i] = m1[i] * 3; // This matches the test case in avx1.ll and a matrix multiply. In each case the vectorized version was considerably slower. radar://13304919 llvm-svn: 176403
* Cost model support for lowered math builtins.Benjamin Kramer2013-02-281-0/+32
| | | | | | | | | | We make the cost for calling libm functions extremely high as emitting the calls is expensive and causes spills (on x86) so performance suffers. We still vectorize important calls like ceilf and friends on SSE4.1. and fabs. Differential Revision: http://llvm-reviews.chandlerc.com/D466 llvm-svn: 176287
* I optimized the following patterns:Elena Demikhovsky2013-02-201-1/+11
| | | | | | | | | | | | | | | | sext <4 x i1> to <4 x i64> sext <4 x i8> to <4 x i64> sext <4 x i16> to <4 x i64> I'm running Combine on SIGN_EXTEND_IN_REG and revert SEXT patterns: (sext_in_reg (v4i64 anyext (v4i32 x )), ExtraVT) -> (v4i64 sext (v4i32 sext_in_reg (v4i32 x , ExtraVT))) The sext_in_reg (v4i32 x) may be lowered to shl+sar operations. The "sar" does not exist on 64-bit operation, so lowering sext_in_reg (v4i64 x) has no vector solution. I also added a cost of this operations to the AVX costs table. llvm-svn: 175619
* ARM cost model: Address computation in vector mem ops not freeArnold Schwaighofer2013-02-081-0/+40
| | | | | | | | | | | | | | | Adds a function to target transform info to query for the cost of address computation. The cost model analysis pass now also queries this interface. The code in LoopVectorize adds the cost of address computation as part of the memory instruction cost calculation. Only there, we know whether the instruction will be scalarized or not. Increase the penality for inserting in to D registers on swift. This becomes necessary because we now always assume that address computation has a cost and three is a closer value to the architecture. radar://13097204 llvm-svn: 174713
* We are not ready to estimate the cost of integer expansions based on the ↵Nadav Rotem2012-12-231-2/+0
| | | | | | number of parts. This test is too noisy. llvm-svn: 170999
* Improve the X86 cost model for loads and stores.Nadav Rotem2012-12-212-2/+67
| | | | llvm-svn: 170830
* Reverse order of checking SSE level when calculating compare cost, so we checkJakub Staszak2012-12-181-14/+28
| | | | | | AVX2 before AVX. llvm-svn: 170464
* Cost Model: change the default cost of control flow instructions (br / ret / ↵Nadav Rotem2012-12-056-9/+9
| | | | | | ...) to zero. llvm-svn: 169423
* CostModel: add another known vector trunc optimization.Nadav Rotem2012-11-061-0/+3
| | | | llvm-svn: 167488
* Cost Model: add tables for some avx type-conversion hacks.Nadav Rotem2012-11-061-0/+32
| | | | llvm-svn: 167480
* CostModel: Add tables for the common x86 compares.Nadav Rotem2012-11-051-0/+42
| | | | llvm-svn: 167421
* Code Model: Improve the accuracy of the zext/sext/trunc vector cost estimation.Nadav Rotem2012-11-051-0/+34
| | | | llvm-svn: 167412
* Cost Model: Normalize the insert/extract index when splitting typesNadav Rotem2012-11-051-0/+7
| | | | llvm-svn: 167402
* Cost Model: teach the cost model about expanding integers.Nadav Rotem2012-11-051-0/+9
| | | | llvm-svn: 167401
* Implement the cost of abnormal x86 instruction lowering as a table.Nadav Rotem2012-11-051-0/+2
| | | | llvm-svn: 167395
* X86 CostModel: Add support for a some of the common arithmetic instructions ↵Nadav Rotem2012-11-032-0/+42
| | | | | | for SSE4, AVX and AVX2. llvm-svn: 167347
* Add a stub for the x86 cost model impl. Implement a basic cost rule for ↵Nadav Rotem2012-11-021-0/+33
| | | | | | inserting/extracting from XMM registers. llvm-svn: 167333
* CostModel: add support for Vector Insert and Extract.Nadav Rotem2012-11-021-0/+43
| | | | llvm-svn: 167329
* Add a cost model analysis that allows us to estimate the cost of IR-level ↵Nadav Rotem2012-11-023-0/+93
instructions. llvm-svn: 167324
OpenPOWER on IntegriCloud