summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* Clean up code format a bit.Jim Grosbach2013-03-021-4/+2
| | | | llvm-svn: 176412
* Tidy up. Trailing whitespace.Jim Grosbach2013-03-021-7/+7
| | | | llvm-svn: 176411
* ARM NEON: Fix v2f32 float intrinsicsArnold Schwaighofer2013-03-021-0/+18
| | | | | | Mark them as expand, they are not legal as our backend does not match them. llvm-svn: 176410
* recommit r172363 & r171325 (reverted in r172756)Nuno Lopes2013-03-021-12/+29
| | | | | | | | This adds minimalistic support for PHI nodes to llvm.objectsize() evaluation fingers crossed so that it does break clang boostrap again.. llvm-svn: 176408
* add getUnderlyingObjectSize()Nuno Lopes2013-03-022-30/+21
| | | | | | | this is similar to getObjectSize(), but doesnt subtract the offset tweak the BasicAA code accordingly (per PR14988) llvm-svn: 176407
* X86 cost model: Adjust cost for custom lowered vector multipliesArnold Schwaighofer2013-03-021-5/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This matters for example in following matrix multiply: int **mmult(int rows, int cols, int **m1, int **m2, int **m3) { int i, j, k, val; for (i=0; i<rows; i++) { for (j=0; j<cols; j++) { val = 0; for (k=0; k<cols; k++) { val += m1[i][k] * m2[k][j]; } m3[i][j] = val; } } return(m3); } Taken from the test-suite benchmark Shootout. We estimate the cost of the multiply to be 2 while we generate 9 instructions for it and end up being quite a bit slower than the scalar version (48% on my machine). Also, properly differentiate between avx1 and avx2. On avx-1 we still split the vector into 2 128bits and handle the subvector muls like above with 9 instructions. Only on avx-2 will we have a cost of 9 for v4i64. I changed the test case in test/Transforms/LoopVectorize/X86/avx1.ll to use an add instead of a mul because with a mul we now no longer vectorize. I did verify that the mul would be indeed more expensive when vectorized with 3 kernels: for (i ...) r += a[i] * 3; for (i ...) m1[i] = m1[i] * 3; // This matches the test case in avx1.ll and a matrix multiply. In each case the vectorized version was considerably slower. radar://13304919 llvm-svn: 176403
* Added FIXME for future Hexagon cleanup.Andrew Trick2013-03-021-0/+3
| | | | llvm-svn: 176400
* PR14448 - prevent the loop vectorizer from vectorizing the same loop twice.Nadav Rotem2013-03-021-0/+18
| | | | | | | | | | The LoopVectorizer often runs multiple times on the same function due to inlining. When this happens the loop vectorizer often vectorizes the same loops multiple times, increasing code size and adding unneeded branches. With this patch, the vectorizer during vectorization puts metadata on scalar loops and marks them as 'already vectorized' so that it knows to ignore them when it sees them a second time. PR14448. llvm-svn: 176399
* Modify {Call,Invoke}Inst::addAttribute to take an AttrKind.Peter Collingbourne2013-03-022-11/+5
| | | | llvm-svn: 176397
* Remove duplicate line and move another closer to its actual useEli Bendersky2013-03-011-3/+1
| | | | llvm-svn: 176391
* In llvm::MemoryBuffer::getFile() remove an unnecessary stat call check.Argyrios Kyrtzidis2013-03-011-0/+3
| | | | | | | | | | The sys::fs::is_directory() check is unnecessary because, if the filename is a directory, the function will fail anyway with the same error code returned. Remove the check to avoid an unnecessary stat call. Someone needs to review on windows and see if the check is necessary there or not. llvm-svn: 176386
* [mips] Fix inefficient code generation.Akira Hatanaka2013-03-013-1/+16
| | | | | | | | | | | | | This patch eliminates the need to emit a constant move instruction when this pattern is matched: (select (setgt a, Constant), T, F) The pattern above effectively turns into this: (conditional-move (setlt a, Constant + 1), F, T) llvm-svn: 176384
* Removed extraneous #include "LLVMContextImpl.h" from lib/IR/Module.cppJean-Luc Duprat2013-03-011-1/+0
| | | | llvm-svn: 176382
* Fix indentation.Akira Hatanaka2013-03-011-15/+10
| | | | llvm-svn: 176380
* Set properties for f128 type.Akira Hatanaka2013-03-012-17/+71
| | | | llvm-svn: 176378
* Generate an error message instead of asserting or segfaulting when we can'tChad Rosier2013-03-011-0/+1
| | | | | | | handle indirect register inputs. rdar://13322011 llvm-svn: 176367
* LoopVectorize: Don't hang forever if a PHI only has skipped PHI uses.Benjamin Kramer2013-03-011-1/+8
| | | | | | Fixes PR15384. llvm-svn: 176366
* Cache the result of Function::getIntrinsicID() in a DenseMap attached to the ↵Michael Ilseman2013-03-014-5/+36
| | | | | | | | | | LLVMContext. This reduces the time actually spent doing string to ID conversion and shows a 10% improvement in compile time for a particularly bad case that involves ARM Neon intrinsics (these have many overloads). Patch by Jean-Luc Duprat! llvm-svn: 176365
* Fix PR10475Michael Liao2013-03-0111-20/+32
| | | | | | | | | | | | | | - ISD::SHL/SRL/SRA must have either both scalar or both vector operands but TLI.getShiftAmountTy() so far only return scalar type. As a result, backend logic assuming that breaks. - Rename the original TLI.getShiftAmountTy() to TLI.getScalarShiftAmountTy() and re-define TLI.getShiftAmountTy() to return target-specificed scalar type or the same vector type as the 1st operand. - Fix most TICG logic assuming TLI.getShiftAmountTy() a simple scalar type. llvm-svn: 176364
* Add support for using non-pic code for arm and thumb1 when emitting the sjljChad Rosier2013-03-011-10/+21
| | | | | | | | dispatch code. As far as I can tell the thumb2 code is behaving as expected. I was able to compile and run the associated test case for both arm and thumb1. rdar://13066352 llvm-svn: 176363
* Hexagon: Add constant extender support framework.Jyotsna Verma2013-03-014-23/+228
| | | | llvm-svn: 176358
* R600/SI: handle all registers in copyPhysReg v2Christian Konig2013-03-011-16/+88
| | | | | | | | v2: based on Michels patch, but now allows copying of all registers sizes. Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> llvm-svn: 176346
* R600/SI: remove S_MOV immediate patternsChristian Konig2013-03-011-12/+2
| | | | | | | They won't match anyway. Signed-off-by: Christian König <christian.koenig@amd.com> llvm-svn: 176345
* R600/SI: remove GPR*AlignEncodeChristian Konig2013-03-014-67/+16
| | | | | | | It's much easier to specify the encoding with tablegen directly. Signed-off-by: Christian König <christian.koenig@amd.com> llvm-svn: 176344
* R600/SI: fix warning about overloaded virtualChristian Konig2013-03-011-0/+1
| | | | | Signed-off-by: Christian König <christian.koenig@amd.com> llvm-svn: 176343
* R600/SI: fix inserting waits for unordered definesChristian Konig2013-03-011-2/+21
| | | | | Signed-off-by: Christian König <christian.koenig@amd.com> llvm-svn: 176342
* GCC thinks that this variable might be used uninitialized (it isn't).Duncan Sands2013-03-011-1/+1
| | | | llvm-svn: 176341
* [mips] Remove unused option. Fix 80-column violations.Akira Hatanaka2013-03-011-16/+8
| | | | llvm-svn: 176330
* [mips] Add the capability to search delay slot filling instructions inAkira Hatanaka2013-03-011-32/+303
| | | | | | | | successor basic blocks. Currently this is off by default. llvm-svn: 176329
* [mips] Do not add SecondLastInst to list BranchInstrs if there is only oneAkira Hatanaka2013-03-011-2/+2
| | | | | | | | terminator. No functionality change. llvm-svn: 176326
* [mips] Define an overloaded version of function MipsInstrInfo::AnalyzeBranchAdd.Akira Hatanaka2013-03-012-74/+103
| | | | | | | | This function will be used later when the capability to search delay slot filling instructions in successor blocks is added. No intended functionality changes. llvm-svn: 176325
* [mips] Add options to disable searching backward and in successor blocks.Akira Hatanaka2013-03-011-0/+12
| | | | llvm-svn: 176321
* [mips] Add capability to search in the forward direction for instructions thatAkira Hatanaka2013-03-011-23/+92
| | | | | | | | can fill the delay slot. Currently, this is off by default. llvm-svn: 176320
* [mips] Define helper function searchRangeAkira Hatanaka2013-03-011-9/+29
| | | | | | No functionality change. llvm-svn: 176318
* [mips] Rename function findDelayInstr to searchBackward.Akira Hatanaka2013-03-011-3/+3
| | | | llvm-svn: 176317
* Scheduler diagnostics. Print the register name.Andrew Trick2013-03-011-0/+2
| | | | llvm-svn: 176316
* Instructions schedulers should report correct height/depth.Andrew Trick2013-03-011-2/+2
| | | | | | | | | | | | | We avoided computing DAG height/depth during Node printing because it shouldn't depend on an otherwise valid DAG. But this has become far too annoying for the common case of a valid DAG where we want to see valid values. If doing the computation on-the-fly turns out to be a problem in practice, then I'll add a mode to the diagnostics to only force it when we're likely to have a valid DAG, otherwise explicitly print INVALID instead of bogus numbers. For now, just go for it all the time. llvm-svn: 176314
* [mips] Define class MemDefsUses.Akira Hatanaka2013-03-011-23/+126
| | | | | | | This class tracks dependence between memory instructions using underlying objects of memory operands. llvm-svn: 176313
* A small refactoring + adding comments.Eli Bendersky2013-02-282-10/+5
| | | | | | | | | | | SelectionDAGIsel::LowerArguments needs a function, not a basic block. So it makes sense to pass it the function instead of extracting a basic-block from the function and then tossing it. This is also more self-documenting (functions have arguments, BBs don't). In addition, added comments to a couple of Select* methods. llvm-svn: 176305
* Don't add the 'Value' string if there isn't one.Bill Wendling2013-02-281-1/+1
| | | | | | | | This was causing the folding set to fail to fold attributes, because it was being calculated in one spot without an empty values string but here with an empty values string. llvm-svn: 176301
* Fix a bug in instcombine for fmul in fast math mode.Quentin Colombet2013-02-281-3/+3
| | | | | | | | | | | | | | | The instcombine recognized pattern looks like: a = b * c d = a +/- Cst or a = b * c d = Cst +/- a When creating the new operands for fadd or fsub instruction following the related fmul, the first operand was created with the second original operand (M0 was created with C1) and the second with the first (M1 with Opnd0). The fix consists in creating the new operands with the appropriate original operand, i.e., M0 with Opnd0 and M1 with C1. llvm-svn: 176300
* Move an assert earlier in a file and check that the result ofEric Christopher2013-02-281-2/+5
| | | | | | | | our bitwise compare is equal to the field we're looking for. Noticed on inspection. llvm-svn: 176296
* Don't add an attribute that already exists and don't remove an attribute ↵Bill Wendling2013-02-281-0/+2
| | | | | | that doesn't exist. llvm-svn: 176289
* Tidy up; no functional change.Chad Rosier2013-02-281-5/+3
| | | | llvm-svn: 176288
* Cost model support for lowered math builtins.Benjamin Kramer2013-02-282-12/+78
| | | | | | | | | | We make the cost for calling libm functions extremely high as emitting the calls is expensive and causes spills (on x86) so performance suffers. We still vectorize important calls like ceilf and friends on SSE4.1. and fabs. Differential Revision: http://llvm-reviews.chandlerc.com/D466 llvm-svn: 176287
* Style; no functional change.Chad Rosier2013-02-281-7/+4
| | | | llvm-svn: 176285
* Put some per-instruction statistics of fast isel under NDEBUG, together withEli Bendersky2013-02-281-3/+5
| | | | | | other per-instruction statistics. llvm-svn: 176273
* Re-format comments (and check commit access)Yiannis Tsiouris2013-02-281-17/+15
| | | | llvm-svn: 176270
* AArch64: remove post-encoder method from FCMP (immediate) instructions.Tim Northover2013-02-283-27/+30
| | | | | | | | The work done by the post-encoder (setting architecturally unused bits to 0 as required) can be done by the existing operand that covers the "#0.0". This removes at least one use of the discouraged PostEncoderMethod uses. llvm-svn: 176261
* AArch64: be more careful resorting to inefficient addressing for weak vars.Tim Northover2013-02-281-5/+4
| | | | | | | | If an otherwise weak var is actually defined in this unit, it can't be undefined at runtime so we can use normal global variable sequences (ADRP/ADD) to access it. llvm-svn: 176259
OpenPOWER on IntegriCloud