summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
* Remove the separate explicit AES instruction patterns. They are equivalent ↵Craig Topper2011-12-291-48/+5
| | | | | | to the patterns specified by the instructions. Also remove unnecessary bitconverts from the AES patterns. llvm-svn: 147342
* Make SSE42 and SSE4A not imply POPCNT. POPCNT should be able to be disabled ↵Craig Topper2011-12-291-3/+2
| | | | | | on its own without disabling SSE4.2 or SSE4A. llvm-svn: 147339
* Make LowerBUILD_VECTOR keep node vector types consistent when creating MOVL ↵Craig Topper2011-12-291-9/+8
| | | | | | for v16i16 and v32i8. llvm-svn: 147337
* Remove some elses after returns.Craig Topper2011-12-291-7/+10
| | | | llvm-svn: 147336
* Remove trailing spaces. Fix an assert to use && instead of || before string. ↵Craig Topper2011-12-291-7/+5
| | | | | | Add same assert on similar code path. llvm-svn: 147335
* Fix grammar error noticed by Duncan.Rafael Espindola2011-12-291-1/+1
| | | | llvm-svn: 147333
* Change CaptureTracking to pass a Use* instead of a Value* when a value isNick Lewycky2011-12-286-17/+296
| | | | | | | | | | | | | | captured. This allows the tracker to look at the specific use, which may be especially interesting for function calls. Use this to fix 'nocapture' deduction in FunctionAttrs. The existing one does not iterate until a fixpoint and does not guarantee that it produces the same result regardless of iteration order. The new implementation builds up a graph of how arguments are passed from function to function, and uses a bottom-up walk on the argument-SCCs to assign nocapture. This gets us nocapture more often, and does so rather efficiently and independent of iteration order. llvm-svn: 147327
* Fix type-checking for load transformation which is not legal on ↵Eli Friedman2011-12-282-1/+16
| | | | | | floating-point types. PR11674. llvm-svn: 147323
* Update OCaml bindings for the new half float type.Bob Wilson2011-12-282-0/+3
| | | | | | Patch by Jonathan Ragan-Kelley! llvm-svn: 147314
* Add support for mipsel in configure. Fixes PR11669. Patch by Sylvestre Ledru.Rafael Espindola2011-12-282-1/+5
| | | | llvm-svn: 147312
* PR11662.Nadav Rotem2011-12-282-1/+25
| | | | | | Promotion of the mask operand needs to be done using PromoteTargetBoolean, and not padded with garbage. llvm-svn: 147309
* Fixed a bug in LowerVECTOR_SHUFFLE and LowerBUILD_VECTOR.Elena Demikhovsky2011-12-282-5/+40
| | | | | | | Matching MOVLP mask for AVX (265-bit vectors) was wrong. The failure was detected by conformance tests. llvm-svn: 147308
* Demystify this comment.Nick Lewycky2011-12-281-5/+16
| | | | llvm-svn: 147307
* PR11642 has been fixed, enable -fvisibility-inlines-hidden everywhere.Rafael Espindola2011-12-271-3/+0
| | | | llvm-svn: 147296
* Switch StringMap from an array of structures to a structure of arrays.Benjamin Kramer2011-12-272-61/+60
| | | | | | | - -25% memory usage of the main table on x86_64 (was wasted in struct padding). - no significant performance change. llvm-svn: 147294
* Use false not zero, as a bool.Nick Lewycky2011-12-271-2/+2
| | | | llvm-svn: 147292
* Turn cos(-x) into cos(x). Patch by Alexander Malyshev!Nick Lewycky2011-12-272-5/+41
| | | | llvm-svn: 147291
* Clean up some Release build warnings.Benjamin Kramer2011-12-274-24/+16
| | | | llvm-svn: 147289
* Add handling of x86_avx2_pmovmskb to computeMaskedBitsForTargetNode for ↵Craig Topper2011-12-271-1/+6
| | | | | | consistency. Add comments and an assert for BMI instructions to PerformXorCombine since the enabling of the combine is conditional on it, but the function itself isn't. llvm-svn: 147287
* Teach simplifycfg to recompute branch weights when merging some branches, andNick Lewycky2011-12-272-2/+131
| | | | | | | to discard weights when appropriate. Still more to do (and a new TODO), but it's a start! llvm-svn: 147286
* Using Inst->setMetadata(..., NULL) should be safe to remove metadata even whenNick Lewycky2011-12-271-2/+4
| | | | | | | | there is non of that type to remove. This fixes a crasher in the particular case where the instruction has metadata but no metadata storage in the context (this is only possible if the instruction has !dbg but no other metadata info). llvm-svn: 147285
* Fix warning.Rafael Espindola2011-12-261-1/+2
| | | | llvm-svn: 147284
* Make sure DAGCombiner doesn't introduce multiple loads from the same memory ↵Eli Friedman2011-12-263-6/+44
| | | | | | location. PR10747, part 2. llvm-svn: 147283
* Update the branch weight metadata when reversing the order of a branch.Nick Lewycky2011-12-262-4/+27
| | | | llvm-svn: 147280
* Sort includes, canonicalize whitespace, fix typos. No functionality change.Nick Lewycky2011-12-261-12/+12
| | | | llvm-svn: 147279
* Update the LangRef documentation: the codegen does support this instruction.Nadav Rotem2011-12-251-3/+0
| | | | llvm-svn: 147274
* Fix a typo in the widening of vectors in PromoteIntRes. Patch by Shemer Anat.Nadav Rotem2011-12-251-2/+2
| | | | llvm-svn: 147272
* Sparc: Implement emitFrameIndexDebugValue and getDebugValue Location hooks.Venkatraman Govindaraju2011-12-253-1/+28
| | | | llvm-svn: 147269
* Add braces to remove silly warning.Bill Wendling2011-12-251-1/+2
| | | | llvm-svn: 147264
* Remove unused variables.Rafael Espindola2011-12-252-2/+2
| | | | llvm-svn: 147261
* Add an explicit test that we now fold cttz.i32(..., true) >> 5 -> 0.Chandler Carruth2011-12-241-7/+13
| | | | | | This is a result of Benjamin's work on ValueTracking. llvm-svn: 147259
* InstCombine: Add a combine that turns (2^n)-1 ^ x back into (2^n)-1 - x iff ↵Benjamin Kramer2011-12-242-0/+25
| | | | | | | | | x is smaller than 2^n and it fuses with a following add. This was intended to undo the sub canonicalization in cases where it's not profitable, but it also finds some cases on it's own. llvm-svn: 147256
* ComputeMaskedBits: Make knownzero computation more aggressive for ctlz with ↵Benjamin Kramer2011-12-243-4/+24
| | | | | | | | | undef zero. unsigned foo(unsigned x) { return 31 - __builtin_clz(x); } now compiles into a single "bsrl" instruction on x86. llvm-svn: 147255
* InstCombine: Canonicalize (2^n)-1 - x into (2^n)-1 ^ x iff x is known to be ↵Benjamin Kramer2011-12-243-0/+26
| | | | | | | | | | | smaller than 2^n. This has the obvious advantage of being commutable and is always a win on x86 because const - x wastes a register there. On less weird architectures this may lead to a regression because other arithmetic doesn't fuse with it anymore. I'll address that problem in a followup. llvm-svn: 147254
* Section relative fixups are a coff concept, not a x86 one. Replace theRafael Espindola2011-12-246-10/+16
| | | | | | x86 specific reloc_coff_secrel32 with a generic FK_SecRel_4. llvm-svn: 147252
* Use standard promotion for i8 CTTZ nodes and i8 CTLZ nodes when theChandler Carruth2011-12-244-23/+19
| | | | | | | | | | | | | | | | | | LZCNT instructions are available. Force promotion to i32 to get a smaller encoding since the fix-ups necessary are just as complex for either promoted type We can't do standard promotion for CTLZ when lowering through BSR because it results in poor code surrounding the 'xor' at the end of this instruction. Essentially, if we promote the entire CTLZ node to i32, we end up doing the xor on a 32-bit CTLZ implementation, and then subtracting appropriately to get back to an i8 value. Instead, our custom logic just uses the knowledge of the incoming size to compute a perfect xor. I'd love to know of a way to fix this, but so far I'm drawing a blank. I suspect the legalizer could be more clever and/or it could collude with the DAG combiner, but how... ;] llvm-svn: 147251
* Add systematic testing for cttz as well, and fix the bug I spotted byChandler Carruth2011-12-242-1/+32
| | | | | | inspection earlier. llvm-svn: 147250
* Add i8 and i64 testing for ctlz on x86. Also simplify the i16 test.Chandler Carruth2011-12-241-4/+26
| | | | llvm-svn: 147249
* Tidy up this rather crufty test. Put the declarations at the top to makeChandler Carruth2011-12-241-33/+32
| | | | | | | | | | | | | my C-brain happy. Remove the unnecessary bits of pedantic IR fluff like nounwind. Remove stray uses comments. Name things semantically rather than tN so that adding a new test in the middle doesn't cause pain, and so that new tests can be grouped semantically. This exposes how little systematic testing is going on here. I noticed this by finding several bugs via inspection and wondering why this test wasn't catching any of them. =[ llvm-svn: 147248
* Chandler fixed this.Benjamin Kramer2011-12-241-32/+0
| | | | llvm-svn: 147247
* Expand more when we have a nice 'tzcnt' instruction, to avoid generatingChandler Carruth2011-12-242-0/+32
| | | | | | | | | | | 'bsf' instructions here. This one is actually debatable to my eyes. It's not clear that any chip implementing 'tzcnt' would have a slow 'bsf' for any reason, and unless EFLAGS or a zero input matters, 'tzcnt' is just a longer encoding. Still, this restores the old behavior with 'tzcnt' enabled for now. llvm-svn: 147246
* Tidy up some of these tests.Chandler Carruth2011-12-241-22/+19
| | | | llvm-svn: 147245
* Switch the lowering of CTLZ_ZERO_UNDEF from a .td pattern back to theChandler Carruth2011-12-245-12/+93
| | | | | | | | | | | | | | | | | | | | | | | | | | | X86ISelLowering C++ code. Because this is lowered via an xor wrapped around a bsr, we want the dagcombine which runs after isel lowering to have a chance to clean things up. In particular, it is very common to see code which looks like: (sizeof(x)*8 - 1) ^ __builtin_clz(x) Which is trying to compute the most significant bit of 'x'. That's actually the value computed directly by the 'bsr' instruction, but if we match it too late, we'll get completely redundant xor instructions. The more naive code for the above (subtracting rather than using an xor) still isn't handled correctly due to the dagcombine getting confused. Also, while here fix an issue spotted by inspection: we should have been expanding the zero-undef variants to the normal variants when there is an 'lzcnt' instruction. Do so, and test for this. We don't want to generate unnecessary 'bsr' instructions. These two changes fix some regressions in encoding and decoding benchmarks. However, there is still a *lot* to be improve on in this type of code. llvm-svn: 147244
* Cleanup this test a bit, sorting things and grouping them more clearly.Chandler Carruth2011-12-241-21/+17
| | | | llvm-svn: 147243
* Fix Comments.Jakob Stoklund Olesen2011-12-241-3/+3
| | | | llvm-svn: 147238
* Add MachineMemOperands to instructions generated in storeRegToStackSlot orAkira Hatanaka2011-12-241-2/+16
| | | | | | loadRegFromStackSlot. llvm-svn: 147235
* Detect unaligned loads/stores that have been added for Mips64 support.Akira Hatanaka2011-12-241-1/+8
| | | | llvm-svn: 147234
* Test case for r147232.Akira Hatanaka2011-12-241-0/+12
| | | | llvm-svn: 147233
* If target ABI is N64, LEA should be daddiu.Akira Hatanaka2011-12-241-1/+1
| | | | llvm-svn: 147232
* Move x86 specific bits of the COFF writer to lib/Target/X86.Rafael Espindola2011-12-247-42/+127
| | | | llvm-svn: 147231
OpenPOWER on IntegriCloud