summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Optimize code a bit by calling push_back only once in some loops. Reduces ↵Craig Topper2012-04-111-26/+24
| | | | | | compiled code size a bit. llvm-svn: 154473
* Match (fneg (fma) to vfnma. rdar://10139676Evan Cheng2012-04-111-0/+8
| | | | llvm-svn: 154469
* Add retw and lretw instructions. Also, fix Intel syntax parsing for allCharles Davis2012-04-111-5/+10
| | | | | | ret instructions. llvm-svn: 154468
* Fix ARM disassembly of VLD instructions with writebacks.  And add test a caseKevin Enderby2012-04-111-0/+12
| | | | | | for all opcodes handed by DecodeVLDInstruction() in ARMDisassembler.cpp . llvm-svn: 154459
* ARM add missing Thumb1 two-operand aliases for shift-by-immediate.Jim Grosbach2012-04-112-0/+39
| | | | | | rdar://11222742 llvm-svn: 154457
* Fix a number of problems with ARM fused multiply add/subtract instructions.Evan Cheng2012-04-118-10/+73
| | | | | | | | | | 1. The new instruction itinerary entries are not properly described. 2. The asm parser can't handle vfms and vfnms. 3. There were no assembler, disassembler test cases. 4. HasNEON2 has the wrong assembler predicate. rdar://10139676 llvm-svn: 154456
* Tweak MachineLICM heuristics for cheap instructions.Jakob Stoklund Olesen2012-04-111-69/+89
| | | | | | | | | | | Allow cheap instructions to be hoisted if they are register pressure neutral or better. This happens if the instruction is the last loop use of another virtual register. Only expensive instructions are allowed to increase loop register pressure. llvm-svn: 154455
* Only check for PHI uses inside the current loop.Jakob Stoklund Olesen2012-04-111-27/+51
| | | | | | | | | | | Hoisting a value that is used by a PHI in the loop will introduce a copy because the live range is extended to cross the PHI. The same applies to PHIs in exit blocks. Also use this opportunity to make HasLoopPHIUse() non-recursive. llvm-svn: 154454
* Move the constant-folding support for FP_ROUND in SelectionDAG from the ↵Owen Anderson2012-04-101-1/+10
| | | | | | | | one-operand version of getNode() to the two-operand version, since it became a two-operand node at sound point. Zap a testcase that this allows us to completely fold away. llvm-svn: 154447
* [tsan] two more compile-time optimizations:Kostya Serebryany2012-04-101-11/+42
| | | | | | | | | | | | | - don't isntrument reads from constant globals. Saves ~1.5% of instrumented instructions on CPU2006 (counting static instructions, not their execution). - don't insrument reads from vtable (which is a global constant too). Saves ~5%. I did not measure the run-time impact of this, but it is certainly non-negative. llvm-svn: 154444
* Handle llvm.fma.* intrinsics. rdar://10914096Evan Cheng2012-04-103-2/+20
| | | | llvm-svn: 154439
* Add a comment noting that the fdiv -> fmul conversion won't generateDuncan Sands2012-04-101-3/+3
| | | | | | multiplication by a denormal, and some tests checking that. llvm-svn: 154431
* The MDString class stored a StringRef to the string which was already in aBill Wendling2012-04-103-8/+14
| | | | | | | | | | | | | StringMap. This was redundant and unnecessarily bloated the MDString class. Because the MDString class is a "Value" and will never have a "name", and because the Name field in the Value class is a pointer to a StringMap entry, we repurpose the Name field for an MDString. It stores the StringMap entry in the Name field, and uses the normal methods to get the string (name) back. PR12474 llvm-svn: 154429
* Whitespace.Chad Rosier2012-04-101-1/+0
| | | | llvm-svn: 154427
* Revert r154396, which looks to be the real culprit behind the bot failures.Chad Rosier2012-04-101-0/+1
| | | | llvm-svn: 154426
* Temporarily revert this patch to see if it brings the buildbots back.Eric Christopher2012-04-104-87/+36
| | | | llvm-svn: 154425
* [tsan] compile-time instrumentation: do not instrument a read ifKostya Serebryany2012-04-101-5/+82
| | | | | | | | | | | | | a write to the same temp follows in the same BB. Also add stats printing. On Spec CPU2006 this optimization saves roughly 4% of instrumented reads (which is 3% of all instrumented accesses): Writes : 161216 Reads : 446458 Reads-before-write: 18295 llvm-svn: 154418
* To ensure that we have more accurate line information for a blockEric Christopher2012-04-101-2/+5
| | | | | | | | | don't elide the branch instruction if it's the only one in the block, otherwise it's ok. PR9796 and rdar://11215207 llvm-svn: 154417
* Revert r154397, which was causing make check failures on the buildbots.Owen Anderson2012-04-101-13/+6
| | | | llvm-svn: 154414
* ARM fix cc_out operand handling for t2SUBrr instructions.Jim Grosbach2012-04-102-3/+7
| | | | | | | | | | | | We were incorrectly conflating some add variants which don't have a cc_out operand with the mirroring sub encodings, which do. Part of the awesome non-orthogonality legacy of thumb1. Similarly, handling of add/sub of an immediate was sometimes incorrectly removing the cc_out operand for add/sub register variants. rdar://11216577 llvm-svn: 154411
* Remove unused variable.David Blaikie2012-04-101-1/+0
| | | | llvm-svn: 154398
* Fix a dagcombine optimization which assumes that the vsetcc result type is ↵Nadav Rotem2012-04-101-6/+13
| | | | | | | | | always of the same size as the compared values. This is ture for SSE/AVX/NEON but not for all targets. llvm-svn: 154397
* Modify the code that lowers shuffles to blends from using blendvXX to vblendXX.Nadav Rotem2012-04-104-35/+87
| | | | | | | blendv uses a register for the selection while vblend uses an immediate. On sandybridge they still have the same latency and execute on the same execution ports. llvm-svn: 154396
* Make a somewhat subtle change in the logic of block placement. SometimesChandler Carruth2012-04-101-0/+12
| | | | | | | | | | | | | | | | the loop header has a non-loop predecessor which has been pre-fused into its chain due to unanalyzable branches. In this case, rotating the header into the body of the loop in order to place a loop exit at the bottom of the loop is a Very Bad Idea as it makes the loop non-contiguous. I'm working on a good test case for this, but it's a bit annoynig to craft. I should get one shortly, but I'm submitting this now so I can begin the (lengthy) performance analysis process. An initial run of LNT looks really, really good, but there is too much noise there for me to trust it much. llvm-svn: 154395
* Transform div to mul with reciprocal only when fp imm is legal.Anton Korobeynikov2012-04-101-2/+9
| | | | | | This fixes PR12516 and uncovers one weird problem in legalize (workarounded) llvm-svn: 154394
* Use the correct section types on Solaris for unwind data on both x86 and x86-64.David Chisnall2012-04-101-3/+8
| | | | | | Patch by Dmitri Shubin! llvm-svn: 154391
* Express the number of ULPs in fpaccuracy metadata as a real rather than aDuncan Sands2012-04-101-0/+12
| | | | | | rational number, eg as 2.5 rather than 5, 2. OK'd by Peter Collingbourne. llvm-svn: 154387
* Fix 12513: Loop unrolling breaks with indirect branches.Andrew Trick2012-04-103-29/+29
| | | | | | | | Take this opportunity to generalize the indirectbr bailout logic for loop transformations. CFG transformations will never get indirectbr right, and there's no point trying. llvm-svn: 154386
* whitespaceAndrew Trick2012-04-101-140/+140
| | | | llvm-svn: 154385
* Make the code slightly more palatable.Evan Cheng2012-04-101-1/+5
| | | | llvm-svn: 154378
* Add a constructor for DataRefImpl and remove excess initialization.Danil Malyshev2012-04-102-16/+0
| | | | llvm-svn: 154371
* Fix a long standing tail call optimization bug. When a libcall is emittedEvan Cheng2012-04-106-46/+62
| | | | | | | | | | | | | legalizer always use the DAG entry node. This is wrong when the libcall is emitted as a tail call since it effectively folds the return node. If the return node's input chain is not the entry (i.e. call, load, or store) use that as the tail call input chain. PR12419 rdar://9770785 rdar://11195178 llvm-svn: 154370
* Don't try to zExt just to check if an integer constant is zero, it mightRafael Espindola2012-04-101-2/+2
| | | | | | not fit in a i64. llvm-svn: 154364
* ARM LDR/LDRT has the same encoding collision as STR/STRT.Jim Grosbach2012-04-101-8/+7
| | | | | | Generalized logic of r154141. llvm-svn: 154362
* Have TargetLowering::getPICJumpTableRelocBase return a node that points to theAkira Hatanaka2012-04-091-1/+5
| | | | | | GOT if jump table uses 64-bit gp-relative relocation. llvm-svn: 154341
* When performing a truncating store, it's possible to rearrange the data Chad Rosier2012-04-091-1/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | in-register, such that we can use a single vector store rather then a series of scalar stores. For func_4_8 the generated code vldr d16, LCPI0_0 vmov d17, r0, r1 vadd.i16 d16, d17, d16 vmov.u16 r0, d16[3] strb r0, [r2, #3] vmov.u16 r0, d16[2] strb r0, [r2, #2] vmov.u16 r0, d16[1] strb r0, [r2, #1] vmov.u16 r0, d16[0] strb r0, [r2] bx lr becomes vldr d16, LCPI0_0 vmov d17, r0, r1 vadd.i16 d16, d17, d16 vuzp.8 d16, d17 vst1.32 {d16[0]}, [r2, :32] bx lr I'm not fond of how this combine pessimizes 2012-03-13-DAGCombineBug.ll, but I couldn't think of a way to judiciously apply this combine. This ldrh r0, [r0, #4] strh r0, [r1] becomes vldr d16, [r0] vmov.u16 r0, d16[2] vmov.32 d16[0], r0 vuzp.16 d16, d17 vst1.32 {d16[0]}, [r1, :32] PR11158 rdar://10703339 llvm-svn: 154340
* Patch r153892 for PR11861 apparently broke an external project (see PR12493).Lang Hames2012-04-091-16/+17
| | | | | | | | | | This patch restores TwoAddressInstructionPass's pre-r153892 behaviour when rescheduling instructions in TryInstructionTransform. Hopefully this will fix PR12493. To refix PR11861, lowering of INSERT_SUBREGS is deferred until after the copy that unties the operands is emitted (this seems to be a more appropriate fix for that issue anyway). llvm-svn: 154338
* Update comments and remove unnecessary isVolatile() check.Chad Rosier2012-04-091-3/+5
| | | | llvm-svn: 154336
* Fix accidentally constant conditions found by uncommitted improvements to ↵David Blaikie2012-04-091-4/+4
| | | | | | | | | | | -Wconstant-conversion. A couple of cases where we were accidentally creating constant conditions by something like "x == a || b" instead of "x == a || x == b". In one case a conditional & then unreachable was used - I transformed this into a direct assert instead. llvm-svn: 154324
* Pattern match a setcc of boolean value with 0 as a truncate.Rafael Espindola2012-04-091-9/+48
| | | | llvm-svn: 154322
* This patch adds X86 instruction itineraries, which were missed by thePreston Gurd2012-04-091-28/+30
| | | | | | original patch to add itineraries, to X86InstrArithmetc.td. llvm-svn: 154320
* Lower some x86 shuffle sequences to the vblend family of instructions.Nadav Rotem2012-04-091-0/+67
| | | | llvm-svn: 154313
* Fix a bug in the lowering of broadcasts: ConstantPools need to use the ↵Nadav Rotem2012-04-092-19/+13
| | | | | | | | target pointer type. Move NormalizeVectorShuffle and LowerVectorBroadcast into X86TargetLowering. llvm-svn: 154310
* Remove unnecessary type check when combining and/or/xor of swizzles. Move ↵Craig Topper2012-04-091-13/+12
| | | | | | some checks to allow better early out. llvm-svn: 154309
* Remove unnecessary 'else' on an 'if' that always returnsCraig Topper2012-04-091-1/+2
| | | | llvm-svn: 154308
* Optimize code slightly. No functionality change.Craig Topper2012-04-091-6/+7
| | | | llvm-svn: 154307
* Replace some explicit checks with asserts for conditions that should never ↵Craig Topper2012-04-091-14/+7
| | | | | | happen. llvm-svn: 154305
* Cleanup and relax a restriction on the matching of global offsets intoChandler Carruth2012-04-091-9/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | x86 addressing modes. This allows PIE-based TLS offsets to fit directly into an addressing mode immediate offset, which is the last remaining code quality issue from PR12380. With this patch, that PR is completely fixed. To understand why this patch is correct to match these offsets into addressing mode immediates, break it down by cases: 1) 32-bit is trivially correct, and unmodified here. 2) 64-bit non-small mode is unchanged and never matches. 3) 64-bit small PIC code which is RIP-relative is handled specially in the match to try to fit RIP into the base register. If it fails, it now early exits. This behavior is unchanged by the patch. 4) 64-bit small non-PIC code which is not RIP-relative continues to work as it did before. The reason these immediates are safe is because the ABI ensures they fit in small mode. This behavior is unchanged. 5) 64-bit small PIC code which is *not* using RIP-relative addressing. This is the only case changed by the patch, and the primary place you see it is in TLS, either the win64 section offset TLS or Linux local-exec TLS model in a PIC compilation. Here the ABI again ensures that the immediates fit because we are in small mode, and any other operations required due to the PIC relocation model have been handled externally to the Wrapper node (extra loads etc are made around the wrapper node in ISelLowering). I've tested this as much as I can comparing it with GCC's output, and everything appears safe. I discussed this with Anton and it made sense to him at least at face value. That said, if there are issues with PIC code after this patch, yell and we can revert it. llvm-svn: 154304
* Optimize code a bit. No functional change intended.Craig Topper2012-04-081-9/+9
| | | | llvm-svn: 154299
* Silence sign-compare warning.Benjamin Kramer2012-04-081-1/+1
| | | | llvm-svn: 154297
OpenPOWER on IntegriCloud