summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86] Use the correct alignment for COMDAT constant pool entriesDavid Majnemer2016-02-216-15/+36
| | | | | | | | | | | | | | | | | | | COFF doesn't have sections with mergeable contents. Instead, each constant pool entry ends up in a COMDAT section. The linker, when choosing between COMDAT sections, doesn't choose the max alignment of the two sections. You just get whatever alignment was on the section. If one constant needed a higher alignment in one object file from another one, then we will get into trouble if the linker chooses the lower alignment one. Instead, lets promote the alignment of the constant pool entry to make sure we don't use an under aligned constant with an instruction which assumed otherwise. This fixes PR26680. llvm-svn: 261462
* [InstCombine] SSE/SSE2 (u)comiss/(u)comisd comparison intrinsics only use ↵Simon Pilgrim2016-02-201-0/+40
| | | | | | the lowest vector element llvm-svn: 261460
* [WebAssembly] Refine a README.txt entry.Dan Gohman2016-02-201-2/+2
| | | | | | | The register coloring pass may also need to be involved in order to optimally sort registers. llvm-svn: 261458
* [WebAssembly] Handle CopyToReg nodes with flag results in LowerCopyToReg.Dan Gohman2016-02-201-3/+7
| | | | llvm-svn: 261457
* [WebAssembly] Write stack pointer back to memory when FP is usedDerek Schuff2016-02-201-1/+1
| | | | | | | | The stack pointer is bumped when there is a frame pointer or when there are static-size objects, but was only getting written back when there were static-size objects. llvm-svn: 261453
* [WebAssembly] Stackify function prologs and epilogsDerek Schuff2016-02-201-15/+21
| | | | | | | | The instructions are the same, but fewer locals are used. Differential Revision: http://reviews.llvm.org/D17428 llvm-svn: 261452
* Don't scan for SSA register operands to update when not in SSA form.Dan Gohman2016-02-201-22/+24
| | | | | | | | | TailDuplicate can run on either on SSA code or non-SSA code, as indicated to it by MRI->isSSA() ("PreRegAlloc" here). TailDuplicate does extra work to preserve SSA invariants when it duplicates code. This patch makes it skip some of this extra work in the case where the code is not in SSA form. llvm-svn: 261450
* Fix the build bot break caused by rL261441.Nemanja Ivanovic2016-02-201-5/+11
| | | | | | | | The patch has a necessary call to a function inside an assert. Which is fine when you have asserts turned on. Not so much when they're off. Sorry about the regression. llvm-svn: 261447
* Fix for PR 26500Nemanja Ivanovic2016-02-202-52/+182
| | | | | | | | | | | | | | This patch corresponds to review: http://reviews.llvm.org/D17294 It ensures that whatever block we are emitting the prologue/epilogue into, we have the necessary scratch registers. It takes away the hard-coded register numbers for use as scratch registers as registers that are guaranteed to be available in the function prologue/epilogue are not guaranteed to be available within the function body. Since we shrink-wrap, the prologue/epilogue may end up in the function body. llvm-svn: 261441
* [DAGCombiner] Use getBitcast helper when possible. NFCI.Simon Pilgrim2016-02-201-7/+3
| | | | llvm-svn: 261437
* [X86][SSE] Fixed issue with commutation of 'faux unary' target shuffles ↵Simon Pilgrim2016-02-201-5/+4
| | | | | | | | (PR26667) Fixed a bug introduced by D16683 when a binary shuffle is simplified to a unary shuffle (with undef/zero sentinel mask indices) - if this resulted in only the second input being used combineX86ShuffleChain failed to take this into account and still referenced the first input. llvm-svn: 261434
* [X86][SSE] Move all undef/zero cases before target shuffle combining.Simon Pilgrim2016-02-201-20/+14
| | | | | | First small step towards fixing PR26667 - we need to ensure that combineX86ShuffleChain only gets called with a valid shuffle input node (a similar issue was found in D17041). llvm-svn: 261433
* When MemoryDependenceAnalysis hits a CFG with many transparent blocks,Joerg Sonnenberger2016-02-201-6/+26
| | | | | | | | | | | | | | | | | | the algorithm easily degrades into quadratic memory and time complexity. The easiest example is a long chain of BBs that don't otherwise use a location. The caching will add an entry for every intermediate block and limiting the number of results doesn't help as no results are produced until a definition is found. Introduce a limit similar to the existing instructions-per-block limit. This limit counts the total number of blocks checked. If the limit is reached, entries are considered unknown. The initial value is 1000, which avoids regressions for normal sized functions while still limiting edge cases to reasnable memory consumption and execution time. Differential Revision: http://reviews.llvm.org/D16123 llvm-svn: 261430
* [X86] Enable the LEA optimization pass by default.Andrey Turetskiy2016-02-201-4/+5
| | | | | | Differential Revision: http://reviews.llvm.org/D16877 llvm-svn: 261429
* [X86] PR26575: Fix LEA optimization pass (Part 2).Andrey Turetskiy2016-02-201-36/+78
| | | | | | | | | | Handle address displacement operands of a type other than Immediate or Global in LEAs and load/stores. Ref: https://llvm.org/bugs/show_bug.cgi?id=26575 Differential Revision: http://reviews.llvm.org/D17374 llvm-svn: 261428
* [SimplifyCFG] Use pointer identity to simplify predicate.Benjamin Kramer2016-02-201-4/+2
| | | | | | No functional change intended. llvm-svn: 261427
* [LVI] Move ConstantRanges instead of copying.Benjamin Kramer2016-02-201-9/+8
| | | | | | | | No functional change intended. Copying small (<= 64 bits) APInts isn't expensive but bloats code by generating the slow path everywhere. Moving doesn't care about the size of the value. llvm-svn: 261426
* Move some code from doInitialization to runOnFunctionDavid Majnemer2016-02-201-3/+4
| | | | | | | This has no observable behavior change, it just makes the state insertion pass look a little more like normal passes. llvm-svn: 261420
* [X86] Add some missing reversed forms of XOP instructions.Craig Topper2016-02-201-0/+29
| | | | llvm-svn: 261417
* [PM/AA] Wire up TBAA to the new pass manager's registry and test it.Chandler Carruth2016-02-202-0/+2
| | | | llvm-svn: 261411
* [PM/AA] Wire up the scoped-no-alias AA to the new pass manager'sChandler Carruth2016-02-202-0/+2
| | | | | | registry and test it. llvm-svn: 261410
* [PM/AA] Wire up SCEVAA to the new pass manager's registry and test it.Chandler Carruth2016-02-202-0/+2
| | | | llvm-svn: 261409
* MachineCopyPropagation: Introduce Reg2MIMap typedef; NFCMatthias Braun2016-02-201-4/+5
| | | | llvm-svn: 261408
* MachineCopyPropagation: Move variables from function to passMatthias Braun2016-02-201-18/+22
| | | | | | | | This avoids unnecessarily passing them around when calling helper functions. It may also be slightly faster to call clear() on the datastructures instead of freshly initializing them for each block. llvm-svn: 261407
* MachineCopyPropagation: Use ranged for, cleanup; NFCMatthias Braun2016-02-201-51/+35
| | | | llvm-svn: 261406
* MachineCopyPropagation: Use assert() instead of if{report_error()} for ↵Matthias Braun2016-02-201-8/+5
| | | | | | 'impossible' condition llvm-svn: 261405
* [PM/AA] Wire up CFLAA to the new pass manager fully, and port one of itsChandler Carruth2016-02-203-0/+3
| | | | | | | | | tests over to exercise this code. This uncovered a few missing bits here and there in the analysis, but nothing interesting. llvm-svn: 261404
* [PM/AA] Port alias analysis evaluator to the new pass manager, and useChandler Carruth2016-02-204-71/+68
| | | | | | | | | | | | | | | | it to actually test the new pass manager AA wiring. This patch was extracted from the (somewhat too large) D12357 and rebosed on top of the slightly different design of the new pass manager AA wiring that I just landed. With this we can start testing the AA in a thorough way with the new pass manager. Some minor cleanups to the code in the pass was necessitated here, but otherwise it is a very minimal change. Differential Revision: http://reviews.llvm.org/D17372 llvm-svn: 261403
* [SCEV] Don't spell `SCEV *` variables as `Scev`; NFCSanjoy Das2016-02-201-15/+14
| | | | | | | It reads odd since most other places name a `SCEV *` as `S`. Pure renaming change. llvm-svn: 261393
* [SCEV] Don't use std::make_pair; NFCSanjoy Das2016-02-201-15/+14
| | | | | | `{A, B}` reads cleaner than `std::make_pair(A, B)`. llvm-svn: 261392
* [SimplifyCFG] Merge together cleanuppadsDavid Majnemer2016-02-201-2/+45
| | | | | | | | | | Cleanuppads may be merged together if one is the only predecessor of the other in which case a simple transform can be performed: replace the a cleanupret with a branch and remove an unnecessary cleanuppad. Differential Revision: http://reviews.llvm.org/D17459 llvm-svn: 261390
* [X86ISelLowering] Fix TLSADDR lowering when shrink-wrapping is enabled.Davide Italiano2016-02-203-2/+39
| | | | | | | | | | TLSADDR nodes are lowered into actuall calls inside MC. In order to prevent shrink-wrapping from pushing prologue/epilogue past them (which result in TLS variables being accessed before the stack frame is set up), we put markers, so that the stack gets adjusted properly. Thanks to Quentin Colombet for guidance/help on how to fix this problem! llvm-svn: 261387
* AMDGPU/SI: Use v_readfirstlane to legalize SMRD with VGPR base pointerTom Stellard2016-02-202-238/+22
| | | | | | | | | | | | | | | | | | | | | | Summary: Instead of trying to replace SMRD instructions with a VGPR base pointer with an equivalent MUBUF instruction, we now copy the base pointer to SGPRs using v_readfirstlane. This is safe to do, because any load selected as an SMRD instruction has been proven to have a uniform base pointer, so each thread in the wave will have the same pointer value in VGPRs. This will fix some errors on VI from trying to replace SMRD instructions with addr64-enabled MUBUF instructions that don't exist. Reviewers: arsenm, cfang, nhaehnle Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17305 llvm-svn: 261385
* [RegAllocFast] Properly track the physical register definitions on calls.Quentin Colombet2016-02-201-4/+6
| | | | | | PR26485 llvm-svn: 261384
* [codeview] Fix emission of file changes in inline line tablesReid Kleckner2016-02-191-1/+4
| | | | | | These are supposed to be file checksum table offsets, not file ids. llvm-svn: 261379
* [X86ISelLowering] Provide a more informative assert message.Davide Italiano2016-02-191-1/+1
| | | | | | I stumbled upon this while debugging a lowering bug. llvm-svn: 261371
* [X86ISelLowering] Merge two conditions inside a single if.Davide Italiano2016-02-191-3/+1
| | | | llvm-svn: 261370
* Revert r255691 "[LoopVectorizer] Refine loop vectorizer's register usage ↵Hans Wennborg2016-02-191-106/+31
| | | | | | | | calculator by ignoring specific instructions." It caused PR26509. llvm-svn: 261368
* Revert r253557 "Alternative to long nops for X86 CPUs, by Andrey Turetsky"Hans Wennborg2016-02-191-32/+14
| | | | | | Turns out the new nop sequences aren't actually nops on x86_64 (PR26554). llvm-svn: 261365
* Fix incorrect selection of AVX512 sqrt when OptForSize is onDimitry Andric2016-02-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | Summary: When optimizing for size, sqrt calls can be incorrectly selected as AVX512 VSQRT instructions. This is because X86InstrAVX512.td has a `Requires<[OptForSize]>` in its `avx512_sqrt_scalar` multiclass definition. Even if the target does not support AVX512, the class can apparently still be chosen, leading to an incorrect selection of `vsqrtss`. In PR26625, this lead to an assertion: Reg >= X86::FP0 && Reg <= X86::FP6 && "Expected FP register!", because the `vsqrtss` instruction requires an XMM register, which is not available on i686 CPUs. Reviewers: grosbach, resistor, joker.eph Subscribers: spatel, emaste, llvm-commits Differential Revision: http://reviews.llvm.org/D17414 llvm-svn: 261360
* [StatepointLowering] Minor non-semantic cleanupsSanjoy Das2016-02-191-23/+18
| | | | | | Use auto, bring file up to coding standards etc. llvm-svn: 261358
* [WebAssembly] Add another optimization idea to README.txt.Dan Gohman2016-02-191-0/+6
| | | | llvm-svn: 261354
* [AArch64][ShrinkWrap] Fix bug in prolog clobbering live reg when shrink ↵Geoff Berry2016-02-192-9/+63
| | | | | | | | | | | | | | wrapping. Summary: See bug https://llvm.org/bugs/show_bug.cgi?id=26642 Reviewers: qcolombet, t.p.northover Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D17350 llvm-svn: 261349
* [StatepointLowering] Update StatepointMaxSlotsRequired correctlySanjoy Das2016-02-191-3/+4
| | | | | | | | | | Now that we don't always add an element to AllocatedStackSlots if we don't find a pre-existing unallocated stack slot, bumping StatepointMaxSlotsRequired to `NumSlots + 1` is not correct. Instead bump the statistic near the push_back, to Builder.FuncInfo.StatepointStackSlots.size(). llvm-svn: 261348
* [StatepointLowering] Fix a mistake in rL261336Sanjoy Das2016-02-191-4/+5
| | | | | | | | The check on MFI->getObjectSize() has to be on the FrameIndex, not on the index of the FrameIndex in AllocatedStackSlots. Weirdly, the tests I added in rL261336 didn't catch this. llvm-svn: 261347
* [LV] Vectorize first-order recurrencesMatthew Simpson2016-02-192-6/+234
| | | | | | | | | | | | | | | | | | This patch enables the vectorization of first-order recurrences. A first-order recurrence is a non-reduction recurrence relation in which the value of the recurrence in the current loop iteration equals a value defined in the previous iteration. The load PRE of the GVN pass often creates these recurrences by hoisting loads from within loops. In this patch, we add a new recurrence kind for first-order phi nodes and attempt to vectorize them if possible. Vectorization is performed by shuffling the values for the current and previous iterations. The vectorization cost estimate is updated to account for the added shuffle instruction. Contributed-by: Matthew Simpson and Chad Rosier <mcrosier@codeaurora.org> Differential Revision: http://reviews.llvm.org/D16197 llvm-svn: 261346
* [StatepointLowering] Change AllocatedStackSlots to use SmallBitVectorSanjoy Das2016-02-192-13/+15
| | | | | | | | | | | | | | | | | NFCI. They key motivation here is that I'd like to use SmallBitVector::all() in a later change. Also, using a bit vector here seemed better in general. The only interesting change here is that in the failure case of allocateStackSlot, we no longer (the equivalent of) push_back(true) to AllocatedStackSlots. As far as I can tell, this is fine, since we'd never re-use those slots in the same StatepointLoweringState instance. Technically there was no need to change the operator[] type accesses to set() and test(), but I thought it'd be nice to make it obvious that we're using something other than a std::vector like thing. llvm-svn: 261337
* [StatepointLowering] Fix bug in allocateStackSlotSanjoy Das2016-02-191-2/+19
| | | | | | | | | | | | | | | | | | | | | | | | | allocateStackSlot did not consider the size of the value to be spilled before deciding to re-use a spill slot. This was originally okay (since originally we'd only ever spill pointers), but it became not okay when we changed our scheme to directly spill vectors of pointers. While this change fixes the bug pointed out, it has two performance caveats: - It matches spill slot and spillee size exactly, while in theory we can spill, e.g., an 8 byte pointer into a 16 byte slot. This is slightly complicated to fix since in the stackmaps section, we report the size of the spill slot as the size of the "indirect value"; and if they're no longer equivalent, we'll have to keep track of the (indirect) value size separately from the stack slot size. - It will "spuriously run out" of reusable slots, since we now have an second check in the search loop in addition to the availablity check (e.g. you had two free scalar slots, and you first ask for a vector slot followed by a scalar slot). I'll fix this in a later commit. llvm-svn: 261336
* [StatepointLowering] Clean up allocateStackSlotSanjoy Das2016-02-191-35/+22
| | | | | | | | | This removes the unusual loop structure in allocateStackSlot in favor of something more straightforward. I've also removed the cautionary comment in the function, which I suspect is historical cruft now, and confuses more than it enlightens. llvm-svn: 261335
* [LV] Fix PR26600: avoid out of bounds loads for interleaved access vectorizationSilviu Baranga2016-02-191-0/+10
| | | | | | | | | | | | | | | | | | | | Summary: If we don't have the first and last access of an interleaved load group, the first and last wide load in the loop can do an out of bounds access. Even though we discard results from speculative loads, this can cause problems, since it can technically generate page faults (or worse). We now discard interleaved load groups that don't have the first and load in the group. Reviewers: hfinkel, rengolin Subscribers: rengolin, llvm-commits, mzolotukhin, anemet Differential Revision: http://reviews.llvm.org/D17332 llvm-svn: 261331
OpenPOWER on IntegriCloud