summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
* [PM] Wire up support for explicitly running the verifier pass.Chandler Carruth2015-01-052-0/+22
| | | | | | | | The required functionality has been there for some time, but I never managed to actually wire it into the command line registry of passes. Let's do that. llvm-svn: 225144
* [PM] Cleanup a const_cast and other machinery left over in this codeChandler Carruth2015-01-041-2/+1
| | | | | | | | | from before I removed thet non-const use of the function. The unused variable that held the const_cast was already kindly removed by Michael. llvm-svn: 225143
* [X86][SSE] Added vector packing test for pr12412Simon Pilgrim2015-01-041-0/+34
| | | | llvm-svn: 225138
* [X86][SSE] Added vector integer truncation tests - based off pr15524Simon Pilgrim2015-01-041-0/+90
| | | | llvm-svn: 225137
* [PowerPC] Materialize i64 constants using rotationHal Finkel2015-01-043-33/+77
| | | | | | | | | | | | | | | | Materializing full 64-bit constants on PPC64 can be expensive, requiring up to 5 instructions depending on the locations of the non-zero bits. Sometimes materializing a rotated constant, and then applying the inverse rotation, requires fewer instructions than the direct method. If so, do that instead. In r225132, I added support for forming constants using bit inversion. In effect, this reverts that commit and replaces it with rotation support. The bit inversion is useful for turning constants that are mostly ones into ones that are mostly zeros (thus enabling a more-efficient shift-based materialization), but the same effect can be obtained by using negative constants and a rotate, and that is at least as efficient, if not more. llvm-svn: 225135
* Fix unused variable warning for non-asserts builds. NFC.Michael Kuperstein2015-01-041-2/+2
| | | | llvm-svn: 225133
* [PowerPC] Materialize i64 constants using bit inversionHal Finkel2015-01-042-2/+50
| | | | | | | | | Materializing full 64-bit constants on PPC64 can be expensive, requiring up to 5 instructions depending on the locations of the non-zero bits. Sometimes materializing the bit-reversed constant, and then flipping the bits, requires fewer instructions than the direct method. If so, do that instead. llvm-svn: 225132
* [PM] Split the AssumptionTracker immutable pass into two separate APIs:Chandler Carruth2015-01-0465-943/+992
| | | | | | | | | | | | | | | | | | | | | | | | | | | | a cache of assumptions for a single function, and an immutable pass that manages those caches. The motivation for this change is two fold. Immutable analyses are really hacks around the current pass manager design and don't exist in the new design. This is usually OK, but it requires that the core logic of an immutable pass be reasonably partitioned off from the pass logic. This change does precisely that. As a consequence it also paves the way for the *many* utility functions that deal in the assumptions to live in both pass manager worlds by creating an separate non-pass object with its own independent API that they all rely on. Now, the only bits of the system that deal with the actual pass mechanics are those that actually need to deal with the pass mechanics. Once this separation is made, several simplifications become pretty obvious in the assumption cache itself. Rather than using a set and callback value handles, it can just be a vector of weak value handles. The callers can easily skip the handles that are null, and eventually we can wrap all of this up behind a filter iterator. For now, this adds boiler plate to the various passes, but this kind of boiler plate will end up making it possible to port these passes to the new pass manager, and so it will end up factored away pretty reasonably. llvm-svn: 225131
* InstCombine: match can find ConstantExprs, don't assume we have a ValueDavid Majnemer2015-01-042-2/+11
| | | | | | | | | | We assumed the output of a match was a Value, this would cause us to assert because we would fail a cast<>. Instead, use a helper in the Operator family to hide the distinction between Value and Constant. This fixes PR22087. llvm-svn: 225127
* ValueTracking: ComputeNumSignBits should tolerate misshapen phi nodesDavid Majnemer2015-01-042-2/+33
| | | | | | | | | | | | PHI nodes can have zero operands in the middle of a transform. It is expected that utilities in Analysis don't freak out when this happens. Note that it is considered invalid to allow these misshapen phi nodes to make it to another pass. This fixes PR22086. llvm-svn: 225126
* [APFloat][ADT] Fix sign handling logic for FMA results that truncate to zero.Lang Hames2015-01-042-1/+42
| | | | | | | | | | | | | | This patch adds a check for underflow when truncating results back to lower precision at the end of an FMA. The additional sign handling logic in APFloat::fusedMultiplyAdd should only be performed when the result of the addition step of the FMA (in full precision) is exactly zero, not when the result underflows to zero. Unit tests for this case and related signed zero FMA results are included. Fixes <rdar://problem/18925551>. llvm-svn: 225123
* llvm-readobj: add support to dump COFF export tablesSaleem Abdulrasool2015-01-037-0/+39
| | | | | | | This enhances llvm-readobj to print out the COFF export table, similar to the -coff-import option. This is useful for testing in lld. llvm-svn: 225120
* ARM: permit tail calls to weak externals on COFFSaleem Abdulrasool2015-01-033-2/+25
| | | | | | | | | | | Weak externals are resolved statically, so we can actually generate the tail call on PE/COFF targets without breaking the requirements. It is questionable whether we want to propagate the current behaviour for MachO as the requirements are part of the ARM ELF specifications, and it seems that prior to the SVN r215890, we would have tail'ed the call. For now, be conservative and only permit it on PE/COFF where the call will always be fully resolved. llvm-svn: 225119
* [PowerPC/BlockPlacement] Allow target to provide a per-loop alignment preferenceHal Finkel2015-01-035-13/+92
| | | | | | | | | | | | | | | | | The existing code provided for specifying a global loop alignment preference. However, the preferred loop alignment might depend on the loop itself. For recent POWER cores, loops between 5 and 8 instructions should have 32-byte alignment (while the others are better with 16-byte alignment) so that the entire loop will fit in one i-cache line. To support this, getPrefLoopAlignment has been made virtual, and can be provided with an optional MachineLoop* so the target can inspect the loop before answering the query. The default behavior, as before, is to return the value set with setPrefLoopAlignment. MachineBlockPlacement now queries the target for each loop instead of only once per function. There should be no functional change for other targets. llvm-svn: 225117
* [PowerPC] Use 16-byte alignment for modern cores for functions/loopsHal Finkel2015-01-033-4/+110
| | | | | | | | | | | | Most modern PowerPC cores prefer that functions and loops start on 16-byte-aligned boundaries (*), so instruct block placement, etc. to make this happen. The branch selector has also been adjusted so account for the extra nops that might now be inserted before loop headers. (*) Some cores actually prefer other alignments for small loops, but that will be addressed in a follow-up commit. llvm-svn: 225115
* Minor cleanup to all the switches after MatchInstructionImpl in all the ↵Craig Topper2015-01-038-29/+20
| | | | | | | | AsmParsers. Make sure they all have llvm_unreachable on the default path out of the switch. Remove unnecessary "default: break". Remove a 'return' after unreachable. Fix some indentation. llvm-svn: 225114
* Fix some formatting in tablegen output.Craig Topper2015-01-031-7/+4
| | | | llvm-svn: 225113
* Replace some 'unreachable' comments with llvm_unreachable.Craig Topper2015-01-031-2/+2
| | | | llvm-svn: 225112
* ValueTracking: Make computeKnownBits for Arguments a little more clearDavid Majnemer2015-01-031-0/+3
| | | | | | | | | | | We would sometimes leave the out-param APInts untouched while going through computeKnownBits. While I don't know of a way to trigger a bug involving this in practice, it goes against the overall design of computeKnownBits. Found via code inspection. llvm-svn: 225109
* [PowerPC] Add support for the CMPB instructionHal Finkel2015-01-0312-9/+538
| | | | | | | | | | | | | | Newer POWER cores, and the A2, support the cmpb instruction. This instruction compares its operands, treating each of the 8 bytes in the GPRs separately, returning a 'mask' result of 0 (for false) or -1 (for true) in each byte. Code generation support is added, in the form of a PPCISelDAGToDAG DAG-preprocessing routine, that recognizes patterns close to what the instruction computes (either exactly, or related by a constant masking operation), and generates the cmpb instruction (along with any necessary constant masking operation). This can be expanded if use cases arise. llvm-svn: 225106
* [asan] simplify the tracing code, make it use the same guard variables as ↵Kostya Serebryany2015-01-032-26/+13
| | | | | | coverage llvm-svn: 225103
* [X86] Disassembler support for move to/from %rax with a 32-bit memory offset ↵Craig Topper2015-01-037-4/+48
| | | | | | is REX.W and AdSize prefix are both present. llvm-svn: 225099
* [X86] Use 32-bit sign extended immediate for 64-bit LOCK_ArithBinOp with ↵Craig Topper2015-01-031-6/+6
| | | | | | sign extended immediate. llvm-svn: 225098
* [PM] Add proper documentation for the ModulePassManager andChandler Carruth2015-01-021-0/+40
| | | | | | | | | | | | | | FunctionPassManager. These never got documented, likely due to the clutter of this header file. This fixes another problem people noticed when they started trying to use the new pass manager. I've also used this to document the aspirational constraints I would like to hold passes to. I don't really have a better place to document such things at this point, but eventually will probably create a proper .rst file and page for the LLVM pass infrastructure that carries such high-level concerns. llvm-svn: 225097
* [PM] Actually include the correct file name. Sorry for the breakage.Chandler Carruth2015-01-021-1/+1
| | | | llvm-svn: 225096
* [PM] Lift the majority of the template boilerplate used to implement theChandler Carruth2015-01-022-306/+343
| | | | | | | | | | | | | | | concept-based polymorphism in the pass manager to a separate header. I got feedback from someone reading the code and trying to use it that this was really making it hard to dive in and start using these APIs and that makes a lot of sense. This only requires a moderate amount of gymnastics to separate in this way, namely rinsing the PreservedAnalysis object through a template argument in a few places so that it is dependent and we only examine it on instantiation. llvm-svn: 225094
* [PM] Fix some formatting where clang-format has improved recently.Chandler Carruth2015-01-022-6/+7
| | | | llvm-svn: 225092
* Reformat statepoint documentation and fix a couple of typosPhilip Reames2015-01-021-87/+287
| | | | | | Patch by Ramkumar Ramachandra <artagnon@gmail.com>. llvm-svn: 225084
* Improved comments. No functional change intended.Andrea Di Biagio2015-01-021-2/+2
| | | | llvm-svn: 225080
* [X86] Bring some better consistency to the naming of the move to/from ↵Craig Topper2015-01-022-44/+41
| | | | | | %al/ax/eax/rax with memory offset. llvm-svn: 225078
* InstCombine: Detect when llvm.umul.with.overflow always overflowsDavid Majnemer2015-01-023-7/+31
| | | | | | | We know overflow always occurs if both ~LHSKnownZero * ~RHSKnownZero and LHSKnownOne * RHSKnownOne overflow. llvm-svn: 225077
* Analysis: Reformulate WillNotOverflowUnsignedMul for reusabilityDavid Majnemer2015-01-025-53/+54
| | | | | | | | WillNotOverflowUnsignedMul's smarts will live in ValueTracking as computeOverflowForUnsignedMul. It now returns a tri-state result: never overflows, always overflows and sometimes overflows. llvm-svn: 225076
* [X86] Make the instructions that use AdSize16/32/64 co-exist together ↵Craig Topper2015-01-029-179/+301
| | | | | | | | | | without using mode predicates. This is necessary to allow the disassembler to be able to handle AdSize32 instructions in 64-bit mode when address size prefix is used. Eventually we should probably also support 'addr32' and 'addr16' in the assembler to override the address size on some of these instructions. But for now we'll just use special operand types that will lookup the current mode size to select the right instruction. llvm-svn: 225075
* [SROA] Teach SROA to be more aggressive in splitting now that we haveChandler Carruth2015-01-024-35/+173
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | a pre-splitting pass over loads and stores. Historically, splitting could cause enough problems that I hamstrung the entire process with a requirement that splittable integer loads and stores must cover the entire alloca. All smaller loads and stores were unsplittable to prevent chaos from ensuing. With the new pre-splitting logic that does load/store pair splitting I introduced in r225061, we can now very nicely handle arbitrarily splittable loads and stores. In order to fully benefit from these smarts, we need to mark all of the integer loads and stores as splittable. However, we don't actually want to rewrite partitions with all integer loads and stores marked as splittable. This will fail to extract scalar integers from aggregates, which is kind of the point of SROA. =] In order to resolve this, what we really want to do is only do pre-splitting on the alloca slices with integer loads and stores fully splittable. This allows us to uncover all non-integer uses of the alloca that would benefit from a split in an integer load or store (and where introducing the split is safe because it is just memory transfer from a load to a store). Once done, we make all the non-whole-alloca integer loads and stores unsplittable just as they have historically been, repartition and rewrite. The result is that when there are integer loads and stores anywhere within an alloca (such as from a memcpy of a sub-object of a larger object), we can split them up if there are non-integer components to the aggregate hiding beneath. I've added the challenging test cases to demonstrate how this is able to promote to scalars even a case where we have even *partially* overlapping loads and stores. This restores the single-store behavior for small arrays of i8s which is really nice. I've restored both the little endian testing and big endian testing for these exactly as they were prior to r225061. It also forced me to be more aggressive in an alignment test to actually defeat SROA. =] Without the added volatiles there, we actually split up the weird i16 loads and produce nice double allocas with better alignment. This also uncovered a number of bugs where we failed to handle splittable load and store slices which didn't have a begininng offset of zero. Those fixes are included, and without them the existing test cases explode in glorious fireworks. =] I've kept support for leaving whole-alloca integer loads and stores as splittable even for the purpose of rewriting, but I think that's likely no longer needed. With the new pre-splitting, we might be able to remove all the splitting support for loads and stores from the rewriter. Not doing that in this patch to try to isolate any performance regressions that causes in an easy to find and revert chunk. llvm-svn: 225074
* [SROA] Make the computation of adjusted pointers not leak GEPChandler Carruth2015-01-021-10/+14
| | | | | | | | | | | | | | | | | | | | | | | | | instructions. I noticed this when working on dialing up how aggressively we can pre-split loads and stores. My test case wasn't passing because dead GEPs into the allocas persisted when they were built by this routine. This isn't terribly harmful, we still rewrote and promoted the alloca and I can't conceive of how to cause this to happen in a case where we will keep the exact same alloca but rewrite and promote the uses of it. If that ever happened, we'd get an assert out of mem2reg. So I don't have a direct test case yet, but the subsequent commit's test case wouldn't pass without this. There are other problems fixed by this patch that I spotted purely by inspection such as the fact that getAdjustedPtr could have actually deleted dead base pointers. I don't know how to get a base pointer to go into getAdjustedPtr today, so I think this bug could never have manifested (and I certainly can't write a test case for it) but, it wasn't the intent of the code. The code really just wanted to GC the new instructions built. That can be done more directly by comparing with the base pointer which is the only non-new instruction that this code can return. llvm-svn: 225073
* [SROA] Add a test case for r225068 / PR22080.Chandler Carruth2015-01-021-0/+36
| | | | llvm-svn: 225070
* [SROA] Fix the loop exit placement to be prior to indexing the splitsChandler Carruth2015-01-021-4/+8
| | | | | | | | array. This prevents it from walking out of bounds on the splits array. Bug found with the existing tests by ASan and by the MSVC debug build. llvm-svn: 225069
* [SROA] Fix two total think-os in r225061 that should have been caught onChandler Carruth2015-01-011-2/+2
| | | | | | | | | | | | | | | | | | | | | a +asserts bootstrap, but my bootstrap had asserts off. Oops. Anyways, in some places it is reasonable to cast (as a sanity check) the pointer operand to a load or store to an instruction within SROA -- namely when the pointer operand is expected to be derived from an alloca, and thus always an instruction. However, the pre-splitting code also deals with loads and stores to non-alloca pointers and there we need to just use the Value*. Nothing about the code relied on the instruction cast, it was only there essentially as an invariant assertion. Remove the two that don't actually hold. This should fix the proximate issue in PR22080, but I'm also doing an asserts bootstrap myself to see if there are other issues lurking. I'll craft a reduced test case in a moment, but I wanted to get the tree healthy as quickly as possible. llvm-svn: 225068
* [PowerPC] use UINT64_C instead of ulHal Finkel2015-01-011-4/+4
| | | | | | | Attempting to fix PR22078 (building on 32-bit systems) by replacing my careless use of 1ul to be a uint64_t constant with UINT64_C(1). llvm-svn: 225066
* Revert "Just use a using directive in SmallMapVector instead of inheriting ↵Michael Gottesman2015-01-011-3/+6
| | | | | | | | | from MapVector itself." This reverts commit r225059. I think MSVC 2012 has a problem with this. This is an attempt to fix one of the MSVC 2012 bots. llvm-svn: 225065
* Revert r225053: Add an ArrayRef upcasting constructor from ArrayRef<U*> -> ↵Chandler Carruth2015-01-012-45/+0
| | | | | | | | | | | ArrayRef<T*> where T is a base of U. This appears to have broken at least the windows build bots due to compile errors in the predicate that didn't simply supress the overload. I'm not sure what the fix is, and the bots have been broken for a long time now so I'm just reverting until Michael can figure out a fix. llvm-svn: 225064
* [SROA] Switch to using a more direct debug logging technique in one partChandler Carruth2015-01-011-4/+6
| | | | | | | | | | | | | | of my new load and store splitting, and fix a bug where it logged a totally irrelevant slice rather than the actual slice in question. The logging here previously worked because we used to place new slices onto the back of the core sequence, but that caused other problems. I updated the actual code to store new slices in their own vector but didn't update the logging. There isn't a good way to reuse the logging any more, and frankly it wasn't needed. We can directly log this bit more easily. llvm-svn: 225063
* [SROA] Fix formatting with clang-format which I managed to fail to doChandler Carruth2015-01-011-48/+48
| | | | | | prior to committing r225061. Sorry for that. llvm-svn: 225062
* [SROA] Teach SROA how to much more intelligently handle split loads andChandler Carruth2015-01-013-81/+520
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | stores. When there are accesses to an entire alloca with an integer load or store as well as accesses to small pieces of the alloca, SROA splits up the large integer accesses. In order to do that, it uses bit math to merge the small accesses into large integers. While this is effective, it produces insane IR that can cause significant problems in the rest of the optimizer: - It can cause load and store mismatches with GVN on the non-alloca side where we end up loading an i64 (or some such) rather than loading specific elements that are stored. - We can't always get rid of the integer bit math, which is why we can't always fix the loads and stores to work well with GVN. - This is especially bad when we have operations that mix poorly with integer bit math such as floating point operations. - It will block things like the vectorizer which might be able to handle the scalar stores that underly the aggregate. At the same time, we can't just directly split up these loads and stores in all cases. If there is actual integer arithmetic involved on the values, then using integer bit math is actually the perfect lowering because we can often combine it heavily with the surrounding math. The solution this patch provides is to find places where SROA is partitioning aggregates into small elements, and look for splittable loads and stores that it can split all the way to some other adjacent load and store. These are uniformly the cases where failing to split the loads and stores hurts the optimizer that I have seen, and I've looked extensively at the code produced both from more and less aggressive approaches to this problem. However, it is quite tricky to actually do this in SROA. We may have loads and stores to the same alloca, or other complex patterns that are hard to handle. This complexity leads to the somewhat subtle algorithm implemented here. We have to do this entire process as a separate pass over the partitioning of the alloca, and split up all of the loads prior to splitting the stores so that we can handle safely the cases of overlapping, including partially overlapping, loads and stores to the same alloca. We also have to reconstitute the post-split slice configuration so we can avoid iterating again over all the alloca uses (the slow part of SROA). But we also have to ensure that when we split up loads and stores to *other* allocas, we *do* re-iterate over them in SROA to adapt to the more refined partitioning now required. With this, I actually think we can fix a long-standing TODO in SROA where I avoided splitting as many loads and stores as probably should be splittable. This limitation historically mitigated the fallout of all the bad things mentioned above. Now that we have more intelligent handling, I plan to remove the FIXME and more aggressively mark integer loads and stores as splittable. I'll do that in a follow-up patch to help with bisecting any fallout. The net result of this change should be more fine-grained and accurate scalars being formed out of aggregates. At the very least, Clang now generates perfect code for this high-level test case using std::complex<float>: #include <complex> void g1(std::complex<float> &x, float a, float b) { x += std::complex<float>(a, b); } void g2(std::complex<float> &x, float a, float b) { x -= std::complex<float>(a, b); } void foo(const std::complex<float> &x, float a, float b, std::complex<float> &x1, std::complex<float> &x2) { std::complex<float> l1 = x; g1(l1, a, b); std::complex<float> l2 = x; g2(l2, a, b); x1 = l1; x2 = l2; } This code isn't just hypothetical either. It was reduced out of the hot inner loops of essentially every part of the Eigen math library when using std::complex<float>. Those loops would consistently and pervasively hop between the floating point unit and the integer unit due to bit math extraction and insertion of floating point values that were "stored" in a 64-bit integer register around the loop backedge. So far, this change has passed a bootstrap and I have done some other testing and so far, no issues. That doesn't mean there won't be though, so I'll be prepared to help with any fallout. If you performance swings in particular, please let me know. I'm very curious what all the impact of this change will be. Stay tuned for the follow-up to also split more integer loads and stores. llvm-svn: 225061
* Just use a using directive in SmallMapVector instead of inheriting from ↵Michael Gottesman2015-01-011-6/+3
| | | | | | MapVector itself. llvm-svn: 225059
* [PowerPC] Improve instruction selection bit-permuting operations (64-bit)Hal Finkel2015-01-013-97/+1107
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the second installment of improvements to instruction selection for "bit permutation" instruction sequences. r224318 added logic for instruction selection for 32-bit bit permutation sequences, and this adds lowering for 64-bit sequences. The 64-bit sequences are more complicated than the 32-bit ones because: a) the 64-bit versions of the 32-bit rotate-and-mask instructions work by replicating the lower 32-bits of the value-to-be-rotated into the upper 32 bits -- and integrating this into the cost modeling for the various bit group operations is non-trivial b) unlike the 32-bit instructions in 32-bit mode, the rotate-and-mask instructions cannot, in one instruction, specify the mask starting index, the mask ending index, and the rotation factor. Also, forming arbitrary 64-bit constants is more complicated than in 32-bit mode because the number of instructions necessary is value dependent. Plus, support for 'late masking' was added: it is sometimes more efficient to treat the overall value as if it had no mandatory zero bits when planning the bit-group insertions, and then mask them in at the very end. Unfortunately, as the structure of the bit groups is different in the two cases, the more feasible implementation technique was to generate both instruction sequences, and then pick the shorter one. And finally, we now generate reasonable code for i64 bswap: rldicl 5, 3, 16, 0 rldicl 4, 3, 8, 0 rldicl 6, 3, 24, 0 rldimi 4, 5, 8, 48 rldicl 5, 3, 32, 0 rldimi 4, 6, 16, 40 rldicl 6, 3, 48, 0 rldimi 4, 5, 24, 32 rldicl 5, 3, 56, 0 rldimi 4, 6, 40, 16 rldimi 4, 5, 48, 8 rldimi 4, 3, 56, 0 vs. what we used to produce: li 4, 255 rldicl 5, 3, 24, 40 rldicl 6, 3, 40, 24 rldicl 7, 3, 56, 8 sldi 8, 3, 8 sldi 10, 3, 24 sldi 12, 3, 40 rldicl 0, 3, 8, 56 sldi 9, 4, 32 sldi 11, 4, 40 sldi 4, 4, 48 andi. 5, 5, 65280 andis. 6, 6, 255 andis. 7, 7, 65280 sldi 3, 3, 56 and 8, 8, 9 and 4, 12, 4 and 9, 10, 11 or 6, 7, 6 or 5, 5, 0 or 3, 3, 4 or 7, 9, 8 or 4, 6, 5 or 3, 3, 7 or 3, 3, 4 which is 12 instructions, instead of 25, and seems optimal (at least in terms of code size). llvm-svn: 225056
* Add 2x constructors for TinyPtrVector, one that takes in one elemenet and ↵Michael Gottesman2014-12-312-0/+33
| | | | | | | | | the other that takes in an ArrayRef<EltTy> Currently one can only construct an empty TinyPtrVector. These are just missing elements of the API. llvm-svn: 225055
* Add a SmallMapVector class that is a MapVector with a Map of SmallDenseMap ↵Michael Gottesman2014-12-312-0/+229
| | | | | | and a Vector of SmallVector. llvm-svn: 225054
* Add an ArrayRef upcasting constructor from ArrayRef<U*> -> ArrayRef<T*> ↵Michael Gottesman2014-12-312-0/+45
| | | | | | where T is a base of U. llvm-svn: 225053
* InstCombine: fsub nsz 0, X ==> fsub nsz -0.0, XSanjay Patel2014-12-312-0/+16
| | | | | | | | | | | | | Some day the backend may handle instruction-level fast math flags and make this transform unnecessary, but it's still better practice to use the canonical representation of fneg when possible (use a -0.0). This is a partial fix for PR20870 ( http://llvm.org/bugs/show_bug.cgi?id=20870 ). See also http://reviews.llvm.org/D6723. Differential Revision: http://reviews.llvm.org/D6731 llvm-svn: 225050
OpenPOWER on IntegriCloud