summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* This should have been part of r224676.David Majnemer2014-12-201-2/+2
| | | | llvm-svn: 224677
* InstCombine: Squash an icmp+select into bitwise arithmeticDavid Majnemer2014-12-201-0/+33
| | | | | | | | | (X & INT_MIN) == 0 ? X ^ INT_MIN : X into X | INT_MIN (X & INT_MIN) != 0 ? X ^ INT_MIN : X into X & INT_MAX This fixes PR21993. llvm-svn: 224676
* InstSimplify: Optimize away pointless comparisonsDavid Majnemer2014-12-201-0/+76
| | | | | | | | | (X & INT_MIN) ? X & INT_MAX : X into X & INT_MAX (X & INT_MIN) ? X : X & INT_MAX into X (X & INT_MIN) ? X | INT_MIN : X into X (X & INT_MIN) ? X : X | INT_MIN into X | INT_MIN llvm-svn: 224669
* Reapply: [InstCombine] Fix visitSwitchInst to use right operand types for ↵Bruno Cardoso Lopes2014-12-191-0/+30
| | | | | | | | | | | | | | | | | | sub cstexpr The visitSwitchInst generates SUB constant expressions to recompute the switch condition. When truncating the condition to a smaller type, SUB expressions should use the previous type (before trunc) for both operands. Also, fix code to also return the modified switch when only the truncation is performed. This fixes an assertion crash. Differential Revision: http://reviews.llvm.org/D6644 rdar://problem/19191835 llvm-svn: 224588
* use -0.0 when creating an fneg instructionSanjay Patel2014-12-191-1/+1
| | | | | | | | | | | | | | | | | | | Backends recognize (-0.0 - X) as the canonical form for fneg and produce better code. Eg, ppc64 with 0.0: lis r2, ha16(LCPI0_0) lfs f0, lo16(LCPI0_0)(r2) fsubs f1, f0, f1 blr vs. -0.0: fneg f1, f1 blr Differential Revision: http://reviews.llvm.org/D6723 llvm-svn: 224583
* Revert "[InstCombine] Fix visitSwitchInst to use right operand types for sub ↵Bruno Cardoso Lopes2014-12-191-30/+0
| | | | | | | | | | | | | cstexpr" Reverts commit r224574 to appease buildbots: The visitSwitchInst generates SUB constant expressions to recompute the switch condition. When truncating the condition to a smaller type, SUB expressions should use the previous type (before trunc) for both operands. This fixes an assertion crash. llvm-svn: 224576
* [InstCombine] Fix visitSwitchInst to use right operand types for sub cstexprBruno Cardoso Lopes2014-12-191-0/+30
| | | | | | | | | | | | | The visitSwitchInst generates SUB constant expressions to recompute the switch condition. When truncating the condition to a smaller type, SUB expressions should use the previous type (before trunc) for both operands. This fixes an assertion crash. Differential Revision: http://reviews.llvm.org/D6644 rdar://problem/19191835 llvm-svn: 224574
* ConstantFold: Shifting undef by zero results in undefDavid Majnemer2014-12-181-0/+21
| | | | llvm-svn: 224553
* Revert 224119 "This patch recognizes (+ (+ v0, v1) (+ v2, v3)), reorders ↵Suyog Sarda2014-12-171-27/+0
| | | | | | | | | | them for bundling into vector of loads, and vectorizes it." This was re-ordering floating point data types resulting in mismatch in output. llvm-svn: 224424
* Added 5 more tests related to sink store revision 224247Elena Demikhovsky2014-12-175-0/+219
| | | | | | | | - by Ella Bolshinsky http://reviews.llvm.org/D6420 llvm-svn: 224418
* Strength reduce intrinsics with overflow into regular arithmetic operations ↵Erik Eckstein2014-12-171-12/+91
| | | | | | | | | | if possible. Some intrinsics, like s/uadd.with.overflow and umul.with.overflow, are already strength reduced. This change adds other arithmetic intrinsics: s/usub.with.overflow, smul.with.overflow. It completes the work on PR20194. llvm-svn: 224417
* InstSimplify: shl nsw/nuw undef, %V -> undefDavid Majnemer2014-12-171-0/+28
| | | | | | | | | | We can always choose an value for undef which might cause %V to shift out an important bit except for one case, when %V is zero. However, shl behaves like an identity function when the right hand side is zero. llvm-svn: 224405
* Masked Load and Store Intrinsics in loop vectorizer.Elena Demikhovsky2014-12-161-0/+420
| | | | | | | | | | The loop vectorizer optimizes loops containing conditional memory accesses by generating masked load and store intrinsics. This decision is target dependent. http://reviews.llvm.org/D6527 llvm-svn: 224334
* Teach ScalarEvolution to exploit min and max expressions when provingSanjoy Das2014-12-151-0/+453
| | | | | | | | | | | | | | | | isKnownPredicate. The motivation for this change is to optimize away checks in loops like this: limit = min(t, len) for (i = 0 to limit) if (i >= len || i < 0) throw_array_of_of_bounds(); a[i] = ... Differential Revision: http://reviews.llvm.org/D6635 llvm-svn: 224285
* IR: Make metadata typeless in assemblyDuncan P. N. Exon Smith2014-12-15135-1578/+1578
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that `Metadata` is typeless, reflect that in the assembly. These are the matching assembly changes for the metadata/value split in r223802. - Only use the `metadata` type when referencing metadata from a call intrinsic -- i.e., only when it's used as a `Value`. - Stop pretending that `ValueAsMetadata` is wrapped in an `MDNode` when referencing it from call intrinsics. So, assembly like this: define @foo(i32 %v) { call void @llvm.foo(metadata !{i32 %v}, metadata !0) call void @llvm.foo(metadata !{i32 7}, metadata !0) call void @llvm.foo(metadata !1, metadata !0) call void @llvm.foo(metadata !3, metadata !0) call void @llvm.foo(metadata !{metadata !3}, metadata !0) ret void, !bar !2 } !0 = metadata !{metadata !2} !1 = metadata !{i32* @global} !2 = metadata !{metadata !3} !3 = metadata !{} turns into this: define @foo(i32 %v) { call void @llvm.foo(metadata i32 %v, metadata !0) call void @llvm.foo(metadata i32 7, metadata !0) call void @llvm.foo(metadata i32* @global, metadata !0) call void @llvm.foo(metadata !3, metadata !0) call void @llvm.foo(metadata !{!3}, metadata !0) ret void, !bar !2 } !0 = !{!2} !1 = !{i32* @global} !2 = !{!3} !3 = !{} I wrote an upgrade script that handled almost all of the tests in llvm and many of the tests in cfe (even handling many `CHECK` lines). I've attached it (or will attach it in a moment if you're speedy) to PR21532 to help everyone update their out-of-tree testcases. This is part of PR21532. llvm-svn: 224257
* Added a test related to 224247 revisionElena Demikhovsky2014-12-151-0/+43
| | | | llvm-svn: 224248
* Typo Correction in Test Case. NFC.Suyog Sarda2014-12-151-1/+1
| | | | llvm-svn: 224244
* Reapply "[ARM] Combine base-updating/post-incrementing vector load/stores."Ahmed Bougacha2014-12-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | r223862 tried to also combine base-updating load/stores. r224198 reverted it, as "it created a regression on the test-suite on test MultiSource/Benchmarks/Ptrdist/anagram by scrambling the order in which the words are shown." Reapply, with a fix to ignore non-normal load/stores. Truncstores are handled elsewhere (you can actually write a pattern for those, whereas for postinc loads you can't, since they return two values), but it should be possible to also combine extloads base updates, by checking that the memory (rather than result) type is of the same size as the addend. Original commit message: We used to only combine intrinsics, and turn them into VLD1_UPD/VST1_UPD when the base pointer is incremented after the load/store. We can do the same thing for generic load/stores. Note that we can only combine the first load/store+adds pair in a sequence (as might be generated for a v16f32 load for instance), because other combines turn the base pointer addition chain (each computing the address of the next load, from the address of the last load) into independent additions (common base pointer + this load's offset). Differential Revision: http://reviews.llvm.org/D6585 llvm-svn: 224203
* Revert "[ARM] Combine base-updating/post-incrementing vector load/stores."Renato Golin2014-12-131-2/+2
| | | | | | | | | This reverts commit r223862, as it created a regression on the test-suite on test MultiSource/Benchmarks/Ptrdist/anagram by scrambling the order in which the words are shown. We'll investigate the issue and re-apply when safe. llvm-svn: 224198
* ValueTracking: Don't recurse too deeply in computeKnownBitsFromAssumeDavid Majnemer2014-12-121-0/+18
| | | | | | | | | Respect the MaxDepth recursion limit, doing otherwise will trigger an assert in computeKnownBits. This fixes PR21891. llvm-svn: 224168
* This patch recognizes (+ (+ v0, v1) (+ v2, v3)), reorders them for bundling ↵Suyog Sarda2014-12-121-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into vector of loads, and vectorizes it. Test case : float hadd(float* a) { return (a[0] + a[1]) + (a[2] + a[3]); } AArch64 assembly before patch : ldp s0, s1, [x0] ldp s2, s3, [x0, #8] fadd s0, s0, s1 fadd s1, s2, s3 fadd s0, s0, s1 ret AArch64 assembly after patch : ldp d0, d1, [x0] fadd v0.2s, v0.2s, v1.2s faddp s0, v0.2s ret Reviewed Link : http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20141208/248531.html llvm-svn: 224119
* Fix another infinite loop in InstCombineSteven Wu2014-12-121-0/+12
| | | | | | | | | | | | | | | Summary: InstCombine infinite-loops for the testcase added It is because InstCombine is generating instructions that can be optimized by itself. Fix by not optimizing frem if the optimized type is the same as original type. rdar://problem/19150820 Reviewers: majnemer Differential Revision: http://reviews.llvm.org/D6634 llvm-svn: 224097
* [InstCombine][X86] Improved folding of calls to Intrinsic::x86_sse4a_insertqi.Andrea Di Biagio2014-12-111-0/+27
| | | | | | | | | | | | | | | | | | This patch teaches the instruction combiner how to fold a call to 'insertqi' if the 'length field' (3rd operand) is set to zero, and if the sum between field 'length' and 'bit index' (4th operand) is bigger than 64. From the AMD64 Architecture Programmer's Manual: 1. If the sum of the bit index + length field is greater than 64, then the results are undefined; 2. A value of zero in the field length is defined as a length of 64. This patch improves the existing combining logic for intrinsic 'insertqi' adding extra checks to address both point 1. and point 2. Differential Revision: http://reviews.llvm.org/D6583 llvm-svn: 224054
* InstSimplify: Remove usesless %a parameter from testsDavid Majnemer2014-12-111-4/+4
| | | | | | No functional change intended. llvm-svn: 224016
* The inliner needs to fix up debug information for llvm.dbg.declare, not only ↵Michael Kuperstein2014-12-111-0/+96
| | | | | | | | | | for llvm.dbg.value. Patch by Amjad Aboud Differential Revision: http://reviews.llvm.org/D6525 llvm-svn: 224015
* ConstantFold, InstSimplify: undef >>a x can be either -1 or 0, choose 0David Majnemer2014-12-101-2/+2
| | | | | | Zero is usually a nicer constant to have than -1. llvm-svn: 223969
* ConstantFold: an undef shift amount results in undefDavid Majnemer2014-12-101-0/+21
| | | | | | | X shifted by undef results in undef because the undef value can represent values greater than the width of the operands. llvm-svn: 223968
* ConstantFold: div undef, 0 should fold to undef, not zeroDavid Majnemer2014-12-101-0/+7
| | | | | | Dividing by zero yields an undefined value. llvm-svn: 223924
* InstSimplify: [al]shr exact undef, %X -> undefDavid Majnemer2014-12-101-0/+14
| | | | | | | Exact shifts always keep the non-zero bits of their input. This means it keeps it's undef bits. llvm-svn: 223923
* InstSimplify: div %X, 0 -> undefDavid Majnemer2014-12-101-0/+14
| | | | | | We already optimized rem %X, 0 to undef, we should do the same for div. llvm-svn: 223919
* [ARM] Combine base-updating/post-incrementing vector load/stores.Ahmed Bougacha2014-12-101-2/+2
| | | | | | | | | | | | | | | | | | We used to only combine intrinsics, and turn them into VLD1_UPD/VST1_UPD when the base pointer is incremented after the load/store. We can do the same thing for generic load/stores. Note that we can only combine the first load/store+adds pair in a sequence (as might be generated for a v16f32 load for instance), because other combines turn the base pointer addition chain (each computing the address of the next load, from the address of the last load) into independent additions (common base pointer + this load's offset). Differential Revision: http://reviews.llvm.org/D6585 llvm-svn: 223862
* Revert r223764 which taught instcombine about integer-based elment extractionChandler Carruth2014-12-091-210/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | patterns. This is causing Clang to miscompile itself for 32-bit x86 somehow, and likely also on ARM and PPC. I really don't know how, but reverting now that I've confirmed this is actually the culprit. I have a reproduction as well and so should be able to restore this shortly. This reverts commit r223764. Original commit log follows: Teach instcombine to canonicalize "element extraction" from a load of an integer and "element insertion" into a store of an integer into actual element extraction, element insertion, and vector loads and stores. Previously various parts of LLVM (including instcombine itself) would introduce integer loads and stores into the code as a way of opaquely loading and storing "bits". In some cases (such as a memcpy of std::complex<float> object) we will eventually end up using those bits in non-integer types. In order for SROA to effectively promote the allocas involved, it splits these "store a bag of bits" integer loads and stores up into the constituent parts. However, for non-alloca loads and tsores which remain, it uses integer math to recombine the values into a large integer to load or store. All of this would be "fine", except that it forces LLVM to go through integer math to combine and split up values. While this makes perfect sense for integers (and in fact is critical for bitfields to end up lowering efficiently) it is *terrible* for non-integer types, especially floating point types. We have a much more canonical way of representing the act of concatenating the bits of two SSA values in LLVM: a vector and insertelement. This patch teaching InstCombine to use this representation. With this patch applied, LLVM will no longer introduce integer math into the critical path of every loop over std::complex<float> operations such as those that make up the hot path of ... oh, most HPC code, Eigen, and any other heavy linear algebra library. For the record, I looked *extensively* at fixing this in other parts of the compiler, but it just doesn't work: - We really do want to canonicalize memcpy and other bit-motion to integer loads and stores. SSA values are tremendously more powerful than "copy" intrinsics. Not doing this regresses massive amounts of LLVM's scalar optimizer. - We really do need to split up integer loads and stores of this form in SROA or every memcpy of a trivially copyable struct will prevent SSA formation of the members of that struct. It essentially turns off SROA. - The closest alternative is to actually split the loads and stores when partitioning with SROA, but this has all of the downsides historically discussed of splitting up loads and stores -- the wide-store information is fundamentally lost. We would also see performance regressions for bitfield-heavy code and other places where the integers aren't really intended to be split without seemingly arbitrary logic to treat integers totally differently. - We *can* effectively fix this in instcombine, so it isn't that hard of a choice to make IMO. llvm-svn: 223813
* IR: Split Metadata from ValueDuncan P. N. Exon Smith2014-12-091-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Split `Metadata` away from the `Value` class hierarchy, as part of PR21532. Assembly and bitcode changes are in the wings, but this is the bulk of the change for the IR C++ API. I have a follow-up patch prepared for `clang`. If this breaks other sub-projects, I apologize in advance :(. Help me compile it on Darwin I'll try to fix it. FWIW, the errors should be easy to fix, so it may be simpler to just fix it yourself. This breaks the build for all metadata-related code that's out-of-tree. Rest assured the transition is mechanical and the compiler should catch almost all of the problems. Here's a quick guide for updating your code: - `Metadata` is the root of a class hierarchy with three main classes: `MDNode`, `MDString`, and `ValueAsMetadata`. It is distinct from the `Value` class hierarchy. It is typeless -- i.e., instances do *not* have a `Type`. - `MDNode`'s operands are all `Metadata *` (instead of `Value *`). - `TrackingVH<MDNode>` and `WeakVH` referring to metadata can be replaced with `TrackingMDNodeRef` and `TrackingMDRef`, respectively. If you're referring solely to resolved `MDNode`s -- post graph construction -- just use `MDNode*`. - `MDNode` (and the rest of `Metadata`) have only limited support for `replaceAllUsesWith()`. As long as an `MDNode` is pointing at a forward declaration -- the result of `MDNode::getTemporary()` -- it maintains a side map of its uses and can RAUW itself. Once the forward declarations are fully resolved RAUW support is dropped on the ground. This means that uniquing collisions on changing operands cause nodes to become "distinct". (This already happened fairly commonly, whenever an operand went to null.) If you're constructing complex (non self-reference) `MDNode` cycles, you need to call `MDNode::resolveCycles()` on each node (or on a top-level node that somehow references all of the nodes). Also, don't do that. Metadata cycles (and the RAUW machinery needed to construct them) are expensive. - An `MDNode` can only refer to a `Constant` through a bridge called `ConstantAsMetadata` (one of the subclasses of `ValueAsMetadata`). As a side effect, accessing an operand of an `MDNode` that is known to be, e.g., `ConstantInt`, takes three steps: first, cast from `Metadata` to `ConstantAsMetadata`; second, extract the `Constant`; third, cast down to `ConstantInt`. The eventual goal is to introduce `MDInt`/`MDFloat`/etc. and have metadata schema owners transition away from using `Constant`s when the type isn't important (and they don't care about referring to `GlobalValue`s). In the meantime, I've added transitional API to the `mdconst` namespace that matches semantics with the old code, in order to avoid adding the error-prone three-step equivalent to every call site. If your old code was: MDNode *N = foo(); bar(isa <ConstantInt>(N->getOperand(0))); baz(cast <ConstantInt>(N->getOperand(1))); bak(cast_or_null <ConstantInt>(N->getOperand(2))); bat(dyn_cast <ConstantInt>(N->getOperand(3))); bay(dyn_cast_or_null<ConstantInt>(N->getOperand(4))); you can trivially match its semantics with: MDNode *N = foo(); bar(mdconst::hasa <ConstantInt>(N->getOperand(0))); baz(mdconst::extract <ConstantInt>(N->getOperand(1))); bak(mdconst::extract_or_null <ConstantInt>(N->getOperand(2))); bat(mdconst::dyn_extract <ConstantInt>(N->getOperand(3))); bay(mdconst::dyn_extract_or_null<ConstantInt>(N->getOperand(4))); and when you transition your metadata schema to `MDInt`: MDNode *N = foo(); bar(isa <MDInt>(N->getOperand(0))); baz(cast <MDInt>(N->getOperand(1))); bak(cast_or_null <MDInt>(N->getOperand(2))); bat(dyn_cast <MDInt>(N->getOperand(3))); bay(dyn_cast_or_null<MDInt>(N->getOperand(4))); - A `CallInst` -- specifically, intrinsic instructions -- can refer to metadata through a bridge called `MetadataAsValue`. This is a subclass of `Value` where `getType()->isMetadataTy()`. `MetadataAsValue` is the *only* class that can legally refer to a `LocalAsMetadata`, which is a bridged form of non-`Constant` values like `Argument` and `Instruction`. It can also refer to any other `Metadata` subclass. (I'll break all your testcases in a follow-up commit, when I propagate this change to assembly.) llvm-svn: 223802
* Removal Of Duplicate Test Cases and Addition Of Missing Check StatementsSonam Kumari2014-12-091-21/+15
| | | | llvm-svn: 223768
* [test/Transforms/InstCombine/shift.ll] Removed duplicate test cases. NFC.Ankur Garg2014-12-091-35/+17
| | | | | | | | | | | Removed some duplicate test cases from the file /test/Transforms/InstCombine/shift.ll. test54 and test57 were duplicates of each other. test55 and test58 were duplicates of each other. (Removed test57 and test58) llvm-svn: 223767
* Teach instcombine to canonicalize "element extraction" from a load of anChandler Carruth2014-12-091-0/+210
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | integer and "element insertion" into a store of an integer into actual element extraction, element insertion, and vector loads and stores. Previously various parts of LLVM (including instcombine itself) would introduce integer loads and stores into the code as a way of opaquely loading and storing "bits". In some cases (such as a memcpy of std::complex<float> object) we will eventually end up using those bits in non-integer types. In order for SROA to effectively promote the allocas involved, it splits these "store a bag of bits" integer loads and stores up into the constituent parts. However, for non-alloca loads and tsores which remain, it uses integer math to recombine the values into a large integer to load or store. All of this would be "fine", except that it forces LLVM to go through integer math to combine and split up values. While this makes perfect sense for integers (and in fact is critical for bitfields to end up lowering efficiently) it is *terrible* for non-integer types, especially floating point types. We have a much more canonical way of representing the act of concatenating the bits of two SSA values in LLVM: a vector and insertelement. This patch teaching InstCombine to use this representation. With this patch applied, LLVM will no longer introduce integer math into the critical path of every loop over std::complex<float> operations such as those that make up the hot path of ... oh, most HPC code, Eigen, and any other heavy linear algebra library. For the record, I looked *extensively* at fixing this in other parts of the compiler, but it just doesn't work: - We really do want to canonicalize memcpy and other bit-motion to integer loads and stores. SSA values are tremendously more powerful than "copy" intrinsics. Not doing this regresses massive amounts of LLVM's scalar optimizer. - We really do need to split up integer loads and stores of this form in SROA or every memcpy of a trivially copyable struct will prevent SSA formation of the members of that struct. It essentially turns off SROA. - The closest alternative is to actually split the loads and stores when partitioning with SROA, but this has all of the downsides historically discussed of splitting up loads and stores -- the wide-store information is fundamentally lost. We would also see performance regressions for bitfield-heavy code and other places where the integers aren't really intended to be split without seemingly arbitrary logic to treat integers totally differently. - We *can* effectively fix this in instcombine, so it isn't that hard of a choice to make IMO. Differential Revision: http://reviews.llvm.org/D6548 llvm-svn: 223764
* InstSimplify: Try to bring back the rest of r223583David Majnemer2014-12-081-0/+9
| | | | | | | This reverts r223624 with a small tweak, hopefully this will make stage3 equivalent. llvm-svn: 223679
* Removal Of Duplicate Test Case from shift.ll fileSonam Kumari2014-12-081-1/+1
| | | | llvm-svn: 223648
* Revert a part of r223583, for now. It seems causing different emission ↵NAKAMURA Takumi2014-12-081-9/+0
| | | | | | between stage2(gcc-clang) and stage3 clang. Investigating. llvm-svn: 223624
* InstSimplify: Optimize away useless unsigned comparisonsDavid Majnemer2014-12-061-0/+48
| | | | | | Code like X < Y && Y == 0 should always be folded away to false. llvm-svn: 223583
* IR: Disallow complicated function-local metadataDuncan P. N. Exon Smith2014-12-061-2/+3
| | | | | | | | | | Disallow complex types of function-local metadata. The only valid function-local metadata is an `MDNode` whose sole argument is a non-metadata function-local value. Part of PR21532. llvm-svn: 223564
* Add some tests for SimplifyCFG's TurnSwitchRangeIntoICmp(). NFC.Hans Wennborg2014-12-041-0/+50
| | | | llvm-svn: 223396
* Add some tests for SimplifyCFG's ConstantFoldTerminator(). NFC.Hans Wennborg2014-12-041-0/+64
| | | | llvm-svn: 223395
* Add a test case for argument type coercion in an invoke of a vararg functionPhilip Reames2014-12-041-0/+20
| | | | | | This would have caught the bug I fixed in 223370. llvm-svn: 223378
* Revert "r223364 - Revert r223347 which has caused crashes on bootstrap bots."Hal Finkel2014-12-041-1/+152
| | | | | | | | | | | | | | | | | | | | Reapply r223347, with a fix to not crash on uninserted instructions (or more precisely, instructions in uninserted blocks). bugpoint was able to reduce the test case somewhat, but it is still somewhat large (and relies on setting things up to be simplified during inlining), so I've not included it here. Nevertheless, it is clear what is going on and why. Original commit message: Restrict somewhat the memory-allocation pointer cmp opt from r223093 Based on review comments from Richard Smith, restrict this optimization from applying to globals that might resolve lazily to other dynamically-loaded modules, and also from dynamic allocas (which might be transformed into malloc calls). In short, take extra care that the compared-to pointer is really simultaneously live with the memory allocation. llvm-svn: 223371
* Revert r223347 which has caused crashes on bootstrap bots.Alexander Potapenko2014-12-041-152/+1
| | | | llvm-svn: 223364
* [InstCombine] Minor optimization for bswap with binary opsSimon Pilgrim2014-12-041-15/+169
| | | | | | | | | | | | | | | | | Added instcombine optimizations for BSWAP with AND/OR/XOR ops: OP( BSWAP(x), BSWAP(y) ) -> BSWAP( OP(x, y) ) OP( BSWAP(x), CONSTANT ) -> BSWAP( OP(x, BSWAP(CONSTANT) ) ) Since its just a one liner, I've also added BSWAP to the DAGCombiner equivalent as well: fold (OP (bswap x), (bswap y)) -> (bswap (OP x, y)) Refactored bswap-fold tests to use FileCheck instead of just checking that the bswaps had gone. Differential Revision: http://reviews.llvm.org/D6407 llvm-svn: 223349
* Restrict somewhat the memory-allocation pointer cmp opt from r223093Hal Finkel2014-12-041-1/+152
| | | | | | | | | | Based on review comments from Richard Smith, restrict this optimization from applying to globals that might resolve lazily to other dynamically-loaded modules, and also from dynamic allocas (which might be transformed into malloc calls). In short, take extra care that the compared-to pointer is really simultaneously live with the memory allocation. llvm-svn: 223347
* [SimplifyLibCalls] Improve double->float shrinking to consider constantsMatthias Braun2014-12-031-0/+24
| | | | | | | | | | This allows cases like float x; fmin(1.0, x); to be optimized to fminf(1.0f, x); rdar://19049359 Differential Revision: http://reviews.llvm.org/D6496 llvm-svn: 223270
* [SimplifyLibCalls] Enable double to float shrinking for copysignMatthias Braun2014-12-031-0/+14
| | | | | | | | rdar://19049359 Differential Revision: http://reviews.llvm.org/D6495 llvm-svn: 223269
OpenPOWER on IntegriCloud