summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* move a single test case to where most other instcombine shuffle bug test ↵Sanjay Patel2015-11-212-16/+15
| | | | | | cases exist llvm-svn: 253784
* Move free-zext.ll to llvm/test/Transforms/CodeGenPrepare/AArch64/NAKAMURA Takumi2015-11-201-0/+0
| | | | llvm-svn: 253730
* Fix another infinite loop in Reassociate caused by Constant::isZero().Owen Anderson2015-11-201-0/+13
| | | | | | Not all zero vectors are ConstantDataVector's. llvm-svn: 253723
* [CodeGenPrepare] Create more extloads and fewer andsGeoff Berry2015-11-201-0/+82
| | | | | | | | | | | | | | | Summary: Add and instructions immediately after loads that only have their low bits used, assuming that the (and (load x) c) will be matched as a extload and the ands/truncs fed by the extload will be removed by isel. Reviewers: mcrosier, qcolombet, ab Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14584 llvm-svn: 253722
* SamplePGO - Tweak RUN command for a test. NFC.Diego Novillo2015-11-201-1/+1
| | | | llvm-svn: 253717
* SamplePGO - Do not count never-executed inlined functions when computing ↵Diego Novillo2015-11-202-0/+152
| | | | | | | | | | | | | | | coverage. If a function was originally inlined but not actually hot at runtime, its samples will not be counted inside the parent function. This throws off the coverage calculation because it expects to find more used records than it should. Fixed by ignoring functions that will not be inlined into the parent. Currently, this is inlined functions with 0 samples. In subsequent patches, I'll change this to mean "cold" functions. llvm-svn: 253716
* SamplePGO - Add line offset and discriminator information to sample reports.Diego Novillo2015-11-201-4/+4
| | | | | | | | | While debugging some sampling coverage problems, I found this useful: When applying samples from a profile, it helps to also know what line offset and discriminator the sample belongs to. This makes it easy to correlate against the input profile. llvm-svn: 253670
* Fix a pair of issues that caused an infinite loop in reassociate.Owen Anderson2015-11-201-0/+20
| | | | | | | | Terrifyingly, one of them is a mishandling of floating point vectors in Constant::isZero(). How exactly this issue survived this long is beyond me. llvm-svn: 253655
* ScalarEvolution: do not set nuw when creating exprs of form <expr> + <all-ones>.Peter Collingbourne2015-11-201-0/+49
| | | | | | | | | | | The nuw constraint will not be satisfied unless <expr> == 0. This bug has been around since r102234 (in 2010!), but was uncovered by r251052, which introduced more aggressive optimization of nuw scev expressions. Differential Revision: http://reviews.llvm.org/D14850 llvm-svn: 253627
* [InstCombine] add tests to show missing trunc optimizationsSanjay Patel2015-11-191-0/+46
| | | | llvm-svn: 253609
* [InstCombine] add tests to show missing bitcast optimizationsSanjay Patel2015-11-191-0/+29
| | | | llvm-svn: 253602
* Reimplement discriminator assignment algorithm.Dehao Chen2015-11-192-4/+76
| | | | | | | | | | | | Summary: The new algorithm is more efficient (O(n), n is number of basic blocks). And it is guaranteed to cover all cases of multiple BB mapped to same line. Reviewers: dblaikie, davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14738 llvm-svn: 253594
* [GlobalOpt] Localize some globals that have non-instruction usersJames Molloy2015-11-191-0/+28
| | | | | | We currently bail out of global localization if the global has non-instruction users. However, often these can be simple bitcasts or constant-GEPs, which we can easily turn into instructions before localizing. Be a bit more aggressive. llvm-svn: 253584
* this new test file was accidentally left out of r253573Sanjay Patel2015-11-191-0/+56
| | | | llvm-svn: 253574
* [FunctionAttrs] Provide a mechanism for adding function attributes from the ↵James Molloy2015-11-191-0/+12
| | | | | | | | | | | | command line This provides a way to force a function to have certain attributes from the command line. This can be useful when debugging or doing workload exploration, where manually editing IR is tedious or not possible (due to build systems etc). The syntax is -force-attribute=function_name:attribute_name All function attributes are parsed except alignstack as it requires an argument. llvm-svn: 253550
* Pointers in Masked Load, Store, Gather, Scatter intrinsicsElena Demikhovsky2015-11-191-0/+142
| | | | | | | | | | The masked intrinsics support all integer and floating point data types. I added the pointer type to this list. Added tests for CodeGen and for Loop Vectorizer. Updated the Language Reference. Differential Revision: http://reviews.llvm.org/D14150 llvm-svn: 253544
* Revert "Change memcpy/memset/memmove to have dest and source alignments."Pete Cooper2015-11-19102-653/+648
| | | | | | | | | | This reverts commit r253511. This likely broke the bots in http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202 http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787 llvm-svn: 253543
* Fix bug 25440: GVN assertion after coercing loadsWeiming Zhao2015-11-191-0/+108
| | | | | | | | | | | | | | | | Optimizations like LoadPRE in GVN will insert new instructions. If the insertion point is in a already processed BB, they should get a value number explicitly. If the insertion point is after current instruction, then just leave it. However, current GVN framework has no support for it. In this patch, we just bail out if a VN can't be found. Dfferential Revision: http://reviews.llvm.org/D14670 A test/Transforms/GVN/pr25440.ll M lib/Transforms/Scalar/GVN.cpp llvm-svn: 253536
* [SimplifyLibCalls] New trick: pow(x, 0.5) -> sqrt(x) under -ffast-math.Davide Italiano2015-11-181-0/+15
| | | | | | Differential Revision: http://reviews.llvm.org/D14466 llvm-svn: 253521
* Change memcpy/memset/memmove to have dest and source alignments.Pete Cooper2015-11-18102-648/+653
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html These intrinsics currently have an explicit alignment argument which is required to be a constant integer. It represents the alignment of the source and dest, and so must be the minimum of those. This change allows source and dest to each have their own alignments by using the alignment attribute on their arguments. The alignment argument itself is removed. There are a few places in the code for which the code needs to be checked by an expert as to whether using only src/dest alignment is safe. For those places, they currently take the minimum of src/dest alignments which matches the current behaviour. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false) will now read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false) For out of tree owners, I was able to strip alignment from calls using sed by replacing: (call.*llvm\.memset.*)i32\ [0-9]*\,\ i1 false\) with: $1i1 false) and similarly for memmove and memcpy. I then added back in alignment to test cases which needed it. A similar commit will be made to clang which actually has many differences in alignment as now IRBuilder can generate different source/dest alignments on calls. In IRBuilder itself, a new argument was added. Instead of calling: CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, /* isVolatile */ false) you now call CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, /* isVolatile */ false) There is a temporary class (IntegerAlignment) which takes the source alignment and rejects implicit conversion from bool. This is to prevent isVolatile here from passing its default parameter to the source alignment. Note, changes in future can now be made to codegen. I didn't change anything here, but this change should enable better memcpy code sequences. Reviewed by Hal Finkel. llvm-svn: 253511
* Disable gvn non-local speculative loads under asan.Mike Aizatsky2015-11-181-0/+57
| | | | | | | | Summary: Fix for https://llvm.org/bugs/show_bug.cgi?id=25550 Differential Revision: http://reviews.llvm.org/D14763 llvm-svn: 253498
* Revert "Revert "Strip metadata when speculatively hoisting instructions ↵Igor Laevsky2015-11-182-0/+70
| | | | | | | | (r252604)" Failing clang test is now fixed by the r253458. llvm-svn: 253459
* Teach the inliner to track deoptimization stateSanjoy Das2015-11-181-0/+97
| | | | | | | | | | | | | | | | Summary: This change teaches LLVM's inliner to track and suitably adjust deoptimization state (tracked via deoptimization operand bundles) as it inlines through call sites. The operation is described in more detail in the LangRef changes. Reviewers: reames, majnemer, chandlerc, dexonsmith Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14552 llvm-svn: 253438
* Add a test for r253323David Majnemer2015-11-181-0/+47
| | | | | | Forgot to do this simultaneously with committing the fix. llvm-svn: 253430
* [EH] Keep filter clauses for types that have been caught.Andrew Kaylor2015-11-171-4/+7
| | | | | | | | The instruction combiner previously removed types from filter clauses in Landing Pad instructions if the type had previously been seen in a catch clause. This is incorrect and prevents unexpected exception handlers from rethrowing the caught type. Differential Revision: http://reviews.llvm.org/D14669 llvm-svn: 253370
* Vector of pointers in function attributes calculationElena Demikhovsky2015-11-171-0/+38
| | | | | | | | | While setting function attributes we check all instructions that may access memory. For a call instruction we check all arguments. The special check is required for pointers. I added vector-of-pointers to the call arguments types that should be checked. Differential Revision: http://reviews.llvm.org/D14693 llvm-svn: 253363
* [PRE] Preserve !invariant.load metadataPhilip Reames2015-11-171-0/+17
| | | | | | Spoted via inspection. Test case included. llvm-svn: 253275
* Add intermediate subtract instructions to reassociation worklist.Owen Anderson2015-11-168-17/+48
| | | | | | | | | | We sometimes create intermediate subtract instructions during reassociation. Adding these to the worklist to revisit exposes many additional reassociation opportunities. Patch by Aditya Nandakumar. llvm-svn: 253240
* [LoopStrengthReduce] Don't increment iterator past the end of the BBDavid Majnemer2015-11-161-0/+51
| | | | | | | | | | | We tried to move the insertion point beyond instructions like landingpad and cleanuppad. However, we *also* tried to move past catchpad. This is problematic because catchpad is also a terminator. This fixes PR25541. llvm-svn: 253238
* Don't generate discriminators for calls to debug intrinsicsPavel Labath2015-11-161-0/+30
| | | | | | | | | | | | | | Summary: This fails a check in Verifier.cpp, which checks for location matches between the declared variable and the !dbg attachments. Reviewers: dnovillo, dblaikie, danielcdh Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14657 llvm-svn: 253194
* [Sink] Don't move landingpadsKeno Fischer2015-11-161-0/+33
| | | | | | | | | | | | | | | Summary: Moving landingpads into successor basic blocks makes the verifier sad. Teach Sink that much like PHI nodes and terminator instructions, landingpads (and cleanuppads, etc.) may not be moved between basic blocks. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14475 llvm-svn: 253182
* [GlobalOpt] Demote globals to locals more aggressivelyJames Molloy2015-11-153-2/+82
| | | | | | | | | | | | | | | | Global to local demotion can speed up programs that use globals a lot. It is particularly useful with LTO, when the entire call graph is known and most functions have been internalized. For a global to be demoted, it must only be accessed by one function and that function: 1. Must never recurse directly or indirectly, else the GV would be clobbered. 2. Must never rely on the value in GV at the start of the function (apart from the initializer). GlobalOpt can already do this, but it is hamstrung and only ever tries to demote globals inside "main", because C++ gives extra guarantees about how main is called - once and only once. In LTO mode, we can often prove the first property (if the function is internal by this point, we know enough about the callgraph to determine if it could possibly recurse). FunctionAttrs now infers the "norecurse" attribute for this reason. The second property can be proven for a subset of functions by proving that all loads from GV are dominated by a store to GV. This is conservative in the name of compile time - this only requires a DominatorTree which is fairly cheap in the grand scheme of things. We could do more fancy stuff with MemoryDependenceAnalysis too to catch more cases but this appears to catch most of the useful ones in my testing. llvm-svn: 253168
* Fixed GEP visitor in the InstCombine pass.Elena Demikhovsky2015-11-151-0/+23
| | | | | | | | | | | | | The current implementation of GEP visitor in InstCombine fails with assertion on Vector GEP with mix of scalar and vector types, like this: getelementptr double, double* %a, <8 x i32> %i (It fails to create a "sext" from <8 x i32> to <8 x i64>) I fixed it and added some tests. Differential Revision: http://reviews.llvm.org/D14485 llvm-svn: 253162
* Don't recompute LCSSA after loop-unrolling when possible.Michael Zolotukhin2015-11-141-0/+119
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Currently we always recompute LCSSA for outer loops after unrolling an inner loop. That leads to compile time problem when we have big loop nests, and we can solve it by avoiding unnecessary work. For instance, if w eonly do partial unrolling, we don't break LCSSA, so we don't need to rebuild it. Also, if all exits from the inner loop are inside the enclosing loop, then complete unrolling won't break LCSSA either. I replaced unconditional LCSSA recomputation with conditional recomputation + unconditional assert and added several tests, which were failing when I experimented with it. Soon I plan to follow up with a similar patch for recalculation of dominators tree. Reviewers: hfinkel, dexonsmith, bogner, joker.eph, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14526 llvm-svn: 253126
* [LIR] Add support for creating memcpys from loops with a negative stride.Chad Rosier2015-11-131-2/+29
| | | | | | | | | | | | | | | This allows us to transform the below loop into a memcpy. void test(unsigned *__restrict__ a, unsigned *__restrict__ b) { for (int i = 2047; i >= 0; --i) { a[i] = b[i]; } } This is the memcpy version of r251518, which added support for memset with negative strided loops. llvm-svn: 253091
* [safestack] Rewrite isAllocaSafe using SCEV.Evgeniy Stepanov2015-11-134-4/+264
| | | | | | | | | | | | | | | Use ScalarEvolution to calculate memory access bounds. Handle function calls based on readnone/nocapture attributes. Handle memory intrinsics with constant size. This change improves both recall and precision of IsAllocaSafe. See the new tests (ex. BitCastWide) for the kind of code that was wrongly classified as safe. SCEV efficiency seems to be limited by the fact the SafeStack runs late (in CodeGenPrepare), and many loops are unrolled or otherwise not in LCSSA. llvm-svn: 253083
* [llvm-profdata] Add check for text profile formats and improve error ↵Nathan Slingerland2015-11-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | reporting (2nd try) Summary: This change addresses two possible instances of user error / confusion when merging sampled profile data. Previously any input that didn't match the raw or processed instrumented format would automatically be interpreted as instrumented profile text format data. No error would be reported during the merge. Example: If foo-sampled.profdata and bar-sampled.profdata are binary sampled profiles: Old behavior: $ llvm-profdata merge foo-sampled.profdata bar-sampled.profdata -output foobar-sampled.profdata $ llvm-profdata show -sample foobar-sampled.profdata error: foobar-sampled.profdata:1: Expected 'mangled_name:NUM:NUM', found lprofi This change adds basic checks for valid input data when assuming text input. It also makes error messages related to file format validity more specific about the assumbed profile data type. New behavior: $ llvm-profdata merge foo-sampled.profdata bar-sampled.profdata -o foobar-sampled.profdata error: foo.profdata: Unrecognized instrumentation profile encoding format Perhaps you forgot to use the -sample option? Reviewers: bogner, davidxl, dnovillo Subscribers: davidxl, llvm-commits Differential Revision: http://reviews.llvm.org/D14558 llvm-svn: 253009
* Revert "Fix bug 25440: GVN assertion after coercing loads"Tobias Grosser2015-11-121-66/+0
| | | | | | This reverts 252919 which broke LNT: MultiSource/Applications/SPASS llvm-svn: 252936
* Fix bug 25440: GVN assertion after coercing loadsWeiming Zhao2015-11-121-0/+66
| | | | | | | | | | | | | | | | Summary: when coercing loads, it inserts some instructions, which have no GV assigned. https://llvm.org/bugs/show_bug.cgi?id=25440 Reviewers: hfinkel, dberlin Subscribers: dberlin, llvm-commits Differential Revision: http://reviews.llvm.org/D14479 llvm-svn: 252919
* [InstCombine] Add trivial folding (bitreverse (bitreverse x)) -> xJames Molloy2015-11-121-0/+11
| | | | | | There are plenty more instcombines we could probably do with bitreverse, but this seems like a very obvious and trivial starting point and was brought up by Hal in his review. llvm-svn: 252879
* Revert "Revert "[FunctionAttrs] Identify norecurse functions""James Molloy2015-11-125-7/+69
| | | | | | This reapplies this patch, with test fixes. llvm-svn: 252871
* Revert "[FunctionAttrs] Identify norecurse functions"James Molloy2015-11-125-69/+7
| | | | | | This reverts commit r252862. This introduced test failures and I'm reverting while I investigate how this happened. llvm-svn: 252863
* [FunctionAttrs] Identify norecurse functionsJames Molloy2015-11-125-7/+69
| | | | | | | | | | | | | A function can be marked as norecurse if: * The SCC to which it belongs has cardinality 1; and either a) It does not call any non-norecurse function. This includes self-recursion; or b) It only has one callsite and the function that callsite is within is marked norecurse. a) is best propagated bottom-up and b) is best propagated top-down. We build up the norecurse attributes bottom-up using the existing SCC pass, and mark functions with no obvious recursion (but not provably norecurse) to sweep later, top-down. llvm-svn: 252862
* SamplePGO - Fix PR 25482 - Do not rely on llvm.dbg.cu for discriminatorsDiego Novillo2015-11-116-5/+29
| | | | | | | | | | | | | | | The discriminators pass relied on the presence of llvm.dbg.cu to decide whether to add discriminators, but this fails in the case where debug info is only enabled partially when -fprofile-sample-use is active. The reason llvm.dbg.cu is not present in these cases is to prevent codegen from emitting debug info (as it is only used for the sample profile pass). This changes the discriminators pass to also emit discriminators even when debug info is not being emitted. llvm-svn: 252763
* [MIPS] add overrides for isCheapToSpeculateCttz() and isCheapToSpeculateCtlz()Sanjay Patel2015-11-112-0/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MIPS32 has instructions for efficient count-leading/trailing-zeros, so this should be considered a cheap operation (and therefore fair game for speculation) for any MIPS32 implementation. The net result of allowing this speculation for the regression tests in this patch is that we get this code: ctlz: jr $ra clz $2, $4 cttz: addiu $1, $4, -1 not $2, $4 and $1, $2, $1 clz $1, $1 addiu $2, $zero, 32 jr $ra subu $2, $2, $1 Instead of: ctlz: beqz $4, $BB0_2 addiu $2, $zero, 32 clz $2, $4 $BB0_2: jr $ra nop cttz: beqz $4, $BB1_2 addiu $2, $zero, 32 addiu $1, $4, -1 not $2, $4 and $1, $2, $1 clz $1, $1 addiu $2, $zero, 32 subu $2, $2, $1 $BB1_2: jr $ra nop See D14469 for the larger motivation. Differential Revision: http://reviews.llvm.org/D14500 llvm-svn: 252755
* Sort the enums in Attributes.h in case insensitive alphabetical order.Akira Hatanaka2015-11-113-3/+3
| | | | | | | | | Sort the enums in preparation for moving the attributes to a table-gen file. rdar://problem/19836465 llvm-svn: 252692
* [ValueTracking] Teach isImpliedCondition a new bitwise trickSanjoy Das2015-11-101-0/+78
| | | | | | | | | | | | | | | | | | | Summary: This change teaches isImpliedCondition to prove things like (A | 15) < L ==> (A | 14) < L if the low 4 bits of A are known to be zero. Depends on D14391 Reviewers: majnemer, reames, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14392 llvm-svn: 252673
* [PowerPC] Add an MI SSA peephole pass.Bill Schmidt2015-11-101-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a pass for doing PowerPC peephole optimizations at the MI level while the code is still in SSA form. This allows for easy modifications to the instructions while depending on a subsequent pass of DCE. Both passes are very fast due to the characteristics of SSA. At this time, the only peepholes added are for cleaning up various redundancies involving the XXPERMDI instruction. However, I would expect this will be a useful place to add more peepholes for inefficiencies generated during instruction selection. The pass is placed after VSX swap optimization, as it is best to let that pass remove unnecessary swaps before performing any remaining clean-ups. The utility of these clean-ups are demonstrated by changes to four existing test cases, all of which now have tighter expected code generation. I've also added Eric Schweiz's bugpoint-reduced test from PR25157, for which we now generate tight code. One other test started failing for me, and I've fixed it (test/Transforms/PlaceSafepoints/finite-loops.ll) as well; this is not related to my changes, and I'm not sure why it works before and not after. The problem is that the CHECK-NOT: of "statepoint" from test1 fails because of the "statepoint" in test2, and so forth. Adding a CHECK-LABEL in between keeps the different occurrences of that string properly scoped. llvm-svn: 252651
* [ARM] add overrides for isCheapToSpeculateCttz() and isCheapToSpeculateCtlz()Sanjay Patel2015-11-102-0/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ARM V6T2 has instructions for efficient count-leading/trailing-zeros, so this should be considered a cheap operation (and therefore fair game for speculation) for any ARM V6T2 implementation. The net result of allowing this speculation for the regression tests in this patch is that we get this code: ctlz: clz r0, r0 bx lr cttz: rbit r0, r0 clz r0, r0 bx lr Instead of: ctlz: cmp r0, #0 moveq r0, #32 clzne r0, r0 bx lr cttz: cmp r0, #0 moveq r0, #32 rbitne r0, r0 clzne r0, r0 bx lr This will help solve a general speculation/despeculation problem noted in PR24818: https://llvm.org/bugs/show_bug.cgi?id=24818 Differential Revision: http://reviews.llvm.org/D14469 llvm-svn: 252639
* [ValueTracking] Recognize that and(x, add (x, -1)) clears the low bitPhilip Reames2015-11-101-0/+65
| | | | | | | | | | This is a cleaned up version of a patch by John Regehr with permission. Originally found via the souper tool. If we add an odd number to x, then bitwise-and the result with x, we know that the low bit of the result must be zero. Either it was zero in x originally, or the add cleared it in the temporary value. As a result, one of the two values anded together must have the bit cleared. Differential Revision: http://reviews.llvm.org/D14315 llvm-svn: 252629
OpenPOWER on IntegriCloud