summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* An alloca can be equal to an argument. It can't *alias* an alloca, but it couldDan Gohman2013-01-311-0/+13
| | | | | | | be equal, since there's nothing preventing a caller from correctly predicting the stack location of an alloca. llvm-svn: 174119
* Remove the AttrBuilder form of the Attribute::get creators.Bill Wendling2013-01-312-29/+29
| | | | | | | | | | | | | The AttrBuilder is for building a collection of attributes. The Attribute object holds only one attribute. So it's not really useful for the Attribute object to have a creator which takes an AttrBuilder. This has two fallouts: 1. The AttrBuilder no longer holds its internal attributes in a bit-mask form. 2. The attributes are now ordered alphabetically (hence why the tests have changed). llvm-svn: 174110
* Made the min-trip-count-switch test X86-specific to avoidPekka Jaaskelainen2013-01-311-0/+0
| | | | | | breakage with builds without X86-support. llvm-svn: 174052
* Filecheckized 2x tests in SimplifyCFG and removed their date prefix to fit ↵Michael Gottesman2013-01-312-4/+4
| | | | | | with current llvm style for test names. llvm-svn: 174011
* InstCombine: canonicalize sext-and --> selectNadav Rotem2013-01-303-14/+24
| | | | | | | | sext-not-and --> select. Patch by Muhammad Tauqir Ahmad. llvm-svn: 173901
* Adding simple cast cost to ARMRenato Golin2013-01-291-0/+114
| | | | | | | | | | | Changing ARMBaseTargetMachine to return ARMTargetLowering intead of the generic one (similar to x86 code). Tests showing which instructions were added to cast when necessary or cost zero when not. Downcast to 16 bits are not lowered in NEON, so costs are not there yet. llvm-svn: 173849
* LoopVectorize: convert TinyTripCountVectorThreshold constantPekka Jaaskelainen2013-01-291-0/+28
| | | | | | to a command line switch. llvm-svn: 173837
* Convert getAttributes() to return an AttributeSetNode.Bill Wendling2013-01-292-12/+12
| | | | | | | The AttributeSetNode contains all of the attributes. This removes one (hopefully last) use of the Attribute class as a container of multiple attributes. llvm-svn: 173761
* Re-revert r173342, without losing the compile time improvements, flatChandler Carruth2013-01-271-115/+0
| | | | | | out bug fixes, or functionality preserving refactorings. llvm-svn: 173610
* FileCheck-ify some grep testsReid Kleckner2013-01-251-3/+2
| | | | | | | | These tests in particular try to use escaped square brackets as an argument to grep, which is failing for me with native win32 python. It appears the backslash is being lost near the CreateProcess*() call. llvm-svn: 173506
* Switch this code away from Value::isUsedInBasicBlock. That code eitherChandler Carruth2013-01-251-0/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | loops over instructions in the basic block or the use-def list of the value, neither of which are really efficient when repeatedly querying about values in the same basic block. What's more, we already know that the CondBB is small, and so we can do a much more efficient test by counting the uses in CondBB, and seeing if those account for all of the uses. Finally, we shouldn't blanket fail on any such instruction, instead we should conservatively assume that those instructions are part of the cost. Note that this actually fixes a bug in the pass because isUsedInBasicBlock has a really terrible bug in it. I'll fix that in my next commit, but the fix for it would make this code suddenly take the compile time hit I thought it already was taking, so I wanted to go ahead and migrate this code to a faster & better pattern. The bug in isUsedInBasicBlock was also causing other tests to test the wrong thing entirely: for example we weren't actually disabling speculation for floating point operations as intended (and tested), but the test passed because we failed to speculate them due to the isUsedInBasicBlock failure. llvm-svn: 173417
* Reapply chandlerc's r173342 now that the miscompile it was triggering is fixed.Benjamin Kramer2013-01-241-0/+29
| | | | | | | | | | | | | | | | | | | | Original commit message: Plug TTI into the speculation logic, giving it a real cost interface that can be specialized by targets. The goal here is not to be more aggressive, but to just be more accurate with very obvious cases. There are instructions which are known to be truly free and which were not being modeled as such in this code -- see the regression test which is distilled from an inner loop of zlib. Everywhere the TTI cost model is insufficiently conservative I've added explicit checks with FIXME comments to go add proper modelling of these cost factors. If this causes regressions, the likely solution is to make TTI even more conservative in its cost estimates, but test cases will help here. llvm-svn: 173357
* ConstantFolding: Add a missing folding that leads to a miscompile.Benjamin Kramer2013-01-242-1/+37
| | | | | | | | | | We use constant folding to see if an intrinsic evaluates to the same value as a constant that we know. If we don't take the undefinedness into account we get a value that doesn't match the actual implementation, and miscompiled code. This was uncovered by Chandler's simplifycfg changes. llvm-svn: 173356
* Revert r173342 temporarily. It appears to cause a very late miscompileChandler Carruth2013-01-241-29/+0
| | | | | | of stage2 in a bootstrap. Still investigating.... llvm-svn: 173343
* Plug TTI into the speculation logic, giving it a real cost interfaceChandler Carruth2013-01-241-0/+29
| | | | | | | | | | | | | | | | | | that can be specialized by targets. The goal here is not to be more aggressive, but to just be more accurate with very obvious cases. There are instructions which are known to be truly free and which were not being modeled as such in this code -- see the regression test which is distilled from an inner loop of zlib. Everywhere the TTI cost model is insufficiently conservative I've added explicit checks with FIXME comments to go add proper modelling of these cost factors. If this causes regressions, the likely solution is to make TTI even more conservative in its cost estimates, but test cases will help here. llvm-svn: 173342
* Address a large chunk of this FIXME by accumulating the cost forChandler Carruth2013-01-241-0/+42
| | | | | | | unfolded constant expressions rather than checking each one independently. llvm-svn: 173341
* Switch the constant expression speculation cost evaluation away fromChandler Carruth2013-01-241-0/+22
| | | | | | | | | | | | | | | | | | | | a cost fuction that seems both a bit ad-hoc and also poorly suited to evaluating constant expressions. Notably, it is missing any support for trivial expressions such as 'inttoptr'. I could fix this routine, but it isn't clear to me all of the constraints its other users are operating under. The core protection that seems relevant here is avoiding the formation of a select instruction wich a further chain of select operations in a constant expression operand. Just explicitly encode that constraint. Also, update the comments and organization here to make it clear where this needs to go -- this should be driven off of real cost measurements which take into account the number of constants expressions and the depth of the constant expression tree. llvm-svn: 173340
* ConstantFolding: Evaluate GEP indices in the index type.Benjamin Kramer2013-01-231-3/+22
| | | | | | | This fixes some edge cases that we would get wrong with uint64_ts. PR14986. llvm-svn: 173289
* Revert "InstCombine: Clean up weird code that talks about a modulus that's ↵Benjamin Kramer2013-01-231-0/+16
| | | | | | | | | long gone." This causes crashes during the build of compiler-rt during selfhost. Add a testcase for coverage. llvm-svn: 173279
* Add the IR attribute 'sspstrong'.Bill Wendling2013-01-231-0/+155
| | | | | | | | | | | | | | | | | | | | | SSPStrong applies a heuristic to insert stack protectors in these situations: * A Protector is required for functions which contain an array, regardless of type or length. * A Protector is required for functions which contain a structure/union which contains an array, regardless of type or length. Note, there is no limit to the depth of nesting. * A protector is required when the address of a local variable (i.e., stack based variable) is exposed. (E.g., such as through a local whose address is taken as part of the RHS of an assignment or a local whose address is taken as part of a function argument.) This patch implements the SSPString attribute to be equivalent to SSPRequired. This will change in a subsequent patch. llvm-svn: 173230
* Add support for reverse pointer induction variables. These are loops that ↵Nadav Rotem2013-01-233-0/+195
| | | | | | | | | | | | | contain pointers that count backwards. For example, this is the hot loop in BZIP: do { m = *--p; *p = ( ... ); } while (--n); llvm-svn: 173219
* Tests: rewrite 'opt ... %s' to 'opt ... < %s' so that opt does not emit a ↵Dmitri Gribenko2013-01-221-1/+1
| | | | | | | | ModuleID This is done to avoid odd test failures, like the one fixed in r171243. llvm-svn: 173163
* This test is only supposed to test that the objc-arc alias analysisMichael Gottesman2013-01-221-1/+1
| | | | | | | allows for gvn to perform certain optimizations. Thus the runline should only contain -objc-arc-aa, not the full -objc-arc. llvm-svn: 173126
* Remove target triple from an LSR test.Andrew Trick2013-01-221-1/+0
| | | | | | Manish already fixed this test to work with NoTTI. llvm-svn: 173110
* Transform (sub 0, (zext bool to A)) to (sext bool to A) andPaul Redmond2013-01-213-4/+12
| | | | | | | | | (sub 0, (sext bool to A)) to (zext bool to A). Patch by Muhammad Ahmad Reviewed by Duncan Sands llvm-svn: 173093
* LoopVectorizer: Implement a new heuristics for selecting the unroll factor.Nadav Rotem2013-01-201-0/+71
| | | | | | | We ignore the cpu frontend and focus on pipeline utilization. We do this because we don't have a good way to estimate the loop body size at the IR level. llvm-svn: 172964
* Change the cpu type in the test.Nadav Rotem2013-01-201-1/+1
| | | | llvm-svn: 172963
* LoopVectorizer: Emit memory checks into their own basic block.Benjamin Kramer2013-01-191-0/+4
| | | | | | | | | | | | | | This separates the check for "too few elements to run the vector loop" from the "memory overlap" check, giving a lot nicer code and allowing to skip the memory checks when we're not going to execute the vector code anyways. We still leave the decision of whether to emit the memory checks as branches or setccs, but it seems to be doing a good job. If ugly code pops up we may want to emit them as separate blocks too. Small speedup on MultiSource/Benchmarks/MallocBench/espresso. Most of this is legwork to allow multiple bypass blocks while updating PHIs, dominators and loop info. llvm-svn: 172902
* Reverting r171325 & r172363. This was causing a mis-compile on the ↵Bill Wendling2013-01-171-128/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | self-hosted LTO build bots. Okay, here's how to reproduce the problem: 1) Build a Release (or Release+Asserts) version of clang in the normal way. 2) Using the clang & clang++ binaries from (1), build a Release (or Release+Asserts) version of the same sources, but this time enable LTO --- specify the `-flto' flag on the command line. 3) Run the ARC migrator tests: $ arcmt-test --args -triple x86_64-apple-darwin10 -fsyntax-only -x objective-c++ ./src/tools/clang/test/ARCMT/cxx-rewrite.mm You'll see that the output isn't correct (the whitespace is off). The mis-compile is in the function `RewriteBuffer::RemoveText' in the clang/lib/Rewrite/Core/Rewriter.cpp file. When that function and RewriteRope.cpp are compiled with LTO and the `arcmt-test' executable is regenerated, you'll see the error. When those files are not LTO'ed, then the output of the `arcmt-test' is fine. It is *really* hard to get a testcase out of this. I'll file a PR with what I have currently. --- Reverse-merging r172363 into '.': U include/llvm/Analysis/MemoryBuiltins.h U lib/Analysis/MemoryBuiltins.cpp --- Reverse-merging r171325 into '.': U test/Transforms/InstCombine/objsize.ll G include/llvm/Analysis/MemoryBuiltins.h G lib/Analysis/MemoryBuiltins.cpp llvm-svn: 172756
* Added test for r172599 which fixes bugzilla://14584,rdar://11744105.Michael Gottesman2013-01-161-0/+168
| | | | llvm-svn: 172656
* Move test that depends on the x86 target into a target-specific directory.Benjamin Kramer2013-01-161-0/+0
| | | | | | Should fix the arm buildbot (which only builds the arm target). llvm-svn: 172611
* Remove triple from this test, it makes it fail when X86 TTI is missing.Benjamin Kramer2013-01-161-4/+1
| | | | | | Without a triple opt falls back to NoTTI which comes closer to LSR's pre-TTI behavior. llvm-svn: 172609
* Teach InstCombine to optimize extract of a value from a vector add operation ↵Nadav Rotem2013-01-151-0/+10
| | | | | | with a constant zero. llvm-svn: 172576
* 1. Hoist minus sign as high as possible in an attempt to revealShuxin Yang2013-01-152-33/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | some optimization opportunities (in the enclosing supper-expressions). rule 1. (-0.0 - X ) * Y => -0.0 - (X * Y) if expression "-0.0 - X" has only one reference. rule 2. (0.0 - X ) * Y => -0.0 - (X * Y) if expression "0.0 - X" has only one reference, and the instruction is marked "noSignedZero". 2. Eliminate negation (The compiler was already able to handle these opt if the 0.0s are replaced with -0.0.) rule 3: (0.0 - X) * (0.0 - Y) => X * Y rule 4: (0.0 - X) * C => X * -C if the expr is flagged "noSignedZero". 3. Rule 5: (X*Y) * X => (X*X) * Y if X!=Y and the expression is flagged with "UnsafeAlgebra". The purpose of this transformation is two-fold: a) to form a power expression (of X). b) potentially shorten the critical path: After transformation, the latency of the instruction Y is amortized by the expression of X*X, and therefore Y is in a "less critical" position compared to what it was before the transformation. 4. Remove the InstCombine code about simplifiying "X * select". The reasons are following: a) The "select" is somewhat architecture-dependent, therefore the higher level optimizers are not able to precisely predict if the simplification really yields any performance improvement or not. b) The "select" operator is bit complicate, and tends to obscure optimization opportunities. It is btter to keep it as low as possible in expr tree, and let CodeGen to tackle the optimization. llvm-svn: 172551
* Pattern-matched variables in post-inc-icmpzero.llRenato Golin2013-01-151-4/+4
| | | | | | | | | | | Test was failing for clang-native-arm-cortex-a9 build-bot configuration. The reason for the failure was the test was using hardcoded names. The attached patch fixes this failure by replacing the hard-coded variables names with pattern-matched variable names. Patch by Manish Verma, ARM llvm-svn: 172534
* This change is to implement following rules under the condition C_A and/or C_RShuxin Yang2013-01-141-0/+96
| | | | | | | | | | | | | | | | | | | | | --------------------------------------------------------------------------- C_A: reassociation is allowed C_R: reciprocal of a constant C is appropriate, which means - 1/C is exact, or - reciprocal is allowed and 1/C is neither a special value nor a denormal. ----------------------------------------------------------------------------- rule1: (X/C1) / C2 => X / (C2*C1) (if C_A) => X * (1/(C2*C1)) (if C_A && C_R) rule 2: X*C1 / C2 => X * (C1/C2) if C_A rule 3: (X/Y)/Z = > X/(Y*Z) (if C_A && at least one of Y and Z is symbolic value) rule 4: Z/(X/Y) = > (Z*Y)/X (similar to rule3) rule 5: C1/(X*C2) => (C1/C2) / X (if C_A) rule 6: C1/(X/C2) => (C1*C2) / X (if C_A) rule 7: C1/(C2/X) => (C1/C2) * X (if C_A) llvm-svn: 172488
* SCEVExpander fix. RAUW needs to update the InsertedExpressions cache.Andrew Trick2013-01-141-0/+84
| | | | | | | | Note that this bug is only exposed because LTO fails to use TTI. Fixes self-LTO of clang. rdar://13007381. llvm-svn: 172462
* Added bugzilla PR number to test case.Michael Gottesman2013-01-131-0/+1
| | | | llvm-svn: 172369
* Fixed an infinite loop in the block escape in analysis in ObjCARC caused by ↵Michael Gottesman2013-01-131-0/+86
| | | | | | | | 2x blocks each assigned a value via a phi-node causing each to depend on the other. A test case is provided as well. llvm-svn: 172368
* Fix PR14547. Handle induction variables of small sizes smaller than i32 (i8 ↵Nadav Rotem2013-01-131-0/+35
| | | | | | and i16). llvm-svn: 172348
* Fixed bug in ObjCARC where we were changing a call from objc_autoreleaseRV ↵Michael Gottesman2013-01-121-1/+1
| | | | | | => objc_autorelease but were not updating the InstructionClass to IC_Autorelease. llvm-svn: 172288
* Fixed a bug where we were tail calling objc_autorelease causing an object to ↵Michael Gottesman2013-01-125-13/+97
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | not be placed into an autorelease pool. The reason that this occurs is that tail calling objc_autorelease eventually tail calls -[NSObject autorelease] which supports fast autorelease. This can cause us to violate the semantic gaurantees of __autoreleasing variables that assignment to an __autoreleasing variables always yields an object that is placed into the innermost autorelease pool. The fix included in this patch works by: 1. In the peephole optimization function OptimizeIndividualFunctions, always remove tail call from objc_autorelease. 2. Whenever we convert to/from an objc_autorelease, set/unset the tail call keyword as appropriate. *NOTE* I also handled the case where objc_autorelease is converted in OptimizeReturns to an autoreleaseRV which still violates the ARC semantics. I will be removing that in a later patch and I wanted to make sure that the tree is in a consistent state vis-a-vis ARC always. Additionally some test cases are provided and all tests that have tail call marked objc_autorelease keywords have been modified so that tail call has been removed. *NOTE* One test fails due to a separate bug that I am going to commit soon. Thus I marked the check line TMP: instead of CHECK: so make check does not fail. llvm-svn: 172287
* ARM Cost Model: Modify the target independent cost model to askNadav Rotem2013-01-111-3/+3
| | | | | | | | the target if it supports the different CAST types. We didn't do this on X86 because of the different register sizes and types, but on ARM this makes sense. llvm-svn: 172245
* ARM Cost Model: We need to detect the max bitwidth of types in the loop in ↵Nadav Rotem2013-01-111-0/+52
| | | | | | | | | | | order to select the max vectorization factor. We don't have a detailed analysis on which values are vectorized and which stay scalars in the vectorized loop so we use another method. We look at reduction variables, loads and stores, which are the only ways to get information in and out of loop iterations. If the data types are extended and truncated then the cost model will catch the cost of the vector zext/sext/trunc operations. llvm-svn: 172178
* Converted test dont-tce-tail-marked-call.ll to use FileCheck.Michael Gottesman2013-01-111-2/+2
| | | | llvm-svn: 172172
* This commit is a 4x squash commit consisting of 4x functions converted to ↵Michael Gottesman2013-01-114-6/+12
| | | | | | | | | | | | use FileCheck instead of grep. Messages: Converted test case trivial_codegen_tailcall.ll to use FileCheck. Converted test return_constant.ll to use FileCheck instead of grep. Converted test reorder_load.ll to use FileCheck instead of grep. Converted test intervening-inst.ll to use FileCheck instead of grep. llvm-svn: 172171
* PR14904: Segmentation fault running pass 'Recognize loop idioms'Shuxin Yang2013-01-101-0/+20
| | | | | | | | The root cause is mistakenly taking for granted that "dyn_cast<Instruction>(a-Value)" return a non-NULL instruction. llvm-svn: 172145
* CastInst::castIsValid should return true if the dest type is the same asEvan Cheng2013-01-101-0/+36
| | | | | | Value's current type. The casting is trivial even for aggregate type. llvm-svn: 172143
* Teach InstCombine to hoist FABS and FNEG through FPTRUNC instructions. The ↵Owen Anderson2013-01-101-0/+19
| | | | | | application of these operations commutes with the truncation, so we should prefer to do them in the smallest size we can, to save register space, use smaller constant pool entries, etc. llvm-svn: 172117
* LoopVectorizer: Fix a bug in the vectorization of BinaryOperators. The ↵Nadav Rotem2013-01-101-0/+25
| | | | | | | | BinaryOperator can be folded to an Undef, and we don't want to set NSW flags to undef vals. PR14878 llvm-svn: 172079
OpenPOWER on IntegriCloud