summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* Implement TTI getUnrollingPreferences for PowerPCHal Finkel2013-09-112-0/+52
| | | | | | | | The PowerPC A2 core greatly benefits from aggressive concatenation unrolling; use the new getUnrollingPreferences to enable this by default when targeting the PPC A2 core. llvm-svn: 190549
* Teach loop-idiom about address space pointer sizesMatt Arsenault2013-09-115-0/+360
| | | | llvm-svn: 190491
* Fix missing CHECK-LABELsMatt Arsenault2013-09-101-5/+5
| | | | llvm-svn: 190426
* Don't shrink atomic ops to bool in GlobalOpt.Eli Friedman2013-09-091-0/+15
| | | | | | | | | LLVM IR doesn't currently allow atomic bool load/store operations, and the transformation is dubious anyway because it isn't profitable on all platforms. PR17163. llvm-svn: 190357
* [InstCombiner] Expose opportunities to merge subtract and comparison.Quentin Colombet2013-09-091-0/+65
| | | | | | | | | | | | | | | | | | | | | Several architectures use the same instruction to perform both a comparison and a subtract. The instruction selection framework does not allow to consider different basic blocks to expose such fusion opportunities. Therefore, these instructions are “merged” by CSE at MI IR level. To increase the likelihood of CSE to apply in such situation, we reorder the operands of the comparison, when they have the same complexity, so that they matches the order of the most frequent subtract. E.g., icmp A, B ... sub B, A <rdar://problem/14514580> llvm-svn: 190352
* Revert patches to add case-range support for PR1255.Bob Wilson2013-09-091-32/+32
| | | | | | | | | | | | | | | | | The work on this project was left in an unfinished and inconsistent state. Hopefully someone will eventually get a chance to implement this feature, but in the meantime, it is better to put things back the way the were. I have left support in the bitcode reader to handle the case-range bitcode format, so that we do not lose bitcode compatibility with the llvm 3.3 release. This reverts the following commits: 155464, 156374, 156377, 156613, 156704, 156757, 156804 156808, 156985, 157046, 157112, 157183, 157315, 157384, 157575, 157576, 157586, 157612, 157810, 157814, 157815, 157880, 157881, 157882, 157884, 157887, 157901, 158979, 157987, 157989, 158986, 158997, 159076, 159101, 159100, 159200, 159201, 159207, 159527, 159532, 159540, 159583, 159618, 159658, 159659, 159660, 159661, 159703, 159704, 160076, 167356, 172025, 186736 llvm-svn: 190328
* Debug Info Testing: update context from empty string to null.Manman Ren2013-09-086-9/+9
| | | | | | Context should be either null or MDNode. llvm-svn: 190267
* Debug Info Testing: updated to use NULL instead of "i32 0" in a few fields.Manman Ren2013-09-0613-24/+24
| | | | | | | | Field 2 of DIType (Context), field 9 of DIDerivedType (TypeDerivedFrom), field 12 of DICompositeType (ContainingType), fields 2, 7, 12 of DISubprogram (Context, Type, ContainingType). llvm-svn: 190205
* Merge these 2 tests in a single file.Rafael Espindola2013-09-044-71/+50
| | | | llvm-svn: 189975
* Revert "Add r159136 back now that pr13124 has been fixed."Rafael Espindola2013-09-041-14/+0
| | | | | | | | | | | | | | | | | | | This reverts commit r189886. I found a corner case where this optimization is not valid: Say we have a "linkonce_odr unnamed_addr" in two translation units: * In TU 1 this optimization kicks in and makes it hidden. * In TU 2 it gets const merged with a constant that is *not* unnamed_addr, resulting in a non unnamed_addr constant with default visibility. * The static linker rules for combining visibility them produce a hidden symbol, which is incorrect from the point of view of the non unnamed_addr constant. The one place we can do this is when we know that the symbol is not used from another TU in the same shared object, i.e., during LTO. I will move it there. llvm-svn: 189954
* InstCombine: allow unmasked icmps to be combined with logical opsTim Northover2013-09-041-0/+30
| | | | | | | | | | "(icmp op i8 A, B)" is equivalent to "(icmp op i8 (A & 0xff), B)" as a degenerate case. Allowing this as a "masked" comparison when analysing "(icmp) &/| (icmp)" allows us to combine them in more cases. rdar://problem/7625728 llvm-svn: 189931
* InstCombine: look for masked compares with subset relationTim Northover2013-09-041-0/+122
| | | | | | | | | | | Even in cases which aren't universally optimisable like "(A & B) != 0 && (A & C) != 0", the masks can make one of the comparisons completely redundant. In this case, since we've gone to the effort of spotting masked comparisons we should combine them. rdar://problem/7625728 llvm-svn: 189930
* Add r159136 back now that pr13124 has been fixed.Rafael Espindola2013-09-031-0/+14
| | | | | | | | | | Original message: If a constant or a function has linkonce_odr linkage and unnamed_addr, mark hidden. Being linkonce_odr guarantees that it is available in every dso that needs it. Being a constant/function with unnamed_addr guarantees that the copies don't have to be merged. llvm-svn: 189886
* [objc-arc] Turn off the objc_retainBlock -> objc_retain optimization.Michael Gottesman2013-09-035-672/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The reason that I am turning off this optimization is that there is an additional case where a block can escape that has come up. Specifically, this occurs when a block is used in a scope outside of its current scope. This can cause a captured retainable object pointer whose life is preserved by the objc_retainBlock to be deallocated before the block is invoked. An example of the code needed to trigger the bug is: ---- \#import <Foundation/Foundation.h> int main(int argc, const char * argv[]) { void (^somethingToDoLater)(); { NSObject *obj = [NSObject new]; somethingToDoLater = ^{ [obj self]; // Crashes here }; } NSLog(@"test."); somethingToDoLater(); return 0; } ---- In the next commit, I remove all the dead code that results from this. Once I put in the fixing commit I will bring back the tests that I deleted in this commit. rdar://14802782. rdar://14868830. llvm-svn: 189869
* [objc-arc] Move some block tests from basic.ll -> retain-block.ll and add ↵Michael Gottesman2013-09-033-55/+55
| | | | | | some missing CHECK-LABELS. llvm-svn: 189868
* Teach InstCombineLoadCast about address spaces.Matt Arsenault2013-09-031-1/+22
| | | | | | | | This is another one that doesn't matter much, but uses the right GEP index types in the first place. llvm-svn: 189854
* In this patch we are trying to do two things:Yi Jiang2013-09-033-13/+141
| | | | | | | | | 1) If the width of vectorization list candidate is bigger than vector reg width, we will break it down to fit the vector reg. 2) We do not vectorize the width which is not power of two. The performance result shows it will help some spec benchmarks. mesa improved 6.97% and ammp improved 1.54%. llvm-svn: 189830
* SimplifyLibCalls: When emitting an overloaded fp function check that it's ↵Benjamin Kramer2013-08-311-0/+20
| | | | | | | | | | available. The existing code missed some edge cases when e.g. we're going to emit sqrtf but only the availability of sqrt was checked. This happens on odd platforms like windows. llvm-svn: 189724
* InstCombine: Check for zero shift amounts before subtracting one causing ↵Benjamin Kramer2013-08-301-0/+36
| | | | | | | | | integer overflow. PR17026. Also avoid undefined shifts and shift amounts larger than 64 bits (those are always undef because we can't represent integer types that large). llvm-svn: 189672
* Fix a test to not fail for users with my name. :)Daniel Dunbar2013-08-291-0/+1
| | | | llvm-svn: 189547
* Convert tests to FileCheckMatt Arsenault2013-08-2814-19/+38
| | | | llvm-svn: 189529
* Handle address spaces in TargetTransformInfoMatt Arsenault2013-08-281-1/+44
| | | | llvm-svn: 189527
* Disable unrolling in the loop vectorizer when disabled in the pass managerHal Finkel2013-08-282-1/+32
| | | | | | | | | | | | | | | | | When unrolling is disabled in the pass manager, the loop vectorizer should also not unroll loops. This will allow the -fno-unroll-loops option in Clang to behave as expected (even for vectorizable loops). The loop vectorizer's -force-vector-unroll option will (continue to) override the pass-manager setting (including -force-vector-unroll=0 to force use of the internal auto-selection logic). In order to test this, I added a flag to opt (-disable-loop-unrolling) to force disable unrolling through opt (the analog of -fno-unroll-loops in Clang). Also, this fixes a small bug in opt where the loop vectorizer was enabled only after the pass manager populated the queue of passes (the global_alias.ll test needed a slight update to the RUN line as a result of this fix). llvm-svn: 189499
* Fix inserting instructions before last in bundle.Matt Arsenault2013-08-261-1/+1
| | | | | | | | | | | The builder inserts from before the insert point, not after, so this would insert before the last instruction in the bundle instead of after it. I'm not sure if this can actually be a problem with any of the current insertions. llvm-svn: 189285
* Debug Info: add an identifier field to DICompositeType.Manman Ren2013-08-2621-30/+30
| | | | | | | | | | | | | | | | | | DICompositeType will have an identifier field at position 14. For now, the field is set to null in DIBuilder. For DICompositeTypes where the template argument field (the 13th field) was optional, modify DIBuilder to make sure the template argument field is set. Now DICompositeType has 15 fields. Update DIBuilder to use NULL instead of "i32 0" for null value of a MDNode. Update verifier to check that DICompositeType has 15 fields and the last field is null or a MDString. Update testing cases to include an extra field for DICompositeType. The identifier field will be used by type uniquing so a front end can genearte a DICompositeType with a unique identifer. llvm-svn: 189282
* LoopVectorize: Implement partial loop unrolling when vectorization is not ↵Nadav Rotem2013-08-261-0/+39
| | | | | | | | | | | | | | | | profitable. This patch enables unrolling of loops when vectorization is legal but not profitable. We add a new class InnerLoopUnroller, that extends InnerLoopVectorizer and replaces some of the vector-specific logic with scalars. This patch does not introduce any runtime regressions and improves the following workloads: SingleSource/Benchmarks/Shootout/matrix -22.64% SingleSource/Benchmarks/Shootout-C++/matrix -13.06% External/SPEC/CINT2006/464_h264ref/464_h264ref -3.99% SingleSource/Benchmarks/Adobe-C++/simple_types_constant_folding -1.95% llvm-svn: 189281
* Forgot to add slp threshold to testMatt Arsenault2013-08-261-1/+2
| | | | llvm-svn: 189248
* Vectorize starting from insertelements building a vectorMatt Arsenault2013-08-261-0/+196
| | | | llvm-svn: 189233
* Filecheckize some tests.Michael Gottesman2013-08-232-3/+10
| | | | llvm-svn: 189079
* Update StripDeadDebugInfo to use DebugInfoFinder so that it is no longer ↵Michael Gottesman2013-08-233-53/+59
| | | | | | | | | | | | | | | | | | | | | stale to the point of not working and more resilient to debug info changes. The current version of StripDeadDebugInfo became stale and no longer actually worked since it was expecting an older version of debug info. This patch updates it to use DebugInfoFinder and the modern DebugInfo classes as much as possible to make it more redundent to such changes. Additionally, the only place where that was avoided (the code where we replace the old sets with the new), I call verify on the DIContextUnit implying that if the format changes and my live set changes no longer make sense an assert will be hit. In order to ensure that that occurs I have included a test case. The actual stripping of the dead debug info follows the same strategy as was used before in this class: find the live set and replace the old set in the given compile unit (which may contain dead global variables/functions) with the new live one. llvm-svn: 189078
* [Debug Info Tests] Update testing cases.Manman Ren2013-08-222-15/+15
| | | | | | | | | A single metadata will not span multiple lines. This also helps me with my script to automatic update the testing cases. A debug info testing case should have a llvm.dbg.cu. Do not use hard-coded id for debug nodes. llvm-svn: 189033
* Teach the SLP vectorizer the correct way to check for consecutive accessChandler Carruth2013-08-221-1/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | using GEPs. Previously, it used a number of different heuristics for analyzing the GEPs. Several of these were conservatively correct, but failed to fall back to SCEV even when SCEV might have given a reasonable answer. One was simply incorrect in how it was formulated. There was good code already to recursively evaluate the constant offsets in GEPs, look through pointer casts, etc. I gathered this into a form code like the SLP code can use in a previous commit, which allows all of this code to become quite simple. There is some performance (compile time) concern here at first glance as we're directly attempting to walk both pointers constant GEP chains. However, a couple of thoughts: 1) The very common cases where there is a dynamic pointer, and a second pointer at a constant offset (usually a stride) from it, this code will actually not do any unnecessary work. 2) InstCombine and other passes work very hard to collapse constant GEPs, so it will be rare that we iterate here for a long time. That said, if there remain performance problems here, there are some obvious things that can improve the situation immensely. Doing a vectorizer-pass-wide memoizer for each individual layer of pointer values, their base values, and the constant offset is likely to be able to completely remove redundant work and strictly limit the scaling of the work to scrape these GEPs. Since this optimization was not done on the prior version (which would still benefit from it), I've not done it here. But if folks have benchmarks that slow down it should be straight forward for them to add. I've added a test case, but I'm not really confident of the amount of testing done for different access patterns, strides, and pointer manipulation. llvm-svn: 189007
* Teach LoopVectorize about address space sizesMatt Arsenault2013-08-221-2/+29
| | | | llvm-svn: 188980
* TBAA: remove !tbaa from testing cases when they are not needed.Manman Ren2013-08-211-13/+10
| | | | | | | This will make it easier to turn on struct-path aware TBAA since the metadata format will change. llvm-svn: 188944
* Teach InstCombine about address spacesMatt Arsenault2013-08-215-70/+406
| | | | llvm-svn: 188926
* Add test for bitcast array ptrs with address spacesMatt Arsenault2013-08-211-0/+22
| | | | llvm-svn: 188919
* Add enforce known alignment test with address spaceMatt Arsenault2013-08-211-3/+23
| | | | llvm-svn: 188917
* SLPVectorizer: Fix invalid iterator errorsArnold Schwaighofer2013-08-201-0/+30
| | | | | | | | | | | Update iterator when the SLP vectorizer changes the instructions in the basic block by restarting the traversal of the basic block. Patch by Yi Jiang! Fixes PR 16899. llvm-svn: 188832
* Teach ConstantFolding about pointer address spacesMatt Arsenault2013-08-203-4/+296
| | | | llvm-svn: 188831
* Add a llvm.copysign intrinsicHal Finkel2013-08-191-0/+53
| | | | | | | | | | | | | | | | | | | | | This adds a llvm.copysign intrinsic; We already have Libfunc recognition for copysign (which is turned into the FCOPYSIGN SDAG node). In order to autovectorize calls to copysign in the loop vectorizer, we need a corresponding intrinsic as well. In addition to the expected changes to the language reference, the loop vectorizer, BasicTTI, and the SDAG builder (the intrinsic is transformed into an FCOPYSIGN node, just like the function call), this also adds FCOPYSIGN to a few lists in LegalizeVector{Ops,Types} so that vector copysigns can be expanded. In TargetLoweringBase::initActions, I've made the default action for FCOPYSIGN be Expand for vector types. This seems correct for all in-tree targets, and I think is the right thing to do because, previously, there was no way to generate vector-values FCOPYSIGN nodes (and most targets don't specify an action for vector-typed FCOPYSIGN). llvm-svn: 188728
* Teach InstCombine visitGetElementPtr about address spacesMatt Arsenault2013-08-191-1/+28
| | | | llvm-svn: 188721
* Fix assert with GEP ptr vector indexing structsMatt Arsenault2013-08-191-0/+11
| | | | | | | | Also fix it calculating the wrong value. The struct index is not a ConstantInt, so it was being interpreted as an array index. llvm-svn: 188713
* Revert non-test parts of r188507Matt Arsenault2013-08-191-1/+65
| | | | | | Re-add the inboundsless tests I didn't add originally llvm-svn: 188710
* Adds missing TLI check for library simplification ofMichael Kuperstein2013-08-192-0/+25
| | | | | | | * pow(x, 0.5) -> fabs(sqrt(x)) * pow(2.0, x) -> exp2(x) llvm-svn: 188656
* Add missing test for GEP + bitcast transformationMatt Arsenault2013-08-161-0/+24
| | | | llvm-svn: 188529
* [tests] Cleanup initialization of test suffixes.Daniel Dunbar2013-08-1667-78/+0
| | | | | | | | | | | | | | | | | - Instead of setting the suffixes in a bunch of places, just set one master list in the top-level config. We now only modify the suffix list in a few suites that have one particular unique suffix (.ml, .mc, .yaml, .td, .py). - Aside from removing the need for a bunch of lit.local.cfg files, this enables 4 tests that were inadvertently being skipped (one in Transforms/BranchFolding, a .s file each in DebugInfo/AArch64 and CodeGen/PowerPC, and one in CodeGen/SI which is now failing and has been XFAILED). - This commit also fixes a bunch of config files to use config.root instead of older copy-pasted code. llvm-svn: 188513
* InstCombine: Simplify if(x!=0 && x!=-1).Jim Grosbach2013-08-161-0/+12
| | | | | | | | | | | When both constants are positive or both constants are negative, InstCombine already simplifies comparisons like this, but when it's exactly zero and -1, the operand sorting ends up reversed and the pattern fails to match. Handle that special case. Follow up for rdar://14689217 llvm-svn: 188512
* Don't do FoldCmpLoadFromIndexedGlobal for non inbounds GEPsMatt Arsenault2013-08-152-77/+301
| | | | | | | This path wasn't tested before without a datalayout, so add some more tests and re-run with and without one. llvm-svn: 188507
* [tests] Fix refacto in r187764 that effectively disabled SimplifyCFG tests. :(Daniel Dunbar2013-08-154-0/+1
| | | | llvm-svn: 188503
* Fixing a corner-case bug in strchr and strrchr lib call optimizations whereYunzhong Gao2013-08-152-0/+22
| | | | | | | | | the input character is not converted to char before comparing with zero. The patch was discussed in this thread: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130812/184069.html llvm-svn: 188489
OpenPOWER on IntegriCloud