summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [GVN] Format variable name.Tilmann Scheller2014-09-041-2/+2
| | | | | | Local variables need to start with an upper case letter. llvm-svn: 217133
* IndVarSimplify: Address review comments for r217102David Majnemer2014-09-041-4/+7
| | | | | | No functional change intended, just some cleanups and comments added. llvm-svn: 217115
* [asan] fix debug info produced for asan-coverage=2Kostya Serebryany2014-09-031-1/+3
| | | | llvm-svn: 217106
* IndVarSimplify: Don't let LFTR compare against a poison valueDavid Majnemer2014-09-031-6/+18
| | | | | | | | | | | | | | LinearFunctionTestReplace tries to use the *next* indvar to compare against when possible. However, it may be the case that the calculation for the next indvar has NUW/NSW flags and that it may only be safely used inside the loop. Using it in a comparison to calculate the exit condition could result in observing poison. This fixes PR20680. Differential Revision: http://reviews.llvm.org/D5174 llvm-svn: 217102
* [asan] add -asan-coverage=3: instrument all blocks and critical edges. Kostya Serebryany2014-09-031-2/+11
| | | | llvm-svn: 217098
* Make some helpers static or move into the llvm namespace.Benjamin Kramer2014-09-031-1/+1
| | | | llvm-svn: 217077
* Preserve IR flags (nsw, nuw, exact, fast-math) in SLP vectorizer (PR20802).Sanjay Patel2014-09-031-5/+30
| | | | | | | | | | | | | | | | | The SLP vectorizer should propagate IR-level optimization hints/flags (nsw, nuw, exact, fast-math) when converting scalar instructions into vectors. But this isn't a simple copy - we need to take the intersection (the logical 'and') of the sets of flags on the scalars. The solution is further complicated because we can have non-uniform (non-SIMD) vector ops after: http://reviews.llvm.org/D4015 http://llvm.org/viewvc/llvm-project?view=revision&revision=211339 The vast majority of changed files are existing tests that were not propagating IR flags, but I've also added a new test file for focused testing of IR flag possibilities. Differential Revision: http://reviews.llvm.org/D5172 llvm-svn: 217051
* Change name of copyFlags() to copyIRFlags(). Add convenience method for ↵Sanjay Patel2014-09-031-1/+1
| | | | | | | | | | logical 'and' of all flags. NFC. Adding 'IR' to the names in an attempt to be less ambiguous about the flags we're dealing with here. The 'and' method is needed by the SLPVectorizer (PR20802) and possibly other passes. llvm-svn: 217004
* Add pass-manager flags to use CFL AAHal Finkel2014-09-021-0/+5
| | | | | | | Add -use-cfl-aa (and -use-cfl-aa-in-codegen) to add CFL AA in the default pass managers (for easy testing). llvm-svn: 216978
* [asan] Assign a low branch weight to ASan's slow path, patch by Jonas ↵Kostya Serebryany2014-09-022-2/+5
| | | | | | Wagner. This speeds up asan (at least on SPEC) by 1%-5% or more. Also fix lint in dfsan. llvm-svn: 216972
* Generate extract for in-tree uses if the use is scalar operand in vectorized ↵Yi Jiang2014-09-021-18/+69
| | | | | | instruction. radar://18144665 llvm-svn: 216946
* unique_ptrify the result of SpecialCaseList::createDavid Blaikie2014-09-021-1/+1
| | | | llvm-svn: 216925
* LICM: Don't crash when an instruction is used by an unreachable BBDavid Majnemer2014-09-021-1/+6
| | | | | | | | | | | | | | | | | | | Summary: BBs might contain non-LCSSA'd values after the LCSSA pass is run if they are unreachable from the entry block. Normally, the users of the instruction would be PHIs but the unreachable BBs have normal users; rewrite their uses to be undef values. An alternative fix could involve fixing this at LCSSA but that would require this invariant to hold after subsequent transforms. If a BB created an unreachable block, they would be in violation of this. This fixes PR19798. Differential Revision: http://reviews.llvm.org/D5146 llvm-svn: 216911
* SROA: Don't insert instructions before a PHIDavid Majnemer2014-09-011-1/+4
| | | | | | | | | | | | | | | SROA may decide that it needs to insert a bitcast and would set it's insertion point before a PHI. This will create an invalid module right quick. Instead, choose the first insertion point in the basic block that holds our PHI. This fixes PR20822. Differential Revision: http://reviews.llvm.org/D5141 llvm-svn: 216891
* Revert "Revert two GEP-related InstCombine commits"David Majnemer2014-09-011-11/+42
| | | | | | | | | | | | | | This reverts commit r216698 which reverted r216523 and r216598. We would attempt to perform the transformation even if the match() failed because, as a side effect, it would set V. This would trick us into believing that we correctly found a place to correctly apply the transform. An additional test case was added to getelementptr.ll so that we might not regress in the future. llvm-svn: 216890
* Add a convenience method to copy wrapping, exact, and fast-math flags (NFC).Sanjay Patel2014-09-011-13/+3
| | | | | | | | | | | | | | The loop vectorizer preserves wrapping, exact, and fast-math properties of scalar instructions. This patch adds a convenience method to make that operation easier because we need to do this in the loop vectorizer, SLP vectorizer, and possibly other places. Although this is a 'no functional change' patch, I've added a testcase to verify that the exact flag is preserved by the loop vectorizer. The wrapping and fast-math flags are already checked in existing testcases. Differential Revision: http://reviews.llvm.org/D5138 llvm-svn: 216886
* Fix a really bad miscompile introduced in r216865 - the else-if logicChandler Carruth2014-09-011-10/+14
| | | | | | | | | | | | | | | | | | chain became completely broken here as *all* intrinsic users ended up being skipped, and the ones that seemed to be singled out were actually the exact wrong set. This is a great example of why long else-if chains can be easily confusing. Switch the entire code to use early exits and early continues to have simpler (and more importantly, correct) logic here, as well as fixing the reversed logic for detecting and continuing on lifetime intrinsics. I've also significantly cleaned up the test case and added another test case demonstrating an example where the optimization is not (trivially) safe to perform. llvm-svn: 216871
* Small refactor on VectorizerHint for deduplicationRenato Golin2014-09-011-93/+160
| | | | | | | | | | | | | | | | | | | | Previously, the hint mechanism relied on clean up passes to remove redundant metadata, which still showed up if running opt at low levels of optimization. That also has shown that multiple nodes of the same type, but with different values could still coexist, even if temporary, and cause confusion if the next pass got the wrong value. This patch makes sure that, if metadata already exists in a loop, the hint mechanism will never append a new node, but always replace the existing one. It also enhances the algorithm to cope with more metadata types in the future by just adding a new type, not a lot of code. Re-applying again due to MSVC 2013 being minimum requirement, and this patch having C++11 that MSVC 2012 didn't support. Fixes PR20655. llvm-svn: 216870
* Feed AA to the inliner and use AA->getModRefBehavior in AddAliasScopeMetadataHal Finkel2014-09-014-12/+25
| | | | | | | | | | | | This feeds AA through the IFI structure into the inliner so that AddAliasScopeMetadata can use AA->getModRefBehavior to figure out which functions only access their arguments (instead of just hard-coding some knowledge of memory intrinsics). Most of the information is only available from BasicAA; this is important for preserving alias scoping information for target-specific intrinsics when doing the noalias parameter attribute to metadata conversion. llvm-svn: 216866
* Ignore lifetime intrinsics in use list for MemCpyOptimizer. Patch by Luqman ↵Nick Lewycky2014-09-011-0/+4
| | | | | | Aden, review by Hal Finkel. llvm-svn: 216865
* Fix AddAliasScopeMetadata again - alias.scope must be a complete descriptionHal Finkel2014-09-011-15/+37
| | | | | | | | | | | | | | | | | | | | | | | I thought that I had fixed this problem in r216818, but I did not do a very good job. The underlying issue is that when we add alias.scope metadata we are asserting that this metadata completely describes the aliasing relationships within the current aliasing scope domain, and so in the context of translating noalias argument attributes, the pointers must all be based on noalias arguments (as underlying objects) and have no other kind of underlying object. In r216818 excluding appropriate accesses from getting alias.scope metadata is done by looking for underlying objects that are not identified function-local objects -- but that's wrong because allocas, etc. are also function-local objects and we need to explicitly check that all underlying objects are the noalias arguments for which we're adding metadata aliasing scopes. This fixes the underlying-object check for adding alias.scope metadata, and does some refactoring of the related capture-checking eligibility logic (and adds more comments; hopefully making everything a bit clearer). Fixes self-hosting on x86_64 with -mllvm -enable-noalias-to-md-conversion (the feature is still disabled by default). llvm-svn: 216863
* Fix some cases where StringRef was being passed by const reference. Remove ↵Craig Topper2014-08-301-3/+3
| | | | | | const from some other StringRefs since its implicitly const already. llvm-svn: 216820
* Fix AddAliasScopeMetadata to not add scopes when deriving from unknown pointersHal Finkel2014-08-301-25/+51
| | | | | | | | | | | | | | | | | | | | | | | The previous implementation of AddAliasScopeMetadata, which adds noalias metadata to preserve noalias parameter attribute information when inlining had a flaw: it would add alias.scope metadata to accesses which might have been derived from pointers other than noalias function parameters. This was incorrect because even some access known not to alias with all noalias function parameters could easily alias with an access derived from some other pointer. Instead, when deriving from some unknown pointer, we cannot add alias.scope metadata at all. This fixes a miscompile of the test-suite's tramp3d-v4. Furthermore, we cannot add alias.scope to functions unless we know they access only argument-derived pointers (currently, we know this only for memory intrinsics). Also, we fix a theoretical problem with using the NoCapture attribute to skip the capture check. This is incorrect (as explained in the comment added), but would not matter in any code generated by Clang because we get only inferred nocapture attributes in Clang-generated IR. This functionality is not yet enabled by default. llvm-svn: 216818
* InstCombine: Respect recursion depth in visitUDivOperandDavid Majnemer2014-08-301-4/+4
| | | | llvm-svn: 216817
* InstCombine: Try harder to combine icmp instructionsDavid Majnemer2014-08-301-3/+25
| | | | | | | | | | | | | consider: (and (icmp X, Y), (and Z, (icmp A, B))) It may be possible to combine (icmp X, Y) with (icmp A, B). If we successfully combine, create an 'and' instruction with Z. This fixes PR20814. N.B. There is room for improvement after this change but I'm not convinced it's worth chasing yet. llvm-svn: 216814
* Fix a typo in AddAliasScopeMetadataHal Finkel2014-08-291-1/+1
| | | | llvm-svn: 216741
* Revert two GEP-related InstCombine commitsDavid Majnemer2014-08-291-40/+11
| | | | | | | This reverts commit r216523 and r216598; people have reported regressions. llvm-svn: 216698
* Don't promote byval pointer arguments when padding mattersReid Kleckner2014-08-281-3/+81
| | | | | | | | | | | | | | | | Don't promote byval pointer arguments when when their size in bits is not equal to their alloc size in bits. This can happen for x86_fp80, where the size in bits is 80 but the alloca size in bits in 128. Promoting these types can break passing unions of x86_fp80s and other types. Patch by Thomas Jablin! Reviewed By: rnk Differential Revision: http://reviews.llvm.org/D5057 llvm-svn: 216693
* InstCombine: Remove redundant combinesDavid Majnemer2014-08-281-15/+0
| | | | | | | | InstSimplify already handles icmp (X+Y), X (and things like it) appropriately. The first thing that InstCombine does is run InstSimplify on the instruction. llvm-svn: 216659
* Fix: SLPVectorizer tried to move an instruction which was replaced by a ↵Erik Eckstein2014-08-281-4/+0
| | | | | | | | | | vector instruction. For a detailed description of the problem see the comment in the test file. The problematic moveBefore() calls are not required anymore because the new scheduling algorithm ensures a correct ordering anyway. llvm-svn: 216656
* InstSimplify: Move a transform from InstCombine to InstSimplifyDavid Majnemer2014-08-281-10/+0
| | | | | | | | Several combines involving icmp (shl C2, %X) C1 can be simplified without introducing any new instructions. Move them to InstSimplify; while we are at it, make them more powerful. llvm-svn: 216642
* InstCombine: Combine gep X, (Y-X) to YDavid Majnemer2014-08-271-14/+25
| | | | | | | | We try to perform this transform in InstSimplify but we aren't always able to. Sometimes, we need to insert a bitcast if X and Y don't have the same time. llvm-svn: 216598
* [SLP] Re-enable vectorization of GEP expressions (re-apply r210342 with a fix).Michael Zolotukhin2014-08-271-0/+101
| | | | llvm-svn: 216549
* Simplify creation of a bunch of ArrayRefs by using None, makeArrayRef or ↵Craig Topper2014-08-2712-42/+27
| | | | | | just letting them be implicitly created. llvm-svn: 216525
* Fix some cases were ArrayRefs were being passed by reference. Also remove ↵Craig Topper2014-08-271-4/+4
| | | | | | 'const' from some other ArrayRef uses since its implicitly const already. llvm-svn: 216524
* InstCombine: Optimize GEP's involving ptrtoint betterDavid Majnemer2014-08-271-11/+29
| | | | | | | | | | | | | | We supported transforming: (gep i8* X, -(ptrtoint Y)) to: (inttoptr (sub (ptrtoint X), (ptrtoint Y))) However, this only fired if 'X' had type i8*. Generalize this to support various types of different sizes. This results in much better CodeGen, especially for pointers to packed structs. llvm-svn: 216523
* Revert r210342 and r210343, add test case for the crasher.Joerg Sonnenberger2014-08-261-91/+0
| | | | | | PR 20642. llvm-svn: 216475
* This patch enables SimplifyUsingDistributiveLaws() to handle following pattens.Dinesh Dwivedi2014-08-262-54/+45
| | | | | | | | | | | | (X >> Z) & (Y >> Z) -> (X&Y) >> Z for all shifts. (X >> Z) | (Y >> Z) -> (X|Y) >> Z for all shifts. (X >> Z) ^ (Y >> Z) -> (X^Y) >> Z for all shifts. These patterns were previously handled separately in visitAnd()/visitOr()/visitXor(). Differential Revision: http://reviews.llvm.org/D4951 llvm-svn: 216443
* musttail: Don't eliminate varargs packs if there is a forwarding callReid Kleckner2014-08-261-2/+7
| | | | | | Also clean up and beef up this grep test for the feature. llvm-svn: 216425
* fix typos in commentsSanjay Patel2014-08-261-4/+4
| | | | llvm-svn: 216424
* ArgPromotion: Don't touch variadic functionsReid Kleckner2014-08-251-0/+7
| | | | | | | | | | | | | | | | Adding, removing, or changing non-pack parameters can change the ABI classification of pack parameters. Clang and other frontends encode the classification in the IR of the call site, but the callee side determines it dynamically based on the number of registers consumed so far. Changing the prototype affects the number of registers consumed would break such code. Dead argument elimination performs a similar task and already has a similar check to avoid this problem. Patch by Thomas Jablin! llvm-svn: 216421
* Modernize raw_fd_ostream's constructor a bit.Rafael Espindola2014-08-252-5/+4
| | | | | | | | | | Take a StringRef instead of a "const char *". Take a "std::error_code &" instead of a "std::string &" for error. A create static method would be even better, but this patch is already a bit too big. llvm-svn: 216393
* Remove dangling initializers in GlobalDCEBruno Cardoso Lopes2014-08-252-1/+10
| | | | | | | | | | | | | | GlobalDCE deletes global vars and updates their initializers to nullptr while leaving underlying constants to be cleaned up later by its uses. The clean up may never happen, fix this by forcing it every time it's safe to destroy constants. Final patch by Rafael Espindola http://reviews.llvm.org/D4931 <rdar://problem/17523868> llvm-svn: 216390
* MergeFunctions, tiny refactoring:Stepan Dyatkovskiy2014-08-251-3/+3
| | | | | | cmpAPFloat has been renamed to cmpAPFloats (multiple form). llvm-svn: 216376
* MergeFunctions, tiny refactoring:Stepan Dyatkovskiy2014-08-251-5/+5
| | | | | | cmpAPInt has been renamed to cmpAPInts (multiple form). llvm-svn: 216375
* MergeFunctions, tiny refactoring:Stepan Dyatkovskiy2014-08-251-12/+11
| | | | | | cmpType has been renamed to cmpTypes (multiple form). llvm-svn: 216374
* MergeFunctions, tiny refactoring:Stepan Dyatkovskiy2014-08-251-5/+5
| | | | | | cmpGEP has been renamed to cmpGEPs (multiple form). llvm-svn: 216373
* Allow vectorization of division by uniform power of 2.Karthik Bhat2014-08-252-8/+31
| | | | | | | | This patch adds support to recognize division by uniform power of 2 and modifies the cost table to vectorize division by uniform power of 2 whenever possible. Updates Cost model for Loop and SLP Vectorizer.The cost table is currently only updated for X86 backend. Thanks to Hal, Andrea, Sanjay for the review. (http://reviews.llvm.org/D4971) llvm-svn: 216371
* Use range based for loops to avoid needing to re-mention SmallPtrSet size.Craig Topper2014-08-2413-103/+58
| | | | llvm-svn: 216351
* InstCombine: Properly optimize or'ing bittests togetherDavid Majnemer2014-08-241-0/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CFE, with -03, would turn: bool f(unsigned x) { bool a = x & 1; bool b = x & 2; return a | b; } into: %1 = lshr i32 %x, 1 %2 = or i32 %1, %x %3 = and i32 %2, 1 %4 = icmp ne i32 %3, 0 This sort of thing exposes a nasty pathology in GCC, ICC and LLVM. Instead, we would rather want: %1 = and i32 %x, 3 %2 = icmp ne i32 %1, 0 Things get a bit more interesting in the following case: %1 = lshr i32 %x, %y %2 = or i32 %1, %x %3 = and i32 %2, 1 %4 = icmp ne i32 %3, 0 Replacing it with the following sequence is better: %1 = shl nuw i32 1, %y %2 = or i32 %1, 1 %3 = and i32 %2, %x %4 = icmp ne i32 %3, 0 This sequence is preferable because %1 doesn't involve %x and could potentially be hoisted out of loops if it is invariant; only perform this transform in the non-constant case if we know we won't increase register pressure. llvm-svn: 216343
OpenPOWER on IntegriCloud