summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [InstCombine] remove fptrunc (select) code; NFCISanjay Patel2018-05-211-17/+0
| | | | | | | This pattern is handled within commonCastTransforms(), so the code here is dead AFAICT. llvm-svn: 332887
* [EarlyCSE] Improve EarlyCSE of some absolute value cases.Craig Topper2018-05-211-4/+12
| | | | | | | | | | Change matchSelectPattern to return X and -X for ABS/NABS in a well defined order. Adjust EarlyCSE to account for this. Ensure the SPF result is some kind of min/max and not abs/nabs in one place in InstCombine that made me nervous. Prevously we returned the two operands of the compare part of the abs pattern. The RHS is always going to be a 0i, 1 or -1 constant. This isn't a very meaningful thing to return for any one. There's also some freedom in the abs pattern as to what happens when the value is equal to 0. This freedom led to early cse failing to match when different constants were used in otherwise equivalent operations. By returning the input and its negation in a defined order we can ensure an exact match. This also makes sure both patterns use the exact same subtract instruction for the negation. I believe CSE should evebntually make this happen and properly merge the nsw/nuw flags. But I'm not familiar with CSE and what order it does things in so it seemed like it might be good to really enforce that they were the same. Differential Revision: https://reviews.llvm.org/D47037 llvm-svn: 332865
* [VPlan] Reland r332654 and silence unused func warningDiego Caballero2018-05-219-42/+839
| | | | | | | | | | r332654 was reverted due to an unused function warning in release build. This commit includes the same code with the warning silenced. Differential Revision: https://reviews.llvm.org/D44338 llvm-svn: 332860
* [InstCombine] Fix PR37526: MinMax patterns produce an infinite loop.Alexey Bataev2018-05-211-3/+8
| | | | | | | | | | | | | | | | | Summary: This patch fixes PR37526 by simplifying the newly generated LoadInst instructions. If the pointer address is a bitcast from the pointer to the NewType, we can just remove this extra bitcast instead of creating the new one. This fixes the PR37526 + may speed up the whole compilation process. Reviewers: spatel, RKSimon, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47144 llvm-svn: 332855
* revert r332610, it breaks cfi, see D46326Nico Weber2018-05-211-82/+11
| | | | llvm-svn: 332838
* [CVP] Require DomTree for new Pass ManagerDavid Green2018-05-211-9/+13
| | | | | | | | | We were previously using a DT in CVP through SimplifyQuery, but not requiring it in the new pass manager. Hence it would crash if DT was not already available. This now gets DT directly and plumbs it through to where it is used (instead of using it through SQ). llvm-svn: 332836
* Fix up a few grammar issues.Eric Christopher2018-05-211-1/+1
| | | | llvm-svn: 332835
* [X86] Remove mask arguments from permvar builtins/intrinsics. Use a select ↵Craig Topper2018-05-201-20/+12
| | | | | | | | in IR instead. Someday maybe we'll use selects for all intrinsics. llvm-svn: 332824
* [InstCombine] choose 1 form of abs and nabs as canonicalSanjay Patel2018-05-201-0/+51
| | | | | | | | | | | | | | | We already do this for min/max (see the blob above the diff), so we should do the same for abs/nabs. A sign-bit check (<s 0) is used as a predicate for other IR transforms and it's likely the best for codegen. This might solve the motivating cases for D47037 and D47041, but I think those patches still make sense. We can't guarantee this canonicalization if the icmp has more than one use. Differential Revision: https://reviews.llvm.org/D47076 llvm-svn: 332819
* [IRCE] Fix miscompile with range checks against negative valuesMax Kazantsev2018-05-191-2/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the patch rL329547, we have lifted the over-restrictive limitation on collected range checks, allowing to work with range checks with the end of their range not being provably non-negative. However it appeared that the non-negativity of this value was assumed in the utility function `ClampedSubtract`. In particular, its reasoning is based on the fact that `0 <= SINT_MAX - X`, which is not true if `X` is negative. The function `ClampedSubtract` is only called twice, once with `X = 0` (which is OK) and the second time with `X = IRC.getEnd()`, where we may now see the problem if the end is actually a negative value. In this case, we may sometimes miscompile. This patch is the conservative fix of the miscompile problem. Rather than rejecting non-provably non-negative `getEnd()` values, we will check it for non-negativity in runtime. For this, we use function `smax(smin(X, 0), -1) + 1` that is equal to `1` if `X` is non-negative and is equal to 0 if `X` is negative. If we multiply `Begin, End` of safe iteration space by this function calculated for `X = IRC.getEnd()`, we will get the original `[Begin, End)` if `IRC.getEnd()` was non-negative (and, thus, `ClampedSubtract` worked correctly) and the empty range `[0, 0)` in case if ` IRC.getEnd()` was negative. So we in fact prohibit execution of the main loop if at least one of range checks was made against a negative value (and we figured it out in runtime). It is still better than what we have before (non-negativity had to be proved in compile time) and prevents us from miscompile, however it is sometiles too restrictive for unsigned range checks against a negative value (which in fact can be eliminated). Once we re-implement `ClampedSubtract` in a way that it handles negative `X` correctly, this limitation can be lifted, too. Differential Revision: https://reviews.llvm.org/D46860 Reviewed By: samparker llvm-svn: 332809
* [MergeICmps] Don't crash when memcmp is not availableBenjamin Kramer2018-05-191-0/+4
| | | | | | Fixes clang crashing with -fno-builtin, PR37527. llvm-svn: 332808
* Fix evaluator for non-zero alloca addr spaceYaxun Liu2018-05-191-1/+2
| | | | | | | | | | | | | | The evaluator goes through BB and creates global vars as temporary values to evaluate results of LLVM instructions. It creates undef for alloca, however it assumes alloca in addr space 0. If the next instruction is addrspace cast to 0, then we get an invalid cast instruction. This patch let the temp global var have an address space matching alloca addr space, so that the valuation can be done. Differential Revision: https://reviews.llvm.org/D47081 llvm-svn: 332794
* Constant fold launder of null and undefPiotr Padlewski2018-05-181-2/+3
| | | | | | | | | | | | | | | Summary: This might be useful because clang will add some barriers for pointer comparisons. Reviewers: majnemer, dberlin, hfinkel, nlewycky, davide, rsmith, amharc, kuhar Subscribers: davide, amharc, llvm-commits Differential Revision: https://reviews.llvm.org/D32423 llvm-svn: 332786
* [InstCombine] Qualify a select pattern based transform to restrct to only ↵Craig Topper2018-05-181-1/+1
| | | | | | min/max and ignore abs/nabs. llvm-svn: 332770
* [msan] Don't check divisor shadow in fdiv.Evgeniy Stepanov2018-05-181-7/+10
| | | | | | | | | | | | | | | | Summary: Floating point division by zero or even undef does not have undefined behavior and may occur due to optimizations. Fixes https://bugs.llvm.org/show_bug.cgi?id=37523. Reviewers: kcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D47085 llvm-svn: 332761
* Reverted r332654 as it has broken some buildbots and left unfixed for a long ↵Galina Kistanova2018-05-189-838/+42
| | | | | | | | | | | time. The introduced problem is: llvm.src/lib/Transforms/Vectorize/VPlanVerifier.cpp:29:13: error: unused function 'hasDuplicates' [-Werror,-Wunused-function] static bool hasDuplicates(const SmallVectorImpl<VPBlockBase *> &VPBlockVec) { ^ llvm-svn: 332747
* [SimplifyCFG] Fix a debug invariant bug in FoldBranchToCommonDest()David Stenberg2018-05-181-3/+5
| | | | | | | | | | | | | | | | | | | | | Summary: Fix a case where FoldBranchToCommonDest() would bail out from doing CSE when encountering a debug intrinsic. Handle that by skipping past the debug intrinsics. Also, as a minor refactoring, rename checkCSEInPredecessor() to tryCSEWithPredecessor() to make it a bit more clear that the function may remove instructions. Reviewers: fhahn, craig.topper, dblaikie, xbolva00 Reviewed By: fhahn, xbolva00 Subscribers: vsk, davide, llvm-commits Differential Revision: https://reviews.llvm.org/D46635 llvm-svn: 332698
* [asan] Add instrumentation support for MyriadWalter Lee2018-05-181-1/+33
| | | | | | | | | | | | | | | | | | | | 1. Define Myriad-specific ASan constants. 2. Add code to generate an outer loop that checks that the address is in DRAM range, and strip the cache bit from the address. The former is required because Myriad has no memory protection, and it is up to the instrumentation to range-check before using it to index into the shadow memory. 3. Do not add an unreachable instruction after the error reporting function; on Myriad such function may return if the run-time has not been initialized. 4. Add a test. Differential Revision: https://reviews.llvm.org/D46451 llvm-svn: 332692
* [WebAssembly] Add Wasm personality and isScopedEHPersonality()Heejin Ahn2018-05-176-11/+12
| | | | | | | | | | | | | | | | | | | | | Summary: - Add wasm personality function - Re-categorize the existing `isFuncletEHPersonality()` function into two different functions: `isFuncletEHPersonality()` and `isScopedEHPersonality(). This becomes necessary as wasm EH uses scoped EH instructions (catchswitch, catchpad/ret, and cleanuppad/ret) but not outlined funclets. - Changed some callsites of `isFuncletEHPersonality()` to `isScopedEHPersonality()` if they are related to scoped EH IR-level stuff. Reviewers: majnemer, dschuff, rnk Subscribers: jfb, sbc100, jgravelle-google, eraman, JDevlieghere, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D45559 llvm-svn: 332667
* [LV][VPlan] Build plain CFG with simple VPInstructions for outer loops.Diego Caballero2018-05-179-42/+838
| | | | | | | | | | | | | | | | | | | Patch #3 from VPlan Outer Loop Vectorization Patch Series #1 (RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html). Expected to be NFC for the current inner loop vectorization path. It introduces the basic algorithm to build the VPlan plain CFG (single-level CFG, no hierarchical CFG (H-CFG), yet) in the VPlan-native vectorization path using VPInstructions. It includes: - VPlanHCFGBuilder: Main class to build the VPlan H-CFG (plain CFG without nested regions, for now). - VPlanVerifier: Main class with utilities to check the consistency of a H-CFG. - VPlanBlockUtils: Main class with utilities to manipulate VPBlockBases in VPlan. Reviewers: rengolin, fhahn, mkuper, mssimpso, a.elovikov, hfinkel, aprantl. Differential Revision: https://reviews.llvm.org/D44338 llvm-svn: 332654
* Add a limit for phi folding instcombineXinliang David Li2018-05-171-1/+9
| | | | | | Differential Revision: http://reviews.llvm.org/D47023 llvm-svn: 332653
* [InstCombine] Propagate the nsw/nuw flags from the add in the 'shifty' abs ↵Craig Topper2018-05-171-1/+5
| | | | | | | | | | | | pattern to the sub in the select version. According to alive this is valid. I'm hoping to use this to make an assumption that the sign bit is zero after this sequence. The only way it wouldn't be is if the input was INT__MIN, but by preserving the flags we can make doing this to INT_MIN UB. The nuw flags is weird because it creates such a contradiction that the original number would have to be positive meaning we could remove the select entirely, but we don't get that far. Differential Revision: https://reviews.llvm.org/D46988 llvm-svn: 332623
* In thin and full LTO + CFI, direct function calls may go through jump tableDmitry Mikulin2018-05-171-11/+82
| | | | | | | | | entries to reach the target. Since these calls don't require type checks, we can short-circuit them to their real targets. Differential Revision: https://reviews.llvm.org/D46326 llvm-svn: 332610
* [SROA] Handle PHI with multiple duplicate predecessorsBjorn Pettersson2018-05-171-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The verifier accepts PHI nodes with multiple entries for the same basic block, as long as the value is the same. As seen in PR37203, SROA did not handle such PHI nodes properly when speculating loads over the PHI, since it inserted multiple loads in the predecessor block and changed the PHI into having multiple entries for the same basic block, but with different values. This patch teaches SROA to reuse the same speculated load for each PHI duplicate entry in such situations. Resolves: https://bugs.llvm.org/show_bug.cgi?id=37203 Reviewers: uabelho, chandlerc, hfinkel, bkramer, efriedma Reviewed By: efriedma Subscribers: dberlin, efriedma, llvm-commits Differential Revision: https://reviews.llvm.org/D46426 llvm-svn: 332577
* [SROA] pr37267: fix assertion failure in integer wideningHiroshi Inoue2018-05-171-0/+8
| | | | | | | | | | The current integer widening does not support rewriting partial split slices in rewriteIntegerStore (and rewriteIntegerLoad). This patch adds explicit checks for this case in isIntegerWideningViableForSlice. Before r322533, splitting is allowed only for the whole-alloca slice and hence the above case is implicitly rejected by another check `if (DL.getTypeStoreSize(ValueTy) > Size)` because whole-alloca slice is larger than the partition. Differential Revision: https://reviews.llvm.org/D46750 llvm-svn: 332575
* [STLExtras] Add size() for ranges, and remove distance()Vedant Kumar2018-05-162-2/+3
| | | | | | | | | | r332057 introduced distance() for ranges. Based on post-commit feedback, this renames distance() to size(). The new size() is also only enabled when the operation is O(1). Differential Revision: https://reviews.llvm.org/D46976 llvm-svn: 332551
* [InstCombine] Fix the signature of fgets_unlocked.Benjamin Kramer2018-05-161-2/+2
| | | | | | | It returns a pointer, not an int. This miscompiles all code that uses the return value of fgets. llvm-svn: 332531
* [InstCombine] allow more binop (shuffle X), C transformsSanjay Patel2018-05-161-2/+4
| | | | | | | | | The canonicalization was restricted to shuffle masks with a 1-to-1 mapping to the constant vector, but that disqualifies the common splat pattern. This is part of solving PR37463: https://bugs.llvm.org/show_bug.cgi?id=37463 llvm-svn: 332479
* [SimplifyLibcalls] Replace locked IO with unlocked IODavid Bolvansky2018-05-162-19/+217
| | | | | | | | | | | | | | Summary: If file stream arg is not captured and source is fopen, we could replace IO calls by unlocked IO ("_unlocked" function variants) to gain better speed, Reviewers: efriedma, RKSimon, spatel, sanjoy, hfinkel, majnemer, lebedev.ri, rja Reviewed By: rja Subscribers: rja, srhines, efriedma, lebedev.ri, llvm-commits Differential Revision: https://reviews.llvm.org/D45736 llvm-svn: 332452
* [LoopUnroll] Split out simplify code after Unroll into a new function. NFCDavid Green2018-05-161-34/+46
| | | | | | | | | So that it can be shared with other passes that may end up doing the same thing. Differential Revision: https://reviews.llvm.org/D45874 llvm-svn: 332450
* [ObjCARC] Prevent code motion into a catchswitchShoaib Meenai2018-05-162-0/+6
| | | | | | | | | | | A catchswitch must be the only non-phi instruction in its basic block; attempting to move a retain or release into a catchswitch basic block will result in invalid IR. Explicitly mark a CFG hazard in this case to prevent the code motion. Differential Revision: https://reviews.llvm.org/D46482 llvm-svn: 332430
* Fix LSR compile time hang.Evgeny Stupachenko2018-05-161-1/+6
| | | | | | | | | | | | | Summary: Limit number of reassociations in GenerateReassociationsImpl. Reviewers: qcolombet, mkazantsev Differential Revision: https://reviews.llvm.org/D46039 From: Evgeny Stupachenko <evstupac@gmail.com> <evgeny.v.stupachenko@intel.com> llvm-svn: 332426
* [InstCombine] fix binop (shuffle X), C --> shuffle (binop X, C') to check usesSanjay Patel2018-05-151-1/+1
| | | | llvm-svn: 332407
* StructurizeCFG: fix inverting conditionsMarek Olsak2018-05-151-2/+3
| | | | | | | | | | | | | | | | Author: Samuel Pitoiset Without this patch, it appears to me that we are selecting the wrong operand when inverting conditions. In the attached test, it will select %tmp3 instead of %tmp4. To fix it, just use 'A' as everywhere. This fixes a regression introduced by "[PatternMatch] define m_Not using m_Xor and cst_pred_ty" https://reviews.llvm.org/D46351 llvm-svn: 332403
* [msan] Instrument masked.store, masked.load intrinsics.Evgeniy Stepanov2018-05-151-0/+87
| | | | | | | | | | | | Summary: Instrument masked store/load intrinsics. Reviewers: kcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D46785 llvm-svn: 332402
* [InstCombine] clean up code for binop-shuffle transforms; NFCISanjay Patel2018-05-151-39/+34
| | | | llvm-svn: 332399
* [InstCombine] fix binop-of-shuffles to check usesSanjay Patel2018-05-151-12/+10
| | | | llvm-svn: 332375
* [MergeFunctions] Fix merging of small weak functionswhitequark2018-05-151-35/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | When two interposable functions are merged, we cannot replace uses and have to emit calls to a common internal function. However, writeThunk() will not actually emit a thunk if the function is too small. This leaves us in a broken state where mergeTwoFunctions already rewired the functions, but writeThunk doesn't do anything. This patch changes the implementation so that: * writeThunk() does just that. * The direct replacement of calls is moved into mergeTwoFunctions() into the non-interposable case only. * isThunkProfitable() is extracted and will be called for the non-iterposable case always, and in the interposable case only if uses are still left after replacement. This issue has been introduced in https://reviews.llvm.org/D34806, where the code for checking thunk profitability has been moved. Differential Revision: https://reviews.llvm.org/D46804 Reviewed By: whitequark llvm-svn: 332342
* [NFC] Add const to method signatureMax Kazantsev2018-05-151-1/+1
| | | | llvm-svn: 332317
* [InstCombine] fix crash due to ignored addrspacecastKeno Fischer2018-05-141-2/+3
| | | | | | | | | | | | | | | | | | Summary: Part of the InstCombine code for simplifying GEPs looks through addrspacecasts. However, this was done by updating a variable also used by the next transformation, for marking GEPs as inbounds. This led to replacing a GEP with a similar instruction in a different addrspace, which caused an assertion failure in RAUW. This caused julia issue https://github.com/JuliaLang/julia/issues/27055 Patch by Jeff Bezanson <jeff@juliacomputing.com> Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D46722 llvm-svn: 332302
* [AggressiveInstCombine] avoid crashing on unsimplified code (PR37446)Sanjay Patel2018-05-141-0/+4
| | | | | | | | | This bug: https://bugs.llvm.org/show_bug.cgi?id=37446 ...raises another question: why do we run aggressive-instcombine before regular instcombine? llvm-svn: 332243
* Rename DEBUG macro to LLVM_DEBUG.Nicola Zaghen2018-05-14111-2347/+2573
| | | | | | | | | | | | | | | | The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240
* Test commit access.Nicola Zaghen2018-05-141-1/+1
| | | | | | Remove trailing whitespace. llvm-svn: 332220
* [X86] Remove and autoupgrade the cvtusi2sd intrinsic. Use ↵Craig Topper2018-05-141-1/+0
| | | | | | uitofp+insertelement instead. llvm-svn: 332206
* [X86] Extend instcombine folds for pclmuldq intrinsics to the 256 and 512 ↵Craig Topper2018-05-131-9/+12
| | | | | | bit version. llvm-svn: 332202
* [X86] Remove and autoupgrade masked vpermd/vpermps intrinsics.Craig Topper2018-05-131-2/+0
| | | | llvm-svn: 332198
* [X86] Remove an autoupgrade legacy cvtss2sd intrinsics.Craig Topper2018-05-131-1/+0
| | | | llvm-svn: 332187
* [X86] Remove and autoupgrade cvtsi2ss/cvtsi2sd intrinsics to match what ↵Craig Topper2018-05-121-4/+0
| | | | | | clang has used for a very long time. llvm-svn: 332186
* Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading."Michael Zolotukhin2018-05-121-26/+30
| | | | | | | | | Stage3/stage4 bootstrap miscompares should be fixed by a non-determinism fix in IDF (r332167). This reverts commit r330446. llvm-svn: 332168
* [CodeExtractor] Allow extracting blocks with exception handlingSergey Dmitriev2018-05-111-27/+91
| | | | | | | | | | | | | | | | | This is a CodeExtractor improvement which adds support for extracting blocks which have exception handling constructs if that is legal to do. CodeExtractor performs validation checks to ensure that extraction is legal when it finds invoke instructions or EH pads (landingpad, catchswitch, or cleanuppad) in blocks to be extracted. I have also added an option to allow extraction of blocks with alloca instructions, but no validation is done for allocas. CodeExtractor caller has to validate it himself before allowing alloca instructions to be extracted. By default allocas are still not allowed in extraction blocks. Differential Revision: https://reviews.llvm.org/D45904 llvm-svn: 332151
OpenPOWER on IntegriCloud