summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [PM] LoopSimplify. Remove unneeded pass dependencies. NFCI.Davide Italiano2016-06-081-3/+0
| | | | llvm-svn: 272140
* [PM/SimplifyCFG] Preserve GlobalsAA even if the IR is mutated.Davide Italiano2016-06-081-4/+5
| | | | llvm-svn: 272139
* Avoid copies of std::strings and APInt/APFloats where we only read from itBenjamin Kramer2016-06-087-14/+14
| | | | | | | | As suggested by clang-tidy's performance-unnecessary-copy-initialization. This can easily hit lifetime issues, so I audited every change and ran the tests under asan, which came back clean. llvm-svn: 272126
* [PM] Preserve GlobalsAA for SROA.Davide Italiano2016-06-071-1/+6
| | | | | | Differential Revision: http://reviews.llvm.org/D21040 llvm-svn: 272009
* [InstCombine][AVX2] Add support for simplifying AVX2 per-element shifts to ↵Simon Pilgrim2016-06-071-0/+125
| | | | | | | | | | | | | | | | | | native shifts Unlike native shifts, the AVX2 per-element shift instructions VPSRAV/VPSRLV/VPSLLV handle out of range shift values (logical shifts set the result to zero, arithmetic shifts splat the sign bit). If the shift amount is constant we can sometimes convert these instructions to native shifts: 1 - if all shift amounts are in range then the conversion is trivial. 2 - out of range arithmetic shifts can be clamped to the (bitwidth - 1) (a legal shift amount) before conversion. 3 - logical shifts just return zero if all elements have out of range shift amounts. In addition, UNDEF shift amounts are handled - either as an UNDEF shift amount in a native shift or as an UNDEF in the logical 'all out of range' zero constant special case for logical shifts. Differential Revision: http://reviews.llvm.org/D19675 llvm-svn: 271996
* [InstCombine][SSE] Add MOVMSK constant folding (PR27982)Simon Pilgrim2016-06-071-0/+51
| | | | | | | | | | This patch adds support for folding undef/zero/constant inputs to MOVMSK instructions. The SSE/AVX versions can be fully folded, but the MMX version can only handle undef inputs. Differential Revision: http://reviews.llvm.org/D20998 llvm-svn: 271990
* [InstCombine] scalarizePHI should not assume the code it sees has been CSE'dMichael Kuperstein2016-06-061-12/+26
| | | | | | | | | | | | | | scalarizePHI only looked for phis that have exactly two uses - the "latch" use, and an extract. Unfortunately, we can not assume all equivalent extracts are CSE'd, since InstCombine itself may create an extract which is a duplicate of an existing one. This extends it to handle several distinct extracts from the same index. This should fix at least some of the performance regressions from PR27988. Differential Revision: http://reviews.llvm.org/D20983 llvm-svn: 271961
* [PM] Preserve the correct set of analyses for GVN.Davide Italiano2016-06-061-1/+6
| | | | llvm-svn: 271934
* [GVN] Switch dump() definition over to LLVM_DUMP_METHOD.Davide Italiano2016-06-061-2/+1
| | | | llvm-svn: 271932
* Reapply [LSR] Create fewer redundant instructions.Geoff Berry2016-06-061-20/+22
| | | | | | | | | | | | | | | | | | | Summary: Fix LSRInstance::HoistInsertPosition() to check the original insert position block first for a canonical insertion point that is dominated by all inputs. This leads to SCEV being able to reuse more instructions since it currently tracks the instructions it creates for reuse by keeping a table of <Value, insert point> pairs. Originally reviewed in http://reviews.llvm.org/D18001 Reviewers: atrick Subscribers: llvm-commits, mzolotukhin, mcrosier Differential Revision: http://reviews.llvm.org/D18480 llvm-svn: 271929
* [InstCombine] limit icmp transform to ConstantInt (PR28011)Sanjay Patel2016-06-061-3/+5
| | | | | | | | | | | | | | | In r271810 ( http://reviews.llvm.org/rL271810 ), I loosened the check above this to work for any Constant rather than ConstantInt. AFAICT, that part makes sense if we can determine that the shrunken/extended constant remained equal. But it doesn't make sense for this later transform where we assume that the constant DID change. This could assert for a ConstantExpr: https://llvm.org/bugs/show_bug.cgi?id=28011 And it could be wrong for a vector as shown in the added regression test. llvm-svn: 271908
* LICM: Don't sink stores out of loops that may throw.Eli Friedman2016-06-051-0/+10
| | | | | | | | | | | | | | | | Summary: This hasn't been caught before because it requires noalias or similarly strong alias analysis to actually reproduce. Fixes http://llvm.org/PR27952 . Reviewers: hfinkel, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20944 llvm-svn: 271858
* Add safety check to InstCombiner::commonIRemTransformsSanjoy Das2016-06-051-2/+11
| | | | | | | | | | | | | | | | Since FoldOpIntoPhi speculates the binary operation to potentially each of the predecessors of the PHI node (pulling it out of arbitrary control dependence in the process), we can FoldOpIntoPhi only if we know the operation doesn't have UB. This also brings up an interesting profitability question -- the way it is written today, commonIRemTransforms will hoist out work from dynamically dead code into code that will execute at runtime. Perhaps that isn't the best canonicalization? Fixes PR27968. llvm-svn: 271857
* [PM] Port IndVarSimplify to the new pass managerSanjoy Das2016-06-051-0/+27
| | | | | | | | | | | | | | | | | Summary: There are some rough corners, since the new pass manager doesn't have (as far as I can tell) LoopSimplify and LCSSA, so I've updated the tests to run them separately in the old pass manager in the lit tests. We also don't have an equivalent for AU.setPreservesCFG() in the new pass manager, so I've left a FIXME. Reviewers: bogner, chandlerc, davide Subscribers: sanjoy, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20783 llvm-svn: 271846
* [IndVars] Remove -liv-reduceSanjoy Das2016-06-052-77/+0
| | | | | | | | | | It is an off-by-default option that no one seems to use[0], and given that SCEV directly understands the overflow instrinsics there is no real need for it anymore. [0]: http://lists.llvm.org/pipermail/llvm-dev/2016-April/098181.html llvm-svn: 271845
* [InstCombine] allow vector icmp bool transformsSanjay Patel2016-06-051-1/+1
| | | | llvm-svn: 271843
* fix documentation comments and other clean-ups; NFCSanjay Patel2016-06-051-74/+67
| | | | llvm-svn: 271839
* [PM] Port GCOVProfiler pass to the new pass managerXinliang David Li2016-06-051-1/+13
| | | | llvm-svn: 271823
* [PM] code refactoring /NFCXinliang David Li2016-06-052-73/+83
| | | | llvm-svn: 271822
* [InstCombine] less 'CI' confusion; NFCSanjay Patel2016-06-051-26/+26
| | | | | | | | | Change the name of the ICmpInst to 'ICmp' and the Constant (was a ConstantInt) to 'C', so that it's hopefully clearer that 'CI' refers to CastInst in this context. While we're scrubbing, fix the documentation comment and use 'auto' with 'dyn_cast'. llvm-svn: 271817
* [SimplifyCFG] Don't kill empty cleanuppads with multiple usesDavid Majnemer2016-06-041-0/+5
| | | | | | | | | | | | | | | | | A basic block could contain: %cp = cleanuppad [] cleanupret from %cp unwind to caller This basic block is empty and is thus a candidate for removal. However, there can be other uses of %cp outside of this basic block. This is only possible in unreachable blocks. Make our transform more correct by checking that the pad has a single user before removing the BB. This fixes PR28005. llvm-svn: 271816
* [InstCombine] allow vector constants for cast+icmp foldSanjay Patel2016-06-041-1/+1
| | | | | | | This is step 1 of unknown towards fixing PR28001: https://llvm.org/bugs/show_bug.cgi?id=28001 llvm-svn: 271810
* clean-up; NFCSanjay Patel2016-06-041-4/+3
| | | | llvm-svn: 271807
* fix formatting, punctuation; NFCSanjay Patel2016-06-041-5/+3
| | | | llvm-svn: 271804
* [InstCombine][MMX] Extend SimplifyDemandedUseBits MOVMSK support to MMXSimon Pilgrim2016-06-041-3/+9
| | | | | | | | Add the MMX implementation to the SimplifyDemandedUseBits SSE/AVX MOVMSK support added in D19614 Requires a minor tweak as llvm.x86.mmx.pmovmskb takes a x86_mmx argument - so we have to be explicit about the implied v8i8 vector type. llvm-svn: 271789
* [pgo] extend r271532 to darwin platformXinliang David Li2016-06-031-4/+4
| | | | llvm-svn: 271746
* [esan|wset] Optionally assume intra-cache-line accessesDerek Bruening2016-06-031-2/+16
| | | | | | | | | | | | | | | | | | Summary: Adds an option -esan-assume-intra-cache-line which causes esan to assume that a single memory access touches just one cache line, even if it is not aligned, for better performance at a potential accuracy cost. Experiments show that the performance difference can be 2x or more, and accuracy loss is typically negligible, so we turn this on by default. This currently applies just to the working set tool. Reviewers: aizatsky Subscribers: vitalybuka, zhaoqin, kcc, eugenis, llvm-commits Differential Revision: http://reviews.llvm.org/D20978 llvm-svn: 271743
* [esan] Specify which tool via a global variableDerek Bruening2016-06-031-0/+13
| | | | | | | | | | | | | | | Summary: Adds a global variable to specify the tool, to support handling early interceptors that invoke instrumented code and require shadow memory to be initialized prior to __esan_init() being invoked. Reviewers: aizatsky Subscribers: vitalybuka, zhaoqin, kcc, eugenis, llvm-commits Differential Revision: http://reviews.llvm.org/D20973 llvm-svn: 271715
* [InstCombine] look through bitcasts to find selectsSanjay Patel2016-06-031-18/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There was concern that creating bitcasts for the simpler potential select pattern: define <2 x i64> @vecBitcastOp1(<4 x i1> %cmp, <2 x i64> %a) { %a2 = add <2 x i64> %a, %a %sext = sext <4 x i1> %cmp to <4 x i32> %bc = bitcast <4 x i32> %sext to <2 x i64> %and = and <2 x i64> %a2, %bc ret <2 x i64> %and } might lead to worse code for some targets, so this patch is matching the larger patterns seen in the test cases. The motivating example for this patch is this IR produced via SSE intrinsics in C: define <2 x i64> @gibson(<2 x i64> %a, <2 x i64> %b) { %t0 = bitcast <2 x i64> %a to <4 x i32> %t1 = bitcast <2 x i64> %b to <4 x i32> %cmp = icmp sgt <4 x i32> %t0, %t1 %sext = sext <4 x i1> %cmp to <4 x i32> %t2 = bitcast <4 x i32> %sext to <2 x i64> %and = and <2 x i64> %t2, %a %neg = xor <4 x i32> %sext, <i32 -1, i32 -1, i32 -1, i32 -1> %neg2 = bitcast <4 x i32> %neg to <2 x i64> %and2 = and <2 x i64> %neg2, %b %or = or <2 x i64> %and, %and2 ret <2 x i64> %or } For an AVX target, this is currently: vpcmpgtd %xmm1, %xmm0, %xmm2 vpand %xmm0, %xmm2, %xmm0 vpandn %xmm1, %xmm2, %xmm1 vpor %xmm1, %xmm0, %xmm0 retq With this patch, it becomes: vpmaxsd %xmm1, %xmm0, %xmm0 Differential Revision: http://reviews.llvm.org/D20774 llvm-svn: 271676
* [esan|cfrag] Instrument GEP instr for struct field access.Qin Zhao2016-06-031-0/+61
| | | | | | | | | | | | | | | | | Summary: Instrument GEP instruction for counting the number of struct field address calculation to approximate the number of struct field accesses. Adds test struct_field_count_basic.ll to test the struct field instrumentation. Reviewers: bruening, aizatsky Subscribers: junbuml, zhaoqin, llvm-commits, eugenis, vitalybuka, kcc, bruening Differential Revision: http://reviews.llvm.org/D20892 llvm-svn: 271619
* [LoopUnroll] Set correct thresholds for new recently enabled unrolling ↵Michael Zolotukhin2016-06-031-2/+2
| | | | | | | | | | | heuristic. In r270478, where I enabled the new heuristic I posted testing results, which I got when explicitly passed the thresholds values via CL options. However, setting the CL options init-values is not enough to change the default values of thresholds, so I'm changing them in another place now. llvm-svn: 271615
* [TailRecursionElimination] Refactor/cleanup.Davide Italiano2016-06-021-150/+121
| | | | | | | | | In preparation for porting to the new PM. Patch by Jake VanAdrighem! (review mainly by me/Justin) Differential Revision: http://reviews.llvm.org/D20610 llvm-svn: 271607
* [PM] Schedule InstSimplify after late LICM run, to clean up LCSSA nodes.Manuel Jacob2016-06-021-0/+3
| | | | | | | | | | | | | | | Summary: The module pass pipeline includes a late LICM run after loop unrolling. LCSSA is implicitly run as a pass dependency of LICM. However no cleanup pass was run after this, so the LCSSA nodes ended in the optimized output. Reviewers: hfinkel, mehdi_amini Subscribers: majnemer, bruno, mzolotukhin, mehdi_amini, llvm-commits Differential Revision: http://reviews.llvm.org/D20606 llvm-svn: 271602
* [PM] LoadCombine preserves GlobalsAA, doesn't depend on it.Davide Italiano2016-06-021-1/+0
| | | | llvm-svn: 271601
* [PM/LoadCombine] Inline getAnalysisUsage(). NFCI.Davide Italiano2016-06-021-8/+5
| | | | llvm-svn: 271600
* transform obscured FP sign bit ops into a fabs/fneg using TLI hookSanjay Patel2016-06-021-18/+0
| | | | | | | | | | | | | | | | | | | This is effectively a revert of: http://reviews.llvm.org/rL249702 - [InstCombine] transform masking off of an FP sign bit into a fabs() intrinsic call (PR24886) and: http://reviews.llvm.org/rL249701 - [ValueTracking] teach computeKnownBits that a fabs() clears sign bits and a reimplementation as a DAG combine for targets that have IEEE754-compliant fabs/fneg instructions. This is intended to resolve the objections raised on the dev list: http://lists.llvm.org/pipermail/llvm-dev/2016-April/098154.html and: https://llvm.org/bugs/show_bug.cgi?id=24886#c4 In the interest of patch minimalism, I've only partly enabled AArch64. PowerPC, MIPS, x86 and others can enable later. Differential Revision: http://reviews.llvm.org/D19391 llvm-svn: 271573
* [InstCombine] remove guard for generating a vector selectSanjay Patel2016-06-021-15/+11
| | | | | | | | | | | | | | | | | This is effectively NFC because we already do this transform after r175380: http://reviews.llvm.org/rL175380 and also via foldBoolSextMaskToSelect(). This change should just make it a bit more efficient to match the pattern. The original guard was added in r95058: http://reviews.llvm.org/rL95058 A sampling of codegen for current in-tree targets shows no problems. This makes sense given that we're already producing the vector selects via the other transforms. llvm-svn: 271554
* [esan|cfrag] Create the cfrag struct array for the runtimeQin Zhao2016-06-021-5/+115
| | | | | | | | | | | | | | Summary: Fills the cfrag struct variable with an array of struct information variables. Reviewers: aizatsky, bruening Subscribers: bruening, kcc, vitalybuka, eugenis, llvm-commits, zhaoqin Differential Revision: http://reviews.llvm.org/D20661 llvm-svn: 271547
* [profile] value profiling bug fix -- missing icall targets in profile-useXinliang David Li2016-06-021-1/+7
| | | | | | | | | | | | | | | | | Inline virtual functions has linkeonceodr linkage (emitted in comdat on supporting targets). If the vtable for the class is not emitted in the defining module, function won't be address taken thus its address is not recorded. At the mercy of the linker, if the per-func prf_data from this module (in comdat) is picked at link time, we will lose mapping from function address to its hash val. This leads to missing icall promotion. The second test case (currently disabled) in compiler_rt (r271528): instrprof-icall-prom.test demostrates the bug. The first profile-use subtest is fine due to linker order difference. With this change, no missing icall targets is found in instrumented clang's raw profile. llvm-svn: 271532
* make icall pass name consistent /NFCXinliang David Li2016-06-021-3/+3
| | | | llvm-svn: 271467
* [asan] Rename *UAR* into *UseAfterReturn*Vitaly Buka2016-06-021-7/+7
| | | | | | | | | | | | | | | Summary: To improve readability. PR27453 Reviewers: kcc, eugenis, aizatsky Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20761 llvm-svn: 271447
* [MemorySSA] Port to new pass managerGeoff Berry2016-06-012-67/+56
| | | | | | | | | | | | | | | | | Add support for the new pass manager to MemorySSA pass. Change MemorySSA to be computed eagerly upon construction. Change MemorySSAWalker to be owned by the MemorySSA object that creates it. Reviewers: dberlin, george.burgess.iv Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19664 llvm-svn: 271432
* [LV] For some IVs, use vector phis instead of widening in the loop bodyMichael Kuperstein2016-06-011-20/+76
| | | | | | | | | | | | | Previously, whenever we needed a vector IV, we would create it on the fly, by splatting the scalar IV and adding a step vector. Instead, we can create a real vector IV. This tends to save a couple of instructions per iteration. This only changes the behavior for the most basic case - integer primary IVs with a constant step. Differential Revision: http://reviews.llvm.org/D20315 llvm-svn: 271410
* IR: Allow multiple global metadata attachments with the same type.Peter Collingbourne2016-06-012-5/+6
| | | | | | | | | | This will be necessary to allow the global merge pass to attach multiple debug info metadata nodes to global variables once we reverse the edge from DIGlobalVariable to GlobalVariable. Differential Revision: http://reviews.llvm.org/D20414 llvm-svn: 271358
* [SLP] Pass in correct alignment when query memory access costGuozhi Wei2016-05-311-4/+8
| | | | | | | | | | This patch fixes bug https://llvm.org/bugs/show_bug.cgi?id=27897. When query memory access cost, current SLP always passes in alignment value of 1 (unaligned), so it gets a very high cost of scalar memory access, and wrongly vectorize memory loads in the test case. It can be fixed by simply giving correct alignment. llvm-svn: 271333
* [PM] BDCE: Fix caching of analyses.Davide Italiano2016-05-311-3/+8
| | | | | | | Another chapter in the story. GlobalsAA should be preserved, as well as the CFG. llvm-svn: 271307
* [PM] ADCE: Fix caching of analyses.Davide Italiano2016-05-311-3/+8
| | | | | | | When this pass was originally ported, AA wasn't available for the new PM. Now it is, so we can cache properly. llvm-svn: 271303
* Fix a crash in MergeFunctions related to ordering of weak/strong functionsErik Eckstein2016-05-311-32/+12
| | | | | | | | | | | The assumption, made in insert() that weak functions are always inserted after strong functions, is only true in the first round of adding functions. In subsequent rounds this is no longer guaranteed , because we might remove a strong function from the tree (because it's modified) and add it later, where an equivalent weak function already exists in the tree. This change removes the assert in insert() and explicitly enforces a weak->strong order. This also removes the need of two separate loops in runOnModule(). llvm-svn: 271299
* [esan|cfrag] Create the skeleton of cfrag variable for the runtimeQin Zhao2016-05-311-19/+90
| | | | | | | | | | | | | | | | | Summary: Creates a global variable containing preliminary information for the cache-fragmentation tool runtime. Passes a pointer to the variable (null if no variable is created) to the compilation unit init and exit routines in the runtime. Reviewers: aizatsky, bruening Subscribers: filcab, kubabrecka, bruening, kcc, vitalybuka, eugenis, llvm-commits, zhaoqin Differential Revision: http://reviews.llvm.org/D20541 llvm-svn: 271298
* X86: permit using SjLj EH on x86 targets as an optionSaleem Abdulrasool2016-05-311-0/+2
| | | | | | | | | | | This adds support to the backed to actually support SjLj EH as an exception model. This is *NOT* the default model, and requires explicitly opting into it from the frontend. GCC supports this model and for MinGW can still be enabled via the `--using-sjlj-exceptions` options. Addresses PR27749! llvm-svn: 271244
OpenPOWER on IntegriCloud