summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* PredicateInfo: Support switch statementsDaniel Berlin2017-02-222-36/+110
| | | | | | | | | | | | | | | | | | | Summary: Depends on D29606 and D29682 Makes us pass GVN's edge.ll (we also will pass a few other testcases they just need cleaning up). Thoughts on the Predicate* hiearchy of classes especially welcome :) (it's not clear to me how best to organize it, and currently, the getBlock* seems ... uglier than maybe wasting a field somewhere or something). Reviewers: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29747 llvm-svn: 295889
* Move updating functions to MemorySSAUpdater.Daniel Berlin2017-02-224-131/+110
| | | | | | | | | | | | | | | Add updater to passes that now need it. Move around code in MemorySSA to expose needed functions. Summary: Mostly cleanup Reviewers: george.burgess.iv Subscribers: llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D30221 llvm-svn: 295887
* [LSR] Canonicalize formula and put recursive Reg related with current loop ↵Wei Mi2017-02-221-39/+83
| | | | | | | | | | | | | | | | in ScaledReg. After rL294814, LSR formula can have multiple SCEVAddRecExprs inside of its BaseRegs. Previous canonicalization will swap the first SCEVAddRecExpr in BaseRegs with ScaledReg. But now we want to swap the SCEVAddRecExpr Reg related with current loop with ScaledReg. Otherwise, we may generate code like this: RegA + lsr.iv + RegB, where loop invariant parts RegA and RegB are not grouped together and cannot be promoted outside of loop. With this patch, it will ensure lsr.iv to be generated later in the expr: RegA + RegB + lsr.iv, so that RegA + RegB can be promoted outside of loop. Differential Revision: https://reviews.llvm.org/D26781 llvm-svn: 295884
* [RDF] Support for partial structural aliases in RegisterAggrKrzysztof Parzyszek2017-02-222-61/+67
| | | | llvm-svn: 295883
* [Support] Re-add the special OSX flags on mmap.Zachary Turner2017-02-221-0/+19
| | | | | | | | The problem appears to be that these flags can only be used when mapping a file for read-only, not for readwrite. So we do that here. llvm-svn: 295880
* [Hexagon] Add intrinsics for masked vector storesKrzysztof Parzyszek2017-02-221-0/+19
| | | | | | Patch by Harsha Jagasia. llvm-svn: 295879
* AMDGPU: Don't look at chain users when adjusting writemaskMatt Arsenault2017-02-221-0/+4
| | | | | | Fixes not adjusting using new intrinsics with chains. llvm-svn: 295878
* AMDGPU: Always allocate emergency stack slot at offset 0Matt Arsenault2017-02-221-5/+19
| | | | | | | | | This allows us to ensure that 0 is never a valid pointer to a user object, and ensures that the offset is always legal without needing a register to access it. This comes at the cost of usable offsets and wasted stack space. llvm-svn: 295877
* AMDGPU: Change exp with compr bit printingMatt Arsenault2017-02-221-3/+11
| | | | llvm-svn: 295873
* Revert "AMDGPU : Update TrapCode based on Trap Handler ABI."Wei Ding2017-02-224-16/+12
| | | | | | This reverts commit r295867. llvm-svn: 295871
* [SLP] Fix for PR32036: Vectorized horizontal reduction returning wrong resultAlexey Bataev2017-02-221-13/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: If the same value is used several times as an extra value, SLP vectorizer takes it into account only once instead of actual number of using. For example: ``` int val = 1; for (int y = 0; y < 8; y++) { for (int x = 0; x < 8; x++) { val = val + input[y * 8 + x] + 3; } } ``` We have 2 extra rguments: `1` - initial value of horizontal reduction and `3`, which is added 8*8 times to the reduction. Before the patch we added `1` to the reduction value and added once `3`, though it must be added 64 times. Reviewers: mkuper, mzolotukhin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30262 llvm-svn: 295868
* AMDGPU : Update TrapCode based on Trap Handler ABI.Wei Ding2017-02-224-12/+16
| | | | | | Differential Revision: http://reviews.llvm.org/D30232 llvm-svn: 295867
* Move llvm_unreachable out of switch.Rafael Espindola2017-02-221-2/+5
| | | | | | | This should make gcc happy and still produce a clang warning if we add another value to the enum. llvm-svn: 295865
* [AArch64] Extend AArch64RedundantCopyElimination to do simple copy propagation.Geoff Berry2017-02-221-43/+127
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Extend AArch64RedundantCopyElimination to catch cases where the register that is known to be zero is COPY'd in the predecessor block. Before this change, this pass would catch cases like: CBZW %W0, <BB#1> BB#1: %W0 = COPY %WZR // removed After this change, cases like the one below are also caught: %W0 = COPY %W1 CBZW %W1, <BB#1> BB#1: %W0 = COPY %WZR // removed This change results in a 4% increase in static copies removed by this pass when compiling the llvm test-suite. It also fixes regressions caused by doing post-RA copy propagation (a separate change to be put up for review shortly). Reviewers: junbuml, mcrosier, t.p.northover, qcolombet, MatzeB Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D30113 llvm-svn: 295863
* [ModuleSummaryAnalysis] Don't crash when referencing unnamed globals.Davide Italiano2017-02-221-0/+6
| | | | | | | Instead, just be conservative as these are unfrequent enough. Thanks to Peter Collingbourne for the discussion about this on IRC. llvm-svn: 295861
* [WebAssembly] Implement the wasm binary container header.Dan Gohman2017-02-221-1/+3
| | | | | | | Also, update the version number to 0x1, which is what engines are now expecting. llvm-svn: 295860
* [LoopVectorize] Added address space check when analysing interleaved accessesKarl-Johan Karlsson2017-02-221-14/+19
| | | | | | | | | | | | | | | | | | | | Prevent memory objects of different address spaces to be part of the same load/store groups when analysing interleaved accesses. This is fixing pr31900. Reviewers: HaoLiu, mssimpso, mkuper Reviewed By: mssimpso, mkuper Subscribers: llvm-commits, efriedma, mzolotukhin Differential Revision: https://reviews.llvm.org/D29717 This reverts r295042 (re-applies r295038) with an additional fix for the buildbot problem. llvm-svn: 295858
* [WebAssembly] Define a table of function signatures for runtime library calls.Dan Gohman2017-02-223-0/+1345
| | | | | | | | | | LLVM CodeGen emits references to external symbols that are never declared in LLVM IR level, so they have no declared signature. However, WebAssembly requires all functions be declared with signatures. This patch adds a table for providing signatures for known runtime libcalls that will be used in subsequent patches to emit declarations for such functions. llvm-svn: 295857
* [RDF] Skip undef uses when calculating kill flagsKrzysztof Parzyszek2017-02-221-1/+1
| | | | llvm-svn: 295856
* [RDF] Only access block live-ins when tracking livenessKrzysztof Parzyszek2017-02-221-2/+4
| | | | llvm-svn: 295855
* [Support] Provide linux/magic.h fallback for older kernelsMichal Gorny2017-02-221-0/+15
| | | | | | | | | | | | | | | | | | | | | The function for distinguishing local and remote files added in r295768 unconditionally uses linux/magic.h header to provide necessary filesystem magic numbers. However, in kernel headers predating 2.6.18 the magic numbers are spread throughout multiple include files. Furthermore, LLVM did not require kernel headers being installed so far. To increase the portability across different versions of Linux kernel and different Linux systems, add CMake header checks for linux/magic.h and -- if it is missing -- the linux/nfs_fs.h and linux/smb.h headers which contained the numbers previously. Furthermore, since the numbers are static and the feature does not seem critical enough to make LLVM require kernel headers at all, add fallback constants for the case when none of the necessary headers is available. Differential Revision: https://reviews.llvm.org/D30261 llvm-svn: 295854
* Fix an obvious bug in SampleProfileReaderGCC.Dehao Chen2017-02-221-5/+3
| | | | | | | | | | | | | | Summary: The CallTargetProfile should be added to FProfile to be consistent with other profile readers. Reviewers: dnovillo, davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30233 llvm-svn: 295852
* [WebAssembly] Configure codegen to legalize f16 values.Dan Gohman2017-02-221-0/+5
| | | | llvm-svn: 295850
* [DAGCombiner] revert r295336Bill Seurer2017-02-221-19/+8
| | | | | | | | | | | r295336 causes a bootstrapped clang to fail for many compilations on powerpc BE. See http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/2315 for example. Reverting as per the developer's request. llvm-svn: 295849
* [X86][SSE] getTargetConstantBitsFromNode - insert constant bits directly ↵Simon Pilgrim2017-02-221-18/+15
| | | | | | | | into masks. Minor optimization, don't create temporary mask APInts that are just going to be OR'd into the accumulate masks - insert directly instead. llvm-svn: 295848
* [X86][SSE] Use APInt::getBitsSet() instead of APInt::getLowBitsSet().shl() ↵Simon Pilgrim2017-02-222-8/+10
| | | | | | separately. NFCI. llvm-svn: 295845
* Fix -Wunused-but-set-variable warning by removing unused 'aggregateIsPacked' ↵Simon Pilgrim2017-02-221-4/+0
| | | | | | checking llvm-svn: 295830
* [GlobalISel] Fix compiler warnings and make assert assert something.Benjamin Kramer2017-02-223-11/+7
| | | | llvm-svn: 295827
* [SLP] Remove unused initial value from the variable, NFC.Alexey Bataev2017-02-221-1/+1
| | | | llvm-svn: 295826
* [X86][GlobalISel] Initial implementation , select G_ADD gpr, gprIgor Breger2017-02-227-6/+201
| | | | | | | | | | | | | | Summary: Initial implementation for X86InstructionSelector. Handle selection COPY and G_ADD/G_SUB gpr, gpr . Reviewers: qcolombet, rovka, zvi, ab Reviewed By: rovka Subscribers: mgorny, dberris, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D29816 llvm-svn: 295824
* [ARM] Fix constant islands pass.Roger Ferrer Ibanez2017-02-221-0/+7
| | | | | | | | | | | | The pass tries to fix a spill of LR that turns out to be unnecessary. So it removes the tPOP but forgets to remove tPUSH. This causes the stack be misaligned upon returning the function. Thus, remove the tPUSH as well in this case. Differential Revision: https://reviews.llvm.org/D30207 llvm-svn: 295816
* [X86] Fix memory operands definition for some instructions.Ayman Musa2017-02-221-10/+14
| | | | | | | | Change integer memory operands to FP memory operands to some FP instructions. Differential Revision: https://reviews.llvm.org/D30201 llvm-svn: 295813
* OptDiag: Add const to some interfaces that don't modify anything. NFCJustin Bogner2017-02-222-4/+4
| | | | | | | | This needed a const_cast for the dominator tree recalculation in OptimizationRemarkEmitter, but we do that all over the place already and it's safe. llvm-svn: 295812
* [ARM] Classification Improvements to ARM Sched-Models. NFCI.Javed Absar2017-02-225-69/+111
| | | | | | | | | | | | | | This patch adds missing sched classes for Thumb2 instructions. This has been missing so far, and as a consequence, machine scheduler models for individual sub-targets have tended to be larger than they needed to be. These patches should help write schedulers better and faster in the future for ARM sub-targets. Reviewer: Diana Picus Differential Revision: https://reviews.llvm.org/D29953 llvm-svn: 295811
* [AVX-512] Allow legacy scalar min/max intrinsics to select EVEX instructions ↵Craig Topper2017-02-227-45/+85
| | | | | | | | | | | | when available This patch introduces new X86ISD::FMAXS and X86ISD::FMINS opcodes. The legacy intrinsics now lower to this node. As do the AVX-512 masked intrinsics when the rounding mode is CUR_DIRECTION. I've merged a copy of the tablegen multiclass avx512_fp_scalar into avx512_fp_scalar_sae. avx512_fp_scalar still needs to support CUR_DIRECTION appearing as a rounding mode for X86ISD::FADD_ROUND and others. Differential revision: https://reviews.llvm.org/D30186 llvm-svn: 295810
* [ValueTracking] Make poison propagation more aggressiveSanjoy Das2017-02-221-49/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Motivation: fix PR31181 without regression (the actual fix is still in progress). However, the actual content of PR31181 is not relevant here. This change makes poison propagation more aggressive in the following cases: 1. poision * Val == poison, for any Val. In particular, this changes existing intentional and documented behavior in these two cases: a. Val is 0 b. Val is 2^k * N 2. poison << Val == poison, for any Val 3. getelementptr is poison if any input is poison I think all of these are justified (and are axiomatically true in the new poison / undef model): 1a: we need poison * 0 to be poison to allow transforms like these: A * (B + C) ==> A * B + A * C If poison * 0 were 0 then the above transform could not be allowed since e.g. we could have A = poison, B = 1, C = -1, making the LHS poison * (1 + -1) = poison * 0 = 0 and the RHS poison * 1 + poison * -1 = poison + poison = poison 1b: we need e.g. poison * 4 to be poison since we want to allow A * 4 ==> A + A + A + A If poison * 4 were a value with all of their bits poison except the last four; then we'd not be able to do this transform since then if A were poison the LHS would only be "partially" poison while the RHS would be "full" poison. 2: Same reasoning as (1b), we'd like have the following kinds transforms be legal: A << 1 ==> A + A Reviewers: majnemer, efriedma Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D30185 llvm-svn: 295809
* Use const-ref in range-loop for to avoid copying pairs of std::stringSean Silva2017-02-222-2/+2
| | | | | | | | | | No reason to create temporaries. Differential Revision: https://reviews.llvm.org/D29871 Patch by sergio.martins! llvm-svn: 295807
* [WebAssembly] Add skeleton MC support for the Wasm container formatDan Gohman2017-02-2223-16/+892
| | | | | | | | | This just adds the basic skeleton for supporting a new object file format. All of the actual encoding will be implemented in followup patches. Differential Revision: https://reviews.llvm.org/D26722 llvm-svn: 295803
* Fix -Wcovered-switch-default.Rui Ueyama2017-02-221-3/+1
| | | | llvm-svn: 295799
* AMDGPU: Add cvt.pkrtz intrinsicMatt Arsenault2017-02-228-5/+81
| | | | | | Convert llvm.SI.packf16 test uses llvm-svn: 295797
* [LoopUnroll] Enable PGO-based loop peeling by default.Michael Kuperstein2017-02-221-2/+2
| | | | | | | | | This enables peeling of loops with low dynamic iteration count by default, when profile information is available. Differential Revision: https://reviews.llvm.org/D27734 llvm-svn: 295796
* AMDGPU: Remove llvm.AMDGPU.clamp intrinsicMatt Arsenault2017-02-212-13/+0
| | | | llvm-svn: 295789
* AMDGPU: Redefine clamp node as clamp 0.0-1.0Matt Arsenault2017-02-2112-29/+163
| | | | | | | | | | | Change implementation to use max instead of add. min/max/med3 do not flush denormals regardless of the mode, so it is OK to use it whether or not they are enabled. Also allow using clamp with f16, and use knowledge of dx10_clamp. llvm-svn: 295788
* [NVPTX] Unify vectorization of load/stores of aggregate arguments and return ↵Artem Belevich2017-02-211-710/+420
| | | | | | | | | | | | | | | | | | | | | | | | | | | | values. Original code only used vector loads/stores for explicit vector arguments. It could also do more loads/stores than necessary (e.g v5f32 would touch 8 f32 values). Aggregate types were loaded one element at a time, even the vectors contained within. This change attempts to generalize (and simplify) parameter space loads/stores so that vector loads/stores can be used more broadly. Functionality of the patch has been verified by compiling thrust test suite and manually checking the differences between PTX generated by llvm with and without the patch. General algorithm: * ComputePTXValueVTs() flattens input/output argument into a flat list of scalars to load/store and returns their types and offsets. * VectorizePTXValueVTs() uses that data to create vectorization plan which returns an array of flags marking boundaries of vectorized load/stores. Scalars are represented as 1-element vectors. * Code that generates loads/stores implements a simple state machine that constructs a vector according to the plan. Differential Revision: https://reviews.llvm.org/D30011 llvm-svn: 295784
* AMDGPU: Formatting fixesMatt Arsenault2017-02-211-4/+5
| | | | llvm-svn: 295783
* DAG: Check if extract_vector_elt is legal or customMatt Arsenault2017-02-211-1/+1
| | | | | | | Avoids test regressions in future AMDGPU commits when more vector types are custom lowered. llvm-svn: 295782
* [AArch64, X86] Add statistics for the MacroFusion passEvandro Menezes2017-02-212-0/+8
| | | | llvm-svn: 295777
* [AArch64, X86] Guard against both instrs being wild cardsEvandro Menezes2017-02-212-10/+12
| | | | | | If both instrs are wild cards, the result can be a crash. llvm-svn: 295776
* [CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; ↵Eugene Zelenko2017-02-215-61/+141
| | | | | | other minor fixes (NFC). llvm-svn: 295773
* Try to fix the buildbot on OSX.Zachary Turner2017-02-211-16/+0
| | | | | | | | | Since I'm only seeing failures on OSX, and it's saying permission denied, I'm suspecting this is due to the addition of the MAP_RESILIENT_CODESIGN and/or MAP_RESILIENT_MEDIA flags. Speculatively trying to remove those to get the bots working. llvm-svn: 295770
OpenPOWER on IntegriCloud