summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* APFloat: Add frexpMatt Arsenault2016-03-211-1/+25
| | | | llvm-svn: 263950
* AMDGPU: Add frexp_mant intrinsicMatt Arsenault2016-03-211-2/+2
| | | | llvm-svn: 263948
* Implement constant folding for bitreverseMatt Arsenault2016-03-212-0/+33
| | | | llvm-svn: 263945
* [AArch64] Fix a -Wdocumentation warning. NFC.Chad Rosier2016-03-211-2/+2
| | | | llvm-svn: 263942
* [IndVars] Fix PR26974: make sure replaceCongruentIVs doesn't break LCSSASilviu Baranga2016-03-211-0/+1
| | | | | | | | | | | | | | | | | | | Summary: replaceCongruentIVs can break LCSSA when trying to replace IV increments since it tries to replace all uses of a phi node with another phi node while both of the phi nodes are not necessarily in the processed loop. This will cause an assert in IndVars. To fix this, we add a check to make sure that the replacement maintains LCSSA. Reviewers: sanjoy Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18266 llvm-svn: 263941
* [DAGCombine] Catch the case where extract_vector_elt can cause an any_ext ↵Silviu Baranga2016-03-211-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | while processing AND SDNodes Summary: extract_vector_elt can cause an implicit any_ext if the types don't match. When processing the following pattern: (and (extract_vector_elt (load ([non_ext|any_ext|zero_ext] V))), c) DAGCombine was ignoring the possible extend, and sometimes removing the AND even though it was required to maintain some of the bits in the result to 0, resulting in a miscompile. This change fixes the issue by limiting the transformation only to cases where the extract_vector_elt doesn't perform the implicit extend. Reviewers: t.p.northover, jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18247 llvm-svn: 263935
* [NVPTX] Adds a new address space inference pass.Jingyue Wu2016-03-205-9/+609
| | | | | | | | | | | | | | | | | | | Summary: The old address space inference pass (NVPTXFavorNonGenericAddrSpaces) is unable to convert the address space of a pointer induction variable. This patch adds a new pass called NVPTXInferAddressSpaces that overcomes that limitation using a fixed-point data-flow analysis (see the file header comments for details). The new pass is experimental and not enabled by default. Users can turn it on by setting the -nvptx-use-infer-addrspace flag of llc. Reviewers: jholewinski, tra, jlebar Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D17965 llvm-svn: 263916
* [X86][SSE] Tidyup setTargetShuffleZeroElements to match ↵Simon Pilgrim2016-03-201-4/+4
| | | | | | | | computeZeroableShuffleElements Based on feedback for D14261 llvm-svn: 263911
* [X86][SSE] Detect zeroable shuffle elements from different value typesSimon Pilgrim2016-03-201-8/+42
| | | | | | | | Improve computeZeroableShuffleElements to be able to peek through bitcasts to extract zero/undef values from BUILD_VECTOR nodes of different element sizes to the shuffle mask. Differential Revision: http://reviews.llvm.org/D14261 llvm-svn: 263906
* AVX512BW: Enable v32i1/v64i1 BUILD_VECTORIgor Breger2016-03-201-0/+2
| | | | | | Differential Revision: http://reviews.llvm.org/D18211 llvm-svn: 263898
* Suppress a -Wunused-variable warning in release builds.Craig Topper2016-03-201-0/+1
| | | | llvm-svn: 263892
* Use a range-based for loop. NFC.Michael Kuperstein2016-03-201-4/+4
| | | | llvm-svn: 263889
* Expose IRBuilder::CreateAtomicCmpXchg as LLVMBuildAtomicCmpXchg in the C API.Mehdi Amini2016-03-191-0/+55
| | | | | | | | | | | | | Summary: Also expose getters and setters in the C API, so that the change can be tested. Reviewers: nhaehnle, axw, joker.eph Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18260 From: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> llvm-svn: 263886
* CodeGen: use range based for loopSaleem Abdulrasool2016-03-191-5/+4
| | | | | | Convert a loop to use a range based style loop. NFC. llvm-svn: 263884
* [SimplifyLibCalls] Only consider sinpi/cospi functions within the same functionDavid Majnemer2016-03-191-7/+11
| | | | | | | | | | The sinpi/cospi can be replaced with sincospi to remove unnecessary computations. However, we need to make sure that the calls are within the same function! This fixes PR26993. llvm-svn: 263875
* [InstCombine] Don't insert instructions before a catch switchDavid Majnemer2016-03-191-0/+3
| | | | | | | | | CatchSwitches are not splittable, we cannot insert casts, etc. before them. This fixes PR26992. llvm-svn: 263874
* Add a comment on partial hashing of MetadataMehdi Amini2016-03-191-0/+12
| | | | | | | Following r263866, on D. Blaikie suggestion. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263869
* Hash Metadata using pointer for MDString argument instead of value (NFC)Mehdi Amini2016-03-192-140/+127
| | | | | | | | | | | MDString are uniqued in the Context on creation, hashing the pointer is less expensive than hashing the String itself. Reviewers: dexonsmith Differential Revision: http://reviews.llvm.org/D16560 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263867
* Compute some Debug Info Metadata hash key partially (NFC)Mehdi Amini2016-03-191-9/+4
| | | | | | | | | | | | | | | | | | | | | | Summary: This patch changes the computation of the hash key for DISubprogram to be computed on a small subset of the fields. The hash is computed a lot faster, but there might be more collision in the table. However by carefully selecting the fields, colisions should be rare. Using `opt` to load the IR for FastISelEmitter.cpp.o, with this patch: - DISubprogram::getImpl() goes from 28ms to 15ms. - DICompositeType::getImpl() goes from 6ms to 2ms - DIDerivedType::getImpl() goes from 18 to 12ms Reviewers: dexonsmith Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16571 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263866
* Rework linkInModule(), making it oblivious to ThinLTOMehdi Amini2016-03-193-72/+33
| | | | | | | | | | | | | | | | | | | | | | | | Summary: ThinLTO is relying on linkInModule to import selected function. However a lot of "magic" was hidden in linkInModule and the IRMover, who would rename and promote global variables on the fly. This is moving to an approach where the steps are decoupled and the client is reponsible to specify the list of globals to import. As a consequence some test are changed because they were relying on the previous behavior which was importing the definition of *every* single global without control on the client side. Now the burden is on the client to decide if a global has to be imported or not. Reviewers: tejohnson Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D18122 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263863
* [CXX_FAST_TLS] Fix issues in ARM.Manman Ren2016-03-181-2/+3
| | | | | | | | | We need to be careful on which registers can be explicitly handled via copies. Prologue, Epilogue use physical registers and if one belongs to the set of CSRsViaCopy, it will no longer be CSRed, since PEI overwrites it after the explicit copies. llvm-svn: 263857
* [CXX_FAST_TLS] Disable tail call when calling conventions are mismatched.Manman Ren2016-03-183-0/+21
| | | | | | | Since CXX_FAST_TLS has a bigger set of CSRs, we don't tail call when caller and callee have mismatched calling conventions. llvm-svn: 263856
* [CXX_FAST_TLS] fix issues with O0 on ARM, AArch64 and X86.Manman Ren2016-03-183-1/+3
| | | | | | | Since at O0, explicit copies via SplitCSR may not be removed even if they are unnecessary, we choose not to use SplitCSR at O0. llvm-svn: 263855
* AArch64: Don't modify other modules in AArch64PromoteConstantDuncan P. N. Exon Smith2016-03-181-148/+177
| | | | | | | | | | | | | | | | | | | | | | | | | | | Avoid modifying other modules in `AArch64PromoteConstant` when the constant is `ConstantData` (a horrible accident, I'm sure, caught by an experimental follow-up to r261464). Previously, this walked through all the users of a constant, but that reaches into other modules when the constant doesn't depend transitively on a `GlobalValue`! Since we're walking instructions anyway, just modify the instructions we actually see. As a drive-by, instead of storing `Use` and getting the instructions again via `Use::getUser()` (which is not a constantant time lookup), store `std::pair<Instruction, unsigned>`. Besides being cheaper, this makes it easier to drop use-lists form `ConstantData` in the future. (I threw this in because I was touching all the code anyway.) Because the patch completely changes the traversal logic, it looks like a rewrite of the pass, but the core logic is all the same (or should be, minus the out-of-module changes). In other words, there should be NFC as long as the LLVMContext only has a single Module. I didn't think of a good way to test this, but I hope to submit a patch eventually that makes walking these use-lists illegal/impossible. llvm-svn: 263853
* [sancov] clang-formatting SanitizerCoverage.cpp and fully pleasing clang-tidy.Mike Aizatsky2016-03-181-72/+78
| | | | | | Differential Revision: http://reviews.llvm.org/D18288 llvm-svn: 263852
* Revert "Revert "[sancov] specifying sanitizer coverage dependencies.""Chandler Carruth2016-03-181-1/+7
| | | | | | This reverts commit r263825, re-instating r263797. llvm-svn: 263847
* [sancov] Fix the sancov pass to initialize itself inside itsChandler Carruth2016-03-181-1/+3
| | | | | | | constructor. This should fix the recent crashes on certain architectures. llvm-svn: 263845
* BPF: emit an error message for unsupported signed division operationAlexei Starovoitov2016-03-181-0/+12
| | | | | | Signed-off-by: Yonghong Song <yhs@plumgrid.com> Signed-off-by: Alexei Starovoitov <ast@fb.com> llvm-svn: 263842
* Interface to get/set profile summary metadata to moduleEaswaran Raman2016-03-181-0/+8
| | | | | | Differential Revision: http://reviews.llvm.org/D17894 llvm-svn: 263835
* [libFuzzer] add a flag close_fd_mask so that we can silence spammy targets ↵Kostya Serebryany2016-03-187-1/+74
| | | | | | by closing stderr/stdout llvm-svn: 263831
* MILexer: Add ErrorCallbackType typedef; NFCMatthias Braun2016-03-181-30/+22
| | | | llvm-svn: 263829
* [IndVars] Make the fix for PR26973 more obvious; NFCISanjoy Das2016-03-181-3/+42
| | | | llvm-svn: 263828
* [IndVars] Pass the right loop to isLoopInvariantPredicateSanjoy Das2016-03-181-3/+2
| | | | | | | | | | | The loop on IVOperand's incoming values assumes IVOperand to be an induction variable on the loop over which `S Pred X` is invariant; otherwise loop invariant incoming values to IVOperand are not guaranteed to dominate the comparision. This fixes PR26973. llvm-svn: 263827
* Revert "[sancov] specifying sanitizer coverage dependencies."Mike Aizatsky2016-03-181-7/+1
| | | | | | | | This fails on arm. This reverts commit 52c8e0f7119d1ea1050c0708565a8c92b73386d2. llvm-svn: 263825
* AMDGPU: add missing braces around multi-line if blockNicolai Haehnle2016-03-181-1/+2
| | | | | | This fixes an issue with rL263658 pointed out by Tom Stellard. llvm-svn: 263823
* [AArch64] Enable more load clustering in the MI Scheduler.Chad Rosier2016-03-183-36/+116
| | | | | | | | | | | | | This patch adds unscaled loads and sign-extend loads to the TII getMemOpBaseRegImmOfs API, which is used to control clustering in the MI scheduler. This is done to create more opportunities for load pairing. I've also added the scaled LDRSWui instruction, which was missing from the scaled instructions. Finally, I've added support in shouldClusterLoads for clustering adjacent sext and zext loads that too can be paired by the load/store optimizer. Differential Revision: http://reviews.llvm.org/D18048 llvm-svn: 263819
* [codeview] Only emit function ids for inlined functionsReid Kleckner2016-03-182-54/+76
| | | | | | | We aren't referencing any other kind of function currently. Should save a bit on our debug info size. llvm-svn: 263817
* [MCParser] Accept uppercase radix variants 0X and 0BColin LeMahieu2016-03-182-4/+4
| | | | | | Differential Revision: http://reviews.llvm.org/D14781 llvm-svn: 263802
* [sancov] specifying sanitizer coverage dependencies.Mike Aizatsky2016-03-181-1/+7
| | | | | | | | | | | | | | | Summary: These dependencies would be used in the future to reduce the number of instrumented blocks(http://reviews.llvm.org/rL262103) This is submitted as a separate CL because of previous problems with ARM. Subscribers: aemerson Differential Revision: http://reviews.llvm.org/D18227 llvm-svn: 263797
* AMDGPU: Overload return type of llvm.amdgcn.buffer.load.formatNicolai Haehnle2016-03-181-36/+43
| | | | | | | | | | | | | | | | Summary: Allow the selection of BUFFER_LOAD_FORMAT_x and _XY. Do this now before the frontend patches land in Mesa. Eventually, we may want to automatically reduce the size of loads at the LLVM IR level, which requires such overloads, and in some cases Mesa can generate them directly. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18255 llvm-svn: 263792
* AMDGPU/SI: Add llvm.amdgcn.buffer.atomic.* intrinsicsNicolai Haehnle2016-03-183-2/+187
| | | | | | | | | | | | | | | | | | | | Summary: These intrinsics expose the BUFFER_ATOMIC_* instructions and will be used by Mesa to implement atomics with buffer semantics. The intrinsic interface matches that of buffer.load.format and buffer.store.format, except that the GLC bit is not exposed (it is automatically deduced based on whether the return value is used). The change of hasSideEffects is required for TableGen to accept the pattern that matches the intrinsic. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, rivanvx, llvm-commits Differential Revision: http://reviews.llvm.org/D18151 llvm-svn: 263791
* AMDGPU: use ComplexPattern for offsets in llvm.amdgcn.buffer.load/store.formatNicolai Haehnle2016-03-183-13/+110
| | | | | | | | | | | | | | | | | | | | | Summary: We cannot easily deduce that an offset is in an SGPR, but the Mesa frontend cannot easily make use of an explicit soffset parameter either. Furthermore, it is likely that in the future, LLVM will be in a better position than the frontend to choose an SGPR offset if possible. Since there aren't any frontend uses of these intrinsics in upstream repositories yet, I would like to take this opportunity to change the intrinsic signatures to a single offset parameter, which is then selected to immediate offsets or voffsets using a ComplexPattern. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18218 llvm-svn: 263790
* [AMDGPU] Assembler: Change dpp_ctrl syntax to match sp3Sam Kolton2016-03-182-50/+95
| | | | | Review: http://reviews.llvm.org/D18267 llvm-svn: 263789
* [Fuzzer] Guard no_sanitize_memory attributes behind __has_feature.Benjamin Kramer2016-03-181-2/+10
| | | | | | Otherwise GCC fails to build it because it doesn't know the attribute. llvm-svn: 263787
* adding another optimization opportunity to readme fileEhsan Amiri2016-03-181-0/+11
| | | | llvm-svn: 263775
* [libFuzzer] read corpus dirs recursivelyKostya Serebryany2016-03-182-14/+25
| | | | llvm-svn: 263773
* [LoopDataPrefetch] Add TTI to limit the number of iterations to prefetch aheadAdam Nemet2016-03-184-0/+22
| | | | | | | | | | | | | | Summary: It can hurt performance to prefetch ahead too much. Be conservative for now and don't prefetch ahead more than 3 iterations on Cyclone. Reviewers: hfinkel Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17949 llvm-svn: 263772
* [LoopDataPrefetch/Aarch64] Allow selective prefetching of large-strided accessesAdam Nemet2016-03-184-0/+43
| | | | | | | | | | | | | | | | | | | | | Summary: And use this TTI for Cyclone. As it was explained in the original RFC (http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758), the HW prefetcher work up to 2KB strides. I am also adding tests for this and the previous change (D17943): * Cyclone prefetching accesses with a large stride * Cyclone not prefetching accesses with a small stride * Generic Aarch64 subtarget not prefetching either Reviewers: hfinkel Subscribers: aemerson, rengolin, llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17945 llvm-svn: 263771
* [Aarch64] Add pass LoopDataPrefetch for CycloneAdam Nemet2016-03-183-0/+34
| | | | | | | | | | | | | | | | | | | | Summary: This wires up the pass for Cyclone but keeps it off for now because we need a few more TTIs. The getPrefetchMinStride value is not very well tuned right now but it works well with CFP2006/433.milc which motivated this. Tests will be added as part of the upcoming large-stride prefetching patch. Reviewers: t.p.northover Subscribers: llvm-commits, aemerson, hfinkel, rengolin Differential Revision: http://reviews.llvm.org/D17943 llvm-svn: 263770
* [libFuzzer] improve -merge functionalityKostya Serebryany2016-03-186-73/+101
| | | | llvm-svn: 263769
OpenPOWER on IntegriCloud