summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* [coroutines] Spill the result of the invoke instruction correctlyGor Nishanov2017-01-252-9/+82
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When we decide that the result of the invoke instruction need to be spilled, we need to insert the spill into a block that is on the normal edge coming out of the invoke instruction. (Prior to this change the code would insert the spill immediately after the invoke instruction, which breaks the IR, since invoke is a terminator instruction). In the following example, we will split the edge going into %cont and insert the spill there. ``` %r = invoke double @print(double 0.0) to label %cont unwind label %pad cont: %0 = call i8 @llvm.coro.suspend(token none, i1 false) switch i8 %0, label %suspend [i8 0, label %resume i8 1, label %cleanup] resume: call double @print(double %r) ``` Reviewers: majnemer Reviewed By: majnemer Subscribers: mehdi_amini, llvm-commits, EricWF Differential Revision: https://reviews.llvm.org/D29102 llvm-svn: 293006
* [OpenMP] Codegen support for 'target teams' on the host.Arpith Chacko Jacob2017-01-258-41/+1423
| | | | | | | | | | | | | | | This patch adds support for codegen of 'target teams' on the host. This combined directive has two captured statements, one for the 'teams' region, and the other for the 'parallel'. This target teams region is offloaded using the __tgt_target_teams() call. The patch sets the number of teams as an argument to this call. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29084 llvm-svn: 293005
* Use filename in linemarker when compiling preprocessed sourceDavid Callahan2017-01-253-3/+58
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Clang appears to always use name as specified on the command line, whereas gcc uses the name as specified in the linemarker at the first line when compiling a preprocessed source. This results mismatch between two compilers in FILE symbol table entry. This patch makes clang to resemble gcc's behavior in finding the original source file name and use it as an input file name. Even with this patch, values of FILE symbol table entry may still be different because clang uses dirname+basename for the entry whlie gcc uses basename only. I'll write a patch for that once this patch is committed. Reviewers: dblaikie, inglorion Reviewed By: inglorion Subscribers: inglorion, aprantl, bruno Differential Revision: https://reviews.llvm.org/D28796 llvm-svn: 293004
* Reverting commit because an NVPTX patch sneaked in. Break up into twoArpith Chacko Jacob2017-01-259-1424/+41
| | | | | | patches. llvm-svn: 293003
* Jim unintentionally had the gdb-format specifiers falling throughJason Molenda2017-01-253-17/+40
| | | | | | | | | after r276132 so that 'x/4b' would print out a series of 4 8-byte quantities. Fix that, add a test case. <rdar://problem/29930833> llvm-svn: 293002
* [OpenMP] Codegen support for 'target teams' on the host.Arpith Chacko Jacob2017-01-259-41/+1424
| | | | | | | | | | | | | | | This patch adds support for codegen of 'target teams' on the host. This combined directive has two captured statements, one for the 'teams' region, and the other for the 'parallel'. This target teams region is offloaded using the __tgt_target_teams() call. The patch sets the number of teams as an argument to this call. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29084 llvm-svn: 293001
* AMDGPU add support for spilling to a user sgpr pointed buffersTom Stellard2017-01-2510-35/+123
| | | | | | | | | | | | | | | | | Summary: This lets you select which sort of spilling you want, either s[0:1] or 64-bit loads from s[0:1]. Patch By: Dave Airlie Reviewers: nhaehnle, arsenm, tstellarAMD Reviewed By: arsenm Subscribers: mareko, llvm-commits, kzhuravl, wdng, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D25428 llvm-svn: 293000
* [OpenMP] Support for the num_threads-clause on 'target parallel' on the ↵Arpith Chacko Jacob2017-01-253-0/+145
| | | | | | | | | | | | | | NVPTX device. This patch adds support for the Spmd construct 'target parallel' on the NVPTX device. This involves ignoring the num_threads clause on the device since the number of threads in this combined construct is already set on the host through the call to __tgt_target_teams(). Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29083 llvm-svn: 292999
* [asan] fix __sanitizer_cov_with_check to get the correct caller PC. Before ↵Kostya Serebryany2017-01-252-3/+4
| | | | | | this fix the code relied on the fact that the other function (__sanitizer_cov) is inlined. This was true with clang builds on x86, but not true with gcc builds on x86 and on PPC. This caused bot redness after r292862 llvm-svn: 292998
* [OpenMP] Support for the num_threads-clause on 'target parallel'.Arpith Chacko Jacob2017-01-2512-52/+555
| | | | | | | | | | | | | | | The num_threads-clause on the combined directive applies to the 'parallel' region of this construct. We modify the NumThreadsClause class to capture the clause expression within the 'target' region. The offload runtime call for 'target parallel' is changed to __tgt_target_teams() with 1 team and the number of threads set by this clause or a default if none. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29082 llvm-svn: 292997
* [AArch64] Fix some Clang-tidy modernize and Include What You Use warnings; ↵Eugene Zelenko2017-01-255-48/+153
| | | | | | other minor fixes (NFC). llvm-svn: 292996
* GlobalISel: Use the correct types when translating landingpad instructionsJustin Bogner2017-01-253-6/+54
| | | | | | | | | | | There was a bug here where we were using p0 instead of s32 for the selector type in the landingpad. Instead of hardcoding these types we should get the types from the landingpad instruction directly. Note that we replicate an assert from SDAG here to only support two-valued landingpads. llvm-svn: 292995
* [asan] temporarily disable parts of a test that fail after r292862Kostya Serebryany2017-01-241-2/+2
| | | | llvm-svn: 292994
* Fix llvm-objdump so it picks a good CPU based for Mach-O filesKevin Enderby2017-01-243-0/+7
| | | | | | | | | | | for CPU_SUBTYPE_ARM_V7S and CPU_SUBTYPE_ARM_V7K. For these two cpusubtypes they should default to a cortex-a7 CPU to give proper disassembly without a -mcpu= flag. rdar://27431703 llvm-svn: 292993
* Implement LWG2556: Wide contract for future::share()Marshall Clow2017-01-243-10/+16
| | | | llvm-svn: 292992
* PR31742: Don't emit a bogus "zero size array" extwarn when initializing aRichard Smith2017-01-242-2/+4
| | | | | | runtime-sized array from an empty list in an array new. llvm-svn: 292991
* Change the return type of emplace_[front|back] back to void when building ↵Marshall Clow2017-01-2416-22/+256
| | | | | | with C++14 or before. Resolves PR31680. llvm-svn: 292990
* Provide option to set pc of the file loaded in memory.Hafiz Abid Qadeer2017-01-245-8/+23
| | | | | | | | | | | | | | Summary: This commit adds an option to set PC to the entry point of the file loaded using "target module load" command. In D28804, Greg asked me to separate this part under a different option. Reviewers: clayborg Reviewed By: clayborg Subscribers: lldb-commits Differential Revision: https://reviews.llvm.org/D28944 llvm-svn: 292989
* [XCore] Fix some Clang-tidy modernize and Include What You Use warnings; ↵Eugene Zelenko2017-01-245-42/+53
| | | | | | other minor fixes (NFC). llvm-svn: 292988
* Fix a bug where lldb does not respect the packet size.Hafiz Abid Qadeer2017-01-241-5/+25
| | | | | | | | | | | | | | Summary: LLDB was using packet size advertised by the target as the max memory size to write in one go. It is wrong because packets have other overhead apart from memory payload. Also memory transferred through 'm' and 'M' packets needs 2 bytes in packet to transfer 1 of memory. Reviewers: clayborg Reviewed By: clayborg Subscribers: lldb-commits Differential Revision: https://reviews.llvm.org/D28808 llvm-svn: 292987
* Remove auto_ptr in C++17. Get it back by defining ↵Marshall Clow2017-01-2419-7/+81
| | | | | | _LIBCPP_ENABLE_CXX17_REMOVED_AUTO_PTR llvm-svn: 292986
* AMDGPU: Remove spurious out branches after a killMatt Arsenault2017-01-242-2/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The sequence like this: v_cmpx_le_f32_e32 vcc, 0, v0 s_branch BB0_30 s_cbranch_execnz BB0_30 ; BB#29: exp null off, off, off, off done vm s_endpgm BB0_30: ; %endif110 is likely wrong. The s_branch instruction will unconditionally jump to BB0_30 and the skip block (exp done + endpgm) inserted for performing the kill instruction will never be executed. This results in a GPU hang with Star Ruler 2. The s_branch instruction is added during the "Control Flow Optimizer" pass which seems to re-organize the basic blocks, and we assume that SI_KILL_TERMINATOR is always the last instruction inside a basic block. Thus, after inserting a skip block we just go to the next BB without looking at the subsequent instructions after the kill, and the s_branch op is never removed. Instead, we should remove the unconditional out branches and let skip the two instructions if the exec mask is non-zero. This patch fixes the GPU hang and doesn't introduce any regressions with "make check". Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99019 Patch by Samuel Pitoiset <samuel.pitoiset@gmail.com> llvm-svn: 292985
* Revert rL292621. Caused some internal build bot failures in apple.Wei Mi2017-01-244-625/+0
| | | | llvm-svn: 292984
* [SystemZ] Fix some Clang-tidy modernize and Include What You Use warnings; ↵Eugene Zelenko2017-01-2410-98/+185
| | | | | | other minor fixes (NFC). llvm-svn: 292983
* Enable FeatureFlatForGlobal on Volcanic IslandsMatt Arsenault2017-01-24276-379/+391
| | | | | | | | | | | This switches to the workaround that HSA defaults to for the mesa path. This should be applied to the 4.0 branch. Patch by Vedran Miletić <vedran@miletic.net> llvm-svn: 292982
* [tsan] Enable ignore_noninstrumented_modules=1 on Darwin by defaultKuba Mracek2017-01-2451-53/+66
| | | | | | | | TSan recently got the "ignore_noninstrumented_modules" flag, which disables tracking of read and writes that come from noninstrumented modules (via interceptors). This is a way of suppressing false positives coming from system libraries and other noninstrumented code. This patch turns this on by default on Darwin, where it's supposed to replace the previous solution, "ignore_interceptors_accesses", which disables tracking in *all* interceptors. The new approach should re-enable TSan's ability to find races via interceptors on Darwin. Differential Revision: https://reviews.llvm.org/D29041 llvm-svn: 292981
* Add a file comment to SyntheticSections.h.Rui Ueyama2017-01-241-0/+15
| | | | llvm-svn: 292980
* Explicitly promote indirect calls before sample profile annotation.Dehao Chen2017-01-243-5/+58
| | | | | | | | | | | | | | Summary: In iterative sample pgo where profile is collected from PGOed binary, we may see indirect call targets promoted and inlined in the profile. Before profile annotation, we need to make this happen in order to annotate correctly on IR. This patch explicitly promotes these indirect calls and inlines them before profile annotation. Reviewers: xur, davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29040 llvm-svn: 292979
* Strengthen test from r292632 to also check we get the mangling correct for ↵Richard Smith2017-01-241-2/+3
| | | | | | this case. llvm-svn: 292978
* Demangle: correct demangling for CV-qualified functionsSaleem Abdulrasool2017-01-241-6/+9
| | | | | | | | | | | When demangling a CV-qualified function type with a final reference type parameter, we would treat the reference type parameter as a r-value ref accidentally. This would result in the improper decoration of the function type itself. Resolves PR31741! llvm-svn: 292976
* Demangle: use named values for CV qualifiersSaleem Abdulrasool2017-01-241-12/+18
| | | | | | | Rather than hard-coding magic values of 1, 2, 4 (bit-field), use an enum to name the values. NFC. llvm-svn: 292975
* Revert [AMDGPU][mc][tests][NFC] Add coverage/smoke tests for Gfx7 and Gfx8.Ivan Krasin2017-01-243-241206/+0
| | | | | | | | | | | | Reason: broke ASAN bots with a global buffer overflow. http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/2291 Each test contains 20-30K test cases but takes only several (from 4 to 10) seconds to complete on average machine. The tests cover the majority of AMDGPU Gfx7/Gfx8 instructions, including many dark corners, and intended to quickly find out if something is broken. llvm-svn: 292974
* cxa_demangle: fix rvalue ref checkSaleem Abdulrasool2017-01-242-4/+4
| | | | | | | | | | | | When checking if the type is a r-value ref, we would not do a complete check. This would result in us treating a trailing parameter reference `&)` as a r-value ref, and improperly inject the cv qualifier on the type. We now correctly demangle the type `KFvRmE` as a constant function rather than a constant reference. Fixes PR31741! llvm-svn: 292973
* IRGen: Factor out function CodeGenAction::loadModule. NFCI.Peter Collingbourne2017-01-242-28/+40
| | | | llvm-svn: 292972
* Remove the load hoisting code of MLSM, it is completely subsumed by GVNHoistDaniel Berlin2017-01-243-179/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: GVNHoist performs all the optimizations that MLSM does to loads, in a more general way, and in a faster time bound (MLSM is N^3 in most cases, N^4 in a few edge cases). This disables the load portion. Note that the way ld_hoist_st_sink.ll is written makes one think that the loads should be moved to the while.preheader block, but 1. Neither MLSM nor GVNHoist do it (they both move them to identical places). 2. MLSM couldn't possibly do it anyway, as the while.preheader block is not the head of the diamond, while.body is. (GVNHoist could do it if it was legal). 3. At a glance, it's not legal anyway because the in-loop load conflict with the in-loop store, so the loads must stay in-loop. I am happy to update the test to use update_test_checks so that checking is tighter, just was going to do it as a followup. Note that i can find no particular benefit to the store portion on any real testcase/benchmark i have (even size-wise). If we really still want it, i am happy to commit to writing a targeted store sinker, just taking the code from the MemorySSA port of MergedLoadStoreMotion (which is N^2 worst case, and N most of the time). We can do what it does in a much better time bound. We also should be both hoisting and sinking stores, not just sinking them, anyway, since whether we should hoist or sink to merge depends basically on luck of the draw of where the blockers are placed. Nonetheless, i have left it alone for now. Reviewers: chandlerc, davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29079 llvm-svn: 292971
* IRGen: Factor out function clang::FindThinLTOModule. NFCI.Peter Collingbourne2017-01-242-21/+26
| | | | llvm-svn: 292970
* Add a test to make sure that implicit conversion from error_code to bool ↵Marshall Clow2017-01-241-0/+30
| | | | | | will fail llvm-svn: 292969
* [docs] Add TableGen-based generator for command line argument documentation,Richard Smith2017-01-2410-64/+3823
| | | | | | | and generate documentation for all (non-hidden) options supported by the 'clang' driver. llvm-svn: 292968
* Update status for LWG2733Marshall Clow2017-01-241-1/+1
| | | | llvm-svn: 292967
* AMDGPU/SI: Give up in promote alloca when a pointer may be captured.Changpeng Fang2017-01-242-0/+51
| | | | | | | | | | Differential Revision: http://reviews.llvm.org/D28970 Reviewer: Matt llvm-svn: 292966
* Demangle: avoid butchering parameter typeSaleem Abdulrasool2017-01-241-2/+2
| | | | | | | | | | When demangling a CV-qualified function type with a final parameter with a reference type, we would insert the CV qualification on the parameter rather than the function, and in the process adjust the insertion point by one extra, splitting the type name. This avoids doing so, even though the attribution is still incorrect. llvm-svn: 292965
* Fix test/Driver/embed-bitcode.c on non-Darwin host by setting the target ↵Mehdi Amini2017-01-241-2/+2
| | | | | | explicitly llvm-svn: 292964
* cxa_demangle: avoid butchering the last parameter typeSaleem Abdulrasool2017-01-242-2/+8
| | | | | | | | | | | Fix an off-by-one case which would destroy the final parameter in a CV-qualified function type with a reference. We still get the CV qualification incorrect, but at least we do not clobber the type name any longer. Partially fixes PR31741. llvm-svn: 292963
* Implement LWG2733: [fund.ts.v2] gcd / lcm and bool. We already did tbis for ↵Marshall Clow2017-01-249-0/+204
| | | | | | C++17, so replicate the changes in experimental. llvm-svn: 292962
* Forward -bitcode_process_mode to ld64 in marker-only modeMehdi Amini2017-01-242-2/+13
| | | | | | | | | | Reviewers: steven_wu Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D29066 llvm-svn: 292961
* Split isUsingLTO() outside of embedBitcodeInObject() and ↵Mehdi Amini2017-01-243-11/+8
| | | | | | | | | | | | | | embedBitcodeMarkerOnly(). Summary: These accessors maps directly to the command line option. Reviewers: steven_wu Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D29065 llvm-svn: 292960
* [AArch64] Fix typo. NFC.Chad Rosier2017-01-241-2/+2
| | | | llvm-svn: 292959
* Mark LWG2736 as complete. No code changes, but we have more tests nowMarshall Clow2017-01-243-4/+36
| | | | llvm-svn: 292958
* Use InstCombine's builder in foldSelectCttzCtlz instead of creating a new one.Amaury Sechet2017-01-241-3/+2
| | | | | | | | | | Summary: As per title. This will add the instructiions we are interested in in the worklist. Reviewers: mehdi_amini, majnemer, andreadb Differential Revision: https://reviews.llvm.org/D29081 llvm-svn: 292957
* [AMDGPU] Add VGPR copies post regalloc fix passStanislav Mekhanoshin2017-01-245-0/+122
| | | | | | | | | | | | | | Regalloc creates COPY instructions which do not formally use VALU. That results in v_mov instructions displaced after exec mask modification. One pass which do it is SIOptimizeExecMasking, but potentially it can be done by other passes too. This patch adds a pass immediately after regalloc to add implicit exec use operand to all VGPR copy instructions. Differential Revision: https://reviews.llvm.org/D28874 llvm-svn: 292956
OpenPOWER on IntegriCloud