summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* [StreamExecutor] Add kernel typesJason Henline2016-08-059-1/+383
| | | | | | | | | | | | Summary: Add StreamExecutor kernel types. Reviewers: jlebar, tra Subscribers: parallel_libs-commits Differential Revision: https://reviews.llvm.org/D23138 llvm-svn: 277827
* Print a more useful BP value from MSVC-built ASan runtimesReid Kleckner2016-08-051-3/+3
| | | | | | | | MSVC doesn't have an exact equivalent for __builtin_frame_address, but _AddressOfReturnAddress() + sizeof(void*) should be equivalent for all frames build with -fno-omit-frame-pointer. llvm-svn: 277826
* Fixed x2APIC discovery for 256-processor architectures.Andrey Churbanov2016-08-051-3/+3
| | | | | | | | Mask for value read from ebx register returned by CPUID expanded to 0xFFFF. Differential Revision: https://reviews.llvm.org/D23203 llvm-svn: 277825
* AMDGPU : Add Clang builtin intrinsics for compare with the fullWei Ding2016-08-054-0/+89
| | | | | | | | wavefront result. Differential Revision: http://reviews.llvm.org/D22934 llvm-svn: 277824
* [PowerPC] Wrong fast-isel codegen for VSX floating-point loadsUlrich Weigand2016-08-052-12/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There were two locations where fast-isel would generate a LFD instruction with a target register class VSFRC instead of F8RC when VSX was enabled. This can ccause invalid registers to be used in certain cases, like: lfd 36, ... instead of using a VSX load instruction. The wrong register number gets silently truncated, causing invalid code to be generated. The first place is PPCFastISel::PPCEmitLoad, which had multiple problems: 1.) The IsVSSRC and IsVSFRC flags are not initialized correctly, since they are computed from resultReg, which is still zero at this point in many cases. Fixed by changing the helper routines to operate on a register class instead of a register and passing in UseRC. 2.) Even with this fixed, Is64VSXLoad is still wrong due to a typo: bool Is32VSXLoad = IsVSSRC && Opc == PPC::LFS; bool Is64VSXLoad = IsVSSRC && Opc == PPC::LFD; The second line needs to use isVSFRC (like PPCEmitStore does). 3.) Once both the above are fixed, we're now generating a VSX instruction -- but an incorrect one, since generation of an indexed instruction with null index is wrong. Fixed by copying the code handling the same issue in PPCEmitStore. The second place is PPCFastISel::PPCMaterializeFP, where we would emit an LFD to load a constant from the literal pool, and use the wrong result register class. Fixed by hardcoding a F8RC class even on systems supporting VSX. Fixes: https://llvm.org/bugs/show_bug.cgi?id=28630 Differential Revision: https://reviews.llvm.org/D22632 llvm-svn: 277823
* [SystemZ] Add missing classes and instructionsZhan Jun Liau2016-08-056-0/+464
| | | | | | | | | | | | | | | | Summary: Add instruction formats E, RSI, SSd, SSE, and SSF. Added BRXH, BRXLE, PR, MVCK, STRAG, and ECTG instructions to test out those formats. Reviewers: uweigand Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23179 llvm-svn: 277822
* Actually, r277337 was fine. Just kill the DAGs that made the test allow ↵Benjamin Kramer2016-08-051-20/+20
| | | | | | nondeterminism. llvm-svn: 277821
* [SimplifyCFG] Make range reduction code deterministic.Benjamin Kramer2016-08-052-22/+23
| | | | | | | | | | | This generated IR based on the order of evaluation, which is different between GCC and Clang. With that in mind you get bootstrap miscompares if you compare a Clang built with GCC-built Clang vs. Clang built with Clang-built Clang. Diagnosing that made my head hurt. This also reverts commit r277337, which "fixed" the test case. llvm-svn: 277820
* reduce tests; auto-generate checksSanjay Patel2016-08-051-59/+68
| | | | llvm-svn: 277819
* [OpenMP] Sema and parsing for 'teams distribute' pragmaKelvin Li2016-08-0536-13/+3543
| | | | | | | | This patch is to implement sema and parsing for 'teams distribute' pragma. Differential Revision: https://reviews.llvm.org/D23189 llvm-svn: 277818
* [X86][SSE] Update the the target shuffle matches to use the effective mask's ↵Simon Pilgrim2016-08-051-31/+29
| | | | | | | | value type directly instead of via the input value type. Preparation for adding 2 input support so we want to avoid unnecessary references to the input value type. llvm-svn: 277817
* testing commit accessGor Nishanov2016-08-051-1/+1
| | | | llvm-svn: 277816
* [X86][SSE] Consistently use the target shuffle root value type for vector ↵Simon Pilgrim2016-08-051-11/+12
| | | | | | | | size calculations. NFCI. Preparation for adding 2 input support so we want to avoid unnecessary references to the input value type. llvm-svn: 277814
* LLLexer.cpp: Avoid using BitsToDouble() to preserve SNaN like "double ↵NAKAMURA Takumi2016-08-051-1/+2
| | | | | | | | | 0x7FF4000000000000". We should not use double (or float) in the LLVM, unless it is really needed. x87 FP register doesn't preserve SNaN to move the value. FIXME: APFloat() may have the constructor by raw bit. llvm-svn: 277813
* Reformat.NAKAMURA Takumi2016-08-051-1/+1
| | | | llvm-svn: 277812
* [include-fixer] Correct some header mappings.Haojian Wu2016-08-051-8/+8
| | | | | | | | | | Reviewers: bkramer Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D23199 llvm-svn: 277811
* [DependenceInfo] Reset operations counter when setting limit.Michael Kruse2016-08-051-1/+3
| | | | | | | | | | | | | | | | | When entering the dependence computation and the max_operations is set, the operations counter may have already exceeded the counter, thus aborting any ISL computation from the start. The counter is reset at the end of the dependence calculation such that a follow-up recomputation might succeed, ie. the success of the first dependence calculation depends on unrelated ISL operations that happened before, giving it a disadvantage to the following calculations. This patch resets the operations counter at the beginning of the dependence recalculation to not depend on previous actions. Otherwise additional preprocessing of the Scop that aims to improve its schedulability (eg. DeLICM) do have the effect that DependenceInfo and hence the scheduling fail more likely, contraproductive to the goal of said preprocessing. llvm-svn: 277810
* Add a missing backslash to my previous commitJohn Brawn2016-08-051-1/+1
| | | | llvm-svn: 277809
* [X86][SSE] Added target shuffle combine binary compute matching function. NFCI.Simon Pilgrim2016-08-051-72/+80
| | | | | | Added matchBinaryPermuteVectorShuffle and moved the blend+zero and insertps matching code into it. llvm-svn: 277808
* Reapply r276973 "Adjust Registry interface to not require plugins to export ↵John Brawn2016-08-0512-72/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | a registry" This differs from the previous version by being more careful about template instantiation/specialization in order to prevent errors when building with clang -Werror. Specifically: * begin is not defined in the template and is instead instantiated when Head is. I think the warning when we don't do that is wrong (PR28815) but for now at least do it this way to avoid the warning. * Instead of performing template specializations in LLVM_INSTANTIATE_REGISTRY instead provide a template definition then do explicit instantiation. No compiler I've tried has problems with doing it the other way, but strictly speaking it's not permitted by the C++ standard so better safe than sorry. Original commit message: Currently the Registry class contains the vestiges of a previous attempt to allow plugins to be used on Windows without using BUILD_SHARED_LIBS, where a plugin would have its own copy of a registry and export it to be imported by the tool that's loading the plugin. This only works if the plugin is entirely self-contained with the only interface between the plugin and tool being the registry, and in particular this conflicts with how IR pass plugins work. This patch changes things so that instead the add_node function of the registry is exported by the tool and then imported by the plugin, which solves this problem and also means that instead of every plugin having to export every registry they use instead LLVM only has to export the add_node functions. This allows plugins that use a registry to work on Windows if LLVM_EXPORT_SYMBOLS_FOR_PLUGINS is used. llvm-svn: 277806
* [PowerPC] fix passing long double arguments to function (soft-float)Strahinja Petrovic2016-08-054-0/+65
| | | | | | | | | | This patch fixes passing long double type arguments to function in soft float mode. If there is less than 4 argument registers free (long double type is mapped in 4 gpr registers in soft float mode) long double type argument must be passed through stack. Differential Revision: https://reviews.llvm.org/D20114. llvm-svn: 277804
* GPGPU: Sort dimension sizes of multi-dimensional shared memory arrays correctlyTobias Grosser2016-08-052-1/+110
| | | | | | | | | | Before this commit we generated the array type in reverse order and we also added the outermost dimension size to the new array declaration, which is incorrect as Polly additionally assumed an additional unsized outermost dimension, such that we had an off-by-one error in the linearization of access expressions. llvm-svn: 277802
* [InstCombine] try to fold (select C, (sext A), B) into logical opsNicolai Haehnle2016-08-054-28/+109
| | | | | | | | | | | | | | | | | | | | | | Summary: Turn (select C, (sext A), B) into (sext (select C, A, B')) when A is i1 and B is a compatible constant, also for zext instead of sext. This will then be further folded into logical operations. The transformation would be valid for non-i1 types as well, but other parts of InstCombine prefer to have sext from non-i1 as an operand of select. Motivated by the shader compiler frontend in Mesa for AMDGPU, which emits i32 for boolean operations. With this change, the boolean logic is fully recovered. Reviewers: majnemer, spatel, tstellarAMD Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D22747 llvm-svn: 277801
* Add missing 'REQUIRES' lineTobias Grosser2016-08-051-0/+2
| | | | llvm-svn: 277800
* GPGPU: Add cuda annotations to specify maximal number of threads per blockTobias Grosser2016-08-052-3/+75
| | | | | | | | These annotations ensure that the NVIDIA PTX assembler limits the number of registers used such that we can be certain the resulting kernel can be executed for the number of threads in a thread block that we are planning to use. llvm-svn: 277799
* Reverting r277632 as it breaks the build on MacOS.Ivan Krasin2016-08-052-18/+0
| | | | | | | | Reviewers: kcc Differential Revision: https://reviews.llvm.org/D23190 llvm-svn: 277798
* Fix crash in template type diffing.Richard Trieu2016-08-052-0/+36
| | | | | | | | | When the type being diffed is a type alias, and the orginal type is not a templated type, then there will be no unsugared TemplateSpecializationType. When this happens, exit early from the constructor. Also add assertions to the other iterator accessor to prevent the iterator from being used. llvm-svn: 277797
* Allow -1 to assign max value to unsigned bitfields.Richard Trieu2016-08-053-1/+19
| | | | | | | | Silence the -Wbitfield-constant-conversion warning for when -1 or other negative values are assigned to unsigned bitfields, provided that the bitfield is wider than the minimum number of bits needed to encode the negative value. llvm-svn: 277796
* CFI: add XFAIL test for a future optimization of two vcalls.Ivan Krasin2016-08-051-0/+60
| | | | | | | | | | | | | | | | | Summary: Often, a code will call multiple virtual methods of a given object. If they go in a linear block, it should be possible to check vtable before the first call, then store vtable pointer and reuse it for the second vcall without any additional checks. This is expected to have a positive performance impact on a hot path in Blink, see https://crbug.com/634139. Reviewers: kcc Differential Revision: https://reviews.llvm.org/D23151 llvm-svn: 277795
* Simplify. NFC.Rui Ueyama2016-08-051-19/+8
| | | | llvm-svn: 277794
* InstCombine: Clean up some trailing whitespace. NFCJustin Bogner2016-08-054-13/+13
| | | | llvm-svn: 277793
* InstCombine: Replace some never-null pointers with references. NFCJustin Bogner2016-08-0513-104/+102
| | | | llvm-svn: 277792
* Move invariants outside of a lambda. NFC.Rui Ueyama2016-08-051-3/+3
| | | | llvm-svn: 277791
* Make combine() non-member function.Rui Ueyama2016-08-051-35/+34
| | | | | | Because this function depends only on its arguments. llvm-svn: 277790
* Change the indexing done for kernel/kext directories to be recursive.Jason Molenda2016-08-052-498/+257
| | | | | | | | | | | | | | Also re-write how most of the directory indexing is done - as it has grown over the years, it has become a bit of a mess and was overdue for a cleanup. Most importantly, this allows you to specify a directory with the platform.plugin.darwin-kernel.kext-directories setting and now lldb will search for kexts and kernels in those directories recursively. <rdar://problem/20754467> llvm-svn: 277789
* [LIT][Darwin] Change %ld64 to be prefixed with DYLD_INSERT_LIBRARIESBruno Cardoso Lopes2016-08-043-5/+8
| | | | | | | | | | | | Followup from r277778, after Mehdi's comments. Expand %ld64 to perform the necessary preload instead, that way new tests do not need to worry about setting up DYLD_INSERT_LIBRARIES themselves. rdar://problem/24300926 llvm-svn: 277788
* [Sema] Add sizeof diagnostics for bzeroBruno Cardoso Lopes2016-08-043-3/+54
| | | | | | | | | | | | | | | | | | | | | | | | | For memset (and others) we can get diagnostics like: struct stat { int x; }; void foo(struct stat *stamps) { bzero(stamps, sizeof(stamps)); memset(stamps, 0, sizeof(stamps)); } t.c:7:28: warning: 'memset' call operates on objects of type 'struct stat' while the size is based on a different type 'struct stat *' [-Wsizeof-pointer-memaccess] memset(stamps, 0, sizeof(stamps)); ~~~~~~ ^~~~~~ t.c:7:28: note: did you mean to dereference the argument to 'sizeof' (and multiply it by the number of elements)? memset(stamps, 0, sizeof(stamps)); ^~~~~~ This patch implements the same class of warnings for bzero. Differential Revision: https://reviews.llvm.org/D22525 rdar://problem/18963514 llvm-svn: 277787
* GVN-hoist: enable by defaultSebastian Pop2016-08-042-2/+3
| | | | llvm-svn: 277786
* GVN-hoist: fix early exit logicSebastian Pop2016-08-044-6/+41
| | | | | | | | | | The patch splits a complex && if condition into easier to read and understand logic. That wrong early exit condition was letting some instructions with not all operands available pass through when HoistingGeps was true. Differential Revision: https://reviews.llvm.org/D23174 llvm-svn: 277785
* IR: Provide an IRBuilder Inserter that calls a callback after insertionJustin Bogner2016-08-043-25/+27
| | | | | | | | | | Add a generalized IRBuilderCallbackInserter, which is just given a callback to execute after insertion. This can be used to get rid of the custom inserter in InstCombine, which will in turn allow me to add target specific InstCombineCalls API for intrinsics without horrible layering violations. llvm-svn: 277784
* [ADT] Migrate DepthFirstIterator to use NodeRefTim Shen2016-08-043-0/+5
| | | | | | | | | | | | Summary: The corresponding LLVM change is D23146. Reviewers: dblaikie, chandlerc Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D23147 llvm-svn: 277783
* [LV, X86] Be more optimistic about vectorizing shifts.Michael Kuperstein2016-08-046-24/+103
| | | | | | | | | | | | | | | Shifts with a uniform but non-constant count were considered very expensive to vectorize, because the splat of the uniform count and the shift would tend to appear in different blocks. That made the splat invisible to ISel, and we'd scalarize the shift at codegen time. Since r201655, CodeGenPrepare sinks those splats to be next to their use, and we are able to select the appropriate vector shifts. This updates the cost model to to take this into account by making shifts by a uniform cheap again. Differential Revision: https://reviews.llvm.org/D23049 llvm-svn: 277782
* Split InputSectionDescription::Sort into SortInner and SortOuter.Rui Ueyama2016-08-042-44/+44
| | | | | | | | | | | | | | | | | Summary: The comparator function to compare input sections as instructed by SORT command was a bit too complicated because it needed to handle four different cases. This patch split it into two function calls. This patch also simplifies the parser. Reviewers: grimar Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23140 llvm-svn: 277780
* [InstCombine] use m_APInt to allow icmp eq (mul X, C1), C2 folds for splat ↵Sanjay Patel2016-08-042-8/+6
| | | | | | | | | | | | | | | | constant vectors This concludes the splat vector enhancements for foldICmpEqualityWithConstant(). Other commits in this series: https://reviews.llvm.org/rL277762 https://reviews.llvm.org/rL277752 https://reviews.llvm.org/rL277738 https://reviews.llvm.org/rL277731 https://reviews.llvm.org/rL277659 https://reviews.llvm.org/rL277638 https://reviews.llvm.org/rL277629 llvm-svn: 277779
* [LIT][Darwin] Preload libclang_rt.asan_osx_dynamic.dylib when necessaryBruno Cardoso Lopes2016-08-043-3/+30
| | | | | | | | | | | | | | | | | | | | | Green Dragon's darwin stage2 asan bot fails on some checks: http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check test/tools/lto/hide-linkonce-odr.ll test/tools/lto/opt-level.ll ERROR: Interceptors are not working. This may be because AddressSanitizer is loaded too late (e.g. via dlopen) To fix this, %ld64 needs to load 'libclang_rt.asan_osx_dynamic.dylib' before libLTO.dylib, via DYLD_INSERT_LIBRARIES. This won't work by updating config.environment, since some shim binary in the way scrubs the env vars. Instead, provide the path to this lib through %asanrtlib, which can then be used by tests directly with DYLD_INSERT_LIBRARIES. rdar://problem/24300926 llvm-svn: 277778
* builtins: split out the EABI and VFP ARM sourcesSaleem Abdulrasool2016-08-041-43/+55
| | | | | | | | These are meant to only be included on certain targets. This only disables it for Windows ARM for now. Ideally these would be conditionally included as appropriate. llvm-svn: 277777
* Clean up the logic of the Archive::Child::Child() with an assert to know Err ↵Kevin Enderby2016-08-041-21/+23
| | | | | | | | | | | | | | | | | is not a nullptr when we are pointed at real data. David Blaikie pointed out some odd logic in the case the Err value was a nullptr and Lang Hames suggested it could be cleaned it up with an assert to know that Err is not a nullptr when we are pointed at real data. As only in the case of constructing the sentinel value by pointing it at null data is Err is permitted to be a nullptr, since no error could occur in that case. With this change the testing for “if (Err)” is removed from the constructor’s logic and *Err is used directly without any check after the assert(). llvm-svn: 277776
* GlobalISel: extend add widening to SUB, MUL, OR, AND and XOR.Tim Northover2016-08-047-3/+169
| | | | | | | These are the operations that are trivially identical. Division is omitted for now because you need to use the correct sign/zero extension. llvm-svn: 277775
* GlobalISel: add support for G_MULTim Northover2016-08-044-2/+30
| | | | llvm-svn: 277774
* [CloneFunction] Add a testcase for r277691/r277693David Majnemer2016-08-041-0/+21
| | | | | | | | | PR28848 had a very nice reduction of the underlying cause of the bug. Our ValueMap had, in an entry for an Instruction, a ConstantInt. This is not at all unexpected but should be handled properly. llvm-svn: 277773
OpenPOWER on IntegriCloud