summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* OpenCL: Improve vector printf warningsMatt Arsenault2018-12-016-32/+192
| | | | | | | | | | | | The vector modifier is considered separate, so don't treat it as a conversion specifier. This is still not warning on some cases, like using a type that isn't a valid vector element. Fixes bug 39652 llvm-svn: 348084
* OpenCL: Extend argument promotion rules to vector typesMatt Arsenault2018-12-012-5/+57
| | | | | | | | The spec is ambiguous on whether vector types are allowed to be implicitly converted. The only legal context I think this can be used for OpenCL is printf, where it seems necessary. llvm-svn: 348083
* [X86] Add vXi8 division/remainder by non-splat constant test cases to ↵Craig Topper2018-12-016-0/+1966
| | | | | | prepare for an upcoming patch. llvm-svn: 348082
* [MachineOutliner][AArch64] Improve checks for stack instructionsJessica Paquette2018-12-014-22/+45
| | | | | | | | | | | | | If we know that we'll definitely save LR to a register, there's no reason to pre-check whether or not a stack instruction is unsafe to fix up. This makes it so that we check for that condition before mapping instructions. This allows us to outline more, since we don't pessimise as many instructions. Also update some tests, since we outline more. llvm-svn: 348081
* Replace w16/w17 in machine-outliner.mir with w11/w12Jessica Paquette2018-12-011-52/+52
| | | | | | | These registers should not be used here, since they are interprocedural scratch registers in AArch64. llvm-svn: 348080
* [X86] Don't use zero_extend_vector_inreg for mulhu lowering with sse 4.1Craig Topper2018-12-013-98/+101
| | | | | | | | | | | | | | Summary: With sse4.1 we use two zero_extend_vector_inreg and a pshufd to expand the v16i8 input into two v8i16 vectors for the multiply. That's 3 shuffles to extend one operand. The other operand is usually constant as this is mostly used by division by constant optimization. Pre sse4.1 we use a punpckhbw and a punpcklbw with a zero vector. That's two shuffles and an xor and a copy due to tied register constraints. That seems maybe better than the 3 shuffles. With AVX we avoid the copy so that's obviously better. Reviewers: spatel, RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D55138 llvm-svn: 348079
* Introduce a way to allow the ASan dylib on Darwin platforms to be loaded via ↵Dan Liew2018-12-0110-1/+117
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | `dlopen()`. Summary: The purpose of this option is provide a way for the ASan dylib to be loaded via `dlopen()` without triggering most initialization steps (e.g. shadow memory set up) that normally occur when the ASan dylib is loaded. This new functionality is exposed by - A `SANITIZER_SUPPORTS_INIT_FOR_DLOPEN` macro which indicates if the feature is supported. This only true for Darwin currently. - A `HandleDlopenInit()` function which should return true if the library is being loaded via `dlopen()` and `SANITIZER_SUPPORTS_INIT_FOR_DLOPEN` is supported. Platforms that support this may perform any initialization they wish inside this function. Although disabling initialization is something that could potentially apply to other sanitizers it appears to be unnecessary for other sanitizers so this patch only makes the change for ASan. rdar://problem/45284065 Reviewers: kubamracek, george.karpenkov, kcc, eugenis, krytarowski Subscribers: #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D54469 llvm-svn: 348078
* [TTI] Reduction costs only need to include a single extract element cost ↵Simon Pilgrim2018-12-0125-2354/+2242
| | | | | | | | | | | | | | | | (REAPPLIED) We were adding the entire scalarization extraction cost for reductions, which returns the total cost of extracting every element of a vector type. For reductions we don't need to do this - we just need to extract the 0'th element after the reduction pattern has completed. Fixes PR37731 Rebased and reapplied after being reverted in rL347541 due to PR39774 - which was fixed by D54955/rL347759 and D55017/rL347997 Differential Revision: https://reviews.llvm.org/D54585 llvm-svn: 348076
* [AMDGPU] Split 64-Bit XNOR to 64-Bit NOT/XORGraham Sellers2018-12-013-8/+140
| | | | | | | | | The identity ~(x ^ y) == (~x ^ y) == (x ^ ~y) allows XNOR (XOR/NOT) to turn into NOT/XOR. Handling this case with its own split means we can make the NOT remain in the scalar unit. Previously, we split 64-bit XNOR into two 32-bit XNOR, then lowered. Now, we get three instructions (s_not, v_xor, v_xor) rather than four in the case where either of the sources is a scalar 64-bit. Add test cases to xnor.ll to attempt XNOR Vx, Sy and XNOR Sx, Vy. Also adding test that uses the opposite identity such that (~x ^ y) on the scalar unit (or vector for gfx906) can generate XNOR. This already worked, but I didn't see a test for it. Differential: https://reviews.llvm.org/D55071 llvm-svn: 348075
* [llvm-readobj] Improve dynamic section iteration NFC.Xing GUO2018-12-011-3/+9
| | | | llvm-svn: 348074
* [SelectionDAG] Improve SimplifyDemandedBits to SimplifyDemandedVectorElts ↵Simon Pilgrim2018-12-0111-537/+475
| | | | | | | | | | | | simplification D52935 introduced the ability for SimplifyDemandedBits to call SimplifyDemandedVectorElts through BITCASTs if the demanded bit mask entirely covered the sub element. This patch relaxes this to demanding an element if we need any bit from it. Differential Revision: https://reviews.llvm.org/D54761 llvm-svn: 348073
* [InstCombine] Support ssub.sat canonicalization for non-splatsNikita Popov2018-12-013-21/+16
| | | | | | | | | | | | Extend ssub.sat(X, C) -> sadd.sat(X, -C) canonicalization to also support non-splat vector constants. This is done by generalizing the implementation of the isNotMinSignedValue() helper to return true for constants that are non-splat, but don't contain any signed min elements. Differential Revision: https://reviews.llvm.org/D55011 llvm-svn: 348072
* Correct indentation.Bill Wendling2018-12-011-1/+1
| | | | llvm-svn: 348071
* Specify constant context in constant emitterBill Wendling2018-12-015-4/+177
| | | | | | | The constant emitter may need to evaluate the expression in a constant context. For exasmple, global initializer lists. llvm-svn: 348070
* [X86] Remove stale FIXME from test case. NFCCraig Topper2018-12-011-1/+0
| | | | | | This was fixed in r346581. I just forgot to remove it. llvm-svn: 348069
* [ThinLTO] Allow importing of functions with var argsTeresa Johnson2018-12-013-19/+9
| | | | | | | | | | | | | | | | | | Summary: Follow up to D54270, which allowed importing of var args functions unless they called va_start. As pointed out in the post-commit comments on that patch, the inliner can handle functions that call va_start in certain situations as well. Go ahead and enable importing of all var args functions. Measurements on a large binary show that this increases imports and binary size by an insignificant amount. Reviewers: davidxl Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D54607 llvm-svn: 348068
* [RISCV] Remove RV64I SLLW/SRLW/SRAW patterns and add new test casesAlex Bradbury2018-12-015-88/+198
| | | | | | | | | | | | | | | | | | | | | | | | | | | As noted by Eli Friedman <https://reviews.llvm.org/D52977?id=168629#1315291>, the RV64I shift patterns for SLLW/SRLW/SRAW make some incorrect assumptions. SRAW assumed that (sext_inreg foo, i32) could only be produced when sign-extended an i32. However, it can be produced by input such as: define i64 @tricky_ashr(i64 %a, i64 %b) { %1 = shl i64 %a, 32 %2 = ashr i64 %1, 32 %3 = ashr i64 %2, %b ret i64 %3 } It's important not to select sraw in the above case, because sraw only uses bits lower 5 bits from the shift, while a shift of 32-63 would be valid. Similarly, the patterns for srlw assumed (and foo, 0xffffffff) would only be produced when zero-extending a value that was originally i32 in LLVM IR. This is obviously incorrect. This patch removes the SLLW/SRLW/SRAW shift patterns for the time being and adds test cases that would demonstrate a miscompile if the incorrect patterns were re-added. llvm-svn: 348067
* [clangd] Recommit the "AnyScope" changes in requests.json by rCTE347753 ↵Fangrui Song2018-12-011-7/+7
| | | | | | | | (reverted by rCTE347792) This fixes IndexBenchmark tests. llvm-svn: 348066
* [Basic] Move DiagnosticsEngine::dump from .h to .cppFangrui Song2018-12-012-4/+10
| | | | | | | | | | | | | The two LLVM_DUMP_METHOD methods have a undefined reference on clang::DiagnosticsEngine::DiagStateMap::dump. tools/clang/tools/extra/clangd/benchmarks/IndexBenchmark links in clangDaemon but does not link in clangBasic explicitly, which causes a linker error "undefined symbol" in !NDEBUG + -DBUILD_SHARED_LIBS=on builds. Move LLVM_DUMP_METHOD methods to .cpp to fix IndexBenchmark. They should be unconditionally defined as they are also used by non-dump-method #pragma clang __debug diag_mapping llvm-svn: 348065
* [projects] Use add_llvm_external_project for implicit projectsShoaib Meenai2018-12-011-1/+1
| | | | | | | | | | | This allows disabling implicit projects via the LLVM_TOOL_*_BUILD variables, similar to how implicit tools can be disabled. They'll still be enabled by default, since add_llvm_external_project defaults the LLVM_TOOL_*_BUILD variables to ON for in-tree implciit projects. Differential Revision: https://reviews.llvm.org/D55105 llvm-svn: 348064
* [X86][LoopVectorize] Replace -mcpu=skylake-avx512 with -mattr=avx512f in ↵Craig Topper2018-12-013-3/+3
| | | | | | some tests that failed when experimenting with defaulting to -mprefer-vector-width=256 for skylake-avx512. llvm-svn: 348063
* Relax test to also work on Windows.Adrian Prantl2018-12-011-1/+1
| | | | llvm-svn: 348062
* [compiler-rt] Use "ColumnLimit: 0" instead of "clang-format off" in testsVitaly Buka2018-12-017-12/+2
| | | | | | | | | | Reviewers: eugenis, jfb Subscribers: kubamracek, dberris, llvm-commits Differential Revision: https://reviews.llvm.org/D55152 llvm-svn: 348061
* Honor -fdebug-prefix-map when creating function names for the debug info.Adrian Prantl2018-12-015-21/+43
| | | | | | | | | | | | | This adds a callback to PrintingPolicy to allow CGDebugInfo to remap file paths according to -fdebug-prefix-map. Otherwise the debug info (particularly function names for C++ lambdas) may contain paths that should have been remapped in the debug info. <rdar://problem/46128056> Differential Revision: https://reviews.llvm.org/D55137 llvm-svn: 348060
* Use RequireNullTerminator=false in identify_magic.Zachary Turner2018-12-011-1/+1
| | | | | | | | | | | identify_magic does not need the file to be null terminated. Passing true here causes the file reading code to decide not to use mmap in some rare cases (which happen to be true 100% of the time in PDB files) which can lead to very large files failing to load. Since it was probably just an accident that we were passing true here (since it is the default function parameter), this should be strictly an improvement. llvm-svn: 348059
* [lit] Add a generic build script with a lit substitution.Zachary Turner2018-12-0117-56/+702
| | | | | | | | | | | | | | | | | | | This adds a script called build.py as well as a lit substitution called %build that we can use to invoke it. The idea is that this allows a lit test to build test inferiors without having to worry about architecture / platform specific differences, command line syntax, finding / configurationg a proper toolchain, and other issues. They can simply write something like: %build --arch=32 -o %t.exe %p/Inputs/foo.cpp and it will just work. This paves the way for being able to run lit tests with multiple configurations, platforms, and compilers with a single test. Differential Revision: https://reviews.llvm.org/D54914 llvm-svn: 348058
* [NVPTX] Add lowering of i128 numbers as struct fieldsArtem Belevich2018-12-012-0/+25
| | | | | | | | | | | Addition to D34555 - override VTs computation with ComputePTXValueVTs for struct fields. Author: Denys Zariaiev<denys.zariaiev@gmail.com> Differential Revision: https://reviews.llvm.org/D55144 llvm-svn: 348057
* [X86] Replace '-mcpu=skx' with -mattr=avx512f or -mattr=avx512bw in ↵Craig Topper2018-12-016-6/+6
| | | | | | interleave/strided load/store cost model tests. llvm-svn: 348056
* [windows] Fix two minor bugs on WindowsStella Stamenova2018-12-012-5/+6
| | | | | | | 1. In ProcessWindows if we fail to allocate memory, we need to return LLDB_INVALID_ADDRESS rather than 0 or nullptr as that is the invalid address that LLDB looks for 2. In RegisterContextWindows in ReadAllRegisterValues, always create a new buffer. This is what the other platforms do and data_sp is always null in all tested scenarios on Windows as well llvm-svn: 348055
* [gn build] Add action to generate VCSRevision.h and use it to add ↵Nico Weber2018-12-014-3/+141
| | | | | | | | llvm/lib/Object/BUILD.gn Differential Revision: https://reviews.llvm.org/D55090 llvm-svn: 348054
* Revert "Revert r347417 "Re-Reinstate 347294 with a fix for the failures.""Fangrui Song2018-11-3035-194/+396
| | | | | | | | | It seems the two failing tests can be simply fixed after r348037 Fix 3 cases in Analysis/builtin-functions.cpp Delete the bad CodeGen/builtin-constant-p.c for now llvm-svn: 348053
* [codeview] Remove dead macros for codeview record serialization, NFCReid Kleckner2018-11-301-23/+0
| | | | | | | These weren't needed when we went to the yaml IO style of serialization, which has "mapOptional". llvm-svn: 348052
* LegacyDivergenceAnalysis: fix uninitialized valueNicolai Haehnle2018-11-301-1/+3
| | | | | Change-Id: I014502e431a68f7beddf169f6a3d19dac5dd2c26 llvm-svn: 348051
* AMDGPU: Divergence-driven selection of scalar buffer load intrinsicsNicolai Haehnle2018-11-308-240/+123
| | | | | | | | | | | | | | | | | | | | | | | | Summary: Moving SMRD to VMEM in SIFixSGPRCopies is rather bad for performance if the load is really uniform. So select the scalar load intrinsics directly to either VMEM or SMRD buffer loads based on divergence analysis. If an offset happens to end up in a VGPR -- either because a floating point calculation was involved, or due to other remaining deficiencies in SIFixSGPRCopies -- we use v_readfirstlane. There is some unrelated churn in tests since we now select MUBUF offsets in a unified way with non-scalar buffer loads. Change-Id: I170e6816323beb1348677b358c9d380865cd1a19 Reviewers: arsenm, alex-t, rampitec, tpr Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D53283 llvm-svn: 348050
* AMDGPU: Fix various issues around the VirtReg2Value mappingNicolai Haehnle2018-11-304-32/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The VirtReg2Value mapping is crucial for getting consistently reliable divergence information into the SelectionDAG. This patch fixes a bunch of issues that lead to incorrect divergence info and introduces tight assertions to ensure we don't regress: 1. VirtReg2Value is generated lazily; there were some cases where a lookup was performed before all relevant virtual registers were created, leading to an out-of-sync mapping. Those cases were: - Complex code to lower formal arguments that generated CopyFromReg nodes from live-in registers (fixed by never querying the mapping for live-in registers). - Code that generates CopyToReg for formal arguments that are used outside the entry basic block (fixed by never querying the mapping for Register nodes, which don't need the divergence info anyway). 2. For complex values that are lowered to a sequence of registers, all registers must be reflected in the VirtReg2Value mapping. I am not adding any new tests, since I'm not actually aware of any bugs that these problems are causing with trunk as-is. However, I recently added a test case (in r346423) which fails when D53283 is applied without this change. Also, the new assertions should provide most of the effective test coverage. There is one test change in sdwa-peephole.ll. The underlying issue is that since the divergence info is now correct, the DAGISel will select V_OR_B32 directly instead of S_OR_B32. This leads to an extra COPY which affects the behavior of MachineLICM in a way that ends up with the S_MOV_B32 with the constant in a different basic block than the V_OR_B32, which is presumably what defeats the peephole. Reviewers: alex-t, arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D54340 llvm-svn: 348049
* [DA] GPUDivergenceAnalysis for unstructured GPU kernelsNicolai Haehnle2018-11-3023-27/+1359
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is patch #3 of the new DivergenceAnalysis <https://lists.llvm.org/pipermail/llvm-dev/2018-May/123606.html> The GPUDivergenceAnalysis is intended to eventually supersede the existing LegacyDivergenceAnalysis. The existing LegacyDivergenceAnalysis produces incorrect results on unstructured Control-Flow Graphs: <https://bugs.llvm.org/show_bug.cgi?id=37185> This patch adds the option -use-gpu-divergence-analysis to the LegacyDivergenceAnalysis to turn it into a transparent wrapper for the GPUDivergenceAnalysis. Reviewers: nhaehnle Reviewed By: nhaehnle Subscribers: jholewinski, jvesely, jfb, llvm-commits, alex-t, sameerds, arsenm, nhaehnle Differential Revision: https://reviews.llvm.org/D53493 llvm-svn: 348048
* [x86] add tests for undef + partial undef constant folding; NFCSanjay Patel2018-11-301-0/+90
| | | | | | Keep this file sync'd with the instsimplify version (rL348045). llvm-svn: 348047
* [X86] Split skylake-avx512 run lines in SLP vectorizer tests to cover ↵Craig Topper2018-11-3014-368/+493
| | | | | | | | -mprefer=vector-width=256 and -mprefer-vector-width=512. This will make these tests immune if we ever change the default behavior of -march=skylake-avx512 to prefer 256 bit vectors. llvm-svn: 348046
* [InstSimplify] add tests for undef + partial undef constant folding; NFCSanjay Patel2018-11-301-0/+80
| | | | | | | | These tests should probably go under a separate test file because they should fold with just -constprop, but they're similar to the scalar tests already in here. llvm-svn: 348045
* [analyzer] Deleting unnecessary test fileKristof Umann2018-11-301-54/+0
| | | | | | That I really should've done in rC348031. llvm-svn: 348044
* [ValueTracking] Make unit tests easier to write; NFCNikita Popov2018-11-301-106/+63
| | | | | | | | | | | | | | Generalize the existing MatchSelectPatternTest class to also work with other types of tests. This reduces the amount of boilerplate necessary to write ValueTracking tests in general, and computeKnownBits tests in particular. The inherited convention is that the function must be @test and the tested instruction %A. Differential Revision: https://reviews.llvm.org/D55141 llvm-svn: 348043
* Support: use std::is_trivially_copyable on MSVCSaleem Abdulrasool2018-11-301-2/+3
| | | | | | | | MSVC 2015 and newer have std::is_trivially_copyable available for use. We should prefer that over the std::is_class to get this check be correct. llvm-svn: 348042
* Add myself as code owner for OpenBSD driverBrad Smith2018-11-301-0/+4
| | | | llvm-svn: 348041
* Add a test to verify that lldb can load a kext binary.Jason Molenda2018-11-302-0/+260
| | | | | | <rdar://problem/46356062> llvm-svn: 348040
* Revert r347417 "Re-Reinstate 347294 with a fix for the failures."Fangrui Song2018-11-3037-558/+197
| | | | | | | | | | Kept the "indirect_builtin_constant_p" test case in test/SemaCXX/constant-expression-cxx1y.cpp while we are investigating why the following snippet fails: extern char extern_var; struct { int a; } a = {__builtin_constant_p(extern_var)}; llvm-svn: 348039
* [analyzer] Emit an error for invalid -analyzer-config inputsKristof Umann2018-11-306-17/+175
| | | | | | Differential Revision: https://reviews.llvm.org/D53280 llvm-svn: 348038
* [ExprConstant] Try fixing __builtin_constant_p after D54355 (rC347417)Fangrui Song2018-11-301-1/+0
| | | | | | | | | | | Summary: Reinstate the original behavior (Success(false, E)) before D54355 when this branch is taken. This fixes spurious error of the following snippet: extern char extern_var; struct { int a; } a = {__builtin_constant_p(extern_var)}; llvm-svn: 348037
* [MachineOutliner] Outline both register save calls + no LR save calls togetherJessica Paquette2018-11-302-42/+52
| | | | | | | | | | | | | | | | | Instead of treating the outlined functions for these as distinct frames, they should be combined into one case. Neither allows for stack fixups, and both generate the same frame. Thus, they ought to be considered one case. This makes the code far easier to understand, for one thing. It also offers some small code size improvements. It's fairly rare to see a class of outlined functions that doesn't fall entirely into one variant (on CTMark anyway). It does happen from time to time though. This mostly offers some serious simplification. Also update the test to show the added functionality. llvm-svn: 348036
* AArch64: Don't emit CFI for SCS register in nounwind functions.Peter Collingbourne2018-11-302-14/+24
| | | | | | | | | | All that you can legitimately do with the CFI for a nounwind function is get a backtrace, and adjusting the SCS register is not (currently) required for this purpose. Differential Revision: https://reviews.llvm.org/D54988 llvm-svn: 348035
* [TableGen] Fix negation of simple predicatesEvandro Menezes2018-11-301-14/+41
| | | | | | | | | | | | Simple predicates, such as those defined by `CheckRegOperandSimple` or `CheckImmOperandSimple`, were not being negated when used with `CheckNot`. This change fixes this issue by defining the previously declared methods to handle simple predicates. Differential revision: https://reviews.llvm.org/D55089 llvm-svn: 348034
OpenPOWER on IntegriCloud