summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* [X86][SSE] Add support for extracting constant bit data from broadcasted ↵Simon Pilgrim2016-12-024-46/+57
| | | | | | constants llvm-svn: 288499
* [clang-move] some tweaks.Haojian Wu2016-12-022-71/+61
| | | | | | | | | * Don't save SourceManager for each declarations. * Rename some out-dated methods. No functionality change. llvm-svn: 288498
* [SLP] Fix for PR6246: vectorization for scalar ops on vector elements.Alexey Bataev2016-12-023-840/+440
| | | | | | | | | | | | | | | When trying to vectorize trees that start at insertelement instructions function tryToVectorizeList() uses vectorization factor calculated as MinVecRegSize/ScalarTypeSize. But sometimes it does not work as tree cost for this fixed vectorization factor is too high. Patch tries to improve the situation. It tries different vectorization factors from max(PowerOf2Floor(NumberOfVectorizedValues), MinVecRegSize/ScalarTypeSize) to MinVecRegSize/ScalarTypeSize and tries to choose the best one. Differential Revision: https://reviews.llvm.org/D27215 llvm-svn: 288497
* [X86] Refactored getTargetConstantBitsFromNode to allow for expansion. NFCI.Simon Pilgrim2016-12-021-44/+56
| | | | | | | | getTargetConstantBitsFromNode currently only extracts constant pool vector data, but it will need to be generalized to support broadcast and scalar constant pool data as well. Converted Constant bit extraction and Bitset splitting to helper lambda functions. llvm-svn: 288496
* Fix a buildbot failure in include-fixer.Eric Liu2016-12-021-7/+4
| | | | llvm-svn: 288495
* Replace __ANDROID_NDK__ with __ANDROID__Pavel Labath2016-12-0218-37/+25
| | | | | | | | | | | | | | | | | | | | Summary: This replaces all the uses of the __ANDROID_NDK__ define with __ANDROID__. This is a preparatory step to remove our custom android toolchain file and rely on the standard android NDK one instead, which does not provide this define. Instead I rely, on __ANDROID__, which is set by the compiler. I haven't yet removed the cmake variable with the same name, as we will need to do something completely different there -- NDK toolchain defines CMAKE_SYSTEM_NAME to Android, while our current one pretends it's linux. Reviewers: tberghammer, zturner Subscribers: danalbert, srhines, mgorny, lldb-commits Differential Revision: https://reviews.llvm.org/D27305 llvm-svn: 288494
* [ClangFormat] Only insert #include into the #include block in the beginning ↵Eric Liu2016-12-023-31/+195
| | | | | | | | | | | | | | | | | | of the file. Summary: This avoid inserting #include into: - raw string literals containing #include. - #if block. - Special #include among declarations (e.g. functions). Reviewers: djasper Subscribers: cfe-commits, klimek Differential Revision: https://reviews.llvm.org/D26909 llvm-svn: 288493
* [SLPVectorizer][X86] Add tests for vectorization of buildvector of scalar ↵Simon Pilgrim2016-12-021-0/+1573
| | | | | | fp-ops (PR6246) llvm-svn: 288492
* [Frontend] Fix an issue where a quoted search path is incorrectlyAlex Lorenz2016-12-022-1/+13
| | | | | | | | | | | | | | | removed as a duplicate header search path The commit r126167 started passing the First index into RemoveDuplicates, but forgot to update 0 to First in the loop that looks for the duplicate. This resulted in a bug where an -iquoted search path was incorrectly removed if you passed in the same path into -iquote and more than one time into -isystem. rdar://23991350 Differential Revision: https://reviews.llvm.org/D27298 llvm-svn: 288491
* compiler-rt/test/profile/Linux/lit.local.cfg: [Py3] Use text mode ↵NAKAMURA Takumi2016-12-021-0/+1
| | | | | | (universal_newlines=True). llvm-svn: 288490
* [ScopInfo] Fold constant coefficients in array dimensions to the rightTobias Grosser2016-12-024-11/+279
| | | | | | | | | | | | | | | | | | | | | | | | | This allows us to delinearize code such as the one below, where the array sizes are A[][2 * n] as there are n times two elements in the innermost dimension. Alternatively, we could try to generate another dimension for the struct in the innermost dimension, but as the struct has constant size, recovering this dimension is easy. struct com { double Real; double Img; }; void foo(long n, struct com A[][n]) { for (long i = 0; i < 100; i++) for (long j = 0; j < 1000; j++) A[i][j].Real += A[i][j].Img; } int main() { struct com A[100][1000]; foo(1000, A); llvm-svn: 288489
* [sanitizer] Add a bunch of ifdefs for sparc targets to avoid build failures.Maxim Ostapenko2016-12-022-3/+71
| | | | | | Differential Revision: https://reviews.llvm.org/D27301 llvm-svn: 288488
* Port parallel ICF to COFF.Rui Ueyama2016-12-022-121/+142
| | | | | | | | LLD used to take 11.73 seconds to link Clang. Now it is 6.94 seconds. MSVC link takes 83.02 seconds. Note that ICF is enabled by default on Windows, so a low latency ICF is more important than in ELF. llvm-svn: 288487
* Don't include system header inside namespaceStephan Bergmann2016-12-021-16/+16
| | | | | | | ...causes build failure at least with GCC 6.2.1, as smmintrin.h indirectly includes cstdlib, which then runs into problems. llvm-svn: 288486
* Ignore R_X86_64_NONE.Rafael Espindola2016-12-022-0/+26
| | | | | | | | | | It looks like the way dtrace works is * The user creates .o files that reference magical symbol names. * dtrace reads those files, collecs the info it needs and changes the relocation to R_X86_64_NONE expecting the linker to ignore them. llvm-svn: 288485
* [AVX-512] Add EVEX vpshuflw/vpshufhw/vpshufd instructions to load folding ↵Craig Topper2016-12-023-0/+281
| | | | | | tables. llvm-svn: 288484
* Fix a bug in ICF involving COFF associative sections.Rui Ueyama2016-12-022-10/+104
| | | | | | | | | | | | | | | Associative sections are sections that need to be linked if their associated sections are linked. Associative sections are used to append auxiliary data such as debug info. Previously, we compared all associative sections when comparing two comdat sections. Because usually assocative sections are not mergeable sections, we missed a lot of mergeable sections. MSVC linker doesn't seem to check the identity of associative sections. This patch makes LLD to ignore associative sections when doing ICF. llvm-svn: 288483
* [AVX-512] Add EVEX PSHUFB instructions to load folding tables.Craig Topper2016-12-023-0/+87
| | | | llvm-svn: 288482
* [AVX-512] Add masked VINSERTF/VINSERTI instructions to load folding tables.Craig Topper2016-12-022-1/+45
| | | | llvm-svn: 288481
* Fix the worse case performance of ICF.Rui Ueyama2016-12-021-90/+89
| | | | | | | | | | | | | | | | | | | | | r288228 seems to have regressed ICF performance in some cases in which a lot of sections are actually mergeable. In r288228, I made a change to create a Range object for each new color group. So every time we split a group, we allocated and added a new group to a list of groups. This patch essentially reverted r288228 with an improvement to parallelize the original algorithm. Now the ICF main loop is entirely allocation-free and lock-free. Just like pre-r288228, we search for group boundaries by linear scan instead of managing the information using Range class. r288228 was neutral in performance-wise, and so is this patch. I confirmed that this produces the exact same result as before using chromium and clang as tests. llvm-svn: 288480
* [ScopInfo] Separate construction and finalization of memory accesses [NFC]Tobias Grosser2016-12-022-12/+59
| | | | | | | | | | | | | | | | | | After having built memory accesses we perform some additional transformations on them to increase the chances that our delinearization guesses the right shape. Only after these transformations, we take the assumptions that the array shape we predict is such that no out-of-bounds memory accesses arise. Before this change, the construction of the memory access, the access folding that improves the represenation for certain parametric subscripts, and taking the assumption was all done right after a memory access was created. In this change we split this now into three separate iterations over all memory accesses. This means only after all memory accesses have been built, we start to canonicalize accesses, and to take assumptions. This split prepares for future canonicalizations that must consider all memory accesses for deriving additional beneficial transformations. llvm-svn: 288479
* clang/test/Driver/defsym.s: Appease targeting msc. It is incapable of ↵NAKAMURA Takumi2016-12-021-1/+1
| | | | | | external assembler in trunk. llvm-svn: 288478
* Add a test documenting how we handle addends on Elf_Rela.Rafael Espindola2016-12-021-0/+34
| | | | llvm-svn: 288477
* IR: Move NumElements field from {Array,Vector}Type to SequentialType.Peter Collingbourne2016-12-0210-75/+33
| | | | | | | | | | Now that PointerType is no longer a SequentialType, all SequentialTypes have an associated number of elements, so we can move that information to the base class, allowing for a number of simplifications. Differential Revision: https://reviews.llvm.org/D27122 llvm-svn: 288464
* Change LoopUnrollPass cost from int to unsigned to make it consistent. (NFC)Dehao Chen2016-12-021-5/+5
| | | | llvm-svn: 288463
* IR: Change PointerType to derive from Type rather than SequentialType.Peter Collingbourne2016-12-0211-39/+34
| | | | | | | | | | | | | | | | | | | As proposed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2016-October/106640.html This is for a couple of reasons: - Values of type PointerType are unlike the other SequentialTypes (arrays and vectors) in that they do not hold values of the element type. By moving PointerType we can unify certain aspects of how the other SequentialTypes are handled. - PointerType will have no place in the SequentialType hierarchy once pointee types are removed, so this is a necessary step towards removing pointee types. Differential Revision: https://reviews.llvm.org/D26595 llvm-svn: 288462
* Allow duplicated abs symbols with the same value.Rafael Espindola2016-12-022-6/+31
| | | | | | | | | This is a fairly reasonable bfd extension since there is one obvious value. dtrace depends on this feature as it creates multiple absolute symbols with the same value. llvm-svn: 288461
* Fix GlobalISel build.Peter Collingbourne2016-12-021-1/+1
| | | | llvm-svn: 288460
* ConstantFolding: Factor code into helper functionMatt Arsenault2016-12-021-23/+34
| | | | llvm-svn: 288459
* IR: Change the gep_type_iterator API to avoid always exposing the "current" ↵Peter Collingbourne2016-12-0231-144/+141
| | | | | | | | | | | | | type. Instead, expose whether the current type is an array or a struct, if an array what the upper bound is, and if a struct the struct type itself. This is in preparation for a later change which will make PointerType derive from Type rather than SequentialType. Differential Revision: https://reviews.llvm.org/D26594 llvm-svn: 288458
* Update implementation of ABI support for throwing noexcept function pointersRichard Smith2016-12-025-81/+30
| | | | | | | and catching as non-noexcept to match the final design per discusson on cxx-abi-dev. llvm-svn: 288457
* [CUDA] Fix faulty test from rL288448Jason Henline2016-12-021-1/+1
| | | | | | | | | | | | | | | | | | Summary: The test introduced by rL288448 is currently failing because unimportant but unexpected errors appear as output from a test compile line. This patch looks for a more specific error message, in order to avoid false positives. Reviewers: jlebar Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D27328 Switch to more specific error llvm-svn: 288453
* p0012r1: define corresponding feature test macroRichard Smith2016-12-022-3/+2
| | | | llvm-svn: 288452
* Write the addent to got entries when using Elf_Rel.Rafael Espindola2016-12-022-2/+37
| | | | llvm-svn: 288451
* [DWARF] Put linkage-name on abstract origin even when there's a declaration.Paul Robinson2016-12-022-36/+97
| | | | | | | | | | In r266692, we made it possible to emit linkage names for just inlined functions, putting the attribute on the abstract origin. Make sure we don't think the linkage-name was already emitted on a declaration. Differential Revision: http://reviews.llvm.org/D27320 llvm-svn: 288450
* Recover better from an incompatible .pcm file being provided by -fmodule-file=.Richard Smith2016-12-024-15/+51
| | | | | | | | | | | We try to include the headers of the module textually in this case, still enforcing the modules semantic rules. In order to make that work, we need to still track that we're entering and leaving the module. Also, if the module was also marked as unavailable (perhaps because it was missing a file), we shouldn't mark the module unavailable -- we don't need the module to be complete if we're going to enter it textually. llvm-svn: 288449
* [CUDA] "Support" ASAN arguments in CudaToolChainJason Henline2016-12-023-0/+12
| | | | | | | | | | | | | | | | | | | | | | This fixes a bug that was introduced in rL287285. The bug made it illegal to pass -fsanitize=address during CUDA compilation because the CudaToolChain class was switched from deriving from the Linux toolchain class to deriving directly from the ToolChain toolchain class. When CudaToolChain derived from Linux, it used Linux's getSupportedSanitizers method, and that method allowed ASAN, but when it switched to deriving directly from ToolChain, it inherited a getSupportedSanitizers method that didn't allow for ASAN. This patch fixes that bug by creating a getSupportedSanitizers method for CudaToolChain that supports ASAN. This patch also fixes the test that checks that -fsanitize=address is passed correctly for CUDA builds. That test didn't used to notice if an error message was emitted, and that's why it didn't catch this bug when it was first introduced. With the fix from this patch, that test will now catch any similar bug in the future. llvm-svn: 288448
* [WebAssembly] Add an -mdirect flag for the direct wasm object feature.Dan Gohman2016-12-022-0/+6
| | | | | | | Add a target flag for enabling the new direct wasm object emission feature. llvm-svn: 288447
* [ThinLTO] Stop importing constant global vars as copies in the backendTeresa Johnson2016-12-028-31/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: We were doing an optimization in the ThinLTO backends of importing constant unnamed_addr globals unconditionally as a local copy (regardless of whether the thin link decided to import them). This should be done in the thin link instead, so that resulting exported references are marked and promoted appropriately, but will need a summary enhancement to mark these variables as constant unnamed_addr. The function import logic during the thin link was trying to handle this proactively, by conservatively marking all values referenced in the initializer lists of exported global variables as also exported. However, this only handled values referenced directly from the initializer list of an exported global variable. If the value is itself a constant unnamed_addr variable, we could end up exporting its references as well. This caused multiple issues. The first is that the transitively exported references weren't promoted. Secondly, some could not be promoted/renamed (e.g. they had a section or other constraint). recursively, instead of just adding the first level of initializer list references to the ExportList directly. Remove this optimization and the associated handling in the function import backend. SPEC measurements indicate we weren't getting much from it in any case. Fixes PR31052. Reviewers: mehdi_amini Subscribers: krasin, llvm-commits Differential Revision: https://reviews.llvm.org/D26880 llvm-svn: 288446
* AMDGPU: Use wider scalar spills for SGPR spillingMatt Arsenault2016-12-024-53/+259
| | | | | | | | | | | | | | | | Since the spill is for the whole wave, these don't have the swizzling problems that vector stores do and a single 4-byte allocation is enough to spill a 64 element register. This should reduce the number of spill instructions and put all the spills for a register in the same cacheline. This should save allocated private size, but for now it doesn't. The extra slots are allocated for each component, but never used because the frame layout is essentially finalized before frame indices are replaced. For always using the scalar store path, this should probably be moved into processFunctionBeforeFrameFinalized. llvm-svn: 288445
* Delete tautological assertion.Jonathan Roelofs2016-12-021-1/+0
| | | | | | | | After r256463, both the LHS and RHS now refer to the same variable. Before, they referred to the member, the parameter respectively. Now GCC6's -Wtautological-compare complains. llvm-svn: 288444
* Fix undefined behavior.Rui Ueyama2016-12-021-7/+9
| | | | | | | New items can be added to Ranges here, and that invalidates an iterater that previously pointed the end of the vector. llvm-svn: 288443
* When instructions are hoisted out of loops by MachineLICM, remove their ↵Wolfgang Pieb2016-12-023-4/+151
| | | | | | | | | | | | | | | debug loc. This prevents erratic stepping behavior as well as incorrect source attribution for sample profiling. Reviewers: dblakie Subscribers: llvm-commit Differential Revision: https://reviews.llvm.org/D27290 llvm-svn: 288442
* SDAG: Avoid a large, usually empty SmallVector in a recursive functionJustin Bogner2016-12-021-2/+2
| | | | | | | | | | | | | | | This SmallVector is using up 128 bytes on the stack every time despite almost always being empty[1], and since this function can recurse quite deeply that adds up to a lot of overhead. We've seen this run afoul of ulimits in some cases with ASAN on. Replacing the SmallVector with a std::vector trades an occasional heap allocation for vastly less stack usage. [1]: I gathered some stats on an internal test suite and the vector was non-empty in only 45,000 of 10,000,000 calls to this function. llvm-svn: 288441
* Struct GEPs must use i32, not whatever size_t is. It should be safeJohn McCall2016-12-011-2/+4
| | | | | | | to do this unconditionally, given that the indices will always be small constant integers anyway. llvm-svn: 288440
* [AArch64] Fold more spilled/refilled COPYs.Geoff Berry2016-12-013-64/+114
| | | | | | | | | | | | | | | Summary: Make AArch64InstrInfo::foldMemoryOperandImpl more general by folding all full COPYs between register classes of the same size that are either spilled or refilled. Reviewers: MatzeB, qcolombet Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D27271 llvm-svn: 288439
* [libclang] Add APIs to check the result of an integer expression in ↵Argyrios Kyrtzidis2016-12-015-6/+81
| | | | | | | | | CXEvalResult without overflow Patch by Emilio Cobos Álvarez! See https://reviews.llvm.org/D26788 llvm-svn: 288438
* [MC] Refactor emitELFSize to make usage more consistent. NFC.Dan Gohman2016-12-017-14/+11
| | | | | | | | | | | | | Move the cast<MCSymbolELF> inside emitELFSize, so that: - it's done in one place instead of at each call - it's more consistent with similar functions like EmitCOFFSafeSEH - ambiguity between cast<> and dyn_cast<> is avoided (which also eliminates an unnecessary dyn_cast call) This also makes it easier to experiment with using ".size" directives on non-ELF targets. llvm-svn: 288437
* Extend CompilationDatabase by a field for the output filenameJoerg Sonnenberger2016-12-016-9/+31
| | | | | | | | | | | | | | In bigger projects like an Operating System, the same source code is often compiled in slightly different ways. This could be the difference between PIC and non-PIC code for static vs dynamic libraries, it could also be the difference between size optimised versions of tools for ramdisk images. At the moment, the compilation database has no way to distinguish such cases. As first step, add a field in the JSON format for it and process it accordingly. Differential Revision: https://reviews.llvm.org/D27138 llvm-svn: 288436
* llvm-modextract: Call keep() on the output stream before exiting.Peter Collingbourne2016-12-012-0/+7
| | | | llvm-svn: 288435
OpenPOWER on IntegriCloud