summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU: Fix i128 mulMatt Arsenault2016-12-093-1/+76
| | | | llvm-svn: 289231
* AMDGPU: Allow TBA, TMA, TTMP* registers with SMEM instructionsMatt Arsenault2016-12-093-7/+155
| | | | | | Fixes assembler regressions. llvm-svn: 289230
* AMDGPU: Clean up instruction bitsMatt Arsenault2016-12-092-98/+117
| | | | | | | | | Sort the instruction bits by type and make sure there is one for each format. Also cleanup namespaces. llvm-svn: 289229
* Don't assert when redefining a built-in macro in a PCH, PR29119Nico Weber2016-12-094-5/+64
| | | | | | | | | | | | | | | | PCH files store the macro history for a given macro, and the whole history list for one identifier is given to the Preprocessor at once via Preprocessor::setLoadedMacroDirective(). This contained an assert that no macro history exists yet for that identifier. That's usually true, but it's not true for builtin macros, which are created in Preprocessor() before flags and pchs are processed. Luckily, ASTWriter stops writing macro history lists at builtins (see shouldIgnoreMacro() in ASTWriter.cpp), so the head of the history list was missing for builtin macros. So make the assert weaker, and splice the history list to the existing single define for builtins. https://reviews.llvm.org/D27545 llvm-svn: 289228
* [PPC] Add intrinsics for vector extract word and vector insert word.Sean Fertile2016-12-093-0/+34
| | | | | Revision: https://reviews.llvm.org/D26547 llvm-svn: 289227
* Revert "In visitSTORE, always use FindBetterChain, rather than only when ↵Nirav Dave2016-12-0967-1682/+1995
| | | | | | | | UseAA is enabled." This reverts commit r289221 which appears to be triggering an assertion llvm-svn: 289226
* Store decls in prototypes on the declarator instead of in the ASTReid Kleckner2016-12-0916-109/+176
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This saves two pointers from FunctionDecl that were being used for some rare and questionable C-only functionality. The DeclsInPrototypeScope ArrayRef was added in r151712 in order to parse this kind of C code: enum e {x, y}; int f(enum {y, x} n) { return x; // should return 1, not 0 } The challenge is that we parse 'int f(enum {y, x} n)' it its own function prototype scope that gets popped before we build the FunctionDecl for 'f'. The original change was doing two questionable things: 1. Saving all tag decls introduced in prototype scope on a TU-global Sema variable. This is problematic when you have cases like this, where 'x' and 'y' shouldn't be visible in 'f': void f(void (*fp)(enum { x, y } e)) { /* no x */ } This patch fixes that, so now 'f' can't see 'x', which is consistent with GCC. 2. Storing the decls in FunctionDecl in ActOnFunctionDeclarator so that they could be used in ActOnStartOfFunctionDef. This is just an inefficient way to move information around. The AST lives forever, but the list of non-parameter decls in prototype scope is short lived. Moving these things to the Declarator solves both of these issues. Reviewers: rsmith Subscribers: jmolloy, cfe-commits Differential Revision: https://reviews.llvm.org/D27279 llvm-svn: 289225
* Fix parsing when one extern follows another.Rafael Espindola2016-12-092-1/+18
| | | | llvm-svn: 289224
* Fix buildbots that are failing due to this test by adding all expected fails ↵Greg Clayton2016-12-091-0/+9
| | | | | | that TestMultipleDebuggers.py has. llvm-svn: 289223
* Rename multiple target test so it is unique.Greg Clayton2016-12-091-1/+1
| | | | llvm-svn: 289222
* In visitSTORE, always use FindBetterChain, rather than only when UseAA is ↵Nirav Dave2016-12-0967-1995/+1682
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | enabled. Retrying after fixing overly aggressive load-store forwarding optimization. Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates *worse* code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores *CAN* be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 289221
* Use SelectionDAG.getSplatBuildVector helper. NFCI.Simon Pilgrim2016-12-091-6/+5
| | | | llvm-svn: 289220
* AMDGPU/SI: Don't mark VINTRP instructions as mayLoadTom Stellard2016-12-093-6/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: These instructions technically do read from memory, but the memory is considered to be out of bounds for normal load/store instructions. shader-db stats: SGPRS: 1416075 -> 1413323 (-0.19 %) VGPRS: 867413 -> 863935 (-0.40 %) Spilled SGPRs: 1409 -> 1354 (-3.90 %) Spilled VGPRs: 63 -> 63 (0.00 %) Private memory VGPRs: 880 -> 880 (0.00 %) Scratch size: 2648 -> 2632 (-0.60 %) dwords per thread Code Size: 37889052 -> 37897340 (0.02 %) bytes LDS: 2147 -> 2147 (0.00 %) blocks Max Waves: 279243 -> 280369 (0.40 %) Wait states: 0 -> 0 (0.00 %) Reviewers: nhaehnle, mareko, arsenm Subscribers: kzhuravl, wdng, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27593 llvm-svn: 289219
* [SelectionDAG] Use SelectionDAG.getBuildVector helper. NFCI.Simon Pilgrim2016-12-092-9/+6
| | | | | | Makes interception of BUILD_VECTOR creation easier for debugging. llvm-svn: 289218
* Don't crash on an extra symbol in a version script.Rafael Espindola2016-12-092-1/+9
| | | | llvm-svn: 289217
* [SCEVExpander] Remove \brief, reflow comments; NFCSanjoy Das2016-12-091-81/+73
| | | | llvm-svn: 289216
* [SCEVExpander] Use llvm data structures; NFCSanjoy Das2016-12-091-6/+8
| | | | llvm-svn: 289215
* [SelectionDAG] Add additional checks to CONCAT_VECTORS creationSimon Pilgrim2016-12-091-0/+10
| | | | | | Part of the work for PR31323 - add extra asserts checking that the input vectors are of consistent type and result in the correct number of vector elements. llvm-svn: 289214
* Document and publish the useful module-file-info flag.Vassil Vassilev2016-12-091-1/+2
| | | | llvm-svn: 289213
* Give preempting symbols precedence over linker script.Rafael Espindola2016-12-092-1/+14
| | | | llvm-svn: 289212
* [LLDB][MIPS] Fix TestWatchpointIter failureNitesh Jain2016-12-091-2/+9
| | | | | | | | | | Reviewers: jingham Subscribers: jaydeep, bhushan, slthakur, lldb-commits Differential Revision: https://reviews.llvm.org/D27124 llvm-svn: 289211
* [LLDB][MIPS] Fix TestMultipleHits for MIPSNitesh Jain2016-12-091-2/+2
| | | | | | | | | | Reviewers: clayborg, labath, zturner Subscribers: jaydeep, bhushan, slthakur, lldb-commits Differential Revision: https://reviews.llvm.org/D27085 llvm-svn: 289210
* [LLDB][MIPS] Fix some test case failures due to elf_abi field of ↵Nitesh Jain2016-12-091-0/+1
| | | | | | | | | | | | qprocessInfo packet. Reviewers: jaydeep, bhushan, clayborg Subscribers: slthakur, lldb-commits Differential Revision: https://reviews.llvm.org/D26542 llvm-svn: 289209
* Plug another leak in the DWARF unittests, DIEInlineStrings are never destroyed.Benjamin Kramer2016-12-093-5/+7
| | | | llvm-svn: 289208
* Fix memory leak in unit test.Benjamin Kramer2016-12-092-3/+3
| | | | | | | The StringPool entries are destroyed with the allocator, the string pool itself is not. llvm-svn: 289207
* Update doc version to 4.0Eric Fiselier2016-12-091-2/+2
| | | | llvm-svn: 289206
* [NFC] Change whitespace to force docs rebuildEric Fiselier2016-12-091-0/+1
| | | | llvm-svn: 289205
* Fix missing const on set::count. Patch from Andrey KhalyavinEric Fiselier2016-12-092-10/+12
| | | | llvm-svn: 289204
* [clang-format] calculate MaxInsertOffset in the original code correctly.Eric Liu2016-12-092-0/+26
| | | | | | | | | | Reviewers: djasper Subscribers: klimek, cfe-commits Differential Revision: https://reviews.llvm.org/D27615 llvm-svn: 289203
* llvm/test/Object/archive-thin-create.test: Make sure that %t is empty to ↵NAKAMURA Takumi2016-12-091-0/+1
| | | | | | stabilize the test. llvm-svn: 289202
* [AVR] Remove a set of redundant testsDylan McKay2016-12-094-88/+0
| | | | | | This fixes the build. llvm-svn: 289201
* [SelectionDAG] Add partial BITCAST support to computeKnownBitsSimon Pilgrim2016-12-092-315/+167
| | | | | | | | | | Adds support for bitcasting a little endian 'small element' vector to 'large element' scalar/vector (e.g. v16i8 to v4i32 or v2i32 to i64), which is required for PR30845. We extract the knownbits for each 'small element' part and concatenate the results together. We can add support for big endian and 'large element' scalar/vector to 'small element' vector bitcasting once we have test cases for them. Differential Revision: https://reviews.llvm.org/D27129 llvm-svn: 289200
* Fix TestMultipleTargets for on x86_64 architecturesPavel Labath2016-12-091-0/+2
| | | | | | | | | This test links against liblldb, so it can only run when the target arch is the same arch as liblldb. We already have a decorator for that, so apply it. While I'm in there, also mark the test as debug-info independent. llvm-svn: 289199
* [ELF][I386] .got.plt entries for i386 should use VA of ifunc resolverPeter Smith2016-12-094-5/+94
| | | | | | | | | | | | | | | | | | The i386 glibc ld.so expects the .got.slot entry that is relocated by a R_386_IRELATIVE relocation to point directly at the ifunc resolver and not the address of the PLT entry + 6 (thus entering the lazy resolver). This is also the case for ARM and I suspect it is because these use REL relocations and can't use the addend field to store the address of the ifunc resolver. If the lazy resolver is used we get an error message stating that only R_386_JUMP_SLOT is supported. As ARM and i386 share the same code, I've removed the ARM specific test and added a writeIgotPlt() function that by default calls writeGotPlt(). ARM and i386 override this to write the address of the ifunc resolver. Differential Revision: https://reviews.llvm.org/D27581 llvm-svn: 289198
* Refactor uses_allocator test types for upcoming fixesEric Fiselier2016-12-099-378/+666
| | | | llvm-svn: 289197
* Update Doxygen comment in StringSaver (NFC)Malcolm Parsons2016-12-091-2/+2
| | | | llvm-svn: 289196
* Put C++ ABI headers in a special build directory instead of the top level.Eric Fiselier2016-12-094-6/+97
| | | | | | | | | | | | | | This patch changes where the C++ ABI headers are put during the build. Previously they were put in the top level include directory (not the libc++ header directory). However that just polutes the top level directory. Instead this patch creates a special directory to put them in. The reason they can't be put under c++/v1 until after the build is because libc++ uses the in-source headers, so we can't add the include path of the libc++ headers in the object dir. Additionally this patch teaches the test suite how to find the ABI headers, and adds a demangling utility to help debug tests with. llvm-svn: 289195
* Revert "[SelectionDAG] Add knownbits support for EXTRACT_VECTOR_ELT opcodes"Daniel Jasper2016-12-092-38/+10
| | | | | | | | This reverts commit r288916 as it is currently causing a crasher in Halide. Reproducer on llvm.org/PR31323. While it might be that halide is generating invalid IR, llc shouldn't crash. llvm-svn: 289194
* [X86] Modify patterns from memory form of RCP/RSQRT/SQRT intrinsics to only ↵Craig Topper2016-12-091-14/+11
| | | | | | | | | | allow (scalar_to_vector (loadf32/load64)) instead of anything that sse_load_f32/f64 can match. sse_load_f32/f64 can also match loads that are zero extended to vectors. We shouldn't match that because we wouldn't be able to get the instruction to zero the upper bits like the intrinsic semantics would require for such a case. There is a test case that does depend on this behavior. llvm-svn: 289193
* [AVR] Use a more appropriate integer type for wide IN/OUT instructionsDylan McKay2016-12-091-2/+2
| | | | | | | | | | We could previously select an integer which would hit an assertion error in pseudo expansion. The new type will also generate the appropriate fixups if needed, which wasn't done beforehand. llvm-svn: 289192
* [AVR] Add tests for a large number of pseudo instructionsDylan McKay2016-12-0928-4/+572
| | | | | | This adds MIR tests for 24 pseudo instructions. llvm-svn: 289191
* [AVX-512] Correctly preserve the passthru semantics of the FMA scalar intrinsicsCraig Topper2016-12-097-108/+156
| | | | | | | | | | | | | | | | | | | | | Summary: Scalar intrinsics have specific semantics about the which input's upper bits are passed through to the output. The same input is also supposed to be the input we use for the lower element when the mask bit is 0 in a masked operation. We aren't currently keeping these semantics with instruction selection. This patch corrects this by introducing new scalar FMA ISD nodes that indicate whether operand 1(one of the multiply inputs) or operand 3(the additon/subtraction input) should pass thru its upper bits. We use this information to select 213/132 form for the operand 1 version and the 231 form for the operand 3 version. We also use this information to suppress combining FNEG operations on the passthru input since semantically the passthru bits aren't negated. This is stronger than the earlier check added for a user being SELECTS so we can remove that. This fixes PR30913. Reviewers: delena, zvi, v_klochkov Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27144 llvm-svn: 289190
* AMDGPU: Select i16 instructions to VOP3 formsMatt Arsenault2016-12-093-20/+20
| | | | | | | | | | | | | | These were selecting directly to the VOP2 form instead of VOP3 like the i32 instructions. Fixes regressions in future commits where an immediate isn't folded because it was initially used for the second operand. Because uniform 16-bit operations are promoted to i32, it's difficult to get a simple testcase where this matters. Fold failures in SIFoldOperands here tend to be hidden by commute and fold in SIShrinkInstructions. llvm-svn: 289189
* Remove some more uses of Args::GetArgumentAtIndex.Zachary Turner2016-12-096-157/+142
| | | | llvm-svn: 289188
* Re-commit r289184, "Support: Use a 64-bit seek in raw_fd_ostream::seek()." ↵Peter Collingbourne2016-12-093-0/+12
| | | | | | with a configure-time check for lseek64. llvm-svn: 289187
* [X86] Add masked versions of VPERMT2* and VPERMI2* to load folding tables.Craig Topper2016-12-092-6/+112
| | | | llvm-svn: 289186
* Revert r289184, we need more configury for Darwin and *BSD.Peter Collingbourne2016-12-091-5/+1
| | | | llvm-svn: 289185
* Support: Use a 64-bit seek in raw_fd_ostream::seek().Peter Collingbourne2016-12-091-1/+5
| | | | llvm-svn: 289184
* Add type records to TPI stream.Rui Ueyama2016-12-092-111/+285
| | | | | | | | | I don't think the data I add to a TPI stream in this patch is correct, but at least it can be displayed using llvm-pdbdump. Until I add more streams to a PDB file, I'm not able to know whether the data will be accepted by MSVC tools or not. llvm-svn: 289183
* [SCCP] Make the test added in r289175 more meaningful.Davide Italiano2016-12-091-1/+2
| | | | | | Add a comment while here. llvm-svn: 289182
OpenPOWER on IntegriCloud