summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [MC] Implement the COFF directives in MCNullStreamer.Dan Gohman2017-02-271-0/+4
| | | | | | This fixes -filetype=null errors introduced in r296403. llvm-svn: 296410
* AMDGPU: Basic folds for fmed3 intrinsicMatt Arsenault2017-02-272-0/+84
| | | | | | | Constant fold, canonicalize constants to RHS, reduce to minnum/maxnum when inputs are nan/undef. llvm-svn: 296409
* Remove some code accidentally left in.Zachary Turner2017-02-271-2/+0
| | | | llvm-svn: 296407
* [AddressSanitizer] Put shadow at 0 for FuchsiaPetr Hosek2017-02-271-1/+6
| | | | | | | | | | The Fuchsia ASan runtime reserves the low part of the address space. Patch by Roland McGrath Differential Revision: https://reviews.llvm.org/D30426 llvm-svn: 296405
* [CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; ↵Eugene Zelenko2017-02-276-31/+97
| | | | | | other minor fixes (NFC). llvm-svn: 296404
* [MC] Factor out non-COFF handling of COFF-specific directives.Dan Gohman2017-02-274-52/+12
| | | | | | | | | Instead of requiring every non-COFF MCObjectStreamer to implement the COFF hooks just to do an llvm_unreachable to say that they're not supported, do the llvm_unreachable in the default implementation, as suggested by rnk in https://reviews.llvm.org/D26722. llvm-svn: 296403
* [WebAssembly] Add some comments and tidy up whitespace.Dan Gohman2017-02-274-5/+6
| | | | llvm-svn: 296402
* AMDGPU: Use v_med3_{f16|i16|u16}Matt Arsenault2017-02-277-33/+52
| | | | llvm-svn: 296401
* [WebAssembly] Split CFG-sorting into its own pass. NFC.Dan Gohman2017-02-277-223/+302
| | | | | | | CFG sorting was already an independent algorithm from block/loop insertion; this change makes it more convenient to debug. llvm-svn: 296399
* Revert r296366 "[InlineFunction] add nonnull assumptions based on argument ↵Hans Wennborg2017-02-271-36/+22
| | | | | | | | attributes" It causes miscompiles e.g. during self-host of Clang (PR32082). llvm-svn: 296398
* AMDGPU: Support v2i16/v2f16 packed operationsMatt Arsenault2017-02-2711-63/+378
| | | | llvm-svn: 296396
* ISel: We need to notify FastIS of the IMPLICIT_DEF we created in ↵Arnold Schwaighofer2017-02-271-1/+7
| | | | | | | | | | createSwiftErrorEntriesInEntryBlock Otherwise, it will insert instructions before it. rdar://30536186 llvm-svn: 296395
* [PDB] Partial resubmit of r296215, which improved PDB Stream Library.Zachary Turner2017-02-2730-205/+187
| | | | | | | | | | | | | | | | | This was reverted because it was breaking some builds, and because of incorrect error code usage. Since the CL was large and contained many different things, I'm resubmitting it in pieces. This portion is NFC, and consists of: 1) Renaming classes to follow a consistent naming convention. 2) Fixing the const-ness of the interface methods. 3) Adding detailed doxygen comments. 4) Fixing a few instances of passing `const BinaryStream& X`. These are now passed as `BinaryStreamRef X`. llvm-svn: 296394
* Revert "DAG: Check if extract_vector_elt is legal or custom"Matt Arsenault2017-02-271-1/+1
| | | | | | | This reverts r295782. This could potentially result in some legalization loops and I avoided the need for this. llvm-svn: 296393
* Empty line. NFCIXin Tong2017-02-271-1/+0
| | | | llvm-svn: 296392
* [PGO] Fix a bug in reading text format value profile.Rong Xu2017-02-271-2/+3
| | | | | | | | | | | | | | Summary: Should use the Valuekind read from the profile. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits, xur Differential Revision: https://reviews.llvm.org/D30420 llvm-svn: 296391
* [ARM] don't transform an add(ext Cond), C to select unless there's a setcc ↵Sanjay Patel2017-02-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | of the condition The transform in question claims to be doing: // fold (add (select cc, 0, c), x) -> (select cc, x, (add, x, c)) ...starting in PerformADDCombineWithOperands(), but it wasn't actually checking for a setcc node for the sext/zext patterns. This is exactly the opposite of a transform I'd like to add to DAGCombiner's foldSelectOfConstants(), so I was seeing infinite loops with my draft of a patch applied. The changes in select_const.ll look positive (less instructions). The change in arm-and-tst-peephole.ll is unrelated. We're changing the input IR in that test to preserve the intent of the test, but that's not affected by this code change. Differential Revision: https://reviews.llvm.org/D30355 llvm-svn: 296389
* AMDGPU: Add some of the new gfx9 VOP3 instructionsMatt Arsenault2017-02-271-0/+12
| | | | llvm-svn: 296382
* [X86][SSE] Attempt to extract vector elements through target shufflesSimon Pilgrim2017-02-272-0/+112
| | | | | | | | | | DAGCombiner already supports peeking thorough shuffles to improve vector element extraction, but legalization often leaves us in situations where we need to extract vector elements after shuffles have already been lowered. This patch adds support for VECTOR_EXTRACT_ELEMENT/PEXTRW/PEXTRB instructions to attempt to handle target shuffles as well. I've covered some basic scenarios including handling shuffle mask scaling and the implicit zero-extension of PEXTRW/PEXTRB, there is more that could be done here (that I've mentioned in TODOs) but I haven't found many cases where its worth it. Differential Revision: https://reviews.llvm.org/D30176 llvm-svn: 296381
* AMDGPU: Support inlineasm for packed instructionsMatt Arsenault2017-02-271-1/+42
| | | | | | | Add packed types as legal so they may be used with inlineasm. Keep all operations expanded for now. llvm-svn: 296379
* AMDGPU: Don't fold immediate if clamp/omod are setMatt Arsenault2017-02-272-8/+13
| | | | | | | Doesn't fix any practical problems because clamp/omod are currently folded after peephole optimizer. llvm-svn: 296375
* AMDGPU: Fold omod into instructionsMatt Arsenault2017-02-273-6/+146
| | | | llvm-svn: 296372
* [TailDuplicator] Maintain DebugLoc for branch instructionsTaewook Oh2017-02-271-1/+2
| | | | | | | | | | | | | | Summary: Existing implementation of duplicateSimpleBB function drops DebugLoc metadata of branch instructions during the transformation. This patch addresses this issue by making newly created branch instructions to keep the metadata of replaced branch instructions. Reviewers: qcolombet, craig.topper, aprantl, MatzeB, sanjoy, dblaikie Reviewed By: dblaikie Subscribers: dblaikie, llvm-commits Differential Revision: https://reviews.llvm.org/D30026 llvm-svn: 296371
* AMDGPU: Add f16 to shader calling conventionsMatt Arsenault2017-02-271-3/+3
| | | | | | Mostly useful for writing tests for f16 features. llvm-svn: 296370
* AMDGPU: Add VOP3P instruction formatMatt Arsenault2017-02-2723-86/+879
| | | | | | | | Add a few non-VOP3P but instructions related to packed. Includes hack with dummy operands for the benefit of the assembler llvm-svn: 296368
* [InlineFunction] add nonnull assumptions based on argument attributesSanjay Patel2017-02-271-22/+36
| | | | | | | | | | | This was suggested in D27855: have the inliner add assumptions, so we don't lose nonnull info provided by argument attributes. This still doesn't solve PR28430 (dyn_cast), but this gets us closer. https://reviews.llvm.org/D29999 llvm-svn: 296366
* [Hexagon] Defs and clobbers can overlapKrzysztof Parzyszek2017-02-271-5/+4
| | | | llvm-svn: 296365
* Fix a bug when unswitching on partial LIV for SwitchInstXin Tong2017-02-271-32/+128
| | | | | | | | | | | | | | Summary: Fix a bug when unswitching on partial LIV for SwitchInst. Reviewers: hfinkel, efriedma, sanjoy Reviewed By: sanjoy Subscribers: david2050, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D29107 llvm-svn: 296363
* Fix comments. NFC.Rong Xu2017-02-271-1/+1
| | | | | | Change "Thin-LTO" to "ThinLTO" in the comments for consistency. llvm-svn: 296362
* [X86] Use APInt instead of SmallBitVector tracking undef elements from ↵Craig Topper2017-02-271-25/+25
| | | | | | | | | | | | | | | | | | | getTargetConstantBitsFromNode and getConstVector. Summary: SmallBitVector uses a malloc for more than 58 bits on a 64-bit target and more than 27 bits on a 32-bit target. Some of the vector types we deal with here use more than those number of elements and therefore cause a malloc. APInt on the other hand supports up to 64 bits without a malloc. That's the maximum number of bits we need here so we can avoid a malloc for all cases by using APInt. Reviewers: RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30392 llvm-svn: 296355
* [X86] Use APInt instead of SmallBitVector for tracking Zeroable elements in ↵Craig Topper2017-02-271-63/+57
| | | | | | | | | | | | | | | | | | | shuffle lowering Summary: SmallBitVector uses a malloc for more than 58 bits on a 64-bit target and more than 27 bits on a 32-bit target. Some of the vector types we deal with here use more than those number of elements and therefore cause a malloc. APInt on the other hand supports up to 64 bits without a malloc. That's the maximum number of bits we need here so we can avoid a malloc for all cases by using APInt. Reviewers: RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30390 llvm-svn: 296354
* [X86] Fix SmallVector sizes in constant pool shuffle decoding to avoid heap ↵Craig Topper2017-02-271-5/+5
| | | | | | | | | | allocation Some of the vectors are under sized to avoid heap allocation. In one case the vector was oversized. Differential Revision: https://reviews.llvm.org/D30387 llvm-svn: 296353
* [X86] Use APInt instead of SmallBitVector for tracking undef elements in ↵Craig Topper2017-02-271-10/+10
| | | | | | | | | | | | | | | | | | | constant pool shuffle decoding Summary: SmallBitVector uses a malloc for more than 58 bits on a 64-bit target and more than 27 bits on a 32-bit target. Some of the vector types we deal with here use more than those number of elements and therefore cause a malloc. APInt on the other hand supports up to 64 bits without a malloc. That's the maximum number of bits we need here so we can avoid a malloc for all cases by using APInt. This will incur a minor increase in stack usage due to APInt storing the bit count separately from the data bits unlike SmallBitVector, but that should be ok. Reviewers: RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30386 llvm-svn: 296352
* Loop predication expand both sides of the widened conditionArtur Pilipenko2017-02-271-5/+7
| | | | | | | | | | | | This is a fix for a loop predication bug which resulted in malformed IR generation. Loop invariant side of the widened condition is not guaranteed to be available in the preheader as is, so we need to expand it as well. See added unsigned_loop_0_to_n_hoist_length test for example. Reviewed By: sanjoy, mkazantsev Differential Revision: https://reviews.llvm.org/D30099 llvm-svn: 296345
* AArch64InstPrinter: rewrite of printSysAliasSjoerd Meijer2017-02-273-316/+163
| | | | | | | | | | | | | | | | | | This is a cleanup/rewrite of the printSysAlias function. This was not using the tablegen instruction descriptions, but was "manually" decoding the instructions. This has been replaced with calls to lookup_XYZ_ByEncoding tablegen calls. This revealed several problems. First, instruction IVAU had the wrong encoding. This was cancelled out by the parser that incorrectly matched the wrong encoding. Second, instruction CVAP was missing from the SystemOperands tablegen descriptions, so this has been added. And third, the required target features were not captured in the tablegen descriptions, so support for this has also been added. Differential Revision: https://reviews.llvm.org/D30329 llvm-svn: 296343
* [ARM] LSL #0 is an alias of MOVJohn Brawn2017-02-272-12/+39
| | | | | | | | | | | | | | | | | | | | | | | | Currently we handle this correctly in arm, but in thumb we don't which leads to an unpredictable instruction being emitted for LSL #0 in an IT block and SP not being permitted in some cases when it should be. For the thumb2 LSL we can handle this by making LSL #0 an alias of MOV in the .td file, but for thumb1 we need to handle it in checkTargetMatchPredicate to get the IT handling right. We also need to adjust the handling of MOV rd, rn, LSL #0 to avoid generating the 16-bit encoding in an IT block. We should also adjust it to allow SP in the same way that it is allowed in MOV rd, rn, but I haven't done that here because it looks like it would take quite a lot of work to get right. Additionally correct the selection of the 16-bit shift instructions in processInstruction, where it was checking if the two registers were equal when it should have been checking if they were low. It appears that previously this code was never executed and the 16-bit encoding was selected by default, but the other changes I've done here have somehow made it start being used. Differential Revision: https://reviews.llvm.org/D30294 llvm-svn: 296342
* [DAGCombine] Fix for a load combine bug with non-zero offset patterns on BE ↵Artur Pilipenko2017-02-271-0/+4
| | | | | | | | | | | | | | | | | | targets This pattern is essentially a i16 load from p+1 address: %p1.i16 = bitcast i8* %p to i16* %p2.i8 = getelementptr i8, i8* %p, i64 2 %v1 = load i16, i16* %p1.i16 %v2.i8 = load i8, i8* %p2.i8 %v2 = zext i8 %v2.i8 to i16 %v1.shl = shl i16 %v1, 8 %res = or i16 %v1.shl, %v2 Current implementation would identify %v1 load as the first byte load and would mistakenly emit a i16 load from %p1.i16 address. This patch adds a check that the first byte is loaded from a non-zero offset of the first load address. This way this address can be used as the base address for the combined value. Otherwise just give up combining. llvm-svn: 296336
* [DAGCombine] NFC. MatchLoadCombine extract MemoryByteOffset lambda helperArtur Pilipenko2017-02-271-9/+13
| | | | | | This refactoring will simplify the upcoming change to fix the bug in folding patterns with non-zero offsets on BE targets. llvm-svn: 296332
* [DAGCombine] NFC. MatchLoadCombine remember the first byte provider, not the ↵Artur Pilipenko2017-02-271-3/+5
| | | | | | | | load node This refactoring will simplify the upcoming change to fix a bug in folding patterns with non-zero offsets on BE targets. llvm-svn: 296331
* AArch64AsmParser: don't try to parse “[1]” for non-vector register operandsSjoerd Meijer2017-02-271-25/+0
| | | | | | | | | There are no instructions that have "[1]" as part of the assembly string; FMOVXDhighr is out of date. This removes dead code. Differential Revision: https://reviews.llvm.org/D30165 llvm-svn: 296327
* [AMDGPU] Runtime metadata fixes:Konstantin Zhuravlyov2017-02-275-32/+79
| | | | | | | | | | | - Verify that runtime metadata is actually valid runtime metadata when assembling, otherwise we could accept the following when assembling, but ocl runtime will reject it: .amdgpu_runtime_metadata { amd.MDVersion: [ 2, 1 ], amd.RandomUnknownKey, amd.IsaInfo: ... - Make IsaInfo optional, and always emit it. Differential Revision: https://reviews.llvm.org/D30349 llvm-svn: 296324
* [X86] Check for less than 0 rather than explicit compare with -1. NFCCraig Topper2017-02-271-2/+3
| | | | llvm-svn: 296321
* Update comments. NFCIXin Tong2017-02-261-2/+2
| | | | llvm-svn: 296298
* Revert "[CGP] Split some critical edges coming out of indirect branches"Daniel Jasper2017-02-261-155/+0
| | | | | | | This reverts commit r296149 as it leads to crashes when compiling for PPC. llvm-svn: 296295
* [LoopDeletion] Modernize and simplify a bit. NFCI.Davide Italiano2017-02-261-8/+3
| | | | llvm-svn: 296294
* [X86] Fix execution domain for cmpss/sd instructions.Craig Topper2017-02-261-0/+8
| | | | llvm-svn: 296293
* [AVX-512] Fix execution domain for scalar commutable min/max instructions.Craig Topper2017-02-261-1/+1
| | | | llvm-svn: 296292
* [AVX-512] Fix execution domain for vmovhpd/lpd/hps/lps.Craig Topper2017-02-261-0/+1
| | | | llvm-svn: 296291
* [AVX-512] Fix the execution domain for AVX-512 integer broadcasts.Craig Topper2017-02-261-0/+1
| | | | llvm-svn: 296290
* [AVX-512] Disable the redundant patterns in the VPBROADCASTBr_Alt and ↵Craig Topper2017-02-261-14/+16
| | | | | | VPBROADCASTWr_Alt instructions. NFC llvm-svn: 296289
OpenPOWER on IntegriCloud