summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [MSP430] Align functions on 2-byte boundary instead of 4.Vadzim Dambrouski2017-09-191-1/+1
| | | | | | | | | | | | | | | | Summary: There is no benefit in having the 4-byte alignment, and removing this restriction can save a lot of space for some applications. Reviewers: asl, awygle Reviewed By: awygle Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36165 llvm-svn: 313676
* [SimplifyCFG] fix typos/formatting; NFCSanjay Patel2017-09-191-24/+22
| | | | llvm-svn: 313671
* [AMDGPU] Prevent post-RA scheduler from breaking memory clausesStanislav Mekhanoshin2017-09-192-0/+58
| | | | | | | | | The pre-RA scheduler does load/store clustering, but post-RA scheduler undoes it. Add mutation to prevent it. Differential Revision: https://reviews.llvm.org/D38014 llvm-svn: 313670
* [SystemZ] Fix truncstore + bswap codegen bugUlrich Weigand2017-09-191-1/+2
| | | | | | | | | | | | | SystemZTargetLowering::combineSTORE contains code to transform a combination of STORE + BSWAP into a STRV type instruction. This transformation is correct for regular stores, but not for truncating stores. The routine neglected to check for that case. Fixes a miscompilation of llvm-objcopy with clang, which caused test suite failures in the SystemZ multistage build bot. llvm-svn: 313669
* Revert "ExecutionEngine: add R_AARCH64_ABS{16,32}"Saleem Abdulrasool2017-09-191-12/+0
| | | | | | | | This reverts commit SVN r313654. Seems that it is triggering an assertion on Windows specifically. Revert until I can build on Windows and look into what is happening there. llvm-svn: 313668
* dwarfdump/symbolizer: Avoid loading unneeded CUs from a DWPDavid Blaikie2017-09-193-9/+11
| | | | | | | | When symbolizing large binaries, parsing every CU in a DWP file is a significant performance penalty. Instead, use the index to only load the CUs that are needed. llvm-svn: 313659
* Handle profile mismatch correctly for SamplePGO.Dehao Chen2017-09-191-1/+6
| | | | | | | | | | | | | | Summary: Fix the bug when promoted call return type mismatches with the promoted function, we should not try to inline it. Otherwise it may lead to compiler crash. Reviewers: davidxl, tejohnson, eraman Reviewed By: tejohnson Subscribers: llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D38018 llvm-svn: 313658
* Re-land "Fix Bug 30978 by emitting cv file checksums."Reid Kleckner2017-09-197-49/+179
| | | | | | | This reverts r313431 and brings back r313374 with a fix to write checksums as binary data and not ASCII hex strings. llvm-svn: 313657
* ExecutionEngine: add R_AARCH64_ABS{16,32}Saleem Abdulrasool2017-09-191-0/+12
| | | | | | | | | | Add support for the R_AARCH64_ABS{16,32} relocations in the execution engine. This is primarily used for DWARF debug information relocations and needed by the LLVM JIT to support JITing for lldb. Patch by Alex Langford! llvm-svn: 313654
* [X86] Convert X86ISD::SELECT to ISD::VSELECT just before instruction ↵Craig Topper2017-09-193-22/+3
| | | | | | | | | | selection to avoid duplicate patterns Similar to what we do for X86ISD::SHRUNKBLEND just turn X86ISD::SELECT into ISD::VSELECT. This allows us to remove the duplicated TRUNC patterns. Differential Revision: https://reviews.llvm.org/D38022 llvm-svn: 313644
* Re-land r313400 "[DebugInfo] Insert DW_OP_deref when spilling indirect ↵Reid Kleckner2017-09-191-35/+54
| | | | | | | | | | | | | | DBG_VALUEs" I forgot to zero out the BitVector when reusing it between UserValues. Later uses of the same location number for a different UserValue would falsely indicate that they were spilled. Usually this would lead to incorrect debug info, but in some cases they would indicate something nonsensical like a memory location based on a vector register (Q8 on ARM). llvm-svn: 313640
* [PowerPC Peephole] Constants into a join add, use ADDI over LI/ADD.Tony Jiang2017-09-191-0/+116
| | | | | | | | | | Two blocks prior to the join each perform an li and the the join block has an add using the initialized register. Optimize each predecessor block to instead use addi and delete the li's and add. Differential Revision: https://reviews.llvm.org/D36734 llvm-svn: 313639
* [Power9] Add missing Power9 instructions.Tony Jiang2017-09-195-442/+67
| | | | | | | The following 8 instructions are implemented in this patch. addpcis(subpcis, lnia), darn, maddhd, maddhdu, maddld, setb llvm-svn: 313636
* dwarfdump: Delay parsing abbreviations until they're neededDavid Blaikie2017-09-192-10/+33
| | | | | | | | | | | This speeds up dumping specific DIEs by not parsing abbreviations for units that are not used. (this is also handy to have in eventually to speed up llvm-symbolizer for .dwp files, where parsing most of the DWP file can be avoided by using the index) llvm-svn: 313635
* [globalisel] Add a G_BSWAP instruction and support bswap using it.Daniel Sanders2017-09-191-0/+3
| | | | llvm-svn: 313633
* [Nios2] Subtarget, basic infrastructure for frame, instructions and registersNikolai Bozhenov2017-09-1915-20/+545
| | | | | | | | | | | | | | This is the second minimal patch keeping Nios2 target buildable. I'm adding subtarget here and other stuff for frame lowering, instruction, register information methods. I do not add any test cases, as still there are missing parts like DAG selector and assembly printing. I plan to include them into the next patch. Patch by Andrei Grischenko <andrei.l.grischenko@intel.com> Differential Revision: https://reviews.llvm.org/D37256 llvm-svn: 313626
* [x86] Lowering Mask Set1 intrinsics to LLVM IRJina Nahias2017-09-192-24/+7
| | | | | | | | This patch, together with a matching clang patch (https://reviews.llvm.org/D37668), implements the lowering of X86 mask set1 intrinsics to IR. Differential Revision: https://reviews.llvm.org/D37669 llvm-svn: 313625
* [ARM] Use ADDCARRY / SUBCARRYRoger Ferrer Ibanez2017-09-192-20/+169
| | | | | | | | | | | | | | | | | | | | | | | | | | | This is a preparatory step for D34515. This change: - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32 - lowering is done by first converting the boolean value into the carry flag using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two operations does the actual addition. - for subtraction, given that ISD::SUBCARRY second result is actually a borrow, we need to invert the value of the second operand and result before and after using ARMISD::SUBE. We need to invert the carry result of ARMISD::SUBE to preserve the semantics. - given that the generic combiner may lower ISD::ADDCARRY and ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering as well otherwise i64 operations now would require branches. This implies updating the corresponding test for unsigned. - add new combiner to remove the redundant conversions from/to carry flags to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C - fixes PR34045 - fixes PR34564 Differential Revision: https://reviews.llvm.org/D35192 llvm-svn: 313618
* AMDGPU: Run internalize symbols at -O0Matt Arsenault2017-09-191-21/+21
| | | | | | | | The relocations used for externally visible functions aren't supported, so the direct call emitted ends up hitting a linker error. llvm-svn: 313616
* [X86][Skylake] Adding the scheduling information for the SkylakeClient targetGadi Haber2017-09-193-3/+4014
| | | | | | | | | | | | | | This patch adds the instruction scheduling information for the SkylakeClient (SKL) architecture target by adding the file X86SchedSkylakeClient.td located under the X86 Target. We used the scheduling information retrieved from the Skylake architects in order to create the file. The scheduling information includes latency, number of micro-Ops and used ports by each SKL instruction. The patch continues the scheduling replacement and insertion effort started with the SNB target in r307529 and r310792 and for HSW in r311879. Please expect some performance fluctuations due to code alignment effects. Reviewers: craig.topper, zvi, chandlerc, igorb, aymanmus, RKSimon, delena Differential Revision: https://reviews.llvm.org/D37294 llvm-svn: 313613
* [X86] Remove some unnecessary patterns for truncate with X86ISD::SELECT and ↵Craig Topper2017-09-191-6/+0
| | | | | | | | undef preserved source. We canonicalize undef preserved sources to zero during intrinsic lowering. llvm-svn: 313612
* [X86] Add VPERMPD/VPERMQ and VPERMPS/VPERMD to the execution domain fixing ↵Craig Topper2017-09-191-0/+16
| | | | | | table. llvm-svn: 313610
* Allow public Triple deduction from ObjectFiles.Vlad Tsyrklevich2017-09-191-0/+25
| | | | | | | | | | | | | | | Move logic that allows for Triple deduction from an ObjectFile object out of llvm-objdump.cpp into a public factory, found in the ObjectFile class. This should allow other tools in the future to use this logic without reimplementation. Patch by Mitch Phillips Differential Revision: https://reviews.llvm.org/D37719 llvm-svn: 313605
* [Coverage] Use gap regions to select better line exec countsVedant Kumar2017-09-183-7/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | After clang started emitting deferred regions (r312818), llvm-cov has had a hard time picking reasonable line execuction counts. There have been one or two generic improvements in this area (e.g r310012), but line counts can still report coverage for whitespace instead of code (llvm.org/PR34612). To fix the problem: * Introduce a new region kind so that frontends can explicitly label gap areas. This is done by changing the encoding of the columnEnd field of MappingRegion. This doesn't substantially increase binary size, and makes it easy to maintain backwards-compatibility. * Don't set the line count to a count from a gap area, unless the count comes from a wrapped segment. * Don't highlight gap areas as uncovered. Fixes llvm.org/PR34612. llvm-svn: 313597
* bpf: add inline-asm supportYonghong Song2017-09-184-0/+121
| | | | | | Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 313593
* Revert r313400 "[DebugInfo] Insert DW_OP_deref when spilling indirect ↵Hans Wennborg2017-09-181-53/+35
| | | | | | | | | | | | | | | | | | | | | | | | | DBG_VALUEs" This caused asserts in Chromium. See http://crbug.com/766261 > Summary: > This comes up in optimized debug info for C++ programs that pass and > return objects indirectly by address. In these programs, > llvm.dbg.declare survives optimization, which causes us to emit indirect > DBG_VALUE instructions. The fast register allocator knows to insert > DW_OP_deref when spilling indirect DBG_VALUE instructions, but the > LiveDebugVariables did not until this change. > > This fixes part of PR34513. I need to look into why this doesn't work at > -O0 and I'll send follow up patches to handle that. > > Reviewers: aprantl, dblaikie, probinson > > Subscribers: qcolombet, hiraditya, llvm-commits > > Differential Revision: https://reviews.llvm.org/D37911 llvm-svn: 313589
* [DAGCombiner] fold assertzexts separated by truncSanjay Patel2017-09-182-35/+25
| | | | | | | | | | | | | If we have an AssertZext of a truncated value that has already been AssertZext'ed, we can assert on the wider source op to improve the zext-y knowledge: assert (trunc (assert X, i8) to iN), i1 --> trunc (assert X, i1) to iN This moves a fold from being Mips-specific to general combining, and x86 shows improvements. Differential Revision: https://reviews.llvm.org/D37017 llvm-svn: 313577
* llvm-dwarfdump: use more efficient API (NFC)Adrian Prantl2017-09-181-4/+4
| | | | llvm-svn: 313573
* [gcov] Emit errors when opening the notes file failsReid Kleckner2017-09-181-0/+6
| | | | | | | | No time to write a test case, on to the next bug. =P Discovered while investigating PR34659 llvm-svn: 313571
* Fix indentation.Adrian Prantl2017-09-181-1/+1
| | | | llvm-svn: 313568
* llvm-dwarfdump: add a --show-parents options when selectively dumping DIEs.Adrian Prantl2017-09-181-1/+16
| | | | llvm-svn: 313567
* AMDGPU: Start selecting s_xnor_{b32, b64}Konstantin Zhuravlyov2017-09-183-2/+48
| | | | | | Differential Revision: https://reviews.llvm.org/D37981 llvm-svn: 313565
* [DAG, x86] allow store merging before and after legalization (PR34217)Sanjay Patel2017-09-181-4/+4
| | | | | | | | | | | | | | | | | rL310710 allowed store merging to occur after legalization to catch stores that are created late, but this exposes a logic hole seen in PR34217: https://bugs.llvm.org/show_bug.cgi?id=34217 We will miss merging stores if the target lowers vector extracts into target-specific operations. This patch allows store merging to occur both before and after legalization if the target chooses to get maximum merging. I don't think the potential regressions in the other tests are relevant. The tests are for correctness of weird IR constructs rather than perf tests, and I think those are still correct. Differential Revision: https://reviews.llvm.org/D37987 llvm-svn: 313564
* [X86] Make sure we still emit zext for GR32 to GR64 when the source of the ↵Craig Topper2017-09-181-2/+4
| | | | | | | | | | | | zext is AssertZext The AssertZext we might see in this case is only giving information about the lower 32 bits. It isn't providing information about the upper 32 bits. So we should emit a zext. This fixes PR28540. Differential Revision: https://reviews.llvm.org/D37729 llvm-svn: 313563
* llvm-dwarfdump: Sink the handling of ShowChildren into DWARFDie::dump(). NFC.Adrian Prantl2017-09-182-4/+3
| | | | llvm-svn: 313560
* [X86] Don't emit COPY_TO_REG to ABCD registers before EXTRACT_SUBREG of sub_8bitCraig Topper2017-09-183-59/+7
| | | | | | | | This is similar to D37843, but for sub_8bit. This fixes all of the patterns except for the 2 that emit only an EXTRACT_SUBREG. That causes a verifier error with global isel because global isel doesn't know to issue the ABCD when doing this extract on 32-bits targets. Differential Revision: https://reviews.llvm.org/D37890 llvm-svn: 313558
* [X86] Don't emit COPY_TO_REG to ABCD registers before EXTRACT_SUBREG of ↵Craig Topper2017-09-182-56/+19
| | | | | | | | | | | | sub_8bit_hi I'm pretty sure that InstrEmitter::EmitSubregNode will take care of this itself by calling ConstrainForSubReg which in turn calls TRI->getSubClassWithSubReg. I think Jakob Stoklund Olesen alluded to this in his commit message for r141207 which added the code to EmitSubregNode. Differential Revision: https://reviews.llvm.org/D37843 llvm-svn: 313557
* [AArch64] Adjust the cost model for Exynos M1 and M2Evandro Menezes2017-09-181-18/+57
| | | | | | Refine the model of FP loads and stores. llvm-svn: 313555
* [AArch64] Adjust the cost model for Exynos M1 and M2Evandro Menezes2017-09-182-15/+84
| | | | | | | Refine the model of loads and stores using the register offset addressing modes. llvm-svn: 313554
* [AArch64] Adjust the cost model for Exynos M1 and M2Evandro Menezes2017-09-181-1/+2
| | | | | | Fix formatting in the predicate function AArch64InstrInfo::isExynosShiftLeftFast(). llvm-svn: 313553
* [GlobalISel] Only build expensive remarks if they're enabled. NFC.Ahmed Bougacha2017-09-182-7/+14
| | | | | | | | r313390 taught 'allowExtraAnalysis' to check whether remarks are enabled at all. Use that to only do the expensive instruction printing if they are. llvm-svn: 313552
* [X86][AVX] Improve (i8 bitcast (v8i1 x)) handling for 256-bit vector compare ↵Simon Pilgrim2017-09-181-1/+1
| | | | | | | | results. As commented on D37849, AVX1 targets were missing a chance to use vmovmskps for v8f32/v8i32 results for bool vector bitcasts llvm-svn: 313547
* [SelectionDAG] Add BITCAST handling to ComputeNumSignBits for splatted sign ↵Simon Pilgrim2017-09-181-0/+24
| | | | | | | | | | | | bits. For cases where we are BITCASTing to vectors of smaller elements, then if the entire source was a splatted sign (src's NumSignBits == SrcBitWidth) we can say that the dst's NumSignBit == DstBitWidth, as we're just splitting those sign bits across multiple elements. We could generalize this but at the moment the only use case I have is to peek through bitcasts to vector comparison results. Differential Revision: https://reviews.llvm.org/D37849 llvm-svn: 313543
* [X86] Fix two more places to prefer VPERMQ/PD over VPERM2X128 when AVX2 is ↵Craig Topper2017-09-181-9/+7
| | | | | | | | | | | | enabled The shuffle combining and lowerVectorShuffleAsLanePermuteAndBlend were both still trying to use VPERM2XF128 for unary shuffles when AVX2 is enabled. VPERM2X128 takes two inputs meaning when we use it for a unary shuffle one of those inputs is left undefined creating a false dependency on whatever register gets allocated there. If we have VPERMQ/PD we should prefer those since they only have a single input. Differential Revision: https://reviews.llvm.org/D37947 llvm-svn: 313542
* [SLP] clean up for vector store case; NFCI Sanjay Patel2017-09-181-12/+11
| | | | llvm-svn: 313541
* [AArch64] Add V8_2aOps feature to Cortex-A55 and 75Sam Parker2017-09-181-0/+2
| | | | | | | | | | Add the missing hardware features the ProcA55 and ProcA75 feature. These are already enabled via the target parser, but I had missed them in the backend. Differential Revision: https://reviews.llvm.org/D37974 llvm-svn: 313535
* [ARM] Implement isTruncateFreeSam Parker2017-09-182-1/+22
| | | | | | | | | | Implement the isTruncateFree hooks, lifted from AArch64, that are used by TargetTransformInfo. This allows simplifycfg to reduce the test case into a single basic block. Differential Revision: https://reviews.llvm.org/D37516 llvm-svn: 313533
* [X86][SSE] Improve support for vselect(Cond, 0, X) -> ANDN(Cond, X)Simon Pilgrim2017-09-181-0/+14
| | | | | | | | As discussed on PR28925 and D37849. Differential Revision: https://reviews.llvm.org/D37975 llvm-svn: 313532
* [ARM] Fix for indexed dot product instruction descriptionsSjoerd Meijer2017-09-181-1/+1
| | | | | | | | | | | The indexed dot product instructions only accept the lower 16 D-registers as the indexed register, but we were e.g. incorrectly accepting: vudot.u8 d16,d16,d18[0] Differential Revision: https://reviews.llvm.org/D37968 llvm-svn: 313531
* [dwarfdump] Make .eh_frame an alias for .debug_frameJonas Devlieghere2017-09-182-11/+16
| | | | | | | | | | | | | | | | | | | | This patch makes the `.eh_frame` extension an alias for `.debug_frame`. Up till now it was only possible to dump the section using objdump, but not with dwarfdump. Since the two are essentially interchangeable, we dump whichever of the two is present. As a workaround, this patch also adds parsing for 3 currently unimplemented CFA instructions: `DW_CFA_def_cfa_expression`, `DW_CFA_expression`, and `DW_CFA_val_expression`. Because I lack the required knowledge, I just parse the fields without actually creating the instructions. Finally, this also fixes the typo in the `.debug_frame` section name which incorrectly contained a trailing `s`. Differential revision: https://reviews.llvm.org/D37852 llvm-svn: 313530
OpenPOWER on IntegriCloud