summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [dwarfdump][NFC] Consistent printing of address rangesJonas Devlieghere2017-09-292-8/+8
| | | | | | | | | | | | | | This implement the insertion operator for DWARF address ranges so they are consistently printed as [LowPC, HighPC). While a dump method might have felt more consistent, it is used exclusively for printing error messages in the verifier and never used for actual dumping. Hence this approach is more intuitive and creates less clutter at the call sites. Differential revision: https://reviews.llvm.org/D38395 llvm-svn: 314523
* AMDGPU: VALU carry-in and v_cndmask condition cannot be EXECNicolai Haehnle2017-09-295-11/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | The hardware will only forward EXEC_LO; the high 32 bits will be zero. Additionally, inline constants do not work. At least, v_addc_u32_e64 v0, vcc, v0, v1, -1 which could conceivably be used to combine (v0 + v1 + 1) into a single instruction, acts as if all carry-in bits are zero. The llvm.amdgcn.ps.live test is adjusted; it would be nice to combine s_mov_b64 s[0:1], exec v_cndmask_b32_e64 v0, v1, v2, s[0:1] into v_mov_b32 v0, v3 but it's not particularly high priority. Fixes dEQP-GLES31.functional.shaders.helper_invocation.value.* llvm-svn: 314522
* Use the basic cost if a GEP is not used as addressing modeJun Bum Lim2017-09-293-2/+7
| | | | | | | | | | | | | | | | | | | Summary: Currently, getGEPCost() returns TCC_FREE whenever a GEP is a legal addressing mode in the target. However, since it doesn't check its actual users, it will return FREE even in cases where the GEP cannot be folded away as a part of actual addressing mode. For example, if an user of the GEP is a call instruction taking the GEP as a parameter, then the GEP may not be folded in isel. Reviewers: hfinkel, efriedma, mcrosier, jingyue, haicheng Reviewed By: hfinkel Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D38085 llvm-svn: 314517
* [SystemZ] implement shouldCoalesce()Jonas Paulsson2017-09-297-5/+91
| | | | | | | | | | | | | | | | | | | Implement shouldCoalesce() to help regalloc avoid running out of GR128 registers. If a COPY involving a subreg of a GR128 is coalesced, the live range of the GR128 virtual register will be extended. If this happens where there are enough phys-reg clobbers present, regalloc will run out of registers (if there is not a single GR128 allocatable register available). This patch tries to allow coalescing only when it can prove that this will be safe by checking the (local) interval in question. Review: Ulrich Weigand, Quentin Colombet https://reviews.llvm.org/D37899 https://bugs.llvm.org/show_bug.cgi?id=34610 llvm-svn: 314516
* [X86] Improve codegen for inverted overflow checking intrinsics.Amara Emerson2017-09-291-0/+20
| | | | | | | | Adds a new combine for: xor(setcc cc, val), 1 --> setcc (invert(cc), val) Differential Revision: https://reviews.llvm.org/D38161 llvm-svn: 314514
* [ARM] v8.3-a complex number supportSam Parker2017-09-297-2/+298
| | | | | | | | | | | | | | | New instructions are added to AArch32 and AArch64 to aid floating-point multiplication and addition of complex numbers, where the complex numbers are packed in a vector register as a pair of elements. The Imaginary part of the number is placed in the more significant element, and the Real part of the number is placed in the less significant element. This patch adds assembler for the ARM target. Differential Revision: https://reviews.llvm.org/D36789 llvm-svn: 314511
* Small modification <NFC>Michael Zuckerman2017-09-291-1/+1
| | | | | Change-Id: I360abccee12cae29bd2ac4f8399c9ecc92eb7f13 llvm-svn: 314510
* [mips] Reordering callseq* nodes to be linearAleksandar Beserminji2017-09-292-26/+27
| | | | | | | | | | | | | Fix nested callseq* nodes by moving callseq_start after the arguments calculation to temporary registers, so that callseq* nodes in resulting DAG are linear. Recommitting r314497. This version does not contain test which fails when compiler is not build in debug mode. Differential Revision: https://reviews.llvm.org/D37328 llvm-svn: 314507
* Revert "[mips] Reordering callseq* nodes to be linear"Aleksandar Beserminji2017-09-292-27/+26
| | | | | | | | | Added test relies on the compiler being built in debug mode, which may not be the case. This reverts commit r314497. llvm-svn: 314506
* [mips] Add missing license info, formatting changes. NFCISimon Dardis2017-09-291-30/+47
| | | | | | | | Add missing license information to MicroMipsInstrFPU.td and fix most of the formatting errors present. Others will be addressed in a follow up commits. llvm-svn: 314505
* [AMDGPU] calling conventions for AMDPAL OS typeTim Renouf2017-09-2910-2/+31
| | | | | | | | | | | | | | | Summary: This commit adds comments on how the AMDPAL OS type overloads the existing AMDGPU_ calling conventions used by Mesa, and adds a couple of new ones. Reviewers: arsenm, nhaehnle, dstuttard Subscribers: mehdi_amini, kzhuravl, wdng, yaxunl, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D37752 llvm-svn: 314502
* [AMDGPU] AMDPAL scratch buffer supportTim Renouf2017-09-295-12/+95
| | | | | | | | | | | | | | | | | | | | | | | Summary: Added support for scratch (including spilling) for OS type amdpal: generates code to set up the scratch descriptor if it is needed. With amdpal, the scratch resource descriptor is loaded from offset 0 of the global information table. The low 32 bits of the address of the global information table is passed in s0. Added amdgpu-git-ptr-high function attribute to hard-wire the high 32 bits of the address of the global information table. If the function attribute is not specified, or is 0xffffffff, then the backend generates code to use the high 32 bits of pc. The documentation for the AMDPAL ABI will be added in a later commit. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye Differential Revision: https://reviews.llvm.org/D37483 llvm-svn: 314501
* [Triple] Add AMDPAL operating system typeTim Renouf2017-09-292-0/+6
| | | | | | | | | | | | | | | | | | Summary: This operating system type represents the AMDGPU PAL runtime, and will be required by the AMDGPU backend in order to generate correct code for this runtime. Currently it generates the same code as not specifying an OS at all. That will change in future commits. Patch from Tim Corringham. Subscribers: arsenm, nhaehnle Differential Revision: https://reviews.llvm.org/D37380 llvm-svn: 314500
* [dwarfdump][NFC] Consistent errors and warnings with --verifyJonas Devlieghere2017-09-293-88/+118
| | | | | | | | | | | This patch introduces 3 helper functions: error(), warn() and note() to make printing during verification more consistent. When supported, the respective prefixes are printed in color using the same color scheme as clang. Differential revision: https://reviews.llvm.org/D38368 llvm-svn: 314498
* [mips] Reordering callseq* nodes to be linearAleksandar Beserminji2017-09-292-26/+27
| | | | | | | | | | Fix nested callseq* nodes by moving callseq_start after the arguments calculation to temporary registers, so that callseq* nodes in resulting DAG are linear. Differential Revision: https://reviews.llvm.org/D37328 llvm-svn: 314497
* [X86][MS-InlineAsm] Extended support for variables / identifiers on memory / ↵Coby Tayree2017-09-291-60/+90
| | | | | | | | | | | immediate expressions Allow the proper recognition of Enum values and global variables inside ms inline-asm memory / immediate expressions, as they require some additional overhead and treated incorrect if doesn't early recognized. supersedes D33278, D35774 Differential Revision: https://reviews.llvm.org/D37412 llvm-svn: 314493
* Revert "[BypassSlowDivision] Improve our handling of divisions by constants"Sanjoy Das2017-09-291-13/+7
| | | | | | | This reverts commit r314253. It causes a miscompile on P100 in an internal benchmark. Reverting while I investigate. llvm-svn: 314482
* llvm-dwarfdump: support .apple-namespaces in --findAdrian Prantl2017-09-291-15/+13
| | | | llvm-svn: 314481
* llvm-dwarfdump: add support for .apple_types in --findAdrian Prantl2017-09-291-2/+6
| | | | llvm-svn: 314479
* [MachineOutliner][NFC] Simplify logic in pruneCandidatesJessica Paquette2017-09-281-70/+61
| | | | | | | | | This commit yanks out the repeated sections of code in pruneCandidates into two lambdas: ShouldSkipCandidate and Prune. This simplifies the logic in pruneCandidates significantly, and reduces the chance of introducing bugs by folding all of the shared logic into one place. llvm-svn: 314475
* [X86] Don't select (cmp (and, imm), 0) to testwCraig Topper2017-09-281-1/+4
| | | | | | | | | | | | | | | | | Summary: X86ISelDAGToDAG tries to analyze ANDs compared with 0 to optimize to narrower immediates using subregisters. I don't think we should be optimizing to 16-bit test instructions. It goes against our normal behavior of promoting i16 operations to i32. It only saves one byte due to the need to add a 0x66 prefix. I think it would also be subject to a length changing prefix penalty in the decoders on Intel CPUs. Reviewers: RKSimon, zvi, spatel Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38273 llvm-svn: 314474
* ARM: Fix cases where CSI Restored bit is not clearedMatthias Braun2017-09-283-9/+19
| | | | | | | | | | LR is an untypical callee saved register in that it is restored into a different register (PC) and thus does not live-out of the return block. This case requires the `Restored` flag in CalleeSavedInfo to be cleared. This fixes a number of cases where this wasn't handled correctly yet. llvm-svn: 314471
* bpf: fix a bug for disassembling ld_pseudo instYonghong Song2017-09-281-1/+2
| | | | | Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 314469
* [Hexagon] Fix some Clang-tidy modernize and Include What You Use warnings; ↵Eugene Zelenko2017-09-2814-313/+398
| | | | | | other minor fixes (NFC). llvm-svn: 314467
* [SystemZ] Fix fall-out from r314428Ulrich Weigand2017-09-281-0/+6
| | | | | | | | | | | The expensive-checks build bot found a problem with the r314428 commit: if CC is live after a ATOMIC_CMP_SWAPW instruction, it needs to be marked as live-in to the block after the loop the pseudo gets expanded to. This actually fixes a code-gen bug as well, since if the CC isn't live, the CR and JLH are merged to a CRJLH which doesn't actually set the condition code any more. llvm-svn: 314465
* [X86] Make use of vpmovwb when possible in LowerMULHCraig Topper2017-09-281-15/+8
| | | | | | | | If we have BWI, we can truncate in a much simpler way by using vpmovwb. This even works without VLX by using the wider zmm->ymm truncate with a subvector extract. Differential Revision: https://reviews.llvm.org/D38375 llvm-svn: 314457
* [ARM] Restore the right frame pointer register in Int_eh_sjlj_longjmpMartin Storsjo2017-09-281-14/+58
| | | | | | | | | | | | | | | | | | | | | | In setupEntryBlockAndCallSites in CodeGen/SjLjEHPrepare.cpp, we fetch and store the actual frame pointer, but on return via the longjmp intrinsic, it always was restored into the r7 variable. On windows, the frame pointer should be restored into r11 instead of r7. On Darwin (where sjlj exception handling is used by default), the frame pointer is always r7, both in arm and thumb mode, and likewise, on windows, the frame pointer always is r11. On linux however, if sjlj exception handling is enabled (which it isn't by default), libcxxabi and the user code can be built in differing modes using different registers as frame pointer. Therefore, when restoring registers on a platform where we don't always use the same register depending on code mode, restore both r7 and r11. Differential Revision: https://reviews.llvm.org/D38253 llvm-svn: 314451
* [ARM] Fix SJLJ exception handling when manually chosen on a platform where ↵Martin Storsjo2017-09-281-1/+3
| | | | | | | | it isn't default Differential Revision: https://reviews.llvm.org/D38252 llvm-svn: 314450
* MIR: Serialize CaleeSavedInfo Restored flagMatthias Braun2017-09-282-7/+14
| | | | llvm-svn: 314449
* [X86] Use target independent ZERO_EXTEND/SIGN_EXTEND nodes were possible in ↵Craig Topper2017-09-281-9/+10
| | | | | | | | LowerMULH We aren't do any in register extends here so we should be able to just the target independent nodes directly and allow them to be lowered as necessary. llvm-svn: 314447
* [X86] Move a setOperation action for ISD::TRUNCATE near another one in the ↵Craig Topper2017-09-281-2/+1
| | | | | | same if. Remove one that is redundant with another subtarget features. llvm-svn: 314446
* llvm-dwarfdump: implement --find for .apple_namesAdrian Prantl2017-09-284-11/+95
| | | | | | | | | | | | This patch implements the dwarfdump option --find=<name>. This option looks for a DIE in the accelerator tables and dumps it if found. This initial patch only adds support for .apple_names to keep the review small, adding the other sections and pubnames support should be trivial though. Differential Revision: https://reviews.llvm.org/D38282 llvm-svn: 314439
* [ORC] Fix the type of RTDyldObjectLinkingLayer::NotifyLoadedFtor.Lang Hames2017-09-281-1/+1
| | | | | | Bug found by Stefan Granitz. Thanks Stefan! llvm-svn: 314436
* [JumpThreading] Preserve DT and LVI across the passEvandro Menezes2017-09-283-65/+253
| | | | | | | | | | | | | | | | | JumpThreading now preserves dominance and lazy value information across the entire pass. The pass manager is also informed of this preservation with the goal of DT and LVI being recalculated fewer times overall during compilation. This change prepares JumpThreading for enhanced opportunities; particularly those across loop boundaries. Patch by: Brian Rzycki <b.rzycki@samsung.com>, Sebastian Pop <s.pop@samsung.com> Differential revision: https://reviews.llvm.org/D37528 llvm-svn: 314435
* [X86] Use BWI instructions to improve lowering of v32i8 MULHU/SCraig Topper2017-09-281-0/+18
| | | | | | | | | | | | | | Summary: If we have BWI instructions we can widen to v32i16 to do the multiply instead of splitting. Reviewers: RKSimon, spatel, zvi Reviewed By: zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38305 llvm-svn: 314432
* [X86] Remove dead code from X86ISelDAGToDAG.cpp multiply handlingCraig Topper2017-09-281-1/+1
| | | | | | | | | | | | | | | | | Summary: Lowering never creates X86ISD::UMUL for 8-bit types. X86ISD::UMUL8 is used instead. If X86ISD::UMUL 8-bit were ever used it would crash. DAGCombiner replaces UMUL_LOHI/SMUL_LOHI with a wider MUL and a shift if the type twice as wide is legal. So we should never see i8 UMUL_LOHI/SMUL_LOHI. In fact I think there was a bug in part of the i8 code. Similar is true for i16 though without the bug. Reviewers: RKSimon, spatel, zvi Reviewed By: zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38276 llvm-svn: 314430
* [X86] Use correct subvector index when combining two insert subvectors ↵Craig Topper2017-09-281-1/+1
| | | | | | | | | | featuring zero vectors. Previously we were using one of the subvector indices twice. The included test case causes an assert without this change. Thanks to Simon Pilgrim for catching this. llvm-svn: 314429
* [SystemZ] Custom-expand ATOMIC_CMP_AND_SWAP_WITH_SUCCESSUlrich Weigand2017-09-284-22/+67
| | | | | | | | The SystemZ compare-and-swap instructions already provide the "success" indication via a condition-code value, so the default expansion of those operations generates an unnecessary extra comparsion. llvm-svn: 314428
* [dwarfdump] Verify that CUs have a unit DIE.Jonas Devlieghere2017-09-281-3/+8
| | | | | | | | | This patch adds a check to the DWARF verifier to detect CUs without a unit DIE. Differential revision: https://reviews.llvm.org/D38363 llvm-svn: 314426
* Use SDValue::getConstantOperandVal helper. NFCI.Simon Pilgrim2017-09-281-3/+3
| | | | llvm-svn: 314425
* [mips] Remove codegen support for branch likely instructions.Simon Dardis2017-09-282-18/+49
| | | | | | | | | | | | | | | | | | This patch disables codegen support for branch likely instructions to address a potential bug. These branches were unselectable as they had the same patterns as the normal branches but came after them when ISel was concerned. The branch likely instructions were marked as having no delay slots when they have annulling delay slots. The delay slot filler does not currently handle annulling delay slot branches, so this would lead to wrong codegen if these branches were generated. Reviewers: atanasyan, nitesh.jain Differential Revision: https://reviews.llvm.org/D38169 llvm-svn: 314421
* [LoopUnroll] Fix use after poison.Benjamin Kramer2017-09-281-1/+3
| | | | llvm-svn: 314418
* [DebugInfo] Do not extend range for physreg in LiveDebugVariablesBjorn Pettersson2017-09-281-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: A DBG_VALUE that is referring to a physical register is valid up until the next def of the register, or the end of the basic block that it belongs to. LiveDebugVariables is computing live intervals (slot index ranges) for DBG_VALUE instructions, before regalloc, in order to be able to re-insert DBG_VALUE instructions again after regalloc. When the DBG_VALUE is mapping a variable to a physical register we do not need to compute the range. We should simply re-insert the DBG_VALUE at the start position. The problem that was found, resulting in this patch, was a situation when the DBG_VALUE was the last real use of the physical register. The computeIntervals/extendDef methods extended the range to cover the whole basic block, even though the physical register very well could be allocated to some virtual register inside the basic block. So the extended range could not be trusted. This patch is a preparation for https://reviews.llvm.org/D38229, where the goal is to insert DBG_VALUE after each new definition of a variable, even if the virtual registers that the variable was connected to has been coalesced into using the same physical register (e.g. due to two address instructions). For more info see https://bugs.llvm.org/show_bug.cgi?id=34545 Reviewers: aprantl, rnk, echristo Reviewed By: aprantl Subscribers: Ka-Ka, llvm-commits Differential Revision: https://reviews.llvm.org/D38140 llvm-svn: 314414
* [LVI] Move LVILatticeVal class to separate header file (NFC).Florian Hahn2017-09-283-347/+160
| | | | | | | | | | | | | | | | | Summary: This allows sharing the lattice value code between LVI and SCCP (D36656). It also adds a `satisfiesPredicate` function, used by D36656. Reviewers: davide, sanjoy, efriedma Reviewed By: sanjoy Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D37591 llvm-svn: 314411
* [x86][AsmParser] Allow some more MS size directivesCoby Tayree2017-09-281-0/+3
| | | | | | | MS allows the following size directives: float/double and long as synonymous to dword/qword and dword, respectively. Differential Revision: https://reviews.llvm.org/D37190 llvm-svn: 314410
* Teach TargetInstrInfo::getInlineAsmLength to parse .space directives with ↵Alex Bradbury2017-09-283-55/+32
| | | | | | | | | | | | | | | | | | | | | | | | integer arguments It's currently quite difficult to test passes like branch relaxation, which requires branches with large displacement to be generated. The .space assembler directive makes it easy to create arbitrarily large basic blocks, but getInlineAsmLength is not able to parse it and so the size of the block is not correctly estimated. Other backends (AArch64, AMDGPU) introduce options just for testing that artificially restrict the ranges of branch instructions (e.g. aarch64-tbz-offset-bits). Although parsing a single form of the .space directive feels inelegant, it does allow a more direct testing approach. This patch adapts the .space parsing code from Mips16InstrInfo::getInlineAsmLength and removes it now the extra functionality is provided by the base implementation. I want to move this functionality to the generic getInlineAsmLength as 1) I need the same for RISC-V, and 2) I feel other backends will benefit from more direct testing of large branch displacements. Differential Revision: https://reviews.llvm.org/D37798 llvm-svn: 314393
* [PowerPC] eliminate partially redundant compare instructionHiroshi Inoue2017-09-281-14/+180
| | | | | | | | | | | | | | | | | | This is a follow-on of D37211. D37211 eliminates a compare instruction if two conditional branches can be made based on the one compare instruction, e.g. if (a == 0) { ... } else if (a < 0) { ... } This patch extends this optimization to support partially redundant cases, which often happen in while loops. For example, one compare instruction is moved from the loop body into the preheader by this optimization in the following example. do { if (a == 0) dummy1(); a = func(a); } while (a > 0); Differential Revision: https://reviews.llvm.org/D38236 llvm-svn: 314390
* [RISCV] Add common fixups and relocationsAlex Bradbury2017-09-2811-39/+593
| | | | | | | | | | | | | %lo(), %hi(), and %pcrel_hi() are supported and test cases have been added to ensure the appropriate fixups and relocations are generated. I've added an instruction format field which is used in RISCVMCCodeEmitter to, for instance, tell whether it should emit a lo12_i fixup or a lo12_s fixup (RISC-V has two 12-bit immediate encodings depending on the instruction type). Differential Revision: https://reviews.llvm.org/D23568 llvm-svn: 314389
* [RegAllocGreedy]: Allow recoloring of done register if it's non-tiedMikael Holmen2017-09-281-2/+14
| | | | | | | | | | | | | | | | | | | | | Summary: If we have a non-allocated register, we allow us to try recoloring of an already allocated and "Done" register, even if they are of the same register class, if the non-allocated register has at least one tied def and the allocated one has none. It should be easier to recolor the non-tied register than the tied one, so it might be an improvement even if they use the same regclasses. Reviewers: qcolombet Reviewed By: qcolombet Subscribers: llvm-commits, MatzeB Differential Revision: https://reviews.llvm.org/D38309 llvm-svn: 314388
* [DAGCombiner] Fix an off-by-one error in vector logicGeorge Burgess IV2017-09-281-2/+2
| | | | | | | | | Without this, we could end up trying to get the Nth (0-indexed) element from a subvector of size N. Differential Revision: https://reviews.llvm.org/D37880 llvm-svn: 314380
OpenPOWER on IntegriCloud