summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* Clear LastMappingSymbols and LastEMS(Info) when resetting the ↵Yichao Yu2017-10-262-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | ARM(AArch64)ELFStreamer Summary: This causes a segfault on ARM when (I think) the pass manager is used multiple times. Reset set the (last) current section to NULL without saving the corresponding LastEMSInfo back into the map. The next use of the streamer then save the LastEMSInfo for the NULL section leaving the LastEMSInfo mapping for the last current section (the one that was there before the reset) NULL which cause the LastEMSInfo to be set to NULL when the section is being used again. The reuse of the section (pointer) might mean that the map was holding dangling pointers previously which is why I went for clearing the map and resetting the info, making it as similar to the state right after the constructor run as possible. The AArch64 one doesn't have segfault (since LastEMS isn't a pointer) but it seems to have the same issue. The segfault is likely caused by https://reviews.llvm.org/D30724 which turns LastEMSInfo into a pointer. As mentioned above, it seems that the actual issue was older though. No test is included since the test is believed to be too complicated for such an obvious fix and not worth doing. Reviewers: llvm-commits, shankare, t.p.northover, peter.smith, rengolin Reviewed By: rengolin Subscribers: mgorny, aemerson, rengolin, javed.absar, kristof.beyls Differential Revision: https://reviews.llvm.org/D38588 llvm-svn: 316679
* Represent runtime preemption in the IR.Sean Fertile2017-10-261-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we do not represent runtime preemption in the IR, which has several drawbacks: 1) The semantics of GlobalValues differ depending on the object file format you are targeting (as well as the relocation-model and -fPIE value). 2) We have no way of disabling inlining of run time interposable functions, since in the IR we only know if a function is link-time interposable. Because of this llvm cannot support elf-interposition semantics. 3) In LTO builds of executables we will have extra knowledge that a symbol resolved to a local definition and can't be preemptable, but have no way to propagate that knowledge through the compiler. This patch adds preemptability specifiers to the IR with the following meaning: dso_local --> means the compiler may assume the symbol will resolve to a definition within the current linkage unit and the symbol may be accessed directly even if the definition is not within this compilation unit. dso_preemptable --> means that the compiler must assume the GlobalValue may be replaced with a definition from outside the current linkage unit at runtime. To ease transitioning dso_preemptable is treated as a 'default' in that low-level codegen will still do the same checks it did previously to see if a symbol should be accessed indirectly. Eventually when IR producers emit the specifiers on all Globalvalues we can change dso_preemptable to mean 'always access indirectly', and remove the current logic. Differential Revision: https://reviews.llvm.org/D20217 llvm-svn: 316668
* AMDGPU: Handle s_buffer_load_dword hazard on SIMarek Olsak2017-10-261-0/+27
| | | | | | | | | | Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D39171 llvm-svn: 316666
* [mips] Fix (dis)assembly of abs.fmt for micromipsSimon Dardis2017-10-262-7/+16
| | | | | | | | | | | These instructions were previously marked as codegen only preventing them from being assembled as microMIPS or disassembled. Reviewers: atanasyan, abeserminji Differential Revision: https://reviews.llvm.org/D39123 llvm-svn: 316656
* [mips] Fix PR35071Simon Dardis2017-10-261-13/+12
| | | | | | | | | | | | | | | | | | | | PR35071 exposed the fact that MipsInstrInfo::removeBranch did not walk past debug instructions when removing branches for the control flow optimizer, which lead to duplicated conditional branches. If the target of the branch was a removable block, only the conditional branch in the terminating position would have it's MBB operands updated, leaving the first branch with a dangling MBB operand. The MIPS long branch pass would then trigger an assertion when attempting to examine the instruction with dangling MBB operand. This resolves PR35071. Thanks to Alex Richardson for reporting the issue! Reviewers: atanasyan Differential Revision: https://reviews.llvm.org/D39288 llvm-svn: 316654
* [PowerPC] Use record-form instruction for Less-or-Equal -1 and ↵Hiroshi Inoue2017-10-261-30/+39
| | | | | | | | | | | Greater-or-Equal 1 Currently a record-form instruction is used for comparison of "greater than -1" and "less than 1" by modifying the predicate (e.g. LT 1 into LE 0) in addition to the naive case of comparison against 0. This patch also enables emitting a record-form instruction for "less than or equal to -1" (i.e. "less than 0") and "greater than or equal to 1" (i.e. "greater than 0") to increase the optimization opportunities. Differential Revision: https://reviews.llvm.org/D38941 llvm-svn: 316647
* [AsmParser][TableGen] Add VariantID argument to the generated mnemonic spell ↵Craig Topper2017-10-263-3/+6
| | | | | | | | check function so it can use the correct table based on variant. I'm considering implementing the mnemonic spell checker for x86, and that would require the separate intel and att variants. llvm-svn: 316641
* [AsmParser][TableGen] Make the generated mnemonic spell checker function a ↵Craig Topper2017-10-263-3/+6
| | | | | | | | file local static function. Also only emit in targets that specificially request it. This is required so we don't get an unused static function error. llvm-svn: 316640
* [X86] Use correct type for return value of ComputeAvailableFeatures in the ↵Craig Topper2017-10-261-1/+1
| | | | | | | | AsmParser. NFC There aren't enough used bits to make this a functional change, but we should fix it for consistency. llvm-svn: 316639
* Hexagon: Fold a single-use textual header into its useDavid Blaikie2017-10-252-79/+56
| | | | llvm-svn: 316604
* [Hexagon] Account for negative offset when limiting max deviationKrzysztof Parzyszek2017-10-251-2/+8
| | | | | | | | | | | | | In getOffsetRange, Max can be set to 0 to force the extender replacement to be at or below the original value. This would cause the new offset to be non-negative, which is preferred for memory instructions (to reduce the likelihood of it getting constant-extended due to predication). The problem happens when the range is shifted by an offset (present in the instruction being examined) and the offset is negative. The entire range for the allowable deviation will then be strictly negative. This creates a problem, since 0 is assumed to be a valid deviation. llvm-svn: 316601
* [X86] Add avx512vpopcntdq to Knights MillCraig Topper2017-10-251-1/+2
| | | | | | As indicated by Table 1-1 in Intel Architecture Instruction Set Extensions and Future Features Programming Reference from October 2017. llvm-svn: 316592
* [mips] Clean up some whitespace (NFC).Simon Dardis2017-10-251-1/+1
| | | | | | Also test that my email address was updated. llvm-svn: 316575
* [ARM GlobalISel] Fix call opcodesDiana Picus2017-10-251-4/+11
| | | | | | | | We were generating BLX for all the calls, which was incorrect in most cases. Update ARMCallLowering to generate BL for direct calls, and BLX, BX_CALL or BMOVPCRX_CALL for indirect calls. llvm-svn: 316570
* [ARM] OrCombineToBFI functionSam Parker2017-10-251-92/+109
| | | | | | | | Extract the functionality to combine OR to BFI into its own function. Differential Revision: https://reviews.llvm.org/D39001 llvm-svn: 316563
* [ARM] Swap cmp operands for automatic shiftsSam Parker2017-10-251-0/+6
| | | | | | | | | | Swap the compare operands if the lhs is a shift and the rhs isn't, as in arm and T2 the shift can be performed by the compare for its second operand. Differential Revision: https://reviews.llvm.org/D39004 llvm-svn: 316562
* [AArch64] Add support for dllimport of values and functionsMartin Storsjo2017-10-254-20/+63
| | | | | | | | | | | | | | Previously, the dllimport attribute did the right thing in terms of treating it as a pointer to a value, but this makes sure the names get mangled properly, and calls to such functions load the function from the __imp_ pointer. This is based on SVN r212431 and r212430 where the same was implemented for ARM. Differential Revision: https://reviews.llvm.org/D38530 llvm-svn: 316555
* AMDGPU: Add max-mix-insts subtarget featureMatt Arsenault2017-10-254-8/+22
| | | | llvm-svn: 316553
* bpf: fix an uninitialized variable issueYonghong Song2017-10-241-1/+3
| | | | | Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 316519
* ARMAddressingModes.h: Don't mark header functions as file localDavid Blaikie2017-10-241-86/+63
| | | | llvm-svn: 316517
* HexagonDepTimingClasses.h: Don't mark header functions as file localDavid Blaikie2017-10-241-5/+14
| | | | llvm-svn: 316508
* WebassemblyAsmPrinter.h: Include WebAssemblyMachineFunctionInfo for use with ↵David Blaikie2017-10-241-1/+1
| | | | | | MachineFunction::getInfo llvm-svn: 316507
* X86Operand.h: Include X86MCTargetDesc.h for SSE register enum/namesDavid Blaikie2017-10-241-0/+1
| | | | llvm-svn: 316506
* X86AsmPrinter.h: Add missing header for complete type needed for ↵David Blaikie2017-10-241-0/+1
| | | | | | MCCodeEmitter dtor. llvm-svn: 316505
* [NVPTX] allow address space inference for volatile loads/stores.Artem Belevich2017-10-241-0/+16
| | | | | | | | | | If particular target supports volatile memory access operations, we can avoid AS casting to generic AS. Currently it's only enabled in NVPTX for loads and stores that access global & shared AS. Differential Revision: https://reviews.llvm.org/D39026 llvm-svn: 316495
* [X86][Broadwell] Added the instruction scheduling information for the ↵Gadi Haber2017-10-243-2/+4078
| | | | | | | | | | | | | | | | | | | | Broadwell CPU. Adding the scheduling information for the Browadwell (BDW) CPU target. This patch adds the instruction scheduling information for the Broadwell (BDW) architecture target by adding the file X86SchedBroadwell.td located under the X86 Target. We used the scheduling information retrieved from the Broadwell architects in order to create the file. The scheduling information includes latency, number of micro-Ops and used ports by each BDW instruction. The patch continues the scheduling replacement and insertion effort started with the SandyBridge (SNB) target in r310792, the Haswell (HSW) target in r311879, the SkylakeClient (SKL) target in rL313613 + rL315978 and the SkylakeServer (SKX) in rL315175. Performance fluctuations may be expected due to code alignment effects. Reviewers: zvi, RKSimon, craig.topper Differential Revision: https://reviews.llvm.org/D39054 Change-Id: If6f799e5ff60e1091c8d43b05ea78c53581bae01 llvm-svn: 316492
* bpf: fix a bug in trunc-op optimizationYonghong Song2017-10-241-1/+8
| | | | | | | | Previous implementation for per-function scope is incorrect and too conservative. Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 316481
* [PowerPC] Try to simplify a Swap if it feeds a SplatStefan Pintilie2017-10-241-0/+47
| | | | | | | | | | | | If we have the situation where a Swap feeds a Splat we can sometimes change the index on the Splat and then remove the Swap instruction. Fixed the test case that was failing and recommit after pulling the original commit. Original revision is here: https://reviews.llvm.org/D39009 llvm-svn: 316478
* bpf: fix a bug in bpf-isel trunc-op optimizationYonghong Song2017-10-241-0/+5
| | | | | | | | | | | | | | | In BPF backend, we try to optimize away redundant trunc operations so that kernel verifier rewrite remains valid. Previous implementation only works for a single function. This patch fixed the issue for multiple functions. It clears internal map data structure before performing optimization for each function. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 316469
* [X86][AVX] ComputeNumSignBitsForTargetNode - add support for X86ISD::VTRUNCSimon Pilgrim2017-10-241-0/+10
| | | | llvm-svn: 316462
* PowerPC: support the separator character in the IASSaleem Abdulrasool2017-10-241-0/+1
| | | | | | | PowerPC uses ; as a comment leader and the @ as a separator character. Support this properly. llvm-svn: 316454
* [X86] truncateVectorCompareWithPACKSS - use PACKSSDW/PACKSSWB instead of ↵Simon Pilgrim2017-10-241-7/+19
| | | | | | | | just PACKSSWB. By using the widest type possible for PACKSS truncation we have a better chance of being able to peek through bitcasts and improves other combines driven by ComputeNumSignBits. llvm-svn: 316448
* [ARM] Error for invalid shift in memory operandOliver Stannard2017-10-241-1/+1
| | | | | | | | | | Report a diagnostic when we fail to parse a shift in a memory operand because the shift type is not an identifier. Without this, we were silently ignoring the whole instruction. Differential revision: https://reviews.llvm.org/D39237 llvm-svn: 316441
* [X86] truncateVectorCompareWithPACKSS - remove duplicate variables. NFCI.Simon Pilgrim2017-10-241-11/+10
| | | | llvm-svn: 316440
* Update f16c instruction scheduling on btver2.Andrew V. Tischenko2017-10-241-0/+50
| | | | | | Differential Revision: https://reviews.llvm.org/D39051 llvm-svn: 316435
* X86CallFrameOptimization: Update comments and variable names. NFCI.Zvi Rackover2017-10-241-15/+15
| | | | | | Following up on D38738. llvm-svn: 316434
* X86CallFrameOptimization: Recognize 'store 0/-1 using and/or' idiomsZvi Rackover2017-10-241-7/+29
| | | | | | | | | | | | | | | | | | | | | | | | Summary: r264440 added or/and patterns for storing -1 or 0 with the intention of decreasing code size. However, X86CallFrameOptimization does not recognize these memory accesses so it will not replace them with push's when profitable. This patch fixes this problem by teaching X86CallFrameOptimization these store 0/-1 idioms. An alternative fix would be to prevent the 'store 0/1 idioms' patterns from firing when accessing the stack. This would save the need to teach the pass about these idioms. However, because X86CallFrameOptimization does not always fire we may result in cases where neither X86CallFrameOptimization not the patterns for 'store 0/1 idioms' fire. Fixes pr34863 Reviewers: DavidKreitzer, guyblank, aymanmus Reviewed By: aymanmus Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38738 llvm-svn: 316431
* AMDGPU: Add new intrinsic llvm.amdgcn.kill(i1)Marek Olsak2017-10-248-36/+172
| | | | | | | | | | | | | | | | Summary: Kill the thread if operand 0 == false. llvm.amdgcn.wqm.vote can be applied to the operand. Also allow kill in all shader stages. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D38544 llvm-svn: 316427
* AMDGPU: Add llvm.amdgcn.wqm.vote intrinsicMarek Olsak2017-10-241-1/+3
| | | | | | | | | | Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D38543 llvm-svn: 316426
* [ARM] Replace development diagnostics with normal DEBUG macroOliver Stannard2017-10-241-14/+9
| | | | | | | | | | * Remove the -arm-asm-parser-dev-diags option. * Use normal DEBUG(dbgs()) printing for the extra development information about missing diagnostics. Differential Revision: https://reviews.llvm.org/D39194 llvm-svn: 316423
* [ARM] tSETEND needs IsThumbOliver Stannard2017-10-241-1/+1
| | | | | | | | | | This is the Thumb encoding, so the Requires list must include IsThumb. No test because we happen to select the ARM one first, but that's just luck. Differential Revision: https://reviews.llvm.org/D39190 llvm-svn: 316421
* [ARM] Remove tCPS alias which just crashedOliver Stannard2017-10-241-7/+0
| | | | | | | | | | | | This alias caused a crash when trying to print the "cps #0" instruction in a diagnostic for thumbv6 (which doesn't have that instruction). The comment was incorrect, this instruction is UNPREDICTABLE if no flag bits are set, so I don't think it's worth keeping. Differential Revision: https://reviews.llvm.org/D39191 llvm-svn: 316420
* X86: Fix X86CallFrameOptimization to search for the COPY StackPointerZvi Rackover2017-10-241-16/+24
| | | | | | | | | | | | | | | | | | | SelectionDAG inserts a copy of ESP into a virtual register. X86CallFrameOptimization assumed that the COPY, if present, is always right after the call-frame setup instruction (ADJCALLSTACKDOWN). This was a wrong assumption as the COPY can be located anywhere between the call-frame setup instruction and its first use. If the COPY happened to be located in a different location than what X86CallFrameOptimization assumed, visiting it while processing the call chain would lead to a conservative bail-out. The fix is quite straightfoward, scan ahead for the stack-pointer copy and make note of it so it can be ignored while processing the call chain. Fixes pr34903 Differential Revision: https://reviews.llvm.org/D38730 llvm-svn: 316416
* [MC] Adding code padding for performance stability - infrastructure. NFC.Omer Paparo Bivas2017-10-242-0/+2
| | | | | | | | | | | | | | | | | Infrastructure designed for padding code with nop instructions in key places such that preformance improvement will be achieved. The infrastructure is implemented such that the padding is done in the Assembler after the layout is done and all IPs and alignments are known. This patch by itself in a NFC. Future patches will make use of this infrastructure to implement required policies for code padding. Reviewers: aaboud zvi craig.topper gadi.haber Differential revision: https://reviews.llvm.org/D34393 Change-Id: I92110d0c0a757080a8405636914a93ef6f8ad00e llvm-svn: 316413
* X86: Register the X86CallFrameOptimization passZvi Rackover2017-10-242-4/+15
| | | | | | | | | | | | | | | | | Summary: The motivation of this change is to enable .mir testing for this pass. Added one test case to cover the functionality, this same case will be improved by a future patch. Reviewers: igorb, guyblank, DavidKreitzer Reviewed By: guyblank, DavidKreitzer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38729 llvm-svn: 316412
* AMDGPU: Initialize WavefrontSize from TD filesKonstantin Zhuravlyov2017-10-232-2/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D39205 llvm-svn: 316389
* [X86][SSE] combineBitcastvxi1 - use PACKSSWB directly to pack v8i16 to v16i8Simon Pilgrim2017-10-231-5/+4
| | | | | | Avoid difficulties determining the number of sign bits later on in shuffle lowering to lower to PACKSS llvm-svn: 316383
* Revert "[PowerPC] Try to simplify a Swap if it feeds a Splat"Stefan Pintilie2017-10-231-47/+0
| | | | | | | | | Revert commit r316366. Previous commit causes p8-scalar_vector_conversions.ll to fail. This reverts commit 990e764ad8a2eec206ce5dda6aefab059ccd4e92. llvm-svn: 316371
* [Hexagon] Return the correct chain edge for i1 function callsKrzysztof Parzyszek2017-10-231-1/+2
| | | | | | | | | | | | | | | | | In HexagonISelLowering, there is code to handle the case when a function returns an i1 type. In this case, we need to generate extra nodes to copy the result from R0 to a predicate register. The code was returning the wrong value for the chain edge which caused an assert "Wrong topological sorting" when converting the instructions to MIs. This patch fixes the problem by returning the chain for the final copy. Patch by Brendon Cahoon. llvm-svn: 316367
* [PowerPC] Try to simplify a Swap if it feeds a SplatStefan Pintilie2017-10-231-0/+47
| | | | | | | | | If we have the situation where a Swap feeds a Splat we can sometimes change the index on the Splat and then remove the Swap instruction. Differential Revision: https://reviews.llvm.org/D39009 llvm-svn: 316366
OpenPOWER on IntegriCloud