summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU: Slightly simplify prolog reserved register handlingMatt Arsenault2017-04-241-25/+27
| | | | | | | | | | | | | | Rely on MachineRegisterInfo's knowledge of used physical registers. Move flat_scratch initialization earlier, so the uses are visible when making these decisions. This will make it easier to add another reserved register at the end for the stack pointer rather than handling another special case. llvm-svn: 301254
* Move value type list from TargetRegisterClass to TargetRegisterInfoKrzysztof Parzyszek2017-04-249-27/+40
| | | | | | Differential Revision: https://reviews.llvm.org/D31937 llvm-svn: 301234
* Revert r301231: Accidentally committed stale filesKrzysztof Parzyszek2017-04-249-34/+27
| | | | | | I forgot to commit local changes before commit. llvm-svn: 301232
* Move value type list from TargetRegisterClass to TargetRegisterInfoKrzysztof Parzyszek2017-04-249-27/+34
| | | | | | Differential Revision: https://reviews.llvm.org/D31937 llvm-svn: 301231
* AMDGPU: Select scratch mubuf offsets when pointer is a constantMatt Arsenault2017-04-242-34/+94
| | | | | | | | In call sequence setups, there may not be a frame index base and the pointer is a constant offset from the frame pointer / scratch wave offset register. llvm-svn: 301230
* AMDGPU: Set StackGrowsUp in MCAsmInfoMatt Arsenault2017-04-241-0/+1
| | | | | | Not sure what this does though. llvm-svn: 301229
* [AMDGPU] Merge M0 initializationsStanislav Mekhanoshin2017-04-242-9/+179
| | | | | | | | | | Merges equivalent initializations of M0 and hoists them into a common dominator block. Technically the same code can be used with any register, physical or virtual. Differential Revision: https://reviews.llvm.org/D32279 llvm-svn: 301228
* Move size and alignment information of regclass to TargetRegisterInfoKrzysztof Parzyszek2017-04-2434-198/+246
| | | | | | | | | | | | | | | 1. RegisterClass::getSize() is split into two functions: - TargetRegisterInfo::getRegSizeInBits(const TargetRegisterClass &RC) const; - TargetRegisterInfo::getSpillSize(const TargetRegisterClass &RC) const; 2. RegisterClass::getAlignment() is replaced by: - TargetRegisterInfo::getSpillAlignment(const TargetRegisterClass &RC) const; This will allow making those values depend on subtarget features in the future. Differential Revision: https://reviews.llvm.org/D31783 llvm-svn: 301221
* CodeGen: Add a hook for getFenceOperandTyYaxun Liu2017-04-242-0/+10
| | | | | | | | | | | | | | Currently the operand type for ATOMIC_FENCE assumes value type of a pointer in address space 0. This is fine for most targets. However for amdgcn target, the size of pointer in address space 0 depends on triple environment. For amdgiz environment, it is 64 bit but for other environment it is 32 bit. On the other hand, amdgcn target expects 32 bit fence operands independent of the target triple environment. Therefore a hook is need in target lowering for getting the fence operand type. This patch has no effect on targets other than amdgcn. Differential Revision: https://reviews.llvm.org/D32186 llvm-svn: 301215
* X86RegisterInfo: eliminateFrameIndex: Avoid code duplication; NFCMatthias Braun2017-04-243-34/+24
| | | | | | | | | | | | | | | | | Re-Commit of r300922 and r300923 with less aggressive assert (see discussion at the end of https://reviews.llvm.org/D32205) X86RegisterInfo::eliminateFrameIndex() and X86FrameLowering::getFrameIndexReference() both had logic to compute the base register. This consolidates the code. Also use MachineInstr::isReturn instead of manually enumerating tail call instructions (return instructions were not included in the previous list because they never reference frame indexes). Differential Revision: https://reviews.llvm.org/D32206 llvm-svn: 301211
* AMDGPU: Add StackPtr and FramePtr registers to MFIMatt Arsenault2017-04-242-0/+26
| | | | | | These will be necessary for setting up call sequences. llvm-svn: 301208
* AMDGPU: Move trap lowering to DAGMatt Arsenault2017-04-245-59/+66
| | | | | | | | | | | Fixes traps in any block besides the entry block, and fixes depending on a live-in physical register by using a virtual register copy. Also happens to stop emitting a nop in the case debug trap is not supported. llvm-svn: 301206
* AMDGPU: Move v_readlane lane select from VGPR to SGPRNicolai Haehnle2017-04-241-0/+13
| | | | | | | | | | | | | | | | | Summary: Fix a compiler bug when the lane select happens to end up in a VGPR. Clarify the semantic of the corresponding intrinsic to be that of the corresponding GLSL: the lane select must be uniform across a wave front, otherwise results are undefined. Reviewers: arsenm Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D32343 llvm-svn: 301197
* [GlobalISel][X86] Lower FormalArgument/Ret using ↵Igor Breger2017-04-242-22/+11
| | | | | | | | | | | | | | | | G_MERGE_VALUES/G_UNMERGE_VALUES. Summary: [GlobalISel][X86] Lower FormalArgument/Ret using G_MERGE_VALUES/G_UNMERGE_VALUES. Reviewers: zvi, t.p.northover, guyblank Reviewed By: t.p.northover Subscribers: dberris, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D32288 llvm-svn: 301194
* AMDGPU: Fix crash when scheduling non-memory SMRD instructionsNicolai Haehnle2017-04-241-0/+5
| | | | | | | | | | | | Summary: Fixes piglit spec/arb_shader_clock/execution/* Reviewers: arsenm Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D32345 llvm-svn: 301191
* [SystemZ] Update kill-flag in splitMove().Jonas Paulsson2017-04-241-2/+3
| | | | | | | EarlierMI needs to clear the kill flag on the first operand in case of a store. Review: Ulrich Weigand llvm-svn: 301177
* [ARM] GlobalISel: Legalize s8 and s16 G_(S|U)DIVDiana Picus2017-04-242-0/+55
| | | | | | | | | | | | | | We have to widen the operands to 32 bits and then we can either use hardware division if it is available or lower to a libcall otherwise. At the moment it is not enough to set the Legalizer action to WidenScalar, since for libcalls it won't know what to do (it won't be able to find what size to widen to, because it will find Libcall and not Legal for 32 bits). To hack around this limitation, we request Custom lowering, and as part of that we widen first and then we run another legalizeInstrStep on the widened DIV. llvm-svn: 301166
* [Arch64AsmParser] better diagnostic for isbSjoerd Meijer2017-04-241-7/+5
| | | | | | | | | | | Instruction isb takes as an operand either 'sy' or an immediate value. This improves the diagnostic when the string is not 'sy' and adds a test case for this which was missing. This also adds tests to check invalid inputs for dsb and dmb. Differential Revision: https://reviews.llvm.org/D32227 llvm-svn: 301165
* [ARM] GlobalISel: Support G_(S|U)DIV for s32Diana Picus2017-04-243-0/+19
| | | | | | | | | Add support for both targets with hardware division and without. For hardware division we have to add support throughout the pipeline (legalizer, reg bank select, instruction select). For targets without hardware division, we only need to mark it as a libcall. llvm-svn: 301164
* [ARM] GlobalISel: Select G_CONSTANT with CImm operandsDiana Picus2017-04-241-0/+12
| | | | | | | | | | | | | | When selecting a G_CONSTANT to a MOVi, we need the value to be an Imm operand. We used to just leave the G_CONSTANT operand unchanged, which works in some cases (such as the GEP offsets that we create when referring to stack slots). However, in many other places the G_CONSTANTs are created with CImm operands. This patch makes sure to handle those as well, and to error out gracefully if in the end we don't end up with an Imm operand. Thanks to Oliver Stannard for reporting this issue. llvm-svn: 301162
* [X86][SSE] Add scheduler class support for SSE42 (PCMPGT) instructionsSimon Pilgrim2017-04-231-6/+10
| | | | llvm-svn: 301142
* Revert "[APInt] Fix a few places that use APInt::getRawData to operate ↵Renato Golin2017-04-233-16/+17
| | | | | | | | | | | | | | | | within the normal API." This reverts commit r301105, 4, 3 and 1, as a follow up of the previous revert, which broke even more bots. For reference: Revert "[APInt] Use operator<<= where possible. NFC" Revert "[APInt] Use operator<<= instead of shl where possible. NFC" Revert "[APInt] Use ashInPlace where possible." PR32754. llvm-svn: 301111
* [X86][MPX] Add load & store instructions of bnd values to ↵Ayman Musa2017-04-231-22/+30
| | | | | | | | | | getLoadStoreRegOpcode function. This is needed for a follow up patch that generates the memory folding tables. Differential Revision: https://reviews.llvm.org/D32232 llvm-svn: 301109
* [APInt] Fix a few places that use APInt::getRawData to operate within the ↵Craig Topper2017-04-231-1/+1
| | | | | | | | | | normal API. getRawData exposes the internal type of the APInt class directly to its users. Ideally we wouldn't expose such an implementation detail. This patch fixes a few of the easy cases by using truncate, extract, or a rotate. llvm-svn: 301105
* [APInt] Use operator<<= where possible. NFCCraig Topper2017-04-231-2/+2
| | | | llvm-svn: 301104
* [APInt] Use operator<<= instead of shl where possible. NFCCraig Topper2017-04-232-12/+11
| | | | llvm-svn: 301103
* [APInt] Use ashInPlace where possible.Craig Topper2017-04-231-2/+2
| | | | llvm-svn: 301101
* [globalisel][tablegen] Revise API for ComplexPattern operands to improve ↵Daniel Sanders2017-04-223-17/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | flexibility. Summary: Some targets need to be able to do more complex rendering than just adding an operand or two to an instruction. For example, it may need to insert an instruction to extract a subreg first, or it may need to perform an operation on the operand. In SelectionDAG, targets would create SDNode's to achieve the desired effect during the complex pattern predicate. This worked because SelectionDAG had a form of garbage collection that would take care of SDNode's that were created but not used due to a later predicate rejecting a match. This doesn't translate well to GlobalISel and the churn was wasteful. The API changes in this patch enable GlobalISel to accomplish the same thing without the waste. The API is now: InstructionSelector::OptionalComplexRendererFn selectArithImmed(MachineOperand &Root) const; where Root is the root of the match. The return value can be omitted to indicate that the predicate failed to match, or a function with the signature ComplexRendererFn can be returned. For example: return OptionalComplexRendererFn( [=](MachineInstrBuilder &MIB) { MIB.addImm(Immed).addImm(ShVal); }); adds two immediate operands to the rendered instruction. Immed and ShVal are captured from the predicate function. As an added bonus, this also reduces the amount of information we need to provide to GIComplexOperandMatcher. Depends on D31418 Reviewers: aditya_nandakumar, t.p.northover, qcolombet, rovka, ab, javed.absar Reviewed By: ab Subscribers: dberris, kristof.beyls, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D31761 llvm-svn: 301079
* AArch64FrameLowering: Check if the ExtraCSSpill register is actually unusedMatthias Braun2017-04-211-6/+6
| | | | | | | | | | | The code assumed that when saving an additional CSR register (ExtraCSSpill==true) we would have a free register throughout the function. This was not true if this CSR register is also used to pass values as in the swiftself case. rdar://31451816 llvm-svn: 301057
* Re-commit r301040 "X86: Don't emit zero-byte functions on Windows"Hans Wennborg2017-04-2114-26/+20
| | | | | | | | | In addition to the original commit, tighten the condition for when to pad empty functions to COFF Windows. This avoids running into problems when targeting e.g. Win32 AMDGPU, which caused test failures when this was committed initially. llvm-svn: 301047
* Revert r301040 "X86: Don't emit zero-byte functions on Windows"Hans Wennborg2017-04-2114-20/+26
| | | | | | This broke almost all bots. Reverting while fixing. llvm-svn: 301041
* X86: Don't emit zero-byte functions on WindowsHans Wennborg2017-04-2114-26/+20
| | | | | | | | | | | | | | | | | | Empty functions can lead to duplicate entries in the Guard CF Function Table of a binary due to multiple functions sharing the same RVA, causing the kernel to refuse to load that binary. We had a terrific bug due to this in Chromium. It turns out we were already doing this for Mach-O in certain situations. This patch expands the code for that in AsmPrinter::EmitFunctionBody() and renames TargetInstrInfo::getNoopForMachoTarget() to simply getNoop() since it seems it was used for not just Mach-O anyway. Differential Revision: https://reviews.llvm.org/D32330 llvm-svn: 301040
* ARM: make sure we use all entries in a vector before forming a vpaddl.Tim Northover2017-04-211-5/+5
| | | | | | | | | Otherwise there's some mismatch, and we'll either form an illegal type or an illegal node. Thanks to Eli Friedman for pointing out the problem with my original solution. llvm-svn: 301036
* AMDGPU/GFX9: Enable FastFMAF32Konstantin Zhuravlyov2017-04-211-1/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D32363 llvm-svn: 301029
* AMDGPU: Temporarily disable packed inlinable literals (v2f16, v2i16)Konstantin Zhuravlyov2017-04-211-0/+9
| | | | | | Differential Revision: https://reviews.llvm.org/D32361 llvm-svn: 301028
* AMDGPU: Fix S_PACK_HH_B32_B16Konstantin Zhuravlyov2017-04-211-1/+1
| | | | | | | | - We really ought to zero out lower 16 bits Differential Revision: https://reviews.llvm.org/D32356 llvm-svn: 301026
* [AMDGPU] Handle SI_MASKED_UNREACHABLE in instruction emitterYaxun Liu2017-04-211-0/+6
| | | | | | | | | | | | SI_MASKED_UNREACHABLE does not have machine instruction encoding. It needs special handling in AMDGPUAsmPrinter::EmitInstruction like some other pseudo instructions. This patch fixes compilation failure of RadeonRays. Differential Revision: https://reviews.llvm.org/D32364 llvm-svn: 301025
* Revert "X86RegisterInfo: eliminateFrameIndex: Avoid code duplication; NFC"Matthias Braun2017-04-213-23/+34
| | | | | | | | | | | It seems we have on situation in a sanitizer enable bootstrap build where the return instruction has a frame index operand that does not point to a fixed object and fails the assert added here. This reverts commit r300923. This reverts commit r300922. llvm-svn: 301024
* AMDGPU: Do not lower fast unsafe div for safe, f32, with fp32 denormalsKonstantin Zhuravlyov2017-04-211-2/+4
| | | | | | Differential Revision: https://reviews.llvm.org/D32085 llvm-svn: 301023
* [AArch64] Improve code generation for logical instructions takingAkira Hatanaka2017-04-216-7/+152
| | | | | | | | | | | | | | | | | | | | | | | | | | | | immediate operands. This commit adds an AArch64 dag-combine that optimizes code generation for logical instructions taking immediate operands. The optimization uses demanded bits to change a logical instruction's immediate operand so that the immediate can be folded into the immediate field of the instruction. This recommits r300932 and r300930, which was causing dag-combine to loop forever. The problem was that optimizeLogicalImm was returning true even when there was no change to the immediate node (which happened when the immediate was all zeros or ones), which caused dag-combine to push and pop the same node to the work list over and over again without making any progress. This commit fixes the bug by returning false early in optimizeLogicalImm if the immediate is all zeros or ones. Also, it changes the code to compare the immediate with 0 or Mask rather than calling countPopulation. rdar://problem/18231627 Differential Revision: https://reviews.llvm.org/D5591 llvm-svn: 301019
* [AArch64] Refactor instruction selection lowering for addresses. NFCIJoel Jones2017-04-212-85/+91
| | | | | | | | | | | | Factor out the common code used for generating addresses into common templated functions that call overloaded versions of a new function, getTargetNode. Tested with make check-llvm with targets AArch64. Differential Revision: https://reviews.llvm.org/D32169 llvm-svn: 301005
* ARM: don't try to create an i8 -> i32 vpaddl.Tim Northover2017-04-211-2/+5
| | | | | | | | DAG combine was mistakenly assuming that the step-up it was looking at was always a doubling, but it can sometimes be a larger extension in which case we'd crash. llvm-svn: 301002
* [globalisel][tablegen] Import SelectionDAG's rule predicates and support the ↵Daniel Sanders2017-04-214-9/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | equivalent in GIRule. Summary: The SelectionDAG importer now imports rules with Predicate's attached via Requires, PredicateControl, etc. These predicates are implemented as bitset's to allow multiple predicates to be tested together. However, unlike the MC layer subtarget features, each target only pays for it's own predicates (e.g. AArch64 doesn't have 192 feature bits just because X86 needs a lot). Both AArch64 and X86 derive at least one predicate from the MachineFunction or Function so they must re-initialize AvailableFeatures before each function. They also declare locals in <Target>InstructionSelector so that computeAvailableFeatures() can use the code from SelectionDAG without modification. Reviewers: rovka, qcolombet, aditya_nandakumar, t.p.northover, ab Reviewed By: rovka Subscribers: aemerson, rengolin, dberris, kristof.beyls, llvm-commits, igorb Differential Revision: https://reviews.llvm.org/D31418 llvm-svn: 300993
* [AArch64][Falkor] Refine modeling of store-release exclusive instructions.Chad Rosier2017-04-212-2/+8
| | | | llvm-svn: 300987
* [Mips] Document Mips Backend Relocation PrinciplesJoel Jones2017-04-211-0/+125
| | | | | | | | | | | This revision documents the combination of C++ and table-gen code that handles relocations and addresses. Thanks for Simon Dardis for the careful reviews. Differential Revision: https://reviews.llvm.org/D31628 llvm-svn: 300986
* [AArch64][Falkor] Refine resource needs of STRQ with register offset.Chad Rosier2017-04-212-0/+8
| | | | llvm-svn: 300984
* Revert r300964 + r300970 - [globalisel][tablegen] Import SelectionDAG's rule ↵Daniel Sanders2017-04-214-48/+9
| | | | | | | | | predicates and support the equivalent in GIRule. It's causing llvm-clang-x86_64-expensive-checks-win to fail to compile and I haven't worked out why. Reverting to make it green while I figure it out. llvm-svn: 300978
* [AArch64][Falkor] Refine loads/stores that require an extra LD pipe.Chad Rosier2017-04-212-5/+21
| | | | llvm-svn: 300976
* [AArch64][Falkor] Fix number of microops for WriteSTIdx missed in r300892.Chad Rosier2017-04-211-1/+1
| | | | llvm-svn: 300975
* [AArch64] Fix a few missed pre/post-inc in Falkor.Chad Rosier2017-04-211-5/+9
| | | | llvm-svn: 300974
OpenPOWER on IntegriCloud