summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86] Set the execution domain for vptest instruction to the integer domain.Craig Topper2017-11-111-0/+3
| | | | llvm-svn: 317973
* [X86] Correct the execution domain on ROUND/VROUND instructions.Craig Topper2017-11-111-6/+12
| | | | llvm-svn: 317968
* [X86] Remove the default for one of the arguments to some tablegen ↵Craig Topper2017-11-111-5/+3
| | | | | | | | multiclasses. NFC No one ever uses this default and probably shouldn't since it sets the execution domain to generic. llvm-svn: 317967
* Recommit r317904: [Hexagon] Create HexagonISelDAGToDAG.h, NFCKrzysztof Parzyszek2017-11-102-109/+139
| | | | | | | The Windows builder did not reconstruct the HexagonGenDAGISel.inc file after the TableGen binary has changed. llvm-svn: 317921
* AMDGPU/NFC: Split Processors.td into GCNProcessors.td and R600Processors.tdKonstantin Zhuravlyov2017-11-104-218/+258
| | | | | | Differential Revision: https://reviews.llvm.org/D39880 llvm-svn: 317920
* Revert "[Hexagon] Create HexagonISelDAGToDAG.h, NFC"Krzysztof Parzyszek2017-11-102-139/+109
| | | | | | This reverts r317904: broke Windows build. llvm-svn: 317916
* [X86] Merge the template method selectAddrOfGatherScatterNode into ↵Craig Topper2017-11-101-25/+16
| | | | | | | | selectVectorAddr. NFCI Just need to initialize a couple variables differently based on the node type. No need for a whole separate template method. llvm-svn: 317915
* [RISCV] Silence an unused variable warning in release builds [NFC]Mandeep Singh Grang2017-11-102-5/+5
| | | | | | | | | | | | | | | | | | Summary: Also minor cleanups: 1. Avoided multiple calls to Fixup.getKind() 2. Avoided multiple calls to getFixupKindInfo() 3. Removed a redundant return. Reviewers: asb, apazos Reviewed By: asb Subscribers: rbar, johnrusso, llvm-commits Differential Revision: https://reviews.llvm.org/D39881 llvm-svn: 317908
* [Hexagon] Create HexagonISelDAGToDAG.h, NFCKrzysztof Parzyszek2017-11-102-109/+139
| | | | llvm-svn: 317904
* [RegAlloc, SystemZ] Increase number of LOCRs by passing "hard" regalloc hints.Jonas Paulsson2017-11-105-4/+99
| | | | | | | | | | | | | | | | | | | | * The method getRegAllocationHints() is now of bool type instead of void. If true is returned, regalloc (AllocationOrder) will *only* try to allocate the hints, as opposed to merely trying them before non-hinted registers. * TargetRegisterInfo::getRegAllocationHints() is implemented for SystemZ with an increase in number of LOCRs. In this case, it is desired to force the hints even though there is a slight increase in spilling, because if a non-hinted register would be allocated, the LOCRMux pseudo would have to be expanded with a jump sequence. The LOCR (Load On Condition) SystemZ instruction must have both operands in either the low or high part of the 64 bit register. Reviewers: Quentin Colombet and Ulrich Weigand https://reviews.llvm.org/D36795 llvm-svn: 317879
* [X86] Add support for combining FMADDSUB(A, B, FNEG(C))->FMSUBADD(A, B, C)Craig Topper2017-11-101-0/+31
| | | | | | Support the opposite direction as well. Also add a TODO for not being able to combine FMSUB/FNMADD/FNMSUB with FNEG. llvm-svn: 317878
* [AMDGPU] Fix pointer info for lowering load/store for r600 for amdgiz ↵Yaxun Liu2017-11-101-3/+7
| | | | | | | | | | | | | | | | environment r600 uses dummy pointer info for lowering load/store. Since dummy pointer info assumes address space 0, this causes isel failure when temporary load/store SDNodes are generated for amdgiz environment. Since the offest is not constant, FixedStack pseudo source value cannot be used to create the pointer info. This patch creates pointer info using llvm undef value. At least this provides correct address space so that isel can be done correctly. Differential Revision: https://reviews.llvm.org/D39698 llvm-svn: 317862
* [AMDGPU] Fix pointer info for pseudo source for r600Yaxun Liu2017-11-102-0/+21
| | | | | | | | | | | The pointer info for pseudo source for r600 is not correct when alloca addr space is not 0, which causes invalid SDNode for r600---amdgiz. This patch fixes that. Differential Revision: https://reviews.llvm.org/D39670 llvm-svn: 317861
* [SystemZ] Add support for the "o" inline asm constraintUlrich Weigand2017-11-092-0/+5
| | | | | | | | | We don't really need any special handling of "offsettable" memory addresses, but since some existing code uses inline asm statements with the "o" constraint, add support for this constraint for compatibility purposes. llvm-svn: 317807
* [mips] Correct microMIP's jump and add unconditional branch pseudoSimon Dardis2017-11-094-18/+29
| | | | | | | | | | | | | | Correct the definition of 'j' as being unavailable for microMIPS32R6 and provide the 'b' assembly idiom for codegen purposes for microMIPS32r3. Provide the necessary 'br' pattern for microMIPS32R6 as it now longer incorrectly uses the 'j' instruction. Reviewers: atanasyan Differential Revision: https://reviews.llvm.org/D39741 llvm-svn: 317801
* [RISCV] MC layer support for the standard RV32A instruction set extensionAlex Bradbury2017-11-096-12/+128
| | | | llvm-svn: 317791
* [RISCV] MC layer support for the standard RV32M instruction set extensionAlex Bradbury2017-11-094-4/+45
| | | | llvm-svn: 317788
* Sched model improving on btver2: JFPU01 resource, vtestp* for xmm.Andrew V. Tischenko2017-11-091-11/+26
| | | | | | Differential Revision: https://reviews.llvm.org/D39802 llvm-svn: 317785
* Add -print-schedule scheduling comments to inline asm.Andrew V. Tischenko2017-11-093-14/+16
| | | | | | Differential Revision: https://reviews.llvm.org/D39728 llvm-svn: 317782
* [X86] Give priority to EVEX FMA instructions over FMA4 instructions.Craig Topper2017-11-093-63/+69
| | | | | | No existing processor has both so it doesn't really matter what we do here. But we were previously just relying on pattern order which gave FMA4 priority. llvm-svn: 317775
* Fix "default label in switch which covers all enumeration values" warningVitaly Buka2017-11-091-2/+0
| | | | llvm-svn: 317771
* [X86] Make X86ISD::FMADDS3 isel patterns commutable.Craig Topper2017-11-091-4/+4
| | | | | | This was missed when FMADDS3 was split from X86ISD::FMADDS3_RND. llvm-svn: 317769
* AMDGPU: Merge BUFFER_STORE_DWORD_OFFEN/OFFSET into x2, x4Marek Olsak2017-11-091-4/+109
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Only 56 shaders (out of 48486) are affected. Totals from affected shaders (changed stats only): SGPRS: 2420 -> 2460 (1.65 %) Spilled VGPRs: 94 -> 112 (19.15 %) Scratch size: 524 -> 528 (0.76 %) dwords per thread Code Size: 187400 -> 184992 (-1.28 %) bytes One DiRT Showdown shader spills 6 more VGPRs. One Grid Autosport shader spills 12 more VGPRs. The other 54 shaders only have a decrease in code size. (I'm ignoring the SGPR noise) Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D39012 llvm-svn: 317755
* AMDGPU: Lower buffer store and atomic intrinsics manuallyMarek Olsak2017-11-095-20/+206
| | | | | | | | | | | | | | Summary: Without this, SIMemoryLegalizer inserts s_waitcnt vmcnt(0) before every buffer store and atomic instruction. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D39060 llvm-svn: 317754
* AMDGPU: Merge BUFFER_LOAD_DWORD_OFFSET into x2, x4Marek Olsak2017-11-091-13/+37
| | | | | | | | | | | | Summary: Only 3 (out of 48486) shaders are affected. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D38951 llvm-svn: 317753
* AMDGPU: Merge BUFFER_LOAD_DWORD_OFFEN into x2, x4Marek Olsak2017-11-091-26/+141
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: -9.9% code size decrease in affected shaders. Totals (changed stats only): SGPRS: 2151462 -> 2170646 (0.89 %) VGPRS: 1634612 -> 1640288 (0.35 %) Spilled SGPRs: 8942 -> 8940 (-0.02 %) Code Size: 52940672 -> 51727288 (-2.29 %) bytes Max Waves: 373066 -> 371718 (-0.36 %) Totals from affected shaders: SGPRS: 283520 -> 302704 (6.77 %) VGPRS: 227632 -> 233308 (2.49 %) Spilled SGPRs: 3966 -> 3964 (-0.05 %) Code Size: 12203080 -> 10989696 (-9.94 %) bytes Max Waves: 44070 -> 42722 (-3.06 %) Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D38950 llvm-svn: 317752
* AMDGPU: Merge S_BUFFER_LOAD_DWORD_IMM into x2, x4Marek Olsak2017-11-092-14/+125
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Only constant offsets (*_IMM opcodes) are merged. It reuses code for LDS load/store merging. It relies on the scheduler to group loads. The results are mixed, I think they are mostly positive. Most shaders are affected, so here are total stats only: SGPRS: 2072198 -> 2151462 (3.83 %) VGPRS: 1628024 -> 1634612 (0.40 %) Spilled SGPRs: 7883 -> 8942 (13.43 %) Spilled VGPRs: 97 -> 101 (4.12 %) Scratch size: 1488 -> 1492 (0.27 %) dwords per thread Code Size: 60222620 -> 52940672 (-12.09 %) bytes Max Waves: 374337 -> 373066 (-0.34 %) There is 13.4% increase in SGPR spilling, DiRT Showdown spills a few more VGPRs (now 37), but 12% decrease in code size. These are the new stats for SGPR spilling. We already spill a lot SGPRs, so it's uncertain whether more spilling will make any difference since SGPRs are always spilled to VGPRs: SGPR SPILLING APPS Shaders SpillSGPR AvgPerSh alien_isolation 2938 100 0.0 batman_arkham_origins 589 6 0.0 bioshock-infinite 1769 4 0.0 borderlands2 3968 22 0.0 counter_strike_glob.. 1142 60 0.1 deus_ex_mankind_div.. 1410 79 0.1 dirt-showdown 533 4 0.0 dirt_rally 364 1163 3.2 divinity 1052 2 0.0 dota2 1747 7 0.0 f1-2015 776 1515 2.0 grid_autosport 1767 1505 0.9 hitman 1413 273 0.2 left_4_dead_2 1762 4 0.0 life_is_strange 1296 26 0.0 mad_max 358 96 0.3 metro_2033_redux 2670 60 0.0 payday2 1362 22 0.0 portal 474 3 0.0 saints_row_iv 1704 8 0.0 serious_sam_3_bfe 392 1348 3.4 shadow_of_mordor 1418 12 0.0 shadow_warrior 3956 239 0.1 talos_principle 324 1735 5.4 thea 172 17 0.1 tomb_raider 1449 215 0.1 total_war_warhammer 242 56 0.2 ue4_effects_cave 295 55 0.2 ue4_elemental 572 12 0.0 unigine_tropics 210 56 0.3 unigine_valley 278 152 0.5 victor_vran 1262 84 0.1 yofrankie 82 2 0.0 Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D38949 llvm-svn: 317751
* AMDGPU: Fold immediate offset into BUFFER_LOAD_DWORD lowered from SMEMMarek Olsak2017-11-093-14/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: -5.3% code size in affected shaders. Changed stats only: 48486 shaders in 30489 tests Totals: SGPRS: 2086406 -> 2072430 (-0.67 %) VGPRS: 1626872 -> 1627960 (0.07 %) Spilled SGPRs: 7865 -> 7912 (0.60 %) Code Size: 60978060 -> 60188764 (-1.29 %) bytes Max Waves: 374530 -> 374342 (-0.05 %) Totals from affected shaders: SGPRS: 299664 -> 285688 (-4.66 %) VGPRS: 233844 -> 234932 (0.47 %) Spilled SGPRs: 3959 -> 4006 (1.19 %) Code Size: 14905272 -> 14115976 (-5.30 %) bytes Max Waves: 46202 -> 46014 (-0.41 %) Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D38915 llvm-svn: 317750
* [X86] Make sure we don't read too many operands from X86ISD::FMADDS1/FMADDS3 ↵Craig Topper2017-11-091-2/+5
| | | | | | | | nodes when doing FNEG combine. r317453 added new ISD nodes without rounding modes that were added to an existing if/else chain. But all the previous nodes handled there included a rounding mode. The final code after this if/else chain expected an extra operand that isn't present for the new nodes. llvm-svn: 317748
* [X86] X86MaskedGatherSDNode shouldn't inherit from MaskedGatherScatterSDNodeCraig Topper2017-11-081-2/+10
| | | | | | The classof implementation in MaskedGatherScatterSDNode doesn't consider X86MaskedGatherSDNode so its misleading. llvm-svn: 317733
* [X86] Preserve memory refs when folding loads into divides.Craig Topper2017-11-081-1/+5
| | | | | | This is similar to what we already do for multiplies. Without this we can't unfold and hoist an invariant load. llvm-svn: 317732
* [X86] Remove an if check on the result of a cast. NFCCraig Topper2017-11-081-12/+6
| | | | | | cast takes a non-null input and produces a non-null output. So this if can never fail. llvm-svn: 317731
* Revert "Correct dwarf unwind information in function epilogue for X86"Reid Kleckner2017-11-083-54/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts r317579, originally committed as r317100. There is a design issue with marking CFI instructions duplicatable. Not all targets support the CFIInstrInserter pass, and targets like Darwin can't cope with duplicated prologue setup CFI instructions. The compact unwind info emission fails. When the following code is compiled for arm64 on Mac at -O3, the CFI instructions end up getting tail duplicated, which causes compact unwind info emission to fail: int a, c, d, e, f, g, h, i, j, k, l, m; void n(int o, int *b) { if (g) f = 0; for (; f < o; f++) { m = a; if (l > j * k > i) j = i = k = d; h = b[c] - e; } } We get assembly that looks like this: ; BB#1: ; %if.then Lloh3: adrp x9, _f@GOTPAGE Lloh4: ldr x9, [x9, _f@GOTPAGEOFF] mov w8, wzr Lloh5: str wzr, [x9] stp x20, x19, [sp, #-16]! ; 8-byte Folded Spill .cfi_def_cfa_offset 16 .cfi_offset w19, -8 .cfi_offset w20, -16 cmp w8, w0 b.lt LBB0_3 b LBB0_7 LBB0_2: ; %entry.if.end_crit_edge Lloh6: adrp x8, _f@GOTPAGE Lloh7: ldr x8, [x8, _f@GOTPAGEOFF] Lloh8: ldr w8, [x8] stp x20, x19, [sp, #-16]! ; 8-byte Folded Spill .cfi_def_cfa_offset 16 .cfi_offset w19, -8 .cfi_offset w20, -16 cmp w8, w0 b.ge LBB0_7 LBB0_3: ; %for.body.lr.ph Note the multiple .cfi_def* directives. Compact unwind info emission can't handle that. llvm-svn: 317726
* Set hasSideEffects=0 for PHI and fix affected passesAlex Bradbury2017-11-081-3/+2
| | | | | | | | | | | | | | Previously, hasSideEffects was ? for TargetOpcode::PHI and would be inferred as 1. D37065 sets the previously inferred properties explicitly. This patch sets hasSideEffects=0 for PHI, as it is for G_PHI. MachineInstr::isSafeToMove has been updated so it still returns false for PHI. Additionally, HexagonBitSimplify relied on a PHI node having the hasUnmodeledSideEffects property. This patch fixes that assumption. Differential Revision: https://reviews.llvm.org/D37097 llvm-svn: 317721
* [X86] Correct the implementation of BEXTR load folding to use the shift as ↵Craig Topper2017-11-081-6/+14
| | | | | | | | | | the parent node and pass a separate root. We were calling tryFoldLoad with the 'and' node was the root and parent node of the load. But the parent of the load should be the shift that proceeds the and. While the and node is correctly the root node. To fix this I had to make tryFoldLoad take a separate use and root input. I've added a convenience version with the old signature to avoid updating the other call sites. llvm-svn: 317720
* [WebAssembly] Update test expectationsSam Clegg2017-11-081-14/+3
| | | | | | | | I believe these were fixed in rL317707 Differential Revision: https://reviews.llvm.org/D39813 llvm-svn: 317718
* [X86] Don't call validateInstruction from MatchAndEmitInstruction when ↵Craig Topper2017-11-081-2/+2
| | | | | | | | MatchingInlineAsm is set. The MCInst won't be populated. Without this we can't parse gather instructions in ms inline asm blocks. The validateInstruction function was introduced in r316700 to check gather constraints. llvm-svn: 317713
* [WebAssembly] Call signExtend to get sign extended registerDan Gohman2017-11-081-1/+1
| | | | | | | | Patch by Jatin Bhateja! Differential Revision: https://reviews.llvm.org/D39529 llvm-svn: 317710
* [WebAssembly] Revise the strategy for inline asm.Dan Gohman2017-11-082-6/+25
| | | | | | | | | | | | | | | | | | | | | | Previously, an "r" constraint would mean the compiler provides a value on WebAssembly's operand stack. This was tricky to use properly, particularly since it isn't possible to declare a new local from within an inline asm string. With this patch, "r" provides the value in a WebAssembly local, and the local index is provided to the inline asm string. This requires inline asm to use get_local and set_local to read the register. This does potentially result in larger code size, however inline asm should hopefully be quite rare in WebAssembly. This also means that the "m" constraint can no longer be supported, as WebAssembly has nothing like a "memory operand" that includes an implicit get_local. This fixes PR34599 for the wasm32-unknown-unknown-wasm target (though not for the ELF target). llvm-svn: 317707
* [RISCV] Initial support for function callsAlex Bradbury2017-11-088-4/+186
| | | | | | | | | Note that this is just enough for simple function call examples to generate working code. Support for varargs etc follows in future patches. Differential Revision: https://reviews.llvm.org/D29936 llvm-svn: 317691
* [RISCV] Codegen for conditional branchesAlex Bradbury2017-11-088-4/+118
| | | | | | | | | | | | | | | | | | | | A good portion of this patch is the extra functions that needed to be implemented to support the test case. e.g. storeRegToStackSlot, loadRegFromStackSlot, eliminateFrameIndex. Setting ISD::BR_CC to Expand may appear non-obvious on an architecture with branch+cmp instructions. However, I found it much easier to deal with matching the expanded form. I had to change simm13_lsb0 and simm21_lsb0 to inherit from the Operand<OtherVT> class rather than Operand<i32> in order to keep tablegen happy. This isn't a big deal, but it does seem a shame to lose the uniformity across immediate types when there's not an obvious benefit (I'm hoping a tablegen expert will educate me on what I'm missing here!). Differential Revision: https://reviews.llvm.org/D29935 llvm-svn: 317690
* [RISCV] Codegen support for memory operations on global addressesAlex Bradbury2017-11-085-22/+99
| | | | | | Differential Revision: https://reviews.llvm.org/D39103 llvm-svn: 317688
* [RISCV] Codegen support for memory operationsAlex Bradbury2017-11-084-0/+47
| | | | | | | | | This required the implementation of RISCVTargetInstrInfo::copyPhysReg. Support for lowering global addresses follow in the next patch. Differential Revision: https://reviews.llvm.org/D29934 llvm-svn: 317685
* [RISCV] Codegen support for materializing constantsAlex Bradbury2017-11-081-0/+24
| | | | | | Differential Revision: https://reviews.llvm.org/D39101 llvm-svn: 317684
* [mips] Guard indirect and tailcall pseudo instructions correctly.Simon Dardis2017-11-083-11/+23
| | | | | | | | | | | Previously these pseudo instructions were not guarded by ISA, so their select was dependant on the ordering of the entries in the DAG matcher. Reviewers: atanasyan Differential Revision: https://reviews.llvm.org/D39723 llvm-svn: 317681
* [NFCI] Ensure TargetOpcode::* are compatible with guessInstructionProperties=0Alex Bradbury2017-11-081-3/+1
| | | | | | | | | | | | | | | | | | | rL162640 introduced CodeGenTarget::guessInstructionProperties. If a target sets guessInstructionProperties=0 in its FooInstrInfo, tablegen will error if it has to guess properties from patterns. Unfortunately, guessInstructionProperties=0 can't be used with current upstream LLVM as instructions in the TargetOpcode namespace are always included and sometimes have inferred properties for mayLoad, mayStore, and hasSideEffects. This patch provides the simplest possible fix to this problem, setting default values for these fields in the TargetOpcode scope. There is no intended functional change, as the explicitly set properties should match what was previously inferred. A number of the instructions had hasSideEffects=1 inferred unintentionally. This patch makes it explicit, while future patches (such as D37097) correct the property. Differential Revision: https://reviews.llvm.org/D37065 llvm-svn: 317674
* [X86] Add patterns to fold EVEX store with EVEX encoded vcvtps2ph ↵Craig Topper2017-11-081-11/+23
| | | | | | instructions. Remove bad pattern that had vf432 vcvtps2ph storing 128-bits. llvm-svn: 317662
* [X86] Allow legacy vcvtps2ph intrinsics to select EVEX encoded instructions. ↵Craig Topper2017-11-082-18/+16
| | | | | | | | Rely on EVEX->VEX to convert back. Missed store folding opportunities will be fixed in a subsequent commit. llvm-svn: 317661
* Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layeringDavid Blaikie2017-11-0886-99/+99
| | | | | | | | This header includes CodeGen headers, and is not, itself, included by any Target headers, so move it into CodeGen to match the layering of its implementation. llvm-svn: 317647
* AMDGPU: Set correct sched model on v_mad_u64_u32Matt Arsenault2017-11-081-0/+2
| | | | llvm-svn: 317645
OpenPOWER on IntegriCloud