summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* Revert "[AArch64] Simplify AES*Tied pseudo expansion (NFC)."Tim Northover2017-08-031-3/+10
| | | | | | | | | | This reverts commit r309821. My suggestion was wrong because it left the MachineOperands tied which confused the verifier. Since there's no easy way to untie operands, the original BuildMI solution is probably best. llvm-svn: 309962
* AMDGPU/SI: Don't fix a PHI under uniform branch in SIFixSGPRCopies only when ↵Changpeng Fang2017-08-031-3/+3
| | | | | | | | | | | | | | | | sources and destination are all sgprs Summary: If a PHI has at lease one VGPR operand, we have to fix the PHI in SIFixSGPRCopies. Reviewer: Matt Differential Revision: http://reviews.llvm.org/D34727 llvm-svn: 309959
* Revert r309923, it caused PR34045.Nico Weber2017-08-032-156/+13
| | | | llvm-svn: 309950
* [ARM] GlobalISel: Select simple G_GLOBAL_VALUE instructionsDiana Picus2017-08-031-0/+57
| | | | | | | | | | | | | Add support in the instruction selector for G_GLOBAL_VALUE for ELF and MachO for the static relocation model. We don't handle Windows yet because that's Thumb-only, and we don't handle Thumb in general at the moment. Support for PIC, ROPI, RWPI and TLS will be added in subsequent commits. Differential Revision: https://reviews.llvm.org/D35883 llvm-svn: 309927
* [X86] SET0 to use XMM registers where possible PR26018 PR32862Dinar Temirbulatov2017-08-031-8/+13
| | | | | | Differential Revision: https://reviews.llvm.org/D35965 llvm-svn: 309926
* [ARM] Use ADDCARRY / SUBCARRYRoger Ferrer Ibanez2017-08-032-13/+156
| | | | | | | | | | | | | | | | | | | | | | | | This patch: - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32 - lowering is done by first converting the boolean value into the carry flag using (_, C) <- (ARMISD::ADDC R, -1) and converted back to an integer value using (R, _) <- (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two operations does the actual addition. - for subtraction, given that ISD::SUBCARRY second result is actually a borrow, we need to invert the value of the second operand and result before and after using ARMISD::SUBE. We need to invert the carry result of ARMISD::SUBE to preserve the semantics. - given that the generic combiner may lower ISD::ADDCARRY and ISD::SUBCARRY into ISD::UADDO and ISD::USUBO we need to update their lowering as well otherwise i64 operations now would require branches. This implies updating the corresponding test for unsigned. - add new combiner to remove the redundant conversions from/to carry flags to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) -> C Differential Revision: https://reviews.llvm.org/D35192 llvm-svn: 309923
* Fix WebAssembly target after r309911.Daniel Jasper2017-08-033-16/+5
| | | | llvm-svn: 309922
* Fix the ppc jit tests.Rafael Espindola2017-08-031-3/+4
| | | | llvm-svn: 309921
* Delete Default and JITDefault code modelsRafael Espindola2017-08-0341-312/+344
| | | | | | | | | | | | | | | IMHO it is an antipattern to have a enum value that is Default. At any given piece of code it is not clear if we have to handle Default or if has already been mapped to a concrete value. In this case in particular, only the target can do the mapping and it is nice to make sure it is always done. This deletes the two default enum values of CodeModel and uses an explicit Optional<CodeModel> when it is possible that it is unspecified. llvm-svn: 309911
* [ARM] Tidy up banked registers encodingJaved Absar2017-08-035-77/+74
| | | | | | | | | | Moves encoding (SYSm) information of banked registers to ARMSystemRegister.td, where it rightly belongs and forms a single point of reference in the code. Reviewed by: @fhahn, @rovka, @olista01 Differential Revision: https://reviews.llvm.org/D36219 llvm-svn: 309910
* AMDGPU/GlobalISel: Mark 32-bit G_FMUL as legalTom Stellard2017-08-021-0/+2
| | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D36218 llvm-svn: 309898
* AMDGPU/R600: Initialize more passesTom Stellard2017-08-027-8/+68
| | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D36128 llvm-svn: 309893
* Don't pass the code model to MCRafael Espindola2017-08-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | I was surprised to see the code model being passed to MC. After all, it assembles code, it doesn't create it. The one place it is used is in the expansion of .cfi directives to handle .eh_frame being more that 2gb away from the code. As far as I can tell, gnu assembler doesn't even have an option to enable this. Compiling a c file with gcc -mcmodel=large produces a regular looking .eh_frame. This is probably because in practice linker parse and recreate .eh_frames. In llvm this is used because the JIT can place the code and .eh_frame very far apart. Ideally we would fix the jit and delete this option. This is hard. Apart from confusion another problem with the current interface is that most callers pass CodeModel::Default, which is bad since MC has no way to map it to the target default if it actually needed to. This patch then replaces the argument with a boolean with a default value. The vast majority of users don't ever need to look at it. In fact, only CodeGen and llvm-mc use it and llvm-mc just to enable more testing. llvm-svn: 309884
* [Power9] Exploit vector absolute difference instructions on Power 9Stefan Pintilie2017-08-022-1/+52
| | | | | | | | | Power 9 has instructions to do absolute difference (VABSDUB, VABSDUH, VABSDUW) for byte, halfword and word. We should take advantage of these. Differential Revision: https://reviews.llvm.org/D34684 llvm-svn: 309876
* AMDGPU: Restore using MRI to find highest used regsMatt Arsenault2017-08-021-5/+23
| | | | | | | | | | If there are no calls, this is a faster path than searching the entire program for calls. This was supposed to be left in r309781. Fixes unused variable warning. llvm-svn: 309832
* Remove unused includes of MachineLocation.h (NFC)Adrian Prantl2017-08-021-1/+0
| | | | llvm-svn: 309824
* [AArch64] Simplify AES*Tied pseudo expansion (NFC).Florian Hahn2017-08-021-10/+3
| | | | | | | | | | | | | | | | Summary: Suggested by @t.p.northover in https://bugs.llvm.org/show_bug.cgi?id=34015. Reviewers: javed.absar, t.p.northover, rengolin Reviewed By: t.p.northover Subscribers: aemerson, kristof.beyls, llvm-commits, t.p.northover Differential Revision: https://reviews.llvm.org/D36223 llvm-svn: 309821
* AMDGPU: Fix clobbering CSR VGPRs when spilling SGPR to itMatt Arsenault2017-08-023-6/+60
| | | | llvm-svn: 309783
* AMDGPU: Fix emitting encoded callsMatt Arsenault2017-08-022-4/+11
| | | | | | | | | | This was failing on out of bounds access to the extra operands on the s_swappc_b64 beyond those in the instruction definition. This was working, but somehow regressed within the past few weeks, although I don't see any obvious commit. llvm-svn: 309782
* AMDGPU: Analyze callee resource usage in AsmPrinterMatt Arsenault2017-08-025-19/+186
| | | | llvm-svn: 309781
* [AMDGPU] Fix asan error after last commitStanislav Mekhanoshin2017-08-021-1/+1
| | | | | | | | | | Previous change "Turn s_and_saveexec_b64 into s_and_b64 if result is unused" introduced asan use-after-poison error. Instruction was analyzed after eraseFromParent() calls. Move analysys higher than erase. llvm-svn: 309779
* AMDGPU: Don't place arguments in emergency stack slotMatt Arsenault2017-08-021-1/+9
| | | | | | | | When finding the fixed offsets for function arguments, this needs to skip over the 4 bytes reserved for the emergency stack slot. llvm-svn: 309776
* [AMDGPU] Turn s_and_saveexec_b64 into s_and_b64 if result is unusedStanislav Mekhanoshin2017-08-012-1/+65
| | | | | | | | | | With SI_END_CF elimination for some nested control flow we can now eliminate saved exec register completely by turning a saveexec version of instruction into just a logical instruction. Differential Revision: https://reviews.llvm.org/D36007 llvm-svn: 309766
* [AMDGPU] Collapse adjacent SI_END_CFStanislav Mekhanoshin2017-08-014-0/+168
| | | | | | | | | | | | | | | | Add a pass to remove redundant S_OR_B64 instructions enabling lanes in the exec. If two SI_END_CF (lowered as S_OR_B64) come together without any vector instructions between them we can only keep outer SI_END_CF, given that CFG is structured and exec bits of the outer end statement are always not less than exec bit of the inner one. This needs to be done before the RA to eliminate saved exec bits registers but after register coalescer to have no vector registers copies in between of different end cf statements. Differential Revision: https://reviews.llvm.org/D35967 llvm-svn: 309762
* [AArch64] Fix a typo in isExtFreeImpl()Haicheng Wu2017-08-011-1/+1
| | | | | | | | next => not Differential Revision: https://reviews.llvm.org/D36104 llvm-svn: 309748
* [Hexagon] Fix some Clang-tidy modernize-use-using and Include What You Use ↵Eugene Zelenko2017-08-0125-398/+562
| | | | | | warnings; other minor fixes (NFC). llvm-svn: 309746
* [AArch64] Rewrite stack frame handling for win64 vararg functionsMartin Storsjo2017-08-011-22/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous attempt, which made do with a single offset in computeCalleeSaveRegisterPairs, wasn't quite enough. The previous attempt only worked as long as CombineSPBump == true (since the offset would be adjusted later in fixupCalleeSaveRestoreStackOffset). Instead include the size for the fixed stack area used for win64 varargs in calculations in emitPrologue/emitEpilogue. The stack consists of mainly three parts; - AFI->getLocalStackSize() - AFI->getCalleeSavedStackSize() - FixedObject Most of the places in the code which previously used the CSStackSize now use PrologueSaveSize instead, which is the sum of the latter two, while some cases which need exactly the middle one use AFI->getCalleeSavedStackSize() explicitly instead of a local variable. In addition to moving the offsetting into emitPrologue/emitEpilogue (which fixes functions with CombineSPBump == false), also set the frame pointer to point to the right location, where the frame pointer and link register actually are stored. In addition to the prologue/epilogue, this also requires changes to resolveFrameIndexReference. Add tests for a function that keeps a frame pointer and another one that uses a VLA. Differential Revision: https://reviews.llvm.org/D35919 llvm-svn: 309744
* AMDGPU: Fix handling of div_scale with undef inputsMatt Arsenault2017-08-011-1/+55
| | | | | | | | | | | | The src0 register must match src1 or src2, but if these were undefined they could end up using different implicit_defed virtual registers. Force these to use one undef vreg or pick the defined other register. Also fixes producing invalid nodes without the right number of inputs when src2 is undef. llvm-svn: 309743
* AMDGPU: Initial implementation of callsMatt Arsenault2017-08-0118-14/+574
| | | | | | | | | Includes a hack to fix the type selected for the GlobalAddress of the function, which will be fixed by changing the default datalayout to use generic pointers for 0. llvm-svn: 309732
* [AMDGPU] Put a function used only inside assert() under NDEBUG.Davide Italiano2017-08-011-0/+4
| | | | llvm-svn: 309723
* [lanai] Add getIntImmCost in LanaiTargetTransformInfo.Jacques Pienaar2017-08-011-0/+27
| | | | | | Add simple int immediate cost function. llvm-svn: 309721
* [X86][SSE] Added missing vector logic intrinsic schedulesSimon Pilgrim2017-08-011-10/+6
| | | | | | | | Improves atom scheduler test coverage (to make it easier to upgrade them for PR32431). Merged SSE_VEC_BIT_ITINS_P + SSE_BIT_ITINS_P as we were interchanging between them. llvm-svn: 309715
* [X86] Use BEXTR/BEXTRI for 64-bit 'and' with a large maskCraig Topper2017-08-011-5/+36
| | | | | | | | | | | | | | Summary: The 64-bit 'and' with immediate instruction only supports a 32-bit immediate. So for larger constants we have to load the constant into a register first. If the immediate happens to be a mask we can use the BEXTRI instruction to perform the masking. We already do something similar using the BZHI instruction from the BMI2 instruction set. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36129 llvm-svn: 309706
* [X86][SSE] Added missing PACKSS/PACKUS intrinsic schedulesSimon Pilgrim2017-08-013-8/+10
| | | | | | | | Improves atom scheduler test coverage (to make it easier to upgrade them for PR32431). Checked on Agner that these actually match the UNPACK schedules, but better to include a separate class llvm-svn: 309701
* [X86][SSSE3] Added missing PHADDS/PHSUBS/PSIGN intrinsic schedulesSimon Pilgrim2017-08-011-2/+2
| | | | llvm-svn: 309699
* [AVX-512] Don't use unmasked VMOVDQU8/16 for 8-bit or 16-bit element stores ↵Craig Topper2017-08-011-13/+29
| | | | | | | | | | | | | | even when BWI instructions are supported. Always use VMOVDQA32/VMOVDQU32. We were already using the 32 bit element opcode if BWI isn't enabled, but there's no reason to change opcode if we have BWI. We will still use the 8/16 opcodes for masked stores though. This allows us to use the aligned opcode when we can which makes our test output more consistent between different modes. It also reduces the number of isel patterns we need. This is a slight inconsistency with loads which default to 64 bit element opcodes. I'll probably rectify that in a future patch. Differential Revision: https://reviews.llvm.org/D35978 llvm-svn: 309693
* [Mips] Fix for BBIT octeon instructionStrahinja Petrovic2017-08-011-1/+7
| | | | | | | | | | | This patch enables control flow optimization for variations of BBIT instruction. In this case optimization removes unnecessary branch after BBIT instruction. Differential Revision: https://reviews.llvm.org/D35359 llvm-svn: 309679
* [Hexagon] Convert HVX vector constants of i1 to i8Krzysztof Parzyszek2017-08-011-0/+36
| | | | | | | | | Certain operations require vector of i1 values. However, for Hexagon architecture compatibility, they need to be represented as vector of i8. Patch by Suyog Sarda. llvm-svn: 309677
* AMDGPU/GlobalISel: Add support for amdgpu_vs calling conventionTom Stellard2017-08-011-4/+24
| | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D35916 llvm-svn: 309675
* [AVX-512] Add unmasked subvector inserts and extract to the execution domain ↵Craig Topper2017-07-311-0/+24
| | | | | | tables. llvm-svn: 309632
* [X86][MMX] Added custom lowering action for MMX SELECT (PR30418)Konstantin Belochapka2017-07-311-0/+13
| | | | | | | Fix for pr30418 - error in backend: Cannot select: t17: x86mmx = select_cc t2, Constant:i64<0>, t7, t8, seteq:ch Differential Revision: https://reviews.llvm.org/D34661 llvm-svn: 309614
* [AVX-512] Remove patterns that select vmovdqu8/16 for unmasked loads. Prefer ↵Craig Topper2017-07-311-11/+18
| | | | | | | | | | | | | | vmovdqa64/vmovdqu64 instead. These were taking priority over the aligned load instructions since there is no vmovda8/16. I don't think there is really a difference between aligned and unaligned on newer cpus so I don't think it matters which instructions we use. But with this change we reduce the size of the isel table a little and we allow the aligned information to pass through to the evex->vec pass and produce the same output has avx/avx2 in some cases. I also generally dislike patterns rooted in a bitcast which these were. Differential Revision: https://reviews.llvm.org/D35977 llvm-svn: 309589
* Strip trailing whitespace. NFCI.Simon Pilgrim2017-07-311-7/+7
| | | | llvm-svn: 309584
* Fix typo in comment.Simon Pilgrim2017-07-311-1/+1
| | | | llvm-svn: 309583
* [GISel]: Support Widening G_ICMP's destination operand.Aditya Nandakumar2017-07-312-6/+10
| | | | | | | | | Updated AArch64 to widen destination to s32. https://reviews.llvm.org/D35737 Reviewed by Tim llvm-svn: 309579
* Do not recombine FMA when that is not needed.Amaury Sechet2017-07-311-4/+16
| | | | | | | | | | | | Summary: As per title. This creates useless recombines. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33848 llvm-svn: 309578
* Exclude more unused functions from release build.Florian Hahn2017-07-311-0/+4
| | | | llvm-svn: 309576
* [Cost] Rename getReductionCost() to getArithmeticReductionCost(), NFC.Alexey Bataev2017-07-312-4/+5
| | | | llvm-svn: 309563
* Guard print() functions only used by dump() functions.Florian Hahn2017-07-313-1/+9
| | | | | | | | | | | | | | | | | | | Summary: Since r293359, most dump() function are only defined when `!defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)` holds. print() functions only used by dump() functions are now unused in release builds, generating lots of warnings. This patch only defines some print() functions if they are used. Reviewers: MatzeB Reviewed By: MatzeB Subscribers: arsenm, mzolotukhin, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D35949 llvm-svn: 309553
* [X86][AVX512] Add masked MOVS[S|D] patternsGuy Blank2017-07-311-0/+16
| | | | | | | | | | | Added patterns to recognize AND 1 on the mask of a scalar masked move is not needed since only the lower bit is relevant for the instruction. Differential Revision: https://reviews.llvm.org/D35897 llvm-svn: 309546
OpenPOWER on IntegriCloud