summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86] Don't emit MULX by default with BMI2Craig Topper2018-12-121-49/+17
| | | | | | | | | | MULX has somewhat improved register allocation constraints compared to the legacy MUL instruction. Both output registers are encoded instead of fixed to EAX/EDX, but EDX is used as input. It also doesn't touch flags. Unfortunately, the encoding is longer. Prefering it whenever BMI2 is enabled is probably not optimal. Choosing it should somehow be a function of register allocation constraints like converting adds to three address. gcc and icc definitely don't pick MULX by default. Not sure what if any rules they have for using it. Differential Revision: https://reviews.llvm.org/D55565 llvm-svn: 348975
* [AMDGPU] Support for "uniform-work-group-size" attributeAakanksha Patil2018-12-122-7/+65
| | | | | | | | Updated the annotate-kernel-features pass to support the propagation of uniform-work-group attribute from the kernel to the called functions. Once this pass is run, all kernels, even the ones which initially did not have the attribute, will be able to indicate weather or not they have uniform work group size depending on the value of the attribute. Differential Revision: https://reviews.llvm.org/D50200 llvm-svn: 348971
* [AMDGPU] Emit MessagePack HSA Metadata for v3 code objectScott Linder2018-12-1210-142/+831
| | | | | | | | | Continue to present HSA metadata as YAML in ASM and when output by tools (e.g. llvm-readobj), but encode it in Messagepack in the code object. Differential Revision: https://reviews.llvm.org/D48179 llvm-svn: 348963
* [X86] Emit SBB instead of SETCC_CARRY from LowerSELECT. Break false ↵Craig Topper2018-12-122-8/+22
| | | | | | | | | | | | dependency on the SBB input. I'm hoping we can just replace SETCC_CARRY with SBB. This is another step towards that. I've explicitly used zero as the input to the setcc to avoid a false dependency that we've had with the SETCC_CARRY. I changed one of the patterns that used NEG to instead use an explicit compare with 0 on the LHS. We needed the zero anyway to avoid the false dependency. The negate would clobber its input register. By using a CMP we can avoid that which could be useful. Differential Revision: https://reviews.llvm.org/D55414 llvm-svn: 348959
* [SelectionDAG] Add a generic isSplatValue functionSimon Pilgrim2018-12-122-50/+18
| | | | | | | | | | | | | | This patch introduces a generic function to determine whether a given vector type is known to be a splat value for the specified demanded elements, recursing up the DAG looking for BUILD_VECTOR or VECTOR_SHUFFLE splat patterns. It also keeps track of the elements that are known to be UNDEF - it returns true if all the demanded elements are UNDEF (as this may be useful under some circumstances), so this needs to be handled by the caller. A wrapper variant is also provided that doesn't take the DemandedElts or UndefElts arguments for cases where we just want to know if the SDValue is a splat or not (with/without UNDEFS). I had hoped to completely remove the X86 local version of this function, but I'm seeing some regressions in shift/rotate codegen that will take a little longer to fix and I hope to get this in sooner so I can continue work on PR38243 which needs more capable splat detection. Differential Revision: https://reviews.llvm.org/D55426 llvm-svn: 348953
* [NVPTX] do not rely on cached subtarget info.Artem Belevich2018-12-122-13/+14
| | | | | | | | | | | | | | If a module has function references, but no functions themselves, we may end up never calling runOnMachineFunction and therefore would never initialize nvptxSubtarget field which would eventually cause a crash. Instead of relying on nvptxSubtarget being initialized by one of the methods, retrieve subtarget info directly. Differential Revision: https://reviews.llvm.org/D55580 llvm-svn: 348952
* [x86] allow 8-bit adds to be promoted by convertToThreeAddress() to form LEASanjay Patel2018-12-123-12/+23
| | | | | | | | | | This extends the code that handles 16-bit add promotion to form LEA to also allow 8-bit adds. That allows us to combine add ops with register moves and save some instructions. This is another step towards allowing add truncation in generic DAGCombiner (see D54640). Differential Revision: https://reviews.llvm.org/D55494 llvm-svn: 348946
* [AMDGPU] Extend the SI Load/Store optimizer to combine more things.Neil Henning2018-12-124-238/+544
| | | | | | | | | | I've extended the load/store optimizer to be able to produce dwordx3 loads and stores, This change allows many more load/stores to be combined, and results in much more optimal code for our hardware. Differential Revision: https://reviews.llvm.org/D54042 llvm-svn: 348937
* [mips] Enable using of integrated assembler in all cases.Simon Atanasyan2018-12-121-21/+1
| | | | llvm-svn: 348934
* [AMDGPU] Set metadata access for explicit sectionPiotr Sobczak2018-12-122-0/+12
| | | | | | | | | | | | | | | | | | | Summary: This patch provides a means to set Metadata section kind for a global variable, if its explicit section name is prefixed with ".AMDGPU.metadata." This could be useful to make the global variable go to an ELF section without any section flags set. Reviewers: dstuttard, tpr, kzhuravl, nhaehnle, t-tye Reviewed By: dstuttard, kzhuravl Subscribers: llvm-commits, arsenm, jvesely, wdng, yaxunl, t-tye Differential Revision: https://reviews.llvm.org/D55267 llvm-svn: 348922
* [ARM GlobalISel] Select load/store for Thumb2Diana Picus2018-12-122-6/+34
| | | | | | | | | | | | Unfortunately we can't use TableGen for this because it doesn't yet support predicates on the source pattern root. Therefore, add a bit of handwritten code to the instruction selector to handle the most basic cases. Also mark them as legal and extract their legalizer test cases to a new test file. llvm-svn: 348920
* [SystemZ] Minor cleanup of SchedModelsJonas Paulsson2018-12-122-21/+21
| | | | | | | Some fixes of a few InstRWs for z13 and z14. Review: Ulrich Weigand llvm-svn: 348917
* [X86] Combine vpmovdw+vpacksswb into vpmovdb.Craig Topper2018-12-121-8/+8
| | | | | | This is similar to the combine we already have for vpmovdw+vpackuswb. llvm-svn: 348910
* [COFF, ARM64] Emit COFF function headerMandeep Singh Grang2018-12-111-5/+27
| | | | | | | | | | | | | | | | | | | | Summary: Emit COFF header when printing out the function. This is important as the header contains two important pieces of information: the storage class for the symbol and the symbol type information. This bit of information is required for the linker to correctly identify the type of symbol that it is dealing with. This patch mimics X86 and ARM COFF behavior for function header emission. Reviewers: rnk, mstorsjo, compnerd, TomTan, ssijaric Reviewed By: mstorsjo Subscribers: dmajor, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D55535 llvm-svn: 348875
* Fix not correct imm operand assertion for SUB32ri in ↵Craig Topper2018-12-111-1/+2
| | | | | | | | | | | | | | | | | | | | | | | X86CondBrFolding::analyzeCompare Summary: When doing X86CondBrFolding::analyzeCompare, it will meet the SUB32ri instruction as below to use the global address for its operand, %733:gr32 = SUB32ri %62:gr32(tied-def 0), @img2buf_normal, implicit-def $eflags JNE_1 %bb.41, implicit $eflags so the assertion "assert(MI.getOperand(ValueIndex).isImm() && "Expecting Imm operand")" is not correct and change the assert to if make X86CondBrFolding::analyzeCompare return false as not finding the compare for this Patch by Jianping Chen Reviewers: smaslov, LuoYuanke, liutianle, Jianping Reviewed By: Jianping Subscribers: lebedev.ri, llvm-commits Differential Revision: https://reviews.llvm.org/D54250 llvm-svn: 348853
* [x86] clean up code for converting 16-bit ops to LEA; NFCSanjay Patel2018-12-112-62/+60
| | | | | | | | As discussed in D55494, we want to extend this to handle 8-bit ops too, but that could be extended further to enable this on 32-bit systems too. llvm-svn: 348851
* [x86] remove dead code for 16-bit LEA formation; NFCSanjay Patel2018-12-111-57/+13
| | | | | | | | | | | As discussed in: D55494 ...this code has been disabled/dead for a long time (the code references Athlon and Pentium 4), and there's almost no chance that it will be used given the last decade of uarch evolution. Also, in SDAG we promote 16-bit ops to 32-bit, so there's almost no way to test this code any more. llvm-svn: 348845
* [PPC][NFC] store operands are dst not srcMartell Malone2018-12-111-9/+9
| | | | | | Differential Revision: https://reviews.llvm.org/D55502 llvm-svn: 348826
* [WebAssembly] Add '.eventtype' directive supportHeejin Ahn2018-12-113-33/+70
| | | | | | | | | | | | | | Summary: This patch supports `.eventtype` directive printing and parsing in the same syntax with `.functype`. Reviewers: aardappel, sbc100 Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D55353 llvm-svn: 348818
* [WebAssembly] TargetStreamer cleanup (NFC)Heejin Ahn2018-12-113-42/+32
| | | | | | | | | | | | | | | | | | Summary: - Unify mixed argument names (`Symbol` and `Sym`) to `Sym` - Changed `MCSymbolWasm*` argument of `emit***` functions to `const MCSymbolWasm*`. It seems not very intuitive that emit function in the streamer modifies symbol contents. - Moved empty function bodies to the header - clang-format Reviewers: aardappel, dschuff, sbc100 Subscribers: jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D55347 llvm-svn: 348816
* [GISel]: Refactor MachineIRBuilder to allow passing additional parameters to ↵Aditya Nandakumar2018-12-112-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | build Instrs https://reviews.llvm.org/D55294 Previously MachineIRBuilder::buildInstr used to accept variadic arguments for sources (which were either unsigned or MachineInstrBuilder). While this worked well in common cases, it doesn't allow us to build instructions that have multiple destinations. Additionally passing in other optional parameters in the end (such as flags) is not possible trivially. Also a trivial call such as B.buildInstr(Opc, Reg1, Reg2, Reg3) can be interpreted differently based on the opcode (2defs + 1 src for unmerge vs 1 def + 2srcs). This patch refactors the buildInstr to buildInstr(Opc, ArrayRef<DstOps>, ArrayRef<SrcOps>) where DstOps and SrcOps are typed unions that know how to add itself to MachineInstrBuilder. After this patch, most invocations would look like B.buildInstr(Opc, {s32, DstReg}, {SrcRegs..., SrcMIBs..}); Now all the other calls (such as buildAdd, buildSub etc) forward to buildInstr. It also makes it possible to build instructions with multiple defs. Additionally in a subsequent patch, we should make it possible to add flags directly while building instructions. Additionally, the main buildInstr method is now virtual and other builders now only have to override buildInstr (for say constant folding/cseing) is straightforward. Also attached here (https://reviews.llvm.org/F7675680) is a clang-tidy patch that should upgrade the API calls if necessary. llvm-svn: 348815
* [Hexagon] Couple of fixes in optimize addressing modeKrzysztof Parzyszek2018-12-101-16/+21
| | | | | | | | | | - Check if an operand is an immediate before calling getImm. Some operands that take constant values can actually have global symbols or other constant expressions. - When a load-constant instruction can be folded into users, make sure to only delete it when all users have been successfully converted. llvm-svn: 348802
* Revert "[Hexagon] Check if operand is an immediate before getImm"Krzysztof Parzyszek2018-12-101-15/+12
| | | | | | This reverts r348787. The patch wasn't quite correct. llvm-svn: 348792
* [GlobalISel] Restrict G_MERGE_VALUES capability and replace with new opcodes.Amara Emerson2018-12-105-9/+159
| | | | | | | | | | | | This patch restricts the capability of G_MERGE_VALUES, and uses the new G_BUILD_VECTOR and G_CONCAT_VECTORS opcodes instead in the appropriate places. This patch also includes AArch64 support for selecting G_BUILD_VECTOR of <4 x s32> and <2 x s64> vectors. Differential Revisions: https://reviews.llvm.org/D53629 llvm-svn: 348788
* [Hexagon] Check if operand is an immediate before getImmKrzysztof Parzyszek2018-12-101-12/+15
| | | | llvm-svn: 348787
* [Hexagon] Add patterns for any_extend from i1 and short vectors of i1Krzysztof Parzyszek2018-12-101-29/+28
| | | | llvm-svn: 348785
* [x86] fix formatting; NFCSanjay Patel2018-12-101-19/+17
| | | | | | | | This should really be generalized to allow increment and/or we should replace it by using ISD::matchUnaryPredicate(). See D55515 for context. llvm-svn: 348776
* [AArch64] Refactor the Exynos scheduling predicatesEvandro Menezes2018-12-106-329/+249
| | | | | | | | | Refactor the scheduling predicates based on `MCInstPredicate`. In this case, for the Exynos processors. Differential revision: https://reviews.llvm.org/D55345 llvm-svn: 348774
* [AMDGPU] Change the l1 flush instruction for AMDPAL/MESA3D.Neil Henning2018-12-101-1/+7
| | | | | | | | | | This commit changes which l1 flush instruction is used for AMDPAL and MESA3d workloads to flush the entire l1 cache instead of just the volatile lines. Differential Revision: https://reviews.llvm.org/D55367 llvm-svn: 348771
* [AArch64] Refactor the scheduling predicatesEvandro Menezes2018-12-101-61/+325
| | | | | | | | | Refactor the scheduling predicates based on `MCInstPredicate`. Augment the number of helper predicates used by processor specific predicates. Differential revision: https://reviews.llvm.org/D55375 llvm-svn: 348768
* [AMDGPU] Add new Mode Register pass - minor fixTim Corringham2018-12-101-1/+1
| | | | | | | Trivial change to add parentheses to an expression to avoid a sanitizer error in SIModeRegister.cpp, which was committed earlier. llvm-svn: 348767
* [AVX512] Update typo in commentCameron McInally2018-12-101-1/+1
| | | | | | | | Should be "Sae" for "Suppress All Exceptions". NFC llvm-svn: 348763
* [mips][mc] Emit R_{MICRO}MIPS_JALR when expanding jal to jalrVladimir Stefanovic2018-12-101-3/+21
| | | | | | | | | | | When replacing jal with jalr, also emit '.reloc R_MIPS_JALR' (R_MICROMIPS_JALR for micromips). The linker might then be able to turn jalr into a direct call. Add '-mips-jalr-reloc' to enable/disable this feature (default is true). Differential revision: https://reviews.llvm.org/D55292 llvm-svn: 348760
* [AMDGPU] Add new Mode Register passTim Corringham2018-12-1011-11/+487
| | | | | | | | | | | | | | | A new pass to manage the Mode register. Currently this just manages the floating point double precision rounding requirements, but is intended to be easily extended to encompass all Mode register settings. The immediate motivation comes from the requirement to use the round-to-zero rounding mode for the 16 bit interpolation instructions, where the rounding mode setting is shared between 16 and 64 bit operations. llvm-svn: 348754
* [X86] Fix AvoidStoreForwardingBlocks pass for negative displacementsNikita Popov2018-12-101-1/+1
| | | | | | | | | | | | | | | | Fixes https://bugs.llvm.org/show_bug.cgi?id=39926. The size of the first copy was computed as std::abs(std::abs(LdDisp2) - std::abs(LdDisp1)), which results in skipped bytes if the signs of LdDisp2 and LdDisp1 differ. As far as I can see, this should just be LdDisp2 - LdDisp1. The case where LdDisp1 > LdDisp2 is already handled in the code above, in which case LdDisp2 is set to LdDisp1 and this subtraction will evaluate to Size1 = 0, which is the correct value to skip an overlapping copy. Differential Revision: https://reviews.llvm.org/D55485 llvm-svn: 348750
* [X86] Merge addcarryx/addcarry intrinsic into a single addcarry intrinsic.Craig Topper2018-12-101-6/+4
| | | | | | | | Both intrinsics do the exact same thing so we really only need one. Earlier in the 8.0 cycle we changed the signature of this intrinsic without renaming it. But it looks difficult to get the autoupgrade code to allow me to merge the intrinsics and change the signature at the same time. So I've renamed the intrinsic slightly for the new merged intrinsic. I'm skipping autoupgrading from the previous new to 8.0 signature. I've also renamed the subborrow for consistency. llvm-svn: 348737
* [AMDGPU] Fix discarded result of addAttributeBrian Gesiak2018-12-091-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: `llvm::AttributeList` and `llvm::AttributeSet` are immutable, and so methods defined on these classes, such as `addAttribute`, return a new immutable object with the attribute added. In https://reviews.llvm.org/D55217 I attempted to annotate methods such as `addAttribute` with `LLVM_NODISCARD`, since calling these methods has no side-effects, and so ignoring the result that is returned is almost certainly a programmer error. However, committing the change resulted in new warnings in the AMDGPU target. The AMDGPU simplify libcalls pass added in https://reviews.llvm.org/D36436 attempts to add the readonly and nounwind attributes to simplified library functions, but instead calls the `addAttribute` methods and ignores the result. Modify the simplify libcalls pass to actually add the nounwind and readonly attributes. Also update the simplify libcalls test to assert that these attributes are actually being set. Reviewers: rampitec, vpykhtin, rnk Reviewed By: rampitec Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D55435 llvm-svn: 348732
* [X86] If the carry input to an addcarry/subborrow intrinsic is known to be ↵Craig Topper2018-12-092-10/+19
| | | | | | | | | | 0, emit a flag setting ADD/SUB instead of ADC/SBB. Previously we had to take the carry in and add -1 to it to set the carry flag so we could use it with ADC/SBB. But if we know its 0 then we don't need to bother. This should go a long way towards fixing PR24545. llvm-svn: 348727
* Remove unneeded dependency from lib/Target/X86/Utils/ to lib/IR (aka Core).Nico Weber2018-12-091-1/+1
| | | | | | | The dependency was added in r213995 in response to r213986 which did make X86/Utils depend on IR, but r256680 later removed that dependency again. llvm-svn: 348724
* [x86] don't try to convert add with undef operands to LEASanjay Patel2018-12-092-37/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The existing code tries to handle an undef operand while transforming an add to an LEA, but it's incomplete because we will crash on the i16 test with the debug output shown below. It's better to just give up instead. Really, GlobalIsel should have folded these before we could get into trouble. # Machine code for function add_undef_i16: NoPHIs, TracksLiveness, Legalized, RegBankSelected, Selected bb.0 (%ir-block.0): liveins: $edi %1:gr32 = COPY killed $edi %0:gr16 = COPY %1.sub_16bit:gr32 %5:gr64_nosp = IMPLICIT_DEF %5.sub_16bit:gr64_nosp = COPY %0:gr16 %6:gr64_nosp = IMPLICIT_DEF %6.sub_16bit:gr64_nosp = COPY %2:gr16 %4:gr32 = LEA64_32r killed %5:gr64_nosp, 1, killed %6:gr64_nosp, 0, $noreg %3:gr16 = COPY killed %4.sub_16bit:gr32 $ax = COPY killed %3:gr16 RET 0, implicit killed $ax # End machine code for function add_undef_i16. *** Bad machine code: Reading virtual register without a def *** - function: add_undef_i16 - basic block: %bb.0 (0x7fe6cd83d940) - instruction: %6.sub_16bit:gr64_nosp = COPY %2:gr16 - operand 1: %2:gr16 LLVM ERROR: Found 1 machine code errors. Differential Revision: https://reviews.llvm.org/D54710 llvm-svn: 348722
* [X86] Extend pfm counter coverage for llvm-exegesisSimon Pilgrim2018-12-091-5/+29
| | | | | | Extension to rL348617, turns out llvm-exegesis doesn't need to match the perf counter name against a scheduler model resource name - so I've added a few more counters that I could find in the libpfm4 source code (and fix a typo in the knl/knm retired_uops counter - which uses 'all' instead of 'any'). llvm-svn: 348721
* AMDGPU: Fix offsets for < 4-byte aggregate kernel argumentsMatt Arsenault2018-12-071-4/+7
| | | | | | | | We were still using the rounded down offset and alignment even though they aren't handled because you can't trivially bitcast the loaded value. llvm-svn: 348658
* [Hexagon] Fix post-ra expansion of PS_wselectKrzysztof Parzyszek2018-12-071-1/+0
| | | | llvm-svn: 348655
* Fix unused variable warning. NFCI.Simon Pilgrim2018-12-071-2/+2
| | | | llvm-svn: 348649
* [WebAssembly] clang-format/clang-tidy AsmParser (NFC)Heejin Ahn2018-12-071-60/+88
| | | | | | | | | | | | | | | | | | | Summary: - LLVM clang-format style doesn't allow one-line ifs. - LLVM clang-tidy style says method names should start with a lowercase letter. But currently WebAssemblyAsmParser's parent class MCTargetAsmParser is mixing lowercase and uppercase method names itself so overridden methods cannot be renamed now. - Changed else ifs after returns to ifs. - Added some newlines for readability. Reviewers: aardappel, sbc100 Subscribers: dschuff, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D55350 llvm-svn: 348648
* Delete registerScope functionHeejin Ahn2018-12-071-20/+2
| | | | | | `unregisterScope()` is not currently used, so removing it. llvm-svn: 348647
* [X86] Replace instregex with instrs list. NFCI. Simon Pilgrim2018-12-073-3/+3
| | | | llvm-svn: 348626
* AMDGPU: Allow f32 types for llvm.amdgcn.s.buffer.loadMatt Arsenault2018-12-072-5/+12
| | | | llvm-svn: 348625
* [X86] Initialize and Register X86CondBrFoldingPassCraig Topper2018-12-073-4/+8
| | | | | | | | | | To make X86CondBrFoldingPass can be run with --run-pass option, this can test one wrong assertion on analyzeCompare function for SUB32ri when its operand is not imm Patch by Jianping Chen Differential Revision: https://reviews.llvm.org/D55412 llvm-svn: 348620
* AMDGPU: Remove llvm.SI.tbuffer.storeMatt Arsenault2018-12-072-67/+0
| | | | llvm-svn: 348619
OpenPOWER on IntegriCloud