summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [X86][SSE] SimplifyDemandedBitsForTargetNode - PCMPGT(0,X) sign maskSimon Pilgrim2019-02-041-0/+7
| | | | | | | | For PCMPGT(0, X) patterns where we only demand the sign bit (e.g. BLENDV or MOVMSK) then we can use X directly. Differential Revision: https://reviews.llvm.org/D57667 llvm-svn: 353051
* AMDGPU/GlobalISel: Legalize select for v4s16Matt Arsenault2019-02-041-3/+3
| | | | | | | Also add some more select tests to help show future legalization changes. llvm-svn: 353045
* [AsmPrinter] Remove hidden flag -print-schedule.Andrea Di Biagio2019-02-0414-48/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes hidden codegen flag -print-schedule effectively reverting the logic originally committed as r300311 (https://llvm.org/viewvc/llvm-project?view=revision&revision=300311). Flag -print-schedule was originally introduced by r300311 to address PR32216 (https://bugs.llvm.org/show_bug.cgi?id=32216). That bug was about adding "Better testing of schedule model instruction latencies/throughputs". These days, we can use llvm-mca to test scheduling models. So there is no longer a need for flag -print-schedule in LLVM. The main use case for PR32216 is now addressed by llvm-mca. Flag -print-schedule is mainly used for debugging purposes, and it is only actually used by x86 specific tests. We already have extensive (latency and throughput) tests under "test/tools/llvm-mca" for X86 processor models. That means, most (if not all) existing -print-schedule tests for X86 are redundant. When flag -print-schedule was first added to LLVM, several files had to be modified; a few APIs gained new arguments (see for example method MCAsmStreamer::EmitInstruction), and MCSubtargetInfo/TargetSubtargetInfo gained a couple of getSchedInfoStr() methods. Method getSchedInfoStr() had to originally work for both MCInst and MachineInstr. The original implmentation of getSchedInfoStr() introduced a subtle layering violation (reported as PR37160 and then fixed/worked-around by r330615). In retrospect, that new API could have been designed more optimally. We can always query MCSchedModel to get the latency and throughput. More importantly, the "sched-info" string should not have been generated by the subtarget. Note, r317782 fixed an issue where "print-schedule" didn't work very well in the presence of inline assembly. That commit is also reverted by this change. Differential Revision: https://reviews.llvm.org/D57244 llvm-svn: 353043
* Use auto for dyn_cast case to save a line. NFCI.Simon Pilgrim2019-02-041-2/+1
| | | | llvm-svn: 353041
* [ARM] Mark 255 and 65535 as cheap for Thumb1 "And"David Green2019-02-041-3/+7
| | | | | | | | | | This prevents Constant Hoisting from pulling the constant out of the block, allowing us to still produce LDRH/UXTH nodes. LDRB/UXTB (255) is already cheap by the default getIntImmCost, but I've added it for clarity. Differential Revision: https://reviews.llvm.org/D57671 llvm-svn: 353040
* Recommit r352660 "[X86] Mark EMMS and FEMMS as clobbering MM0-7 and ST0-7."Craig Topper2019-02-042-2/+6
| | | | | | | | | | | | | | We now print ST0 as 'st' when generating the clobber list for MS inline assembly in clang. This matches what the gcc reg name list expects. Original commit message: This fixes the test case in PR35982 by preventing MMX instructions that read MM0-7 from being moved below EMMS/FEMMS by the post RA scheduler. Though as discussed in bugzilla, this is not a complete fix. There is still the possibility of reordering in IR or by the pre-RA scheduler. Differential Revision: https://reviews.llvm.org/D57298 llvm-svn: 353016
* [X86] Print %st(0) as %st when its implicit to the instruction. Continue ↵Craig Topper2019-02-048-49/+75
| | | | | | | | printing it as %st(0) when its encoded in the instruction. This is a step back from the change I made in r352985. This appears to be more consistent with gcc and objdump behavior. llvm-svn: 353015
* Revert r352985 "[X86] Print %st(0) as %st to match what gcc inline asm uses ↵Craig Topper2019-02-044-24/+25
| | | | | | | | | | as the clobber name to make MS inline asm work correctly" Looking into gcc and objdump behavior more this was overly aggressive. If the register is encoded in the instruction we should print %st(0), if its implicit we should print %st. I'll be making a more directed change in a future patch. llvm-svn: 353013
* [X86][AVX] Support shuffle combining for VBROADCAST with smaller vector sourcesSimon Pilgrim2019-02-031-0/+20
| | | | | | getTargetShuffleMask can only do this safely if we're extracting the lowest subvector from a vector of the same result type. llvm-svn: 352999
* [X86][AVX] Support shuffle combining for VPMOVZX with smaller vector sourcesSimon Pilgrim2019-02-031-5/+13
| | | | llvm-svn: 352997
* [X86][AVX] More aggressively simplify BROADCAST source operandSimon Pilgrim2019-02-031-2/+14
| | | | | | | | Aim to use scalar source or lowest 128-bit vector directly. We're still missing some VZMOVL_LOAD combines. llvm-svn: 352994
* [X86] Print %st(0) as %st to match what gcc inline asm uses as the clobber ↵Craig Topper2019-02-034-25/+24
| | | | | | | | | | | | | | | | | | | name to make MS inline asm work correctly Summary: When calculating clobbers for MS style inline assembly we fail if the asm clobbers stack top because we print st(0) and try to pass it through the gcc register name check. This was found with when I attempted to make a emms/femms clobber all ST registers. If you use emms/femms in MS inline asm we would try to use st(0) as the clobber name but clang would think that wasn't a valid clobber name. This also matches what objdump disassembly prints. It's also what is printed by gcc -S. Reviewers: RKSimon, rnk, efriedma, spatel, andreadb, lebedev.ri Reviewed By: rnk Subscribers: eraman, gbedwell, lebedev.ri, llvm-commits Differential Revision: https://reviews.llvm.org/D57621 llvm-svn: 352985
* [X86] Lower ISD::UADDO to use the Z flag instead of C flag when the RHS is a ↵Craig Topper2019-02-031-1/+7
| | | | | | | | | | | | | | | | | | | | | constant 1 to encourage INC formation. Summary: Add an additional combine to combineCarryThroughADD to reverse it back to the C flag to avoid regressions. I believe this catches the cases that D57547 got. Reviewers: RKSimon, spatel Reviewed By: spatel Subscribers: javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57637 llvm-svn: 352984
* [AMDGPU] Fix -Wunused-variable after rL352978Fangrui Song2019-02-031-1/+0
| | | | llvm-svn: 352982
* GlobalISel: Implement widenScalar for G_UNMERGE_VALUESMatt Arsenault2019-02-031-1/+2
| | | | | | | | | For the scalar case only. Also move the similar G_MERGE_VALUES handling to a separate function and cleanup to make them look more similar. llvm-svn: 352979
* GlobalISel: Implement widenScalar for G_EXTRACT vector sourcesMatt Arsenault2019-02-021-0/+18
| | | | | | Handle the basic element extract case. llvm-svn: 352978
* AMDGPU/GlobalISel: Avoid reporting illegal extloads as legalMatt Arsenault2019-02-021-1/+1
| | | | | | This avoids breaking a test in a future commit. llvm-svn: 352977
* AMDGPU/GlobalISel: Legalize icmp for pointer typesMatt Arsenault2019-02-021-1/+10
| | | | llvm-svn: 352976
* AMDGPU/GlobalISel: Legalize constant for pointer typesMatt Arsenault2019-02-021-3/+4
| | | | llvm-svn: 352975
* AMDGPU/GlobalISel: Legalize select for pointer typesMatt Arsenault2019-02-021-4/+12
| | | | llvm-svn: 352974
* GlobalISel: Legalization for inttoptr/ptrtointMatt Arsenault2019-02-021-6/+44
| | | | llvm-svn: 352973
* [X86][AVX] Enable INSERT_SUBVECTOR(SRC0, SHUFFLE(SRC1)) shuffle combiningSimon Pilgrim2019-02-021-15/+27
| | | | | | | | | | Push the insert_subvector up through the shuffle operands to help find more cross-lane shuffles. The is exposes a couple of minor issues that will be fixed shortly: Missed broadcast folds - we have a mixture of vzext_load lengths that need cleaning up combine-sdiv.ll - AVX1 SimplifyDemandedVectorElts failure (hits max depth due to a couple of extra bitcasts). llvm-svn: 352963
* [SDAG] Add SDNode/SDValue getConstantOperandAPInt helper. NFCI.Simon Pilgrim2019-02-021-15/+11
| | | | | | | | We already have the getConstantOperandVal helper which returns a uint64_t, but along comes the fuzzer and inserts a i128 -1 constant or something and the whole thing asserts....... I've updated a few obvious cases, and tried to make use of the const reference where possible, but there's more to do. A number of existing oss-fuzz tickets should be fixed if we start using APInt and perform value clamping where necessary. llvm-svn: 352961
* [BPF] [BTF] Process FileName with absolute path correctlyYonghong Song2019-02-021-1/+1
| | | | | | | | | | | | | | | | | In IR, sometimes the following attributes for DIFile may be generated: filename: /home/yhs/test.c directory: /tmp The /tmp may represent the working directory of the compilation process. In such cases, since filename is with absolute path, the directory should be ignored by BTF. The filename alone is enough to get the source. Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 352952
* Revert "[BPF] [BTF] Process FileName with absolute path correctly"Yonghong Song2019-02-011-1/+1
| | | | | | | | This reverts commit r352939. Some tests failed. Revert to unblock others. llvm-svn: 352941
* [AArch64] Fix unused variable [NFC]Mandeep Singh Grang2019-02-011-0/+1
| | | | llvm-svn: 352940
* [BPF] [BTF] Process FileName with absolute path correctlyYonghong Song2019-02-011-1/+1
| | | | | | | | | | | | | | | | | In IR, sometimes the following attributes for DIFile may be generated: filename: /home/yhs/test.c directory: /tmp The /tmp may represent the working directory of the compilation process. In such cases, since filename is with absolute path, the directory should be ignored by BTF. The filename alone is enough to get the source. Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 352939
* [WebAssembly] Add codegen support for the import_field attributeDan Gohman2019-02-013-6/+27
| | | | | | | | | This adds the LLVM side of https://reviews.llvm.org/D57602 -- the import_field attribute. See that patch for details. Differential Revision: https://reviews.llvm.org/D57603 llvm-svn: 352931
* [COFF, ARM64] Fix localaddress to handle stack realignment and variable size ↵Mandeep Singh Grang2019-02-015-25/+56
| | | | | | | | | | | | | | | | objects Summary: This fixes using the correct stack registers for SEH when stack realignment is needed or when variable size objects are present. Reviewers: rnk, efriedma, ssijaric, TomTan Reviewed By: rnk, efriedma Subscribers: javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D57183 llvm-svn: 352923
* [X86][AVX] Add VMOVDDUP-VPBROADCASTQ execution domain mappingSimon Pilgrim2019-02-011-0/+4
| | | | | | | | Noticed in D57514. Differential Revision: https://reviews.llvm.org/D57519 llvm-svn: 352922
* [opaque pointer types] Pass value type to GetElementPtr creation.James Y Knight2019-02-016-22/+25
| | | | | | | | | This cleans up all GetElementPtr creation in LLVM to explicitly pass a value type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57173 llvm-svn: 352913
* [opaque pointer types] Pass value type to LoadInst creation.James Y Knight2019-02-0111-37/+44
| | | | | | | | | This cleans up all LoadInst creation in LLVM to explicitly pass the value type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57172 llvm-svn: 352911
* [opaque pointer types] Pass function types to CallInst creation.James Y Knight2019-02-0113-20/+21
| | | | | | | | | This cleans up all CallInst creation in LLVM to explicitly pass a function type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57170 llvm-svn: 352909
* test commit (add blank line) NFCRoland Froese2019-02-011-0/+1
| | | | llvm-svn: 352897
* [AMDGPU] Fix for vector element insertionTim Corringham2019-02-011-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Incorrect code was generated when lowering insertelement operations for vectors with 8 or 16 bit elements. The value being inserted was not adjusted for the position of the element within the 32 bit word and so only the low element within each 32 bit word could receive the intended value. Fixed by simply replicating the value to each element of a congruent vector before the mask and or operation used to update the intended element. A number of affected LIT tests have been updated appropriately. before the mask & or into the intended Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: llvm-commits, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Tags: #llvm Differential Revision: https://reviews.llvm.org/D57588 llvm-svn: 352885
* [X86][SSE] Use PSLLDQ/PSRLDQ to mask out zeroable ends of a shuffleSimon Pilgrim2019-02-011-0/+73
| | | | | | | | | | As suggested on PR40318, this patch uses PSLLDQ/PSRLDQ to lower shuffles to zero out the ends of a vector, leaving a sequential inner section. For pre-SSSE3 we do this for shuffles with zeros at either end (requiring up to 3 shifts), but once PSHUFB is available I've limited this to shuffles with a single zeroable end (2 shifts). Differential Revision: https://reviews.llvm.org/D56784 llvm-svn: 352883
* [X86][AVX] Combine INSERT_SUBVECTOR(SRC0, ↵Simon Pilgrim2019-02-011-3/+4
| | | | | | | | | | BITCAST(SHUFFLE(EXTRACT_SUBVECTOR(SRC1))) Enable peeking through one use bitcasts to the subvector shuffle. This still depends on the subvector being the same scalar-size but D57514 has already helped with the more tricky patterns llvm-svn: 352879
* [AArch64] Optimize floating point materializationAdhemerval Zanella2019-02-012-28/+23
| | | | | | | | | | | | | | | This patch changes isFPImmLegal to return if the value can be enconded as the immediate operand of a logical instruction besides checking if for immediate field for fmov. This optimizes some floating point materization, inclusive values used on isinf lowering. Reviewed By: rengolin, efriedma, evandro Differential Revision: https://reviews.llvm.org/D57044 llvm-svn: 352866
* [X86][BdVer2] Transfer delays from the integer to the floating point unit.Roman Lebedev2019-02-011-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: I'm unable to find this number in the "AMD SOG for family 15h". llvm-exegesis measures the latencies of these instructions as `2`, which matches the latencies specified in "AMD SOG for family 15h". However if we look at Agner, Microarchitecture, "AMD Bulldozer, Piledriver, Steamroller and Excavator pipeline", "Data delay between different execution domains", the int->ivec transfer is listed as `8`..`10`cy of additional latency. Also, Agner's "Instruction tables", for Piledriver, lists their latencies as `12`, which is consistent with `2cy` from exegesis / AMD SOG + `10cy` transfer delay. Additional data point comes from the fact that Agner's "Instruction tables", for Jaguar, lists their latencies as `8`; and "AMD SOG for family 16h" does state the `+6cy` int->ivec delay, which is consistent with instr latency of `1` or `2`. Reviewers: andreadb, RKSimon, craig.topper Reviewed By: andreadb Subscribers: gbedwell, courbet, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57300 llvm-svn: 352861
* Provide reason messages for unviable inliningYevgeny Rouban2019-02-011-2/+3
| | | | | | | | | | | | | InlineCost's isInlineViable() is changed to return InlineResult instead of bool. This provides messages for failure reasons and allows to get more specific messages for cases where callsites are not viable for inlining. Reviewed By: xbolva00, anemet Differential Revision: https://reviews.llvm.org/D57089 llvm-svn: 352849
* [RISCV] Implement RV64D codegenAlex Bradbury2019-02-012-4/+30
| | | | | | | | | | | | This patch: * Adds necessary RV64D codegen patterns * Modifies CC_RISCV so it will properly handle f64 types (with soft float ABI) Note that in general there is no reason to try to select fcvt.w[u].d rather than fcvt.l[u].d for i32 conversions because fptosi/fptoui produce poison if the input won't fit into the target type. Differential Revision: https://reviews.llvm.org/D53237 llvm-svn: 352833
* [opaque pointer types] Add a FunctionCallee wrapper type, and use it.James Y Knight2019-02-019-68/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recommit r352791 after tweaking DerivedTypes.h slightly, so that gcc doesn't choke on it, hopefully. Original Message: The FunctionCallee type is effectively a {FunctionType*,Value*} pair, and is a useful convenience to enable code to continue passing the result of getOrInsertFunction() through to EmitCall, even once pointer types lose their pointee-type. Then: - update the CallInst/InvokeInst instruction creation functions to take a Callee, - modify getOrInsertFunction to return FunctionCallee, and - update all callers appropriately. One area of particular note is the change to the sanitizer code. Previously, they had been casting the result of `getOrInsertFunction` to a `Function*` via `checkSanitizerInterfaceFunction`, and storing that. That would report an error if someone had already inserted a function declaraction with a mismatching signature. However, in general, LLVM allows for such mismatches, as `getOrInsertFunction` will automatically insert a bitcast if needed. As part of this cleanup, cause the sanitizer code to do the same. (It will call its functions using the expected signature, however they may have been declared.) Finally, in a small number of locations, callers of `getOrInsertFunction` actually were expecting/requiring that a brand new function was being created. In such cases, I've switched them to Function::Create instead. Differential Revision: https://reviews.llvm.org/D57315 llvm-svn: 352827
* [WebAssembly] Fix a regression selecting negative build_vector lanesThomas Lively2019-01-311-1/+1
| | | | | | | | | | | | | | | | Summary: The custom lowering introduced in rL352592 creates build_vector nodes with negative i32 operands, but these operands did not meet the value range constraints necessary to match build_vector nodes. This CL fixes the issue by removing the unnecessary constraints. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish Differential Revision: https://reviews.llvm.org/D57481 llvm-svn: 352813
* [RISCV] Add RV64F codegen supportAlex Bradbury2019-01-313-2/+130
| | | | | | | | | | | | | This requires a little extra work due tothe fact i32 is not a legal type. When call lowering happens post-legalisation (e.g. when an intrinsic was inserted during legalisation). A bitcast from f32 to i32 can't be introduced. This is similar to the challenges with RV32D. To handle this, we introduce target-specific DAG nodes that perform bitcast+anyext for f32->i64 and trunc+bitcast for i64->f32. Differential Revision: https://reviews.llvm.org/D53235 llvm-svn: 352807
* [Hexagon] Rename textually included file from .h to .incRichard Trieu2019-01-312-1/+1
| | | | llvm-svn: 352802
* Revert "[opaque pointer types] Add a FunctionCallee wrapper type, and use it."James Y Knight2019-01-319-71/+68
| | | | | | | | | This reverts commit f47d6b38c7a61d50db4566b02719de05492dcef1 (r352791). Seems to run into compilation failures with GCC (but not clang, where I tested it). Reverting while I investigate. llvm-svn: 352800
* [WebAssembly] Add bulk memory target featureThomas Lively2019-01-313-16/+30
| | | | | | | | | | | | Summary: Also clean up some preexisting target feature code. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, jfb Differential Revision: https://reviews.llvm.org/D57495 llvm-svn: 352793
* [opaque pointer types] Add a FunctionCallee wrapper type, and use it.James Y Knight2019-01-319-68/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The FunctionCallee type is effectively a {FunctionType*,Value*} pair, and is a useful convenience to enable code to continue passing the result of getOrInsertFunction() through to EmitCall, even once pointer types lose their pointee-type. Then: - update the CallInst/InvokeInst instruction creation functions to take a Callee, - modify getOrInsertFunction to return FunctionCallee, and - update all callers appropriately. One area of particular note is the change to the sanitizer code. Previously, they had been casting the result of `getOrInsertFunction` to a `Function*` via `checkSanitizerInterfaceFunction`, and storing that. That would report an error if someone had already inserted a function declaraction with a mismatching signature. However, in general, LLVM allows for such mismatches, as `getOrInsertFunction` will automatically insert a bitcast if needed. As part of this cleanup, cause the sanitizer code to do the same. (It will call its functions using the expected signature, however they may have been declared.) Finally, in a small number of locations, callers of `getOrInsertFunction` actually were expecting/requiring that a brand new function was being created. In such cases, I've switched them to Function::Create instead. Differential Revision: https://reviews.llvm.org/D57315 llvm-svn: 352791
* [DAG][SystemZ] Define unwrapAddress for PCREL_WRAPPER.Nirav Dave2019-01-312-0/+8
| | | | | | | | | | | | | | | | Summary: Like with X86, this allows better DAG-level alias analysis and alignment inference for wrapped addresses. Reviewers: jonpa, uweigand Reviewed By: uweigand Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D57407 llvm-svn: 352786
* Revert "[X86] Mark EMMS and FEMMS as clobbering MM0-7 and ST0-7."Craig Topper2019-01-312-6/+2
| | | | | | This is causing a failure in chromium llvm-svn: 352782
OpenPOWER on IntegriCloud