summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [X86] Remove VT.isSimple() check from detectAVGPattern.Craig Topper2018-02-261-1/+1
| | | | | | Which types are considered 'simple' is a function of the requirements of all targets that LLVM supports. That shouldn't directly affect what types we are able to handle. The remainder of this code checks that the number of elements is a power of 2 and takes care of splitting down to a legal size. llvm-svn: 326063
* TableGen: Remove VarInit::getFieldTypeNicolai Haehnle2018-02-251-7/+0
| | | | | | | | | | | | | | It is redundant with the implementation in TypedInit. Change-Id: I8ab1fb5c77e4923f7eb3ffae5889f0f8af6093b4 Reviewers: arsenm, craig.topper, tra, MartinO Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D43678 llvm-svn: 326061
* TableGen: Get rid of Init::getFieldInitNicolai Haehnle2018-02-251-24/+6
| | | | | | | | | | | | | | | | | | | | Summary: FieldInit will just rely on the standardized resolving mechanism to give us DefInits for folding, thus simplifying the code. Unlike the removal of resolveListElementReference, this shouldn't have performance implications, because DefInits do not recurse inside their record. Change-Id: Id4544c774c9d9ee92f293615af6ecff706453f21 Reviewers: arsenm, craig.topper, tra, MartinO Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D43563 llvm-svn: 326060
* TableGen: Remove Init::resolveListElementReferenceNicolai Haehnle2018-02-251-88/+9
| | | | | | | | | | | | | | | | | | | | | | Summary: Resolving a VarListElementInit should just resolve the list and then take its element. This eliminates a lot of duplicated logic and simplifies the next steps of refactoring resolveReferences. This does potentially cause sub-elements of the entire list to be resolved resulting in more work, but I didn't notice a measurable change in performance, and a later patch adds a caching mechanism that covers at least the common case of `var[i]` in a more generic way. Change-Id: I7b59185b855c7368585c329c31e5be38c5749dac Reviewers: arsenm, craig.topper, tra, MartinO Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D43562 llvm-svn: 326059
* [DebugInfo] Stable sort symbols to remove non-deterministic orderingMandeep Singh Grang2018-02-251-1/+1
| | | | | | | | | | | | | | | | Summary: This fixes failure in DebugInfo/X86/multiple-aranges.ll uncovered by D39245. Reviewers: rafael, echristo, probinson Reviewed By: probinson Subscribers: probinson, llvm-commits, JDevlieghere Tags: #debug-info Differential Revision: https://reviews.llvm.org/D39950 llvm-svn: 326056
* [X86] Use SDNode instead of SDPatternOperator. NFCCraig Topper2018-02-251-7/+7
| | | | llvm-svn: 326048
* [TargetLowering] SimplifyDemandedVectorElts - pass demanded elts through ↵Simon Pilgrim2018-02-241-0/+13
| | | | | | ADD/SUB ops llvm-svn: 326044
* [TargetLowering] SimplifyDemandedVectorElts - pass demanded elts through ↵Simon Pilgrim2018-02-241-0/+5
| | | | | | TRUNCATE ops llvm-svn: 326043
* [X86] Allow int_x86_sse2_cvtps2dq and int_x86_avx_cvt_ps2dq_256 to select ↵Craig Topper2018-02-243-7/+11
| | | | | | EVEX encoded instructions. llvm-svn: 326041
* Revert "StructurizeCFG: Test for branch divergence correctly"Adam Nemet2018-02-241-12/+3
| | | | | | | | This reverts commit r325881. Breaks many bots llvm-svn: 326037
* [X86][SSE] combineSubToSubus - support v8i64 handling from SSSE3Simon Pilgrim2018-02-241-1/+1
| | | | | | Our UMIN/UMAX, vector truncation and shuffle combining is good enough to efficiently handle v8i64 with the number of leading zeros that are necessary for PSUBUS. llvm-svn: 326034
* [X86][SSE] combineSubToSubus - support v8i32 handling from SSSE3 (not SSE41)Simon Pilgrim2018-02-241-3/+3
| | | | | | Now that UMIN etc are Legal/Custom for SSE2+, we can efficiently match SUBUS v8i32 cases from SSSE3 which can perform efficient truncation with PSHUFB. llvm-svn: 326033
* [X86][SSE] combineSubToSubus - begun generalizing to work with any type ↵Simon Pilgrim2018-02-241-4/+11
| | | | | | sizes with SplitBinaryOpsAndApply llvm-svn: 326030
* Fix spelling in comment. NFCI.Simon Pilgrim2018-02-241-1/+1
| | | | llvm-svn: 326029
* [Sparc] Return true in enableMultipleCopyHints().Jonas Paulsson2018-02-241-0/+2
| | | | | | | | | | Enable multiple COPY hints to eliminate more COPYs during register allocation. Note that this is something all targets should do, see https://reviews.llvm.org/D38128. Review: James Y Knight llvm-svn: 326028
* [X86] Use SelectionDAG::getNot instead of implementing manually. NFCCraig Topper2018-02-241-2/+1
| | | | llvm-svn: 326020
* [AMDGPU] Shrinking V_SUBBREV_U32Stanislav Mekhanoshin2018-02-241-1/+2
| | | | | | | | | | V_SUBBREV_U32 is a commute opcode for V_SUBB_U32. However, when we try to commute V_SUBB_U32 in order to shrink it we do not then process V_SUBBREV_U32 and it stay VOP3. This is fixed. Differential Revision: https://reviews.llvm.org/D43699 llvm-svn: 326011
* [WebAssembly] Add exception handling option and featureHeejin Ahn2018-02-248-9/+26
| | | | | | | | | | | | | | Summary: Add a llc command line option and WebAssembly architecture feature for exception handling. Reviewers: dschuff Subscribers: jfb, sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D43683 llvm-svn: 326004
* Implement equal_range for the DWARF v5 accelerator tablePavel Labath2018-02-241-19/+177
| | | | | | | | | | | | | | | | | | | Summary: This patch implements the name lookup functionality of the .debug_names accelerator table and hooks it up to "llvm-dwarfdump -find". To make the interface of the two kinds of accelerator tables more consistent, I've created an abstract "DWARFAcceleratorTable::Entry" class, which provides a consistent interface to access the common functionality of the table entries (such as getting the die offset, die tag, etc.). I've also modified the apple table to vend entries conforming to this interface. Reviewers: JDevlieghere, aprantl, probinson, dblaikie Subscribers: vleschuk, clayborg, echristo, llvm-commits Differential Revision: https://reviews.llvm.org/D43067 llvm-svn: 326003
* [MemorySSA] Remove a redundant dyn_cast.George Burgess IV2018-02-241-3/+2
| | | | | | StartingAccess is a MemoryUseOrDef. No need to check again. llvm-svn: 326000
* [X86] Remove checks for '(scalar_to_vector (i8 (trunc GR32:)))' from scalar ↵Craig Topper2018-02-241-4/+4
| | | | | | | | masked move patterns. This portion can be matched by other patterns. We don't need it to make the larger pattern valid. It's sufficient to have a v1i1 mask input without caring where it came from. llvm-svn: 325999
* bpf: New optimization pass for eliminating unnecessary i32 promotionsYonghong Song2018-02-234-0/+197
| | | | | | | | | | | | | | | | | | This pass performs peephole optimizations to cleanup ugly code sequences at MachineInstruction layer. Currently, the only optimization in this pass is to eliminate type promotion sequences for zero extending 32-bit subregisters to 64-bit registers. If the compiler could prove the zero extended source come from 32-bit subregistere then it is safe to erase those promotion sequece, because the upper half of the underlying 64-bit registers were zeroed implicitly already. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325991
* bpf: New decoder namespace for 32-bit subregister load/storeYonghong Song2018-02-231-2/+43
| | | | | | | | | | | | | | | | | | | | | | | When -mattr=+alu32 passed to the disassembler, use decoder namespace for 32-bit subregister. This is to disassemble load and store instructions in preferred B format as described in previous commit: w = *(u8 *) (r + off) // BPF_LDX | BPF_B w = *(u16 *)(r + off) // BPF_LDX | BPF_H w = *(u32 *)(r + off) // BPF_LDX | BPF_W *(u8 *) (r + off) = w // BPF_STX | BPF_B *(u16 *)(r + off) = w // BPF_STX | BPF_H *(u32 *)(r + off) = w // BPF_STX | BPF_W NOTE: all other instructions should still use the default decoder namespace. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325990
* bpf: Enable 32-bit subregister support for -mattr=+alu32Yonghong Song2018-02-231-0/+2
| | | | | | | | | After all those preparation patches, now we could enable 32-bit subregister support once -mattr=+alu32 specified. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325989
* bpf: Support 32-bit subregister in various InstrInfo hooksYonghong Song2018-02-231-0/+10
| | | | | | | | | This patch support 32-bit subregister in three InstrInfo hooks, i.e. copyPhysReg, loadRegFromStackSlot and storeRegToStackSlot, Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325988
* bpf: New instruction patterns for 32-bit subregister load and storeYonghong Song2018-02-232-10/+103
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The instruction mapping between eBPF/arm64/x86_64 are: eBPF arm64 x86_64 LD1 BPF_LDX | BPF_B ldrb movzbl LD2 BPF_LDX | BPF_H ldrh movzwl LD4 BPF_LDX | BPF_W ldr movl movzbl/movzwl/movl on x86_64 accept 32-bit sub-register, for example %eax, the same for ldrb/ldrh on arm64 which accept 32-bit "w" register. And actually these instructions only accept sub-registers. There is no point to have LD1/2/4 (unsigned) for 64-bit register, because on these arches, upper 32-bits are guaranteed to be zeroed by hardware or VM, so load into the smallest available register class is the best choice for maintaining type information. For eBPF we should adopt the same philosophy, to change current format (A): r = *(u8 *) (r + off) // BPF_LDX | BPF_B r = *(u16 *)(r + off) // BPF_LDX | BPF_H r = *(u32 *)(r + off) // BPF_LDX | BPF_W *(u8 *) (r + off) = r // BPF_STX | BPF_B *(u16 *)(r + off) = r // BPF_STX | BPF_H *(u32 *)(r + off) = r // BPF_STX | BPF_W into B: w = *(u8 *) (r + off) // BPF_LDX | BPF_B w = *(u16 *)(r + off) // BPF_LDX | BPF_H w = *(u32 *)(r + off) // BPF_LDX | BPF_W *(u8 *) (r + off) = w // BPF_STX | BPF_B *(u16 *)(r + off) = w // BPF_STX | BPF_H *(u32 *)(r + off) = w // BPF_STX | BPF_W There is no change on encoding nor how should they be interpreted, everything is as it is, load the specified length, write into low bits of the register then zeroing all remaining high bits. The only change is their associated register class and how compiler view them. Format A still need to be kept, because eBPF LLVM backend doesn't support sub-registers at default, but once 32-bit subregister is enabled, it should use format B. This patch implemented this together with all those necessary extended load and truncated store patterns. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325987
* bpf: Support i32 in getScalarShiftAmountTy methodYonghong Song2018-02-232-0/+7
| | | | | | | | | | getScalarShiftAmount method should be implemented for eBPF backend to make sure shift amount could still get correct type once 32-bit subregisters support are enabled. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325986
* bpf: Support condition comparison on i32Yonghong Song2018-02-233-27/+126
| | | | | | | | | | | | | | | | | | | We need to support condition comparison on i32. All these comparisons are supposed to be combined into BPF_J* instructions which only support i64. For ISD::BR_CC we need to promote it to i64 first, then do custom lowering. For ISD::SET_CC, just expand to SELECT_CC like what's been done for i64. For ISD::SELECT_CC, we also want to do custom lower for i32. However, after 32-bit subregister support enabled, it is possible the comparison operands are i32 while the selected value are i64, or the comparison operands are i64 while the selected value are i32. We need to define extra instruction pattern and support them in custom instruction inserter. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325985
* bpf: Handle i32 for ALU operations without ISA supportYonghong Song2018-02-231-21/+26
| | | | | | | | | | | | There is no eBPF ISA support for BSWAP, ROTR, ROTL, SREM, SDIVREM, MULHU, ADDC, ADDE etc on i32. They could be emulated by other basic BPF_ALU operations, we'd set their lowering action the same as i64. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325984
* bpf: New calling convention for 32-bit subregistersYonghong Song2018-02-233-9/+37
| | | | | | | | | | | This patch add new calling conventions to allow GPR32RegClass as valid register class for arguments and return types. New calling convention will only be choosen when -mattr=+alu32 specified. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325983
* bpf: New target attribute "alu32" for 32-bit subregister supportYonghong Song2018-02-233-0/+9
| | | | | | | | | | | | | | | | | | This new attribute aims to control the enablement of 32-bit subregister support on eBPF backend. Name the interface as "alu32" is because we in particular want to enable the generation of BPF_ALU32 instructions by enable subregister support. This attribute could be used in the following format with llc: llc -mtriple=bpf -mattr=[+|-]alu32 It is disabled at default. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325982
* bpf: Define instruction patterns for extensions and truncations between i32 ↵Yonghong Song2018-02-231-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | to i64 For transformations between i32 and i64, if it is explicit signed extension: - first cast the operand to i64 - then use SLL + SRA to finish the extension. if it is explicit zero extension: - first cast the operand to i64 - then use SLL + SRL to finish the extension. if it is explicit any extension: - just refer to 64-bit register. if it is explicit truncation: - just refer to 32-bit subregister. NOTE: Some of the zero extension sequences might be unnecessary, they will be removed by an peephole pass on MachineInstruction layer. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325981
* bpf: Tighten the immediate predication for 32-bit alu instructionsYonghong Song2018-02-231-2/+4
| | | | | | | | | | | | | | | | | These 32-bit ALU insn patterns which takes immediate as one operand were initially added to enable AsmParser support, and the AsmMatcher uses "ins" and "outs" fields to deduct the operand constraint. However, the instruction selector doesn't work the same as AsmMatcher. The selector will use the "pattern" field for which we are not setting the predication for immediate operands correctly. Without this patch, i32 would eventually means all i32 operands are valid, both imm and gpr, while these patterns should allow imm only. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325980
* bpf: Use markSuperRegs to mark reserved registersYonghong Song2018-02-231-2/+2
| | | | | | | | | markSuperRegs is the canonical helper function used to mark reserved registers. It could mark any overlapping sub-registers automatically. Reviewed-by: Yonghong Song <yhs@fb.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> llvm-svn: 325979
* [PowerPC] Disable shrink-wrapping when getting PC address through the LRNemanja Ivanovic2018-02-233-0/+23
| | | | | | | | | | | | | | The instruction sequence used to get the address of the PC into a GPR requires that we clobber the link register. Doing so without having first saved it in the prologue leaves the function unable to return. Currently, this sequence is emitted into the entry block. To ensure the prologue is inserted before this sequence, disable shrink-wrapping. This fixes PR33547. Differential Revision: https://reviews.llvm.org/D43677 llvm-svn: 325972
* [MemorySSA] Fix a cache invalidation bug with removed accessesGeorge Burgess IV2018-02-231-1/+1
| | | | | | | I suspect there's a deeper issue here, but we probably shouldn't be using INVALID_MEMORYSSA_ID as liveOnEntry's ID anyway. llvm-svn: 325971
* [DebugInfo] Support DWARF v5 source code embedding extensionScott Linder2018-02-2317-87/+215
| | | | | | | | | | | | | | | | | | | In DWARF v5 the Line Number Program Header is extensible, allowing values with new content types. In this extension a content type is added, DW_LNCT_LLVM_source, which contains the embedded source code of the file. Add new optional attribute for !DIFile IR metadata called source which contains source text. Use this to output the source to the DWARF line table of code objects. Analogously extend METADATA_FILE in Bitcode and .file directive in ASM to support optional source. Teach llvm-dwarfdump and llvm-objdump about the new values. Update the output format of llvm-dwarfdump to make room for the new attribute on file_names entries, and support embedded sources for the -source option in llvm-objdump. Differential Revision: https://reviews.llvm.org/D42765 llvm-svn: 325970
* [InstCombine] simplify code for fabs(X) * fabs(X) -> X * X; NFCSanjay Patel2018-02-231-13/+4
| | | | llvm-svn: 325968
* Sink the verification code around the assert where it's handled and wrap in ↵Eric Christopher2018-02-231-25/+14
| | | | | | | | | | NDEBUG. This has the advantage of making release only builds more warning free and there's no need to make this routine a class function if it isn't using class members anyhow. llvm-svn: 325967
* [InstSimplify] sqrt(X) * sqrt(X) --> XSanjay Patel2018-02-232-4/+6
| | | | | | This was misplaced in InstCombine. We can loosen the FMF as a follow-up step. llvm-svn: 325965
* Intrinsics calls should avoid the PLT when "RtLibUseGOT" metadata is present.Sriraman Tallam2018-02-233-2/+22
| | | | | | Differential Revision: https://reviews.llvm.org/D42216 llvm-svn: 325962
* [InstCombine] allow fmul-sqrt folds with less than full -ffast-mathSanjay Patel2018-02-232-18/+25
| | | | | | Also, add a Builder method for intrinsics to reduce code duplication for clients. llvm-svn: 325960
* Simplify a DEBUG statement to remove a set but not used variable in release ↵Eric Christopher2018-02-231-7/+4
| | | | | | builds. llvm-svn: 325959
* [X86] Add assembler/disassembler support for blendm with zero masking and ↵Craig Topper2018-02-231-0/+8
| | | | | | | | broacast. Fixes PR31617 llvm-svn: 325957
* [Power9] Add missing instructions to the Power 9 schedulerStefan Pintilie2018-02-231-47/+97
| | | | | | | | | | This is the first in a series of patches that will define more instructions using InstRW so that we can move away from ItinRW and ultimately have a complete Power 9 scheduler. Differential Revision: https://reviews.llvm.org/D43635 llvm-svn: 325956
* [Hexagon] Recognize non-immediate constants in HexagonConstPropagationKrzysztof Parzyszek2018-02-232-6/+11
| | | | llvm-svn: 325954
* Fixed unused variable warning. NFCI.Simon Pilgrim2018-02-231-1/+1
| | | | llvm-svn: 325950
* [X86] Add DAG combine to remove (and X, 1) from in front of a v1i1 scalar to ↵Craig Topper2018-02-232-4/+24
| | | | | | | | | | | | vector. These can be created by type legalization promoting the inputs to select to match scalar boolean contents. We were trying to pattern match them away during isel, but its better to just remove them from the DAG. I've cleaned up some patterns to not check for this 'and' anymore. But I suspect this has also opened up opportunities for pattern removal. llvm-svn: 325949
* [WebAssembly] Fix macro metaprogram to not duplicate code as much.Benjamin Kramer2018-02-231-7/+11
| | | | | | No functionality change intended. llvm-svn: 325947
* [X86][SSE] Generalize x > C-1 ? x+-C : 0 --> subus x, C combine for ↵Simon Pilgrim2018-02-231-22/+24
| | | | | | non-uniform constants llvm-svn: 325944
OpenPOWER on IntegriCloud