summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [AMDGPU] Shrinking V_SUBBREV_U32Stanislav Mekhanoshin2018-02-241-1/+2
| | | | | | | | | | V_SUBBREV_U32 is a commute opcode for V_SUBB_U32. However, when we try to commute V_SUBB_U32 in order to shrink it we do not then process V_SUBBREV_U32 and it stay VOP3. This is fixed. Differential Revision: https://reviews.llvm.org/D43699 llvm-svn: 326011
* [WebAssembly] Add exception handling option and featureHeejin Ahn2018-02-248-9/+26
| | | | | | | | | | | | | | Summary: Add a llc command line option and WebAssembly architecture feature for exception handling. Reviewers: dschuff Subscribers: jfb, sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D43683 llvm-svn: 326004
* Implement equal_range for the DWARF v5 accelerator tablePavel Labath2018-02-241-19/+177
| | | | | | | | | | | | | | | | | | | Summary: This patch implements the name lookup functionality of the .debug_names accelerator table and hooks it up to "llvm-dwarfdump -find". To make the interface of the two kinds of accelerator tables more consistent, I've created an abstract "DWARFAcceleratorTable::Entry" class, which provides a consistent interface to access the common functionality of the table entries (such as getting the die offset, die tag, etc.). I've also modified the apple table to vend entries conforming to this interface. Reviewers: JDevlieghere, aprantl, probinson, dblaikie Subscribers: vleschuk, clayborg, echristo, llvm-commits Differential Revision: https://reviews.llvm.org/D43067 llvm-svn: 326003
* [MemorySSA] Remove a redundant dyn_cast.George Burgess IV2018-02-241-3/+2
| | | | | | StartingAccess is a MemoryUseOrDef. No need to check again. llvm-svn: 326000
* [X86] Remove checks for '(scalar_to_vector (i8 (trunc GR32:)))' from scalar ↵Craig Topper2018-02-241-4/+4
| | | | | | | | masked move patterns. This portion can be matched by other patterns. We don't need it to make the larger pattern valid. It's sufficient to have a v1i1 mask input without caring where it came from. llvm-svn: 325999
* bpf: New optimization pass for eliminating unnecessary i32 promotionsYonghong Song2018-02-234-0/+197
| | | | | | | | | | | | | | | | | | This pass performs peephole optimizations to cleanup ugly code sequences at MachineInstruction layer. Currently, the only optimization in this pass is to eliminate type promotion sequences for zero extending 32-bit subregisters to 64-bit registers. If the compiler could prove the zero extended source come from 32-bit subregistere then it is safe to erase those promotion sequece, because the upper half of the underlying 64-bit registers were zeroed implicitly already. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325991
* bpf: New decoder namespace for 32-bit subregister load/storeYonghong Song2018-02-231-2/+43
| | | | | | | | | | | | | | | | | | | | | | | When -mattr=+alu32 passed to the disassembler, use decoder namespace for 32-bit subregister. This is to disassemble load and store instructions in preferred B format as described in previous commit: w = *(u8 *) (r + off) // BPF_LDX | BPF_B w = *(u16 *)(r + off) // BPF_LDX | BPF_H w = *(u32 *)(r + off) // BPF_LDX | BPF_W *(u8 *) (r + off) = w // BPF_STX | BPF_B *(u16 *)(r + off) = w // BPF_STX | BPF_H *(u32 *)(r + off) = w // BPF_STX | BPF_W NOTE: all other instructions should still use the default decoder namespace. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325990
* bpf: Enable 32-bit subregister support for -mattr=+alu32Yonghong Song2018-02-231-0/+2
| | | | | | | | | After all those preparation patches, now we could enable 32-bit subregister support once -mattr=+alu32 specified. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325989
* bpf: Support 32-bit subregister in various InstrInfo hooksYonghong Song2018-02-231-0/+10
| | | | | | | | | This patch support 32-bit subregister in three InstrInfo hooks, i.e. copyPhysReg, loadRegFromStackSlot and storeRegToStackSlot, Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325988
* bpf: New instruction patterns for 32-bit subregister load and storeYonghong Song2018-02-232-10/+103
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The instruction mapping between eBPF/arm64/x86_64 are: eBPF arm64 x86_64 LD1 BPF_LDX | BPF_B ldrb movzbl LD2 BPF_LDX | BPF_H ldrh movzwl LD4 BPF_LDX | BPF_W ldr movl movzbl/movzwl/movl on x86_64 accept 32-bit sub-register, for example %eax, the same for ldrb/ldrh on arm64 which accept 32-bit "w" register. And actually these instructions only accept sub-registers. There is no point to have LD1/2/4 (unsigned) for 64-bit register, because on these arches, upper 32-bits are guaranteed to be zeroed by hardware or VM, so load into the smallest available register class is the best choice for maintaining type information. For eBPF we should adopt the same philosophy, to change current format (A): r = *(u8 *) (r + off) // BPF_LDX | BPF_B r = *(u16 *)(r + off) // BPF_LDX | BPF_H r = *(u32 *)(r + off) // BPF_LDX | BPF_W *(u8 *) (r + off) = r // BPF_STX | BPF_B *(u16 *)(r + off) = r // BPF_STX | BPF_H *(u32 *)(r + off) = r // BPF_STX | BPF_W into B: w = *(u8 *) (r + off) // BPF_LDX | BPF_B w = *(u16 *)(r + off) // BPF_LDX | BPF_H w = *(u32 *)(r + off) // BPF_LDX | BPF_W *(u8 *) (r + off) = w // BPF_STX | BPF_B *(u16 *)(r + off) = w // BPF_STX | BPF_H *(u32 *)(r + off) = w // BPF_STX | BPF_W There is no change on encoding nor how should they be interpreted, everything is as it is, load the specified length, write into low bits of the register then zeroing all remaining high bits. The only change is their associated register class and how compiler view them. Format A still need to be kept, because eBPF LLVM backend doesn't support sub-registers at default, but once 32-bit subregister is enabled, it should use format B. This patch implemented this together with all those necessary extended load and truncated store patterns. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325987
* bpf: Support i32 in getScalarShiftAmountTy methodYonghong Song2018-02-232-0/+7
| | | | | | | | | | getScalarShiftAmount method should be implemented for eBPF backend to make sure shift amount could still get correct type once 32-bit subregisters support are enabled. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325986
* bpf: Support condition comparison on i32Yonghong Song2018-02-233-27/+126
| | | | | | | | | | | | | | | | | | | We need to support condition comparison on i32. All these comparisons are supposed to be combined into BPF_J* instructions which only support i64. For ISD::BR_CC we need to promote it to i64 first, then do custom lowering. For ISD::SET_CC, just expand to SELECT_CC like what's been done for i64. For ISD::SELECT_CC, we also want to do custom lower for i32. However, after 32-bit subregister support enabled, it is possible the comparison operands are i32 while the selected value are i64, or the comparison operands are i64 while the selected value are i32. We need to define extra instruction pattern and support them in custom instruction inserter. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325985
* bpf: Handle i32 for ALU operations without ISA supportYonghong Song2018-02-231-21/+26
| | | | | | | | | | | | There is no eBPF ISA support for BSWAP, ROTR, ROTL, SREM, SDIVREM, MULHU, ADDC, ADDE etc on i32. They could be emulated by other basic BPF_ALU operations, we'd set their lowering action the same as i64. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325984
* bpf: New calling convention for 32-bit subregistersYonghong Song2018-02-233-9/+37
| | | | | | | | | | | This patch add new calling conventions to allow GPR32RegClass as valid register class for arguments and return types. New calling convention will only be choosen when -mattr=+alu32 specified. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325983
* bpf: New target attribute "alu32" for 32-bit subregister supportYonghong Song2018-02-233-0/+9
| | | | | | | | | | | | | | | | | | This new attribute aims to control the enablement of 32-bit subregister support on eBPF backend. Name the interface as "alu32" is because we in particular want to enable the generation of BPF_ALU32 instructions by enable subregister support. This attribute could be used in the following format with llc: llc -mtriple=bpf -mattr=[+|-]alu32 It is disabled at default. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325982
* bpf: Define instruction patterns for extensions and truncations between i32 ↵Yonghong Song2018-02-231-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | to i64 For transformations between i32 and i64, if it is explicit signed extension: - first cast the operand to i64 - then use SLL + SRA to finish the extension. if it is explicit zero extension: - first cast the operand to i64 - then use SLL + SRL to finish the extension. if it is explicit any extension: - just refer to 64-bit register. if it is explicit truncation: - just refer to 32-bit subregister. NOTE: Some of the zero extension sequences might be unnecessary, they will be removed by an peephole pass on MachineInstruction layer. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325981
* bpf: Tighten the immediate predication for 32-bit alu instructionsYonghong Song2018-02-231-2/+4
| | | | | | | | | | | | | | | | | These 32-bit ALU insn patterns which takes immediate as one operand were initially added to enable AsmParser support, and the AsmMatcher uses "ins" and "outs" fields to deduct the operand constraint. However, the instruction selector doesn't work the same as AsmMatcher. The selector will use the "pattern" field for which we are not setting the predication for immediate operands correctly. Without this patch, i32 would eventually means all i32 operands are valid, both imm and gpr, while these patterns should allow imm only. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325980
* bpf: Use markSuperRegs to mark reserved registersYonghong Song2018-02-231-2/+2
| | | | | | | | | markSuperRegs is the canonical helper function used to mark reserved registers. It could mark any overlapping sub-registers automatically. Reviewed-by: Yonghong Song <yhs@fb.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> llvm-svn: 325979
* [PowerPC] Disable shrink-wrapping when getting PC address through the LRNemanja Ivanovic2018-02-233-0/+23
| | | | | | | | | | | | | | The instruction sequence used to get the address of the PC into a GPR requires that we clobber the link register. Doing so without having first saved it in the prologue leaves the function unable to return. Currently, this sequence is emitted into the entry block. To ensure the prologue is inserted before this sequence, disable shrink-wrapping. This fixes PR33547. Differential Revision: https://reviews.llvm.org/D43677 llvm-svn: 325972
* [MemorySSA] Fix a cache invalidation bug with removed accessesGeorge Burgess IV2018-02-231-1/+1
| | | | | | | I suspect there's a deeper issue here, but we probably shouldn't be using INVALID_MEMORYSSA_ID as liveOnEntry's ID anyway. llvm-svn: 325971
* [DebugInfo] Support DWARF v5 source code embedding extensionScott Linder2018-02-2317-87/+215
| | | | | | | | | | | | | | | | | | | In DWARF v5 the Line Number Program Header is extensible, allowing values with new content types. In this extension a content type is added, DW_LNCT_LLVM_source, which contains the embedded source code of the file. Add new optional attribute for !DIFile IR metadata called source which contains source text. Use this to output the source to the DWARF line table of code objects. Analogously extend METADATA_FILE in Bitcode and .file directive in ASM to support optional source. Teach llvm-dwarfdump and llvm-objdump about the new values. Update the output format of llvm-dwarfdump to make room for the new attribute on file_names entries, and support embedded sources for the -source option in llvm-objdump. Differential Revision: https://reviews.llvm.org/D42765 llvm-svn: 325970
* [InstCombine] simplify code for fabs(X) * fabs(X) -> X * X; NFCSanjay Patel2018-02-231-13/+4
| | | | llvm-svn: 325968
* Sink the verification code around the assert where it's handled and wrap in ↵Eric Christopher2018-02-231-25/+14
| | | | | | | | | | NDEBUG. This has the advantage of making release only builds more warning free and there's no need to make this routine a class function if it isn't using class members anyhow. llvm-svn: 325967
* [InstSimplify] sqrt(X) * sqrt(X) --> XSanjay Patel2018-02-232-4/+6
| | | | | | This was misplaced in InstCombine. We can loosen the FMF as a follow-up step. llvm-svn: 325965
* Intrinsics calls should avoid the PLT when "RtLibUseGOT" metadata is present.Sriraman Tallam2018-02-233-2/+22
| | | | | | Differential Revision: https://reviews.llvm.org/D42216 llvm-svn: 325962
* [InstCombine] allow fmul-sqrt folds with less than full -ffast-mathSanjay Patel2018-02-232-18/+25
| | | | | | Also, add a Builder method for intrinsics to reduce code duplication for clients. llvm-svn: 325960
* Simplify a DEBUG statement to remove a set but not used variable in release ↵Eric Christopher2018-02-231-7/+4
| | | | | | builds. llvm-svn: 325959
* [X86] Add assembler/disassembler support for blendm with zero masking and ↵Craig Topper2018-02-231-0/+8
| | | | | | | | broacast. Fixes PR31617 llvm-svn: 325957
* [Power9] Add missing instructions to the Power 9 schedulerStefan Pintilie2018-02-231-47/+97
| | | | | | | | | | This is the first in a series of patches that will define more instructions using InstRW so that we can move away from ItinRW and ultimately have a complete Power 9 scheduler. Differential Revision: https://reviews.llvm.org/D43635 llvm-svn: 325956
* [Hexagon] Recognize non-immediate constants in HexagonConstPropagationKrzysztof Parzyszek2018-02-232-6/+11
| | | | llvm-svn: 325954
* Fixed unused variable warning. NFCI.Simon Pilgrim2018-02-231-1/+1
| | | | llvm-svn: 325950
* [X86] Add DAG combine to remove (and X, 1) from in front of a v1i1 scalar to ↵Craig Topper2018-02-232-4/+24
| | | | | | | | | | | | vector. These can be created by type legalization promoting the inputs to select to match scalar boolean contents. We were trying to pattern match them away during isel, but its better to just remove them from the DAG. I've cleaned up some patterns to not check for this 'and' anymore. But I suspect this has also opened up opportunities for pattern removal. llvm-svn: 325949
* [WebAssembly] Fix macro metaprogram to not duplicate code as much.Benjamin Kramer2018-02-231-7/+11
| | | | | | No functionality change intended. llvm-svn: 325947
* [X86][SSE] Generalize x > C-1 ? x+-C : 0 --> subus x, C combine for ↵Simon Pilgrim2018-02-231-22/+24
| | | | | | non-uniform constants llvm-svn: 325944
* Shrink various scheduling tables by using narrower types.Benjamin Kramer2018-02-231-1/+1
| | | | | | 16 bits ought to be enough for everyone. This shrinks clang by ~1MB. llvm-svn: 325941
* [PATCH] [AArch64] Add new target feature to fuse conditional selectEvandro Menezes2018-02-233-22/+73
| | | | | | | | | This feature enables the fusion of the comparison and the conditional select instructions together. Differential revision: https://reviews.llvm.org/D42392 llvm-svn: 325939
* Fix compiler warning introduced in r325931. NFC.Geoff Berry2018-02-231-3/+2
| | | | llvm-svn: 325938
* [X86] Custom split v32i16/v64i8 bitcasts when AVX512F is available, but BWI ↵Craig Topper2018-02-231-1/+20
| | | | | | | | | | is not. The test changes you can see are related to the changes in ReplaceNodeResults. Though shuffle-vs-trunc-512.ll does have a test that exercises the code in LowerBITCAST. Looks like the test output didn't change because DAG combining is able to clean up the resulting type legalization. Adding the custom hook just makes type legalization work less hard. Differential Revision: https://reviews.llvm.org/D43447 llvm-svn: 325933
* [MachineOperand][Target] MachineOperand::isRenamable semantics changesGeoff Berry2018-02-2324-57/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Add a target option AllowRegisterRenaming that is used to opt in to post-register-allocation renaming of registers. This is set to 0 by default, which causes the hasExtraSrcRegAllocReq/hasExtraDstRegAllocReq fields of all opcodes to be set to 1, causing MachineOperand::isRenamable to always return false. Set the AllowRegisterRenaming flag to 1 for all in-tree targets that have lit tests that were effected by enabling COPY forwarding in MachineCopyPropagation (AArch64, AMDGPU, ARM, Hexagon, Mips, PowerPC, RISCV, Sparc, SystemZ and X86). Add some more comments describing the semantics of the MachineOperand::isRenamable function and how it is set and maintained. Change isRenamable to check the operand's opcode hasExtraSrcRegAllocReq/hasExtraDstRegAllocReq bit directly instead of relying on it being consistently reflected in the IsRenamable bit setting. Clear the IsRenamable bit when changing an operand's register value. Remove target code that was clearing the IsRenamable bit when changing registers/opcodes now that this is done conservatively by default. Change setting of hasExtraSrcRegAllocReq in AMDGPU target to be done in one place covering all opcodes that have constant pipe read limit restrictions. Reviewers: qcolombet, MatzeB Subscribers: aemerson, arsenm, jyknight, mcrosier, sdardis, nhaehnle, javed.absar, tpr, arichardson, kristof.beyls, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, niosHD, escha, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D43042 llvm-svn: 325931
* [Debug] Add dbg.value intrinsics for PHIs created during LCSSA.Matt Davis2018-02-232-4/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch is an enhancement to propagate dbg.value information when Phis are created on behalf of LCSSA. I noticed a case where a value carried across a loop was reported as <optimized out>. Specifically this case: ``` int bar(int x, int y) { return x + y; } int foo(int size) { int val = 0; for (int i = 0; i < size; ++i) { val = bar(val, i); // Both val and i are correct } return val; // <optimized out> } ``` In the above case, after all of the interesting computation completes our value is reported as "optimized out." This change will add a dbg.value to correct this. This patch also moves the dbg.value insertion routine from LoopRotation.cpp into Local.cpp, so that we can share it in both places (LoopRotation and LCSSA). Reviewers: mzolotukhin, aprantl, vsk, davide Reviewed By: aprantl, vsk Subscribers: dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D42551 llvm-svn: 325926
* [BPI] Detect branches in loops that make themselves not takenJohn Brawn2018-02-231-14/+135
| | | | | | | | | | | | | | | | | | | | | | If we have a loop like this: int n = 0; while (...) { if (++n >= MAX) { n = 0; } } then the body of the 'if' statement will only be executed once every MAX iterations. Detect this by looking for branches in loops where taking the branch makes the branch condition evaluate to 'not taken' in the next iteration of the loop, and reduce the probability of such branches. This slightly improves EEMBC benchmarks on cortex-m4/cortex-m33 due to making better choices in if-conversion, but has no effect on any other cpu/benchmark that I could detect. Differential Revision: https://reviews.llvm.org/D35804 llvm-svn: 325925
* [InstCombine] refactor fmul with negated op folds; NFCISanjay Patel2018-02-231-24/+18
| | | | | | | | | | | | | | The existing code was inefficiently looking for 'nsz' variants. That's unnecessary because we canonicalize those to the expected form with -0.0. We may also want to adjust or remove the fold that sinks negation. We don't do that for fdiv (or integer ops?). That should be uniform? It may also lead to missed optimization as in PR21914: https://bugs.llvm.org/show_bug.cgi?id=21914 ...or we just have to fix other passes to avoid that problem. llvm-svn: 325924
* [InstCombine] use FMF-copying functions to reduce code; NFCISanjay Patel2018-02-231-28/+12
| | | | llvm-svn: 325923
* [PowerPC] Code cleanup. Remove instructions that were withdrawn from Power 9.Stefan Pintilie2018-02-231-21/+0
| | | | | | | | | | | | | | The following set of instructions was originally planned to be added for Power 9 and so code was added to support them. However, a decision was made later on to withdraw support for these instructions in the hardware. xscmpnedp xvcmpnesp xvcmpnedp This patch removes support for the instructions that were not added. Differential Revision: https://reviews.llvm.org/D43641 llvm-svn: 325918
* [mips] finish removal of unused fields in MipsInstructionSelectorPetar Jovanovic2018-02-231-2/+2
| | | | | | r325916 missed to remove calls in constructor. llvm-svn: 325917
* [mips] remove unused fields in MipsInstructionSelectorPetar Jovanovic2018-02-231-3/+0
| | | | | | Unused fields cause buildbreak if -Werror,-Wunused-private-field is passed. llvm-svn: 325916
* Support for the mno-stack-arg-probe flagHans Wennborg2018-02-234-10/+25
| | | | | | | | | | | | Adds support for this flag. There is also another piece for clang (separate review). More info: https://bugs.llvm.org/show_bug.cgi?id=36221 By Ruslan Nikolaev! Differential Revision: https://reviews.llvm.org/D43107 llvm-svn: 325900
* llvm-config: Add advapi32 to --system-libs on Windows (PR36372)Hans Wennborg2018-02-231-1/+2
| | | | llvm-svn: 325894
* [DAGCOmbine] Ensure that (brcond (setcc ...)) is handled in a canonical manner.Amaury Sechet2018-02-232-75/+69
| | | | | | | | | | | | | | | Summary: There are transformation that change setcc into other constructs, and transform that try to reconstruct a setcc from the brcond condition. Depending on what order these transform are done, the end result differs. Most of the time, it is preferable to get a setcc as a brcond argument (and this is why brcond try to recreate the setcc in the first place) so we ensure this is done every time by also doing it at the setcc level when the only user is a brcond. Reviewers: spatel, hfinkel, niravd, craig.topper Subscribers: nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D41235 llvm-svn: 325892
* Revert "TableGen: Fix typeIsConvertibleTo for record types"Nicolai Haehnle2018-02-232-16/+15
| | | | | | | | | | This reverts r325884. Clang's TableGen has dependencies on the exact ordering of superclasses. Revert this change fully for now to fix the build. Change-Id: Ib297f5571cc7809f00838702ad7ab53d47335b26 llvm-svn: 325891
OpenPOWER on IntegriCloud