bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[DWARF] Use a function-local offset for AT_call_return_pc	Vedant Kumar	2018-10-22	5	-9/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Logs provided by @stella.stamenova indicate that on Linux, lldb adds a spurious slide offset to the return PC it loads from AT_call_return_pc attributes (see the list thread: "[PATCH] D50478: Add support for artificial tail call frames"). This patch side-steps the issue by getting rid of the load address calculation in lldb's CallEdge::GetReturnPCAddress. The idea is to have the DWARF writer emit function-local offsets to the instruction after a call. I.e. return-pc = label-after-call-insn - function-entry. LLDB can simply add this offset to the base address of a function to get the return PC. Differential Revision: https://reviews.llvm.org/D53469 llvm-svn: 344960
*	Reapply "[MachineCopyPropagation] Reimplement CopyTracker in terms of ↵	Justin Bogner	2018-10-22	1	-58/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	register units" Recommits r342942, which was reverted in r343189, with a fix for an issue where we would propagate unsafely if we defined only the upper part of a register. Original message: Change the copy tracker to keep a single map of register units instead of 3 maps of registers. This gives a very significant compile time performance improvement to the pass. I measured a 30-40% decrease in time spent in MCP on x86 and AArch64 and much more significant improvements on out of tree targets with more registers. llvm-svn: 344942
*	DAG: Change behavior of fminnum/fmaxnum nodes	Matt Arsenault	2018-10-22	8	-3/+89
\| \| \| \| \| \| \| \| \| \| \|	Introduce new versions that follow the IEEE semantics to help with legalization that may need quieted inputs. There are some regressions from inserting unnecessary canonicalizes when these are matched from fast math fcmp + select which should be fixed in a future commit. llvm-svn: 344914
*	[DAGCombiner] reduce insert+bitcast+extract vector ops to truncate (PR39016)	Sanjay Patel	2018-10-21	1	-4/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a late backend subset of the IR transform added with: D52439 We can confirm that the conversion to a 'trunc' is correct by running: $ opt -instcombine -data-layout="e" (assuming the IR transforms are correct; change "e" to "E" for big-endian) As discussed in PR39016: https://bugs.llvm.org/show_bug.cgi?id=39016 ...the pattern may emerge during legalization, so that's we are waiting for an insertelement to become a scalar_to_vector in the pattern matching here. The DAG allows for fun variations that are not possible in IR. Result types for extracts and scalar_to_vector don't necessarily match input types, so that means we have to be a bit more careful in the transform (see code comments). The tests show that we don't handle cases that require a shift (as we did in the IR version). I've left that as a potential follow-up because I'm not sure if that's a real concern at this late stage. Differential Revision: https://reviews.llvm.org/D53201 llvm-svn: 344872
*	DebugInfo: Use base address specifiers more aggressively	David Blaikie	2018-10-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Using a base address specifier even for a single-element range is a size win for object files (7 words versus 8 words - more significant savings if the debug info is compressed (since it's 3 words of uncompressable reloc + 4 compressable words compared to 6 uncompressable reloc + 2 compressable words) - does trade off executable size increase though. llvm-svn: 344841
*	DebugInfo: Use DW_OP_addrx in DWARFv5	David Blaikie	2018-10-20	1	-4/+11
\| \| \| \| \| \|	Reuse addresses in the address pool, even in non-split cases. llvm-svn: 344838
*	DebugInfo: Implement debug_rnglists.dwo	David Blaikie	2018-10-20	2	-3/+29
\| \| \| \| \| \| \|	Save space/relocations in .o files by keeping dwo ranges in the dwo file rather than the .o file. llvm-svn: 344837
*	DebugInfo: Use address pool forms in debug_rnglists	David Blaikie	2018-10-20	8	-107/+123
\| \| \| \| \| \|	Save no relocations by reusing addresses from the address pool. llvm-svn: 344836
*	DebugInfo: Use debug_addr for non-dwo addresses in DWARF 5	David Blaikie	2018-10-20	5	-14/+22
\| \| \| \| \| \| \| \|	Putting addresses in the address pool, even with non-fission, can reduce relocations - reusing the addresses from debug_info and debug_rnglists (the latter coming soon) llvm-svn: 344834
*	[MachineCSE][GlobalISel] Making sure MachineCSE works mid-GlobalISel (again)	Roman Tereshin	2018-10-20	2	-34/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change of approach, it looks like it's a much better idea to deal with the vregs that have LLTs and reg classes both properly, than trying to avoid creating those across all GlobalISel passes and all targets. The change mostly touches MachineRegisterInfo::constrainRegClass, which is apparently only used by MachineCSE. The changes are NFC for any pipeline but one that contains MachineCSE mid-GlobalISel. NOTE on isCallerPreservedOrConstPhysReg change in MachineCSE: There is no test covering it as the only way to insert a new pass (MachineCSE) from a command line I know of is llc's -run-pass option, which only works with MIR, but MIRParser freezes reserved registers upon MachineFunctions creation, making it impossible to reproduce the state that exposes the issue. Reviwed By: aditya_nandakumar Differential Revision: https://reviews.llvm.org/D53144 llvm-svn: 344822
*	[GISel]: Allow PHIs to be DCEd	Aditya Nandakumar	2018-10-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	https://reviews.llvm.org/D53304 Currently dead phis are not cleaned up during DCE. This patch allows dead PHI and G_PHI insts to be deleted. Reviewed by: dsanders llvm-svn: 344811
*	Fix a use-after-RAUW bug in large GEP splitting	Krzysztof Pszeniczny	2018-10-19	1	-3/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Large GEP splitting, introduced in rL332015, uses a `DenseMap<AssertingVH<Value>, ...>`. This causes an assertion to fail (in debug builds) or undefined behaviour to occur (in release builds) when a value is RAUWed. This manifested itself in the 7zip benchmark from the llvm test suite built on ARM with `-fstrict-vtable-pointers` enabled while RAUWing invariant group launders and splits in CodeGenPrepare. This patch merges the large offsets of the argument and the result of an invariant.group strip/launder intrinsic before RAUWing. Reviewers: Prazek, javed.absar, haicheng, efriedma Reviewed By: Prazek, efriedma Subscribers: kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D51936 llvm-svn: 344802
*	Use llvm::{all,any,none}_of instead std::{all,any,none}_of. NFC	Fangrui Song	2018-10-19	1	-4/+3
\| \| \| \|	llvm-svn: 344774
*	[Pipeliner] copyToPhi DAG Mutation to improve scheduling.	Sumanth Gundapaneni	2018-10-18	1	-1/+95
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In a loop, create artificial dependences between the source of a COPY/REG_SEQUENCE to the use in next iteration. Eg: SRC ----Data Dep--> COPY COPY ---Anti Dep--> PHI (implies, to be used in next iteration) PHI ----Data Dep--> USE This patches creates USE ----Artificial Dep---> SRC This will effectively schedule the COPY late to eliminate additional copies. Before this patch, the schedule can be SRC, COPY, USE : The COPY is used in next iteration and it needs to be preserved. After this patch, the schedule can be USE, SRC, COPY : The COPY is used in next iteration and the live interval is reduced. Differential Revision: https://reviews.llvm.org/D53303 llvm-svn: 344748
*	Revert "[WebAssembly] LSDA info generation"	Krasimir Georgiev	2018-10-16	11	-234/+58
\| \| \| \| \| \| \| \|	This reverts commit r344575. Newly introduced test eh-lsda.ll.test fails with use-after-free under ASAN build. llvm-svn: 344639
*	[Intrinsic] Signed Saturation Addition Intrinsic	Leonard Chan	2018-10-16	9	-0/+105
\| \| \| \| \| \| \| \| \| \| \|	Add an intrinsic that takes 2 integers and perform saturation addition on them. This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics. Differential Revision: https://reviews.llvm.org/D53053 llvm-svn: 344629
*	[LegalizeDAG] ExpandLegalINT_TO_FP - cleanup UINT_TO_FP i64 -> f64 expansion.	Simon Pilgrim	2018-10-16	1	-20/+17
\| \| \| \| \| \| \| \|	Use SrcVT/DestVT types, correct shift type and AND instead of ZERO_EXTEND_IN_REG. Part of prep work for D52965 llvm-svn: 344602
*	[WebAssembly] LSDA info generation	Heejin Ahn	2018-10-16	11	-58/+234
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This adds support for LSDA (exception table) generation for wasm EH. Wasm EH mostly follows the structure of Itanium-style exception tables, with one exception: a call site table entry in wasm EH corresponds to not a call site but a landing pad. In wasm EH, the VM is responsible for stack unwinding. After an exception occurs and the stack is unwound, the control flow is transferred to wasm 'catch' instruction by the VM, after which the personality function is called from the compiler-generated code. (Refer to WasmEHPrepare pass for more information on this part.) This patch: - Changes wasm.landingpad.index intrinsic to take a token argument, to make this 1:1 match with a catchpad instruction - Stores landingpad index info and catch type info MachineFunction in before instruction selection - Lowers wasm.lsda intrinsic to an MCSymbol pointing to the start of an exception table - Adds WasmException class with overridden methods for table generation - Adds support for LSDA section in Wasm object writer Reviewers: dschuff, sbc100, rnk Subscribers: mgorny, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D52748 llvm-svn: 344575
*	[SelectionDAG] allow FP binops in SimplifyDemandedVectorElts	Sanjay Patel	2018-10-15	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \|	This is intended to make the backend on par with functionality that was added to the IR version of SimplifyDemandedVectorElts in: rL343727 ...and the original motivation is that we need to improve demanded-vector-elements in several ways to avoid problems that would be exposed in D51553. Differential Revision: https://reviews.llvm.org/D52912 llvm-svn: 344541
*	[DAGCombiner] allow undef elts in vector fmul matching	Sanjay Patel	2018-10-15	1	-1/+1
\| \| \| \|	llvm-svn: 344534
*	[DAGCombiner] refactor folds for fadd (fmul X, -2.0), Y; NFCI	Sanjay Patel	2018-10-15	1	-16/+18
\| \| \| \| \| \|	The transform doesn't work if the vector constant has undef elements. llvm-svn: 344532
*	[DAGCombiner] allow undef elts in vector fma matching	Sanjay Patel	2018-10-15	1	-21/+22
\| \| \| \|	llvm-svn: 344528
*	[DAGCombiner] allow undef elts in vector fma matching	Sanjay Patel	2018-10-15	1	-9/+10
\| \| \| \|	llvm-svn: 344525
*	[TI removal] Make variables declared as `TerminatorInst` and initialized	Chandler Carruth	2018-10-15	5	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	by `getTerminator()` calls instead be declared as `Instruction`. This is the biggest remaining chunk of the usage of `getTerminator()` that insists on the narrow type and so is an easy batch of updates. Several files saw more extensive updates where this would cascade to requiring API updates within the file to use `Instruction` instead of `TerminatorInst`. All of these were trivial in nature (pervasively using `Instruction` instead just worked). llvm-svn: 344502
*	[TwoAddressInstructionPass] Replace subregister uses when processing tied ↵	Bjorn Pettersson	2018-10-15	1	-8/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	operands Summary: TwoAddressInstruction pass typically rewrites %1:short = foo %0.sub_lo:long as %1:short = COPY %0.sub_lo:long %1:short = foo %1:short when having tied operands. If there are extra un-tied operands that uses the same reg and subreg, such as the second and third inputs to fie here: %1:short = fie %0.sub_lo:long, %0.sub_hi:long, %0.sub_lo:long then there was a bug which replaced the register %0 also for the un-tied operand, but without changing the subregister indices. So we used to get: %1:short = COPY %0.sub_lo:long %1:short = fie %1, %1.sub_hi:short, %1.sub_lo:short With this fix we instead get: %1:short = COPY %0.sub_lo:long %1:short = fie %1, %0.sub_hi:long, %1 Reviewers: arsenm, JesperAntonsson, kparzysz, MatzeB Reviewed By: MatzeB Subscribers: bjope, kparzysz, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D36224 llvm-svn: 344492
*	[LegalizeDAG] Don't bother with final MUL+SRL stage for byte CTPOP.	Simon Pilgrim	2018-10-14	1	-3/+4
\| \| \| \| \| \| \| \|	The final stage of CTPOP expansion (v = (v * 0x01010101...) >> (Len - 8)) is completely pointless for the byte (Len = 8) case as it reduces to (v = (v * 0x01...) >> 0), but annoyingly this doesn't always get optimized away. Found while investigating generic vector CTPOP expansion (PR32655). llvm-svn: 344477
*	Pull out repeated variables from SelectionDAGLegalize::ExpandBitCount.	Simon Pilgrim	2018-10-13	1	-8/+2
\| \| \| \| \| \|	The CTPOP case has been changed from VT.getSizeInBits to VT.getScalarSizeInBits - but this fits in with future work for vector support (PR32655) and doesn't affect any current (scalar) uses. llvm-svn: 344461
*	[LegalizeTypes] Prevent an assertion from PromoteIntRes_BSWAP and ↵	Craig Topper	2018-10-13	1	-8/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PromoteIntRes_BITREVERSE if the shift amount is too large for the VT returned by getShiftAmountTy Summary: getShiftAmountTy for X86 returns MVT::i8. If a BSWAP or BITREVERSE is created that requires promotion and the difference between the original VT and the promoted VT is more than 255 then we won't able to create the constant. This patch adds a check to replace the result from getShiftAmountTy to MVT::i32 if the difference won't fit. This should get legalized later when the shift is ultimately expanded since its clearly an illegal type that we're only promoting to make it a power of 2 bit width. Alternatively we could base the decision completely on the largest shift amount the promoted VT could use. Vectors should be immune here because getShiftAmountTy always returns the incoming VT for vectors. Only the scalar shift amount can be changed by the targets. Reviewers: eli.friedman, RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D53232 llvm-svn: 344460
*	[X86][SSE] Remove most of vector CTTZ custom lowering and use LegalizeDAG ↵	Simon Pilgrim	2018-10-13	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	instead. There is one remnant - AVX1 custom splitting of 256-bit vectors - which is due to a regression where the X86ISD::ANDNP is still performed as a YMM. I've also tightened the CTLZ or CTPOP lowering in SelectionDAGLegalize::ExpandBitCount to require a legal CTLZ - it doesn't affect existing users and fixes an issue with AVX512 codegen. llvm-svn: 344457
*	[X86][SSE] Begin removing vector CTTZ custom lowering and use LegalizeDAG ↵	Simon Pilgrim	2018-10-13	2	-4/+16
\| \| \| \| \| \| \| \|	instead. Adds CTTZ vector legalization support and begins the removal of the X86/SSE custom lowering. llvm-svn: 344453
*	[Intrinsic] Add llvm.minimum and llvm.maximum instrinsic functions	Thomas Lively	2018-10-13	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: These new intrinsics have the semantics of the `minimum` and `maximum` operations specified by the latest draft of IEEE 754-2018. Unlike llvm.minnum and llvm.maxnum, these new intrinsics propagate NaNs and always treat -0.0 as less than 0.0. `minimum` and `maximum` lower directly to the existing `fminnan` and `fmaxnan` ISel DAG nodes. It is safe to reuse these DAG nodes because before this patch were only emitted in situations where there were known to be no NaN arguments or where NaN propagation was correct and there were known to be no zero arguments. I know of only four backends that lower fminnan and fmaxnan: WebAssembly, ARM, AArch64, and SystemZ, and each of these lowers fminnan and fmaxnan to instructions that are compatible with the IEEE 754-2018 semantics. Reviewers: aheejin, dschuff, sunfish, javed.absar Subscribers: kristof.beyls, dexonsmith, kristina, llvm-commits Differential Revision: https://reviews.llvm.org/D52764 llvm-svn: 344437
*	[LegalizeVectorTypes] Use TLI.getVectorIdxTy instead of DAG.getIntPtrConstant.	Craig Topper	2018-10-12	1	-2/+3
\| \| \| \| \| \|	There's no guarantee that vector indices should use pointer types. So use the correct query method. llvm-svn: 344428
*	[LegalizeVectorTypes] When widening the result of a bitcast from a scalar ↵	Craig Topper	2018-10-12	1	-14/+12
\| \| \| \| \| \| \| \| \| \|	type, use a scalar_to_vector to turn the scalar into a vector intead of a build vector full of mostly undefs. This is more consistent with what we usually do and matches some code X86 custom emits in some cases that I think I can cleanup. The MIPS test change just looks to be an instruction ordering change. llvm-svn: 344422
*	Revert BTF commit series.	Eli Friedman	2018-10-12	7	-662/+3
\| \| \| \| \| \| \| \| \| \|	The initial patch was not reviewed, and does not have any tests; it should not have been merged. This reverts 344395, 344390, 344387, 344385, 344381, 344376, and 344366. llvm-svn: 344405
*	[LegalizeVectorTypes] When widening the operands to a concat_vectors, see if ↵	Craig Topper	2018-10-12	1	-5/+16
\| \| \| \| \| \| \| \| \| \| \| \|	we can use the widened operand 0 if the width matches and the other operands are undef. This saves a conversion to extracts and build_vector. We already do this when both the result and the input need to be widened to the same type. This changed the sse-intrinsics-fast-isel test because we don't lower (insert_vector_elt (scalar_to_vector X), Y, 1) well. We turn it into (vector_shuffle (scalar_to_vector X), (scalar_to_vector Y), <0, 4, 2, 3>) losing track of the fact that the upper elts could be undef. We should probably find a way to prevent the scalarization of the <2 x f32> load on these tests. llvm-svn: 344404
*	[LegalizeVectorTypes] When unrolling in WidenVecRes_Convert, make sure we ↵	Craig Topper	2018-10-12	1	-12/+6
\| \| \| \| \| \| \| \| \| \|	use the original vector element count. Not min of the widened result type and the possibly widened input type. If the input type is widened as well, but we still were forced to unroll, we shouldn't be considering the widened input element count. We should only create as many scalar operations as the original type called for. This will be important for an upcoming patch. llvm-svn: 344403
*	Replace assert() with llvm_unreachable because it's obviously a typo.	Rui Ueyama	2018-10-12	1	-1/+1
\| \| \| \|	llvm-svn: 344395
*	[codeview] Emit S_BUILDINFO and LF_BUILDINFO with cwd and source file	Reid Kleckner	2018-10-12	2	-0/+50
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: We can fill in the command line and compiler path later if we want. Reviewers: zturner Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D53179 llvm-svn: 344393
*	[BPF] Use cstdint {,u}int*_t instead of linux/types.h __u32 __u16 ...	Fangrui Song	2018-10-12	2	-9/+9
\| \| \| \|	llvm-svn: 344387
*	Disambiguate: s/make_unique/llvm::make_unique/. NFC	Eric Liu	2018-10-12	1	-13/+13
\| \| \| \|	llvm-svn: 344385
*	[BPF] Don't include linux/types.h and fix style	Fangrui Song	2018-10-12	3	-153/+147
\| \| \| \|	llvm-svn: 344381
*	Better support for POSIX paths in PDBs.	Zachary Turner	2018-10-12	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This a resubmission of a patch which was previously reverted due to breaking several lld tests. The issues causing those failures have been fixed, so the patch is now resubmitted. ---Original Commit Message--- While it doesn't make a ton of sense for POSIX paths to be in PDBs, it's possible to occur in real scenarios involving cross compilation. The tools need to be able to handle this, because certain types of debugging scenarios are possible without a running process and so don't necessarily require you to be on a Windows system. These include post-mortem debugging and binary forensics (e.g. using a debugger to disassemble functions and examine symbols without running the process). There's changes in clang, LLD, and lldb in this patch. After this the cross-platform disassembly and source-list tests pass on Linux. Furthermore, the behavior of LLD can now be summarized by a much simpler rule than before: Unless you specify /pdbsourcepath and /pdbaltpath, the PDB ends up with paths that are valid within the context of the machine that the link is performed on. Differential Revision: https://reviews.llvm.org/D53149 llvm-svn: 344377
*	[BPF] Some fixes after rL344366	Fangrui Song	2018-10-12	2	-0/+3
\| \| \| \| \| \| \| \| \| \|	* Move #include outside of namespaces * Add missing #include * Add out-of-line virtual destructor to BTFTypeEntry designated initializers should also be fixed llvm-svn: 344376
*	[BPF] Add BTF generation for BPF target	Yonghong Song	2018-10-12	7	-0/+662
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	BTF is the debug format for BPF, a kernel virtual machine and widely used for tracing, networking and security, etc ([1]). Currently only instruction streams are passed to kernel, the kernel verifier verifies them before execution. In order to provide better visibility of bpf programs to user space tools, some debug information, e.g., function names and debug line information are desirable for kernel so tools can get such information with better annotation for jited instructions for performance or other reasons. The dwarf is too complicated in kernel and for BPF. Hence, BTF is designed to be the debug format for BPF ([2]). Right now, pahole supports BTF for types, which are generated based on dwarf sections in the ELF file. In order to annotate performance metrics for jited bpf insns, it is necessary to pass debug line info to the kernel. Furthermore, we want to pass the actual code to the kernel because of the following reasons: . bpf program typically is small so storage overhead should be small. . in bpf land, it is totally possible that an application loads the bpf program into the kernel and then that application quits, so holding debug info by the user space application is not practical. . having source codes directly kept by kernel would ease deployment since the original source code does not need ship on every hosts and kernel-devel package does not need to be deployed even if kernel headers are used. The only reliable time to get the source code is during compilation time. This will result in both more accurate information and easier deployment as stated in the above. Another consideration is for JIT. The project like bcc use MCJIT to compile a C program into bpf insns and load them to the kernel ([3]). The generated BTF sections will be readily available for such cases as well. This patch implemented generation of BTF info in llvm compiler. The BTF related sections will be generated when both -target bpf and -g are specified. Two sections are generated: .BTF contains all the type and string information, and .BTF.ext contains the func_info and line_info. The separation is related to how two sections are used differently in bpf loader, e.g., linux libbpf ([4]). The .BTF section can be loaded into the kernel directly while .BTF.ext needs loader manipulation before loading to the kernel. The format of the each section is roughly defined in llvm:include/llvm/MC/MCBTFContext.h and from the implementation in llvm:lib/MC/MCBTFContext.cpp. A later example also shows the contents in each section. The type and func_info are gathered during CodeGen/AsmPrinter by traversing dwarf debug_info. The line_info is gathered in MCObjectStreamer before writing to the object file. After all the information is gathered, the two sections are emitted in MCObjectStreamer::finishImpl. With cmake CMAKE_BUILD_TYPE=Debug, the compiler can dump out all the tables except insn offset, which will be resolved later as relocation records. The debug type "btf" is used for BTFContext dump. Dwarf tests the debug info generation with llvm-dwarfdump to decode the binary sections and check whether the result is expected. Currently we do not have such a tool yet. We will implement btf dump functionality in bpftool ([5]) as the bpftool is considered the recommended tool for bpf introspection. The implementation for type and func_info is tested with linux kernel test cases. The line_info is visually checked with dump from linux kernel libbpf ([4]) and checked with readelf dumping section raw data. Note that the .BTF and .BTF.ext information will not be emitted to assembly code and there is no assembler support for BTF either. In the below, with a clang/llvm built with CMAKE_BUILD_TYPE=Debug, Each table contents are shown for a simple C program. -bash-4.2$ cat -n test.c 1 struct A { 2 int a; 3 char b; 4 }; 5 6 int test(struct A t) { 7 return t->a; 8 } -bash-4.2$ clang -O2 -target bpf -g -mllvm -debug-only=btf -c test.c Type Table: [1] FUNC name_off=1 info=0x0c000001 size/type=2 param_type=3 [2] INT name_off=12 info=0x01000000 size/type=4 desc=0x01000020 [3] PTR name_off=0 info=0x02000000 size/type=4 [4] STRUCT name_off=16 info=0x04000002 size/type=8 name_off=18 type=2 bit_offset=0 name_off=20 type=5 bit_offset=32 [5] INT name_off=22 info=0x01000000 size/type=1 desc=0x02000008 String Table: 0 : 1 : test 6 : .text 12 : int 16 : A 18 : a 20 : b 22 : char 27 : test.c 34 : int test(struct A t) { 58 : return t->a; FuncInfo Table: sec_name_off=6 insn_offset=<Omitted> type_id=1 LineInfo Table: sec_name_off=6 insn_offset=<Omitted> file_name_off=27 line_off=34 line_num=6 column_num=0 insn_offset=<Omitted> file_name_off=27 line_off=58 line_num=7 column_num=3 -bash-4.2$ readelf -S test.o ...... [12] .BTF PROGBITS 0000000000000000 0000028d 00000000000000c1 0000000000000000 0 0 1 [13] .BTF.ext PROGBITS 0000000000000000 0000034e 0000000000000050 0000000000000000 0 0 1 [14] .rel.BTF.ext REL 0000000000000000 00000648 0000000000000030 0000000000000010 16 13 8 ...... -bash-4.2$ The latest linux kernel ([6]) can already support .BTF with type information. The [7] has the reference implementation in linux kernel side to support .BTF.ext func_info. The .BTF.ext line_info support is not implemented yet. If you have difficulty accessing [6], you can manually do the following to access the code: git clone https://github.com/yonghong-song/bpf-next-linux.git cd bpf-next-linux git checkout btf The change will push to linux kernel soon once this patch is landed. References: [1]. https://www.kernel.org/doc/Documentation/networking/filter.txt [2]. https://lwn.net/Articles/750695/ [3]. https://github.com/iovisor/bcc [4]. https://github.com/torvalds/linux/tree/master/tools/lib/bpf [5]. https://github.com/torvalds/linux/tree/master/tools/bpf/bpftool [6]. https://github.com/torvalds/linux [7]. https://github.com/yonghong-song/bpf-next-linux/tree/btf Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Differential Revision: https://reviews.llvm.org/D52950 llvm-svn: 344366
*	[MC][ELF] fix newly added test	Nick Desaulniers	2018-10-12	1	-25/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Reland of - r344197 "[MC][ELF] compute entity size for explicit sections" - r344206 "[MC][ELF] Fix section_mergeable_size.ll" after being reverted in r344278 due to build breakages from not specifying a target triple. Move test from test/CodeGen/Generic/ to test/MC/ELF/. Add explicit target triple so we don't try to run this test on non ELF targets. Reported: https://reviews.llvm.org/D53056#1261707 Reviewers: fhahn, rnk, espindola, NoQ Reviewed By: fhahn, rnk Subscribers: NoQ, MaskRay, rengolin, emaste, arichardson, llvm-commits, pirama, srhines Differential Revision: https://reviews.llvm.org/D53146 llvm-svn: 344360
*	Pull out repeated value types. NFCI.	Simon Pilgrim	2018-10-12	1	-6/+5
\| \| \| \|	llvm-svn: 344355
*	Pull out repeated value types. NFCI.	Simon Pilgrim	2018-10-12	1	-3/+5
\| \| \| \|	llvm-svn: 344354
*	[SelectionDAG] Move VectorLegalizer::ExpandCTLZ codegen into ↵	Simon Pilgrim	2018-10-12	2	-24/+5
\| \| \| \| \| \| \| \|	SelectionDAGLegalize Generalize SelectionDAGLegalize's CTLZ expansion to handle vectors - lets VectorLegalizer::ExpandCTLZ to just pass the expansion on instead of repeating the same codegen. llvm-svn: 344349
*	[DAGCombiner] rearrange extract_element+bitcast fold; NFC	Sanjay Patel	2018-10-11	1	-6/+8
\| \| \| \| \| \| \| \| \| \|	I want to add another pattern here that includes scalar_to_vector, so this makes that patch smaller. I was hoping to remove the hasOneUse() check because it shouldn't be necessary for common codegen, but an AMDGPU test has a comment suggesting that the extra check makes things better on one of those targets. llvm-svn: 344320
*	Revert "DwarfDebug: Pick next location in case of missing location at block ↵	Matthias Braun	2018-10-11	2	-74/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	begin" It originally triggered a stepping problem in the debugger, which could be fixed by adjusting CodeGen/LexicalScopes.cpp however it seems we prefer the previous behavior anyway. See the discussion for details: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181008/593833.html This reverts commit r343880. This reverts commit r343874. llvm-svn: 344318