bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[WebAssembly][NFC] Remove WebAssemblyStackifier TableGen backend	Thomas Lively	2018-10-22	5	-32/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Replace its functionality with a TableGen InstrInfo relational instruction mapping. Although arguably more complex than the TableGen backend, the relational mapping is a smaller maintenance burden than a TableGen backend. Reviewers: aardappel, aheejin, dschuff Subscribers: mgorny, sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D53307 llvm-svn: 344962
*	[DWARF] Use a function-local offset for AT_call_return_pc	Vedant Kumar	2018-10-22	5	-9/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Logs provided by @stella.stamenova indicate that on Linux, lldb adds a spurious slide offset to the return PC it loads from AT_call_return_pc attributes (see the list thread: "[PATCH] D50478: Add support for artificial tail call frames"). This patch side-steps the issue by getting rid of the load address calculation in lldb's CallEdge::GetReturnPCAddress. The idea is to have the DWARF writer emit function-local offsets to the instruction after a call. I.e. return-pc = label-after-call-insn - function-entry. LLDB can simply add this offset to the base address of a function to get the return PC. Differential Revision: https://reviews.llvm.org/D53469 llvm-svn: 344960
*	[Reassociate] add 'using namespace' to reduce bloat; NFC	Sanjay Patel	2018-10-22	1	-3/+4
\| \| \| \|	llvm-svn: 344959
*	[ORC] Guard access to the MemMgrs vector in RTDyldObjectLinkingLayer.	Lang Hames	2018-10-22	1	-3/+10
\| \| \| \| \| \| \| \| \|	Otherwise we can end up with a data-race when linking concurrently. This should fix an intermittent failure in the multiple-compile-threads-basic.ll testcase. llvm-svn: 344956
*	X86: add alias for pushfw/popfw in Intel mode	Tim Northover	2018-10-22	1	-0/+4
\| \| \| \| \| \| \| \| \|	A while ago we changed pushf and popf in Intel mode to generate pushfq and popfq. Unfortunately that left us with no way to get the 16-bit encoding in Intel mode so this patch adds pushfw and popfw as aliases there. llvm-svn: 344949
*	Reapply "[MachineCopyPropagation] Reimplement CopyTracker in terms of ↵	Justin Bogner	2018-10-22	1	-58/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	register units" Recommits r342942, which was reverted in r343189, with a fix for an issue where we would propagate unsafely if we defined only the upper part of a register. Original message: Change the copy tracker to keep a single map of register units instead of 3 maps of registers. This gives a very significant compile time performance improvement to the pass. I measured a 30-40% decrease in time spent in MCP on x86 and AArch64 and much more significant improvements on out of tree targets with more registers. llvm-svn: 344942
*	[hot-cold-split] Add opt remark on success	Teresa Johnson	2018-10-22	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Emit optimization remark on successful hot cold split. Reviewers: sebpop, hiraditya Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D53512 llvm-svn: 344938
*	Revert rL344931 from llvm/trunk: [X86][SSE] getTargetShuffleMaskIndices - ↵	Simon Pilgrim	2018-10-22	1	-9/+10
\| \| \| \| \| \| \| \| \| \|	allow opt-in support for whole undef shuffle mask elements We can't safely assume that certain RawMask entries are UNDEF as most variable shuffles ignore non-index bits - PSHUFB only works on i8 elts so it'd be safe to use but I'm intending to come up with an alternative approach that works for all. ........ Enable this for PSHUFB constant mask decoding and remove the ConstantPool DecodePSHUFBMask llvm-svn: 344937
*	Revert rL344933 from llvm/trunk: [X86][SSE] Tidyup DecodeVPERMILPMask ↵	Simon Pilgrim	2018-10-22	2	-5/+5
\| \| \| \| \| \| \| \| \| \|	shuffle mask decoding We can't safely assume that certain RawMask entries are UNDEF as most variable shuffles ignore non-index bits. ........ Add support for UNDEF raw mask elements and remove the ConstantPool DecodeVPERMILPMask usage in X86ISelLowering.cpp llvm-svn: 344936
*	Revert r344930 as it broke some of the bots on Windows.	Aaron Ballman	2018-10-22	2	-92/+79
\| \| \| \| \| \|	http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/739 llvm-svn: 344935
*	[X86][SSE] Tidyup DecodeVPERMILPMask shuffle mask decoding	Simon Pilgrim	2018-10-22	2	-5/+5
\| \| \| \| \| \|	Add support for UNDEF raw mask elements and remove the ConstantPool DecodeVPERMILPMask usage in X86ISelLowering.cpp llvm-svn: 344933
*	[X86][SSE] getTargetShuffleMaskIndices - allow opt-in support for whole ↵	Simon Pilgrim	2018-10-22	1	-10/+9
\| \| \| \| \| \| \| \|	undef shuffle mask elements Enable this for PSHUFB constant mask decoding and remove the ConstantPool DecodePSHUFBMask llvm-svn: 344931
*	[SourceMgr][FileCheck] Obey -color by extending WithColor	Joel E. Denny	2018-10-22	2	-79/+92
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While this change specifically targets FileCheck, it affects any tool using the same SourceMgr facilities. Previously, -color was documented in FileCheck's -help output, but -color had no effect. Now, -color obeys its documentation: it forces colors to be used in FileCheck diagnostics even when stderr is not a terminal. -color is especially helpful when combined with FileCheck's -v, which can produce a long series of diagnostics that you might wish to pipe to a pager, such as less -R. The WithColor extensions here will also help to clean up color usage in FileCheck's annotated dump of input, which is proposed in D52999. Reviewed By: JDevlieghere Differential Revision: https://reviews.llvm.org/D53419 llvm-svn: 344930
*	[X86] getTargetConstantBitsFromNode - handle extraction from larger constant ↵	Simon Pilgrim	2018-10-22	1	-2/+3
\| \| \| \| \| \| \| \|	pool entries First step towards removing X86ShuffleDecodeConstantPool usage from X86ISelLowering.cpp llvm-svn: 344924
*	Revert r344877 "[X86] Stop promoting integer loads to vXi64"	Craig Topper	2018-10-22	9	-631/+521
\| \| \| \| \| \|	Sam McCall reported miscompiles in some tensorflow code. Reverting while I try to figure out. llvm-svn: 344921
*	DAG: Change behavior of fminnum/fmaxnum nodes	Matt Arsenault	2018-10-22	17	-50/+280
\| \| \| \| \| \| \| \| \| \| \|	Introduce new versions that follow the IEEE semantics to help with legalization that may need quieted inputs. There are some regressions from inserting unnecessary canonicalizes when these are matched from fast math fcmp + select which should be fixed in a future commit. llvm-svn: 344914
*	Some cleanups to the native pdb plugin [NFC].	Zachary Turner	2018-10-22	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \|	This is mostly some cleanup done in the process of implementing some basic support for types. I tried to split up the patch a bit to get some of the NFC portion of the patch out into a separate commit, and this is the result of that. It moves some code around, deletes some spurious namespace qualifications, removes some unnecessary header includes, forward declarations, etc. llvm-svn: 344913
*	[X86][SSE] getTargetShuffleMask - pull out repeated shuffle mask element ↵	Simon Pilgrim	2018-10-22	1	-29/+22
\| \| \| \| \| \|	size. NFCI. llvm-svn: 344910
*	Revert "[PDB] Extend IPDBSession's interface to retrieve frame data"	Aleksandr Urakov	2018-10-22	6	-118/+0
\| \| \| \| \| \|	This reverts commit b5c7e2f9a4dbb34e3667c4bb4972735eadd3247a. llvm-svn: 344909
*	[X86] X86DAGToDAGISel: handle BZHI selection too, not just BEXTR.	Roman Lebedev	2018-10-22	5	-26/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: As discussed in D52304 / IRC, we now have pattern matching for 'bit extract' in two places - tablegen and `X86DAGToDAGISel`. There are 4 patterns. And we will have a problem with `x & (-1 >> (32 - y))` pattern. * If the mask is one-use, then it is always unfolded into `x << (32 - y) >> (32 - y)` first. Thus, the existing test coverage is already broken. * If it is not one-use, then it is not unfolded, and is matched as BZHI. * If it is not one-use, we will not match it as BEXTR. And if it is one-use, it will have been unfolded already. So we will either not handle that pattern for BEXTR, or not have test coverage for it. This is bad. As discussed with @craig.topper, let's unify this matching, and do everything in `X86DAGToDAGISel`. Then we will not have code duplication, and will have proper test coverage. This indeed does not affect any tests, and this is great. It means that for these two patterns, the `X86DAGToDAGISel` is identical to the tablegen version. Please review carefully, i'm not fully sure about that intrinsic change, and introduction of the new `X86ISD` opcode. Reviewers: craig.topper, RKSimon, spatel Reviewed By: craig.topper Subscribers: llvm-commits, craig.topper Differential Revision: https://reviews.llvm.org/D53164 llvm-svn: 344904
*	[X86][BMI1]: X86DAGToDAGISel: select BEXTR from x & ((1 << nbits) + (-1)) ↵	Roman Lebedev	2018-10-22	1	-3/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	pattern Summary: Trivial continuation of D52304. While this pattern is not canonical, we do select it in the BZHI case, so this should not be any different. Reviewers: RKSimon, craig.topper, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52348 llvm-svn: 344902
*	Test commit: change comment.	Petar Avramovic	2018-10-22	1	-1/+1
\| \| \| \|	llvm-svn: 344900
*	[llvm-dwarfdump] - Fix win10 build bot failture.	George Rimar	2018-10-22	1	-3/+6
\| \| \| \| \| \| \| \| \| \| \|	Bot failed: http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/20877/steps/test/logs/stdio This was broken after the r344895 "[llvm-dwarfdump] - Add the support of parsing .debug_loclists." because of wrong formatting specifiers used. llvm-svn: 344896
*	[llvm-dwarfdump] - Add the support of parsing .debug_loclists.	George Rimar	2018-10-22	3	-28/+108
\| \| \| \| \| \| \| \| \| \| \| \|	This teaches llvm-dwarfdump to dump the content of .debug_loclists sections. It converts the DWARFDebugLocDWO class to DWARFDebugLoclists, teaches llvm-dwarfdump about .debug_loclists section and adds the implementation for parsing the DW_LLE_offset_pair entries. Differential revision: https://reviews.llvm.org/D53364 llvm-svn: 344895
*	[PowerPC][NFC] Fix bugs in r+r to r+i conversion	Nemanja Ivanovic	2018-10-22	2	-13/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The D-Form VSX loads introduced in ISA 3.0 are not direct D-Form equivalent of the corresponding X-Forms since they only target the Altivec registers. Namely LXSSPX can load into any of the 64 VSX registers whereas LXSSP can only load into the upper 32 VSX registers. Similarly with the remaining affected instructions. There is currently no way that I can see to trigger the bug, but as we add other ways of exploiting these instructions, there may very well be instances that do. This is an NFC patch in practical terms since the changes it introduces can not be triggered without an MIR test. Differential revision: https://reviews.llvm.org/D53323 llvm-svn: 344894
*	[CGProfile] Turn constant-size SmallVector into array	Benjamin Kramer	2018-10-22	1	-5/+4
\| \| \| \| \| \|	No functionality change. llvm-svn: 344893
*	[PDB] Extend IPDBSession's interface to retrieve frame data	Aleksandr Urakov	2018-10-22	6	-0/+118
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch just extends the `IPDBSession` interface to allow retrieving of frame data through it, and adds an implementation over DIA. It is needed for an implementation (for now with DIA) of the conversion from FPO programs to DWARF expressions mentioned in D53086. Reviewers: zturner, asmith, rnk Reviewed By: asmith Subscribers: mgorny, aprantl, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D53324 llvm-svn: 344886
*	[X86] Add patterns for vector and/or/xor/andn with other types than vXi64.	Craig Topper	2018-10-22	2	-0/+205
\| \| \| \| \| \| \| \| \| \| \| \|	This makes fast isel treat all legal vector types the same way. Previously only vXi64 was in the fast-isel tables. This unfortunately prevents matching of andn by fast-isel for these types since the requires SelectionDAG. But we already had this issue for vXi64. So at least we're consistent now. Interestinly it looks like fast-isel can't handle instructions with constant vector arguments so the the not part of the andn patterns is selected with SelectionDAG. This explains why VPTERNLOG shows up in some of the tests. This is a subset of D53268. As I make progress on that, I will try to reduce the number of lines in the tablegen files. llvm-svn: 344884
*	[IAI,LV] Avoid creating a scalar epilogue due to gaps in interleave-groups when	Dorit Nuzman	2018-10-22	2	-2/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	optimizing for size LV is careful to respect -Os and not to create a scalar epilog in all cases (runtime tests, trip-counts that require a remainder loop) except for peeling due to gaps in interleave-groups. This patch fixes that; -Os will now have us invalidate such interleave-groups and vectorize without an epilog. The patch also removes a related FIXME comment that is now obsolete, and was also inaccurate: "FIXME: return None if loop requiresScalarEpilog(<MaxVF>), or look for a smaller MaxVF that does not require a scalar epilog." (requiresScalarEpilog() has nothing to do with VF). Reviewers: Ayal, hsaito, dcaballe, fhahn Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53420 llvm-svn: 344883
*	[X86] Stop promoting integer loads to vXi64	Craig Topper	2018-10-21	9	-521/+631
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Theoretically this was done to simplify the amount of isel patterns that were needed. But it also meant a substantial number of our isel patterns have to match an explicit bitcast. By making the vXi32/vXi16/vXi8 types legal for loads, DAG combiner should be able to change the load type to remove the bitcast. I had to add some additional plain load instruction patterns and a few other special cases, but overall the isel table has reduced in size by ~12000 bytes. So it looks like this promotion was hurting us more than helping. I still have one crash in vector-trunc.ll that I'm hoping @RKSimon can help with. It seems to relate to using getTargetConstantFromNode on a load that was shrunk due to an extract_subvector combine after the constant pool entry was created. So we end up decoding more mask elements than the load size. I'm hoping this patch will simplify the number of patterns needed to remove the and/or/xor promotion. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits, RKSimon Differential Revision: https://reviews.llvm.org/D53306 llvm-svn: 344877
*	Revert r344873 "foo"	Craig Topper	2018-10-21	4	-62/+45
\| \| \| \| \| \|	Rebase gone wrong left this in my tree. llvm-svn: 344875
*	[X86] Remove SDIVREM8_SEXT_HREG/UDIVREM8_ZEXT_HREG and their associated DAG ↵	Craig Topper	2018-10-21	3	-73/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	combine and target bits support. Use a post isel peephole instead. Summary: These nodes exist to overcome an isel problem where we can generate a zero extend of an AH register followed by an extract subreg, and another zero extend. The first zero extend exists to avoid a partial register update copying the AH register into the low 8-bits. The second zero extend exists if the user wanted the remainder zero extended. To make this work we had a DAG combine to morph the DIVREM opcode to a special opcode that included the extend. But then we had to add the new node to computeKnownBits and computeNumSignBits to process the extension portion. This patch instead removes all of that and adds a late peephole to detect the two extends. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D53449 llvm-svn: 344874
*	foo	Craig Topper	2018-10-21	4	-45/+62
\| \| \| \|	llvm-svn: 344873
*	[DAGCombiner] reduce insert+bitcast+extract vector ops to truncate (PR39016)	Sanjay Patel	2018-10-21	1	-4/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a late backend subset of the IR transform added with: D52439 We can confirm that the conversion to a 'trunc' is correct by running: $ opt -instcombine -data-layout="e" (assuming the IR transforms are correct; change "e" to "E" for big-endian) As discussed in PR39016: https://bugs.llvm.org/show_bug.cgi?id=39016 ...the pattern may emerge during legalization, so that's we are waiting for an insertelement to become a scalar_to_vector in the pattern matching here. The DAG allows for fun variations that are not possible in IR. Result types for extracts and scalar_to_vector don't necessarily match input types, so that means we have to be a bit more careful in the transform (see code comments). The tests show that we don't handle cases that require a shift (as we did in the IR version). I've left that as a potential follow-up because I'm not sure if that's a real concern at this late stage. Differential Revision: https://reviews.llvm.org/D53201 llvm-svn: 344872
*	Schedule Hot Cold Splitting pass after most optimization passes	Aditya Kumar	2018-10-21	2	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In the new+old pass manager, hot cold splitting was schedule too early. Thanks to Vedant for pointing this out. Reviewers: sebpop, vsk Reviewed By: sebpop, vsk Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D53437 llvm-svn: 344869
*	[X86][AVX] Enable lowerVectorShuffleAsLanePermuteAndPermute v16i16/v32i8 ↵	Simon Pilgrim	2018-10-21	1	-2/+12
\| \| \| \| \| \|	unary shuffle lowering llvm-svn: 344868
*	[X86] Only extract constant pool shuffle mask data with zero offsets	Simon Pilgrim	2018-10-21	2	-2/+2
\| \| \| \| \| \|	D53306 exposes an issue where we sometimes use constant pool data from bigger vectors than the target shuffle mask. This should be safe to do, but we have to be certain that we're using the bottom most part of the vector as the shuffle mask decoders have no way to peek into subvectors with non-zero offsets. llvm-svn: 344867
*	[InstCombine] use 'match' to simplify code; NFC	Sanjay Patel	2018-10-20	1	-59/+56
\| \| \| \|	llvm-svn: 344855
*	[InstCombine] make code more flexible with lambda; NFC	Sanjay Patel	2018-10-20	1	-4/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I couldn't tell from svn history when these checks were added, but it pre-dates the split of instcombine into its own directory at rL92459. The motivation for changing the check is partly shown by the code in PR34724: https://bugs.llvm.org/show_bug.cgi?id=34724 There are also existing regression tests for SLPVectorizer with sequences of extract+insert that are likely assumed to become shuffles by the vectorizer cost models. llvm-svn: 344854
*	[InstCombine] add explanatory comment for strange vector logic; NFC	Sanjay Patel	2018-10-20	1	-0/+16
\| \| \| \|	llvm-svn: 344852
*	Replace setFeature macro with lambda to fix MSVC "shift count negative or ↵	Simon Pilgrim	2018-10-20	1	-10/+10
\| \| \| \| \| \|	too big" warnings. NFCI. llvm-svn: 344843
*	DebugInfo: Use base address specifiers more aggressively	David Blaikie	2018-10-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Using a base address specifier even for a single-element range is a size win for object files (7 words versus 8 words - more significant savings if the debug info is compressed (since it's 3 words of uncompressable reloc + 4 compressable words compared to 6 uncompressable reloc + 2 compressable words) - does trade off executable size increase though. llvm-svn: 344841
*	Add missed file from previous commit (r344838)	David Blaikie	2018-10-20	1	-0/+3
\| \| \| \|	llvm-svn: 344839
*	DebugInfo: Use DW_OP_addrx in DWARFv5	David Blaikie	2018-10-20	2	-4/+12
\| \| \| \| \| \|	Reuse addresses in the address pool, even in non-split cases. llvm-svn: 344838
*	DebugInfo: Implement debug_rnglists.dwo	David Blaikie	2018-10-20	2	-3/+29
\| \| \| \| \| \| \|	Save space/relocations in .o files by keeping dwo ranges in the dwo file rather than the .o file. llvm-svn: 344837
*	DebugInfo: Use address pool forms in debug_rnglists	David Blaikie	2018-10-20	8	-107/+123
\| \| \| \| \| \|	Save no relocations by reusing addresses from the address pool. llvm-svn: 344836
*	llvm-dwarfdump: Support RLE_addressx and RLE_startx_length in .debug_rnglists	David Blaikie	2018-10-20	3	-22/+82
\| \| \| \|	llvm-svn: 344835
*	DebugInfo: Use debug_addr for non-dwo addresses in DWARF 5	David Blaikie	2018-10-20	10	-36/+57
\| \| \| \| \| \| \| \|	Putting addresses in the address pool, even with non-fission, can reduce relocations - reusing the addresses from debug_info and debug_rnglists (the latter coming soon) llvm-svn: 344834
*	[X86] Add additional CPUs and features to Host.cpp and X86TargetParser.def ↵	Craig Topper	2018-10-20	1	-50/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to match compiler-rt and enable __builtin_cpu_supports/__builtin_cpu_is support in clang Summary: This matches LLVM to D53461 for compiler-rt. Reviewers: echristo, erichkeane Reviewed By: echristo Subscribers: dberris, llvm-commits Differential Revision: https://reviews.llvm.org/D53462 llvm-svn: 344831
*	[WebAssembly] Implement vector sext_inreg and tests with comparisons	Thomas Lively	2018-10-20	2	-4/+9
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Depends on D53251. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D53252 llvm-svn: 344826