bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[TargetLowering] expandUnalignedStore - cleanup EVT variables. NFCI.	Simon Pilgrim	2019-05-03	1	-23/+18
\| \| \| \| \| \|	Avoid duplicated EVTs and rename Store/Load VTs to avoid -Wshadow warnings. llvm-svn: 359877
*	Revert "[MIR] Add simple PRE pass to MachineCSE"	Anton Afanasyev	2019-05-03	1	-117/+9
\| \| \| \| \| \| \|	This reverts commit 9c20156de39b377190d7a91783d61877b303fe35. It breaks stage 2 of clang-ppc64be-linux-multistage. llvm-svn: 359875
*	[SelectionDAG] Use INT_MIN as (1 << 31) is UB for signed integers. NFCI.	Simon Pilgrim	2019-05-03	1	-2/+2
\| \| \| \|	llvm-svn: 359873
*	[SelectionDAG] computeKnownBits - remove some duplicate/shadow variables. NFCI.	Simon Pilgrim	2019-05-03	1	-6/+4
\| \| \| \|	llvm-svn: 359872
*	[MIR] Add simple PRE pass to MachineCSE	Anton Afanasyev	2019-05-03	1	-9/+117
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is the second part of the commit fixing PR38917 (hoisting partitially redundant machine instruction). Most of PRE (partitial redundancy elimination) and CSE work is done on LLVM IR, but some of redundancy arises during DAG legalization. Machine CSE is not enough to deal with it. This simple PRE implementation works a little bit intricately: it passes before CSE, looking for partitial redundancy and transforming it to fully redundancy, anticipating that the next CSE step will eliminate this created redundancy. If CSE doesn't eliminate this, than created instruction will remain dead and eliminated later by Remove Dead Machine Instructions pass. The third part of the commit is supposed to refactor MachineCSE, to make it more clear and to merge MachinePRE with MachineCSE, so one need no rely on further Remove Dead pass to clear instrs not eliminated by CSE. First step: https://reviews.llvm.org/D54839 Fixes llvm.org/PR38917 Reviewers: RKSimon Subscribers: hfinkel, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D56772 llvm-svn: 359870
*	[IRTranslator] Use the alloc size instead of the store size when translating ↵	Quentin Colombet	2019-05-03	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	allocas We use to incorrectly use the store size instead of the alloc size when creating the stack slot for allocas. On aarch64 this can be demonstrated by allocating weirdly sized types. For instance, in the added test case, we use an alloca for i19. We used to allocate a slot of size 24-bit (19 rounded up to the next byte), whereas we really want to use a full 32-bit slot for this type. llvm-svn: 359856
*	[AArch64][Windows] Compute function length correctly in unwind tables.	Eli Friedman	2019-05-03	2	-3/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The primary fix here is to WinException.cpp: we need to exclude jump tables when computing the length of a function, or else we fail to correctly compute the length. (We can only compute the number of bytes consumed by certain assembler directives after the entire file is parsed. ".p2align" is one of those directives, and is used by jump table generation.) The secondary fix, to MCWin64EH, is to make sure we don't silently miscompile if we hit a similar situation in the future. It's possible we could extend ARM64EmitUnwindInfo so it allows function bodies that contain assembler directives, but that's a lot more complicated; see the FIXME in MCWin64EH.cpp. Fixes https://bugs.llvm.org/show_bug.cgi?id=41581 . Differential Revision: https://reviews.llvm.org/D61095 llvm-svn: 359849
*	[SelectionDAG] Add asserts to verify the vectorness of input and output ↵	Craig Topper	2019-05-02	1	-0/+12
\| \| \| \| \| \| \| \| \| \|	types of TRUNCATE/ZERO_EXTEND/ANY_EXTEND/SIGN_EXTEND agree As a result of the underlying cause of PR41678 we created an ANY_EXTEND node with a scalar result type and v1i1 input type. Ideally we would have asserted for this instead of letting it go through to instruction selection and generate bad machine IR Differential Revision: https://reviews.llvm.org/D61463 llvm-svn: 359836
*	[DAGCombiner] try repeated fdiv divisor transform before building estimate ↵	Sanjay Patel	2019-05-02	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(2nd try) The original patch was committed at rL359398 and reverted at rL359695 because of infinite looping. This includes a fix to check for a vector splat of "1.0" to avoid the infinite loop. Original commit message: This was originally part of D61028, but it's an independent diff. If we try the repeated divisor reciprocal transform before producing an estimate sequence, then we have an opportunity to use scalar fdiv. On x86, the trade-off is 1 divss vs. 5 vector FP ops in the default estimate sequence. On recent chips (Skylake, Ryzen), the full-precision division is only 3 cycle throughput, so that's probably the better perf default option and avoids problems from x86's inaccurate estimates. The last 2 tests show that users still have the option to override the defaults by using the function attributes for reciprocal estimates, but those patterns are potentially made faster by converting the vector ops (including ymm ops) to scalar math. Differential Revision: https://reviews.llvm.org/D61149 llvm-svn: 359793
*	[SelectionDAG] remove constant folding limitations based on FP exceptions	Sanjay Patel	2019-05-02	2	-27/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We don't have FP exception limits in the IR constant folder for the binops (apart from strict ops), so it does not make sense to have them here in the DAG either. Nothing else in the backend tries to preserve exceptions (again outside of strict ops), so I don't see how this could have ever worked for real code that cares about FP exceptions. There are still cases (examples: unary opcodes in SDAG, FMA in IR) where we are trying (at least partially) to preserve exceptions without even asking if the target supports FP exceptions. Those should be corrected in subsequent patches. Real support for FP exceptions requires several changes to handle the constrained/strict FP ops. Differential Revision: https://reviews.llvm.org/D61331 llvm-svn: 359791
*	Revert "[DAGCombiner] try repeated fdiv divisor transform before building ↵	Sanjay Patel	2019-05-01	1	-3/+3
\| \| \| \| \| \| \| \| \|	estimate" This reverts commit fb9a5307a94e6f1f850e4d89f79103b123f16279 (rL359398) because it can cause an infinite loop due to opposing combines. llvm-svn: 359695
*	DAG: allow DAG pointer size different from memory representation.	Tim Northover	2019-05-01	4	-47/+134
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In preparation for supporting ILP32 on AArch64, this modifies the SelectionDAG builder code so that pointers are allowed to have a larger type when "live" in the DAG compared to memory. Pointers get zero-extended whenever they are loaded, and truncated prior to stores. In addition, a few not quite so obvious locations need updating: * A GEP that has not been marked inbounds needs to enforce the IR-documented 2s-complement wrapping at the memory pointer size. Inbounds GEPs are undefined if they overflow the address space, so no additional operations are needed. * Signed comparisons would give incorrect results if performed on the zero-extended values. This shouldn't affect CodeGen for now, but will become active when the AArch64 ILP32 support is committed. llvm-svn: 359676
*	[SelectionDAG] remove div-by-zero constant folding restriction	Sanjay Patel	2019-04-30	1	-7/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We don't have this restriction in IR, so it should not be here either simply out of consistency. Code that wants to handle FP exceptions is expected to use the 'strict' variants of these nodes. We don't get the frem case because frem by 0.0 produces NaN (invalid), and that's the remaining check here (so the removed check for frem was dead code AFAIK). This is the only place in SDAG that uses "HasFPExceptions", so I think we should remove that entirely as a follow-up patch. llvm-svn: 359566
*	[TargetLowering] findOptimalMemOpLowering. NFCI.	Sjoerd Meijer	2019-04-30	2	-123/+119
\| \| \| \| \| \| \| \| \| \|	This was a local static funtion in SelectionDAG, which I've promoted to TargetLowering so that I can reuse it to estimate the cost of a memory operation in D59787. Differential Revision: https://reviews.llvm.org/D59766 llvm-svn: 359543
*	[AsmPrinter] Make AsmPrinter::HandlerInfo::Handler a unique_ptr	Fangrui Song	2019-04-30	1	-13/+13
\| \| \| \| \| \| \|	Handlers.clear() in AsmPrinter::doFinalization() will destroy these handlers. A unique_ptr makes the ownership clearer. llvm-svn: 359541
*	[TargetLowering] Change getOptimalMemOpType to take a function attribute list	Sjoerd Meijer	2019-04-30	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \|	The MachineFunction wasn't used in getOptimalMemOpType, but more importantly, this allows reuse of findOptimalMemOpLowering that is calling getOptimalMemOpType. This is the groundwork for the changes in D59766 and D59787, that allows implementation of TTI::getMemcpyCost. Differential Revision: https://reviews.llvm.org/D59785 llvm-svn: 359537
*	[DebugInfo] DW_OP_deref_size in PrologEpilogInserter.	Markus Lavin	2019-04-30	6	-3/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The PrologEpilogInserter need to insert a DW_OP_deref_size before prepending a memory location expression to an already implicit expression to avoid having the existing expression act on the memory address instead of the value behind it. The reason for using DW_OP_deref_size and not plain DW_OP_deref is that big-endian targets need to read the right size as simply truncating a larger read would yield the wrong result (LSB bytes are not at the lower address). This re-commit fixes issues reported in the first one. Namely deref was inserted under wrong conditions and additionally the deref_size argument was incorrectly encoded. Differential Revision: https://reviews.llvm.org/D59687 llvm-svn: 359535
*	[DAGCombiner] Do not generate ISD::ADDE node if adde is not legal for the ↵	Zi Xuan Wu	2019-04-30	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	target when combine ISD::TRUNC node Do not combine (trunc adde(X, Y, Carry)) into (adde trunc(X), trunc(Y), Carry), if adde is not legal for the target. Even it's at type-legalize phase. Because adde is special and will not be legalized at operation-legalize phase later. This fixes: PR40922 https://bugs.llvm.org/show_bug.cgi?id=40922 Differential Revision: https://reviews.llvm.org//D60854 llvm-svn: 359532
*	computePolynomialFromPointer - add missing early-out return for non-pointer ↵	Simon Pilgrim	2019-04-29	1	-0/+1
\| \| \| \| \| \| \| \|	types. Reported in https://www.viva64.com/en/b/0629/ llvm-svn: 359486
*	[globalisel] Improve Legalizer debug output	Daniel Sanders	2019-04-29	2	-6/+62
\| \| \| \| \| \| \| \| \| \|	* LegalizeAction should be printed by name rather than number * Newly created instructions are incomplete at the point the observer first sees them. They are therefore recorded in a small vector and printed just before the legalizer moves on to another instruction. By this point, the instruction must be complete. llvm-svn: 359481
*	[DAG] Refactor DAGCombiner::ReassociateOps	Bjorn Pettersson	2019-04-29	1	-45/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Extract the logic for doing reassociations from DAGCombiner::reassociateOps into a helper function DAGCombiner::reassociateOpsCommutative, and use that helper to trigger reassociation on the original operand order, or the commuted operand order. Codegen is not identical since the operand order will be different when doing the reassociations for the commuted case. That causes some unfortunate churn in some test cases. Apart from that this should be NFC. Reviewers: spatel, craig.topper, tstellar Reviewed By: spatel Subscribers: dmgreen, dschuff, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, hiraditya, aheejin, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61199 llvm-svn: 359476
*	[DebugInfo] Terminate more location-list ranges at the end of blocks	Jeremy Morse	2019-04-29	2	-20/+82
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes PR40795, where constant-valued variable locations can "leak" into blocks placed at higher addresses. The root of this is that DbgEntityHistoryCalculator terminates all register variable locations at the end of each block, but not constant-value variable locations. Fixing this requires constant-valued DBG_VALUE instructions to be broadcast into all blocks where the variable location remains valid, as documented in the LiveDebugValues section of SourceLevelDebugging.rst, and correct termination in DbgEntityHistoryCalculator. Differential Revision: https://reviews.llvm.org/D59431 llvm-svn: 359426
*	[DAGCombiner] try repeated fdiv divisor transform before building estimate	Sanjay Patel	2019-04-28	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was originally part of D61028, but it's an independent diff. If we try the repeated divisor reciprocal transform before producing an estimate sequence, then we have an opportunity to use scalar fdiv. On x86, the trade-off is 1 divss vs. 5 vector FP ops in the default estimate sequence. On recent chips (Skylake, Ryzen), the full-precision division is only 3 cycle throughput, so that's probably the better perf default option and avoids problems from x86's inaccurate estimates. The last 2 tests show that users still have the option to override the defaults by using the function attributes for reciprocal estimates, but those patterns are potentially made faster by converting the vector ops (including ymm ops) to scalar math. Differential Revision: https://reviews.llvm.org/D61149 llvm-svn: 359398
*	[AsmPrinter] refactor to support %c w/ GlobalAddress'	Nick Desaulniers	2019-04-26	1	-4/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Targets like ARM, MSP430, PPC, and SystemZ have complex behavior when printing the address of a MachineOperand::MO_GlobalAddress. Move that handling into a new overriden method in each base class. A virtual method was added to the base class for handling the generic case. Refactors a few subclasses to support the target independent %a, %c, and %n. The patch also contains small cleanups for AVRAsmPrinter and SystemZAsmPrinter. It seems that NVPTXTargetLowering is possibly missing some logic to transform GlobalAddressSDNodes for TargetLowering::LowerAsmOperandForConstraint to handle with "i" extended inline assembly asm constraints. Fixes: - https://bugs.llvm.org/show_bug.cgi?id=41402 - https://github.com/ClangBuiltLinux/linux/issues/449 Reviewers: echristo, void Reviewed By: void Subscribers: void, craig.topper, jholewinski, dschuff, jyknight, dylanmckay, sdardis, nemanjai, javed.absar, sbc100, jgravelle-google, eraman, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, jrtc27, atanasyan, jsji, llvm-commits, kees, tpimh, nathanchance, peter.smith, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D60887 llvm-svn: 359337
*	[DAGCombine] Cleanup visitEXTRACT_SUBVECTOR. NFCI.	Simon Pilgrim	2019-04-26	1	-10/+11
\| \| \| \| \| \|	Use ArrayRef::slice, reduce some rather awkward long lines for legibility and run clang-format. llvm-svn: 359326
*	[X86][SSE] Disable shouldFoldConstantShiftPairToMask for btver1/btver2 ↵	Simon Pilgrim	2019-04-26	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	targets (PR40758) As detailed on PR40758, Bobcat/Jaguar can perform vector immediate shifts on the same pipes as vector ANDs with the same latency - so it doesn't make sense to replace a shl+lshr with a shift+and pair as it requires an additional mask (with the extra constant pool, loading and register pressure costs). Differential Revision: https://reviews.llvm.org/D61068 llvm-svn: 359293
*	[GlobalISel] Fix inserting copies in the right position for reg definitions	Marcello Maggioni	2019-04-26	2	-12/+38
\| \| \| \| \| \| \| \| \| \| \| \| \|	When constrainRegClass is called if the constraining happens on a use the COPY needs to be inserted before the instruction that contains the MachineOperand, but if we are constraining a definition it actually needs to be added after the instruction. In addition, the COPY needs to have its operands flipped (in the use case we are copying from the old unconstrained register to the new constrained register, while in the definition case we are copying from the new constrained register that the instruction defines to the old unconstrained register). llvm-svn: 359282
*	[SelectionDAG][X86] Use stack load/store in PromoteIntRes_BITCAST when the ↵	Craig Topper	2019-04-25	1	-15/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	input needs to be be split and the output type is a vector. We had special case handling here, but it uses a scalar any_extend for the promotion then bitcasts to the final type. This won't split up the input data into multiple promoted elements like we need. This patch falls back to doing the conversion through memory. Fixes PR41594 which I believe was reflected in the bitcast-vector-bool.ll changes. The changes to vector-half-conversions.ll are fixing a previously unknown miscompile from this issue. Differential Revision: https://reviews.llvm.org/D61114 llvm-svn: 359219
*	[GlobalISel][AArch64] Legalize G_FNEARBYINT	Jessica Paquette	2019-04-25	1	-0/+2
\| \| \| \| \| \| \| \| \|	Add legalizer support for G_FNEARBYINT. It's the same as G_FCEIL etc. Since the importer allows us to automatically select this after legalization, also add tests for selection etc. Also update arm64-vfloatintrinsics.ll. llvm-svn: 359204
*	[GlobalISel] Add IRTranslator support for G_FNEARBYINT	Jessica Paquette	2019-04-25	1	-0/+2
\| \| \| \| \| \| \| \| \|	Translate llvm.nearbyint into G_FNEARBYINT as a simple intrinsic. Update arm64-irtranslator.ll. Differential Revision: https://reviews.llvm.org/D60922 llvm-svn: 359203
*	Recommitting r358783 and r358786 "[MS] Emit S_HEAPALLOCSITE debug info" with ↵	Amy Huang	2019-04-24	5	-0/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	fixes for buildbot error (undefined assembler label). Summary: This emits labels around heapallocsite calls and S_HEAPALLOCSITE debug info in codeview. Currently only changes FastISel, so emitting labels still needs to be implemented in SelectionDAG. Reviewers: rnk Subscribers: aprantl, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D61083 llvm-svn: 359149
*	[DAGCombiner] scale repeated FP divisor by splat factor	Sanjay Patel	2019-04-24	1	-3/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we have a vector FP division with a splatted divisor, use the existing transform that converts 'x/y' into 'x * (1.0/y)' to allow more conversions. This can then potentially be converted into a scalar FP division by existing combines (rL358984) as seen in the tests here. That can be a potentially big perf difference if scalar fdiv has better timing (including avoiding possible frequency throttling for vector ops). Differential Revision: https://reviews.llvm.org/D61028 llvm-svn: 359147
*	DebugInfo: Emit only declarations (not whole definitions) of non-unit user ↵	David Blaikie	2019-04-24	6	-13/+63
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	defined types into type units While this doesn't come up in reasonable cases currently (the only user defined types not in type units are ones without linkage - which makes for near-ODR violations, because it'd be a type with linkage referencing a type without linkage - such a type can't be validly defined in more than one TU, so arguably it shouldn't be in a type unit to begin with - but it's a convenient way to demonstrate an issue that will become more revalent with homed modular debug info type definitions - which also don't need to be in type units but more legitimately so). Precursor to the Clang change to de-type-unit (by omitting the 'identifier') types homed due to strong linkage vtables. (making that change without this one would lead to major type duplication in type units) llvm-svn: 359122
*	Add "const" in GetUnderlyingObjects. NFC	Bjorn Pettersson	2019-04-24	3	-14/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Both the input Value pointer and the returned Value pointers in GetUnderlyingObjects are now declared as const. It turned out that all current (in-tree) uses of GetUnderlyingObjects were trivial to update, being satisfied with have those Value pointers declared as const. Actually, in the past several of the users had to use const_cast, just because of ValueTracking not providing a version of GetUnderlyingObjects with "const" Value pointers. With this patch we get rid of those const casts. Reviewers: hfinkel, materi, jkorous Reviewed By: jkorous Subscribers: dexonsmith, jkorous, jholewinski, sdardis, eraman, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61038 llvm-svn: 359072
*	[Remarks] Add string deduplication using a string table	Francis Visoiu Mistrih	2019-04-24	2	-1/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add support for uniquing strings in the remark streamer and emitting the string table in the remarks section. * Add parsing support for the string table in the RemarkParser. From this remark: ``` --- !Missed Pass: inline Name: NoDefinition DebugLoc: { File: 'test-suite/SingleSource/UnitTests/2002-04-17-PrintfChar.c', Line: 7, Column: 3 } Function: printArgsNoRet Args: - Callee: printf - String: ' will not be inlined into ' - Caller: printArgsNoRet DebugLoc: { File: 'test-suite/SingleSource/UnitTests/2002-04-17-PrintfChar.c', Line: 6, Column: 0 } - String: ' because its definition is unavailable' ... ``` to: ``` --- !Missed Pass: 0 Name: 1 DebugLoc: { File: 3, Line: 7, Column: 3 } Function: 2 Args: - Callee: 4 - String: 5 - Caller: 2 DebugLoc: { File: 3, Line: 6, Column: 0 } - String: 6 ... ``` And the string table in the .remarks/__remarks section containing: ``` inline\0NoDefinition\0printArgsNoRet\0 test-suite/SingleSource/UnitTests/2002-04-17-PrintfChar.c\0printf\0 will not be inlined into \0 because its definition is unavailable\0 ``` This is mostly supposed to be used for testing purposes, but it gives us a 2x reduction in the remark size, and is an incremental change for the updates to the remarks file format. Differential Revision: https://reviews.llvm.org/D60227 llvm-svn: 359050
*	[CGP] Look through bitcasts when duplicating returns for tail calls	Francis Visoiu Mistrih	2019-04-23	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The simple case of: ``` int callee(); void caller(void *a) { if (a == NULL) return callee(); return a; } ``` would generate a regular call instead of a tail call because we don't look through the bitcast of the call to `callee` when duplicating the return blocks. Differential Revision: https://reviews.llvm.org/D60837 llvm-svn: 359041
*	Revert "[MS] Emit S_HEAPALLOCSITE debug info" because of ToTWin64(db)	Amy Huang	2019-04-23	4	-36/+0
\| \| \| \| \| \| \| \| \|	buildbot failure. This reverts commit d07d6d617713bececf57f3547434dd52f0f13f9e and c774f687b6880484a126ed3e3d737e74c926f0ae. llvm-svn: 359034
*	[AArch64][GlobalISel] Legalize G_INTRINSIC_ROUND	Jessica Paquette	2019-04-23	1	-0/+1
\| \| \| \| \| \| \|	Add it to the same rule as G_FCEIL etc. Add a legalizer test, and add a missing switch case to AArch64LegalizerInfo.cpp. llvm-svn: 359033
*	Reapply: "DebugInfo: Emit only one kind of accelerated access/name table""	David Blaikie	2019-04-23	3	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Originally committed in r358931 Reverted in r358997 Seems this change made Apple accelerator tables miss names (because names started respecting the CU NameTableKind GNU & assuming that shouldn't produce accelerated names too), which is never correct (apple accelerator tables don't have separators or CU lists - if present, they must describe all names in all CUs). Original Description: Currently to opt in to debug_names in DWARFv5, the IR must contain 'nameTableKind: Default' which also enables debug_pubnames. Instead, only allow one of {debug_names, apple_names, debug_pubnames, debug_gnu_pubnames}. nameTableKind: Default gives debug_names in DWARFv5 and greater, debug_pubnames in v4 and earlier - and apple_names when tuning for lldb on MachO. nameTableKind: GNU always gives gnu_pubnames llvm-svn: 359026
*	[AArch64][GlobalISel] Legalize G_INTRINSIC_TRUNC	Jessica Paquette	2019-04-23	1	-0/+1
\| \| \| \| \| \| \| \| \|	Same patch as G_FCEIL etc. Add the missing switch case in widenScalar, add G_INTRINSIC_TRUNC to the correct rule in AArch64LegalizerInfo.cpp, and add a test. llvm-svn: 359021
*	Revert "DebugInfo: Emit only one kind of accelerated access/name table"	David Blaikie	2019-04-23	3	-8/+3
\| \| \| \| \| \| \| \|	Regresses some apple_names situations - still investigating. This reverts commit r358931. llvm-svn: 358997
*	Use llvm::stable_sort	Fangrui Song	2019-04-23	12	-62/+56
\| \| \| \| \| \|	While touching the code, simplify if feasible. llvm-svn: 358996
*	[DAGCombiner] generalize binop-of-splats scalarization	Sanjay Patel	2019-04-23	1	-46/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we only match build vectors, we can miss some patterns that use shuffles as seen in the affected tests. Note that the underlying calls within getSplatSourceVector() have the potential for compile-time explosion because of exponential recursion looking through binop opcodes, but currently the list of supported opcodes is very limited. Both of those problems should be addressed in follow-up patches. llvm-svn: 358984
*	[DAGCombiner] Combine OR as ADD when no common bits are set	Bjorn Pettersson	2019-04-23	1	-16/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The DAGCombiner is rewriting (canonicalizing) an ISD::ADD with no common bits set in the operands as an ISD::OR node. This could sometimes result in "missing out" on some combines that normally are performed for ADD. To be more specific this could happen if we already have rewritten an ADD into OR, and later (after legalizations or combines) we expose patterns that could have been optimized if we had seen the OR as an ADD (e.g. reassociations based on ADD). To make the DAG combiner less sensitive to if ADD or OR is used for these "no common bits set" ADD/OR operations we now apply most of the ADD combines also to an OR operation, when value tracking indicates that the operands have no common bits set. Reviewers: spatel, RKSimon, craig.topper, kparzysz Reviewed By: spatel Subscribers: arsenm, rampitec, lebedev.ri, jvesely, nhaehnle, hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59758 llvm-svn: 358965
*	Revert "Use const DebugLoc&"	Chandler Carruth	2019-04-23	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts r358910 (git commit 2b744665308fc8d30a3baecb4947f2bd81aa7d30) While this patch seems trivial and safe and correct, it is not. The copies are actually load bearing copies. You can observe this with MSan or other ways of checking for use-after-destroy, but otherwise this may result in ... difficult to debug inexplicable behavior. I suspect the issue is that the debug location is used after the original reference to it is removed. The metadata backing it gets destroyed as its last references goes away, and then we reference it later through these const references. llvm-svn: 358940
*	DebugInfo: Emit only one kind of accelerated access/name table	David Blaikie	2019-04-22	3	-3/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently to opt in to debug_names in DWARFv5, the IR must contain 'nameTableKind: Default' which also enables debug_pubnames. Instead, only allow one of {debug_names, apple_names, debug_pubnames, debug_gnu_pubnames}. nameTableKind: Default gives debug_names in DWARFv5 and greater, debug_pubnames in v4 and earlier - and apple_names when tuning for lldb on MachO. nameTableKind: GNU always gives gnu_pubnames llvm-svn: 358931
*	[SelectionDAG] move splat util functions up from x86 lowering	Sanjay Patel	2019-04-22	1	-0/+52
\| \| \| \| \| \| \| \| \| \|	This was supposed to be NFC, but the change in SDLoc definitions causes instruction scheduling changes. There's nothing x86-specific in this code, and it can likely be used from DAGCombiner's simplifyVBinOp(). llvm-svn: 358930
*	Use const DebugLoc&	Matt Arsenault	2019-04-22	1	-2/+2
\| \| \| \|	llvm-svn: 358910
*	GlobalISel: Legalize scalar G_EXTRACT sources	Matt Arsenault	2019-04-22	1	-0/+7
\| \| \| \|	llvm-svn: 358892
*	[TargetLowering][AMDGPU][X86] Improve SimplifyDemandedBits bitcast handling	Simon Pilgrim	2019-04-22	1	-1/+25
\| \| \| \| \| \| \| \| \| \| \| \|	This patch adds support for BigBitWidth -> SmallBitWidth bitcasts, splitting the DemandedBits/Elts accordingly. The AMDGPU backend needed an extra (srl (and x, c1 << c2), c2) -> (and (srl(x, c2), c1) combine to encourage BFE creation, I investigated putting this in DAGCombine but it caused a lot of noise on other targets - some improvements, some regressions. The X86 changes are all definite wins. Differential Revision: https://reviews.llvm.org/D60462 llvm-svn: 358887