bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Move if() to newline to stop ambiguity over whether it should be else if. NFCI.	Simon Pilgrim	2019-04-29	1	-1/+2
\| \| \| \| \| \|	Reported in https://www.viva64.com/en/b/0629/ llvm-svn: 359472
*	Fix operator precedence warning. NFCI.	Simon Pilgrim	2019-04-29	1	-1/+2
\| \| \| \| \| \|	Reported in https://www.viva64.com/en/b/0629/ llvm-svn: 359469
*	Remove superfluous break from switch statement. NFCI.	Simon Pilgrim	2019-04-29	1	-1/+0
\| \| \| \| \| \|	Reported in https://www.viva64.com/en/b/0629/ llvm-svn: 359467
*	[BlockExtractor] Expose a constructor for the group extraction	Quentin Colombet	2019-04-29	1	-3/+29
\| \| \| \| \| \| \| \|	NFC Differential Revision: https://reviews.llvm.org/D60971 llvm-svn: 359463
*	[BlockExtractor] Change the basic block separator from ',' to ';'	Quentin Colombet	2019-04-29	1	-1/+1
\| \| \| \| \| \| \| \| \|	This change aims at making the file format be compatible with the way LLVM handles command line options. Differential Revision: https://reviews.llvm.org/D60970 llvm-svn: 359462
*	[X86] Remove duplicate string comparison	Simon Pilgrim	2019-04-29	1	-1/+0
\| \| \| \| \| \|	Fix typo introduced in rL332824 where we simplified the extact string matches for "avx512.mask.permvar.sf.256" and "avx512.mask.permvar.si.256" to a string startswith test for "avx512.mask.permvar." llvm-svn: 359460
*	[AArch64][SVE] Asm: add aliases for unpredicated bitwise logical instructions	Cullen Rhodes	2019-04-29	2	-4/+14
\| \| \| \| \| \| \| \| \|	This patch adds aliases for element sizes .B/.H/.S to the AND/ORR/EOR/BIC bitwise logical instructions. The assembler now accepts these instructions with all element sizes up to 64-bit (.D). The preferred disassembly is .D. llvm-svn: 359457
*	FileCheck [2/12]: Stricter parsing of -D option	Thomas Preud'homme	2019-04-29	1	-43/+112
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch is part of a patch series to add support for FileCheck numeric expressions. This specific patch gives earlier and better diagnostics for the -D option. Prior to this change, parsing of -D option was very loose: it assumed that there is an equal sign (which to be fair is now checked by the FileCheck executable) and that the part on the left of the equal sign was a valid variable name. This commit adds logic to ensure that this is the case and gives diagnostic when it is not, making it clear that the issue came from a command-line option error. This is achieved by sharing the variable parsing code into a new function ParseVariable. Copyright: - Linaro (changes up to diff 183612 of revision D55940) - GraphCore (changes in later versions of revision D55940 and in new revision created off D55940) Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk Subscribers: hiraditya, llvm-commits, probinson, dblaikie, grimar, arichardson, tra, rnk, kristina, hfinkel, rogfer01, JonChesterfield Tags: #llvm Differential Revision: https://reviews.llvm.org/D60382 llvm-svn: 359447
*	[LoopSimplifyCFG] Suppress expensive DomTree verification	Yevgeny Rouban	2019-04-29	1	-1/+7
\| \| \| \| \| \| \| \| \|	This patch makes verification level lower for builds with inexpensive checks. Differential Revision: https://reviews.llvm.org/D61055 llvm-svn: 359446
*	[ARM] Add bitcast/extract_subvec. of fp16 vectors	Diogo N. Sampaio	2019-04-29	1	-91/+144
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds some basic operations for fp16 vectors, such as bitcast from fp16 to i16, required to perform extract_subvector (also added here) and extract_element. Reviewers: SjoerdMeijer, DavidSpickett, t.p.northover, ostannard Reviewed By: ostannard Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60618 llvm-svn: 359433
*	[ARM] Add v4f16 and v8f16 types to the CallingConv	Diogo N. Sampaio	2019-04-29	1	-18/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The Procedure Call Standard for the Arm Architecture states that float16x4_t and float16x8_t behave just as uint16x4_t and uint16x8_t for argument passing. This patch adds the fp16 vectors to the ARMCallingConv.td file. Reviewers: miyuki, ostannard Reviewed By: ostannard Subscribers: ostannard, javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60720 llvm-svn: 359431
*	Try to use /proc on FreeBSD for getExecutablePath	David Chisnall	2019-04-29	1	-1/+14
\| \| \| \| \| \| \| \| \| \| \|	Currently, clang's libTooling passes this function a fake argv0, which means that no libTooling tools can find the standard headers on FreeBSD. With this change, these will now work on any FreeBSD systems that have procfs mounted. This isn't the right fix for the libTooling issue, but it does bring the FreeBSD implementation of getExecutablePath closer to the Linux and macOS implementations. llvm-svn: 359427
*	[DebugInfo] Terminate more location-list ranges at the end of blocks	Jeremy Morse	2019-04-29	2	-20/+82
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes PR40795, where constant-valued variable locations can "leak" into blocks placed at higher addresses. The root of this is that DbgEntityHistoryCalculator terminates all register variable locations at the end of each block, but not constant-value variable locations. Fixing this requires constant-valued DBG_VALUE instructions to be broadcast into all blocks where the variable location remains valid, as documented in the LiveDebugValues section of SourceLevelDebugging.rst, and correct termination in DbgEntityHistoryCalculator. Differential Revision: https://reviews.llvm.org/D59431 llvm-svn: 359426
*	[DWARF] Fix dump of local/foreign TU lists in .debug_names	Fangrui Song	2019-04-29	1	-2/+3
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D61241 llvm-svn: 359425
*	[DWARF] Delete a redundant check in getFileNameByIndex()	Fangrui Song	2019-04-29	1	-2/+1
\| \| \| \|	llvm-svn: 359422
*	[X86] Remove some intel syntax aliases on (v)cvtpd2(u)dq, (v)cvtpd2ps, ↵	Craig Topper	2019-04-29	2	-42/+160
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(v)cvt(u)qq2ps. Add 'x' and'y' suffix aliases to masked version of the same in att syntax. The 128/256 bit version of these instructions require an 'x' or 'y' suffix to disambiguate the memory form in att syntax. We were allowing the same suffix in intel syntax, but it appears gas does not do that. gas does allow the 'x' and 'y' suffix on register and broadcast forms even though its not needed. We were allowing it on unmasked register form, but not on masked versions or on masked or unmasked broadcast form. While there fix some test coverage holes so they can be extended with the 'x' and 'y' suffix tests. llvm-svn: 359418
*	llvm-cvtres: Attempt to make llvm-cvtres/duplicate.test work on big-endian ↵	Nico Weber	2019-04-29	1	-12/+14
\| \| \| \| \| \|	systems llvm-svn: 359414
*	[X86][SSE] combineExtractVectorElt - add early-out to return zero/undef for ↵	Simon Pilgrim	2019-04-28	1	-2/+4
\| \| \| \| \| \|	out-of-range extraction indices. llvm-svn: 359406
*	[ConstantRange] Add makeExactNoWrapRegion()	Nikita Popov	2019-04-28	2	-6/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I got confused on the terminology, and the change in D60598 was not correct. I was thinking of "exact" in terms of the result being non-approximate. However, the relevant distinction here is whether the result is * Largest range such that: Forall Y in Other: Forall X in Result: X BinOp Y does not wrap. (makeGuaranteedNoWrapRegion) * Smallest range such that: Forall Y in Other: Forall X not in Result: X BinOp Y wraps. (A hypothetical makeAllowedNoWrapRegion) * Both. (makeExactNoWrapRegion) I'm adding a separate makeExactNoWrapRegion method accepting a single APInt (same as makeExactICmpRegion) and using it in the places where the guarantee is relevant. Differential Revision: https://reviews.llvm.org/D60960 llvm-svn: 359402
*	[X86][AVX] Combine non-lane crossing binary shuffles using X86ISD::VPERMV3	Simon Pilgrim	2019-04-28	1	-0/+22
\| \| \| \| \| \|	Some of the combines might be further improved if we lower more shuffles with X86ISD::VPERMV3 directly, instead of waiting to combine the results. llvm-svn: 359400
*	[DAGCombiner] try repeated fdiv divisor transform before building estimate	Sanjay Patel	2019-04-28	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was originally part of D61028, but it's an independent diff. If we try the repeated divisor reciprocal transform before producing an estimate sequence, then we have an opportunity to use scalar fdiv. On x86, the trade-off is 1 divss vs. 5 vector FP ops in the default estimate sequence. On recent chips (Skylake, Ryzen), the full-precision division is only 3 cycle throughput, so that's probably the better perf default option and avoids problems from x86's inaccurate estimates. The last 2 tests show that users still have the option to override the defaults by using the function attributes for reciprocal estimates, but those patterns are potentially made faster by converting the vector ops (including ymm ops) to scalar math. Differential Revision: https://reviews.llvm.org/D61149 llvm-svn: 359398
*	[X86][SSE] Optimize llvm.experimental.vector.reduce.xor.vXi1 parity ↵	Simon Pilgrim	2019-04-28	1	-2/+12
\| \| \| \| \| \| \| \| \| \|	reduction (PR38840) An xor reduction of a bool vector can be optimized to a parity check of the MOVMSK/BITCAST'd integer - if the population count is odd return 1, else return 0. Differential Revision: https://reviews.llvm.org/D61230 llvm-svn: 359396
*	[X86] Remove (V)MOV64toSDrr/m and (V)MOVDI2SSrr/m. Use 128-bit result ↵	Craig Topper	2019-04-28	3	-70/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	MOVD/MOVQ and COPY_TO_REGCLASS instead Summary: The register form of these instructions are CodeGenOnly instructions that cover GR32->FR32 and GR64->FR64 bitcasts. There is a similar set of instructions for the opposite bitcast. Due to the patterns using bitcasts these instructions get marked as "bitcast" machine instructions as well. The peephole pass is able to look through these as well as other copies to try to avoid register bank copies. Because FR32/FR64/VR128 are all coalescable to each other we can end up in a situation where a GR32->FR32->VR128->FR64->GR64 sequence can be reduced to GR32->GR64 which the copyPhysReg code can't handle. To prevent this, this patch removes one set of the 'bitcast' instructions. So now we can only go GR32->VR128->FR32 or GR64->VR128->FR64. The instruction that converts from GR32/GR64->VR128 has no special significance to the peephole pass and won't be looked through. I guess the other option would be to add support to copyPhysReg to just promote the GR32->GR64 to a GR64->GR64 copy. The upper bits were basically undefined anyway. But removing the CodeGenOnly instruction in favor of one that won't be optimized seemed safer. I deleted the peephole test because it couldn't be made to work with the bitcast instructions removed. The load version of the instructions were unnecessary as the pattern that selects them contains a bitcasted load which should never happen. Fixes PR41619. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61223 llvm-svn: 359392
*	Revert rL359389: [X86][SSE] Add support for <64 x i1> bool reduction	Simon Pilgrim	2019-04-27	1	-12/+10
\| \| \| \| \| \| \| \|	Minor generalization of the existing <32 x i1> pre-AVX2 split code. ........ Causing irregular buildbot failures. llvm-svn: 359391
*	[X86][SSE] Add support for <64 x i1> bool reduction	Simon Pilgrim	2019-04-27	1	-10/+12
\| \| \| \| \| \|	Minor generalization of the existing <32 x i1> pre-AVX2 split code. llvm-svn: 359389
*	[X86][AVX512] Improve vector bool reductions	Simon Pilgrim	2019-04-27	1	-11/+22
\| \| \| \| \| \|	As predicate masks are legal on AVX512 targets, we avoid MOVMSK in these cases, but we can just bitcast the bool vector to the integer equivalent directly - avoiding expansion of the reduction to a shuffle pattern. llvm-svn: 359386
*	[DJB] Fix variable case after D61178	Fangrui Song	2019-04-27	1	-3/+3
\| \| \| \|	llvm-svn: 359381
*	[X86][AVX] Merge mask select with shuffles across extract_subvector (PR40332)	Simon Pilgrim	2019-04-27	1	-4/+44
\| \| \| \| \| \| \| \|	Fixes PR40332 in the limited case where we're selecting between a target shuffle and a zero vector. We can extend this in the future to handle more opcodes and non-zero selections. llvm-svn: 359378
*	[MCA] Add field `IsEliminated` to class Instruction. NFCI	Andrea Di Biagio	2019-04-27	1	-3/+3
\| \| \| \|	llvm-svn: 359377
*	[X86] Use MOVQ for i64 atomic_stores when SSE2 is enabled	Craig Topper	2019-04-27	5	-19/+72
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If we have SSE2 we can use a MOVQ to store 64-bits and avoid falling back to a cmpxchg8b loop. If its a seq_cst store we need to insert an mfence after the store. Reviewers: spatel, RKSimon, reames, jfb, efriedma Reviewed By: RKSimon Subscribers: hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60546 llvm-svn: 359368
*	Revert "AMDGPU: Split block for si_end_cf"	Mark Searles	2019-04-27	5	-128/+17
\| \| \| \| \| \| \| \| \| \|	This reverts commit 7a6ef3004655dd86d722199c471ae78c28e31bb4. We discovered some internal test failures, so reverting for now. Differential Revision: https://reviews.llvm.org/D61213 llvm-svn: 359363
*	[AMDGPU] gfx1010 VOPC implementation	Stanislav Mekhanoshin	2019-04-26	8	-361/+696
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D61208 llvm-svn: 359358
*	[ORC] Add a 'plugin' interface to ObjectLinkingLayer for events/configuration.	Lang Hames	2019-04-26	2	-51/+153
\| \| \| \| \| \| \| \| \| \| \| \| \|	ObjectLinkingLayer::Plugin provides event notifications when objects are loaded, emitted, and removed. It also provides a modifyPassConfig callback that allows plugins to modify the JITLink pass configuration. This patch moves eh-frame registration into its own plugin, and teaches llvm-jitlink to only add that plugin when performing execution runs on non-Windows platforms. This should allow us to re-enable the test case that was removed in r359198. llvm-svn: 359357
*	[GlobalISel][AArch64] Use getConstantVRegValWithLookThrough for extracts	Jessica Paquette	2019-04-26	1	-38/+6
\| \| \| \| \| \| \| \| \| \| \| \|	getConstantVRegValWithLookThrough does the same thing as the getConstantValueForReg function, and has more visibility across GISel. Plus, it supports looking through G_TRUNC, G_SEXT, and G_ZEXT. So, we get better code reuse and more functionality for free by using it. Add some test cases to select-extract-vector-elt.mir to show that we can now look through those instructions. llvm-svn: 359351
*	[AsmPrinter] refactor to support %c w/ GlobalAddress'	Nick Desaulniers	2019-04-26	18	-82/+90
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Targets like ARM, MSP430, PPC, and SystemZ have complex behavior when printing the address of a MachineOperand::MO_GlobalAddress. Move that handling into a new overriden method in each base class. A virtual method was added to the base class for handling the generic case. Refactors a few subclasses to support the target independent %a, %c, and %n. The patch also contains small cleanups for AVRAsmPrinter and SystemZAsmPrinter. It seems that NVPTXTargetLowering is possibly missing some logic to transform GlobalAddressSDNodes for TargetLowering::LowerAsmOperandForConstraint to handle with "i" extended inline assembly asm constraints. Fixes: - https://bugs.llvm.org/show_bug.cgi?id=41402 - https://github.com/ClangBuiltLinux/linux/issues/449 Reviewers: echristo, void Reviewed By: void Subscribers: void, craig.topper, jholewinski, dschuff, jyknight, dylanmckay, sdardis, nemanjai, javed.absar, sbc100, jgravelle-google, eraman, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, jrtc27, atanasyan, jsji, llvm-commits, kees, tpimh, nathanchance, peter.smith, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D60887 llvm-svn: 359337
*	[X86][AVX] Fold extract_subvector(broadcast(x)) -> broadcast(x) iff x has ↵	Simon Pilgrim	2019-04-26	1	-0/+7
\| \| \| \| \| \|	one use llvm-svn: 359332
*	[AArch64][GlobalISel] Select G_BSWAP for vectors of s32 and s64	Jessica Paquette	2019-04-26	2	-1/+38
\| \| \| \| \| \| \| \| \|	There are instructions for these, so mark them as legal. Select the correct instruction in AArch64InstructionSelector.cpp. Update select-bswap.mir and arm64-rev.ll to reflect the changes. llvm-svn: 359331
*	[AMDGPU] gfx1010 VOP3 and VOP3P implementation	Stanislav Mekhanoshin	2019-04-26	4	-102/+281
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D61202 llvm-svn: 359328
*	[DAGCombine] Cleanup visitEXTRACT_SUBVECTOR. NFCI.	Simon Pilgrim	2019-04-26	1	-10/+11
\| \| \| \| \| \|	Use ArrayRef::slice, reduce some rather awkward long lines for legibility and run clang-format. llvm-svn: 359326
*	[ConstantRange] Add abs() support	Nikita Popov	2019-04-26	1	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for abs() to ConstantRange. This will allow to handle SPF_ABS select flavor in LVI and will also come in handy as a primitive for the srem implementation. The implementation is slightly tricky, because a) abs of signed min is signed min and b) sign-wrapped ranges may have an abs() that is smaller than a full range, so we need to explicitly handle them. Differential Revision: https://reviews.llvm.org/D61084 llvm-svn: 359321
*	[X86] Sink NoRegister creation for unused Base/Index registers into ↵	Craig Topper	2019-04-26	1	-36/+26
\| \| \| \| \| \|	getAddressOperands. NFCI llvm-svn: 359318
*	[X86] Segment registers should have i16 type not i32.	Craig Topper	2019-04-26	1	-1/+1
\| \| \| \| \| \|	Probably doesn't really matter, but was inconsistent with the rest of the code. llvm-svn: 359317
*	[AMDGPU] gfx1010 VOP2 changes	Stanislav Mekhanoshin	2019-04-26	6	-154/+605
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D61156 llvm-svn: 359316
*	[PowerPC] Update P9 vector costs for insert/extract element	Roland Froese	2019-04-26	1	-0/+29
\| \| \| \| \| \| \| \| \| \| \|	The PPC vector cost model values for insert/extract element reflect older processors that lacked vector insert/extract and move-to/move-from VSR instructions. Update getVectorInstrCost to give appropriate values for when the newer instructions are present. Differential Revision: https://reviews.llvm.org/D60160 llvm-svn: 359313
*	s/Dwarf 5/DWARF v5/ NFC	Fangrui Song	2019-04-26	1	-1/+1
\| \| \| \|	llvm-svn: 359307
*	Fix Wparentheses warning. NFCI.	Simon Pilgrim	2019-04-26	1	-5/+5
\| \| \| \|	llvm-svn: 359299
*	[X86][SSE] Pull out OR(EXTRACTELT(X,0),OR(EXTRACTELT(X,1),...)) matching ↵	Simon Pilgrim	2019-04-26	1	-50/+59
\| \| \| \| \| \| \| \| \| \|	code from LowerVectorAllZeroTest Create a matchBitOpReduction helper that checks for the pattern with any opcode. First step towards reusing this code to recognize other scalar reduction patterns. llvm-svn: 359296
*	caseFoldingDjbHash: simplify and make the US-ASCII fast path faster	Fangrui Song	2019-04-26	1	-19/+16
\| \| \| \| \| \| \| \| \|	The slow path (with at least one non US-ASCII) will be slower but that doesn't matter. Differential Revision: https://reviews.llvm.org/D61178 llvm-svn: 359294
*	[X86][SSE] Disable shouldFoldConstantShiftPairToMask for btver1/btver2 ↵	Simon Pilgrim	2019-04-26	4	-2/+26
\| \| \| \| \| \| \| \| \| \|	targets (PR40758) As detailed on PR40758, Bobcat/Jaguar can perform vector immediate shifts on the same pipes as vector ANDs with the same latency - so it doesn't make sense to replace a shl+lshr with a shift+and pair as it requires an additional mask (with the extra constant pool, loading and register pressure costs). Differential Revision: https://reviews.llvm.org/D61068 llvm-svn: 359293
*	[X86][AVX] Combine shuffles extracted from a common vector	Simon Pilgrim	2019-04-26	1	-0/+45
\| \| \| \| \| \| \| \|	A small step towards combining shuffles across vector sizes - this recognizes when a shuffle's operands are all extracted from the same larger source and tries to combine to an unary shuffle of that source instead. Fixes one of the test cases from PR34380. Differential Revision: https://reviews.llvm.org/D60512 llvm-svn: 359292