bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Revert 336426 (and follow-ups 428, 440), it very likely caused PR38084.	Nico Weber	2018-07-06	3	-213/+20
\| \| \| \|	llvm-svn: 336453
*	[Debugify] Allow unsigned values narrower than their variables	Vedant Kumar	2018-07-06	1	-3/+10
\| \| \| \| \| \| \| \|	Suppress the diagnostic for mis-sized dbg.values when a value operand is narrower than the unsigned variable it describes. Assume that a debugger would implicitly zero-extend these values. llvm-svn: 336452
*	[Local] replaceAllDbgUsesWith: Update debug values before RAUW	Vedant Kumar	2018-07-06	4	-2/+71
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The replaceAllDbgUsesWith utility helps passes preserve debug info when replacing one value with another. This improves upon the existing insertReplacementDbgValues API by: - Updating debug intrinsics in-place, while preventing use-before-def of the replacement value. - Falling back to salvageDebugInfo when a replacement can't be made. - Moving the responsibiliy for rewriting llvm.dbg.* DIExpressions into common utility code. Along with the API change, this teaches replaceAllDbgUsesWith how to create DIExpressions for three basic integer and pointer conversions: - The no-op conversion. Applies when the values have the same width, or have bit-for-bit compatible pointer representations. - Truncation. Applies when the new value is wider than the old one. - Zero/sign extension. Applies when the new value is narrower than the old one. Testing: - check-llvm, check-clang, a stage2 `-g -O3` build of clang, regression/unit testing. - This resolves a number of mis-sized dbg.value diagnostics from Debugify. Differential Revision: https://reviews.llvm.org/D48676 llvm-svn: 336451
*	[InstCombine] add more tests with poison and undef; NFC	Sanjay Patel	2018-07-06	1	-5/+540
\| \| \| \| \| \| \| \|	As discussed in D48987 and D48893, there are many different ways to go wrong depending on the binop (and as shown here we already do go wrong in some cases). llvm-svn: 336450
*	[ARM] ParallelDSP: added statistics, NFC.	Sjoerd Meijer	2018-07-06	13	-17/+18
\| \| \| \| \| \| \| \| \|	Added statistics for the number of SMLAD instructions created, and als renamed the pass name to -arm-parallel-dsp. Differential Revision: https://reviews.llvm.org/D48971 llvm-svn: 336441
*	Commit rL336426 cause buildbot failures	Diogo N. Sampaio	2018-07-06	2	-6/+6
\| \| \| \| \| \| \| \|	http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/50537/testReport/junit/LLVM/CodeGen_AArch64/FoldRedundantShiftedMasking_ll/ This removes the comments of the function label causing this error. llvm-svn: 336440
*	[AArch64] Armv8.4-A: TLB support	Sjoerd Meijer	2018-07-06	2	-0/+414
\| \| \| \| \| \| \| \|	This adds: - outer shareable TLB Maintenance instructions, and - TLB range maintenance instructions. llvm-svn: 336434
*	Recommit: [AArch64] Armv8.4-A: Flag manipulation instructions	Sjoerd Meijer	2018-07-06	3	-0/+82
\| \| \| \| \| \|	Now with the asm operand definition included. llvm-svn: 336432
*	[SelectionDAG] https://reviews.llvm.org/D48278	Diogo N. Sampaio	2018-07-06	3	-20/+213
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	D48278 Allow to reduce redundant shift masks. For example: x1 = x & 0xAB00 x2 = (x >> 8) & 0xAB can be reduced to: x1 = x & 0xAB00 x2 = x1 >> 8 It only allows folding when the masks and shift values are constants. llvm-svn: 336426
*	Revert [AArch64] Armv8.4-A: Flag manipulation instructions	Sjoerd Meijer	2018-07-06	3	-82/+0
\| \| \| \| \| \|	It's causing build errors. llvm-svn: 336422
*	[AArch64] Armv8.4-A: Flag manipulation instructions	Sjoerd Meijer	2018-07-06	3	-0/+82
\| \| \| \| \| \| \| \|	These instructions are added to AArch64 only. Differential Revision: https://reviews.llvm.org/D48926 llvm-svn: 336421
*	[llvm-mca] improve the instruction issue logic implemented by the Scheduler.	Andrea Di Biagio	2018-07-06	3	-60/+256
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch modifies the Scheduler heuristic used to select the next instruction to issue to the pipelines. The motivating example is test X86/BtVer2/add-sequence.s, for which llvm-mca wrongly reported an estimated IPC of 1.50. According to perf, the actual IPC for that test should have been ~2.00. It turns out that an IPC of 2.00 for test add-sequence.s cannot possibly be predicted by a Scheduler that only prioritizes instructions based on their "age". A similar issue also affected test X86/BtVer2/dependent-pmuld-paddd.s, for which llvm-mca wrongly estimated an IPC of 0.84 instead of an IPC of 1.00. Instructions in the ReadyQueue are now ranked based on two factors: - The "age" of an instruction. - The number of unique users of writes associated with an instruction. The new logic still prioritizes older instructions over younger instructions to minimize the pressure on the reorder buffer. However, the number of users of an instruction now also affects the overall rank. This potentially increases the ability of the Scheduler to extract instruction level parallelism. This patch fixes the problem with the wrong IPC reported for test add-sequence.s and test dependent-pmuld-paddd.s. llvm-svn: 336420
*	CallGraphSCCPass: iterate over all functions.	Tim Northover	2018-07-06	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously we only iterated over functions reachable from the set of external functions in the module. But since some of the passes under this (notably the always-inliner and coroutine lowerer) are required for correctness, they need to run over everything. This just adds an extra layer of iteration over the CallGraph to keep track of which functions we've already visited and get the next batch of SCCs. Should fix PR38029. llvm-svn: 336419
*	[AArch64][ARM] Armv8.4-A: Trace synchronization barrier instruction	Sjoerd Meijer	2018-07-06	7	-0/+84
\| \| \| \| \| \| \| \|	This adds the Armv8.4-A Trace synchronization barrier (TSB) instruction. Differential Revision: https://reviews.llvm.org/D48918 llvm-svn: 336418
*	[X86] Remove FMA4 scalar intrinsics. Use llvm.fma intrinsic instead.	Craig Topper	2018-07-06	3	-16/+28
\| \| \| \| \| \| \| \|	The intrinsics can be implemented with a f32/f64 llvm.fma intrinsic and an insert into a zero vector. There are a couple regressions here due to SelectionDAG not being able to pull an fneg through an extract_vector_elt. I'm not super worried about this though as InstCombine should be able to do it before we get to SelectionDAG. llvm-svn: 336416
*	Reapply: "objdump: Support newer ObjC image info flags"	Dave Lee	2018-07-06	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add support for two additional ObjC image info flags: `IS_SIMULATED` and `HAS_CATEGORY_CLASS_PROPERTIES`. `IS_SIMULATED` indicates a Mach-O binary built for iOS simulator. `HAS_CATEGORY_CLASS_PROPERTIES` indicates a Mach-O binary built by a compiler that supports class properties in categories. Reviewers: enderby, compnerd Reviewed By: compnerd Subscribers: keith, llvm-commits Differential Revision: https://reviews.llvm.org/D48568 llvm-svn: 336411
*	Revert "[InstCombine] Delay foldICmpUsingKnownBits until simple transforms ↵	Max Kazantsev	2018-07-06	6	-25/+35
\| \| \| \| \| \|	are done" llvm-svn: 336410
*	[X86] Remove all of the avx512 masked packed fma intrinsics. Use llvm.fma or ↵	Craig Topper	2018-07-06	3	-356/+1967
\| \| \| \| \| \| \| \| \| \|	unmasked 512-bit intrinsics with rounding mode. This upgrades all of the intrinsics to use fneg instructions to convert fma into fmsub/fnmsub/fnmadd/fmsubadd. And uses a select instruction for masking. This matches how clang uses the intrinsics these days. llvm-svn: 336409
*	[X86] Cleanup some of the avx512 masked fma tests to prepare for removing ↵	Craig Topper	2018-07-06	3	-1217/+1099
\| \| \| \| \| \| \| \| \| \|	and autoupgrading. -Split cases that call 2 intrinsics in the same case. -Remove testing mask3 and maskz intrinsics with an all ones mask. These won't be interesting after the upgrade. -Restore test cases for some intrinsics that are marked for deletion, but haven't been deleted yet. llvm-svn: 336408
*	[Power9] Add __float128 library call for frem	Stefan Pintilie	2018-07-06	1	-0/+14
\| \| \| \| \| \| \| \|	Power 9 does not have a hardware instruction for frem but we can call fmodf128. Differential Revision: https://reviews.llvm.org/D48552 llvm-svn: 336406
*	[x86]Add a test case to show missed vfnmadd generation.	Easwaran Raman	2018-07-06	1	-0/+25
\| \| \| \|	llvm-svn: 336404
*	Revert "objdump: Support newer ObjC image info flags"	Dave Lee	2018-07-06	1	-7/+0
\| \| \| \| \| \|	This reverts commit 8c4cc472e7a67bd3b2b20cc4cf32d31af29bc7e9. llvm-svn: 336402
*	[X86][Disassembler] Fix LOCK prefix disassembler support	Maksim Panchenko	2018-07-05	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If LOCK prefix is not the first prefix in an instruction, LLVM disassembler silently drops the prefix. The fix is to select a proper instruction with a builtin LOCK prefix if one exists. Reviewers: craig.topper Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49001 llvm-svn: 336400
*	objdump: Support newer ObjC image info flags	Dave Lee	2018-07-05	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add support for two additional ObjC image info flags: `IS_SIMULATED` and `HAS_CATEGORY_CLASS_PROPERTIES`. `IS_SIMULATED` indicates a Mach-O binary built for iOS simulator. `HAS_CATEGORY_CLASS_PROPERTIES` indicates a Mach-O binary built by a compiler that supports class properties in categories. Reviewers: enderby, compnerd Reviewed By: compnerd Subscribers: keith, llvm-commits Differential Revision: https://reviews.llvm.org/D48568 llvm-svn: 336399
*	This is a recommit of r336322, previously reverted in r336324 due to	Sander de Smalen	2018-07-05	18	-0/+378
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	a deficiency in TableGen that has been addressed in r336334. [AArch64][SVE] Asm: Support for predicated FP rounding instructions. This patch also adds instructions for predicated FP square-root and reciprocal exponent. The added instructions are: - FRINTI Round to integral value (current FPCR rounding mode) - FRINTX Round to integral value (current FPCR rounding mode, signalling inexact) - FRINTA Round to integral value (to nearest, with ties away from zero) - FRINTN Round to integral value (to nearest, with ties to even) - FRINTZ Round to integral value (toward zero) - FRINTM Round to integral value (toward minus Infinity) - FRINTP Round to integral value (toward plus Infinity) - FSQRT Floating-point square root - FRECPX Floating-point reciprocal exponent llvm-svn: 336387
*	[X86] Remove the last of the 'x86.fma.' intrinsics and autoupgrade them to ↵	Craig Topper	2018-07-05	4	-184/+1586
\| \| \| \| \| \| \| \|	'llvm.fma'. Add upgrade tests for all. Still need to remove the AVX512 masked versions. llvm-svn: 336383
*	[X86] Add SHUF128 to target shuffle decoding.	Craig Topper	2018-07-05	2	-65/+127
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D48954 llvm-svn: 336376
*	Fix asserts in AMDGCN fmed3 folding by handling more cases of NaN	Matt Arsenault	2018-07-05	1	-3/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Better NaN handling for AMDGCN fmed3. All operands are checked for NaN now. The checks were moved before the canonicalization to provide a better mapping from fclamp. Changed the behaviour of fmed3(x,y,NaN) to return max(x,y) instead of min(x,y) in light of this. Updated tests as a result and added some new cases to cover the fix. Patch by Alan Baker llvm-svn: 336375
*	AMDGPU: Don't use spir_kernel in a test	Matt Arsenault	2018-07-05	1	-3/+2
\| \| \| \| \| \|	Also use verify-machineinstrs. llvm-svn: 336374
*	AMDGPU/GlobalISel: Implement custom kernel arg lowering	Matt Arsenault	2018-07-05	2	-20/+789
\| \| \| \| \| \| \| \| \| \| \| \| \|	Avoid using allocateKernArg / AssignFn. We do not want any of the type splitting properties of normal calling convention lowering. For now at least this exists alongside the IR argument lowering pass. This is necessary to handle struct padding correctly while some arguments are still skipped by the IR argument lowering pass. llvm-svn: 336373
*	[CostModel][X86] Add UDIV/UREM by pow2 costs	Simon Pilgrim	2018-07-05	2	-174/+421
\| \| \| \| \| \|	Normally InstCombine would have simplified these to SRL/AND instructions but we may still see these during SLP vectorization etc. llvm-svn: 336371
*	[llvm-objdump] Removed archive-headers-disas test	Paul Semel	2018-07-05	1	-29/+0
\| \| \| \| \| \| \| \|	This test is failing because of the disas part. For the moment, I will juste remove it. I will add it again tomorrow with a proper fix. llvm-svn: 336370
*	[llvm-objcopy] Fix timezone dependant tests	Paul Semel	2018-07-05	2	-18/+18
\| \| \| \|	llvm-svn: 336363
*	[Power9] Add lib calls for float128 operations with no equivalent PPC ↵	Lei Huang	2018-07-05	1	-1/+146
\| \| \| \| \| \| \| \| \| \| \|	instructions Map the following instructions to the proper float128 lib calls: pow[i], exp[2], log[2\|10], sin, cos, fmin, fmax Differential Revision: https://reviews.llvm.org/D48544 llvm-svn: 336361
*	[X86][SSE] Add srem x, (1 << c) combine tests	Simon Pilgrim	2018-07-05	1	-0/+227
\| \| \| \| \| \|	Now that D45806 has landed we can start trying to avoid scalarizing srem by constant - these tests demonstrate some example cases. llvm-svn: 336360
*	[llvm-objdump] Add --archive-headers (-a) option	Paul Semel	2018-07-05	3	-0/+50
\| \| \| \|	llvm-svn: 336357
*	[AArch64, PowerPC, x86] add tests for signbit bit hacks; NFC	Sanjay Patel	2018-07-05	3	-0/+459
\| \| \| \|	llvm-svn: 336348
*	[AMDGPU] Add VALU to V_INTERP Instructions	Ryan Taylor	2018-07-05	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \|	Wait states are not properly being inserted after buffer_store for v_interp instructions. Add VALU to V_INTERP instructions so that the GCNHazardRecognizer can check and insert the appropriate wait states when needed. Differential Revision: https://reviews.llvm.org/D48772 Change-Id: Id540c9b074fc69b5c1de6b182276aa089c74aa64 llvm-svn: 336339
*	Partially revert r336268 in address-offsets.ll	Krasimir Georgiev	2018-07-05	1	-40/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: There the typos are intentional, explicitly introduced to disable these cases in r280285. Reviewers: bkramer Reviewed By: bkramer Subscribers: dschuff, sbc100, jgravelle-google, aheejin, llvm-commits Differential Revision: https://reviews.llvm.org/D48962 llvm-svn: 336336
*	[TableGen] Increase the number of supported decoder fix-ups.	Sander de Smalen	2018-07-05	3	-18/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The vast number of added instructions for SVE causes TableGen to fail with an assertion: Assertion `Delta < 65536U && "disassembler decoding table too large!"' This patch increases the number of supported decoder fix-ups. Reviewers: dmgreen, stoklund, petpav01 Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D48937 llvm-svn: 336334
*	[X86][SSE] Add extra v16i16 shl x,c -> pmullw test	Simon Pilgrim	2018-07-05	1	-0/+27
\| \| \| \| \| \|	We want to compare shifts with repeated vs non-repeated v8i16 shuffle masks (for PBLENDW ymm) llvm-svn: 336333
*	[mips] Fix atomic operations at O0, v3	Aleksandar Beserminji	2018-07-05	4	-428/+9037
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Similar to PR/25526, fast-regalloc introduces spills at the end of basic blocks. When this occurs in between an ll and sc, the stores can cause the atomic sequence to fail. This patch fixes the issue by introducing more pseudos to represent atomic operations and moving their lowering to after the expansion of postRA pseudos. This version addresses issues with the initial implementation and covers all atomic operations. This resolves PR/32020. Thanks to James Cowgill for reporting the issue! Patch By: Simon Dardis Differential Revision: https://reviews.llvm.org/D31287 llvm-svn: 336328
*	[NEON] Fix combining of vldx_dup intrinsics with updating of base addresses	Ivan A. Kosarev	2018-07-05	1	-0/+43
\| \| \| \| \| \| \| \| \| \| \| \| \|	Resolves: Unsupported ARM Neon intrinsics in Target-specific DAG combine function for VLDDUP https://bugs.llvm.org/show_bug.cgi?id=38031 Related diff: D48439 Differential Revision: https://reviews.llvm.org/D48920 llvm-svn: 336325
*	Reverting r336322 for now, as it causes an assert failure	Sander de Smalen	2018-07-05	18	-378/+0
\| \| \| \| \| \| \|	in TableGen, for which there is already a patch in Phabricator (D48937) that needs to be committed first. llvm-svn: 336324
*	Partial revert of "NFC - Various typo fixes in tests"	Mikael Holmen	2018-07-05	3	-15/+19
\| \| \| \| \| \| \| \|	This partially reverts r336268 since it causes buildbot failures. Added FIXME at the places where the CHECKs are misspelled. llvm-svn: 336323
*	[AArch64][SVE] Asm: Support for predicated FP rounding instructions.	Sander de Smalen	2018-07-05	18	-0/+378
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch also adds instructions for predicated FP square-root and reciprocal exponent. The added instructions are: - FRINTI Round to integral value (current FPCR rounding mode) - FRINTX Round to integral value (current FPCR rounding mode, signalling inexact) - FRINTA Round to integral value (to nearest, with ties away from zero) - FRINTN Round to integral value (to nearest, with ties to even) - FRINTZ Round to integral value (toward zero) - FRINTM Round to integral value (toward minus Infinity) - FRINTP Round to integral value (toward plus Infinity) - FSQRT Floating-point square root - FRECPX Floating-point reciprocal exponent llvm-svn: 336322
*	[ARM] ParallelDSP: only support i16 loads for now	Sjoerd Meijer	2018-07-05	1	-1/+46
\| \| \| \| \| \| \| \| \|	We were miscompiling i8 loads, so reject them as unsupported narrow operations for now. Differential Revision: https://reviews.llvm.org/D48944 llvm-svn: 336319
*	[AArch64][SVE] Asm: Support for signed/unsigned MIN/MAX/ABD	Sander de Smalen	2018-07-05	12	-0/+460
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch implements the following varieties: - Unpredicated signed max, e.g. smax z0.h, z1.h, #-128 - Unpredicated signed min, e.g. smin z0.h, z1.h, #-128 - Unpredicated unsigned max, e.g. umax z0.h, z1.h, #255 - Unpredicated unsigned min, e.g. umin z0.h, z1.h, #255 - Predicated signed max, e.g. smax z0.h, p0/m, z0.h, z1.h - Predicated signed min, e.g. smin z0.h, p0/m, z0.h, z1.h - Predicated signed abd, e.g. sabd z0.h, p0/m, z0.h, z1.h - Predicated unsigned max, e.g. umax z0.h, p0/m, z0.h, z1.h - Predicated unsigned min, e.g. umin z0.h, p0/m, z0.h, z1.h - Predicated unsigned abd, e.g. uabd z0.h, p0/m, z0.h, z1.h llvm-svn: 336317
*	[Power9] Optimize codgen for conversions of int to float128	Lei Huang	2018-07-05	2	-10/+136
\| \| \| \| \| \| \| \| \| \| \| \|	Optimize code sequences for integer conversion to fp128 when the integer is a result of: * float->int * float->long * double->int * double->long Differential Revision: https://reviews.llvm.org/D48429 llvm-svn: 336316
*	[X86] Remove X86 specific scalar FMA intrinsics and upgrade to tart ↵	Craig Topper	2018-07-05	5	-118/+168
\| \| \| \| \| \|	independent FMA and extractelement/insertelement. llvm-svn: 336315