bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[X86] Remove the unused masked 128 and 256-bit masked padds/psubs intrinsics.	Craig Topper	2018-08-16	1	-8/+0
\| \| \| \| \| \|	Still need to remove masking from the 512-bit versions. llvm-svn: 339841
*	[X86] Change legacy SSE scalar fp to integer intrinsics to use specific ISD ↵	Craig Topper	2018-08-15	1	-16/+24
\| \| \| \| \| \| \| \| \| \| \| \|	opcodes instead of keeping as intrinsics. Unify SSE and AVX512 isel patterns. AVX512 added new versions of these intrinsics that take a rounding mode. If the rounding mode is 4 the new intrinsics are equivalent to the old intrinsics. The AVX512 intrinsics were being lowered to ISD opcodes, but the legacy SSE intrinsics were left as intrinsics. This resulted in the AVX512 instructions needing separate patterns for the ISD opcodes and the legacy SSE intrinsics. Now we convert SSE intrinsics and AVX512 intrinsics with rounding mode 4 to the same ISD opcode so we can share the isel patterns. llvm-svn: 339749
*	[X86] Lowering addus/subus intrinsics to native IR	Tomasz Krupa	2018-08-14	1	-20/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This revision improves previous version (rL330322) which has been reverted due to crashes. This is the patch that lowers x86 intrinsics to native IR in order to enable optimizations. The patch also includes folding of previously missing saturation patterns so that IR emits the same machine instructions as the intrinsics. Reviewers: craig.topper, spatel, RKSimon Reviewed By: craig.topper Subscribers: mike.dvoretsky, DavidKreitzer, sroland, llvm-commits Differential Revision: https://reviews.llvm.org/D46179 llvm-svn: 339650
*	[X86] Remove and autoupgrade the scalar fma intrinsics with masking.	Craig Topper	2018-07-12	1	-16/+1
\| \| \| \| \| \|	This converts them to what clang is now using for codegen. Unfortunately, there seem to be a few kinks to work out still. I'll try to address with follow up patches. llvm-svn: 336871
*	[X86] Add back some intrinsic table entries lost in r336506.	Craig Topper	2018-07-08	1	-0/+6
\| \| \| \|	llvm-svn: 336508
*	[X86] Add new scalar fma intrinsics with rounding mode that use f32/f64 types.	Craig Topper	2018-07-08	1	-6/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This allows us to handle masking in a very similar way to the default rounding version that uses llvm.fma. I had to add new rounding mode CodeGenOnly instructions to support isel when we can't find a movss to grab the upper bits from to use the b_Int instruction. Fast-isel tests have been updated to match new clang codegen. We are currently having trouble folding fneg into the new intrinsic. I'm going to correct that in a follow up patch to keep the size of this one down. A future patch will also remove the old intrinsics. llvm-svn: 336506
*	[X86] Merge INTR_TYPE_3OP_RM with INTR_TYPE_3OP. Remove unused INTR_TYPE_1OP_RM.	Craig Topper	2018-07-07	1	-5/+5
\| \| \| \|	llvm-svn: 336476
*	[X86] Remove FMA4 scalar intrinsics. Use llvm.fma intrinsic instead.	Craig Topper	2018-07-06	1	-2/+0
\| \| \| \| \| \| \| \|	The intrinsics can be implemented with a f32/f64 llvm.fma intrinsic and an insert into a zero vector. There are a couple regressions here due to SelectionDAG not being able to pull an fneg through an extract_vector_elt. I'm not super worried about this though as InstCombine should be able to do it before we get to SelectionDAG. llvm-svn: 336416
*	[X86] Remove all of the avx512 masked packed fma intrinsics. Use llvm.fma or ↵	Craig Topper	2018-07-06	1	-93/+1
\| \| \| \| \| \| \| \| \| \|	unmasked 512-bit intrinsics with rounding mode. This upgrades all of the intrinsics to use fneg instructions to convert fma into fmsub/fnmsub/fnmadd/fmsubadd. And uses a select instruction for masking. This matches how clang uses the intrinsics these days. llvm-svn: 336409
*	[X86] Remove the last of the 'x86.fma.' intrinsics and autoupgrade them to ↵	Craig Topper	2018-07-05	1	-4/+0
\| \| \| \| \| \| \| \|	'llvm.fma'. Add upgrade tests for all. Still need to remove the AVX512 masked versions. llvm-svn: 336383
*	[X86] Remove X86 specific scalar FMA intrinsics and upgrade to tart ↵	Craig Topper	2018-07-05	1	-2/+0
\| \| \| \| \| \|	independent FMA and extractelement/insertelement. llvm-svn: 336315
*	[X86] Remove some of the packed FMA3 intrinsics since we no longer use them ↵	Craig Topper	2018-07-05	1	-4/+0
\| \| \| \| \| \| \| \| \| \|	in clang. There's a regression in here due to inability to combine fneg inputs of X86ISD::FMSUB/FNMSUB/FNMADD nodes. More removals to come, but I wanted to stop and fix the regression that showed up in this first. llvm-svn: 336303
*	[X86] Remove masking from avx512 rotate intrinsics. Use select in IR instead.	Craig Topper	2018-06-30	1	-26/+26
\| \| \| \|	llvm-svn: 336035
*	[X86] Remove masking from the avx512 packed sqrt intrinsics. Use select in ↵	Craig Topper	2018-06-29	1	-5/+3
\| \| \| \| \| \| \| \|	IR instead. While there improve the coverage of the intrinsic testing and add fast-isel tests. llvm-svn: 335944
*	[X86] Rename the autoupgraded of packed fp compare and fpclass intrinsics ↵	Craig Topper	2018-06-27	1	-14/+12
\| \| \| \| \| \| \| \|	that don't take a mask as input to exclude '.mask.' from their name. I think the intrinsics named 'avx512.mask.' should refer to the previous behavior of taking a mask argument in the intrinsic instead of using a 'select' or 'and' instruction in IR to accomplish the masking. This is more consistent with the goal that eventually we will have no intrinsics that have masking builtin. When we reach that goal, we should have no intrinsics named "avx512.mask". llvm-svn: 335744
*	[X86] Simplify intrinsic table binary search to not require a temporary struct.	Craig Topper	2018-06-25	1	-10/+9
\| \| \| \| \| \|	std::lower_bound doesn't require the thing to search for to be the same type as the table entries. We just need to define an appropriate comparison function that can take an table entry and an intrinsic number. llvm-svn: 335518
*	[X86] Remove masking from 512-bit floating max/min intrinsics. Use select ↵	Craig Topper	2018-06-21	1	-8/+4
\| \| \| \| \| \|	instruction instead. llvm-svn: 335199
*	[X86] Lowering sqrt intrinsics to native IR	Tomasz Krupa	2018-06-15	1	-8/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Complementary patch to lowering sqrt intrinsics in Clang. Reviewers: craig.topper, spatel, RKSimon, DavidKreitzer, uriel.k Reviewed By: craig.topper Subscribers: tkrupa, mike.dvoretsky, llvm-commits Differential Revision: https://reviews.llvm.org/D41599 llvm-svn: 334849
*	[x86] fix mappings of cvttp2si/cvttp2ui x86 intrinsics to x86-specific nodes ↵	Craig Topper	2018-06-14	1	-20/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	and isel patterns (PR37551) Summary: The tests in: https://bugs.llvm.org/show_bug.cgi?id=37751 ...show miscompiles because we wrongly mapped and folded x86-specific intrinsics into generic DAG nodes. This patch corrects the mappings in X86IntrinsicsInfo.h and adds isel matching corresponding to the new patterns. The complete tests for the failure cases should be in avx-cvttp2si.ll and sse-cvttp2si.ll and avx512-cvttp2i.ll Reviewers: RKSimon, gbedwell, spatel Reviewed By: spatel Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D47993 llvm-svn: 334685
*	[X86] Remove masking from avx512vbmi2 concat and shift by immediate ↵	Craig Topper	2018-06-13	1	-19/+19
\| \| \| \| \| \|	intrinsics. Use select in IR instead. llvm-svn: 334576
*	[X86] Remove masking from dbpsadbw intrinsics, use select in IR instead.	Craig Topper	2018-06-11	1	-7/+4
\| \| \| \|	llvm-svn: 334384
*	[X86] Remove and autoupgrade the expandload and compressstore intrinsics.	Craig Topper	2018-06-11	1	-74/+1
\| \| \| \| \| \|	We use the target independent intrinsics now. llvm-svn: 334381
*	[X86] Remove masking from the 512-bit masked floating point add/sub/mul/div ↵	Craig Topper	2018-06-10	1	-16/+8
\| \| \| \| \| \|	intrinsics. Use a select in IR instead. llvm-svn: 334358
*	[X86][BMI][TBM] Only demand bottom 16-bits of the BEXTR control op (PR34042)	Simon Pilgrim	2018-06-06	1	-0/+4
\| \| \| \| \| \| \| \|	Only the bottom 16-bits of BEXTR's control op are required (0:8 INDEX, 15:8 LENGTH). Differential Revision: https://reviews.llvm.org/D47690 llvm-svn: 334083
*	[X86] Remove and autoupgrade masked avx512vnni intrinsics using the unmasked ↵	Craig Topper	2018-06-03	1	-26/+0
\| \| \| \| \| \|	intrinsics and select instructions. llvm-svn: 333857
*	[X86] Lowering FMA intrinsics to native IR (LLVM part)	Gabor Buella	2018-05-30	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Support for Clang lowering of fused intrinsics. This patch: 1. Removes bindings to clang fma intrinsics. 2. Introduces new LLVM unmasked intrinsics with rounding mode: int_x86_avx512_vfmadd_pd_512 int_x86_avx512_vfmadd_ps_512 int_x86_avx512_vfmaddsub_pd_512 int_x86_avx512_vfmaddsub_ps_512 supported with a new intrinsic type (INTR_TYPE_3OP_RM). 3. Introduces new x86 fmaddsub/fmsubadd folding. 4. Introduces new tests for code emitted by sequentions introduced in Clang part. Patch by tkrupa Reviewers: craig.topper, sroland, spatel, RKSimon Reviewed By: craig.topper, RKSimon Differential Revision: https://reviews.llvm.org/D47443 llvm-svn: 333554
*	[X86] Add unmasked AVX512VNNI instrinsics. Use a select in IR instead.	Craig Topper	2018-05-30	1	-0/+14
\| \| \| \| \| \|	A future patch will remove the old masked intrinsics. llvm-svn: 333508
*	[X86] Remove masked vpermi2var/vpermt2var intrinsics and autoupgrade.	Craig Topper	2018-05-29	1	-112/+1
\| \| \| \| \| \|	We have unmasked intrinsics now and wrap them with a select. This is a net reduction of 36 intrinsics from before the unmasked intrinsics were added. llvm-svn: 333388
*	[X86] Add unmasked vermi2var intrinsics so we can use explicit select ↵	Craig Topper	2018-05-29	1	-0/+18
\| \| \| \| \| \| \| \|	instructions for masking in clang. This will allow us to remove the 3 different flavors of masked intrinsics. I'm leaving the actual intrinsic removal for another patch. llvm-svn: 333386
*	[X86] Converge X86ISD::VPERMV3 and X86ISD::VPERMIV3 to a single opcode.	Craig Topper	2018-05-28	1	-37/+37
\| \| \| \| \| \| \| \| \| \|	These do the same thing with the first and second sources swapped. They previously came from separate intrinsics that specified different masking behavior. But we can cover that with isel patterns and a single node. This is a step towards reducing the number of intrinsics needed. A bunch of tests change because we are now biased to choosing VPERMT over VPERMI when there is nothing to signal that commuting is beneficial. llvm-svn: 333383
*	[X86] Remove masking from avx512ifma intrinsics. Use a select instead.	Craig Topper	2018-05-26	1	-25/+7
\| \| \| \| \| \|	This allows us to avoid having mask and maskz variant. Reducing from 12 intrinsics to 6. llvm-svn: 333346
*	[X86] Remove 128/256-bit cvtdq2ps, cvtudq2ps, cvtqq2pd, cvtuqq2pd intrinsics.	Craig Topper	2018-05-21	1	-14/+0
\| \| \| \| \| \|	These can all be implemented with sitofp/uitofp instructions. llvm-svn: 332916
*	[X86] Remove masking from vpternlog intrinsics. Use a select in IR instead.	Craig Topper	2018-05-21	1	-25/+7
\| \| \| \| \| \| \| \|	This removes 6 intrinsics since we no longer need separate mask and maskz intrinsics. Differential Revision: https://reviews.llvm.org/D47124 llvm-svn: 332890
*	[X86] Remove mask arguments from permvar builtins/intrinsics. Use a select ↵	Craig Topper	2018-05-20	1	-29/+17
\| \| \| \| \| \| \| \|	in IR instead. Someday maybe we'll use selects for all intrinsics. llvm-svn: 332824
*	[X86] Remove and autoupgrade masked vpermd/vpermps intrinsics.	Craig Topper	2018-05-13	1	-4/+0
\| \| \| \|	llvm-svn: 332198
*	[X86] Remove some unused masked conversion intrinsics that can be replaced ↵	Craig Topper	2018-05-12	1	-18/+0
\| \| \| \| \| \| \| \|	with an older intrinsic and a select. This is what clang already uses. llvm-svn: 332170
*	[X86] Remove and autoupgrade a bunch of FMA instrinsics that are no longer ↵	Craig Topper	2018-05-11	1	-22/+0
\| \| \| \| \| \|	used by clang. llvm-svn: 332146
*	[x86] Revert r330322 (& r330323): Lowering x86 adds/addus/subs/subus intrinsics	Chandler Carruth	2018-04-26	1	-0/+40
\| \| \| \| \| \| \| \|	The LLVM commit introduces a crash in LLVM's instruction selection. I filed http://llvm.org/PR37260 with the test case. llvm-svn: 330997
*	Lowering x86 adds/addus/subs/subus intrinsics (llvm part)	Alexander Ivchenko	2018-04-19	1	-40/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	This is the patch that lowers x86 intrinsics to native IR in order to enable optimizations. The patch also includes folding of previously missing saturation patterns so that IR emits the same machine instructions as the intrinsics. Patch by tkrupa Differential Revision: https://reviews.llvm.org/D44785 llvm-svn: 330322
*	[X86] Remove 128/256-bit masked pmaddubsw and pmaddwd intrinsics. Replace ↵	Craig Topper	2018-04-11	1	-12/+4
\| \| \| \| \| \| \| \|	512-bit masked intrinsic with unmasked intrinsic and a select. The 128/256-bit versions were no longer used by clang. It uses the legacy SSE/AVX2 version and a select. The 512-bit was changed to the same for consistency. llvm-svn: 329774
*	[X86] Change X86::PMULDQ/PMULUDQ opcodes to take vXi64 type as input instead ↵	Craig Topper	2018-03-08	1	-6/+0
\| \| \| \| \| \| \| \| \| \| \| \|	of vXi32. This instruction can be thought of as reading either the even elements of a vXi32 input or the lower half of each element of a vXi64 input. We currently use the vXi32 interpretation, but vXi64 matches better with its broadcast behavior in EVEX. I'm looking at moving MULDQ/MULUDQ creation to a DAG combine so we can do it when AVX512DQ is enabled without having to go through Custom lowering. But in some of the test cases we failed to use a broadcast load due to the size difference. This should help with that. I'm also wondering if we can model these instructions in native IR and remove the intrinsics and I think using a vXi64 type will work better with that. llvm-svn: 326991
*	[X86] Allow int_x86_sse2_cvtps2dq and int_x86_avx_cvt_ps2dq_256 to select ↵	Craig Topper	2018-02-24	1	-0/+2
\| \| \| \| \| \|	EVEX encoded instructions. llvm-svn: 326041
*	[X86] Add 512-bit unmasked pmulhrsw/pmulhw/pmulhuw intrinsics. Remove and ↵	Craig Topper	2018-02-20	1	-9/+3
\| \| \| \| \| \| \| \|	auto upgrade 128/256/512 bit masked pmulhrsw/pmulhw/pmulhuw intrinsics. The 128 and 256 bit versions were already not used by clang. This adds an equivalent unmasked 512 bit version. Then autoupgrades all sizes to use unmasked intrinsics plus select. llvm-svn: 325559
*	[X86] Remove MASK_BINOP intrinsic type. NFC	Craig Topper	2018-02-11	1	-1/+1
\| \| \| \|	llvm-svn: 324858
*	[X86] Remove and autoupgrade kand/kandn/kor/kxor/kxnor/knot intrinsics.	Craig Topper	2018-02-03	1	-3/+0
\| \| \| \| \| \| \| \|	Clang already stopped using these a couple months ago. The test cases aren't great as there is nothing forcing the operations to stay in k-registers so some of them moved back to scalar ops due to the bitcasts being moved around. llvm-svn: 324177
*	[X86] Remove unused intrinsic type handling. NFC	Craig Topper	2018-01-26	1	-2/+2
\| \| \| \|	llvm-svn: 323503
*	[X86] Use ISD::TRUNCATE instead of X86ISD::VTRUNC when input and output ↵	Craig Topper	2018-01-14	1	-8/+8
\| \| \| \| \| \|	types have the same number of elements. llvm-svn: 322455
*	[X86] Remove llvm.x86.avx512.cvt2mask. intrinsics and autoupgrade to (icmp ↵	Craig Topper	2018-01-09	1	-13/+1
\| \| \| \| \| \| \| \|	slt X, 0) I had to drop fast-isel-abort from a test because we can't fast isel some of the mask stuff. When we used intrinsics we implicitly fell back to SelectionDAG for the intrinsic call without triggering the abort error. But with native IR that doesn't happen the same way. llvm-svn: 322050
*	[X86] Replace CVT2MASK ISD opcode with PCMPGTM compared to zero.	Craig Topper	2018-01-08	1	-12/+12
\| \| \| \| \| \|	CVT2MASK is just checking the sign bit which can be represented with a comparison with zero. llvm-svn: 321985
*	[X86] Remove unneeded code for handling the old kunpck intrinsics.	Craig Topper	2017-12-16	1	-1/+1
\| \| \| \|	llvm-svn: 320917