bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[X86] Fix (completely overridden) WriteFHAdd/WritePHAdd classes to allow us ↵	Simon Pilgrim	2018-04-22	1	-46/+6
\| \| \| \| \| \|	to remove unnecessary instrw overrides. llvm-svn: 330546
*	[X86] Remove unnecessary WriteFVarBlend/WriteVarBlend InstRW overrides.	Simon Pilgrim	2018-04-22	1	-29/+5
\| \| \| \| \| \|	This also fixes some of the ReadAfterLd issues due to InstRW. llvm-svn: 330544
*	[X86] Fix WriteMPSAD/WritePSADBW values to allow us to remove unnecessary ↵	Simon Pilgrim	2018-04-22	1	-15/+1
\| \| \| \| \| \|	instrw overrides. llvm-svn: 330542
*	[X86][SandyBridge] Remove unnecessary WritePOPCNTLd overrides by fixing load ↵	Simon Pilgrim	2018-04-22	1	-2/+1
\| \| \| \| \| \|	latency. llvm-svn: 330541
*	[X86][AVX] VPERM2F128/VINSERTF128 should be a shuffle256 schedule like ↵	Simon Pilgrim	2018-04-21	1	-0/+2
\| \| \| \| \| \|	VPERM2I128/VINSERTI128 llvm-svn: 330522
*	[X86] Add SchedWrites for LDMXCSR/STMXCSR.	Craig Topper	2018-04-21	1	-8/+4
\| \| \| \|	llvm-svn: 330517
*	[X86][SandyBridge] Strip unnecessary MOVQ/CVT instruction instrw overrides.	Simon Pilgrim	2018-04-21	1	-9/+3
\| \| \| \|	llvm-svn: 330505
*	[X86] Strip unnecessary x87 instruction instrw overrides from scheduler models.	Simon Pilgrim	2018-04-21	1	-26/+2
\| \| \| \|	llvm-svn: 330501
*	[X86] Add WriteFSign/WriteFLogic scheduler classes	Simon Pilgrim	2018-04-20	1	-27/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Split the fp and integer vector logical instruction scheduler classes - older CPUs especially often handled these on different pipes. This unearthed a couple of things that are also handled in this patch: (1) We were tagging avx512 fp logic ops as WriteFAdd, probably because of the lack of WriteFLogic (2) SandyBridge had integer logic ops only using Port5, when afaict they can use Ports015. (3) Cleaned up x86 FCHS/FABS scheduling as they are typically treated as fp logic ops. Differential Revision: https://reviews.llvm.org/D45629 llvm-svn: 330480
*	[X86][SandyBridge] Remove duplciate InstRWs from Sandy Brige scheduler model.	Craig Topper	2018-04-20	1	-60/+6
\| \| \| \|	llvm-svn: 330465
*	[X86] Correct the scheduling data for register forms of XCHG and XADD on ↵	Craig Topper	2018-04-19	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \|	Intel CPUs. The XCHG16rr/XCHG32rr/XCHG64rr instructions should be 3 uops just like XCHG8rr. I believe they're just implemented as 3 move uops with a temporary register. XADD is probably 2 moves and an add also using a temporary register. Change the latency for both from 2 cycles to 3 cycles. Only 2 of the uops are serialized in their execution, the move into the temporary and the move out of the temporary. The move from one GPR to the other should be able to go in parallel with this if there are ALU resources available. llvm-svn: 330349
*	[X86] Merge some MMX instregex	Simon Pilgrim	2018-04-19	1	-26/+7
\| \| \| \| \| \|	There's a lot more but I'd prefer focussing on removing unnecessary InstRWs first. llvm-svn: 330347
*	[X86] Scrub scheduling information for MUL/IMUL on Intel CPUs.	Craig Topper	2018-04-19	1	-5/+44
\| \| \| \| \| \|	This removes a bunch of unnecessary InstRW overrides. It also cleans up the missing information from the Sandy Bridge model. Other fixes to other models. llvm-svn: 330308
*	[X86] Add separate scheduling class for PSADBW instruction.	Craig Topper	2018-04-17	1	-3/+2
\| \| \| \|	llvm-svn: 330204
*	[X86] Remove unnecessary InstRW overrides. Add somes FIXMEs/TODOs.	Craig Topper	2018-04-17	1	-70/+9
\| \| \| \|	llvm-svn: 330203
*	[X86] Add FP comparison scheduler classes	Simon Pilgrim	2018-04-17	1	-24/+2
\| \| \| \| \| \| \| \|	Split VCMP/VMAX/VMIN instructions off to WriteFCmp and VCOMIS instructions off to WriteFCom instead of assuming they match WriteFAdd Differential Revision: https://reviews.llvm.org/D45656 llvm-svn: 330179
*	[X86] Add variable shuffle schedule classes	Simon Pilgrim	2018-04-11	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Split variable index shuffles from immediate index shuffles WriteFVarShuffle - variable 'in-lane' shuffles (VPERMILPS/VPERMIL2PS etc.) WriteVarShuffle - variable 'in-lane' shuffles (PSHUFB/VPPERM etc.) WriteFVarShuffle256 - variable 'cross-lane' shuffles (VPERMPS etc.) WriteVarShuffle256 - variable 'cross-lane' shuffles (VPERMD etc.) Differential Revision: https://reviews.llvm.org/D45404 llvm-svn: 329806
*	[X86] Add SchedWrites for CMOV and SETCC. Use them to remove InstRWs.	Craig Topper	2018-04-08	1	-10/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Cmov and setcc previously used WriteALU, but on Intel processors at least they are more restricted than basic ALU ops. This patch adds new SchedWrites for them and removes the InstRWs. I had to leave some InstRWs for CMOVA/CMOVBE and SETA/SETBE because those have an extra uop relative to the other condition codes on Intel CPUs. The test changes are due to fixing a missing ZnAGU dependency on the memory form of setcc. Reviewers: RKSimon, andreadb, GGanesh Reviewed By: RKSimon Subscribers: GGanesh, llvm-commits Differential Revision: https://reviews.llvm.org/D45380 llvm-svn: 329539
*	[X686] Add appropriate ReadAfterLd for the register input to memory forms of ↵	Craig Topper	2018-04-06	1	-5/+5
\| \| \| \| \| \|	ADC/SBB. llvm-svn: 329424
*	[X86] Remove InstRWs for basic arithmetic instructions from Sandy Bridge ↵	Craig Topper	2018-04-06	1	-64/+4
\| \| \| \| \| \| \| \|	scheduler model. We can get this right through WriteALU and friends now. llvm-svn: 329417
*	[X86] Add an extra store address cycle to WriteRMW in the Sandy ↵	Craig Topper	2018-04-06	1	-3/+3
\| \| \| \| \| \| \| \|	Bridge/Broadwell/Haswell/Skylake scheduler model. Even those the address was calculated for the load, its calculated again for the store. llvm-svn: 329415
*	[X86][SandyBridge] Add (V)DPPS memory fold latencies	Simon Pilgrim	2018-04-06	1	-0/+14
\| \| \| \| \| \|	Noticed this during D44654 llvm-svn: 329389
*	[X86][SandyBridge] SBWriteResPair +5cy Memory Folds	Simon Pilgrim	2018-04-06	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \|	As mentioned on D44647, this patch increases the default memory latency to +5cy , which more closely matches what most custom cases are doing for reg-mem instructions. I've bumped LoadLatency, ReadAfterLd and WriteLoad values to 5cy to be consistent. As Sandy Bridge is currently our default generic model, this affects a lot of scheduling tests... Differential Revision: https://reviews.llvm.org/D44654 llvm-svn: 329388
*	[X86] Separate CDQ and CDQE in the scheduler model.	Craig Topper	2018-04-05	1	-4/+2
\| \| \| \| \| \|	According to Agner's data, CDQE is closer to CWDE. llvm-svn: 329354
*	[X86] Add MOVZPQILo2PQIrr to the Sandy Bridge scheduler model	Craig Topper	2018-04-05	1	-1/+1
\| \| \| \|	llvm-svn: 329351
*	[X86] Add LEAVE instruction to the scheduler models using the same data as ↵	Craig Topper	2018-04-05	1	-2/+8
\| \| \| \| \| \| \| \| \| \|	LEAVE64. Make LEAVE/LEAVE64 more correct on Sandy Bridge. This is the 32-bit mode version of LEAVE64. It should be at least somewhat similar to LEAVE64. The Sandy Bridge version was missing a load port use. llvm-svn: 329347
*	[X86] Remove some InstRWs for plain store instructions on Sandy Bridge.	Craig Topper	2018-04-05	1	-28/+5
\| \| \| \| \| \|	We were forcing the latency of these instructions to 5 cycles, but every other scheduler model had them as 1 cycle. I'm sure I didn't get everything, but this gets a big portion. llvm-svn: 329339
*	[X86] Revert r329251-329254	Craig Topper	2018-04-05	1	-5/+28
\| \| \| \| \| \| \| \| \| \| \| \| \|	It's failing on the bots and I'm not sure why. This reverts: [X86] Synchronize the SchedRW on some EVEX instructions with their VEX equivalents. [X86] Use WriteFShuffle256 for VEXTRACTF128 to be consistent with VEXTRACTI128 which uses WriteShuffle256. [X86] Remove some InstRWs for plain store instructions on Sandy Bridge. [X86] Auto-generate complete checks. NFC llvm-svn: 329256
*	[X86] Auto-generate complete checks. NFC	Craig Topper	2018-04-05	1	-28/+5
\| \| \| \|	llvm-svn: 329251
*	[X86] Separate BSWAP32r and BSWAP64r scheduling data in ↵	Craig Topper	2018-04-04	1	-1/+8
\| \| \| \| \| \| \| \|	SandyBridge/Haswell/Broadwell/Skylake scheduler models. The BSWAP64r version is 2 uops and BSWAP32r is only 1 uop. The regular expressions also looked for a non-existant BSWAP16r. llvm-svn: 329211
*	[X86] Correct the throughput for divide instructions in Sandy ↵	Craig Topper	2018-04-02	1	-20/+22
\| \| \| \| \| \| \| \|	Bridge/Haswell/Broadwell/Skylake scheduler models. Fixes most of PR36898. Still need to fix the 512-bit instructions, but Agner's tables don't have those. llvm-svn: 328960
*	[X86] Add SchedRW for PMULLD	Craig Topper	2018-03-31	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: It seems many CPUs don't implement this instruction as well as the other vector multiplies. Often using a multi uop flow. Silvermont in particular has a 7 uop flow with 11 cycle throughput. Sandy Bridge implements it as a single uop with 5 cycle latency and 1 cycle throughput. But Haswell and later use 2 uops with 10 cycle latency and 2 cycle throughput. This patch adds a new X86SchedWritePair we can use to tag this instruction separately. I've provided correct information for Silvermont, Btver2, and Sandy Bridge. I've removed the InstRWs for SandyBridge. I've left Haswell/Broadwell/Skylake InstRWs in place because I wasn't sure how to account for the different load latency between 128 and 256 bits. I also left Znver1 InstRWs in place because the existing values don't match Agner's spreadsheet. I also left a FIXME in the SandyBridge model because it being used for the "generic" model is too optimistic for the 256/512-bit versions since those are multiple uops on all known CPUs. Reviewers: RKSimon, GGanesh, courbet Reviewed By: RKSimon Subscribers: gchatelet, gbedwell, andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D44972 llvm-svn: 328914
*	[X86] Correct the placement of ReadAfterLd in BEXTR and BZHI. Add dedicated ↵	Craig Topper	2018-03-29	1	-0/+5
\| \| \| \| \| \| \| \| \| \|	SchedRW for BEXTR/BZHI. These instructions have the memory operand before the register operand. So we need to put ReadDefault for all the load ops first. Then the ReadAfterLd Differential Revision: https://reviews.llvm.org/D44838 llvm-svn: 328823
*	[X86] Add WriteFMOVMSK/WriteVecMOVMSK/WriteMMXMOVMSK scheduler classes	Simon Pilgrim	2018-03-27	1	-4/+6
\| \| \| \| \| \| \| \|	Currently MOVMSK instructions use the WriteVecLogic class, which is a very poor choice given that MOVMSK involves a SSE->GPR transfer. Differential Revision: https://reviews.llvm.org/D44924 llvm-svn: 328664
*	[X86] Add WriteCRC32 scheduler class	Simon Pilgrim	2018-03-26	1	-5/+2
\| \| \| \| \| \| \| \|	Currently CRC32 instructions use the WriteFAdd class, this patch splits them off into their own, at the moment it is still mostly just a duplicate of WriteFAdd but it can now be tweaked on a target by target basis. Differential Revision: https://reviews.llvm.org/D44647 llvm-svn: 328582
*	[X86] Add WriteBitScan/WriteLZCNT/WriteTZCNT/WritePOPCNT scheduler classes ↵	Simon Pilgrim	2018-03-26	1	-5/+7
\| \| \| \| \| \| \| \| \| \| \| \|	(PR36881) Give the bit count instructions their own scheduler classes instead of forcing them into existing classes. These were mostly overridden anyway, but I had to add in costs from Agner for silvermont and znver1 and the Fam16h SoG for btver2 (Jaguar). Differential Revision: https://reviews.llvm.org/D44879 llvm-svn: 328566
*	[X86] Merge the SSE and AVX versions of fp divs and sqrts in the ↵	Craig Topper	2018-03-26	1	-27/+11
\| \| \| \| \| \| \| \|	SandyBridge/Haswell/Broadwell/Skylake scheduler models. I've used Agner's data as best I could to get the values to converge on. llvm-svn: 328473
*	[X86] Add the ability to override memory folding latency to schedules and ↵	Simon Pilgrim	2018-03-25	1	-5/+6
\| \| \| \| \| \| \| \| \| \|	add 1uop for memory folds for Intel models The Intel models need an extra 1uop for memory folded instructions, plus a lot of instructions take a non-default memory latency which should allow us to use the multiclass a lot more to tidy things up. Differential Revision: https://reviews.llvm.org/D44840 llvm-svn: 328446
*	[X86][SandyBridge] Merge xmm/ymm instructions instregex entries to reduce ↵	Simon Pilgrim	2018-03-24	1	-158/+79
\| \| \| \| \| \|	regex matches to reduce compile time llvm-svn: 328431
*	[X86][SandyBridge] Fix missing comma that was causing string concatenation ↵	Simon Pilgrim	2018-03-23	1	-1/+1
\| \| \| \| \| \| \| \|	of 2 instregex entries Found while updating D44687 llvm-svn: 328308
*	[X86] Correct the latencies of SNB integer vector multiplies based on ↵	Craig Topper	2018-03-23	1	-9/+25
\| \| \| \| \| \|	Agner's data. Add missing MMX multiplies. llvm-svn: 328295
*	[X86] Merge VMOVMSKBrr and MOVMSKBrr in the SNB sheduler model.	Craig Topper	2018-03-23	1	-3/+2
\| \| \| \| \| \|	The VMOVMSKBrr was in a separate InstRW with a lower latency, but I assume they should be the same and the higher latency matches Agners table so I'm going with that. llvm-svn: 328291
*	[X86] Rename VROUNDYPS* and VROUNDYPD* instructions to VROUNDPSY* and ↵	Craig Topper	2018-03-22	1	-4/+4
\| \| \| \| \| \| \| \| \| \|	VROUNDPDY*. Fix itinerary mistake on all memory forms of VROUNDPD This makes the Y position consistent with other instructions. This should have been NFC, but while refactoring the multiclass I noticed that VROUNDPD memory forms were using the register itinerary. llvm-svn: 328254
*	[X86] Correct the scheduling data for some of the 32 and 64 bit multiplies ↵	Craig Topper	2018-03-22	1	-0/+2
\| \| \| \| \| \|	to as best as I understand how they are implemented. llvm-svn: 328231
*	[X86][SSE42] Use the default PCMPEST/PCMPIST scheduler classes directly. NFCI.	Simon Pilgrim	2018-03-22	1	-20/+7
\| \| \| \| \| \|	Models were completely overriding all SSE42 strins instructions when the default classes could be used for exactly the same coverage. llvm-svn: 328203
*	[X86] Use the default AES scheduler classes directly. NFCI.	Simon Pilgrim	2018-03-22	1	-34/+0
\| \| \| \| \| \| \| \| \|	Models were completely overriding all AES instructions when the WriteAES default classes could be used for exactly the same coverage. Removes 6 unnecessary scheduler classes from every model. Note: Still looking for a way for tblgen to warn when this is happening - often the override is more complete than the default. llvm-svn: 328192
*	[X86][SandyBridge] Merge more VEX/non-VEX instregex patterns (NFCI) (PR35955)	Simon Pilgrim	2018-03-21	1	-637/+326
\| \| \| \|	llvm-svn: 328110
*	[X86][SandyBridge] Merge multiple InstrRW entries that map to the same ↵	Simon Pilgrim	2018-03-20	1	-1520/+1352
\| \| \| \| \| \| \| \|	SchedWriteRes group (NFCI) (PR35955) I've also merged some VEX/non-VEX instregex strings with a (V?) prefix - there are still a lot more of these to do. llvm-svn: 327974
*	[X86] Add TEST16mi/TEST32mi/TEST64mi32 to the ↵	Craig Topper	2018-03-20	1	-2/+2
\| \| \| \| \| \| \| \|	Sandybridge/Haswell/Broadwell/Skylake scheduler models. Move it from a load+store group on SNB to a load only group, the same group as CMP. llvm-svn: 327944
*	[X86] Add ROR/ROL/SHR/SAR by 1 instructions to the Sandy Bridge scheduler model.	Craig Topper	2018-03-20	1	-0/+8
\| \| \| \| \| \|	I assume these match the generic immediate version like they do in the other models. llvm-svn: 327943