bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[X86] Remove unnecessary vector memory folded InstRW overrides.	Simon Pilgrim	2018-04-23	1	-16/+0
\| \| \| \| \| \|	We have test coverage for these with resources-sse/avx llvm-svn: 330662
*	[X86] Remove unnecessary BMI2 InstRW overrides.	Simon Pilgrim	2018-04-23	1	-10/+2
\| \| \| \| \| \|	We have test coverage for these with resources-bmi2.s llvm-svn: 330659
*	[X86] Remove unnecessary WriteLEA InstRW overrides.	Simon Pilgrim	2018-04-23	1	-2/+1
\| \| \| \|	llvm-svn: 330648
*	[X86] Replace x87 instregex with instrs if they only match one instruction	Simon Pilgrim	2018-04-23	1	-7/+6
\| \| \| \|	llvm-svn: 330611
*	[X86] Remove instregex matching from CLAC/STAC.	Simon Pilgrim	2018-04-23	1	-4/+2
\| \| \| \| \| \|	Note - noticed this as the STAC case as it was unintentionally matching against STACK pseudo instructions. llvm-svn: 330588
*	[X86] Remove unnecessary MMX reg-mem InstRW scheduler overrides.	Simon Pilgrim	2018-04-23	1	-19/+1
\| \| \| \|	llvm-svn: 330581
*	[X86] Remove unnecessary WriteFBlend/WriteBlend InstRW overrides.	Simon Pilgrim	2018-04-22	1	-7/+3
\| \| \| \| \| \|	Fixed a lot of the default classes which were being completely overridden. llvm-svn: 330554
*	[X86] Fix (completely overridden) WriteFHAdd/WritePHAdd classes to allow us ↵	Simon Pilgrim	2018-04-22	1	-21/+1
\| \| \| \| \| \|	to remove unnecessary instrw overrides. llvm-svn: 330546
*	[X86] Remove unnecessary WriteFVarBlend/WriteVarBlend InstRW overrides.	Simon Pilgrim	2018-04-22	1	-26/+2
\| \| \| \| \| \|	This also fixes some of the ReadAfterLd issues due to InstRW. llvm-svn: 330544
*	[X86] Fix WriteMPSAD/WritePSADBW values to allow us to remove unnecessary ↵	Simon Pilgrim	2018-04-22	1	-15/+1
\| \| \| \| \| \|	instrw overrides. llvm-svn: 330542
*	[X86] Strip unnecessary prefetch + vector move/load instrw overrides from ↵	Simon Pilgrim	2018-04-21	1	-27/+2
\| \| \| \| \| \|	scheduler models. llvm-svn: 330527
*	[X86] Strip unnecessary broadcast/shuffle256 instrw overrides from scheduler ↵	Simon Pilgrim	2018-04-21	1	-16/+1
\| \| \| \| \| \|	models. llvm-svn: 330523
*	[X86] Strip unnecessary vector integer math, shift-imm, extend, shuffle, ↵	Simon Pilgrim	2018-04-21	1	-43/+3
\| \| \| \| \| \|	pack/unpack instruction instrw overrides from scheduler models. llvm-svn: 330521
*	[X86] Add SchedWrites for LDMXCSR/STMXCSR.	Craig Topper	2018-04-21	1	-9/+5
\| \| \| \|	llvm-svn: 330517
*	[X86] Strip unnecessary WriteFRcp/WriteFRsqrt instruction instrw overrides ↵	Simon Pilgrim	2018-04-21	1	-7/+3
\| \| \| \| \| \| \| \|	from scheduler models. The required the default skylake schedules to be updated - these were being completely overriden by the InstRW and the existing values not used at all. llvm-svn: 330510
*	[X86] Strip unnecessary WriteFShuffle instruction instrw overrides from ↵	Simon Pilgrim	2018-04-21	1	-20/+2
\| \| \| \| \| \|	scheduler models. llvm-svn: 330508
*	[X86] Strip unnecessary MMX instruction instrw overrides from scheduler models.	Simon Pilgrim	2018-04-21	1	-29/+1
\| \| \| \|	llvm-svn: 330503
*	[X86] Add WriteFSign/WriteFLogic scheduler classes	Simon Pilgrim	2018-04-20	1	-32/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Split the fp and integer vector logical instruction scheduler classes - older CPUs especially often handled these on different pipes. This unearthed a couple of things that are also handled in this patch: (1) We were tagging avx512 fp logic ops as WriteFAdd, probably because of the lack of WriteFLogic (2) SandyBridge had integer logic ops only using Port5, when afaict they can use Ports015. (3) Cleaned up x86 FCHS/FABS scheduling as they are typically treated as fp logic ops. Differential Revision: https://reviews.llvm.org/D45629 llvm-svn: 330480
*	[X86] Correct the scheduling data for register forms of XCHG and XADD on ↵	Craig Topper	2018-04-19	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \|	Intel CPUs. The XCHG16rr/XCHG32rr/XCHG64rr instructions should be 3 uops just like XCHG8rr. I believe they're just implemented as 3 move uops with a temporary register. XADD is probably 2 moves and an add also using a temporary register. Change the latency for both from 2 cycles to 3 cycles. Only 2 of the uops are serialized in their execution, the move into the temporary and the move out of the temporary. The move from one GPR to the other should be able to go in parallel with this if there are ALU resources available. llvm-svn: 330349
*	[X86] Merge some MMX instregex	Simon Pilgrim	2018-04-19	1	-40/+12
\| \| \| \| \| \|	There's a lot more but I'd prefer focussing on removing unnecessary InstRWs first. llvm-svn: 330347
*	[X86][FMA] Remove FMA reg-reg InstRW scheduler overrides.	Simon Pilgrim	2018-04-19	1	-4/+0
\| \| \| \| \| \|	These are all already handled identically by WriteFMA. llvm-svn: 330319
*	[X86] Scrub scheduling information for MUL/IMUL on Intel CPUs.	Craig Topper	2018-04-19	1	-11/+8
\| \| \| \| \| \|	This removes a bunch of unnecessary InstRW overrides. It also cleans up the missing information from the Sandy Bridge model. Other fixes to other models. llvm-svn: 330308
*	[X86] Add separate scheduling class for PSADBW instruction.	Craig Topper	2018-04-17	1	-3/+2
\| \| \| \|	llvm-svn: 330204
*	[X86] Remove unnecessary InstRW overrides. Add somes FIXMEs/TODOs.	Craig Topper	2018-04-17	1	-7/+2
\| \| \| \|	llvm-svn: 330203
*	[X86] Add FP comparison scheduler classes	Simon Pilgrim	2018-04-17	1	-34/+4
\| \| \| \| \| \| \| \|	Split VCMP/VMAX/VMIN instructions off to WriteFCmp and VCOMIS instructions off to WriteFCom instead of assuming they match WriteFAdd Differential Revision: https://reviews.llvm.org/D45656 llvm-svn: 330179
*	[X86] Add variable shuffle schedule classes	Simon Pilgrim	2018-04-11	1	-6/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Split variable index shuffles from immediate index shuffles WriteFVarShuffle - variable 'in-lane' shuffles (VPERMILPS/VPERMIL2PS etc.) WriteVarShuffle - variable 'in-lane' shuffles (PSHUFB/VPPERM etc.) WriteFVarShuffle256 - variable 'cross-lane' shuffles (VPERMPS etc.) WriteVarShuffle256 - variable 'cross-lane' shuffles (VPERMD etc.) Differential Revision: https://reviews.llvm.org/D45404 llvm-svn: 329806
*	[X86] Add SchedWrites for CMOV and SETCC. Use them to remove InstRWs.	Craig Topper	2018-04-08	1	-10/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Cmov and setcc previously used WriteALU, but on Intel processors at least they are more restricted than basic ALU ops. This patch adds new SchedWrites for them and removes the InstRWs. I had to leave some InstRWs for CMOVA/CMOVBE and SETA/SETBE because those have an extra uop relative to the other condition codes on Intel CPUs. The test changes are due to fixing a missing ZnAGU dependency on the memory form of setcc. Reviewers: RKSimon, andreadb, GGanesh Reviewed By: RKSimon Subscribers: GGanesh, llvm-commits Differential Revision: https://reviews.llvm.org/D45380 llvm-svn: 329539
*	[X686] Add appropriate ReadAfterLd for the register input to memory forms of ↵	Craig Topper	2018-04-06	1	-8/+8
\| \| \| \| \| \|	ADC/SBB. llvm-svn: 329424
*	[X86] Attempt to model basic arithmetic instructions in the ↵	Craig Topper	2018-04-06	1	-61/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Haswell/Broadwell/Skylake scheduler models without InstRWs Summary: This patch removes InstRW overrides for basic arithmetic/logic instructions. To do this I've added the store address port to RMW. And used a WriteSequence to make the latency additive. It does not cover ADC/SBB because they have different latency. Apparently we were inconsistent about whether the store has latency or not thus the test changes. I've also left out Sandy Bridge because the load latency there is currently 4 cycles and should be 5. Reviewers: RKSimon, andreadb Reviewed By: andreadb Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45351 llvm-svn: 329416
*	[X86] Add an extra store address cycle to WriteRMW in the Sandy ↵	Craig Topper	2018-04-06	1	-3/+3
\| \| \| \| \| \| \| \|	Bridge/Broadwell/Haswell/Skylake scheduler model. Even those the address was calculated for the load, its calculated again for the store. llvm-svn: 329415
*	[X86] Separate CDQ and CDQE in the scheduler model.	Craig Topper	2018-04-05	1	-4/+2
\| \| \| \| \| \|	According to Agner's data, CDQE is closer to CWDE. llvm-svn: 329354
*	[X86] Add LEAVE instruction to the scheduler models using the same data as ↵	Craig Topper	2018-04-05	1	-5/+2
\| \| \| \| \| \| \| \| \| \|	LEAVE64. Make LEAVE/LEAVE64 more correct on Sandy Bridge. This is the 32-bit mode version of LEAVE64. It should be at least somewhat similar to LEAVE64. The Sandy Bridge version was missing a load port use. llvm-svn: 329347
*	[X86] Remove some InstRWs for plain store instructions on Sandy Bridge.	Craig Topper	2018-04-05	1	-2/+0
\| \| \| \| \| \|	We were forcing the latency of these instructions to 5 cycles, but every other scheduler model had them as 1 cycle. I'm sure I didn't get everything, but this gets a big portion. llvm-svn: 329339
*	[X86] Revert r329251-329254	Craig Topper	2018-04-05	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	It's failing on the bots and I'm not sure why. This reverts: [X86] Synchronize the SchedRW on some EVEX instructions with their VEX equivalents. [X86] Use WriteFShuffle256 for VEXTRACTF128 to be consistent with VEXTRACTI128 which uses WriteShuffle256. [X86] Remove some InstRWs for plain store instructions on Sandy Bridge. [X86] Auto-generate complete checks. NFC llvm-svn: 329256
*	[X86] Auto-generate complete checks. NFC	Craig Topper	2018-04-05	1	-2/+0
\| \| \| \|	llvm-svn: 329251
*	[X86] Separate BSWAP32r and BSWAP64r scheduling data in ↵	Craig Topper	2018-04-04	1	-1/+8
\| \| \| \| \| \| \| \|	SandyBridge/Haswell/Broadwell/Skylake scheduler models. The BSWAP64r version is 2 uops and BSWAP32r is only 1 uop. The regular expressions also looked for a non-existant BSWAP16r. llvm-svn: 329211
*	[X86] Correct the throughput for divide instructions in Sandy ↵	Craig Topper	2018-04-02	1	-36/+84
\| \| \| \| \| \| \| \|	Bridge/Haswell/Broadwell/Skylake scheduler models. Fixes most of PR36898. Still need to fix the 512-bit instructions, but Agner's tables don't have those. llvm-svn: 328960
*	[X86] Give ADC8/16/32/64mi the same scheduling information as ↵	Craig Topper	2018-04-01	1	-8/+2
\| \| \| \| \| \| \| \|	ADC8/16/32/64mr and SBB8/16/32/64mi. It doesn't make a lot of sense that it would be different. llvm-svn: 328946
*	[X86] Add SchedRW for PMULLD	Craig Topper	2018-03-31	1	-14/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: It seems many CPUs don't implement this instruction as well as the other vector multiplies. Often using a multi uop flow. Silvermont in particular has a 7 uop flow with 11 cycle throughput. Sandy Bridge implements it as a single uop with 5 cycle latency and 1 cycle throughput. But Haswell and later use 2 uops with 10 cycle latency and 2 cycle throughput. This patch adds a new X86SchedWritePair we can use to tag this instruction separately. I've provided correct information for Silvermont, Btver2, and Sandy Bridge. I've removed the InstRWs for SandyBridge. I've left Haswell/Broadwell/Skylake InstRWs in place because I wasn't sure how to account for the different load latency between 128 and 256 bits. I also left Znver1 InstRWs in place because the existing values don't match Agner's spreadsheet. I also left a FIXME in the SandyBridge model because it being used for the "generic" model is too optimistic for the 256/512-bit versions since those are multiple uops on all known CPUs. Reviewers: RKSimon, GGanesh, courbet Reviewed By: RKSimon Subscribers: gchatelet, gbedwell, andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D44972 llvm-svn: 328914
*	[X86] Correct the placement of ReadAfterLd in BEXTR and BZHI. Add dedicated ↵	Craig Topper	2018-03-29	1	-11/+5
\| \| \| \| \| \| \| \| \| \|	SchedRW for BEXTR/BZHI. These instructions have the memory operand before the register operand. So we need to put ReadDefault for all the load ops first. Then the ReadAfterLd Differential Revision: https://reviews.llvm.org/D44838 llvm-svn: 328823
*	[X86] Add WriteFMOVMSK/WriteVecMOVMSK/WriteMMXMOVMSK scheduler classes	Simon Pilgrim	2018-03-27	1	-4/+5
\| \| \| \| \| \| \| \|	Currently MOVMSK instructions use the WriteVecLogic class, which is a very poor choice given that MOVMSK involves a SSE->GPR transfer. Differential Revision: https://reviews.llvm.org/D44924 llvm-svn: 328664
*	[X86] Add WriteCRC32 scheduler class	Simon Pilgrim	2018-03-26	1	-0/+1
\| \| \| \| \| \| \| \|	Currently CRC32 instructions use the WriteFAdd class, this patch splits them off into their own, at the moment it is still mostly just a duplicate of WriteFAdd but it can now be tweaked on a target by target basis. Differential Revision: https://reviews.llvm.org/D44647 llvm-svn: 328582
*	[X86] Add WriteBitScan/WriteLZCNT/WriteTZCNT/WritePOPCNT scheduler classes ↵	Simon Pilgrim	2018-03-26	1	-14/+10
\| \| \| \| \| \| \| \| \| \| \| \|	(PR36881) Give the bit count instructions their own scheduler classes instead of forcing them into existing classes. These were mostly overridden anyway, but I had to add in costs from Agner for silvermont and znver1 and the Fam16h SoG for btver2 (Jaguar). Differential Revision: https://reviews.llvm.org/D44879 llvm-svn: 328566
*	[X86] Merge the SSE and AVX versions of fp divs and sqrts in the ↵	Craig Topper	2018-03-26	1	-32/+12
\| \| \| \| \| \| \| \|	SandyBridge/Haswell/Broadwell/Skylake scheduler models. I've used Agner's data as best I could to get the values to converge on. llvm-svn: 328473
*	[X86] Move (v)movss to port 5 only for Skylake. Move (v)movups/d to port 015 ↵	Craig Topper	2018-03-25	1	-3/+3
\| \| \| \| \| \| \| \|	for Skylake. This matches Agner's data and is consistent with what the EVEX instructions were doing on SKX. llvm-svn: 328465
*	[X86] Use WriteResPair for WriteIDiv to cleanup sched defs. NFCI.	Simon Pilgrim	2018-03-25	1	-10/+4
\| \| \| \|	llvm-svn: 328460
*	[X86][SkylakeClient] Fix missing comma	Simon Pilgrim	2018-03-25	1	-1/+1
\| \| \| \|	llvm-svn: 328458
*	[X86][SkylakeClient] Fix a set of regular expressions that were checking for ↵	Craig Topper	2018-03-25	1	-18/+18
\| \| \| \| \| \| \| \|	optionally starting with 'Y' instead of 'V' These bad regexs were introduced by r328435 llvm-svn: 328454
*	[X86] Add the ability to override memory folding latency to schedules and ↵	Simon Pilgrim	2018-03-25	1	-5/+6
\| \| \| \| \| \| \| \| \| \|	add 1uop for memory folds for Intel models The Intel models need an extra 1uop for memory folded instructions, plus a lot of instructions take a non-default memory latency which should allow us to use the multiclass a lot more to tidy things up. Differential Revision: https://reviews.llvm.org/D44840 llvm-svn: 328446
*	[X86][SkylakeClient] Merge xmm/ymm instructions instregex entries to reduce ↵	Simon Pilgrim	2018-03-24	1	-1095/+478
\| \| \| \| \| \|	regex matches to reduce compile time llvm-svn: 328435