bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[X86][SKX] Setup WriteFMul and remove unnecessary InstRW scheduler overrides.	Simon Pilgrim	2018-04-24	1	-27/+1
\| \| \| \|	llvm-svn: 330760
*	[X86] Split off PHMINPOSUW to their own schedule class	Simon Pilgrim	2018-04-24	1	-4/+1
\| \| \| \| \| \|	This also fixes Jaguar's schedule which was treating it as the WriteVecIMul default. llvm-svn: 330756
*	[X86][F16C] Add WriteCvtF2FSt scheduling class	Simon Pilgrim	2018-04-24	1	-7/+6
\| \| \| \| \| \|	Fixes the classification of VCVTPS2PHmr/VCVTPS2PHYmr which were tagged as WriteCvtF2FLd_WriteRMW (PR36887) llvm-svn: 330737
*	[X86] Remove unnecessary FMA reg-mem InstRW scheduler overrides.	Simon Pilgrim	2018-04-24	1	-2/+0
\| \| \| \|	llvm-svn: 330720
*	[X86] Add vector element insertion/extraction scheduler classes	Simon Pilgrim	2018-04-24	1	-54/+22
\| \| \| \| \| \| \| \| \| \| \| \|	Split off pinsr/pextr and extractps instructions. (Mostly) fixes PR36887. Note: It might be worth adding a WriteFInsertLd class as well in the future. Differential Revision: https://reviews.llvm.org/D45929 llvm-svn: 330714
*	[X86] Remove unnecessary vector memory folded InstRW overrides.	Simon Pilgrim	2018-04-23	1	-30/+0
\| \| \| \| \| \|	We have test coverage for these with resources-sse/avx llvm-svn: 330662
*	[X86] Remove unnecessary BMI2 InstRW overrides.	Simon Pilgrim	2018-04-23	1	-10/+2
\| \| \| \| \| \|	We have test coverage for these with resources-bmi2.s llvm-svn: 330659
*	[X86] Remove unnecessary WriteLEA InstRW overrides.	Simon Pilgrim	2018-04-23	1	-2/+1
\| \| \| \|	llvm-svn: 330648
*	[X86] Replace x87 instregex with instrs if they only match one instruction	Simon Pilgrim	2018-04-23	1	-7/+6
\| \| \| \|	llvm-svn: 330611
*	[X86] Remove instregex matching from CLAC/STAC.	Simon Pilgrim	2018-04-23	1	-4/+2
\| \| \| \| \| \|	Note - noticed this as the STAC case as it was unintentionally matching against STACK pseudo instructions. llvm-svn: 330588
*	[X86] Remove unnecessary MMX reg-mem InstRW scheduler overrides.	Simon Pilgrim	2018-04-23	1	-18/+1
\| \| \| \|	llvm-svn: 330581
*	[X86] Remove unnecessary WriteFBlend/WriteBlend InstRW overrides.	Simon Pilgrim	2018-04-22	1	-12/+3
\| \| \| \| \| \|	Fixed a lot of the default classes which were being completely overridden. llvm-svn: 330554
*	[X86] Remove unnecessary WriteFMul/WriteFRcp/WriteFRsqrt InstRW overrides.	Simon Pilgrim	2018-04-22	1	-20/+2
\| \| \| \|	llvm-svn: 330553
*	[X86][SkylakeServer] Remove unnecessary PMULLD instrw overrides.	Simon Pilgrim	2018-04-22	1	-21/+0
\| \| \| \|	llvm-svn: 330549
*	[X86] Fix (completely overridden) WriteFHAdd/WritePHAdd classes to allow us ↵	Simon Pilgrim	2018-04-22	1	-24/+2
\| \| \| \| \| \|	to remove unnecessary instrw overrides. llvm-svn: 330546
*	[X86] Remove unnecessary WriteFVarBlend/WriteVarBlend InstRW overrides.	Simon Pilgrim	2018-04-22	1	-30/+2
\| \| \| \| \| \|	This also fixes some of the ReadAfterLd issues due to InstRW. llvm-svn: 330544
*	[X86] Fix WriteMPSAD/WritePSADBW values to allow us to remove unnecessary ↵	Simon Pilgrim	2018-04-22	1	-11/+4
\| \| \| \| \| \|	instrw overrides. llvm-svn: 330542
*	[X86] Strip unnecessary prefetch + vector move/load instrw overrides from ↵	Simon Pilgrim	2018-04-21	1	-70/+3
\| \| \| \| \| \|	scheduler models. llvm-svn: 330527
*	[X86] Strip unnecessary broadcast/shuffle256 instrw overrides from scheduler ↵	Simon Pilgrim	2018-04-21	1	-82/+1
\| \| \| \| \| \|	models. llvm-svn: 330523
*	[X86] Strip unnecessary vector integer math, shift-imm, extend, shuffle, ↵	Simon Pilgrim	2018-04-21	1	-146/+2
\| \| \| \| \| \|	pack/unpack instruction instrw overrides from scheduler models. llvm-svn: 330521
*	[X86] Add SchedWrites for LDMXCSR/STMXCSR.	Craig Topper	2018-04-21	1	-10/+5
\| \| \| \|	llvm-svn: 330517
*	[X86] Strip unnecessary WriteFRcp/WriteFRsqrt instruction instrw overrides ↵	Simon Pilgrim	2018-04-21	1	-25/+3
\| \| \| \| \| \| \| \|	from scheduler models. The required the default skylake schedules to be updated - these were being completely overriden by the InstRW and the existing values not used at all. llvm-svn: 330510
*	[X86] Strip unnecessary WriteFShuffle instruction instrw overrides from ↵	Simon Pilgrim	2018-04-21	1	-85/+2
\| \| \| \| \| \|	scheduler models. llvm-svn: 330508
*	[X86] Strip unnecessary MMX instruction instrw overrides from scheduler models.	Simon Pilgrim	2018-04-21	1	-29/+1
\| \| \| \|	llvm-svn: 330503
*	[X86] Add WriteFSign/WriteFLogic scheduler classes	Simon Pilgrim	2018-04-20	1	-131/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Split the fp and integer vector logical instruction scheduler classes - older CPUs especially often handled these on different pipes. This unearthed a couple of things that are also handled in this patch: (1) We were tagging avx512 fp logic ops as WriteFAdd, probably because of the lack of WriteFLogic (2) SandyBridge had integer logic ops only using Port5, when afaict they can use Ports015. (3) Cleaned up x86 FCHS/FABS scheduling as they are typically treated as fp logic ops. Differential Revision: https://reviews.llvm.org/D45629 llvm-svn: 330480
*	[X86] Correct the scheduling data for register forms of XCHG and XADD on ↵	Craig Topper	2018-04-19	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \|	Intel CPUs. The XCHG16rr/XCHG32rr/XCHG64rr instructions should be 3 uops just like XCHG8rr. I believe they're just implemented as 3 move uops with a temporary register. XADD is probably 2 moves and an add also using a temporary register. Change the latency for both from 2 cycles to 3 cycles. Only 2 of the uops are serialized in their execution, the move into the temporary and the move out of the temporary. The move from one GPR to the other should be able to go in parallel with this if there are ALU resources available. llvm-svn: 330349
*	[X86] Merge some MMX instregex	Simon Pilgrim	2018-04-19	1	-43/+13
\| \| \| \| \| \|	There's a lot more but I'd prefer focussing on removing unnecessary InstRWs first. llvm-svn: 330347
*	[X86][FMA] Remove FMA reg-reg InstRW scheduler overrides.	Simon Pilgrim	2018-04-19	1	-9/+0
\| \| \| \| \| \|	These are all already handled identically by WriteFMA. llvm-svn: 330319
*	[X86] Scrub scheduling information for MUL/IMUL on Intel CPUs.	Craig Topper	2018-04-19	1	-7/+5
\| \| \| \| \| \|	This removes a bunch of unnecessary InstRW overrides. It also cleans up the missing information from the Sandy Bridge model. Other fixes to other models. llvm-svn: 330308
*	[X86] Add separate scheduling class for PSADBW instruction.	Craig Topper	2018-04-17	1	-7/+2
\| \| \| \|	llvm-svn: 330204
*	[X86] Remove unnecessary InstRW overrides. Add somes FIXMEs/TODOs.	Craig Topper	2018-04-17	1	-7/+2
\| \| \| \|	llvm-svn: 330203
*	[X86] Add FP comparison scheduler classes	Simon Pilgrim	2018-04-17	1	-99/+5
\| \| \| \| \| \| \| \|	Split VCMP/VMAX/VMIN instructions off to WriteFCmp and VCOMIS instructions off to WriteFCom instead of assuming they match WriteFAdd Differential Revision: https://reviews.llvm.org/D45656 llvm-svn: 330179
*	[X86] Add variable shuffle schedule classes	Simon Pilgrim	2018-04-11	1	-51/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Split variable index shuffles from immediate index shuffles WriteFVarShuffle - variable 'in-lane' shuffles (VPERMILPS/VPERMIL2PS etc.) WriteVarShuffle - variable 'in-lane' shuffles (PSHUFB/VPPERM etc.) WriteFVarShuffle256 - variable 'cross-lane' shuffles (VPERMPS etc.) WriteVarShuffle256 - variable 'cross-lane' shuffles (VPERMD etc.) Differential Revision: https://reviews.llvm.org/D45404 llvm-svn: 329806
*	[X86] Add SchedWrites for CMOV and SETCC. Use them to remove InstRWs.	Craig Topper	2018-04-08	1	-10/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Cmov and setcc previously used WriteALU, but on Intel processors at least they are more restricted than basic ALU ops. This patch adds new SchedWrites for them and removes the InstRWs. I had to leave some InstRWs for CMOVA/CMOVBE and SETA/SETBE because those have an extra uop relative to the other condition codes on Intel CPUs. The test changes are due to fixing a missing ZnAGU dependency on the memory form of setcc. Reviewers: RKSimon, andreadb, GGanesh Reviewed By: RKSimon Subscribers: GGanesh, llvm-commits Differential Revision: https://reviews.llvm.org/D45380 llvm-svn: 329539
*	[X686] Add appropriate ReadAfterLd for the register input to memory forms of ↵	Craig Topper	2018-04-06	1	-8/+8
\| \| \| \| \| \|	ADC/SBB. llvm-svn: 329424
*	[X86] Attempt to model basic arithmetic instructions in the ↵	Craig Topper	2018-04-06	1	-61/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Haswell/Broadwell/Skylake scheduler models without InstRWs Summary: This patch removes InstRW overrides for basic arithmetic/logic instructions. To do this I've added the store address port to RMW. And used a WriteSequence to make the latency additive. It does not cover ADC/SBB because they have different latency. Apparently we were inconsistent about whether the store has latency or not thus the test changes. I've also left out Sandy Bridge because the load latency there is currently 4 cycles and should be 5. Reviewers: RKSimon, andreadb Reviewed By: andreadb Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45351 llvm-svn: 329416
*	[X86] Add an extra store address cycle to WriteRMW in the Sandy ↵	Craig Topper	2018-04-06	1	-3/+3
\| \| \| \| \| \| \| \|	Bridge/Broadwell/Haswell/Skylake scheduler model. Even those the address was calculated for the load, its calculated again for the store. llvm-svn: 329415
*	[X86][SkylakeServer] Merge 2 InstRW entries to the same sched group. NFCI.	Simon Pilgrim	2018-04-06	1	-2/+2
\| \| \| \|	llvm-svn: 329386
*	[X86] Separate CDQ and CDQE in the scheduler model.	Craig Topper	2018-04-05	1	-4/+2
\| \| \| \| \| \|	According to Agner's data, CDQE is closer to CWDE. llvm-svn: 329354
*	[X86] Add LEAVE instruction to the scheduler models using the same data as ↵	Craig Topper	2018-04-05	1	-5/+2
\| \| \| \| \| \| \| \| \| \|	LEAVE64. Make LEAVE/LEAVE64 more correct on Sandy Bridge. This is the 32-bit mode version of LEAVE64. It should be at least somewhat similar to LEAVE64. The Sandy Bridge version was missing a load port use. llvm-svn: 329347
*	[X86] Remove some InstRWs for plain store instructions on Sandy Bridge.	Craig Topper	2018-04-05	1	-3/+0
\| \| \| \| \| \|	We were forcing the latency of these instructions to 5 cycles, but every other scheduler model had them as 1 cycle. I'm sure I didn't get everything, but this gets a big portion. llvm-svn: 329339
*	[X86] Revert r329251-329254	Craig Topper	2018-04-05	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	It's failing on the bots and I'm not sure why. This reverts: [X86] Synchronize the SchedRW on some EVEX instructions with their VEX equivalents. [X86] Use WriteFShuffle256 for VEXTRACTF128 to be consistent with VEXTRACTI128 which uses WriteShuffle256. [X86] Remove some InstRWs for plain store instructions on Sandy Bridge. [X86] Auto-generate complete checks. NFC llvm-svn: 329256
*	[X86] Auto-generate complete checks. NFC	Craig Topper	2018-04-05	1	-3/+0
\| \| \| \|	llvm-svn: 329251
*	[X86] Separate BSWAP32r and BSWAP64r scheduling data in ↵	Craig Topper	2018-04-04	1	-1/+8
\| \| \| \| \| \| \| \|	SandyBridge/Haswell/Broadwell/Skylake scheduler models. The BSWAP64r version is 2 uops and BSWAP32r is only 1 uop. The regular expressions also looked for a non-existant BSWAP16r. llvm-svn: 329211
*	[X86][SkylakeServer] Correct throughput for 512-bit sqrt and divide.	Craig Topper	2018-04-02	1	-29/+28
\| \| \| \| \| \|	Data taken from the AVX512_SKX_PortAssign spreadsheet at http://instlatx64.atw.hu/ llvm-svn: 328961
*	[X86] Correct the throughput for divide instructions in Sandy ↵	Craig Topper	2018-04-02	1	-87/+101
\| \| \| \| \| \| \| \|	Bridge/Haswell/Broadwell/Skylake scheduler models. Fixes most of PR36898. Still need to fix the 512-bit instructions, but Agner's tables don't have those. llvm-svn: 328960
*	[X86] Give ADC8/16/32/64mi the same scheduling information as ↵	Craig Topper	2018-04-01	1	-8/+2
\| \| \| \| \| \| \| \|	ADC8/16/32/64mr and SBB8/16/32/64mi. It doesn't make a lot of sense that it would be different. llvm-svn: 328946
*	[X86] Add SchedRW for PMULLD	Craig Topper	2018-03-31	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: It seems many CPUs don't implement this instruction as well as the other vector multiplies. Often using a multi uop flow. Silvermont in particular has a 7 uop flow with 11 cycle throughput. Sandy Bridge implements it as a single uop with 5 cycle latency and 1 cycle throughput. But Haswell and later use 2 uops with 10 cycle latency and 2 cycle throughput. This patch adds a new X86SchedWritePair we can use to tag this instruction separately. I've provided correct information for Silvermont, Btver2, and Sandy Bridge. I've removed the InstRWs for SandyBridge. I've left Haswell/Broadwell/Skylake InstRWs in place because I wasn't sure how to account for the different load latency between 128 and 256 bits. I also left Znver1 InstRWs in place because the existing values don't match Agner's spreadsheet. I also left a FIXME in the SandyBridge model because it being used for the "generic" model is too optimistic for the 256/512-bit versions since those are multiple uops on all known CPUs. Reviewers: RKSimon, GGanesh, courbet Reviewed By: RKSimon Subscribers: gchatelet, gbedwell, andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D44972 llvm-svn: 328914
*	[X86] Correct the placement of ReadAfterLd in BEXTR and BZHI. Add dedicated ↵	Craig Topper	2018-03-29	1	-11/+5
\| \| \| \| \| \| \| \| \| \|	SchedRW for BEXTR/BZHI. These instructions have the memory operand before the register operand. So we need to put ReadDefault for all the load ops first. Then the ReadAfterLd Differential Revision: https://reviews.llvm.org/D44838 llvm-svn: 328823
*	[X86][SkylakeServer] Remove checks for 'k', 'z', '_Int' and 'b' from ↵	Craig Topper	2018-03-28	1	-2116/+2116
\| \| \| \| \| \| \| \| \| \|	scheduler regexs. Most of these were optional matches at the end of the strings, but since the strings themselves are prefix matches by default you don't need to check for something optional at the end. I've left the 'b' on memory instructions where it means 'broadcast' because I'm not sure those really have the same load latency and we may need to split them explicitly in the future. llvm-svn: 328730