summaryrefslogtreecommitdiffstats
path: root/llvm/test/tools/llvm-mca/X86/BtVer2
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86][Btver2] Double the AGU and schedule pipe resources for YMMSimon Pilgrim2018-03-262-105/+105
| | | | | | Both the AGUs and schedule pipes are double pumped for 256-bit instructions as well as the functional units which we already model. llvm-svn: 328491
* [llvm-mca] Add flag -instruction-tables to print the theoretical resource ↵Andrea Di Biagio2018-03-2610-706/+707
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pressure distribution for instructions (PR36874) The goal of this patch is to address most of PR36874. To fully fix PR36874 we need to split the "InstructionInfo" view from the "SummaryView". That would make easy to check the latency and rthroughput as well. The patch reuses all the logic from ResourcePressureView to print out the "instruction tables". We have an entry for every instruction in the input sequence. Each entry reports the theoretical resource pressure distribution. Resource pressure is uniformly distributed across all the processor resource units of a group. At the moment, the backend pipeline is not configurable, so the only way to fix this is by creating a different driver that simply sends instruction events to the resource pressure view. That means, we don't use the Backend interface. Instead, it is simpler to just have a different code-path for when flag -instruction-tables is specified. Once Clement addresses bug 36663, then we can port the "instruction tables" logic into a stage of our configurable pipeline. Updated the BtVer2 test cases (thanks Simon for the help). Now we pass flag -instruction-tables to each modified test. Differential Revision: https://reviews.llvm.org/D44839 llvm-svn: 328487
* [X86][Btver2] Cleanup TEST instructions to use JFPA (+JFPX on ymms) function ↵Simon Pilgrim2018-03-232-35/+35
| | | | | | unit llvm-svn: 328343
* [X86][Btver2] Cleanup MOVMSK instructions to use JFPA function unitSimon Pilgrim2018-03-233-298/+298
| | | | | | Add missing non-VEX and (V)PMOVMSKB instructions to the pattern llvm-svn: 328338
* [X86][Btver2] Vector permutes use a JFPU01 scheduler pipe and JFPX/JVALU ↵Simon Pilgrim2018-03-232-303/+303
| | | | | | function unit llvm-svn: 328331
* [X86][Btver2] Vector store instructions use a JFPU1 scheduler pipe and ↵Simon Pilgrim2018-03-235-415/+415
| | | | | | JSAGU/JSTC function units llvm-svn: 328328
* [X86][Btver2] Cleanup DPPS/DPPD instructions to use JFPA/JFPM function unitsSimon Pilgrim2018-03-232-81/+81
| | | | llvm-svn: 328324
* [X86][Btver2] Fix MicroOps counts for DPPS/YMM memory folded instructionsSimon Pilgrim2018-03-232-56/+56
| | | | | | This was due to a misunderstanding over what llvm calls a micro-op (retirement unit) is actually called a macro-op on the AMD/Jaguar target. Folded loads don't affect num macro ops. llvm-svn: 328320
* [X86][Btver2] Cleanup SSE42 PCMPISTR/PCMPESTR string instructions to ↵Simon Pilgrim2018-03-231-9/+9
| | | | | | | | correctly use JFPU1 scheduler pipe followed by JLAGU/JSAGU/JFPA/JVALU function units Fixes throughput to match Agner/Fam16h-SoG as well. llvm-svn: 328318
* [X86][Btver2] Vector move/load/store instructions use a JFPU01 scheduler ↵Simon Pilgrim2018-03-237-520/+520
| | | | | | pipe and JFPX/JVALU function unit as well as the AGUs llvm-svn: 328304
* [X86] Match vpblendvb/vblendvps/vblendvpd itineraries to the SSE equivalent. ↵Craig Topper2018-03-231-66/+66
| | | | | | Change pblendvb/blendvps/blendvpd to use WriteFVarBlend llvm-svn: 328294
* [X86] Change VPSADBW itinerary to SSE_INTALU_ITINS_P to match the SSE version.Craig Topper2018-03-231-55/+55
| | | | llvm-svn: 328293
* [X86][Btver2] Conversion, MaskedLoad/MaskedStore and NTStores all are ↵Simon Pilgrim2018-03-224-269/+269
| | | | | | scheduled through the JFPU1 pipe llvm-svn: 328226
* [X86][Btver2] FCMP (inc FMAX/FMIN) instructions use the JFPA functional pipeSimon Pilgrim2018-03-223-77/+77
| | | | | | The ymm instructions are double pumped as well. llvm-svn: 328222
* [X86][Btver2] FMUL ymm instructions are double pumped on the JFPM functional ↵Simon Pilgrim2018-03-222-255/+255
| | | | | | pipe llvm-svn: 328217
* [llvm-mca] Move the logic that computes the register file usage to the ↵Andrea Di Biagio2018-03-212-0/+58
| | | | | | | | | | | | | | | | | | | | | | | BackendStatistics view. With this patch, the "instruction dispatched" event now provides information related to the number of microarchitectural registers used in each register file. Similarly, the "instruction retired" event is now able to tell how may registers are freed in each register file. Currently, the BackendStatistics view is the only consumer of register usage/pressure information. BackendStatistics uses that info to print out a few general statistics (i.e. max number of mappings used; total mapping created). Before this patch, the BackendStatistics was forced to query the Backend to obtain the register pressure information. This helps removes that dependency. Now views are completely independent from the Backend. As a consequence, it should be easier to address PR36663 and further modularize the pipeline. Added a couple of test cases in the BtVer2 specific directory. llvm-svn: 328129
* [X86][Btver2] Fix crc32 schedule costsSimon Pilgrim2018-03-181-11/+11
| | | | | | The default is currently FAdd for some reason llvm-svn: 327807
* [X86][Btver2] Add crc32 resource testsSimon Pilgrim2018-03-181-1/+26
| | | | llvm-svn: 327805
* [X86][Btver2] FADD/FHADD ymm instructions are double pumped on the JFPA ↵Simon Pilgrim2018-03-182-46/+46
| | | | | | functional pipe llvm-svn: 327804
* [X86][Btver2] Float bitwise ymm instructions are double pumped on the JFPX ↵Simon Pilgrim2018-03-181-27/+27
| | | | | | (JFPA/JFPM) functional pipes llvm-svn: 327803
* [X86][Btver2] F16C instructions are performed on the JSTC functional pipeSimon Pilgrim2018-03-181-8/+8
| | | | llvm-svn: 327801
* [X86][Btver2] SSE4A EXTRQ/INSERTQ instructions are performed on the ↵Simon Pilgrim2018-03-181-4/+4
| | | | | | JVALU0/JVALU1 functional pipes llvm-svn: 327794
* [X86][Btver2] Modelled float bitwise instructions as being performed on the ↵Simon Pilgrim2018-03-183-40/+40
| | | | | | float cluster (FPA/FPM) not the integer. llvm-svn: 327793
* [X86][Btver2] Correctly distinguish between scheduling pipe and functional ↵Simon Pilgrim2018-03-1810-828/+828
| | | | | | | | | | unit for JWriteResFpuPair defs Jaguar's FPU has 2 scheduler pipes (JFPU0/JFPU1) which forward to multiple functional sub-units each. We need to model that an micro-op will both consume the scheduler pipe and a functional unit. This patch just handles the ops defined through JWriteResFpuPair, I'll go through the custom cases later. llvm-svn: 327791
* [X86][Btver2] Add llvm-mca tests to show pipe resource usage of most vector ↵Simon Pilgrim2018-03-1811-0/+3206
| | | | | | | | instructions Hopefully these tests can be easily reused should any other subtarget get in depth llvm-mca coverage (we can either copy the tests or move them into a common dir and run it with multiple prefixes). llvm-svn: 327788
* [X86][Btver2] Tweak pipes test to remove register dependenciesSimon Pilgrim2018-03-151-34/+34
| | | | | | It gives us a better view of pipe usage in the timeline which is what the test is trying to show. llvm-svn: 327685
* [X86][Btver2] Fix ymm div/sqrt to use fmul unitSimon Pilgrim2018-03-151-27/+26
| | | | | | | | YMM FDiv/FSqrt are dispatched on pipe JFPU1 but should be performed on the JFPM unit - that is where most of the cycles are spent. This matches the pipes for WriteFSqrt/WriteFDiv definitions. llvm-svn: 327682
* [X86][Btver2] Add test to show timeline of fpu instructions on different ↵Simon Pilgrim2018-03-151-0/+114
| | | | | | | | pipes/units Try to demonstrate the scheduling from fpu0/fpu1 pipes to the valu0/vimul/fpa or valu1/stc/fpm functional units llvm-svn: 327676
* [llvm-mca] BackendStatistics: early exit from method printSchedulerUsage if theAndrea Di Biagio2018-03-101-0/+34
| | | | | | no scheduler resources were consumed. llvm-svn: 327215
* [llvm-mca] Emit the 'Instruction Info' table before the resource pressure view.Andrea Di Biagio2018-03-084-61/+66
| | | | | | | In future, both the summary information and the 'instruction info' table should be moved into a separate "Summary" view. llvm-svn: 327010
* [llvm-mca] LLVM Machine Code Analyzer.Andrea Di Biagio2018-03-084-0/+320
llvm-mca is an LLVM based performance analysis tool that can be used to statically measure the performance of code, and to help triage potential problems with target scheduling models. llvm-mca uses information which is already available in LLVM (e.g. scheduling models) to statically measure the performance of machine code in a specific cpu. Performance is measured in terms of throughput as well as processor resource consumption. The tool currently works for processors with an out-of-order backend, for which there is a scheduling model available in LLVM. The main goal of this tool is not just to predict the performance of the code when run on the target, but also help with diagnosing potential performance issues. Given an assembly code sequence, llvm-mca estimates the IPC (instructions per cycle), as well as hardware resources pressure. The analysis and reporting style were mostly inspired by the IACA tool from Intel. This patch is related to the RFC on llvm-dev visible at this link: http://lists.llvm.org/pipermail/llvm-dev/2018-March/121490.html Differential Revision: https://reviews.llvm.org/D43951 llvm-svn: 326998
OpenPOWER on IntegriCloud