summaryrefslogtreecommitdiffstats
path: root/llvm/test/tools/llvm-mca/X86/BtVer2/register-files-4.s
Commit message (Collapse)AuthorAgeFilesLines
* [MCA] Always check if scheduler resources are unavailable when reporting ↵Andrea Di Biagio2019-02-261-1/+1
| | | | | | | | | | | dispatch stalls. Dispatch stall cycles may be associated to multiple dispatch stall events. Before this patch, each stall cycle was associated with a single stall event. This patch also improves a couple of code comments, and adds a helper method to query the Scheduler for dispatch stalls. llvm-svn: 354877
* [llvm-mca] Report the number of dispatched micro opcodes in the ↵Andrea Di Biagio2018-08-301-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DispatchStatistics view. This patch introduces the following changes to the DispatchStatistics view: * DispatchStatistics now reports the number of dispatched opcodes instead of the number of dispatched instructions. * The "Dynamic Dispatch Stall Cycles" table now also reports the percentage of stall cycles against the total simulated cycles. This change allows users to easily compare dispatch group sizes with the processor DispatchWidth. Before this change, it was difficult to correlate the two numbers, since DispatchStatistics view reported numbers of instructions (instead of opcodes). DispatchWidth defines the maximum size of a dispatch group in terms of number of micro opcodes. The other change introduced by this patch is related to how DispatchStage generates "instruction dispatch" events. In particular: * There can be multiple dispatch events associated with a same instruction * Each dispatch event now encapsulates the number of dispatched micro opcodes. The number of micro opcodes declared by an instruction may exceed the processor DispatchWidth. Therefore, we cannot assume that instructions are always fully dispatched in a single cycle. DispatchStage knows already how to handle instructions declaring a number of opcodes bigger that DispatchWidth. However, DispatchStage always emitted a single instruction dispatch event (during the first simulated dispatch cycle) for instructions dispatched. With this patch, DispatchStage now correctly notifies multiple dispatch events for instructions that cannot be dispatched in a single cycle. A few views had to be modified. Views can no longer assume that there can only be one dispatch event per instruction. Tests (and docs) have been updated. Differential Revision: https://reviews.llvm.org/D51430 llvm-svn: 341055
* [llvm-mca] Add fields "Total uOps" and "uOps Per Cycle" to the report ↵Andrea Di Biagio2018-08-291-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | generated by the SummaryView. This patch adds two new fields to the perf report generated by the SummaryView. Fields are now logically organized into two small groups; only the second group contains throughput indicators. Example: ``` Iterations: 100 Instructions: 300 Total Cycles: 414 Total uOps: 700 Dispatch Width: 4 uOps Per Cycle: 1.69 IPC: 0.72 Block RThroughput: 4.0 ``` This patch also updates the docs for llvm-mca. Due to the nature of this change, several tests in the tools/llvm-mca directory were affected, and had to be updated using script `update_mca_test_checks.py`. llvm-svn: 340946
* [llvm-mca] Use a different character to flag instructions with side-effects ↵Andrea Di Biagio2018-07-111-2/+2
| | | | | | | | | | | | | | | | | | | | | in the Instruction Info View. NFC This makes easier to identify changes in the instruction info flags. It also helps spotting potential regressions similar to the one recently introduced at r336728. Using the same character to mark MayLoad/MayStore/HasSideEffects is problematic for llvm-lit. When pattern matching substrings, llvm-lit consumes tabs and spaces. A change in position of the flag marker may not trigger a test failure. This patch only changes the character used for flag `hasSideEffects`. The reason why I didn't touch other flags is because I want to avoid spamming the mailing because of the massive diff due to the numerous tests affected by this change. In future, each instruction flag should be associated with a different character in the Instruction Info View. llvm-svn: 336797
* [llvm-mca] Make sure not to end the test files with an empty line.Roman Lebedev2018-06-041-1/+0
| | | | | | | | | | | | | | | | | | | Summary: It's super irritating. [properly configured] git client then complains about that double-newline, and you have to use `--force` to ignore the warning, since even if you fix it manually, it will be reintroduced the very next runtime :/ Reviewers: RKSimon, andreadb, courbet, craig.topper, javed.absar, gbedwell Reviewed By: gbedwell Subscribers: javed.absar, tschuett, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D47697 llvm-svn: 333887
* [llvm-mca] Print the "Block RThroughput" in the SummaryView.Andrea Di Biagio2018-05-231-5/+6
| | | | | | | | | | | | | | | | | | | This patch implements the "block reciprocal throughput" computation in the SummaryView. The block reciprocal throughput is computed as the MAX of: - NumMicroOps / DispatchWidth - Resource Cycles / #Units (for every resource consumed). The block throughput is bounded from above by the hardware dispatch throughput. That is because the DispatchWidth is an upper bound on how many opcodes can be part of a single dispatch group. The block throughput is also limited by the amount of hardware parallelism. The number of available resource units affects how the resource pressure is distributed, and also how many blocks can be delivered every cycle. llvm-svn: 333095
* [X86][BtVer2] Add a 'J' prefix to the PRF/RCU defs. NFCAndrea Di Biagio2018-05-211-2/+2
| | | | | | | This is to keep the Jaguar model's naming convention. Processor resources all have a 'J' prefix in the BtVer2 scheduling model. llvm-svn: 332851
* [llvm-mca] Regenerate tests after r332381 and r332361. NFCAndrea Di Biagio2018-05-161-8/+8
| | | | llvm-svn: 332447
* [UpdateTestChecks] Add update_mca_test_checks.py scriptGreg Bedwell2018-04-181-5/+22
| | | | | | | | | | | This script can be used to regenerate tests in the test/tools/llvm-mca directory (PR36904). Regenerated a number of tests using the pattern: test/tools/llvm-mca/*/*/*.s Differential Revision: https://reviews.llvm.org/D45369 llvm-svn: 330246
* [llvm-mca] Move the logic that prints dispatch unit statistics from ↵Andrea Di Biagio2018-04-101-1/+1
| | | | | | | | | | | BackendStatistics to its own view. This patch moves the logic that collects and analyzes dispatch events to the DispatchStatistics view. Added flag -dispatch-stats to print statistics related to the dispatch logic. llvm-svn: 329708
* Reapply "[llvm-mca] Do not separate iterations with a newline in the ↵Andrea Di Biagio2018-04-101-3/+3
| | | | | | | | timeline view." This reapplies r329403 with a fix for the floating point rounding issue. llvm-svn: 329680
* Revert r329403 "[llvm-mca] Do not separate iterations with a newline in the ↵Hans Wennborg2018-04-091-3/+3
| | | | | | | | | | | | timeline view." This made AArch64/CortexA57/direct-branch.s fail on Windows, e.g. http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/11251 > Also, update a few tests to minimize the diff in D45369. > No functional change intended. llvm-svn: 329569
* [llvm-mca] Do not separate iterations with a newline in the timeline view.Andrea Di Biagio2018-04-061-3/+3
| | | | | | | Also, update a few tests to minimize the diff in D45369. No functional change intended. llvm-svn: 329403
* [llvm-mca] Move the logic that prints register file statistics to its own ↵Andrea Di Biagio2018-04-031-1/+1
| | | | | | | | | | | | | view. NFCI Before this patch, the "BackendStatistics" view was responsible for printing the register file usage (as well as many other statistics). Now users can enable register file usage statistics using the command line flag `-register-file-stats`. By default, the tool doesn't print register file statistics. llvm-svn: 329083
* [MC][Tablegen] Allow the definition of processor register files in the ↵Andrea Di Biagio2018-04-031-0/+49
scheduling model for llvm-mca This patch allows the description of register files in processor scheduling models. This addresses PR36662. A new tablegen class named 'RegisterFile' has been added to TargetSchedule.td. Targets can optionally describe register files for their processors using that class. In particular, class RegisterFile allows to specify: - The total number of physical registers. - Which target registers are accessible through the register file. - The cost of allocating a register at register renaming stage. Example (from this patch - see file X86/X86ScheduleBtVer2.td) def FpuPRF : RegisterFile<72, [VR64, VR128, VR256], [1, 1, 2]> Here, FpuPRF describes a register file for MMX/XMM/YMM registers. On Jaguar (btver2), a YMM register definition consumes 2 physical registers, while MMX/XMM register definitions only cost 1 physical register. The syntax allows to specify an empty set of register classes. An empty set of register classes means: this register file models all the registers specified by the Target. For each register class, users can specify an optional register cost. By default, register costs default to 1. A value of 0 for the number of physical registers means: "this register file has an unbounded number of physical registers". This patch is structured in two parts. * Part 1 - MC/Tablegen * A first part adds the tablegen definition of RegisterFile, and teaches the SubtargetEmitter how to emit information related to register files. Information about register files is accessible through an instance of MCExtraProcessorInfo. The idea behind this design is to logically partition the processor description which is only used by external tools (like llvm-mca) from the processor information used by the llvm machine schedulers. I think that this design would make easier for targets to get rid of the extra processor information if they don't want it. * Part 2 - llvm-mca related * The second part of this patch is related to changes to llvm-mca. The main differences are: 1) class RegisterFile now needs to take into account the "cost of a register" when allocating physical registers at register renaming stage. 2) Point 1. triggered a minor refactoring which lef to the removal of the "maximum 32 register files" restriction. 3) The BackendStatistics view has been updated so that we can print out extra details related to each register file implemented by the processor. The effect of point 3. is also visible in tests register-files-[1..5].s. Differential Revision: https://reviews.llvm.org/D44980 llvm-svn: 329067
OpenPOWER on IntegriCloud