summaryrefslogtreecommitdiffstats
path: root/llvm/tools/llvm-mca
Commit message (Collapse)AuthorAgeFilesLines
...
* [llvm-mca] Simplify eventing by adding an onEvent templated method.Matt Davis2018-07-1222-46/+40
| | | | | | | | | | | | | | | | | | | | Summary: This patch eliminates some redundancy in iterating across Listeners for the Instruction and Stall HWEvents, by introducing a template onEvent routine. This change was suggested by @courbet in https://reviews.llvm.org/D48576. I hope that this patch addresses that suggestion appropriately. I do like this change better than what we had previously. Reviewers: andreadb, courbet, RKSimon Reviewed By: andreadb, courbet Subscribers: javed.absar, tschuett, gbedwell, llvm-commits, courbet Differential Revision: https://reviews.llvm.org/D48672 llvm-svn: 336916
* [llvm-mca] Use a different character to flag instructions with side-effects ↵Andrea Di Biagio2018-07-111-2/+2
| | | | | | | | | | | | | | | | | | | | | in the Instruction Info View. NFC This makes easier to identify changes in the instruction info flags. It also helps spotting potential regressions similar to the one recently introduced at r336728. Using the same character to mark MayLoad/MayStore/HasSideEffects is problematic for llvm-lit. When pattern matching substrings, llvm-lit consumes tabs and spaces. A change in position of the flag marker may not trigger a test failure. This patch only changes the character used for flag `hasSideEffects`. The reason why I didn't touch other flags is because I want to avoid spamming the mailing because of the massive diff due to the numerous tests affected by this change. In future, each instruction flag should be associated with a different character in the Instruction Info View. llvm-svn: 336797
* [llvm-mca] report an error if the assembly sequence contains an unsupported ↵Andrea Di Biagio2018-07-093-25/+46
| | | | | | | | | | | | | instruction. This is a short-term fix for PR38093. For now, we llvm::report_fatal_error if the instruction builder finds an unsupported instruction in the instruction stream. We need to revisit this fix once we start addressing PR38101. Essentially, we need a better framework for error handling. llvm-svn: 336543
* [llvm-mca] Add HardwareUnit and Context classes.Matt Davis2018-07-069-25/+207
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch moves the construction of the default backend from llvm-mca.cpp and into mca::Context. The Context class is responsible for holding ownership of the simulated hardware components. These components are subclasses of HardwareUnit. Right now the HardwareUnit is pretty bare-bones, but eventually we might want to add some common functionality across all hardware components, such as isReady() or something similar. I have a feeling this patch will probably need some updates, but it's a start. One thing I am not particularly fond of is the rather large interface for createDefaultPipeline. That convenience routine takes a rather large set of inputs from the llvm-mca driver, where many of those inputs are generated via command line options. One item I think we might want to change is the separating of ownership of hardware components (owned by the context) and the pipeline (which owns Stages). In short, a Pipeline owns Stages, a Context (currently) owns hardware. The Pipeline's Stages make use of the components, and thus there is a lifetime dependency generated. The components must outlive the pipeline. We could solve this by having the Context also own the Pipeline, and not return a unique_ptr<Pipeline>. Now that I think about it, I like that idea more. Differential Revision: https://reviews.llvm.org/D48691 llvm-svn: 336456
* [llvm-mca] A write latency cannot be a negative value. NFCAndrea Di Biagio2018-07-063-10/+10
| | | | llvm-svn: 336437
* [llvm-mca] improve the instruction issue logic implemented by the Scheduler.Andrea Di Biagio2018-07-062-8/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch modifies the Scheduler heuristic used to select the next instruction to issue to the pipelines. The motivating example is test X86/BtVer2/add-sequence.s, for which llvm-mca wrongly reported an estimated IPC of 1.50. According to perf, the actual IPC for that test should have been ~2.00. It turns out that an IPC of 2.00 for test add-sequence.s cannot possibly be predicted by a Scheduler that only prioritizes instructions based on their "age". A similar issue also affected test X86/BtVer2/dependent-pmuld-paddd.s, for which llvm-mca wrongly estimated an IPC of 0.84 instead of an IPC of 1.00. Instructions in the ReadyQueue are now ranked based on two factors: - The "age" of an instruction. - The number of unique users of writes associated with an instruction. The new logic still prioritizes older instructions over younger instructions to minimize the pressure on the reorder buffer. However, the number of users of an instruction now also affects the overall rank. This potentially increases the ability of the Scheduler to extract instruction level parallelism. This patch fixes the problem with the wrong IPC reported for test add-sequence.s and test dependent-pmuld-paddd.s. llvm-svn: 336420
* [llvm-mca] Fix RegisterFile debug prints. NFCAndrea Di Biagio2018-07-052-3/+9
| | | | llvm-svn: 336367
* [llvm-mca] Clear the content of map VariantDescriptors in InstrBuilder ↵Andrea Di Biagio2018-07-022-0/+5
| | | | | | | | | | | | | | before we start analyzing a new CodeBlock. NFCI. Different CodeBlocks don't overlap. The same MCInst cannot appear in more than one code block because all blocks are instantiated before the simulation is run. We should always clear the content of map VariantDescriptors before every simulation, since VariantDescriptors cannot possibly store useful information for the next blocks. It is also "safer" to clear its content because `MCInst*` is used as the key type for map VariantDescriptors. llvm-svn: 336142
* [MC] Error on a .zerofill directive in a non-virtual sectionFrancis Visoiu Mistrih2018-07-021-1/+2
| | | | | | | | | | | | | | | On darwin, all virtual sections have zerofill type, and having a .zerofill directive in a non-virtual section is not allowed. Instead of asserting, show a nicer error. In order to use the equivalent of .zerofill in a non-virtual section, the usage of .zero of .space is required. This patch replaces the assert with an error. Differential Revision: https://reviews.llvm.org/D48517 llvm-svn: 336127
* [llvm-mca] Remove field HasReadAdvanceEntries from class ReadDescriptor.Andrea Di Biagio2018-06-293-15/+0
| | | | | | | | | | | | This simplifies the logic that updates RAW dependencies in the DispatchStage. There is no advantage in storing that flag in the ReadDescriptor; we should simply rely on the call to `STI.getReadAdvanceCycles()` to obtain the ReadAdvance cycles. If there are no read-advance entries, then method `getReadAdvanceCycles()` quickly returns 0. No functional change intended. llvm-svn: 335977
* [llvm-mca] Delete Pipeline's copy ctor and assignement operator.Matt Davis2018-06-281-0/+3
| | | | | | Prevent copying of the Pipeline. llvm-svn: 335885
* [llvm-mca] Use a WriteRef to describe register writes in class RegisterFile.Andrea Di Biagio2018-06-286-50/+103
| | | | | | | | | | | This patch introduces a new class named WriteRef. A WriteRef is used by the RegisterFile to keep track of register definitions. Internally it wraps a WriteState, as well as the source index of the defining instruction. This patch allows the tool to propagate additional information to support future analysis on data dependencies. llvm-svn: 335867
* [llvm-mca] Refactor method RegisterFile::collectWrites(). NFCIAndrea Di Biagio2018-06-281-7/+13
| | | | | | | | | | | | Rather than calling std::find in a loop, just sort the vector and remove duplicate entries at the end of the function. Also, move the debug print at the end of the function, and query the MCRegisterInfo to print register names rather than physreg IDs. No functional change intended. llvm-svn: 335837
* [llvm-mca] Register listeners with stages; remove Pipeline dependency from ↵Matt Davis2018-06-2711-71/+49
| | | | | | | | | | | | | | | | | | | | | | | | Stage. Summary: This patch removes a few callbacks from Pipeline. It comes at the cost of registering Listeners with all Stages. Not all stages need listeners or issue callbacks, this registration is a bit redundant. However, as we build-out the API, this redundancy can disappear. The main purpose here is to move callback code from the Pipeline and into the stages that actually issue those callbacks. This removes the back-pointer to the Pipeline that was put into a few Stage subclasses. Reviewers: andreadb, courbet, RKSimon Reviewed By: andreadb, courbet Subscribers: tschuett, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D48576 llvm-svn: 335748
* [llvm-mca] Avoid calling method update() on instructions that are already in ↵Andrea Di Biagio2018-06-273-15/+25
| | | | | | | | | | the IS_READY state. NFCI When promoting instructions from the wait queue to the ready queue, we should check if an instruction has already reached the IS_READY state before calling method update(). llvm-svn: 335722
* [llvm-mca] Add a comment to Stage::execute and fix a spelling error. NFC.Matt Davis2018-06-271-1/+3
| | | | llvm-svn: 335697
* [llvm-mca] Removed wrong NDEBUG guards introduced by my last commit.Andrea Di Biagio2018-06-264-10/+0
| | | | | | This partially reverts r335589. llvm-svn: 335592
* [llvm-mca] Remove unused header files and correctly guard some include ↵Andrea Di Biagio2018-06-267-8/+19
| | | | | | headers under NDEBUG. NFC llvm-svn: 335589
* [llvm-mca] Rename Backend to Pipeline. NFC.Matt Davis2018-06-2522-96/+97
| | | | | | | | | | | | | | | | | | Summary: This change renames the Backend and BackendPrinter to Pipeline and PipelinePrinter respectively. Variables and comments have also been updated to reflect this change. The reason for this rename, is to be slightly more correct about what MCA is modeling. MCA models a Pipeline, which implies some logical sequence of stages. Reviewers: andreadb, courbet, RKSimon Reviewed By: andreadb, courbet Subscribers: mgorny, javed.absar, tschuett, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D48496 llvm-svn: 335496
* [llvm-mca] Remove unnecessary include and forward decl in RCU. NFC.Matt Davis2018-06-222-5/+3
| | | | | | | The DispatchUnit is no longer a dependency of RCU, so this patch removes a stale include and forward decl. This patch also cleans up some comments. llvm-svn: 335392
* [llvm-mca] Remove redundant call. NFCAndrea Di Biagio2018-06-221-2/+0
| | | | llvm-svn: 335368
* [llvm-mca] Set the operand ID for implicit register reads/writes. NFCAndrea Di Biagio2018-06-222-36/+43
| | | | | | | Also, move the definition of InstRef at the end of Instruction.h to avoid a forward declaration. llvm-svn: 335363
* [llvm-mca] Introduce a sequential container of StagesMatt Davis2018-06-2210-74/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Remove explicit stages and introduce a list of stages. A pipeline should be composed of an arbitrary list of stages, and not any predefined list of stages in the Backend. The Backend should not know of any particular stage, rather it should only be concerned that it has a list of stages, and that those stages will fulfill the contract of what it means to be a Stage (namely pre/post/execute a given instruction). For now, we leave the original set of stages defined in the Backend ctor; however, I imagine these will be moved out at a later time. This patch makes an adjustment to the semantics of Stage::isReady. Specifically, what the Backend really needs to know is if a Stage has unfinished work. With that said, it is more appropriately renamed Stage::hasWorkToComplete(). This change will clean up the check in Backend::run(), allowing us to query each stage to see if there is unfinished work, regardless of what subclass a stage might be. I feel that this change simplifies the semantics too, but that's a subjective statement. Given how RetireStage and ExecuteStage handle data in their preExecute(), I've had to change the order of Retire and Execute in our stage list. Retire must complete any of its preExecute actions before ExecuteStage's preExecute can take control. This is mainly because both stages utilize the RCU. In the meantime, I want to see if I can adjust that or remove that coupling. Reviewers: andreadb, RKSimon, courbet Reviewed By: andreadb Subscribers: tschuett, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D46907 llvm-svn: 335361
* [llvm-mca] Updates comment in code, and remove some stale comments. NFCAndrea Di Biagio2018-06-214-145/+81
| | | | | | | Also, rename fields `TotalMappings` and `NumUsedMappings` in struct RegisterMappingTracker into `NumPhysRegs` and `NumUsedPhysRegs`. llvm-svn: 335219
* [llvm-mca] use APint::operator[] to obtain the bit value. NFCAndrea Di Biagio2018-06-201-4/+2
| | | | llvm-svn: 335131
* [llvm-mca][X86] Teach how to identify register writes that implicitly clear ↵Andrea Di Biagio2018-06-205-33/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | the upper portion of a super-register. This patch teaches llvm-mca how to identify register writes that implicitly zero the upper portion of a super-register. On X86-64, a general purpose register is implemented in hardware as a 64-bit register. Quoting the Intel 64 Software Developer's Manual: "an update to the lower 32 bits of a 64 bit integer register is architecturally defined to zero extend the upper 32 bits". Also, a write to an XMM register performed by an AVX instruction implicitly zeroes the upper 128 bits of the aliasing YMM register. This patch adds a new method named clearsSuperRegisters to the MCInstrAnalysis interface to help identify instructions that implicitly clear the upper portion of a super-register. The rest of the patch teaches llvm-mca how to use that new method to obtain the information, and update the register dependencies accordingly. I compared the kernels from tests clear-super-register-1.s and clear-super-register-2.s against the output from perf on btver2. Previously there was a large discrepancy between the estimated IPC and the measured IPC. Now the differences are mostly in the noise. Differential Revision: https://reviews.llvm.org/D48225 llvm-svn: 335113
* [llvm-mca] Cleanup the header syntax line. Fix a comment. NFC.Matt Davis2018-06-181-3/+2
| | | | | | This patch removes a few dashes from the header comment to make room for the syntax line. llvm-svn: 334986
* [llvm-mca] Use an ordered map to collect hardware statistics. NFC.Andrea Di Biagio2018-06-184-9/+11
| | | | | | | Histogram entries are now ordered by key. This should improves their readability when statistics are printed. llvm-svn: 334961
* [MCA] Add -summary-view optionRoman Lebedev2018-06-151-1/+10
| | | | | | | | | | | | | | | | | | | Summary: While that is indeed a quite interesting summary stat, there are cases where it does not really add anything other than consuming extra lines. Declutters the output of D48190. Reviewers: RKSimon, andreadb, courbet, craig.topper Reviewed By: andreadb Subscribers: javed.absar, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D48209 llvm-svn: 334833
* [llvm-mca] Clean up the header comment. NFC.Matt Davis2018-06-142-4/+2
| | | | | | This change removes a few dashes to make room for the header syntax string. llvm-svn: 334770
* [llvm-mca] Introduce the ExecuteStage (was originally the Scheduler class).Matt Davis2018-06-149-213/+397
| | | | | | | | | | | | | | Summary: This patch transforms the Scheduler class into the ExecuteStage. Most of the logic remains. Reviewers: andreadb, RKSimon, courbet Reviewed By: andreadb Subscribers: mgorny, javed.absar, tschuett, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D47246 llvm-svn: 334679
* [llvm-mca] Fixed a bug in the logic that checks if a memory operation is ↵Andrea Di Biagio2018-06-131-8/+12
| | | | | | | | | | | | | | | ready to execute. Fixes PR37790. In some (very rare) cases, the LSUnit (Load/Store unit) was wrongly marking a load (or store) as "ready to execute" effectively bypassing older memory barrier instructions. To reproduce this bug, the memory barrier must be the first instruction in the input assembly sequence, and it doesn't have to perform any register writes. llvm-svn: 334633
* Revert: [llvm-mca] Flush the output stream before we start the analysis of a ↵Andrea Di Biagio2018-06-131-1/+0
| | | | | | | | | new code region. NFC Not sure why, but it breaks buildbot clang-cmake-armv8-full. It causes a failure in TEST 'Xray-armhf-linux :: TestCases/Posix/profiling-single-threaded.cc'. llvm-svn: 334617
* [llvm-mca] Flush the output stream before we start the analysis of a new ↵Andrea Di Biagio2018-06-131-0/+1
| | | | | | code region. NFC llvm-svn: 334610
* [llvm-mca] Correctly update the CyclesLeft of a register read in the ↵Andrea Di Biagio2018-06-052-7/+20
| | | | | | | | | | | presence of partial register updates. This patch fixe the logic in ReadState::cycleEvent(). That method was not correctly updating field `TotalCycles`. Added extra code comments in class ReadState to better describe each field. llvm-svn: 334028
* [RFC][patch 3/3] Add support for variant scheduling classes in llvm-mca.Andrea Di Biagio2018-06-043-16/+38
| | | | | | | | | | | | | | | | | | | | | | | | This patch is the last of a sequence of three patches related to LLVM-dev RFC "MC support for variant scheduling classes". http://lists.llvm.org/pipermail/llvm-dev/2018-May/123181.html This fixes PR36672. The main goal of this patch is to teach llvm-mca how to solve variant scheduling classes. This patch does that, plus it adds new variant scheduling classes to the BtVer2 scheduling model to identify so-called zero-idioms (i.e. so-called dependency breaking instructions that are known to generate zero, and that are optimized out in hardware at register renaming stage). Without the BtVer2 change, this patch would not have had any meaningful tests. This patch is effectively the union of two changes: 1) a change that teaches llvm-mca how to resolve variant scheduling classes. 2) a change to the BtVer2 scheduling model that allows us to special-case packed XOR zero-idioms (this partially fixes PR36671). Differential Revision: https://reviews.llvm.org/D47374 llvm-svn: 333909
* [llvm-mca] Track cycles contributed by resources that are in a 'Super' ↵Andrea Di Biagio2018-06-041-1/+19
| | | | | | | | | | | | | | relationship. This is required if we want to correctly match the behavior of method SubtargetEmitter::ExpandProcResource() in Tablegen. When computing the set of "consumed" processor resources and resource cycles, the logic in ExpandProcResource() doesn't update the number of resource cycles contributed by a "Super" resource to a group. We need to take this into account when a model declares a processor resource which is part of a 'processor resource group', and it is also used as the "Super" of other resources. llvm-svn: 333892
* [llvm-mca] Move the logic that computes the block throughput into Support.h. NFCAndrea Di Biagio2018-06-014-48/+63
| | | | | | | This will allow us to share the logic that computes the block throughput with other views. llvm-svn: 333755
* [llvm-mca] Fixed a problem caused by an invalid use of a processor resource ↵Andrea Di Biagio2018-05-311-9/+7
| | | | | | | | | | | | | mask in the Scheduler. The lambda functions used by method ResourceManager::mustIssueImmediately() was incorrectly truncating masks of buffered processor resources to 32-bit quantities. The invalid mask values were then used to access a map of processor resource descriptors. Fixes PR37643. llvm-svn: 333692
* [llvm-mca] Update the header's guard name. NFC.Matt Davis2018-05-251-3/+3
| | | | | | This patch also places a comment at the end of the header guard. llvm-svn: 333297
* [llvm-mca] Update DispatchStage header comment. NFC.Matt Davis2018-05-252-3/+10
| | | | | | Updated the comment to be a wee bit more descriptive. llvm-svn: 333296
* [llvm-mca] Add the RetireStage. Matt Davis2018-05-2511-82/+175
| | | | | | | | | | | | | | | | | Summary: This class maintains the same logic as the original RetireControlUnit. This is just an intermediate patch to make the RCU a Stage. Future patches will remove the dependency on the DispatchStage, and then more properly populate the pre/execute/post Stage interface. Reviewers: andreadb, RKSimon, courbet Reviewed By: andreadb, courbet Subscribers: javed.absar, mgorny, tschuett, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D47244 llvm-svn: 333292
* [llvm-mca] Fix a rounding problem in SummaryView.cpp exposed by r333204.Andrea Di Biagio2018-05-241-4/+7
| | | | | | | | Before printing the block reciprocal throughput, ensure that the floating point number is always rounded the same way on every target. No functional change intended. llvm-svn: 333210
* [llvm-mca] Fix header comments. NFC.Matt Davis2018-05-232-2/+2
| | | | llvm-svn: 333096
* [llvm-mca] Print the "Block RThroughput" in the SummaryView.Andrea Di Biagio2018-05-233-14/+99
| | | | | | | | | | | | | | | | | | | This patch implements the "block reciprocal throughput" computation in the SummaryView. The block reciprocal throughput is computed as the MAX of: - NumMicroOps / DispatchWidth - Resource Cycles / #Units (for every resource consumed). The block throughput is bounded from above by the hardware dispatch throughput. That is because the DispatchWidth is an upper bound on how many opcodes can be part of a single dispatch group. The block throughput is also limited by the amount of hardware parallelism. The number of available resource units affects how the resource pressure is distributed, and also how many blocks can be delivered every cycle. llvm-svn: 333095
* [llvm-mca] Move DispatchStage::cycleEvent to preExecute. NFC.Matt Davis2018-05-223-10/+10
| | | | | | | | | | | | | | | | | | | | | | | Summary: This is an intermediate change, it moves the non-notification logic from Backend::notifyCycleBegin to runCycle(). Once the scheduler becomes part of the Execution stage the explicit call to Scheduler::cycleEvent will disappear. The logic for Dispatch::cycleEvent() can be in the preExecute phase, which this patch addresses. Reviewers: andreadb, RKSimon, courbet Reviewed By: andreadb Subscribers: tschuett, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D47213 llvm-svn: 333029
* [llvm-mca] Removed an empty line generated by the timeline view. NFC.Andrea Di Biagio2018-05-211-7/+10
| | | | | | Also, regenerate all tests. llvm-svn: 332853
* [llvm-mca] Make Dispatch a subclass of Stage.Matt Davis2018-05-1711-130/+134
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The logic of dispatch remains the same, but now DispatchUnit is a Stage (DispatchStage). This change has the benefit of simplifying the backend runCycle() code. The same logic applies, but it belongs to different components now. This is just a start, eventually we will need to remove the call to the DispatchStage in Scheduler.cpp, but that will be a separate patch. This change is mostly a renaming and moving of existing logic. This change also encouraged me to remove the Subtarget (STI) member from the Backend class. That member was used to initialize the other members of Backend and to eventually call DispatchUnit::dispatch(). Now that we have Stages, we can eliminate this by instantiating the DispatchStage with everything it needs at the time of construction (e.g., Subtarget). That change allows us to call DispatchStage::execute(IR) as we expect to call execute() for all other stages. Once we add the Stage list (D46907) we can more cleanly call preExecute() on all of the stages, DispatchStage, will probably wrap cycleEvent() in that case. Made some formatting and minor cleanups to README.txt. Some of the text was re-flowed to stay within 80 cols. Reviewers: andreadb, courbet, RKSimon Reviewed By: andreadb, courbet Subscribers: mgorny, javed.absar, tschuett, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D46983 llvm-svn: 332652
* [llvm-mca] Hide unrelated flags from the -help output.Andrea Di Biagio2018-05-171-23/+35
| | | | llvm-svn: 332615
* [llvm-mca] add flag -all-views and flag -all-stats.Andrea Di Biagio2018-05-171-0/+38
| | | | | | | Flag -all-views enables all the views. Flag -all-stats enables all the views that print hardware statistics. llvm-svn: 332602
OpenPOWER on IntegriCloud