summaryrefslogtreecommitdiffstats
path: root/llvm/tools/llvm-mca
Commit message (Collapse)AuthorAgeFilesLines
...
* [llvm-mca] InstrBuilder: warnings for call/ret instructions are only ↵Andrea Di Biagio2018-11-242-4/+14
| | | | | | reported once. llvm-svn: 347514
* [llvm-mca] Refactor some of the logic in InstrBuilder, and add a ↵Andrea Di Biagio2018-11-232-77/+118
| | | | | | | | | | | | | | | | | | | | verifyOperands method. With this change, InstrBuilder emits an error if the MCInst sequence contains an instruction with a variadic opcode, and a non-zero number of variadic operands. Currently we don't know how to correctly analyze variadic opcodes. The problem with variadic operands is that there is no information for them in the opcode descriptor (i.e. MCInstrDesc). That means, we don't know which variadic operands are defs, and which are uses. In future, we could try to conservatively assume that any extra register operands is both a register use and a register definition. This patch fixes a subtle bug in the evaluation of read/write operands for ARM VLD1 with implicit index update. Added test vld1-index-update.s llvm-svn: 347503
* [llvm-mca][View] Improved Retire Control Unit Statistics.Andrea Di Biagio2018-11-233-16/+59
| | | | | | | | | | | | | | | | | | | | | RetireControlUnitStatistics now reports extra information about the ROB and the avg/maximum number of entries consumed over the entire simulation. Example: Retire Control Unit - number of cycles where we saw N instructions retired: [# retired], [# cycles] 0, 109 (17.9%) 1, 102 (16.7%) 2, 399 (65.4%) Total ROB Entries: 64 Max Used ROB Entries: 35 ( 54.7% ) Average Used ROB Entries per cy: 32 ( 50.0% ) Documentation in llvm/docs/CommandGuide/llvmn-mca.rst has been updated to reflect this change. llvm-svn: 347493
* [llvm-mca] LSUnit: use a SmallSet to model load/store queues. NFCIAndrea Di Biagio2018-11-222-25/+32
| | | | | | | | | | Also, try to minimize the number of queries to the memory queues to speedup the analysis. On average, this change gives a small 2% speedup. For memcpy-like kernels, the speedup is up to 5.5%. llvm-svn: 347469
* [llvm-mca] Use a SmallVector instead of std::vector to track register ↵Andrea Di Biagio2018-11-222-9/+11
| | | | | | | | | | reads/writes. NFCI This avoids a heap allocation most of the times. This patch gives a small but consistent 3% speedup on a release build (up to ~5% on a debug build). llvm-svn: 347464
* [llvm-mca] Fix an invalid memory read introduced by r346487.Andrea Di Biagio2018-11-223-16/+49
| | | | | | | | | | | | | | | | | | | | This patch fixes an invalid memory read introduced by r346487. Before this patch, partial register write had to query the latency of the dependent full register write by calling a method on the full write descriptor. However, if the full write is from an already retired instruction, chances are that the EntryStage already reclaimed its memory. In some parial register write tests, valgrind was reporting an invalid memory read. This change fixes the invalid memory access problem. Writes are now responsible for tracking dependent partial register writes, and notify them in the event of instruction issued. That means, partial register writes no longer need to query their associated full write to check when they are ready to execute. Added test X86/BtVer2/partial-reg-update-7.s llvm-svn: 347459
* [llvm-mca] Correctly update the resource strategy for processor resources ↵Andrea Di Biagio2018-11-121-1/+7
| | | | | | | | | | | | | | | | | | | | | | | with multiple units. When looking at the tests committed by Roman at r346587, I noticed that numbers reported by the resource pressure for PdAGU01 were wrong. In particular, according to the aut-generated CHECK lines in tests memcpy-like-test.s and store-throughput.s, resource pressure for PdAGU01 was not uniformly distributed among the two AGEN pipes. It turns out that the reason why pressure was not correctly distributed, was because the "resource selection strategy" object associated with PdAGU01 was not correctly updated on the event of AGEN pipe used. As a result, llvm-mca was not simulating a round-robin pipeline allocation for PdAGU01. Instead, PdAGU1 was always prioritized over PdAGU0. This patch fixes the issue; now processor resource strategy objects for resources declaring multiple units, are correctly notified in the event of "resource used". llvm-svn: 346650
* [llvm-mca] Account for buffered resources when analyzing "Super" resources.Andrea Di Biagio2018-11-091-1/+28
| | | | | | | | | | | | | | | | | | | | This was noticed when working on PR3946. By construction, a group cannot be used as a "Super" resource. That constraint is enforced by method `SubtargetEmitter::ExpandProcResource()`. A Super resource S can be part of a group G. However, method `SubtargetEmitter::ExpandProcResource()` would not update the number of consumed resource cycles in G based on S. In practice, this is perfectly fine because the resource usage is correctly computed for processor resource units. However, llvm-mca should still check if G is a buffered resource. Before this patch, llvm-mca didn't correctly check if S was part of a group that defines a buffer. So, the instruction descriptor was not correctly set. For now, the semantic change introduced by this patch doesn't affect any of the upstream scheduling models. However, it will allow to make some progress on PR3946. llvm-svn: 346545
* [llvm-mca] Use a small vector for instructions in the EntryStage.Andrea Di Biagio2018-11-093-11/+15
| | | | | | | | | | | | | | | | | | | | | Use a simple SmallVector to track the lifetime of simulated instructions. An ordered map was not needed because instructions are already picked in program order. It is also much faster if we avoid searching for already retired instructions at the end of every cycle. The new policy only triggers a "garbage collection" when the number of retired instructions becomes significantly big when compared with the total size of the vector. While working on this, I noticed that instructions were correctly retired, but their internal state was not updated (i.e. there was no transition from the EXECUTED state, to the RETIRED state). While this was not a problem for the views, it prevented the EntryStage from correctly garbage collecting already retired instructions. That was a bad oversight, and this patch fixes it. The observed speedup on a debug build of llvm-mca after this patch is ~6%. On a release build of llvm-mca, the observed speedup is ~%15%. llvm-svn: 346487
* [llvm-mca] Partially revert r346417.Matt Davis2018-11-081-16/+19
| | | | | | | Restored the llvm:: namespace qualifier on make_unique. This removes the ambiguity with make_unique. llvm-svn: 346424
* [llvm-mca] PR39261: Rename FetchStage to EntryStage.Andrea Di Biagio2018-11-085-22/+23
| | | | | | | | | | | | This fixes PR39261. FetchStage is a misnomer. It causes confusion with the frontend fetch stage, which we don't currently simulate. I decided to rename it into EntryStage mainly because this is meant to be a "source" stage for all pipelines. Differential Revision: https://reviews.llvm.org/D54268 llvm-svn: 346419
* [llvm-mca] Remove unneeded namespace qualifier. NFC.Matt Davis2018-11-081-24/+21
| | | | llvm-svn: 346417
* [llvm-mca] Move the AssembleInput logic into its own class.Matt Davis2018-11-075-103/+218
| | | | | | | | | | | | | | | | | Summary: This patch introduces a CodeRegionGenerator class which is responsible for parsing some type of input and creating a 'CodeRegions' instance for use by llvm-mca. In the future, we will also have a CodeRegionGenerator subclass for converting an input object file into CodeRegions. For now, we only have the subclass for converting input assembly into CodeRegions. This is mostly a NFC patch, as the logic remains close to the original, but now encapsulated in its own class and moved outside of llvm-mca.cpp. Reviewers: andreadb, courbet, RKSimon Reviewed By: andreadb Subscribers: mgorny, tschuett, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D54179 llvm-svn: 346344
* [llvm-mca] Add extra counters for move elimination in view ↵Andrea Di Biagio2018-11-017-47/+163
| | | | | | | | | | | | | | | | | | RegisterFileStatistics. This patch teaches view RegisterFileStatistics how to report events for optimizable register moves. For each processor register file, view RegisterFileStatistics reports the following extra information: - Number of optimizable register moves - Number of register moves eliminated - Number of zero moves (i.e. register moves that propagate a zero) - Max Number of moves eliminated per cycle. Differential Revision: https://reviews.llvm.org/D53976 llvm-svn: 345865
* [llvm-mca] Remove the verb 'assemble' from a few options in help. NFC.Matt Davis2018-10-311-17/+17
| | | | | | | * MCA does not assemble anything. * Ran clang-format. llvm-svn: 345750
* [llvm-mca] Remove namespace prefixes made redundant by r345612. NFCAndrea Di Biagio2018-10-3118-149/+129
| | | | llvm-svn: 345730
* [llvm-mca] Move namespace mca inside llvm::Fangrui Song2018-10-3059-42/+118
| | | | | | | | | | | | | | | | Summary: This allows to remove `using namespace llvm;` in those *.cpp files When we want to revisit the decision (everything resides in llvm::mca::*) in the future, we can move things to a nested namespace of llvm::mca::, to conceptually make them separate from the rest of llvm::mca::* Reviewers: andreadb, mattd Reviewed By: andreadb Subscribers: javed.absar, tschuett, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D53407 llvm-svn: 345612
* [llvm-mca] Lower to mca::Instructon before the pipeline is run.Andrea Di Biagio2018-10-297-59/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | Before this change, the lowering of instructions from llvm::MCInst to mca::Instruction was done as part of the first stage of the pipeline (i.e. the FetchStage). In particular, FetchStage was responsible for picking the next instruction from the source sequence, and lower it to an mca::Instruction with the help of an object of class InstrBuilder. The dependency on InstrBuilder was problematic for a number of reasons. Class InstrBuilder only knows how to lower from llvm::MCInst to mca::Instruction. That means, it is hard to support a different scenario where instructions in input are not instances of class llvm::MCInst. Even if we managed to specialize InstrBuilder, and generalize most of its internal logic, the dependency on InstrBuilder in FetchStage would have caused more troubles (other than complicating the pipeline logic). With this patch, the lowering step is done before the pipeline is run. The pipeline is no longer responsible for lowering from MCInst to mca::Instruction. As a consequence of this, the FetchStage no longer needs to interact with an InstrBuilder. The mca::SourceMgr class now simply wraps a reference to a sequence of mca::Instruction objects. This simplifies the logic of FetchStage, and increases the usability of it. As a result, on a debug build, we see a 7-9% speedup; on a release build, the speedup is around 3-4%. llvm-svn: 345500
* [llvm-mca] Fix -wreorder and -Wunused-private-field after r345376. NFCSam McCall2018-10-262-4/+2
| | | | llvm-svn: 345378
* [llvm-mca] Removed dependency on mca::SourcMgr in some Views. NFCAndrea Di Biagio2018-10-269-44/+61
| | | | llvm-svn: 345376
* [llvm-mca] Introduce a new base class for mca::Instruction, and change how ↵Andrea Di Biagio2018-10-255-79/+84
| | | | | | | | | | | | | | | | | | | | | read/write information is stored. This patch introduces a new base class for Instruction named InstructionBase. Class InstructionBase is responsible for tracking data dependencies with the help of ReadState and WriteState objects. Class Instruction now derives from InstructionBase, and adds extra information related to the `InstrStage` as well as the `RCUTokenID`. ReadState and WriteState objects are no longer unique pointers. This avoids extra heap allocation and pointer checks that weren't really needed. Now, those objects are simply stored into SmallVectors. We use a SmallVector instead of a std::vector because we expect most instructions to only have a very small number of reads and writes. By using a simple SmallVector we also avoid extra heap allocations most of the time. In a debug build, this improves the performance of llvm-mca by roughly 10% (I still have to verify the impact in performance on a release build). llvm-svn: 345280
* [llvm-mca] Removed a couple of redundant method declarations, and simplified ↵Andrea Di Biagio2018-10-257-38/+23
| | | | | | code in ResourcePressureView. NFC llvm-svn: 345259
* [llvm-mca] Replace InstRef::isValid with operator bool. NFC.Matt Davis2018-10-246-17/+12
| | | | llvm-svn: 345190
* [llvm-mca] Simplify the logic in FetchStage. NFCIAndrea Di Biagio2018-10-243-22/+18
| | | | | | Only method 'getNextInstruction()' needs to interact with the SourceMgr. llvm-svn: 345185
* [llvm-mca] Remove dependency from InstrBuilder in class InstructionTables.Andrea Di Biagio2018-10-246-12/+9
| | | | | | | | | Also, removed the initialization of vectors used for processor resource masks. Support function 'computeProcResourceMasks()' already calls method resize on those vectors. No functional change intended. llvm-svn: 345161
* [llvm-mca] Refactor class SourceMgr. NFCIAndrea Di Biagio2018-10-244-44/+43
| | | | | | | | Added begin()/end() methods to allow the usage of SourceMgr in foreach loops. With this change, method getMCInstFromIndex() (as well as a couple of other methods) are now redundant, and can be removed from the public interface. llvm-svn: 345147
* [llvm-mca] [llvm-mca] Improved error handling and error reporting from class ↵Andrea Di Biagio2018-10-244-47/+78
| | | | | | | | | | | | | | | | | | | | | | | | | InstrBuilder. A new class named InstructionError has been added to Support.h in order to improve the error reporting from class InstrBuilder. The llvm-mca driver is responsible for handling InstructionError objects, and printing them out to stderr. The goal of this patch is to remove all the remaining error handling logic from the library code. In particular, this allows us to: - Simplify the logic in InstrBuilder by removing a needless dependency from MCInstrPrinter. - Centralize all the error halding logic in a new function named 'runPipeline' (see llvm-mca.cpp). This is also a first step towards generalizing class InstrBuilder, so that in future, we will be able to reuse its logic to also "lower" MachineInstr to mca::Instruction objects. Differential Revision: https://reviews.llvm.org/D53585 llvm-svn: 345129
* [llvm-mca] Remove a couple of using directives and a bunch of redundant ↵Andrea Di Biagio2018-10-227-16/+13
| | | | | | namespace llvm prefixes. NFC llvm-svn: 344916
* [llvm-mca] Use llvm::ArrayRef in class SourceMgr. NFCIAndrea Di Biagio2018-10-225-33/+31
| | | | | | | Class SourceMgr now uses type ArrayRef<MCInst> to reference the sequence of code from a "CodeRegion". llvm-svn: 344911
* [llvm-mca] Remove a stale TODO comment. NFCAndrea Di Biagio2018-10-191-2/+0
| | | | | | | Starting from revision r344334, we can now describe optimizable register-register moves in the machine scheduling models. llvm-svn: 344797
* Use llvm::{all,any,none}_of instead std::{all,any,none}_of. NFCFangrui Song2018-10-191-7/+6
| | | | llvm-svn: 344774
* [llvm-mca] Correctly set aliases for register writes introduced by optimized ↵Andrea Di Biagio2018-10-123-16/+72
| | | | | | | | | | | register moves. This fixes a problem introduced by r344334. A write from a non-zero move eliminated at register renaming stage was not correctly handled by the PRF. This would have led to an assertion failure if the processor model declares a PRF that enables non-zero move elimination. llvm-svn: 344392
* [llvm-mca] Remove method ↵Andrea Di Biagio2018-10-122-7/+3
| | | | | | RegisterFileStatistics::initializeRegisterFileInfo(). NFC llvm-svn: 344339
* [tblgen][llvm-mca] Add the ability to describe move elimination candidates ↵Andrea Di Biagio2018-10-122-6/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | via tablegen. This patch adds the ability to identify instructions that are "move elimination candidates". It also allows scheduling models to describe processor register files that allow move elimination. A move elimination candidate is an instruction that can be eliminated at register renaming stage. Each subtarget can specify which instructions are move elimination candidates with the help of tablegen class "IsOptimizableRegisterMove" (see llvm/Target/TargetInstrPredicate.td). For example, on X86, BtVer2 allows both GPR and MMX/SSE moves to be eliminated. The definition of 'IsOptimizableRegisterMove' for BtVer2 looks like this: ``` def : IsOptimizableRegisterMove<[ InstructionEquivalenceClass<[ // GPR variants. MOV32rr, MOV64rr, // MMX variants. MMX_MOVQ64rr, // SSE variants. MOVAPSrr, MOVUPSrr, MOVAPDrr, MOVUPDrr, MOVDQArr, MOVDQUrr, // AVX variants. VMOVAPSrr, VMOVUPSrr, VMOVAPDrr, VMOVUPDrr, VMOVDQArr, VMOVDQUrr ], CheckNot<CheckSameRegOperand<0, 1>> > ]>; ``` Definitions of IsOptimizableRegisterMove from processor models of a same Target are processed by the SubtargetEmitter to auto-generate a target-specific override for each of the following predicate methods: ``` bool TargetSubtargetInfo::isOptimizableRegisterMove(const MachineInstr *MI) const; bool MCInstrAnalysis::isOptimizableRegisterMove(const MCInst &MI, unsigned CPUID) const; ``` By default, those methods return false (i.e. conservatively assume that there are no move elimination candidates). Tablegen class RegisterFile has been extended with the following information: - The set of register classes that allow move elimination. - Maxium number of moves that can be eliminated every cycle. - Whether move elimination is restricted to moves from registers that are known to be zero. This patch is structured in three part: A first part (which is mostly boilerplate) adds the new 'isOptimizableRegisterMove' target hooks, and extends existing register file descriptors in MC by introducing new fields to describe properties related to move elimination. A second part, uses the new tablegen constructs to describe move elimination in the BtVer2 scheduling model. A third part, teaches llm-mca how to query the new 'isOptimizableRegisterMove' hook to mark instructions that are candidates for move elimination. It also teaches class RegisterFile how to describe constraints on move elimination at PRF granularity. llvm-mca tests for btver2 show differences before/after this patch. Differential Revision: https://reviews.llvm.org/D53134 llvm-svn: 344334
* [tblgen][CodeGenSchedule] Add a check for invalid RegisterFile definitions ↵Andrea Di Biagio2018-10-111-6/+4
| | | | | | with zero physical registers. llvm-svn: 344235
* [llvm-mca] Minor refactoring in preparation for a patch that will fully fix ↵Andrea Di Biagio2018-10-102-12/+15
| | | | | | PR36671. NFCI llvm-svn: 344149
* [llvm-mca] Remove unused/stale forward decl. NFC.Matt Davis2018-10-041-2/+0
| | | | llvm-svn: 343823
* [llvm-mca] Move field 'AllowZeroMoveEliminationOnly' to class RegisterFile. NFC.Andrea Di Biagio2018-10-042-5/+23
| | | | | | | | | | | | | | | | Flag 'AllowZeroMoveEliminationOnly' should have been a property of the PRF, and not set at register granularity. This change also restricts move elimination to writes that update a full physical register. We assume that there is a strong correlation between logical registers that allow move elimination, and how those same registers are allocated to physical registers by the register renamer. This is still a no functional change, because this experimental code path is disabled for now. This is done in preparation for another patch that will add the ability to describe how move elimination works in scheduling models. llvm-svn: 343787
* [llvm-mca] Check for inconsistencies when constructing instruction descriptors.Andrea Di Biagio2018-10-044-2/+43
| | | | | | | This should help with catching inconsistent definitions of instructions with zero opcodes, which also declare to consume scheduler/pipeline resources. llvm-svn: 343766
* [llvm-mca] Add support for move elimination in class RegisterFile.Andrea Di Biagio2018-10-037-10/+164
| | | | | | | | | | | | | | | | | | | This patch teaches class RegisterFile how to analyze register writes from instructions that are move elimination candidates. In particular, it teaches it how to check if a move can be effectively eliminated by the underlying PRF, and (if necessary) how to perform move elimination. The long term goal is to allow processor models to describe instructions that are valid move elimination candidates. The idea is to let register file definitions in tablegen declare if/when moves can be eliminated. This patch is a non functional change. The logic that performs move elimination is currently disabled. A future patch will add support for move elimination in the processor models, and enable this new code path. llvm-svn: 343691
* [llvm-mca] Remove unecessary forward decls. NFC.Matt Davis2018-10-024-5/+0
| | | | | | This patch also removes an unecessary include. llvm-svn: 343621
* [llvm-mca] Constify the 'notify' routines. NFC.Matt Davis2018-10-026-16/+18
| | | | | | Also fixed up some whitespace formatting in DispatchStage.cpp. llvm-svn: 343615
* [MCA] Remove SM.hasNext() call in FetchStage::execute.Owen Rodley2018-10-021-1/+1
| | | | | | | | | | | | | | | | Summary: This is redundant, as FetchStage::getNextInstruction already checks this and returns llvm::ErrorSuccess() as appropriate. NFC. Reviewers: andreadb Subscribers: gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D52642 llvm-svn: 343555
* [llvm-mca] Rename the 'Subtract' method to 'subtract'Matt Davis2018-10-012-2/+2
| | | | llvm-svn: 343549
* [llvm-mca] Remove redundant namespace prefixes. NFCAndrea Di Biagio2018-09-288-37/+36
| | | | | | We are already "using" namespace llvm in all the files modified by this change. llvm-svn: 343312
* [llvm-mca] Teach how to track zero registers in class RegisterFile.Andrea Di Biagio2018-09-283-26/+78
| | | | | | | | | This change is in preparation for a future work on improving support for optimizable register moves. We already know if a write is from a zero-idiom, so we can propagate that bit of information to the PRF. We use an APInt mask to identify registers that are set to zero. llvm-svn: 343307
* Test commit. NFC.Owen Rodley2018-09-281-1/+1
| | | | llvm-svn: 343296
* llvm::sort(C.begin(), C.end(), ...) -> llvm::sort(C, ...)Fangrui Song2018-09-272-5/+4
| | | | | | | | | | | | Summary: The convenience wrapper in STLExtras is available since rL342102. Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D52573 llvm-svn: 343163
* [llvm-mca] Improve code comments in LSUnit.{h, cpp}. NFCAndrea Di Biagio2018-09-242-15/+25
| | | | llvm-svn: 342877
* [MCA] Remove dependency on CodeGen.Dean Michael Berris2018-09-213-3/+1
| | | | | | | | | | | | | | | | | | | | | Summary: There isn't any actual dependency - there's one #include from CodeGen but nothing from the header is actually used. With this change we can use the MCA library from CodeGen without circular dependencies (e.g. for scheduling). Reviewers: andreadb Reviewed By: andreadb Authored By: orodley Subscribers: mgorny, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D52288 llvm-svn: 342706
OpenPOWER on IntegriCloud