summaryrefslogtreecommitdiffstats
path: root/llvm/include
Commit message (Collapse)AuthorAgeFilesLines
...
* [Attributor] Implement "norecurse" function attribute deductionHideto Ueno2019-09-211-8/+41
| | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch introduces `norecurse` function attribute deduction. `norecurse` will be deduced if the following conditions hold: * The size of SCC in which the function belongs equals to 1. * The function doesn't have self-recursion. * We have `norecurse` for all call site. To avoid a large change, SCC is calculated using scc_iterator in InfoCache initialization for now. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67751 llvm-svn: 372475
* [Support] Add a DataExtractor constructor that takes ArrayRef<uint8_t>Fangrui Song2019-09-211-0/+5
| | | | | | | | | | The new constructor can simplify some llvm-readobj call sites. Reviewed By: grimar, dblaikie Differential Revision: https://reviews.llvm.org/D67797 llvm-svn: 372473
* Revert "[SampleFDO] Expose an interface to return the size of a section or ↵Amara Emerson2019-09-212-23/+0
| | | | | | | | | | the size" This reverts commit f118852046a1d255ed8c65c6b5db320e8cea53a0. Broke the macOS build/greendragon bots. llvm-svn: 372464
* [MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCountJames Molloy2019-09-212-0/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recommit: fix asan errors. The way MachinePipeliner uses these target hooks is stateful - we reduce trip count by one per call to reduceLoopCount. It's a little overfit for hardware loops, where we don't have to worry about stitching a loop induction variable across prologs and epilogs (the induction variable is implicit). This patch introduces a new API: /// Analyze loop L, which must be a single-basic-block loop, and if the /// conditions can be understood enough produce a PipelinerLoopInfo object. virtual std::unique_ptr<PipelinerLoopInfo> analyzeLoopForPipelining(MachineBasicBlock *LoopBB) const; The return value is expected to be an implementation of the abstract class: /// Object returned by analyzeLoopForPipelining. Allows software pipelining /// implementations to query attributes of the loop being pipelined. class PipelinerLoopInfo { public: virtual ~PipelinerLoopInfo(); /// Return true if the given instruction should not be pipelined and should /// be ignored. An example could be a loop comparison, or induction variable /// update with no users being pipelined. virtual bool shouldIgnoreForPipelining(const MachineInstr *MI) const = 0; /// Create a condition to determine if the trip count of the loop is greater /// than TC. /// /// If the trip count is statically known to be greater than TC, return /// true. If the trip count is statically known to be not greater than TC, /// return false. Otherwise return nullopt and fill out Cond with the test /// condition. virtual Optional<bool> createTripCountGreaterCondition(int TC, MachineBasicBlock &MBB, SmallVectorImpl<MachineOperand> &Cond) = 0; /// Modify the loop such that the trip count is /// OriginalTC + TripCountAdjust. virtual void adjustTripCount(int TripCountAdjust) = 0; /// Called when the loop's preheader has been modified to NewPreheader. virtual void setPreheader(MachineBasicBlock *NewPreheader) = 0; /// Called when the loop is being removed. virtual void disposed() = 0; }; The Pipeliner (ModuloSchedule.cpp) can use this object to modify the loop while allowing the target to hold its own state across all calls. This API, in particular the disjunction of creating a trip count check condition and adjusting the loop, improves the code quality in ModuloSchedule.cpp. llvm-svn: 372463
* LiveIntervals: Add missing operator!= for segmentsMatt Arsenault2019-09-211-0/+4
| | | | llvm-svn: 372449
* [SampleFDO] Expose an interface to return the size of a section or the sizeWei Mi2019-09-202-0/+23
| | | | | | | | | | | | | | of the profile for profile in ExtBinary format. Sometimes we want to limit the size of the profile by stripping some functions with low sample count or by stripping some function names with small text size from profile symbol list. That requires the profile reader to have the interfaces returning the size of a section or the size of total profile. The patch add those interfaces. At the same time, add some dump facility to show the size of each section. llvm-svn: 372439
* Revert "[MachinePipeliner] Improve the TargetInstrInfo API ↵Mitch Phillips2019-09-202-46/+0
| | | | | | | | | | | analyzeLoop/reduceLoopCount" This commit broke the ASan buildbot. See comments in rL372376 for more information. This reverts commit 15e27b0b6d9d51362fad85dbe95ac5b3fadf0a06. llvm-svn: 372425
* Fix -Wdocumentation warning. NFCI.Simon Pilgrim2019-09-201-1/+1
| | | | llvm-svn: 372418
* [SelectionDAG][Mips][Sparc] Don't allow SimplifyDemandedBits to constant ↵Craig Topper2019-09-201-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | fold TargetConstant nodes to a Constant. Summary: After the switch in SimplifyDemandedBits, it tries to create a constant when possible. If the original node is a TargetConstant the default in the switch will call computeKnownBits on the TargetConstant which will succeed. This results in the TargetConstant becoming a Constant. But TargetConstant exists to avoid being changed. I've fixed the two cases that relied on this in tree by explicitly making the nodes constant instead of target constant. The Sparc case is an old bug. The Mips case was recently introduced now that ImmArg on intrinsics gets turned into a TargetConstant when the SelectionDAG is created. I've removed the ImmArg since it lowers to generic code. Reviewers: arsenm, RKSimon, spatel Subscribers: jyknight, sdardis, wdng, arichardson, hiraditya, fedor.sergeev, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67802 llvm-svn: 372409
* Remove assert from MachineLoop::getLoopPredecessor()Stanislav Mekhanoshin2019-09-201-2/+0
| | | | | | | | | | | | | | | | | | According to the documentation method returns predecessor if the given loop's header has exactly one unique predecessor outside the loop. Otherwise return null. In reality it asserts if there is no predecessor outside of the loop. The testcase has the loop where predecessors outside of the loop were not identified as analyzeBranch() was unable to process the mask branch and returned true. That is also not correct to assert for the truly dead loops. Differential Revision: https://reviews.llvm.org/D67634 llvm-svn: 372405
* [MVT] Add v256i1 to MachineValueTypeKrzysztof Parzyszek2019-09-202-248/+254
| | | | | | This type can show up when lowering some HVX vector code on Hexagon. llvm-svn: 372403
* [TextAPI] Arch&Platform to TargetCyndy Ishida2019-09-206-64/+218
| | | | | | | | | | | | | | | | | | | | Summary: This is a patch for updating TextAPI/Macho to read in targets as opposed to arch/platform. This is because in previous versions tbd files only supported a single platform but that is no longer the case, so, now its tracked by unique triples. This precedes a seperate patch that will add the TBD-v4 format Reviewers: ributzka, steven_wu, plotfi, compnerd, smeenai Reviewed By: ributzka Subscribers: mgorny, hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67527 llvm-svn: 372396
* [Alignment][NFC] migrate DataLayout internal struct to llvm::AlignGuillaume Chatelet2019-09-201-14/+14
| | | | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 With this patch the PointerAlignElem struct goes from 20B to 16B. Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67400 llvm-svn: 372390
* [FastISel] Fix insertion of unconditional branches during FastISelDavid Tellenbach2019-09-201-0/+5
| | | | | | | | | | | | | | | | | | | | | | The insertion of an unconditional branch during FastISel can differ depending on building with or without debug information. This happens because FastISel::fastEmitBranch emits an unconditional branch depending on the size of the current basic block without distinguishing between debug and non-debug instructions. This patch fixes this issue by ignoring debug instructions when getting the size of the basic block. Reviewers: aprantl Reviewed By: aprantl Subscribers: ormris, aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67703 llvm-svn: 372389
* [IntrinsicEmitter] Add overloaded types for SVE intrinsics (Subdivide2 & ↵Kerry McLaughlin2019-09-203-9/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | Subdivide4) Summary: Both match the type of another intrinsic parameter of a vector type, but where each element is subdivided to form a vector with more elements of a smaller type. Subdivide2Argument allows intrinsics such as the following to be defined: - declare <vscale x 4 x i32> @llvm.something.nxv4i32(<vscale x 8 x i16>) Subdivide4Argument allows intrinsics such as: - declare <vscale x 4 x i32> @llvm.something.nxv4i32(<vscale x 16 x i8>) Tests are included in follow up patches which add intrinsics using these types. Reviewers: sdesmalen, SjoerdMeijer, greened, rovka Reviewed By: sdesmalen Subscribers: rovka, tschuett, jdoerfert, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67549 llvm-svn: 372380
* [MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCountJames Molloy2019-09-202-0/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The way MachinePipeliner uses these target hooks is stateful - we reduce trip count by one per call to reduceLoopCount. It's a little overfit for hardware loops, where we don't have to worry about stitching a loop induction variable across prologs and epilogs (the induction variable is implicit). This patch introduces a new API: /// Analyze loop L, which must be a single-basic-block loop, and if the /// conditions can be understood enough produce a PipelinerLoopInfo object. virtual std::unique_ptr<PipelinerLoopInfo> analyzeLoopForPipelining(MachineBasicBlock *LoopBB) const; The return value is expected to be an implementation of the abstract class: /// Object returned by analyzeLoopForPipelining. Allows software pipelining /// implementations to query attributes of the loop being pipelined. class PipelinerLoopInfo { public: virtual ~PipelinerLoopInfo(); /// Return true if the given instruction should not be pipelined and should /// be ignored. An example could be a loop comparison, or induction variable /// update with no users being pipelined. virtual bool shouldIgnoreForPipelining(const MachineInstr *MI) const = 0; /// Create a condition to determine if the trip count of the loop is greater /// than TC. /// /// If the trip count is statically known to be greater than TC, return /// true. If the trip count is statically known to be not greater than TC, /// return false. Otherwise return nullopt and fill out Cond with the test /// condition. virtual Optional<bool> createTripCountGreaterCondition(int TC, MachineBasicBlock &MBB, SmallVectorImpl<MachineOperand> &Cond) = 0; /// Modify the loop such that the trip count is /// OriginalTC + TripCountAdjust. virtual void adjustTripCount(int TripCountAdjust) = 0; /// Called when the loop's preheader has been modified to NewPreheader. virtual void setPreheader(MachineBasicBlock *NewPreheader) = 0; /// Called when the loop is being removed. virtual void disposed() = 0; }; The Pipeliner (ModuloSchedule.cpp) can use this object to modify the loop while allowing the target to hold its own state across all calls. This API, in particular the disjunction of creating a trip count check condition and adjusting the loop, improves the code quality in ModuloSchedule.cpp. llvm-svn: 372376
* [CallSiteSplitting] Remove unused includes (NFC).Florian Hahn2019-09-201-5/+0
| | | | llvm-svn: 372375
* llvm-undname: Delete an empty, unused method.Nico Weber2019-09-201-2/+0
| | | | llvm-svn: 372367
* [SVFS] Vector Function ABI demangling.Francesco Petrogalli2019-09-191-0/+111
| | | | | | | | | | | | | | | | This patch implements the demangling functionality as described in the Vector Function ABI. This patch will be used to implement the SearchVectorFunctionSystem (SVFS) as described in the RFC: http://lists.llvm.org/pipermail/llvm-dev/2019-June/133484.html A fuzzer is added to test the demangling utility. Patch by Sumedh Arani <sumedh.arani@arm.com> Differential revision: https://reviews.llvm.org/D66024 llvm-svn: 372343
* [Float2Int] avoid crashing on unreachable code (PR38502)Sanjay Patel2019-09-191-2/+4
| | | | | | | | | | | | | In the example from: https://bugs.llvm.org/show_bug.cgi?id=38502 ...we hit infinite looping/crashing because we have non-standard IR - an instruction operand is used before defined. This and other unusual constructs are allowed in unreachable blocks, so avoid the problem by using DominatorTree to step around landmines. Differential Revision: https://reviews.llvm.org/D67766 llvm-svn: 372339
* Reapply r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"Matt Arsenault2019-09-194-1/+26
| | | | | | | | | This reverts r372314, reapplying r372285 and the commits which depend on it (r372286-r372293, and r372296-r372297) This was missing one switch to getTargetConstant in an untested case. llvm-svn: 372338
* Make appendCallNB lambda mutableChris Bieneman2019-09-191-1/+1
| | | | | | Lambdas are by deafult const so that they produce the same output every time they are run. This lambda needs to set the value on a captured promise which is a mutating operation, so it must be mutable. llvm-svn: 372336
* [DAG][X86] Convert isNegatibleForFree/GetNegatedExpression to a target hook ↵Simon Pilgrim2019-09-191-0/+12
| | | | | | | | | | | | | | | | (PR42863) This patch converts the DAGCombine isNegatibleForFree/GetNegatedExpression into overridable TLI hooks and includes a demonstration X86 implementation. The intention is to let us extend existing FNEG combines to work more generally with negatible float ops, allowing it work with target specific combines and opcodes (e.g. X86's FMA variants). Unlike the SimplifyDemandedBits, we can't just handle target nodes through a Target callback, we need to do this as an override to allow targets to handle generic opcodes as well. This does mean that the target implementations has to duplicate some checks (recursion depth etc.). I've only begun to replace X86's FNEG handling here, handling FMADDSUB/FMSUBADD negation and some low impact codegen changes (some FMA negatation propagation). We can build on this in future patches. Differential Revision: https://reviews.llvm.org/D67557 llvm-svn: 372333
* [TableGen] Support encoding per-HwModeJames Molloy2019-09-193-2/+15
| | | | | | | | | | | | | | Much like ValueTypeByHwMode/RegInfoByHwMode, this patch allows targets to modify an instruction's encoding based on HwMode. When the EncodingInfos field is non-empty the Inst and Size fields of the Instruction are ignored and taken from EncodingInfos instead. As part of this promote getHwMode() from TargetSubtargetInfo to MCSubtargetInfo. This is NFC for all existing targets - new code is generated only if targets use EncodingByHwMode. llvm-svn: 372320
* [DAG] Add SelectionDAG::MaxRecursionDepth constantSimon Pilgrim2019-09-191-0/+4
| | | | | | | | | | As commented on D67557 we have a lot of uses of depth checks all using magic numbers. This patch adds the SelectionDAG::MaxRecursionDepth constant and moves over some general cases to use this explicitly. Differential Revision: https://reviews.llvm.org/D67711 llvm-svn: 372315
* Revert r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"Hans Wennborg2019-09-194-26/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This broke the Chromium build, causing it to fail with e.g. fatal error: error in backend: Cannot select: t362: v4i32 = X86ISD::VSHLI t392, Constant:i8<15> See llvm-commits thread of r372285 for details. This also reverts r372286, r372287, r372288, r372289, r372290, r372291, r372292, r372293, r372296, and r372297, which seemed to depend on the main commit. > Encode them directly as an imm argument to G_INTRINSIC*. > > Since now intrinsics can now define what parameters are required to be > immediates, avoid using registers for them. Intrinsics could > potentially want a constant that isn't a legal register type. Also, > since G_CONSTANT is subject to CSE and legalization, transforms could > potentially obscure the value (and create extra work for the > selector). The register bank of a G_CONSTANT is also meaningful, so > this could throw off future folding and legalization logic for AMDGPU. > > This will be much more convenient to work with than needing to call > getConstantVRegVal and checking if it may have failed for every > constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth > immarg operands, many of which need inspection during lowering. Having > to find the value in a register is going to add a lot of boilerplate > and waste compile time. > > SelectionDAG has always provided TargetConstant for constants which > should not be legalized or materialized in a register. The distinction > between Constant and TargetConstant was somewhat fuzzy, and there was > no automatic way to force usage of TargetConstant for certain > intrinsic parameters. They were both ultimately ConstantSDNode, and it > was inconsistently used. It was quite easy to mis-select an > instruction requiring an immediate. For SelectionDAG, start emitting > TargetConstant for these arguments, and using timm to match them. > > Most of the work here is to cleanup target handling of constants. Some > targets process intrinsics through intermediate custom nodes, which > need to preserve TargetConstant usage to match the intrinsic > expectation. Pattern inputs now need to distinguish whether a constant > is merely compatible with an operand or whether it is mandatory. > > The GlobalISelEmitter needs to treat timm as a special case of a leaf > node, simlar to MachineBasicBlock operands. This should also enable > handling of patterns for some G_* instructions with immediates, like > G_FENCE or G_EXTRACT. > > This does include a workaround for a crash in GlobalISelEmitter when > ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372314
* [Unroll] Add an option to control complete unrollingSerguei Katkov2019-09-192-1/+9
| | | | | | | | | | | | Add an ability to specify the max full unroll count for LoopUnrollPass pass in pass options. Reviewers: fhahn, fedor.sergeev Reviewed By: fedor.sergeev Subscribers: hiraditya, zzheng, dmgreen, llvm-commits Differential Revision: https://reviews.llvm.org/D67701 llvm-svn: 372305
* MachineScheduler: Fix assert from not checking subregsMatt Arsenault2019-09-191-0/+3
| | | | | | | The assert would fail if there was a dead def of a subregister if there was a previous use of a different subregister. llvm-svn: 372287
* GlobalISel: Don't materialize immarg arguments to intrinsicsMatt Arsenault2019-09-193-1/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Encode them directly as an imm argument to G_INTRINSIC*. Since now intrinsics can now define what parameters are required to be immediates, avoid using registers for them. Intrinsics could potentially want a constant that isn't a legal register type. Also, since G_CONSTANT is subject to CSE and legalization, transforms could potentially obscure the value (and create extra work for the selector). The register bank of a G_CONSTANT is also meaningful, so this could throw off future folding and legalization logic for AMDGPU. This will be much more convenient to work with than needing to call getConstantVRegVal and checking if it may have failed for every constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth immarg operands, many of which need inspection during lowering. Having to find the value in a register is going to add a lot of boilerplate and waste compile time. SelectionDAG has always provided TargetConstant for constants which should not be legalized or materialized in a register. The distinction between Constant and TargetConstant was somewhat fuzzy, and there was no automatic way to force usage of TargetConstant for certain intrinsic parameters. They were both ultimately ConstantSDNode, and it was inconsistently used. It was quite easy to mis-select an instruction requiring an immediate. For SelectionDAG, start emitting TargetConstant for these arguments, and using timm to match them. Most of the work here is to cleanup target handling of constants. Some targets process intrinsics through intermediate custom nodes, which need to preserve TargetConstant usage to match the intrinsic expectation. Pattern inputs now need to distinguish whether a constant is merely compatible with an operand or whether it is mandatory. The GlobalISelEmitter needs to treat timm as a special case of a leaf node, simlar to MachineBasicBlock operands. This should also enable handling of patterns for some G_* instructions with immediates, like G_FENCE or G_EXTRACT. This does include a workaround for a crash in GlobalISelEmitter when ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372285
* [Object] Extend MachOUniversalBinary::getObjectForArchAlexander Shaposhnikov2019-09-191-1/+7
| | | | | | | | | | | | Make the method MachOUniversalBinary::getObjectForArch return MachOUniversalBinary::ObjectForArch and add helper methods MachOUniversalBinary::getMachOObjectForArch, MachOUniversalBinary::getArchiveForArch for those who explicitly expect to get a MachOObjectFile or an Archive. Differential revision: https://reviews.llvm.org/D67700 Test plan: make check-all llvm-svn: 372278
* Remove the obsolete BlockByRefStruct flag from LLVM IRAdrian Prantl2019-09-183-3/+3
| | | | | | | | | | | | | | | | | | | | | | | DIFlagBlockByRefStruct is an unused DIFlag that originally was used by clang to express (Objective-)C block captures in debug info. For the last year Clang has been emitting complex DIExpressions to describe block captures instead, which makes all the code supporting this flag redundant. This patch removes the flag and all supporting "dead" code, so we can reuse the bit for something else in the future. Since this only affects debug info generated by Clang with the block extension this mostly affects Apple platforms and I don't have any bitcode compatibility concerns for removing this. The Verifier will reject debug info that uses the bit and thus degrade gracefully when LTO'ing older bitcode with a newer compiler. rdar://problem/44304813 Differential Revision: https://reviews.llvm.org/D67453 llvm-svn: 372272
* Add AutoUpgrade function to add new address space datalayout string to ↵Amy Huang2019-09-182-1/+5
| | | | | | | | | | | | | | | | | | | | | | existing datalayouts. Summary: Add function to AutoUpgrade to change the datalayout of old X86 datalayout strings. This adds "-p270:32:32-p271:32:32-p272:64:64" to X86 datalayouts that are otherwise valid and don't already contain it. This also removes the compatibility changes in https://reviews.llvm.org/D66843. Datalayout change in https://reviews.llvm.org/D64931. Reviewers: rnk, echristo Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67631 llvm-svn: 372267
* Fix compile-time regression caused by rL371928Daniel Sanders2019-09-181-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Also fixup rL371928 for cases that occur on our out-of-tree backend There were still quite a few intermediate APInts and this caused the compile time of MCCodeEmitter for our target to jump from 16s up to ~5m40s. This patch, brings it back down to ~17s by eliminating pretty much all of them using two new APInt functions (extractBitsAsZExtValue(), insertBits() but with a uint64_t). The exact conditions for eliminating them is that the field extracted/inserted must be <=64-bit which is almost always true. Note: The two new APInt API's assume that APInt::WordSize is at least 64-bit because that means they touch at most 2 APInt words. They statically assert that's true. It seems very unlikely that someone is patching it to be smaller so this should be fine. Reviewers: jmolloy Reviewed By: jmolloy Subscribers: hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67686 llvm-svn: 372243
* [DDG] Break a cyclic dependency from Analysis to ScalarOptsBenjamin Kramer2019-09-181-2/+2
| | | | llvm-svn: 372240
* Data Dependence Graph BasicsBardia Mahjour2019-09-182-0/+480
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is the first patch in a series of patches that will implement data dependence graph in LLVM. Many of the ideas used in this implementation are based on the following paper: D. J. Kuck, R. H. Kuhn, D. A. Padua, B. Leasure, and M. Wolfe (1981). DEPENDENCE GRAPHS AND COMPILER OPTIMIZATIONS. This patch contains support for a basic DDGs containing only atomic nodes (one node for each instruction). The edges are two fold: def-use edges and memory-dependence edges. The implementation takes a list of basic-blocks and only considers dependencies among instructions in those basic blocks. Any dependencies coming into or going out of instructions that do not belong to those basic blocks are ignored. The algorithm for building the graph involves the following steps in order: 1. For each instruction in the range of basic blocks to consider, create an atomic node in the resulting graph. 2. For each node in the graph establish def-use edges to/from other nodes in the graph. 3. For each pair of nodes containing memory instruction(s) create memory edges between them. This part of the algorithm goes through the instructions in lexicographical order and creates edges in reverse order if the sink of the dependence occurs before the source of it. Authored By: bmahjour Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert Reviewed By: Meinersbur, fhahn, myhsu Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto Tag: #llvm Differential Revision: https://reviews.llvm.org/D65350 llvm-svn: 372238
* [SampleFDO] Minimize performance impact when profile-sample-accurateWei Mi2019-09-181-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | is enabled. We can save memory and reduce binary size significantly by enabling ProfileSampleAccurate. However when ProfileSampleAccurate is true, function without sample will be regarded as cold and this could potentially cause performance regression. To minimize the potential negative performance impact, we want to be a little conservative here saying if a function shows up in the profile, no matter as outline instance, inline instance or call targets, treat the function as not being cold. This will handle the cases such as most callsites of a function are inlined in sampled binary (thus outline copy don't get any sample) but not inlined in current build (because of source code drift, imprecise debug information, or the callsites are all cold individually but not cold accumulatively...), so that the outline function showing up as cold in sampled binary will actually not be cold after current build. After the change, such function will be treated as not cold even profile-sample-accurate is enabled. At the same time we lower the hot criteria of callsiteIsHot check when profile-sample-accurate is enabled. callsiteIsHot is used to determined whether a callsite is hot and qualified for early inlining. When profile-sample-accurate is enabled, functions without profile will be regarded as cold and much less inlining will happen in CGSCC inlining pass, so we can worry less about size increase and be aggressive to allow more early inlining to happen for warm callsites and it is helpful for performance overall. Differential Revision: https://reviews.llvm.org/D67561 llvm-svn: 372232
* [Alignment][NFC] Remove LogAlignment functionsGuillaume Chatelet2019-09-181-8/+2
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, MaskRay, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67620 llvm-svn: 372231
* [Alignment][NFC] Use Align::None instead of 1Guillaume Chatelet2019-09-182-2/+2
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: sdardis, nemanjai, hiraditya, kbarton, jrtc27, MaskRay, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67704 llvm-svn: 372230
* Revert "[AArch64][DebugInfo] Do not recompute CalleeSavedStackSize"Krasimir Georgiev2019-09-181-8/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This reverts commit r372204. This change causes build bot failures under msan: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/35236/steps/check-llvm%20msan/logs/stdio: ``` FAIL: LLVM :: DebugInfo/AArch64/asan-stack-vars.mir (19531 of 33579) ******************** TEST 'LLVM :: DebugInfo/AArch64/asan-stack-vars.mir' FAILED ******************** Script: -- : 'RUN: at line 1'; /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llc -O0 -start-before=livedebugvalues -filetype=obj -o - /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/DebugInfo/AArch64/asan-stack-vars.mir | /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llvm-dwarfdump -v - | /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/DebugInfo/AArch64/asan-stack-vars.mir -- Exit Code: 2 Command Output (stderr): -- ==62894==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0xdfcafb in llvm::AArch64FrameLowering::resolveFrameOffsetReference(llvm::MachineFunction const&, int, bool, unsigned int&, bool, bool) const /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1658:3 #1 0xdfae8a in resolveFrameIndexReference /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1580:10 #2 0xdfae8a in llvm::AArch64FrameLowering::getFrameIndexReference(llvm::MachineFunction const&, int, unsigned int&) const /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1536 #3 0x46642c1 in (anonymous namespace)::LiveDebugValues::extractSpillBaseRegAndOffset(llvm::MachineInstr const&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:582:21 #4 0x4647cb3 in transferSpillOrRestoreInst /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:883:11 #5 0x4647cb3 in process /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:1079 #6 0x4647cb3 in (anonymous namespace)::LiveDebugValues::ExtendRanges(llvm::MachineFunction&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:1361 #7 0x463ac0e in (anonymous namespace)::LiveDebugValues::runOnMachineFunction(llvm::MachineFunction&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:1415:18 #8 0x4854ef0 in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/MachineFunctionPass.cpp:73:13 #9 0x53b0b01 in llvm::FPPassManager::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1648:27 #10 0x53b15f6 in llvm::FPPassManager::runOnModule(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1685:16 #11 0x53b298d in runOnModule /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1750:27 #12 0x53b298d in llvm::legacy::PassManagerImpl::run(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1863 #13 0x905f21 in compileModule(char**, llvm::LLVMContext&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:601:8 #14 0x8fdc4e in main /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:355:22 #15 0x7f67673632e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0) #16 0x882369 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llc+0x882369) MemorySanitizer: use-of-uninitialized-value /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1658:3 in llvm::AArch64FrameLowering::resolveFrameOffsetReference(llvm::MachineFunction const&, int, bool, unsigned int&, bool, bool) const Exiting error: -: The file was not recognized as a valid object file FileCheck error: '-' is empty. FileCheck command line: /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/DebugInfo/AArch64/asan-stack-vars.mir ``` Reviewers: bkramer Reviewed By: bkramer Subscribers: sdardis, aprantl, kristof.beyls, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67710 llvm-svn: 372228
* Fix -Wdocumentation warning. NFCI.Simon Pilgrim2019-09-181-2/+2
| | | | llvm-svn: 372215
* Fix -Wdocumentation "empty paragraph passed to '\brief'" warning. NFCI.Simon Pilgrim2019-09-181-2/+2
| | | | llvm-svn: 372214
* Fix -Wdocumentation "Unknown param" warning. NFCI.Simon Pilgrim2019-09-181-1/+0
| | | | llvm-svn: 372211
* [Alignment] Add a None() member functionGuillaume Chatelet2019-09-181-1/+8
| | | | | | | | | | | | | | | | | | | Summary: This will allow writing `if(A != llvm::Align::None())` which is clearer than `if(A > llvm::Align(1))` This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67697 llvm-svn: 372207
* [AArch64][DebugInfo] Do not recompute CalleeSavedStackSizeSander de Smalen2019-09-181-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a bug exposed by D65653 where a subsequent invocation of `determineCalleeSaves` ends up with a different size for the callee save area, leading to different frame-offsets in debug information. In the invocation by PEI, `determineCalleeSaves` tries to determine whether it needs to spill an extra callee-saved register to get an emergency spill slot. To do this, it calls 'estimateStackSize' and manually adds the size of the callee-saves to this. PEI then allocates the spill objects for the callee saves and the remaining frame layout is calculated accordingly. A second invocation in LiveDebugValues causes estimateStackSize to return the size of the stack frame including the callee-saves. Given that the size of the callee-saves is added to this, these callee-saves are counted twice, which leads `determineCalleeSaves` to believe the stack has become big enough to require spilling an extra callee-save as emergency spillslot. It then updates CalleeSavedStackSize with a larger value. Since CalleeSavedStackSize is used in the calculation of the frame offset in getFrameIndexReference, this leads to incorrect offsets for variables/locals when this information is recalculated after PEI. Reviewers: omjavaid, eli.friedman, thegameg, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D66935 llvm-svn: 372204
* Revert "r372201: [Support] Replace function with function_ref in ↵Ilya Biryukov2019-09-181-4/+3
| | | | | | | | | | | writeFileAtomically. NFC" function_ref causes calls to the function to be ambiguous, breaking compilation. Reverting for now. llvm-svn: 372202
* [Support] Replace function with function_ref in writeFileAtomically. NFCIlya Biryukov2019-09-181-3/+4
| | | | | | | | | | | | | | | | | | | Summary: The latter is slightly more efficient and communicates the intent of the API: writeFileAtomically does not own or copy the callback, it merely calls it at some point. Reviewers: jkorous Reviewed By: jkorous Subscribers: hiraditya, dexonsmith, jfb, llvm-commits, cfe-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67584 llvm-svn: 372201
* [Remarks] Allow the RemarkStreamer to be used directly with a streamFrancis Visoiu Mistrih2019-09-181-6/+16
| | | | | | | | | | The filename in the RemarkStreamer should be optional to allow clients to stream remarks to memory or to existing streams. This introduces a new overload of `setupOptimizationRemarks`, and avoids enforcing the presence of a filename at different places. llvm-svn: 372195
* Revert "Data Dependence Graph Basics"Bardia Mahjour2019-09-172-480/+0
| | | | | | This reverts commit c98ec60993a7aa65073692b62f6d728b36e68ccd, which broke the sphinx-docs build. llvm-svn: 372168
* Data Dependence Graph BasicsBardia Mahjour2019-09-172-0/+480
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is the first patch in a series of patches that will implement data dependence graph in LLVM. Many of the ideas used in this implementation are based on the following paper: D. J. Kuck, R. H. Kuhn, D. A. Padua, B. Leasure, and M. Wolfe (1981). DEPENDENCE GRAPHS AND COMPILER OPTIMIZATIONS. This patch contains support for a basic DDGs containing only atomic nodes (one node for each instruction). The edges are two fold: def-use edges and memory-dependence edges. The implementation takes a list of basic-blocks and only considers dependencies among instructions in those basic blocks. Any dependencies coming into or going out of instructions that do not belong to those basic blocks are ignored. The algorithm for building the graph involves the following steps in order: 1. For each instruction in the range of basic blocks to consider, create an atomic node in the resulting graph. 2. For each node in the graph establish def-use edges to/from other nodes in the graph. 3. For each pair of nodes containing memory instruction(s) create memory edges between them. This part of the algorithm goes through the instructions in lexicographical order and creates edges in reverse order if the sink of the dependence occurs before the source of it. Authored By: bmahjour Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert Reviewed By: Meinersbur, fhahn, myhsu Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto Tag: #llvm Differential Revision: https://reviews.llvm.org/D65350 llvm-svn: 372162
* GSYM: Add the llvm::gsym::Header header class with testsGreg Clayton2019-09-171-0/+124
| | | | | | | | This patch adds the llvm::gsym::Header class which appears at the start of a stand alone GSYM file, or in the first bytes of the GSYM data in a GSYM section within a file. Added encode and decode methods with full error handling and full tests. Differential Revision: https://reviews.llvm.org/D67666 llvm-svn: 372149
OpenPOWER on IntegriCloud