summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [mips] Pick the right variant of DINS upfront and enable target instruction ↵Simon Dardis2017-09-148-44/+119
| | | | | | | | | | | | | | | | | | | | | | | | | verification This patch complements D16810 "[mips] Make isel select the correct DEXT variant up front.". Now ISel picks the right variant of DINS, so now there is no need to replace DINS with the appropriate variant during MipsMCCodeEmitter::encodeInstruction(). This patch also enables target specific instruction verification for ins, dins, dinsm, dinsu, ext, dext, dextm, dextu. These instructions have constraints that are checked when generating MipsISD::Ins and MipsISD::Ext nodes, but these constraints are not checked during instruction selection. Adding machine verification should catch outstanding cases. Finally, correct a bug that instruction verification uncovered, where the position operand of a DINSU generated during lowering was being silently and accidently corrected to the correct value. Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D34809 llvm-svn: 313254
* Revert "[dwarfdump] Add DWARF verifiers for address ranges"Jonas Devlieghere2017-09-141-135/+4
| | | | | | This reverts commit r313250. llvm-svn: 313253
* [DAGCombine] (shl (or x, c1), c2) -> (or (shl x, c2), c1 << c2)Simon Pilgrim2017-09-141-2/+4
| | | | | | | | | | We already have a combine for this pattern when the input to shl is add, so we just need to enable the transformation when the input is or. Original patch by @tstellar Differential Revision: https://reviews.llvm.org/D19325 llvm-svn: 313251
* [dwarfdump] Add DWARF verifiers for address rangesJonas Devlieghere2017-09-141-4/+135
| | | | | | | | | | | | | | | | This patch started as an attempt to rebase Greg's differential (D32821). The result is both quite similar and different at the same time. It adds the following checks: - Verify that all address ranges in a DIE are valid. - Verify that no ranges within the DIE overlap. - Verify that no ranges overlap with the ranges of a sibling. - Verify that children are completely contained in its (direct) parent's address range. (unless both are subprograms) Differential revision: https://reviews.llvm.org/D37696 llvm-svn: 313250
* [SelectionDAG] ComputeNumSignBits - cleanup ROTL/ROTR wrapping to match ↵Simon Pilgrim2017-09-141-3/+3
| | | | | | | | | | DAGCombine etc. Use RotAmt.urem(VTBits) instead of AND(RotAmt, VTBits - 1) TBH I don't expect non-power-of-2 types to be created, but it makes the logic clearer and matches what we do in other rotation combines. llvm-svn: 313245
* [PM/CGSCC] Teach the CGSCC pass manager components to gracefully handleChandler Carruth2017-09-141-1/+6
| | | | | | | | | | | | | | | | | | | | | invalidated SCCs even when we do not have an updated SCC to redirect towards. This comes up in a fairly subtle and surprising circumstance: we need to have a connected but internal node in the call graph which later becomes a disconnected island, and then gets deleted. All of this needs to happen mid-CGSCC walk. Because it is disconnected, we have no way of computing a new "current" SCC when it gets deleted. Instead, we need to explicitly check for a deleted "current" SCC and bail out of the current CGSCC step. This will bubble all the way up to the post-order walk and then resume correctly. I've included minimal tests for this bug. The specific behavior matches something we've seen in the wild with the new PM combined with ThinLTO and sample PGO, but I've not yet confirmed whether this is the only issue there. llvm-svn: 313242
* [LV] Fix maximum legal VF calculationAlon Kom2017-09-142-31/+22
| | | | | | | | | | | | | | | | This patch fixes pr34283, which exposed that the computation of maximum legal width for vectorization was wrong, because it relied on MaxInterleaveFactor to obtain the maximum stride used in the loop, however not all strided accesses in the loop have an interleave-group associated with them. Instead of recording the maximum stride in the loop, which can be over conservative (e.g. if the access with the maximum stride is not involved in the dependence limitation), this patch tracks the actual maximum legal width imposed by accesses that are involved in dependencies. Differential Revision: https://reviews.llvm.org/D37507 llvm-svn: 313237
* [XRay][CodeGen] Use the current function symbol as the associated symbol for ↵Dean Michael Berris2017-09-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the instrumentation map Summary: XRay had been assuming that the previous section is the "text" section of the function when lowering the instrumentation map. Unfortunately this is not a safe assumption, because we may be coming from lowering debug type information for the function being lowered. This fixes an issue with combining -gsplit-dwarf, -generate-type-units, -debug-compile and -fxray-instrument for sole member functions. When the split dwarf section is stripped, we're left with references from the xray_instr_map to the debug section. The change now uses the function's symbol instead of the previous section's start symbol. We found the bug while attempting to strip the split debug sections off an XRay-instrumented object file, which had a peculiar edge-case for single-function classes where the single function is being lowered. Because XRay had assocaited the instrumentation map for a function to the debug types section instead of the function's section, the objcopy call will fail due to the misplaced reference from the xray_instr_map section. Reviewers: pcc, dblaikie, echristo Subscribers: llvm-commits, aprantl Differential Revision: https://reviews.llvm.org/D37791 llvm-svn: 313233
* [mips] Recognise the triple used by Debian for MIPS n32 ABISimon Atanasyan2017-09-141-0/+2
| | | | | | | Triples like mips64-linux-gnuabin32 are documented in this article: https://wiki.debian.org/Multiarch/Tuples llvm-svn: 313231
* Revert "Invoke GetInlineCost for legality check before inline functions in ↵Vitaly Buka2017-09-141-37/+6
| | | | | | | | | | SampleProfileLoader." Patch introduced uninitialized value. This reverts commit r313195. llvm-svn: 313230
* Reland r313157, "ThinLTO: Correctly follow aliasee references when dead ↵Peter Collingbourne2017-09-142-17/+10
| | | | | | | | | | | stripping." which was reverted in r313222. This reland includes a fix for the LowerTypeTests pass so that it looks past aliases when determining which type identifiers are live. Differential Revision: https://reviews.llvm.org/D37842 llvm-svn: 313229
* [SLPVectorizer] Prefer auto over explicit type for VL0, NFCI.Dinar Temirbulatov2017-09-141-1/+1
| | | | llvm-svn: 313228
* Revert r313157 "ThinLTO: Correctly follow aliasee references when dead ↵Hans Wennborg2017-09-141-5/+8
| | | | | | | | | | | | | | | | | | | | | | stripping." This broke Chromium's CFI build; see crbug.com/765004. > We were previously handling aliases during dead stripping by adding > the aliased global's "original name" GUID to the worklist. This will > lead to incorrect behaviour if the global has local linkage because > the original name GUID will not correspond to the global's GUID in > the summary. > > Because an alias is just another name for the global that it > references, there is no need to mark the referenced global as used, > or to follow references from any other copies of the global. So all > we need to do is to follow references from the aliasee's summary > instead of the alias. > > Differential Revision: https://reviews.llvm.org/D37789 llvm-svn: 313222
* AMDGPU: Don't spill SP reg like a normal CSRMatt Arsenault2017-09-133-0/+16
| | | | llvm-svn: 313217
* [codeview] Fold FIXME into comment, there's nothing to do. NFCReid Kleckner2017-09-131-4/+4
| | | | llvm-svn: 313214
* Revert r312719 "[MachineCombiner] Update instruction depths incrementally ↵Hans Wennborg2017-09-132-90/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | for large BBs." This caused PR34596. > [MachineCombiner] Update instruction depths incrementally for large BBs. > > Summary: > For large basic blocks with lots of combinable instructions, the > MachineTraceMetrics computations in MachineCombiner can dominate the compile > time, as computing the trace information is quadratic in the number of > instructions in a BB and it's relevant successors/predecessors. > > In most cases, knowing the instruction depth should be enough to make > combination decisions. As we already iterate over all instructions in a basic > block, the instruction depth can be computed incrementally. This reduces the > cost of machine-combine drastically in cases where lots of instructions > are combined. The major drawback is that AFAIK, computing the critical path > length cannot be done incrementally. Therefore we only compute > instruction depths incrementally, for basic blocks with more > instructions than inc_threshold. The -machine-combiner-inc-threshold > option can be used to set the threshold and allows for easier > experimenting and checking if using incremental updates for all basic > blocks has any impact on the performance. > > Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn > > Reviewed By: fhahn > > Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits > > Differential Revision: https://reviews.llvm.org/D36619 llvm-svn: 313213
* Allow target to decide when to cluster loads/stores in mischedStanislav Mekhanoshin2017-09-135-8/+49
| | | | | | | | | | | | | | | | MachineScheduler when clustering loads or stores checks if base pointers point to the same memory. This check is done through comparison of base registers of two memory instructions. This works fine when instructions have separate offset operand. If they require a full calculated pointer such instructions can never be clustered according to such logic. Changed shouldClusterMemOps to accept base registers as well and let it decide what to do about it. Differential Revision: https://reviews.llvm.org/D37698 llvm-svn: 313208
* llvm-dwarfdump: automatically dump both regular and .dwo variant of sectionsAdrian Prantl2017-09-131-74/+91
| | | | | | | | | | | | Since users typically don't really care about the .dwo / non.dwo distinction, this patch makes it so dwarfdump --debug-<info,...> dumps .debug_info and (if available) also .debug_info.dwo. This simplifies the command line interface (I've removed all dwo-specific dump options) and makes the tool friendlier to use. Differential Revision: https://reviews.llvm.org/D37771 llvm-svn: 313207
* AMDGPU: Handle coldcc in more placesMatt Arsenault2017-09-131-0/+2
| | | | | | Missed in r312936 llvm-svn: 313205
* [codeview] VLAs and unsized arrays should use a size of zeroReid Kleckner2017-09-131-3/+4
| | | | | | | | | | | | | | | | | Previously we used a size of '1' for VLAs because we weren't sure what MSVC did. However, MSVC does support declaring an array without a size, for which it emits an array type with a size of zero. Clang emits the same DI metadata for VLAs and arrays without bound, so we would describe arrays without bound as having one element. This lead to Microsoft debuggers only printing a single element. Emitting a size of zero appears to cause these debuggers to search the symbol information to find a definition of the variable with accurate array bounds. Fixes http://crbug.com/763580 llvm-svn: 313203
* [ARM] Add more CPUs to host detectionEli Friedman2017-09-131-0/+3
| | | | | | | | | This returns "cortex-a73" for second-generation Kryo; not precisely correct, but close enough. Differential Revision: https://reviews.llvm.org/D37724 llvm-svn: 313200
* [Transforms] Fix some Clang-tidy modernize-use-using and Include What You ↵Eugene Zelenko2017-09-133-63/+133
| | | | | | Use warnings; other minor fixes (NFC). llvm-svn: 313198
* [RegAlloc] Keep a copy of live interval for the spilled vregs in ↵Wei Mi2017-09-131-24/+29
| | | | | | | | | | | | | HoistSpillHelper. This is to fix PR34502. After rL311401, the live range of spilled vreg will be cleared. HoistSpill need to use the live range of the original vreg before splitting to know the moving range of the spills. The patch saves a copy of live interval for the spilled vreg inside of HoistSpillHelper. Differential Revision: https://reviews.llvm.org/D37578 llvm-svn: 313197
* Invoke GetInlineCost for legality check before inline functions in ↵Dehao Chen2017-09-131-6/+37
| | | | | | | | | | | | | | | | SampleProfileLoader. Summary: SampleProfileLoader inlines hot functions if it is inlined in the profiled binary. However, the inline needs to be guarded by legality check, otherwise it could lead to correctness issues. Reviewers: eraman, davidxl Reviewed By: eraman Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D37779 llvm-svn: 313195
* [CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; ↵Eugene Zelenko2017-09-1312-136/+248
| | | | | | other minor fixes (NFC). llvm-svn: 313194
* Mark static member functions as static in CodeViewDebugAdrian McCarthy2017-09-132-11/+14
| | | | | | | | | | | | | | Summary: To improve CodeView quality for static member functions, we need to make the static explicit. In addition to a small change in LLVM's CodeViewDebug to return the appropriate MethodKind, this requires a small change in Clang to note the staticness in the debug info metadata. Subscribers: aprantl, hiraditya Differential Revision: https://reviews.llvm.org/D37715 llvm-svn: 313192
* [Inliner] Add another way to compute full inline cost.Easwaran Raman2017-09-131-5/+5
| | | | | | | | | | | | | | | | Summary: Full inline cost is computed when -inline-cost-full is true or ORE is non-null. This patch adds another way to compute full inline cost by adding a field to InlineParams. This will be used by SampleProfileLoader to check legality of inlining a callee that it wants to inline. Reviewers: danielcdh, haicheng Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37819 llvm-svn: 313185
* [LV] Avoid computing the register usage for default VF. NFCAnna Thomas2017-09-131-2/+4
| | | | | | | | | | | These are changes to reduce redundant computations when calculating a feasible vectorization factor: 1. early return when target has no vector registers 2. don't compute register usage for the default VF. Suggested during review for D37702. llvm-svn: 313176
* Refactoring the stride 4 code in the X86interleavedaccess NFCMichael Zuckerman2017-09-131-34/+32
| | | | llvm-svn: 313166
* llvm-dwarfdump: support dumping UUIDs of Mach-O binaries.Adrian Prantl2017-09-133-5/+39
| | | | | | | This is a feature supported by Darwin dwarfdump. UUIDs are used to associate executables with their .dSYM bundles. llvm-svn: 313165
* Add options to dump PGO counts in text.Hiroshi Yamauchi2017-09-132-34/+55
| | | | | | | | | | | | | | | | | Summary: Added text options to -pgo-view-counts and -pgo-view-raw-counts that dump block frequency and branch probability info in text. This is useful when the graph is very large and complex (the dot command crashes, lines/edges too close to tell apart, hard to navigate without textual search) or simply when text is preferred. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37776 llvm-svn: 313159
* [ThinLTO] AliasSummary should not have any referencesTeresa Johnson2017-09-132-4/+3
| | | | | | | | | | | | Summary: References should only be on the aliasee. Reviewers: pcc Subscribers: llvm-commits, inglorion Differential Revision: https://reviews.llvm.org/D37814 llvm-svn: 313158
* ThinLTO: Correctly follow aliasee references when dead stripping.Peter Collingbourne2017-09-131-8/+5
| | | | | | | | | | | | | | | | | | We were previously handling aliases during dead stripping by adding the aliased global's "original name" GUID to the worklist. This will lead to incorrect behaviour if the global has local linkage because the original name GUID will not correspond to the global's GUID in the summary. Because an alias is just another name for the global that it references, there is no need to mark the referenced global as used, or to follow references from any other copies of the global. So all we need to do is to follow references from the aliasee's summary instead of the alias. Differential Revision: https://reviews.llvm.org/D37789 llvm-svn: 313157
* Convenience/safety fix for llvm::sys::Execute(And|No)WaitAlexander Kornienko2017-09-135-22/+22
| | | | | | | | | | | | | | | | | | | | Summary: Change the type of the Redirects parameter of llvm::sys::ExecuteAndWait, ExecuteNoWait and other APIs that wrap them from `const StringRef **` to `ArrayRef<Optional<StringRef>>`, which is safer and simplifies the use of these APIs (no more local StringRef variables just to get a pointer to). Corresponding clang changes will be posted as a separate patch. Reviewers: bkramer Reviewed By: bkramer Subscribers: vsk, llvm-commits Differential Revision: https://reviews.llvm.org/D37563 llvm-svn: 313155
* [ThinLTO] For SamplePGO, need to handle ICP targets consistently in thin linkTeresa Johnson2017-09-131-11/+29
| | | | | | | | | | | | | | | | | | | | | | Summary: SamplePGO indirect call profiles record the target as the original GUID for statics. The importer had special handling to map to the normal GUID in that case. The dead global analysis needs the same treatment or inconsistencies arise, resulting in linker unsats due to some dead symbols being exported and kept, leaving in references to other dead symbols that are removed. This can happen when a SamplePGO profile collected by one binary is used for a different binary, so the indirect call profiles may not accurately reflect live targets. Reviewers: danielcdh Subscribers: mehdi_amini, inglorion, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D37783 llvm-svn: 313151
* [mips] correct operand range for DINSM instructionPetar Jovanovic2017-09-131-1/+1
| | | | | | | | | | | | | This patch corrects the definition of the DINSM instruction. Specification for DINSM instruction for Mips64 says that size operand should be 2 <= size <= 64, but it is defined as uimm5_inssize_plus1 which gives range of 1 .. 32. Patch by Aleksandar Beserminji. Differential Revision: https://reviews.llvm.org/D37683 llvm-svn: 313149
* [MachineScheduler] Put SchedRegion in an anonymous namespace.Mikael Holmen2017-09-131-0/+2
| | | | | | | | | | | | | | | | Summary: It pollutes the global namespace otherwise. Patch by: Bevin Hansson Reviewers: jonpa Reviewed By: jonpa Subscribers: MatzeB, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D37555 llvm-svn: 313148
* [Power9] Add missing instructions: extswsli, popcntbStefan Pintilie2017-09-132-0/+22
| | | | | | | | Added the following P9 instructions: extswsli, extswsli., popcntb Differential Revision: https://reviews.llvm.org/D37342 llvm-svn: 313147
* [MachO] Prevent heap overflow when load command extends past EOFJonas Devlieghere2017-09-131-1/+4
| | | | | | | | | | | This patch fixes a heap-buffer-overflow when a malformed Mach-O has a load command who's size extends past the end of the binary. Fixes: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3225 Differential revision: https://reviews.llvm.org/D37439 llvm-svn: 313145
* [dwarfdump] Rename Brief to Verbose in DIDumpOptionsJonas Devlieghere2017-09-135-28/+29
| | | | | | | | | | | This patches renames "brief" to "verbose" in de DIDumpOptions and inverts the logic to match the new behavior where brief is the default. Changing the default value uncovered some bugs related to the DIDumpOptions not being propagated and have been fixed as well. Differential revision: https://reviews.llvm.org/D37745 llvm-svn: 313139
* [GlobalISel][X86] support G_FPEXT operation.Igor Breger2017-09-132-3/+15
| | | | | | | | | | | | | | Summary: Support G_FPEXT operation. Selection done via TableGen'erated code. Reviewers: zvi, guyblank, aymanmus, m_zuckerman Reviewed By: zvi Subscribers: rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34816 llvm-svn: 313135
* [X86] [PATCH] [intrinsics] Lowering X86 ABS intrinsics to IR. (llvm)Uriel Korach2017-09-132-19/+26
| | | | | | | | This patch, together with a matching clang patch (https://reviews.llvm.org/D37694), implements the lowering of X86 ABS intrinsics to IR. differential revision: https://reviews.llvm.org/D37693. llvm-svn: 313134
* [X86] Adding X86 Processor FamiliesMohammed Agabaria2017-09-133-5/+59
| | | | | | | | | Adding x86 Processor families to initialize several uArch properties (based on the family) This patch shows how gather cost can be initialized based on the proc. family Differential Revision: https://reviews.llvm.org/D35348 llvm-svn: 313132
* [X86] Make sure we emit a SUBREG_TO_REG after the MOV32ri when creating a ↵Craig Topper2017-09-131-2/+9
| | | | | | | | BEXTR64rr instruction from a shift/and pair. Fixes PR34589. llvm-svn: 313126
* [X86 CodeGen] Optimization of ZeroExtendLoad for v2i8 vectorElena Demikhovsky2017-09-131-0/+1
| | | | | | Load with zero-extend and sign-extend from v2i8 to v2i32 is "Legal" since SSE4.1 and may be performed using PMOVZXBD , PMOVSXBD instructions. llvm-svn: 313121
* [LV] Fix PR34523 - avoid generating redundant selectsAyal Zaks2017-09-131-3/+3
| | | | | | | | | | | | | | | | | When converting a PHI into a series of 'select' instructions to combine the incoming values together according their edge masks, initialize the first value to the incoming value In0 of the first predecessor, instead of generating a redundant assignment 'select(Cond[0], In0, In0)'. The latter fails when the Cond[0] mask is null, representing a full mask, which can happen only when there's a single incoming value. No functional changes intended nor expected other than surviving null Cond[0]'s. This fix follows D35725, which introduced using null to represent full masks. Differential Revision: https://reviews.llvm.org/D37619 llvm-svn: 313119
* [GVNHoist] Factor out reachability to search for anticipable instructions ↵Aditya Kumar2017-09-131-288/+418
| | | | | | | | | | | | | | | | | | | | | | | quickly Factor out the reachability such that multiple queries to find reachability of values are fast. This is based on finding the ANTIC points in the CFG which do not change during hoisting. The ANTIC points are basically the dominance-frontiers in the inverse graph. So we introduce a data structure (CHI nodes) to keep track of values flowing out of a basic block. We only do this for values with multiple occurrences in the function as they are the potential hoistable candidates. This patch allows us to hoist instructions to a basic block with >2 successors, as well as deal with infinite loops in a trivial way. Relevant test cases are added to show the functionality as well as regression fixes from PR32821. Regression from previous GVNHoist: We do not hoist fully redundant expressions because fully redundant expressions are already handled by NewGVN Differential Revision: https://reviews.llvm.org/D35918 Reviewers: dberlin, sebpop, gberry, llvm-svn: 313116
* [X86] Use isUInt<32> to simplify some code. NFCCraig Topper2017-09-131-1/+1
| | | | llvm-svn: 313112
* [ARC] Prepare the implementation of relocation for LLDLeslie Zhai2017-09-132-0/+11
| | | | | | | | | | | | Reviewers: ruiu, kparzysz, petecoup, rafael Reviewed By: kparzysz Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37556 llvm-svn: 313109
* [InstCombine] Add a flag to disable LowerDbgDeclareReid Kleckner2017-09-131-1/+30
| | | | | | | | | | | | | | | | | | | | | | | | Summary: This should improve optimized debug info for address-taken variables at the cost of inaccurate debug info in some situations. We patched this into clang and deployed this change to Chromium developers, and this significantly improved debuggability of optimized code. The long-term solution to PR34136 seems more and more like it's going to take a while, so I would like to commit this change under a flag so that it can be used as a stop-gap measure. This flag should really help so for C++ aggregates like std::string and std::vector, which are typically address-taken, even after inlining, and cannot be SROA-ed. Reviewers: aprantl, dblaikie, probinson, dberlin Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D36596 llvm-svn: 313108
OpenPOWER on IntegriCloud