summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [SCCP] Manually fold branches on undef.Davide Italiano2017-12-231-3/+26
| | | | | | | | | | | | | | | This code was originally removed and replace with an assertion because believed unnecessary. It turns out there was simply no test coverage for this case, and the constant folder doesn't yet know about patterns like `br undef %label1, %label2`. Presumably at some point the constant folder might learn about these patterns, but it's a broader change. A testcase will be added to make sure this doesn't regress again in the future. Fixes PR35723. llvm-svn: 321402
* [X86] Add default InstrItinClass to PseudoISimon Pilgrim2017-12-231-2/+3
| | | | | | This will be used to help tidyup existing pseudos that we've added scheduling info to. llvm-svn: 321401
* [X86] Pass the right VT to the getZeroExtendInReg introduced in r321398Craig Topper2017-12-231-1/+1
| | | | | | Apparently we don't have tests for this which I didn't realize before. I'll try to fix that but wanted to fix the obvious bug. llvm-svn: 321399
* [X86] Use SelectionDAG::getZeroExtendInReg instead of implementing it manually.Craig Topper2017-12-231-9/+3
| | | | llvm-svn: 321398
* [SelectionDAG][X86] Don't use ->getValueType(0) after a call to getOperand ↵Craig Topper2017-12-234-14/+14
| | | | | | | | | | to get the type of the operand. getOperand returns an SDValue that contains the node and the result number. There is no guarantee that the result number if 0. By using the -> operator we are calling SDNode::getValueType rather than SDValue::getValueType. This requires supplying a result number and we shouldn't assume it was 0. I don't have a test case. Just noticed while cleaning up some other code and saw that it occurred in other places. llvm-svn: 321397
* [DAG] Add missing case check from findbaseoffset merge from r321389.Nirav Dave2017-12-221-2/+4
| | | | llvm-svn: 321391
* Integrate findBaseOffset address analyses to BaseIndexOffset. NFCI.Nirav Dave2017-12-222-70/+27
| | | | | | | | | BaseIndexOffset supercedes findBaseOffset analysis save only Constant Pool addresses. Migrate analysis to BaseIndexOffset. Relanding after correcting base address matching check. llvm-svn: 321389
* [WebAssembly] MC: Fix for address taken aliasesSam Clegg2017-12-221-39/+30
| | | | | | | | | | | | | | Previously, taking the address for an alias would result in: "Symbol not found in table index space" Increase test coverage for weak aliases. This code should be more efficient too as it avoids building the `IsAddressTaken` set. Differential Revision: https://reviews.llvm.org/D41510 llvm-svn: 321384
* [MemorySSA] Allow reordering of loads that alias in the presence of volatile ↵Alina Sbirlea2017-12-221-29/+10
| | | | | | | | | | | | | | | | | | | loads. Summary: Make MemorySSA allow reordering of two loads that may alias, when one is volatile. This makes MemorySSA less conservative and behaving the same as the AliasSetTracker. For more context, see D16875. LLVM language reference: "The optimizers must not change the number of volatile operations or change their order of execution relative to other volatile operations. The optimizers may change the order of volatile operations relative to non-volatile operations. This is not Java’s “volatile” and has no cross-thread synchronization behavior." Reviewers: george.burgess.iv, dberlin Subscribers: sanjoy, reames, hfinkel, llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D41525 llvm-svn: 321382
* Revert "[DAG] Integrate findBaseOffset address analyses to BaseIndexOffset. ↵Nirav Dave2017-12-222-27/+70
| | | | | | | | | | NFCI." which was causing miscompilations in for some test-suite components. This reverts commit 3e9de9ff0f3162920a2a3cba51c7dc14b54b4d16. llvm-svn: 321380
* [SimplifyCFG] Don't do if-conversion if there is a long dependence chainGuozhi Wei2017-12-222-0/+183
| | | | | | | | | | If after if-conversion, most of the instructions in this new BB construct a long and slow dependence chain, it may be slower than cmp/branch, even if the branch has a high miss rate, because the control dependence is transformed into data dependence, and control dependence can be speculated, and thus, the second part can execute in parallel with the first part on modern OOO processor. This patch checks for the long dependence chain, and give up if-conversion if find one. Differential Revision: https://reviews.llvm.org/D39352 llvm-svn: 321377
* [ThinLTO][CachePruning] explicitly disable pruningBen Dunbobbin2017-12-221-5/+5
| | | | | | | | | | | In https://reviews.llvm.org/rL321077 and https://reviews.llvm.org/D41231 I fixed a regression in the c-api which prevented the pruning from being *effectively* disabled. However this approach, helpfully recommended by @labath, is cleaner. It is also nice to remove the weasel words about effectively disabling from the api comments. Differential Revision: https://reviews.llvm.org/D41497 llvm-svn: 321376
* (Re-landing) Expose a TargetMachine::getTargetTransformInfo functionSanjoy Das2017-12-2229-85/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Re-land r321234. It had to be reverted because it broke the shared library build. The shared library build broke because there was a missing LLVMBuild dependency from lib/Passes (which calls TargetMachine::getTargetIRAnalysis) to lib/Target. As far as I can tell, this problem was always there but was somehow masked before (perhaps because TargetMachine::getTargetIRAnalysis was a virtual function). Original commit message: This makes the TargetMachine interface a bit simpler. We still need the std::function in TargetIRAnalysis to avoid having to add a dependency from Analysis to Target. See discussion: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119749.html I avoided adding all of the backend owners to this review since the change is simple, but let me know if you feel differently about this. Reviewers: echristo, MatzeB, hfinkel Reviewed By: hfinkel Subscribers: jholewinski, jfb, arsenm, dschuff, mcrosier, sdardis, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, llvm-commits Differential Revision: https://reviews.llvm.org/D41464 llvm-svn: 321375
* [AMDGPU][MC] Corrected handling of negative expressionsDmitry Preobrazhensky2017-12-221-1/+6
| | | | | | | | | | See bug 35716: https://bugs.llvm.org/show_bug.cgi?id=35716 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D41488 llvm-svn: 321372
* [SelectionDAG] Reverse the order of operands in the ISD::ADD created by ↵Craig Topper2017-12-221-1/+1
| | | | | | | | | | TargetLowering::getVectorElementPointer so that the FrameIndex is on the left. This seems to improve X86's ability to match this into an address computation. Otherwise the other operand gets assigned to the base register and the stack pointer + frame index ends up in the index register. But index registers can't encode ESP/RSP so we end up having to move it into another register to meet the constraint. I could try to improve the address matcher in X86, but swapping the producer seemed easier. Several other places already have the operands in this order so this is at least consistent. llvm-svn: 321370
* [X86] When lowering insert_vector_elt/extract_vector_elt of vXi1 with a ↵Craig Topper2017-12-221-8/+6
| | | | | | | | non-constant index just use either a 128-bit type or the vXi8 type with the correct number of elements. Despite what the comment said there isn't better codegen for 512-bit vectors. The 128/256/512 bit implementation jus stores to memory and loads an element. There's no advantage to doing that with a larger size. In fact in many cases it causes a stack realignment and generates worse code. llvm-svn: 321369
* [X86] Improve the printing of address mode during isel matching.Craig Topper2017-12-221-4/+5
| | | | | | Fix some inconsistent new line behavior and only print the FrameIndex when the address mode is a FrameIndexBase addressing mode. llvm-svn: 321368
* [AMDGPU][MC] Corrected parsing of optional operands for ds_swizzle_b32Dmitry Preobrazhensky2017-12-221-1/+3
| | | | | | | | | | See bug 35645: https://bugs.llvm.org/show_bug.cgi?id=35645 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D41186 llvm-svn: 321367
* [InlineCost] Find more free binary operationsHaicheng Wu2017-12-221-40/+18
| | | | | | | | | | | | Currently, inline cost model considers a binary operator as free only if both its operands are constants. Some simple cases are missing such as a + 0, a - a, etc. This patch modifies visitBinaryOperator() to call SimplifyBinOp() without going through simplifyInstruction() to get rid of the constant restriction. Thus, visitAnd() and visitOr() are not needed. Differential Revision: https://reviews.llvm.org/D41494 llvm-svn: 321366
* [DAG] Integrate findBaseOffset address analyses to BaseIndexOffset. NFCI.Nirav Dave2017-12-222-70/+27
| | | | | | | BaseIndexOffset supercedes findBaseOffset analysis save only Constant Pool addresses. Migrate analysis to BaseIndexOffset. llvm-svn: 321364
* [AMDGPU][MC] Added support of 256- and 512-bit tuples of ttmp registersDmitry Preobrazhensky2017-12-227-43/+197
| | | | | | | | | | | | See bug 35561: https://bugs.llvm.org/show_bug.cgi?id=35561 This patch also affects implementation of SGPR and VGPR registers though changes are cosmetic. Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D41437 llvm-svn: 321359
* [ARM GlobalISel] Support G_INTTOPTR and G_PTRTOINT for s32Diana Picus2017-12-223-0/+30
| | | | | | | Mark conversions between pointers and 32-bit scalars as legal, map them to the GPR and select to a simple COPY. llvm-svn: 321356
* [ARM GlobalISel] Support pointer constantsDiana Picus2017-12-222-1/+11
| | | | | | | | | | Pointer constants are pretty rare, since we usually represent them as integer constants and then cast to pointer. One notable exception is the null pointer constant, which is represented directly as a G_CONSTANT 0 with pointer type. Mark it as legal and make sure it is selected like any other integer constant. llvm-svn: 321354
* [DAGCombine] Revert r321259Sam Parker2017-12-221-26/+0
| | | | | | | Improve ReduceLoadWidth for SRL Patch is causing an issue on the PPC64 BE santizer. llvm-svn: 321349
* Rewrite the cached map used for locating the most precise DIE amongChandler Carruth2017-12-221-26/+360
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | inlined subroutines for a given address. This is essentially the hot path of llvm-symbolizer when extracting inlined frames during symbolization. Previously, we would read every subprogram and every inlined subroutine, building a std::map across the entire PC space to the best DIE, and then do only a handful of queries as we symbolized a backtrace. A huge fraction of the time was spent building the map itself. This patch changes it two a two-level system. First, we just build a map from PC-interval to DWARF subprograms. These are required to be disjoint and so constructing this is pretty easy. Second, we build a map *just* for the inlined subroutines within the subprogram containing the query address. This allows us to look at far fewer DIEs and build a *much* smaller set of cached maps in the llvm-symbolizer case where only a few address get symbolized during the entire run. It also builds both interval maps in a very different way. It constructs a single flat vector of pairs that maps from offset -> index. The indices point into collections of DIE objects, but can also be "tombstones" (-1) to mark gaps. In the case of subprograms, this mostly just simplifies the data structure a bit. For inlined subroutines, because we carefully split them as we build the map, we end up in many cases having no holes and not having to store both start and stop offsets. Finally, the PC ranges for the inlined subroutines are compressed into 32-bits by making them relative to the base PC of the outer subprogram. This means that if you have a single function body with over 2gb of executable code in it, we will stop mapping address past the first 2gb of that function into inlined subroutines and just give you the subprogram. This doesn't seem like a problem. ;] All of this combines to make llvm-symbolizer *well* over 2x faster for symbolizing backtraces out of LLVM's unittests. Death-test heavy unit tests are running >2x faster. I'm still going to look at completely disabling symbolization there, but figured while I had a good benchmark we should make symbolization a bit better. Sadly, the logic to build the flat interval map for the inlined subroutines is fairly complex. I'm not super happy about this and welcome any simplifying suggestions. Huge thanks to Dave Blaikie who helped walk me through what the various things I needed to do in DWARF to make this work. Differential Revision: https://reviews.llvm.org/D40987 llvm-svn: 321345
* [X86] Add missing initialization for the HasPREFETCHWT1 subtarget variable.Craig Topper2017-12-221-0/+1
| | | | llvm-svn: 321340
* [X86] Enable PRFCHW feature on KNL/KNM and all CPUs inherited from Broadwell.Craig Topper2017-12-221-2/+4
| | | | llvm-svn: 321336
* [X86] Add prefetchwt1 instruction and overhaul priorities and isel enabling ↵Craig Topper2017-12-226-7/+33
| | | | | | | | | | | | | | for prefetch instructions. Previously prefetch was only considered legal if sse was enabled, but it should be supported with 3dnow as well. The prfchw flag now imply at least some form of prefetch without the write hint is available, either the sse or 3dnow version. This is true even if 3dnow and sse are explicitly disabled. Similarly prefetchwt1 feature implies availability of prefetchw and the the prefetcht0/1/2/nta instructions. This way we can support _MM_HINT_ET0 using prefetchw and _MM_HINT_ET1 with prefetchwt1. And its assumed that if we have levels for the write hint we would have levels for the non-write hint, thus why we enable the sse prefetch instructions. I believe this behavior is consistent with gcc. I've updated the prefetch.ll to test all of these combinations. llvm-svn: 321335
* [X86] Use SIGN_EXTEND to implement ANY_EXTEND from vXi1.Craig Topper2017-12-221-15/+15
| | | | llvm-svn: 321334
* [Inliner] Restrict soft-float inlining penalty.Eli Friedman2017-12-223-32/+23
| | | | | | | | | | | | | | The penalty is currently getting applied in a bunch of places where it doesn't make sense, like bitcasts (which are free) and calls (which were getting the call penalty applied twice). Instead, just apply the penalty to binary operators and floating-point casts. While I'm here, also fix getFPOpCost() to do the right thing in more cases, so we don't have to dig into function attributes. Differential Revision: https://reviews.llvm.org/D41522 llvm-svn: 321332
* Add hasProfileData() to check if a function has profile data. NFC.Easwaran Raman2017-12-229-16/+15
| | | | | | | | | | | | | | | | | | | Summary: This replaces calls to getEntryCount().hasValue() with hasProfileData that does the same thing. This refactoring is useful to do before adding synthetic function entry counts but also a useful cleanup IMO even otherwise. I have used hasProfileData instead of hasRealProfileData as David had earlier suggested since I think profile implies "real" and I use the phrase "synthetic entry count" and not "synthetic profile count" but I am fine calling it hasRealProfileData if you prefer. Reviewers: davidxl, silvas Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41461 llvm-svn: 321331
* [DWARF] Fix formatting bug with r321295. This fixes a MIPS buildbot failure.Wolfgang Pieb2017-12-221-1/+1
| | | | llvm-svn: 321330
* [X86] Use SIGN_EXTEND rather than ZERO_EXTEND for lowering ↵Craig Topper2017-12-211-2/+2
| | | | | | | | extract_vector_elt from vXi1 with a non-const index. We have a better range of instructions we can use if we can fill with the value i1 value rather than zeroing. llvm-svn: 321315
* [ModRefInfo] Add must alias info to ModRefInfo.Alina Sbirlea2017-12-215-27/+166
| | | | | | | | | | | | | | | | | | | | | | Summary: Add an additional bit to ModRefInfo, ModRefInfo::Must, to be cleared for known must aliases. Shift existing Mod/Ref/ModRef values to include an additional most significant bit. Update wrappers that modify ModRefInfo values to reflect the change. Notes: * ModRefInfo::Must is almost entirely cleared in the AAResults methods, the remaining changes are trying to preserve it. * Only some small changes to make custom AA passes set ModRefInfo::Must (BasicAA). * GlobalsModRef already declares a bit, who's meaning overlaps with the most significant bit in ModRefInfo (MayReadAnyGlobal). No changes to shift the value of MayReadAnyGlobal (see AlignedMap). FunctionInfo.getModRef() ajusts most significant bit so correctness is preserved, but the Must info is lost. * There are cases where the ModRefInfo::Must is not set, e.g. 2 calls that only read will return ModRefInfo::NoModRef, though they may read from exactly the same location. Reviewers: dberlin, hfinkel, george.burgess.iv Subscribers: llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D38862 llvm-svn: 321309
* [X86] When lowering truncates to vXi1, don't sign extend i16/i8 types to ↵Craig Topper2017-12-211-1/+2
| | | | | | | | 512-bit if we have VLX. This should only affect what we do for v8i16. Previously we went to v8i64, but if we have VLX we only need v8i32. This prevents an unnecessary zmm usage. llvm-svn: 321303
* [DWARF v5] Rework of string offsets table readerWolfgang Pieb2017-12-212-66/+206
| | | | | | | | | | | | | Reorganizes the DWARF consumer to derive the string offsets table contribution's format from the contribution header instead of (incorrectly) from the unit's format. Reviewers: JDevliegehere, aprantl Differential Revision: https://reviews.llvm.org/D41146 llvm-svn: 321295
* [X86] Promote v8i1 shuffles to v8i32 instead of v8i64 if we have VLX.Craig Topper2017-12-211-1/+3
| | | | | | | | We should have equally good shuffle options for v8i32 with VLX. This was spotted during my attempts to remove 512-bit vectors from SKX. We still use 512-bits for v16i1, v32i1, and v64i1. I'm less sure we can handle those well with narrower vectors. i32 and i64 element sizes get the best shuffle support. llvm-svn: 321291
* [X86][SSE] Split large PAVGB/PAVGW vectors to legal widthsSimon Pilgrim2017-12-211-15/+31
| | | | | | | | Patch to allow detectAVGPattern handle vectors larger than the legal size (128 SSE2, 256 AVX2, 512 AVX512BW), splitting the vectors accordingly. Differential Revision: https://reviews.llvm.org/D41440 llvm-svn: 321288
* [YAML] Fix UTF-8 handlingFrancis Visoiu Mistrih2017-12-211-1/+6
| | | | | | | | Previous YAML quoting patches broke UTF-8 printing in YAML: see https://reviews.llvm.org/D41290#961801. Differential Revision: https://reviews.llvm.org/D41490 llvm-svn: 321283
* [DAGCombiner] Remove (xor (xor x, c1), c2) -> (xor x, (xor c1, c2)) fold. NFCI.Simon Pilgrim2017-12-211-15/+0
| | | | | | More general cases are already handled by constant canonicalization and then the ReassociateOps call at line 5327 llvm-svn: 321280
* [DAGCombiner] Generalize (or (and X, c1), c2) -> (and (or X, c2), c1|c2) ↵Simon Pilgrim2017-12-211-10/+10
| | | | | | | | combine to work on non-splat vectors The knownbits_mask_or_shuffle_uitofp change is interesting - shuffle combines manage to kick in, removing the AND constant mask load. For targets with fast-variable-shuffle this should reduce further to VPOR+VPSHUFB+VCVTDQ2PS. llvm-svn: 321279
* [PowerPC] Fix parest build failure in SPEC2017.Tony Jiang2017-12-211-5/+6
| | | | | | | | | | | | | | | | The build failure was caused by an assertion in pre-legalization DAGCombine: Combining: t6: ppcf128 = uint_to_fp t5 ... into: t20: f32 = PPCISD::FCFIDUS t19 which is clearly wrong since ppcf128 are definitely different type with f32 and we cannot change the node value type when do DAGCombine. The fix is don't handle ppc_fp128 or i1 conversions in PPCTargetLowering::combineFPToIntToFP and leave it to downstream to legalize it and expand it to small legal types. Differential Revision: https://reviews.llvm.org/D41411 llvm-svn: 321276
* [DAGCombiner] Generalize (and (or x, C), D) -> D iff (C & D) == D combine to ↵Simon Pilgrim2017-12-211-4/+6
| | | | | | work on non-splat vectors llvm-svn: 321275
* [DAGCombine] Improve ReduceLoadWidth for SRLSam Parker2017-12-211-0/+26
| | | | | | | | | | | If the SRL node is only used by an AND, we may be able to set the ExtVT to the width of the mask, making the AND redundant. To support this, another check has been added in isLegalNarrowLoad which queries whether the load is valid. Differential Revision: https://reviews.llvm.org/D41350 llvm-svn: 321259
* [Support] Remove MemoryBuffer::getNewUninitMemBufferPavel Labath2017-12-211-8/+3
| | | | | | | | | | | There is nothing useful that can be done with a read-only uninitialized buffer without const_casting its contents to initialize it. A better solution is to obtain a writable buffer (WritableMemoryBuffer::getNewUninitMemBuffer), and then convert it to a read-only buffer after initialization. All callers of this function have already been updated to do this, so this function is now unused. llvm-svn: 321257
* [ARM] Armv8-R DFB instructionSam Parker2017-12-215-5/+21
| | | | | | | | Implement MC support for the Armv8-R 'Data Full Barrier' instruction. Differential Revision: https://reviews.llvm.org/D41430 llvm-svn: 321256
* [X86] Use PSHUFB for v32i16 shuffles before falling back to VPERMW/VPERMI2W.Craig Topper2017-12-211-0/+4
| | | | | | PSHUFB has the ability to implicitly 0 elements which VPERMI2W can't do. So give a chance to use it first. llvm-svn: 321251
* [X86] Use VPERMI2B for v16i8 shuffles if we have VBMI+VLX and would have ↵Craig Topper2017-12-211-13/+17
| | | | | | otherwise used two PSHUFBs ORed together. llvm-svn: 321249
* [X86] Use VPERMB/VPERMI2B for v32i8 shuffle lowering if VBMI and VLX are ↵Craig Topper2017-12-211-0/+4
| | | | | | supported. llvm-svn: 321248
* [WebAssembly] Remove unneeded sub-directorySam Clegg2017-12-212-2/+2
| | | | | | | | | This is the only wasm def (and likely likely will be for the foreseeable) file so no need for a sub-directory Differential Revision: https://reviews.llvm.org/D41476 llvm-svn: 321246
OpenPOWER on IntegriCloud