summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [XRay] Add CPU ID in Custom Event FDR RecordsDean Michael Berris2018-11-015-14/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change cuts across compiler-rt and llvm, to increment the FDR log version number to 4, and include the CPU ID in the custom event records. This is a step towards allowing us to change the `llvm::xray::Trace` object to start representing both custom and typed events in the stream of records. Follow-on changes will allow us to change the kinds of records we're presenting in the stream of traces, to incorporate the data in custom/typed events. A follow-on change will handle the typed event case, where it may not fit within the 15-byte buffer for metadata records. This work is part of the larger effort to enable writing analysis and processing tools using a common in-memory representation of the events found in traces. The work will focus on porting existing tools in LLVM to use the common representation and informing the design of a library/framework for expressing trace event analysis as C++ programs. Reviewers: mboerger, eizan Subscribers: hiraditya, mgrang, llvm-commits Differential Revision: https://reviews.llvm.org/D53920 llvm-svn: 345798
* [WebAssembly] Lower vselectThomas Lively2018-11-011-0/+9
| | | | | | | | | | Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D53630 llvm-svn: 345797
* [WebAssembly] Process p2align operands for SIMD loads and storesThomas Lively2018-10-311-0/+12
| | | | | | | | | | Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D53886 llvm-svn: 345795
* [WebAssembly] Handle vector IMPLICIT_DEFs.Thomas Lively2018-10-311-0/+5
| | | | | | | | | | | | | | Summary: Also reduce the test case for implicit defs and test it with all register classes. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D53855 llvm-svn: 345794
* [VFS] Add support for "no_push" to VFS recursive iterators.Jonas Devlieghere2018-10-311-12/+17
| | | | | | | | | | | The "regular" file system has a useful feature that makes it possible to stop recursing when using the recursive directory iterators. This functionality was missing for the VFS recursive iterator and this patch adds that. Differential revision: https://reviews.llvm.org/D53465 llvm-svn: 345793
* [COFF, ARM64] Implement Intrinsic.sponentry for AArch64Mandeep Singh Grang2018-10-316-0/+35
| | | | | | | | | | | | | | Summary: This patch adds Intrinsic.sponentry. This intrinsic is required to correctly support setjmp for AArch64 Windows platform. Reviewers: mgrang, TomTan, rnk, compnerd, mstorsjo, efriedma Reviewed By: efriedma Subscribers: majnemer, chrib, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D53673 llvm-svn: 345791
* [IR] Allow increasing the alignment of dso-local globals.Eli Friedman2018-10-311-1/+1
| | | | | | | | | I think this is the actual important property; the previous visibility check was an approximation. Differential Revision: https://reviews.llvm.org/D53852 llvm-svn: 345790
* [AArch64] Sort switch cases (NFC)Evandro Menezes2018-10-311-20/+23
| | | | llvm-svn: 345786
* Revert r345165 "[X86] Bring back the MOV64r0 pseudo instruction"Craig Topper2018-10-315-41/+43
| | | | | | Google is reporting regressions on some benchmarks. llvm-svn: 345785
* [ARM] Add missing pseudo-instruction for Thumb1 RSBS.Eli Friedman2018-10-312-0/+7
| | | | | | | | | Shows up rarely for 64-bit arithmetic, more frequently for the compare patterns added in r325323. Differential Revision: https://reviews.llvm.org/D53848 llvm-svn: 345782
* revert rL345717 : [InstSimplify] fold icmp based on range of abs/nabsSanjay Patel2018-10-311-42/+0
| | | | | | | This can miscompile as shown in PR39510: https://bugs.llvm.org/show_bug.cgi?id=39510 llvm-svn: 345780
* Check shouldReduceLoadWidth from SimplifySetCCStanislav Mekhanoshin2018-10-312-1/+14
| | | | | | | | | | | | SimplifySetCC could shrink a load without checking for profitability or legality of such shink with a target. Added checks to prevent shrinking of aligned scalar loads in AMDGPU below dword as scalar engine does not support it. Differential Revision: https://reviews.llvm.org/D53846 llvm-svn: 345778
* [DWARF][NFC] Refactor a function to return Optional<> instead of boolWolfgang Pieb2018-10-312-9/+9
| | | | | | | | Minor refactor of DWARFUnit::getStringOffsetSectionItem(). Differential Revision: https://reviews.llvm.org/D53948 llvm-svn: 345776
* [SelectionDAG] Handle constant range [0,1) in lowerRangeToAssertZExtScott Linder2018-10-311-1/+2
| | | | | | | | | lowerRangeToAssertZExt currently relies on something like EarlyCSE having eliminated the constant range [0,1). At -O0 this leads to an assert. Differential Revision: https://reviews.llvm.org/D53888 llvm-svn: 345770
* [AMDGPU] Remove FeatureVGPRSpillingScott Linder2018-10-316-48/+8
| | | | | | | | | | | This feature is only relevant to shaders, and is no longer used. When disabled, lowering of reserved registers for shaders causes a compiler crash. Remove the feature and add a test for compilation of shaders at OptNone. Differential Revision: https://reviews.llvm.org/D53829 llvm-svn: 345763
* [SelectionDAGISel] Suppress a -Wunused-but-set-variable warning in release ↵Craig Topper2018-10-311-0/+1
| | | | | | builds. NFC llvm-svn: 345761
* Fix comment typo. NFCI.Simon Pilgrim2018-10-311-1/+1
| | | | llvm-svn: 345758
* [SelectionDAG] SelectionDAGLegalize::ExpandBITREVERSE - ensure we use ShiftTySimon Pilgrim2018-10-311-6/+6
| | | | | | We should be using the getShiftAmountTy value type for shift amounts. llvm-svn: 345756
* [InstCombine] Combine nested min/max intrinsics with constantsVolkan Keles2018-10-311-1/+35
| | | | | | | | | | | | Reviewers: arsenm, spatel Reviewed By: spatel Subscribers: lebedev.ri, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D53774 llvm-svn: 345751
* [globalisel][irtranslator] Verify that DILocations aren't lost in translationDaniel Sanders2018-10-311-23/+69
| | | | | | | | | | | | | | | | Summary: Also fix a couple bugs where DILocations are lost. EntryBuilder wasn't passing on debug locations for PHI's, constants, GLOBAL_VALUE, etc. Reviewers: aprantl, vsk, bogner, aditya_nandakumar, volkan, rtereshin, aemerson Reviewed By: aemerson Subscribers: aemerson, rovka, kristof.beyls, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D53740 llvm-svn: 345743
* MachineModuleInfo: Initialize DbgInfoAvailable depending on debug_cus existingMatthias Braun2018-10-313-3/+8
| | | | | | | | | | | | | | | | Before this patch DbgInfoAvailable was set to true in DwarfDebug::beginModule() or CodeViewDebug::CodeViewDebug(). This made MIR testing weird since passes would suddenly stop dealing with debug info just because we stopped the pipeline before the debug printers. This patch changes the logic to initialize DbgInfoAvailable based on the fact that debug_compile_units exist in the llvm Module. The debug printers may then override it with false in case of debug printing being disabled. Differential Revision: https://reviews.llvm.org/D53885 llvm-svn: 345740
* [InstCombine] refactor fabs+fcmp fold; NFCSanjay Patel2018-10-311-39/+45
| | | | | | | Also, remove/replace/minimize/enhance the tests for this fold. The code drops FMF, so it needs more tests and at least 1 fix. llvm-svn: 345734
* [Hexagon] Make sure not to use GP-relative addressing with PICKrzysztof Parzyszek2018-10-313-4/+10
| | | | | | | Make sure that -relocation-model=pic prevents use of GP-relative addressing modes. llvm-svn: 345731
* [InstSimplify] fold 'fcmp nnan ult X, 0.0' when X is not negativeSanjay Patel2018-10-311-1/+4
| | | | | | This is the inverted case for the transform added with D53874 / rL345725. llvm-svn: 345728
* [InstCombine] add assertion that InstSimplify has folded a fabs+fcmp; NFCSanjay Patel2018-10-311-2/+5
| | | | | | | | The 'OLT' case was updated at rL266175, so I assume it was just an oversight that 'UGE' was not included because that patch handled both predicates in InstSimplify. llvm-svn: 345727
* [InstSimplify] fold 'fcmp nnan oge X, 0.0' when X is not negativeSanjay Patel2018-10-312-2/+7
| | | | | | | | | | | | | | This re-raises some of the open questions about how to apply and use fast-math-flags in IR from PR38086: https://bugs.llvm.org/show_bug.cgi?id=38086 ...but given the current implementation (no FMF on casts), this is likely the only way to predicate the transform. This is part of solving PR39475: https://bugs.llvm.org/show_bug.cgi?id=39475 Differential Revision: https://reviews.llvm.org/D53874 llvm-svn: 345725
* [LoopUnroll] allow customization for new-pass-manager version of LoopUnrollFedor Sergeev2018-10-313-13/+11
| | | | | | | | | | | | | | | | | Unlike its legacy counterpart new pass manager's LoopUnrollPass does not provide any means to select which flavors of unroll to run (runtime, peeling, partial), relying on global defaults. In some cases having ability to run a restricted LoopUnroll that does more than LoopFullUnroll is needed. Introduced LoopUnrollOptions to select optional unroll behaviors. Added 'unroll<peeling>' to PassRegistry mainly for the sake of testing. Reviewers: chandlerc, tejohnson Differential Revision: https://reviews.llvm.org/D53440 llvm-svn: 345723
* [DAGCombiner] Fold 0 div/rem X to 0David Bolvansky2018-10-311-2/+5
| | | | | | | | | | | | Reviewers: RKSimon, spatel, javed.absar, craig.topper, t.p.northover Reviewed By: RKSimon Subscribers: craig.topper, llvm-commits Differential Revision: https://reviews.llvm.org/D52504 llvm-svn: 345721
* AMDGPU: Rewrite SILowerI1Copies to always stay on SALUNicolai Haehnle2018-10-316-189/+749
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Instead of writing boolean values temporarily into 32-bit VGPRs if they are involved in PHIs or are observed from outside a loop, we use bitwise masking operations to combine lane masks in a way that is consistent with wave control flow. Move SIFixSGPRCopies to before this pass, since that pass incorrectly attempts to move SGPR phis to VGPRs. This should recover most of the code quality that was lost with the bug fix in "AMDGPU: Remove PHI loop condition optimization". There are still some relevant cases where code quality could be improved, in particular: - We often introduce redundant masks with EXEC. Ideally, we'd have a generic computeKnownBits-like analysis to determine whether masks are already masked by EXEC, so we can avoid this masking both here and when lowering uniform control flow. - The criterion we use to determine whether a def is observed from outside a loop is conservative: it doesn't check whether (loop) branch conditions are uniform. Change-Id: Ibabdb373a7510e426b90deef00f5e16c5d56e64b Reviewers: arsenm, rampitec, tpr Subscribers: kzhuravl, jvesely, wdng, mgorny, yaxunl, dstuttard, t-tye, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D53496 llvm-svn: 345719
* AMDGPU: Remove PHI loop condition optimizationNicolai Haehnle2018-10-315-138/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The optimization to early break out of loops if all threads are dead was never fully implemented. But the PHI node analyzing is actually causing a number of problems, so remove all the extra code for it. (This does actually regress code quality in a few places because it ends up relying more heavily on phi's of i1, which we don't do a great job with. However, since it fixes real bugs in the wild, we should take this change. I have some prototype changes to improve i1 lowering in general -- not just for control flow -- which should help recover the code quality, I just need to make those changes fit for general consumption. -- Nicolai) Change-Id: I6fc6c6c8961857ac6009fcfb9f7e5e48dc23fbb1 Patch-by: Christian König <christian.koenig@amd.com> Reviewers: arsenm, rampitec, tpr Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D53359 llvm-svn: 345718
* [InstSimplify] fold icmp based on range of abs/nabsSanjay Patel2018-10-311-0/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a fix for PR39475: https://bugs.llvm.org/show_bug.cgi?id=39475 We managed to get some of these patterns using computeKnownBits in D47041, but that can't be used for nabs(). Instead, put in some range-based logic, so we can fold both abs/nabs with icmp with a constant value. Alive proofs: https://rise4fun.com/Alive/21r Name: abs_nsw_is_positive %cmp = icmp slt i32 %x, 0 %negx = sub nsw i32 0, %x %abs = select i1 %cmp, i32 %negx, i32 %x %r = icmp sgt i32 %abs, -1 => %r = i1 true Name: abs_nsw_is_not_negative %cmp = icmp slt i32 %x, 0 %negx = sub nsw i32 0, %x %abs = select i1 %cmp, i32 %negx, i32 %x %r = icmp slt i32 %abs, 0 => %r = i1 false Name: nabs_is_negative_or_0 %cmp = icmp slt i32 %x, 0 %negx = sub i32 0, %x %nabs = select i1 %cmp, i32 %x, i32 %negx %r = icmp slt i32 %nabs, 1 => %r = i1 true Name: nabs_is_not_over_0 %cmp = icmp slt i32 %x, 0 %negx = sub i32 0, %x %nabs = select i1 %cmp, i32 %x, i32 %negx %r = icmp sgt i32 %nabs, 0 => %r = i1 false Differential Revision: https://reviews.llvm.org/D53844 llvm-svn: 345717
* [tblgen][PredicateExpander] Add the ability to describe more complex ↵Andrea Di Biagio2018-10-312-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | constraints on instruction operands. Before this patch, class PredicateExpander only knew how to expand simple predicates that performed checks on instruction operands. In particular, the new scheduling predicate syntax was not rich enough to express checks like this one: Foo(MI->getOperand(0).getImm()) == ExpectedVal; Here, the immediate operand value at index zero is passed in input to function Foo, and ExpectedVal is compared against the value returned by function Foo. While this predicate pattern doesn't show up in any X86 model, it shows up in other upstream targets. So, being able to support those predicates is fundamental if we want to be able to modernize all the scheduling models upstream. With this patch, we allow users to specify if a register/immediate operand value needs to be passed in input to a function as part of the predicate check. Now, register/immediate operand checks all derive from base class CheckOperandBase. This patch also changes where TIIPredicate definitions are expanded by the instructon info emitter. Before, definitions were expanded in class XXXGenInstrInfo (where XXX is a target name). With the introduction of this new syntax, we may want to have TIIPredicates expanded directly in XXXInstrInfo. That is because functions used by the new operand predicates may only exist in the derived class (i.e. XXXInstrInfo). This patch is a non functional change for the existing scheduling models. In future, we will be able to use this richer syntax to better describe complex scheduling predicates, and expose them to llvm-mca. Differential Revision: https://reviews.llvm.org/D53880 llvm-svn: 345714
* [AMDGPU] support image load/store a16Neil Henning2018-10-311-2/+4
| | | | | | | | | | | | Our a16 support was only enabled for sample/gather and buffer load/store, but not for image load/store operations (which take an i16 as the pixel index rather than a half). Fix our isel lowering and add test cases to prove it out. Differential Revision: https://reviews.llvm.org/D53750 llvm-svn: 345710
* [IndVars] Strengthen restricton in rewriteLoopExitValuesMax Kazantsev2018-10-311-28/+7
| | | | | | | | | | | | | | | | | | | | For some unclear reason rewriteLoopExitValues considers recalculation after the loop profitable if it has some "soft uses" outside the loop (i.e. any use other than call and return), even if we have proved that it has a user inside the loop which we think will not be optimized away. There is no existing unit test that would explain this. This patch provides an example when rematerialisation of exit value is not profitable but it passes this check due to presence of a "soft use" outside the loop. It makes no sense to recalculate value on exit if we are going to compute it due to some irremovable within the loop. This patch disallows applying this transform in the described situation. Differential Revision: https://reviews.llvm.org/D51581 Reviewed By: etherzhhb llvm-svn: 345708
* [LV] Support vectorization of interleave-groups that require an epilog underDorit Nuzman2018-10-3115-70/+173
| | | | | | | | | | | | | | | | | | | | | | optsize using masked wide loads Under Opt for Size, the vectorizer does not vectorize interleave-groups that have gaps at the end of the group (such as a loop that reads only the even elements: a[2*i]) because that implies that we'll require a scalar epilogue (which is not allowed under Opt for Size). This patch extends the support for masked-interleave-groups (introduced by D53011 for conditional accesses) to also cover the case of gaps in a group of loads; Targets that enable the masked-interleave-group feature don't have to invalidate interleave-groups of loads with gaps; they could now use masked wide-loads and shuffles (if that's what the cost model selects). Reviewers: Ayal, hsaito, dcaballe, fhahn Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53668 llvm-svn: 345705
* [MSan] another take at instrumenting inline assembly - now with callsAlexander Potapenko2018-10-311-22/+108
| | | | | | | | | | | | | | | | | Turns out it's not always possible to figure out whether an asm() statement argument points to a valid memory region. One example would be per-CPU objects in the Linux kernel, for which the addresses are calculated using the FS register and a small offset in the .data..percpu section. To avoid pulling all sorts of checks into the instrumentation, we replace actual checking/unpoisoning code with calls to msan_instrument_asm_load(ptr, size) and msan_instrument_asm_store(ptr, size) functions in the runtime. This patch doesn't implement the runtime hooks in compiler-rt, as there's been no demand in assembly instrumentation for userspace apps so far. llvm-svn: 345702
* [ARM64] [Windows] Exception handling support in frame loweringSanjin Sijaric2018-10-316-52/+398
| | | | | | | | | | Emit pseudo instructions indicating unwind codes corresponding to each instruction inside the prologue/epilogue. These are used by the MCLayer to populate the .xdata section. Differential Revision: https://reviews.llvm.org/D50288 llvm-svn: 345701
* [AArch64] Mark condition flags and x16/x17 as clobbered when calling __chkstkMartin Storsjo2018-10-311-0/+6
| | | | | | | | This is similar to SVN r311061 for ARM. Differential Revision: https://reviews.llvm.org/D53878 llvm-svn: 345698
* [ORC] Fix hex printing of uint64_t values.Lang Hames2018-10-312-6/+6
| | | | | | | A plain "%x" format string will drop the high 32-bits. Use the PRIx64 macro instead. llvm-svn: 345696
* [DWARF] Revert r345546: Refactor range list extraction and dumpingWolfgang Pieb2018-10-317-214/+229
| | | | | | This patch caused some internal tests to break which are being investigated. llvm-svn: 345687
* Use llvm::any_of instead std::any_of. NFCFangrui Song2018-10-311-2/+1
| | | | llvm-svn: 345683
* ADT/STLExtras: Introduce llvm::empty; NFCMatthias Braun2018-10-3111-14/+13
| | | | | | | | This is modeled after C++17 std::empty(). Differential Revision: https://reviews.llvm.org/D53909 llvm-svn: 345679
* DWARFVerifier: make the verifier more comprehensive for objectsSaleem Abdulrasool2018-10-301-1/+1
| | | | | | | Make the code do what was mentioned in the comment: only skip the CU types. This enables the lexical blocks to be verified as well. llvm-svn: 345675
* MachineOperand/MIParser: Do not print debug-use flag, infer itMatthias Braun2018-10-302-2/+4
| | | | | | | | | | | | | | The debug-use flag must be set exactly for uses on DBG_VALUEs. This is so obvious that it can be trivially inferred while parsing. This will reduce noise when printing while omitting an information that has little value to the user. The parser will keep recognizing the flag for compatibility with old `.mir` files. Differential Revision: https://reviews.llvm.org/D53903 llvm-svn: 345671
* Revert r345542: AMDGPU: Enable code object v3 by defaultKonstantin Zhuravlyov2018-10-301-30/+15
| | | | | | It breaks mesa. llvm-svn: 345662
* [FPEnv] [FPEnv] Add constrained intrinsics for MAXNUM and MINNUMCameron McInally2018-10-307-0/+28
| | | | | | Differential Revision: https://reviews.llvm.org/D53216 llvm-svn: 345650
* [InstCombine] use 'match' to reduce code; NFCSanjay Patel2018-10-301-92/+90
| | | | llvm-svn: 345647
* [InstCombine] Teach the move free before null test opti how to deal with ↵Quentin Colombet2018-10-301-12/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | noop casts InstCombine features an optimization that essentially replaces: if (a) free(a) into: free(a) Right now, this optimization is gated by the minsize attribute and therefore we only perform it if we can prove that we are going to be able to eliminate the branch and the destination block. However when casts are involved the optimization would fail to apply, because the optimization was not smart enough to realize that it is possible to also move the casts away from the destination block and that is harmless to the performance since they are just noops. E.g., foo(int *a) if (a) free((char*)a) Wouldn't be optimized by instcombine, because - We would refuse to hoist the `bitcast i32* %a to i8` in the source block - We would fail to see that `bitcast i32* %a to i8` and %a are the same value. This patch fixes both these problems: - It teaches the pattern matching of the comparison how to look through casts. - It checks that whether the additional instruction in the destination block can be hoisted and are harmless performance-wise. - It hoists all the code of the destination block in the source block. Differential Revision: D53356 llvm-svn: 345644
* [COFF, ARM64] Make sure to forward arguments from vararg to musttail varargMandeep Singh Grang2018-10-302-0/+27
| | | | | | | | | | | | | | | | | | Summary: Thunk functions in Windows are varag functions that call a musttail function to pass the arguments after the fixup is done. We need to make sure that we forward the arguments from the caller vararg to the callee vararg function. This is the same mechanism that is used for Windows on X86. Reviewers: ssijaric, eli.friedman, TomTan, mgrang, mstorsjo, rnk, compnerd, efriedma Reviewed By: efriedma Subscribers: efriedma, kristof.beyls, chrib, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D53843 llvm-svn: 345641
* [ScalarizeMaskedMemIntrin] Limit the scope of some variables that are only ↵Craig Topper2018-10-301-8/+5
| | | | | | used inside loops. llvm-svn: 345638
OpenPOWER on IntegriCloud