summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [InstCombine] Unify handling of atomic memtransfer with non-atomic memtransferDaniel Neilson2018-05-112-112/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change reworks the handling of atomic memcpy within the instcombine pass. Previously, a constant length atomic memcpy would be lowered into loads & stores as long as no more than 16 load/store pairs are created. This is quite different from the lowering done for a non-atomic memcpy; which only ever lowers into a single load/store pair of no more than 8 bytes. Larger constant-sized memcpy calls are expanded to load/stores in later passes, such as SelectionDAG lowering. In this change the behaviour for atomic memcpy is unified with non-atomic memcpy; atomic memcpy is now treated in the same was as non-atomic memcpy has always been. We leave it to later passes to lower longer-length atomic memcpy calls. Due to the structure of the pass's handling of memtransfer intrinsics, this change also gives us handling of atomic memmove that we did not previously have. Reviewers: apilipenko, skatkov, mkazantsev, anna, reames Reviewed By: reames Subscribers: reames, llvm-commits Differential Revision: https://reviews.llvm.org/D46658 llvm-svn: 332093
* [X86] Added scheduler helper classes to split move/load/store by sizeSimon Pilgrim2018-05-114-198/+261
| | | | | | Nothing uses this yet but this will allow us to specialize MMX/XMM/YMM/ZMM vector moves. llvm-svn: 332090
* [APFloat] Set losesInfo on no-op convertSven van Haastregt2018-05-111-1/+3
| | | | | | | | | | | | losesInfo would be left unset when no conversion needs to be done. A caller such as InstCombine's fitsInFPType would then branch on an uninitialized value. Caught using valgrind on an out-of-tree target. Differential Revision: https://reviews.llvm.org/D46645 llvm-svn: 332087
* AMDGPU/GlobalISel: Implement select() for 32-bit G_FPTOUITom Stellard2018-05-113-0/+18
| | | | | | | | | | Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45883 llvm-svn: 332082
* [X86] Remove and autoupgrade the avx512.mask.store.ss intrinsic.Craig Topper2018-05-112-4/+11
| | | | llvm-svn: 332079
* [Coroutines] PR34897: Fix incorrect elisionsBrian Gesiak2018-05-111-13/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: https://bugs.llvm.org/show_bug.cgi?id=34897 demonstrates an incorrect coroutine frame allocation elision in the coro-elide pass. The elision is performed on the basis that the SSA variables from all llvm.coro.begin are directly referenced in subsequent llvm.coro.destroy instructions. However, this ignores the fact that the function may exit through paths that do not run these destroy instructions. In the sample program from PR34897, for example, the llvm.coro.destroy instruction is only executed in exception handling code. When the coroutine function exits normally, llvm.coro.destroy is not called. Eliding the allocation in this case causes a subsequent reference to the coroutine handle from outside of the function to access freed memory. To fix the issue, when finding an llvm.coro.destroy for each llvm.coro.begin, only consider llvm.coro.destroy that are executed along non-exceptional paths. Test Plan: 1. Download the sample program from https://bugs.llvm.org/show_bug.cgi?id=34897, compile it with `clang++ -fcoroutines-ts -stdlib=libc++ -std=c++1z -O2`, and run it. It should print `"run1\ncheck1\nrun2\ncheck2"` and then exit successfully. 2. Compile https://godbolt.org/g/mCKfnr and confirm it is still optimized to a single instruction, 'return 1190'. 3. `check-llvm` Reviewers: rsmith, GorNishanov, eric_niebler Reviewed By: GorNishanov Subscribers: andrewrk, lewissbaker, EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D43242 llvm-svn: 332077
* [Support] Add docs for 'openFileFor{Write,Read}'Brian Gesiak2018-05-111-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Add documentation for the LLVM Support functions `openFileForWrite` and `openFileForRead`. The `openFileForRead` parameter `RealPath`, in particular, I think warranted some explanation. In addition, make the behavior of the functions more consistent across platforms. Prior to this patch, Windows would set or not set the result file descriptor based on the nature of the error, whereas Unix would consistently set it to `-1` if the open failed. Make Windows consistently set it to `-1` as well. Test Plan: 1. `ninja check-llvm` 2. `ninja docs-llvm-html` Reviewers: zturner, rnk, danielmartin, scanon Reviewed By: danielmartin, scanon Subscribers: scanon, danielmartin, llvm-commits Differential Revision: https://reviews.llvm.org/D46499 llvm-svn: 332075
* [sanitizer-coverage] don't instrument a function if it's entry block ends ↵Kostya Serebryany2018-05-111-0/+2
| | | | | | with 'unreachable' llvm-svn: 332072
* Register NetBSD/i386 in AddressSanitizer.cppKamil Rytarowski2018-05-111-0/+3
| | | | | | | | | | | | | | | | | | | Summary: Ship kNetBSD_ShadowOffset32 set to 1ULL << 30. This is prepared for the amd64 kernel runtime. Sponsored by <The NetBSD Foundation> Reviewers: vitalybuka, joerg, kcc Reviewed By: vitalybuka Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46724 llvm-svn: 332069
* [SampleFDO] Don't treat warm callsite with inline instance in the profile as ↵Wei Mi2018-05-102-50/+63
| | | | | | | | | | | | | | | | | | | | | cold We found current sampleFDO had a performance issue when triaging a regression. For a callsite with inline instance in the profile, even if hot callsite inliner cannot inline it, it may still execute enough times and should not be treated as cold in regular inliner later. However, currently if such callsite is not inlined by hot callsite inliner, and the BB where the callsite locates doesn't get samples from other instructions inside of it, the callsite will have no profile metadata annotated. In regular inliner cost analysis, if the callsite has no profile annotated and its caller has profile information, it will be treated as cold. The fix changes the isCallsiteHot check and chooses to compare CallsiteTotalSamples with hot cutoff value computed by ProfileSummaryInfo. Differential Revision: https://reviews.llvm.org/D45377 llvm-svn: 332058
* [STLExtras] Add distance() for ranges, pred_size(), and succ_size()Vedant Kumar2018-05-1020-49/+31
| | | | | | | | | | | This commit adds a wrapper for std::distance() which works with ranges. As it would be a common case to write `distance(predecessors(BB))`, this also introduces `pred_size()` and `succ_size()` helpers to make that easier to write. Differential Revision: https://reviews.llvm.org/D46668 llvm-svn: 332057
* [InstCombine] Replace an 'if' that should always be true with an assert.Craig Topper2018-05-101-7/+6
| | | | | | The bitwidth of the operation should always be wider than the result width of the truncate since we don't recurse through any width changing operations. llvm-svn: 332055
* [WebAssembly] Initial Disassembler.Sam Clegg2018-05-105-14/+139
| | | | | | | | | | | | | | | | | | | | | This implements a new table-gen emitter to create tables for a wasm disassembler, and a dissassembler to use them. Comes with 2 tests, that tests a few instructions manually. Is also able to disassemble large .wasm files with objdump reasonably. Not working so well, to be addressed in followups: - objdump appears to be passing an incorrect starting point. - since the disassembler works an instruction at a time, and it is disassembling stack instruction, it has no idea of pseudo register assignments. These registers are required for the instruction printing code that follows. For now, all such registers appear in the output as $0. Patch by Wouter van Oortmerssen Differential Revision: https://reviews.llvm.org/D45848 llvm-svn: 332052
* [X86] Add new patterns for masked scalar load/store to match clang's codegen ↵Craig Topper2018-05-101-0/+117
| | | | | | | | | | | | from r331958. Clang's codegen now uses 128-bit masked load/store intrinsics in IR. The backend will widen to 512-bits on AVX512F targets. So this patch adds patterns to detect codegen's widening and patterns for AVX512VL that don't get widened. We may be able to drop some of the old patterns, but I leave that for a future patch. llvm-svn: 332049
* Revert "[InstCombine] snprintf optimizations"Martin Storsjo2018-05-101-90/+0
| | | | | | | | | This reverts commit SVN r331889, which could trigger failed assertions for cases where the snprintf function is declared with a vaguely differing signature (e.g. being defined as static inline), see PR37408. llvm-svn: 332043
* AMDGPU/GlobalISel: Implement select() for G_BITCAST s32 <--> <2 x s16>Tom Stellard2018-05-102-0/+21
| | | | | | | | | | Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45881 llvm-svn: 332042
* [LLVM-C] Consolidate llgo's DIBuilder BindingsRobert Widmann2018-05-101-0/+11
| | | | | | | | | | | | | | Summary: Move and correct LLVMDIBuilderCreateTypedef. This is the last API in DIBuilderBindings.h, so it is being removed and the C API will now be re-exported from IRBindings.h. Reviewers: whitequark, harlanhaskins, deadalnix Reviewed By: whitequark Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46725 llvm-svn: 332041
* AMDGPU/GlobalISel: Enable TableGen'd instruction selectorTom Stellard2018-05-107-4/+132
| | | | | | | | | | | | Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, wdng, mgorny, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45994 llvm-svn: 332039
* [InstCombine] add folds for minnum(-a, -b) --> -maxnum(a, b)Sanjay Patel2018-05-101-0/+16
| | | | | | | | | | | This is similar to what we do for integer min/max with 'not' ops (rL321882). This should fix: https://bugs.llvm.org/show_bug.cgi?id=37404 https://bugs.llvm.org/show_bug.cgi?id=37405 llvm-svn: 332031
* [DWARF] Fixing a bug in DWARF v5 string offsets tables where the length ↵Wolfgang Pieb2018-05-103-5/+14
| | | | | | | | | | | | | encoded the contribution length excluding the table header. Instead it must encode the contribution length minus the length field itself. Reviewer: JDevliegehere Differential Revision: https://reviews.llvm.org/D45922 llvm-svn: 332030
* [InstCombine] Moving overflow computation logic from InstCombine to ↵Omer Paparo Bivas2018-05-104-86/+114
| | | | | | | | | ValueTracking; NFC Differential Revision: https://reviews.llvm.org/D46704 Change-Id: Ifabcbe431a2169743b3cc310f2a34fd706f13f02 llvm-svn: 332026
* [X86] Initialize HasPTWRITE member of X86SubtargetGabor Buella2018-05-101-0/+1
| | | | | | | This was missing from r331961. Caught by sanitizer bots. llvm-svn: 332024
* [X86] Convert/Merge more instregex patterns to reduce InstrRW compile time.Simon Pilgrim2018-05-106-243/+162
| | | | | | Use instrs lists or merge multiple instregex patterns. llvm-svn: 332022
* [CGP] Split large data structres to sink more GEPsHaicheng Wu2018-05-103-26/+243
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Accessing the members of a large data structures needs a lot of GEPs which usually have large offsets due to the size of the underlying data structure. If the offsets are too large to fit into the r+i addressing mode, these GEPs cannot be sunk to their users' blocks and many extra registers are needed then to carry the values of these GEPs. This patch tries to split a large data struct starting from %base like the following. Before: BB0: %base = BB1: %gep0 = gep %base, off0 %gep1 = gep %base, off1 %gep2 = gep %base, off2 BB2: %load1 = load %gep0 %load2 = load %gep1 %load3 = load %gep2 After: BB0: %base = %new_base = gep %base, off0 BB1: %new_gep0 = %new_base %new_gep1 = gep %new_base, off1 - off0 %new_gep2 = gep %new_base, off2 - off0 BB2: %load1 = load i32, i32* %new_gep0 %load2 = load i32, i32* %new_gep1 %load3 = load i32, i32* %new_gep2 In the above example, the struct is split into two parts. The first part still starts from %base and the second part starts from %new_base. After the splitting, %new_gep1 and %new_gep2 have smaller offsets and then can be sunk to BB2 and folded into their users. The algorithm to split data structure is simple and very similar to the work of merging SExts. First, it collects GEPs that have large offsets when iterating the blocks. Second, it splits the underlying data structures and updates the collected GEPs to use smaller offsets. Differential Revision: https://reviews.llvm.org/D42759 llvm-svn: 332015
* [LLVM-C] Add Accessors for Common DIType and DILocation PropertiesRobert Widmann2018-05-101-0/+42
| | | | | | | | | | | | | | | | Summary: - Adds getters for the line, column, and scope of a DILocation - Adds getters for the name, size in bits, offset in bits, alignment in bits, line, and flags of a DIType Reviewers: whitequark, harlanhaskins, deadalnix Reviewed By: whitequark Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46627 llvm-svn: 332014
* [LLVM-C] Move DIBuilder Bindings For Temporary MDNodesRobert Widmann2018-05-101-0/+17
| | | | | | | | | | | | | | Summary: Move LLVMTemporaryMDNode and LLVMMetadataReplaceAllUsesWith to the C bindings and add LLVMDeleteTemporaryMDNode for deleting non-RAUW'ed temporary nodes. Reviewers: whitequark, harlanhaskins, deadalnix Reviewed By: whitequark Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46632 llvm-svn: 332010
* [X86][Znver1] Remove unnecessary SchedWritePMULLD InstRW overrides.Simon Pilgrim2018-05-101-17/+2
| | | | llvm-svn: 332006
* [WebAssembly] Create section start symbols automatically for all sectionsSam Clegg2018-05-104-22/+39
| | | | | | | | | | | These symbols only get included in the output symbols table if they are used in a relocation. This behaviour matches more closely the ELF object writer. Differential Revision: https://reviews.llvm.org/D46561 llvm-svn: 332005
* [PM/LoopUnswitch] Avoid pointlessly creating an exit block set.Chandler Carruth2018-05-101-19/+4
| | | | | | | This code can just test whether blocks are *in* the loop, which we already have a dedicated set tracking in the loop itself. llvm-svn: 332004
* [X86][SNB] Fix typo in PEXTRDmr instregex, was missing VPEXTRDmr.Simon Pilgrim2018-05-101-4/+2
| | | | llvm-svn: 332002
* [X86] Split ↵Simon Pilgrim2018-05-1012-341/+136
| | | | | | | | WriteVecALU/WriteVecLogic/WriteShuffle/WriteVarShuffle/WritePSADBW/WritePHAdd scheduler classes Split off XMM classes from the default (MMX) classes. llvm-svn: 331999
* [mips] Accept 32-bit offsets for ld/sd/lld commandsSimon Atanasyan2018-05-102-4/+4
| | | | | | | | | | | | This is a follow up to the rL330983. The patch teaches ld, sd, and lld commands accept 32-bit memory offsets by replacing `mem_simm16` operand to `mem_simmptr`. In fact, these commands should accept 64-bit offsets, but so large offsets require another command expanding and will be supported by a separate patch. Differential Revision: https://reviews.llvm.org/D46629 llvm-svn: 331997
* [mips] Accept 32-bit offsets for lh and lhu commandsSimon Atanasyan2018-05-102-4/+4
| | | | | | | | | | This is a follow up to the rL330983. The patch teaches lh and lhu commands accepts 32-bit memory offsets by replacing `mem_simm16` operand to `mem_simmptr`. Differential Revision: https://reviews.llvm.org/D46513 llvm-svn: 331996
* [x86] fix fmaxnum/fminnum with nnanSanjay Patel2018-05-101-9/+13
| | | | | | | | | | | | | | | | | | | | | | With nnan, there's no need for the masked merge / blend sequence (that probably costs much more than the min/max instruction). Somewhere between clang 5.0 and 6.0, we started producing these intrinsics for fmax()/fmin() in C source instead of libcalls or fcmp/select. The backend wasn't prepared for that, so we regressed perf in those cases. Note: it's possible that other targets have similar problems as seen here. Noticed while investigating PR37403 and related bugs: https://bugs.llvm.org/show_bug.cgi?id=37403 The IR FMF propagation cases still don't work. There's a proposal that might fix those cases in D46563. llvm-svn: 331992
* [DSE] Teach the pass about partial overwrite of atomic memory intrinsicsDaniel Neilson2018-05-101-9/+16
| | | | | | | | | | | | | | | | | | | Summary: This change teaches DSE that the atomic memory intrinsics can be overwriten partially in the same way as the non-atomic forms. Specifically, that the atomic memcpy & memset can be shortened at the end and that the atomic memset can be shortened at the beginning, if they partially overwritten by later stores. Reviewers: mkazantsev, skatkov, apilipenko, efriedma, rsmith, spatel, filcab, sanjoy Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45584 llvm-svn: 331991
* [PR37339] Fix assertion in FunctionComparator::cmpInlineAsmwhitequark2018-05-101-1/+1
| | | | | | | | | | | | | | | | | Fixes bug https://bugs.llvm.org/show_bug.cgi?id=37339. InlineAsm is only uniqued if the FunctionTypes are exactly the same, while cmpTypes() for example considers all pointer types in the default address space to be the same. For this reason the end of cmpInlineAsm() can be reached. This patch replaces the unreachable assertion with a check that the function types are not identical. Differential Revision: https://reviews.llvm.org/D46495 Reviewers: jfb llvm-svn: 331990
* [DAG] Avoid using deleted node in rebuildSetCCNirav Dave2018-05-101-8/+21
| | | | | | | | | | | | | | | | | Summary: The combine in rebuildSetCC may be combined to another node leaving our references stale. Keep a handle on it to avoid stale references. Fixes PR36602. Reviewers: dbabokin, RKSimon, eli.friedman, davide Subscribers: hiraditya, uabelho, JesperAntonsson, qcolombet, llvm-commits Differential Revision: https://reviews.llvm.org/D46404 llvm-svn: 331985
* Fix signed/unsigned comparison warning and print formatJames Henderson2018-05-101-1/+1
| | | | | | | | | | | | The print format was causing at least 2 unit-test failures from r331971. The signed/unsigned comparison warnings only appeared to affect two lines but it was unclear whether it might just pop up on other lines, so I have been explicit in all the literals in the tests. There were other bot unit-test failures that I am still investigating. llvm-svn: 331978
* [CFLGraph] Fixed Select instruction handlingDavid Bolvansky2018-05-101-1/+6
| | | | | | | | | | | | | | | | | Summary: Operand 0 is the condition, not the true value. Use op 1 and op 2 as the correct values. Reviewers: george.burgess.iv, nlopes, efriedma Reviewed By: george.burgess.iv Subscribers: craig.topper, rjmccall, lebedev.ri, llvm-commits Differential Revision: https://reviews.llvm.org/D46343 llvm-svn: 331976
* [InstCombine] Only propagate known leading zeros from udiv input to output.Benjamin Kramer2018-05-101-2/+7
| | | | | | | | | Put in a conservatively correct estimate for now. Avoids miscompiling clang in FDO mode. This is really tricky to trigger in reality as basically all interesting cases will be folded away by computeKnownBits earlier, I was unable to find a reasonably small test case. llvm-svn: 331975
* [DWARF] Rework debug line parsing to use llvm::Error and callbacksJames Henderson2018-05-103-122/+195
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reviewed by: dblaikie, JDevlieghere, espindola Differential Revision: https://reviews.llvm.org/D44560 Summary: The .debug_line parser previously reported errors by printing to stderr and return false. This is not particularly helpful for clients of the library code, as it prevents them from handling the errors in a manner based on the calling context. This change switches to using llvm::Error and callbacks to indicate what problems were detected during parsing, and has updated clients to handle the errors in a location-specific manner. In general, this means that they continue to do the same thing to external users. Below, I have outlined what the known behaviour changes are, relating to this change. There are two levels of "errors" in the new error mechanism, to broadly distinguish between different fail states of the parser, since not every failure will prevent parsing of the unit, or of subsequent unit. Malformed table errors that prevent reading the remainder of the table (reported by returning them) and other minor issues representing problems with parsing that do not prevent attempting to continue reading the table (reported by calling a specified callback funciton). The only example of this currently is when the last sequence of a unit is unterminated. However, I think it would be good to change the handling of unrecognised opcodes to report as minor issues as well, rather than just printing to the stream if --verbose is used (this would be a subsequent change however). I have substantially extended the DwarfGenerator to be able to handle custom-crafted .debug_line sections, allowing for comprehensive unit-testing of the parser code. For now, I am just adding unit tests to cover the basic error reporting, and positive cases, and do not currently intend to test every part of the parser, although the framework should be sufficient to do so at a later point. Known behaviour changes: - The dump function in DWARFContext now does not attempt to read subsequent tables when searching for a specific offset, if the unit length field of a table before the specified offset is a reserved value. - getOrParseLineTable now returns a useful Error if an invalid offset is encountered, rather than simply a nullptr. - The parse functions no longer use `WithColor::warning` directly to report errors, allowing LLD to call its own warning function. - The existing parse error messages have been updated to not specifically include "warning" in their message, allowing consumers to determine what severity the problem is. - If the line table version field appears to have a value less than 2, an informative error is returned, instead of just false. - If the line table unit length field uses a reserved value, an informative error is returned, instead of just false. - Dumping of .debug_line.dwo sections is now implemented the same as regular .debug_line sections. - Verbose dumping of .debug_line[.dwo] sections now prints the prologue, if there is a prologue error, just like non-verbose dumping. As a helper for the generator code, I have re-added emitInt64 to the AsmPrinter code. This previously existed, but was removed way back in r100296, presumably because it was dead at the time. This change also requires a change to LLD, which will be committed separately. llvm-svn: 331971
* [mips] Correct the predicates of cvt.fmt.fmt instructionsSimon Dardis2018-05-102-23/+24
| | | | | | | | Reviewers: atanasyan, smaksimovic, abeserminji Differential Revision: https://reviews.llvm.org/D46390 llvm-svn: 331969
* [X86] ptwrite intrinsicGabor Buella2018-05-105-13/+30
| | | | | | | | | | Reviewers: craig.topper, RKSimon Reviewed By: craig.topper, RKSimon Differential Revision: https://reviews.llvm.org/D46539 llvm-svn: 331961
* SCEV] Do not use induction in isKnownPredicate for simplification umax.Serguei Katkov2018-05-101-5/+6
| | | | | | | | | | | | | | | | | | During simplification umax we trigger isKnownPredicate twice. As a first attempt it tries the induction. To do that it tries to get post increment of SCEV. Re-writing the SCEV may result in simplification of umax. If the SCEV contains a lot of umax operations this recursion becomes very slow. The added test demonstrates the slow behavior. To resolve this we use only simple ways to check whether the predicate is known. Reviewers: sanjoy, mkazantsev Reviewed By: sanjoy Subscribers: lebedev.ri, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D46046 llvm-svn: 331949
* [InstCombine] Reorder an if condition to put a cheap check in front of a ↵Craig Topper2018-05-101-3/+3
| | | | | | computeKnownBits call. NFC llvm-svn: 331948
* [InstCombine] Use APInt::getBitsSetFrom to shortern a line and fix an 80 ↵Craig Topper2018-05-101-2/+2
| | | | | | | | columns violation. NFC Fix a similar line in the same function. llvm-svn: 331947
* [Inscombine] fix a signedness warning which broke -Werror buildsPhilip Reames2018-05-101-1/+1
| | | | llvm-svn: 331944
* [NVPTX] Added a feature to use short pointers for const/local/shared AS.Artem Belevich2018-05-098-61/+108
| | | | | | | | | | | | Const/local/shared address spaces are all < 4GB and we can always use 32-bit pointers to access them. This has substantial performance impact on kernels that uses shared memory for intermediary results. The feature is disabled by default. Differential Revision: https://reviews.llvm.org/D46147 llvm-svn: 331941
* [AggressiveInstCombine] convert a chain of 'and-shift' bits into masked compareSanjay Patel2018-05-091-32/+81
| | | | | | | | | | | | | | | | | | | | | | This is a follow-up to D45986. As suggested there, we should match the "all-bits-set" pattern in addition to "any-bits-set". This was a little more complicated than I thought it would be initially because the "and 1" instruction can be anywhere in the chain. Hopefully, the code comments make that logic understandable, but if you see a way to simplify or improve that, it's most appreciated. This transforms patterns that emerge from bitfield tests as seen in PR37098: https://bugs.llvm.org/show_bug.cgi?id=37098 I think it would also help reduce the large test from: D46336 D46595 but we need something to reassociate that case to the forms we're expecting here first. Differential Revision: https://reviews.llvm.org/D46649 llvm-svn: 331937
* [InstCombine] Widen guards with conditions betweenPhilip Reames2018-05-091-1/+22
| | | | | | | | The previous handling for guard widening in InstCombine was extremely restrictive. In particular, it didn't handle the common case where we had two guards separated by a single icmp. Handle this by scanning through a small fixed window of instructions to find the next guard if needed. Differential Revision: https://reviews.llvm.org/D46203 llvm-svn: 331935
OpenPOWER on IntegriCloud