summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [DWARF] Fix DWARFVerifier::DieRangeInfo::containsFangrui Song2019-04-151-19/+18
| | | | | | | | | It didn't handle empty LHS correctly. If two ranges of LHS were contiguous and jointly contained one range of RHS, it could also be incorrect. DWARFAddressRange::contains can be removed and its tests can be merged into DWARFVerifier::DieRangeInfo::contains llvm-svn: 358387
* [Transforms][ASan] Move findAllocaForValue() to Utils/Local.cpp. NFCAlexander Potapenko2019-04-152-39/+41
| | | | | | | | | | | | | | | | Summary: Factor out findAllocaForValue() from ASan so that we can use it in MSan to handle lifetime intrinsics. Reviewers: eugenis, pcc Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60615 llvm-svn: 358380
* [NewPM] Add Option handling for SimplifyCFGSerguei Katkov2019-04-152-1/+39
| | | | | | | | | | | This patch enables passing options to SimplifyCFGPass via the passes pipeline. Reviewers: chandlerc, fedor.sergeev, leonardchan, philip.pfaffe Reviewed By: fedor.sergeev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D60675 llvm-svn: 358379
* [DWARF] Fix DWARFVerifier::DieRangeInfo::intersectsFangrui Song2019-04-151-16/+9
| | | | | | It was incorrect if RHS had more than 1 ranges and one of the ranges interacted with *this llvm-svn: 358376
* [DWARF] Make DWARFDebugLine::ParsingState::RowNumber a local variableFangrui Song2019-04-151-2/+2
| | | | llvm-svn: 358374
* [SelectionDAG] Use KnownBits::computeForAddSub/computeForAddCarryBjorn Pettersson2019-04-151-58/+21
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Use KnownBits::computeForAddSub/computeForAddCarry in SelectionDAG::computeKnownBits when doing value tracking for addition/subtraction. This should improve the precision of the known bits, as we only used to make a simple estimate of known zeroes. The KnownBits support functions are also able to deduce bits that are known to be one in the result. Reviewers: spatel, RKSimon, nikic, lebedev.ri Reviewed By: nikic Subscribers: nikic, javed.absar, lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60460 llvm-svn: 358372
* [Sparc] Fix typo. NFC.Jim Lin2019-04-151-2/+2
| | | | llvm-svn: 358370
* [GlobalISel] Enable CSE in the IRTranslator & legalizer for -O0 with ↵Amara Emerson2019-04-157-34/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | constants only. Other opcodes shouldn't be CSE'd until we can be sure debug info quality won't be degraded. This change also improves the IRTranslator so that in most places, but not all, it creates constants using the MIRBuilder directly instead of first creating a new destination vreg and then creating a constant. By doing this, the buildConstant() method can just return the vreg of an existing G_CONSTANT instead of having to create a COPY from it. I measured a 0.2% improvement in compile time and a 0.9% improvement in code size at -O0 ARM64. Compile time: Program base cse diff test-suite...ark/tramp3d-v4/tramp3d-v4.test 9.04 9.12 0.8% test-suite...Mark/mafft/pairlocalalign.test 2.68 2.66 -0.7% test-suite...-typeset/consumer-typeset.test 5.53 5.51 -0.4% test-suite :: CTMark/lencod/lencod.test 5.30 5.28 -0.3% test-suite :: CTMark/Bullet/bullet.test 25.82 25.76 -0.2% test-suite...:: CTMark/ClamAV/clamscan.test 6.92 6.90 -0.2% test-suite...TMark/7zip/7zip-benchmark.test 34.24 34.17 -0.2% test-suite :: CTMark/SPASS/SPASS.test 6.25 6.24 -0.1% test-suite...:: CTMark/sqlite3/sqlite3.test 1.66 1.66 -0.1% test-suite :: CTMark/kimwitu++/kc.test 13.61 13.60 -0.0% Geomean difference -0.2% Code size: Program base cse diff test-suite...-typeset/consumer-typeset.test 1315632 1266480 -3.7% test-suite...:: CTMark/ClamAV/clamscan.test 1313892 1297508 -1.2% test-suite :: CTMark/lencod/lencod.test 1439504 1423112 -1.1% test-suite...TMark/7zip/7zip-benchmark.test 2936980 2904172 -1.1% test-suite :: CTMark/Bullet/bullet.test 3478276 3445460 -0.9% test-suite...ark/tramp3d-v4/tramp3d-v4.test 8082868 8033492 -0.6% test-suite :: CTMark/kimwitu++/kc.test 3870380 3853972 -0.4% test-suite :: CTMark/SPASS/SPASS.test 1434904 1434896 -0.0% test-suite...Mark/mafft/pairlocalalign.test 764528 764528 0.0% test-suite...:: CTMark/sqlite3/sqlite3.test 782092 782092 0.0% Geomean difference -0.9% Differential Revision: https://reviews.llvm.org/D60580 llvm-svn: 358369
* [GlobalISel] Introduce a CSEConfigBase class to allow targets to define ↵Amara Emerson2019-04-159-8/+54
| | | | | | | | | | | | | | their own CSE configs. Because CodeGen can't depend on GlobalISel, we need a way to encapsulate the CSE configs that can be passed between TargetPassConfig and the targets' custom pass configs. This CSEConfigBase allows targets to create custom CSE configs which is then used by the GISel passes for the CSEMIRBuilder. This support will be used in a follow up commit to allow constant-only CSE for -O0 compiles in D60580. llvm-svn: 358368
* llvm-undname: Fix oss-fuzz-foudn crash-on-invalid with incomplete special ↵Nico Weber2019-04-141-0/+4
| | | | | | table nodes llvm-svn: 358367
* llvm-undname: Fix another crash-on-invalid found by oss-fuzzNico Weber2019-04-141-1/+4
| | | | llvm-svn: 358363
* [X86] Redefine KUNPCK instructions to take a narrower source register class ↵Craig Topper2019-04-141-11/+9
| | | | | | | | | than destination register class. Remove copies from the isel output pattern. There's no reason for the inputs to be the destination register class. This just forces an unnecessary copy in the output patterns. llvm-svn: 358362
* [X86] Put the locked mi8 instrutions above the locked mi/mi32 so they will ↵Craig Topper2019-04-141-24/+26
| | | | | | | | | be prefered. We want 64mi8 to be prefered over 64mi32. The order for 16mi/32mi doesn't really matter. llvm-svn: 358361
* [X86] Change IMUL with immediate instruction order to ri8 instructions come ↵Craig Topper2019-04-141-31/+34
| | | | | | | | | | before ri/ri32 instructions. This will ensure IMUL64ri8 is tried before IMUL64ri32. For IMUL32 and IMUL16 the order doesn't really matter because only the ri8 versions use a predicate. That automatically gives them priority. llvm-svn: 358360
* [X86] Move VPTESTM matching from the isel table to custom code in ↵Craig Topper2019-04-142-251/+396
| | | | | | | | | | | | | | | | | | | | | | | | | | | | X86ISelDAGToDAG. We had many tablegen patterns for these instructions. And due to the commutability of the patterns, tablegen expands them to even more patterns. All together VPTESTMD patterns accounted for more the 50K of the 610K isel table. This had gotten bad when we stopped canonicalizing AND to vXi64. This required a pattern for every combination of bitcast input type. This change moves the matching to custom code where it is easier to look through the bitcasts without being concerned with the specific types. The test changes are because we are now stricter with one use checks as its required to make load folding legal. We now require the AND and any BITCAST to only have a single use. This prevents forming VPTESTM and a VPAND with the same inputs. We now support broadcast loads for 128/256 patterns without VLX. We'll widen to 512-bit like and still fold the broadcast since the amount of memory read doesn't change. There are a few tests that got slightly longer because are now prefering load + VPTESTM over XOR+VPCMPEQ for (seteq (load), allzeros). Previously we were able to share the XOR with multiple VPTESTM instructions. llvm-svn: 358359
* [X86] Don't form masked vpcmp/vcmp/vptestm operations if the setcc node has ↵Craig Topper2019-04-141-148/+256
| | | | | | | | | | | | | | more than one use. We're better of emitting a single compare + kand rather than a compare for the other use and a masked compare. I'm looking into using custom instruction selection for VPTESTM to reduce the ridiculous number of permutations of patterns in the isel table. Putting a one use check on all masked compare folding makes load fold matching in the custom code easier. llvm-svn: 358358
* [Mem2Reg] Delete unused PointerAllocaValuesFangrui Song2019-04-141-5/+0
| | | | | | It is unused after AliasSetTracker support was removed. llvm-svn: 358352
* [Mem2Reg] Simplify and micro optimizeFangrui Song2019-04-141-13/+9
| | | | | | | | * Rearrange continu/break * BBNumbers.lookup(A) -> BBNumbers.find(A)->second BBNumbers has been computed, thus we can assume the value exists in the predicate. llvm-svn: 358351
* [Mem2Reg] Don't call LBI.deleteValue on AllocInst/DbgVariableIntrinsicFangrui Song2019-04-141-6/+1
| | | | | | Only StoreInst/LoadInst are assigned numbers. Other types of instructions are not in LBI. llvm-svn: 358350
* [Mem2Reg] Simplify rewriteSingleStoreAllocaFangrui Song2019-04-141-5/+2
| | | | llvm-svn: 358349
* [ConstantRange] Delete unused getSetSizeFangrui Song2019-04-141-8/+0
| | | | | | getSetSize returns an APInt that is 1 bit wider. The APInt is typically 65-bit and requires memory allocation. isSizeStrictlySmallerThan and isSizeLargerThan are preferred. The last use of this helper method was removed by rL302385. llvm-svn: 358347
* [X86] Remove some unused tablegen multiclasses. NFCCraig Topper2019-04-141-27/+0
| | | | llvm-svn: 358345
* [X86] Use PC-relative mode for the kernel code modelBill Wendling2019-04-131-9/+14
| | | | | | | | | | | | | | | | | | Summary: The Linux kernel uses PC-relative mode, so allow that when the code model is "kernel". Reviewers: craig.topper Reviewed By: craig.topper Subscribers: llvm-commits, kees, nickdesaulniers Tags: #llvm Differential Revision: https://reviews.llvm.org/D60643 llvm-svn: 358343
* [ConstantRange] Disallow NUW | NSW in makeGuaranteedNoWrapRegion()Nikita Popov2019-04-131-18/+14
| | | | | | | | | | | | | | | As motivated in D60598, this drops support for specifying both NUW and NSW in makeGuaranteedNoWrapRegion(). None of the users of this function currently make use of this. When both NUW and NSW are specified, the exact nowrap region has two disjoint parts and makeGNWR() returns one of them. This result doesn't seem to be useful for anything, but makes the semantics of the function fuzzier. Differential Revision: https://reviews.llvm.org/D60632 llvm-svn: 358340
* [InstCombine] Remove redundant/bogus mul_with_overflow combinesNikita Popov2019-04-131-8/+0
| | | | | | | | | | | | As pointed out in D60518 folding mulo(%x, undef) to {undef, undef} isn't correct. As a correct version of this already exists in InstructionSimplify (https://github.com/llvm-mirror/llvm/blob/bd8056ef326e075cc500f3f0cfcd1193bc200594/lib/Analysis/InstructionSimplify.cpp#L4750-L4757) this is just dead code though. Drop it together with the mul(%x, 0) -> {0, false} fold that is also already handled by InstSimplify. Differential Revision: https://reviews.llvm.org/D60649 llvm-svn: 358339
* [X86] Use int64_t and isInt<N> instead of APInt operations in ↵Craig Topper2019-04-131-8/+6
| | | | | | | | | | foldLoadStoreIntoMemOperand. NFC We know all our values are limited to 64 bits here so we don't need an APInt. This should save some generated code checking between large and small size. llvm-svn: 358338
* [CommandLineParser] Add DefaultOption flagDon Hinton2019-04-131-5/+43
| | | | | | | | | | | | | | | | | | | | | | Summary: Add DefaultOption flag to CommandLineParser which provides a default option or alias, but allows users to override it for some other purpose as needed. Also, add `-h` as a default alias to `-help`, which can be seamlessly overridden by applications like llvm-objdump and llvm-readobj which use `-h` as an alias for other options. Reviewers: alexfh, klimek Reviewed By: klimek Subscribers: MaskRay, mehdi_amini, inglorion, dexonsmith, hiraditya, llvm-commits, jhenderson, arphaman, cfe-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D59746 llvm-svn: 358337
* [WebAssembly] Use Function::hasOptSize() (NFC)Heejin Ahn2019-04-131-2/+1
| | | | | | | | | | | | | | | | Summary: Use member function. Reviewers: aheejin Subscribers: sunfish, hiraditya, sbc100, jgravelle-google, dschuff, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60651 Patch by Hideto Ueno (uenoku) llvm-svn: 358336
* [Mem2Reg] Delete unused AllocaPointerValFangrui Song2019-04-131-4/+0
| | | | | | It is no longer used after the AliasSetTracker updating logic was removed. llvm-svn: 358334
* [InstCombine] Canonicalize (-X srem Y) to -(X srem Y).Chen Zheng2019-04-131-0/+5
| | | | | | Differential Revision: https://reviews.llvm.org/D60647 llvm-svn: 358328
* [AArch64][GlobalISel] Enable copy elision in the pre-legalizer combine and ↵Amara Emerson2019-04-132-0/+3
| | | | | | | | | | | | | fix a crash. This enables the simple copy combine that already exists in the CombinerHelper. However, it exposed a bug in the GISelChangeObserver where it wouldn't clear a set of MIs to process, and so would end up causing a crash when deleted MIs were being added to the combiner worklist again. Differential Revision: https://reviews.llvm.org/D60579 llvm-svn: 358318
* [WebAssembly] Add DataCount section to object filesThomas Lively2019-04-123-0/+34
| | | | | | | | | | | | | | | | | | | Summary: This ensures that object files will continue to validate as WebAssembly modules in the presence of bulk memory operations. Engines that don't support bulk memory operations will not recognize the DataCount section and will report validation errors, but that's ok because object files aren't supposed to be run directly anyway. Reviewers: aheejin, dschuff, sbc100 Subscribers: jgravelle-google, hiraditya, sunfish, rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60623 llvm-svn: 358315
* [GlobalISel] Fix a crash when handling an invalid MVT during call lowering.Amara Emerson2019-04-121-1/+1
| | | | | | | | This crash was introduced in r358032 as we try to construct an EVT from an MVT in order to find the register type for the calling conv. Fall back instead of trying to do this with an invalid MVT coming from i256. llvm-svn: 358314
* [MemorySSA] Add previous def to cache when found, even if trivial.Alina Sbirlea2019-04-121-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When inserting a new Def, MemorySSA may be have non-minimal number of Phis. While inserting, the walk to find the previous definition may cleanup minimal Phis. When the last definition is trivial to obtain, we do not cache it. It is possible while getting the previous definition for a Def to get two different answers: - one that was straight-forward to find when walking the first path (a trivial phi in this case), and - another that follows a cleanup of the trivial phi, it determines it may need additional Phi nodes, it inserts them and returns a new phi in the same position as the former trivial one. While the Phis added for the second path are all redundant, they are not complete (the walk is only done upwards), and they are not properly cleaned up afterwards. A way to fix this problem is to cache the straight-forward answer we got on the first walk. The caching is only kept for the duration of a getPreviousDef call, and for Phis we use TrackingVH, so removing the trivial phi will lead to replacing it with the next dominating phi in the cache. Resolves PR40749. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60634 llvm-svn: 358313
* [AArch64][GlobalISel] Fix a crash when selecting shufflevectors with an ↵Amara Emerson2019-04-121-7/+17
| | | | | | | | | | | undef mask element. If a shufflevector's mask vector has an element with "undef" then the generic instruction defining that element register is a G_IMPLICT_DEF instead of G_CONSTANT. This fixes the selector to handle this case, and for now assumes that undef just means zero. In future we'll optimize this case properly. llvm-svn: 358312
* [WebAssembly] Add mutable-globals to bleeding-edge CPUThomas Lively2019-04-121-1/+2
| | | | | | | | | | | | | | Summary: This brings the backend in line with Clang. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60594 llvm-svn: 358310
* [ConstantRange] Clarify makeGuaranteedNoWrapRegion() guarantees; NFCNikita Popov2019-04-121-2/+1
| | | | | | | | | | | | | | | | | | | | | makeGuaranteedNoWrapRegion() is actually makeExactNoWrapRegion() as long as only one of NUW or NSW is specified. This is not obvious from the current documentation, and some code seems to think that it is only exact for single-element ranges. Clarify docs and add tests to be more confident this really holds. There are currently no users of makeGuaranteedNoWrapRegion() that pass both NUW and NSW. I think it would be best to drop support for this entirely and then rename the function to makeExactNoWrapRegion(). Knowing that the no-wrap region is exact is useful, because we can backwards-constrain values. What I have in mind in particular is that LVI should be able to constrain values on edges where the with.overflow overflow flag is false. Differential Revision: https://reviews.llvm.org/D60598 llvm-svn: 358305
* [SCEV] Add option to forget everything in SCEV.Alina Sbirlea2019-04-126-36/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Create a method to forget everything in SCEV. Add a cl::opt and PassManagerBuilder option to use this in LoopUnroll. Motivation: Certain Halide applications spend a very long time compiling in forgetLoop, and prefer to forget everything and rebuild SCEV from scratch. Sample difference in compile time reduction: 21.04 to 14.78 using current ToT release build. Testcase showcasing this cannot be opensourced and is fairly large. The option disabled by default, but it may be desirable to enable by default. Evidence in favor (two difference runs on different days/ToT state): File Before (s) After (s) clang-9.bc 7267.91 6639.14 llvm-as.bc 194.12 194.12 llvm-dis.bc 62.50 62.50 opt.bc 1855.85 1857.53 File Before (s) After (s) clang-9.bc 8588.70 7812.83 llvm-as.bc 196.20 194.78 llvm-dis.bc 61.55 61.97 opt.bc 1739.78 1886.26 Reviewers: sanjoy Subscribers: mehdi_amini, jlebar, zzheng, javed.absar, dmgreen, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60144 llvm-svn: 358304
* [MemorySSA] Small fix for the clobber limit.Alina Sbirlea2019-04-121-4/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: After introducing the limit for clobber walking, `walkToPhiOrClobber` would assert that the limit is at least 1 on entry. The test included triggered that assert. The callsite in `tryOptimizePhi` making the calls to `walkToPhiOrClobber` is structured like this: ``` while (true) { if (getBlockingAccess()) { // calls walkToPhiOrClobber } for (...) { walkToPhiOrClobber(); } } ``` The cleanest fix is to check if the limit was reached inside `walkToPhiOrClobber`, and give an allowence of 1. This approach not make any alias() calls (no calls to instructionClobbersQuery), so the performance condition is enforced. The limit is set back to 0 if not used, as this provides info on the fact that we stopped before reaching a true clobber. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60479 llvm-svn: 358303
* [InstCombine] Fix a nasty miscompile introduced w/masked.gather demanded eltsPhilip Reames2019-04-121-1/+5
| | | | | | | | This fixes a miscompile which was introduced in r356510 (https://reviews.llvm.org/D57372). The problem is that the original patch removed pointer operands where the load results we're demanded, but without considering the legality of the load itself. If the masked.gather had active, but undemanded, lanes, then we could end up creating a load which loaded from an undef address. The result could be a segfault, or, in theory, an arbitrary read from a random memory location into an used register. llvm-svn: 358299
* [CVP] Set NSW/NUW flags when simplifying with.overflowNikita Popov2019-04-121-2/+6
| | | | | | | | | | When CVP determines that a with.overflow intrinsic cannot overflow, it currently inserts a simple add/sub. As we already determined that there can be no overflow, we should add the appropriate NUW/NSW flag. Differential Revision: https://reviews.llvm.org/D60585 llvm-svn: 358298
* [KnownBits] Add computeForAddCarry()Nikita Popov2019-04-121-12/+31
| | | | | | | | | | | | | | | | This is for D60460. computeForAddSub() essentially already supports carries because it has to deal with subtractions. This revision extracts a lower-level computeForAddCarry() function, which allows computing the known bits for add (carry known zero), sub (carry known one) and addcarry (carry unknown). As we don't seem to have any yet, I've added a unit test file for KnownBits and exhaustive tests for the new computeForAddCarry() functionality, as well the existing computeForAddSub() function. Differential Revision: https://reviews.llvm.org/D60522 llvm-svn: 358297
* Simplify decoupling between RuntimeDyld/RuntimeDyldChecker, add 'got_addr' util.Lang Hames2019-04-1212-134/+101
| | | | | | | | | | | | | | | | | | | | | | | | | This patch reduces the number of functions in the interface between RuntimeDyld and RuntimeDyldChecker by combining "GetXAddress" and "GetXContent" functions into "GetXInfo" functions that return a struct describing both the address and content. The GetStubOffset function is also replaced with a pair of utilities, GetStubInfo and GetGOTInfo, that fit the new scheme. For RuntimeDyld both of these functions will return the same result, but for the new JITLink linker (https://reviews.llvm.org/D58704) these will provide the addresses of PLT stubs and GOT entries respectively. For JITLink's use, a 'got_addr' utility has been added to the rtdyld-check language, and the syntax of 'got_addr' and 'stub_addr' has been changed: both functions now take two arguments, a 'stub container name' and a target symbol name. For llvm-rtdyld/RuntimeDyld the stub container name is the object file name and section name, separated by a slash. E.g.: rtdyld-check: *{8}(stub_addr(foo.o/__text, y)) = y For the upcoming llvm-jitlink utility, which creates stubs on a per-file basis rather than a per-section basis, the container name is just the file name. E.g.: jitlink-check: *{8}(got_addr(foo.o, y)) = y llvm-svn: 358295
* [Hexagon] Fix reuse bug in Vector Loop Carried Reuse passBrendon Cahoon2019-04-121-3/+10
| | | | | | | | | | | | | | | The Hexagon Vector Loop Carried Reuse pass was allowing reuse between two shufflevectors with different masks. The reason is that the masks are not instruction objects, so the code that checks each operand just skipped over the operands. This patch fixes the bug by checking if the operands are the same when they are not instruction objects. If the objects are not the same, then the code assumes that reuse cannot occur. Differential Revision: https://reviews.llvm.org/D60019 llvm-svn: 358292
* [DAGCombiner] narrow shuffle of concatenated vectorsSanjay Patel2019-04-121-0/+50
| | | | | | | | | | | | | | | | | | // shuffle (concat X, undef), (concat Y, undef), Mask --> // concat (shuffle X, Y, Mask0), (shuffle X, Y, Mask1) The ARM changes with 'vtrn' and narrowed 'vuzp' are improvements. The x86 changes look neutral or better. There's one test with an extra instruction, but that could be reversed for a subtarget with the right attributes. But by default, we want to avoid the 256-bit op when possible (in my motivating benchmark, a handful of ymm ops sprinkled into a sequence of xmm ops are triggering frequency throttling on Haswell resulting in significantly worse perf). Differential Revision: https://reviews.llvm.org/D60545 llvm-svn: 358291
* Add options for MaxLoadsPerMemcmp(OptSize).Hiroshi Yamauchi2019-04-121-2/+15
| | | | | | | | | | | | | | Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60587 llvm-svn: 358287
* [X86][SSE] Recognise vXi1 boolean anyof/allof reduction patternsSimon Pilgrim2019-04-121-33/+56
| | | | | | | | | | | | Currently combineHorizontalPredicateResult only handles anyof/allof reduction patterns of legal types, which can be tricky to match as type legalization of bools can introduce bitcasts/truncs/extensions. This patch extends combineHorizontalPredicateResult to recognise vXi1 bool reductions as well and uses the existing combineBitcastvxi1 helper to create the MOVMSK necessary to then compare the signmask result. This ensures the accuracy of the reduction costs added in D60403 which assume the MOVMSK generation. Differential Revision: https://reviews.llvm.org/D60610 llvm-svn: 358286
* Revert r358268 "[DebugInfo] DW_OP_deref_size in PrologEpilogInserter."Hans Wennborg2019-04-124-37/+3
| | | | | | | | | | | | | | | | | | | It causes clang to crash while building Chromium. See https://crbug.com/952230 for reproducer. > The PrologEpilogInserter need to insert a DW_OP_deref_size before > prepending a memory location expression to an already implicit > expression to avoid having the existing expression act on the memory > address instead of the value behind it. > > The reason for using DW_OP_deref_size and not plain DW_OP_deref is that > big-endian targets need to read the right size as simply truncating a > larger read would yield the wrong result (LSB bytes are not at the lower > address). > > Differential Revision: https://reviews.llvm.org/D59687 llvm-svn: 358281
* Use llvm::upper_bound. NFCFangrui Song2019-04-124-13/+8
| | | | llvm-svn: 358277
* [PowerPC] Add initialization for some ppc passesKang Zhang2019-04-1213-49/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Some llc debug options need pass-name as the parameters. But if we use the pass-name ppc-early-ret, we will get below error: llc test.ll -stop-after ppc-early-ret LLVM ERROR: "ppc-early-ret" pass is not registered. Below pass-names have the pass is not registered error: ppc-ctr-loops ppc-ctr-loops-verify ppc-loop-preinc-prep ppc-toc-reg-deps ppc-vsx-copy ppc-early-ret ppc-vsx-fma-mutate ppc-vsx-swaps ppc-reduce-cr-ops ppc-qpx-load-splat ppc-branch-coalescing ppc-branch-select Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D60248 llvm-svn: 358271
OpenPOWER on IntegriCloud