summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [CGP] use narrower types in memcmp expansion when possibleSanjay Patel2017-08-011-1/+6
| | | | | | | | This only affects very small memcmp on x86 for now, but it will become more important if we allow vector-sized load and compares. llvm-svn: 309711
* [DAG] Convert extload check to equivalent type check. NFC.Nirav Dave2017-08-011-5/+10
| | | | | | Replace check with check that consuming store has the same type. llvm-svn: 309708
* [X86] Use BEXTR/BEXTRI for 64-bit 'and' with a large maskCraig Topper2017-08-011-5/+36
| | | | | | | | | | | | | | Summary: The 64-bit 'and' with immediate instruction only supports a 32-bit immediate. So for larger constants we have to load the constant into a register first. If the immediate happens to be a mask we can use the BEXTRI instruction to perform the masking. We already do something similar using the BZHI instruction from the BMI2 instruction set. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36129 llvm-svn: 309706
* [X86][SSE] Added missing PACKSS/PACKUS intrinsic schedulesSimon Pilgrim2017-08-013-8/+10
| | | | | | | | Improves atom scheduler test coverage (to make it easier to upgrade them for PR32431). Checked on Agner that these actually match the UNPACK schedules, but better to include a separate class llvm-svn: 309701
* [X86][SSSE3] Added missing PHADDS/PHSUBS/PSIGN intrinsic schedulesSimon Pilgrim2017-08-011-2/+2
| | | | llvm-svn: 309699
* [DAG] Move extload check in store merge. NFC.Nirav Dave2017-08-011-5/+3
| | | | | | Move candidate check from later check to initial candidate check. llvm-svn: 309698
* [X86] Fix a crash in FEntryInserter Pass.Manoj Gupta2017-08-011-3/+1
| | | | | | | | | | | | | | | | | | | | | Summary: FEntryInserter pass unconditionally derefs the first Instruction in the first Basic Block. The pass crashes when the first BasicBlock is empty. Fix the crash by not dereferencing the basic Block iterator. This fixes an issue observed when building Linux kernel 4.4 with clang. Fixes PR33971. Reviewers: hfinkel, niravd, dblaikie Reviewed By: niravd Subscribers: davide, llvm-commits Differential Revision: https://reviews.llvm.org/D35979 llvm-svn: 309694
* [AVX-512] Don't use unmasked VMOVDQU8/16 for 8-bit or 16-bit element stores ↵Craig Topper2017-08-011-13/+29
| | | | | | | | | | | | | | even when BWI instructions are supported. Always use VMOVDQA32/VMOVDQU32. We were already using the 32 bit element opcode if BWI isn't enabled, but there's no reason to change opcode if we have BWI. We will still use the 8/16 opcodes for masked stores though. This allows us to use the aligned opcode when we can which makes our test output more consistent between different modes. It also reduces the number of isel patterns we need. This is a slight inconsistency with loads which default to 64 bit element opcodes. I'll probably rectify that in a future patch. Differential Revision: https://reviews.llvm.org/D35978 llvm-svn: 309693
* [InstCombine] Remove explicit check for impossible condition. Replace with ↵Craig Topper2017-08-011-1/+2
| | | | | | | | | | | | | | | | | | | assert Summary: As far as I can tell the earlier call getLimitedValue will guaranteed ShiftAmt is saturated to BitWidth-1 preventing it from ever being equal or greater than BitWidth. At one point in the past the getLimitedValue call was only passed BitWidth not BitWidth - 1. This would have allowed the equality case to get here. And in fact this check was initially added as just BitWidth == ShiftAmt, but was changed shortly after to include > which should have never been possible. Reviewers: spatel, majnemer, davide Reviewed By: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36123 llvm-svn: 309690
* DebugInfo: Update flag description that'd been copypasted from anotherDavid Blaikie2017-08-011-1/+1
| | | | | | Post-commit review feedback from Paul Robinson on r309630. Thanks Paul! llvm-svn: 309685
* [DebugInfo] Use shrink_to_fit to simplify code. NFCI.Benjamin Kramer2017-08-012-14/+4
| | | | llvm-svn: 309683
* [DAG] Extend visitSCALAR_TO_VECTOR optimization to truncated vector.Nirav Dave2017-08-011-11/+26
| | | | | | | | | | | | | | | Summary: Allow SCALAR_TO_VECTOR of EXTRACT_VECTOR_ELT to reduce to EXTRACT_SUBVECTOR of vector shuffle when output is smaller. Marginally improves vector shuffle computations. Reviewers: efriedma, RKSimon, spatel Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D35566 llvm-svn: 309680
* [Mips] Fix for BBIT octeon instructionStrahinja Petrovic2017-08-011-1/+7
| | | | | | | | | | | This patch enables control flow optimization for variations of BBIT instruction. In this case optimization removes unnecessary branch after BBIT instruction. Differential Revision: https://reviews.llvm.org/D35359 llvm-svn: 309679
* [Hexagon] Convert HVX vector constants of i1 to i8Krzysztof Parzyszek2017-08-011-0/+36
| | | | | | | | | Certain operations require vector of i1 values. However, for Hexagon architecture compatibility, they need to be represented as vector of i8. Patch by Suyog Sarda. llvm-svn: 309677
* AMDGPU/GlobalISel: Add support for amdgpu_vs calling conventionTom Stellard2017-08-011-4/+24
| | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D35916 llvm-svn: 309675
* Support itineraries in TargetSubtargetInfo::getSchedInfoStr - Now if the ↵Andrew V. Tischenko2017-08-011-3/+10
| | | | | | | | given instr does not have sched model then we try to calculate the latecy/throughput with help of itineraries. Differential Revision https://reviews.llvm.org/D35997 llvm-svn: 309666
* [IRCE][NFC] Add another assert that AddRecExpr's step is not zeroMax Kazantsev2017-08-011-0/+1
| | | | | | | One more assertion of this kind. It is a preparation step for generalizing to the case of stride not equal to +1/-1. llvm-svn: 309663
* [PM] Add a comment clarifying what a particular predicate is doing.Chandler Carruth2017-08-011-0/+8
| | | | | | | This came up as a point of confusion while working on a fundamental problem with the combination of CGSCC iteration and the inliner. llvm-svn: 309662
* [IRCE][NFC] Add assert that AddRecExpr's step is not zeroMax Kazantsev2017-08-011-0/+1
| | | | | | | We should never return zero steps, ensure this fact by adding a sanity check when we are analyzing the induction variable. llvm-svn: 309661
* Revert r309415: "[LVI] Constant-propagate a zero extension of the switch ↵Daniel Jasper2017-08-011-108/+4
| | | | | | | | | condition value through case edges" This causes assertion failures in (a somewhat old version of) SpiderMonkey. I have already forwarded reproduction instructions to the patch author. llvm-svn: 309659
* [MetaRenamer] Leave `@main` alone.Davide Italiano2017-08-011-1/+5
| | | | | | | | | | | | | To the best of my knowledge -metarenamer is used in two cases: 1) obfuscate names, when e.g. they contain informations that can't be shared. 2) Improve clarity of the textual IR for testcases. One of the usecases if getting the output of `opt` and passing it to the lli interpreter to run the test. If metarenamer renames @main, lli can't find an entry point. llvm-svn: 309657
* [StackColoring] Update AliasAnalysis information in stack coloring passHiroshi Inoue2017-08-014-68/+131
| | | | | | | | | | | | | | | | | Stack coloring pass need to maintain AliasAnalysis information when merging stack slots of different types. Actually, there is a FIXME comment in StackColoring.cpp // FIXME: In order to enable the use of TBAA when using AA in CodeGen, // we'll also need to update the TBAA nodes in MMOs with values // derived from the merged allocas. But, TBAA has been already enabled in CodeGen without fixing this pass. The incorrect TBAA metadata results in recent failures in bootstrap test on ppc64le (PR33928) by allowing unsafe instruction scheduling. Although we observed the problem on ppc64le, this is a platform neutral issue. This patch makes the stack coloring pass maintains AliasAnalysis information when merging multiple stack slots. llvm-svn: 309651
* [libFuzzer] implement more correct way of computing feature index for ↵Kostya Serebryany2017-08-012-11/+18
| | | | | | Inline8bitCounters llvm-svn: 309647
* [libFuzzer] enable -fsanitize-coverage=pc-table for all testsKostya Serebryany2017-08-014-11/+22
| | | | llvm-svn: 309646
* Default MemoryLocation passed to getModRefInfo should be None (D35441)Alina Sbirlea2017-08-011-1/+1
| | | | llvm-svn: 309645
* [sanitizer-coverage] relax an assertionKostya Serebryany2017-08-011-4/+5
| | | | llvm-svn: 309644
* [ScheduleDAG] Don't schedule node with physical register interferenceEli Friedman2017-08-011-25/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | https://reviews.llvm.org/D31536 didn't really solve the problem it was trying to solve; it got rid of the assertion failure, but we were still scheduling the DAG incorrectly (mixing together instructions from different calls), leading to a MachineVerifier failure. In order to schedule the DAG correctly, we have to make sure we don't schedule a node which should be blocked by an interference. Fix ScheduleDAGRRList::PickNodeToScheduleBottomUp so it doesn't pick a node like that. The added call to FindAvailableNode() is the key change here; this makes sure we don't try to schedule a call while we're in the middle of scheduling a different call. I'm not sure this is the right approach; in particular, I'm not sure how to prove we don't end up with an infinite loop of repeatedly backtracking. This also reverts the code change from D31536. It doesn't do anything useful: we should never schedule an ADJCALLSTACKDOWN unless we've already scheduled the corresponding ADJCALLSTACKUP. Differential Revision: https://reviews.llvm.org/D33818 llvm-svn: 309642
* Allow None as a MemoryLocation to getModRefInfoAlina Sbirlea2017-08-012-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Adding part of the changes in D30369 (needed to make progress): Current patch updates AliasAnalysis and MemoryLocation, but does _not_ clean up MemorySSA. Original summary from D30369, by dberlin: Currently, we have instructions which affect memory but have no memory location. If you call, for example, MemoryLocation::get on a fence, it asserts. This means things specifically have to avoid that. It also means we end up with a copy of each API, one taking a memory location, one not. This starts to fix that. We add MemoryLocation::getOrNone as a new call, and reimplement the old asserting version in terms of it. We make MemoryLocation optional in the (Instruction, MemoryLocation) version of getModRefInfo, and kill the old one argument version in favor of passing None (it had one caller). Now both can handle fences because you can just use MemoryLocation::getOrNone on an instruction and it will return a correct answer. We use all this to clean up part of MemorySSA that had to handle this difference. Note that literally every actual getModRefInfo interface we have could be made private and replaced with: getModRefInfo(Instruction, Optional<MemoryLocation>) and getModRefInfo(Instruction, Optional<MemoryLocation>, Instruction, Optional<MemoryLocation>) and delegating to the right ones, if we wanted to. I have not attempted to do this yet. Reviewers: dberlin, davide, dblaikie Subscribers: sanjoy, hfinkel, chandlerc, llvm-commits Differential Revision: https://reviews.llvm.org/D35441 llvm-svn: 309641
* [AVX-512] Add unmasked subvector inserts and extract to the execution domain ↵Craig Topper2017-07-311-0/+24
| | | | | | tables. llvm-svn: 309632
* DebugInfo: Put range base specifier entry functionality behind a flagDavid Blaikie2017-07-311-4/+9
| | | | | | | | Chromium's gold build seems to have trouble with this (gold produces errors) - not sure if it's gold that's not coping with the valid representation, or a bug in the implementation in LLVM, etc. llvm-svn: 309630
* [codeview] Ignore DBG_VALUEs when choosing a BB start source locReid Kleckner2017-07-311-0/+2
| | | | | | | | | | When the first instruction of a basic block has no location (consider a LEA materializing the address of an alloca for a call), we want to start the line table for the block with the first valid source location in the block. We need to ignore DBG_VALUE instructions during this scan to get decent line tables. llvm-svn: 309628
* [InstCombine] allow mask hoisting transform for vector typesSanjay Patel2017-07-311-33/+27
| | | | llvm-svn: 309627
* Update phi nodes in LowerTypeTests control flow simplificationPeter Collingbourne2017-07-311-0/+4
| | | | | | | | | | | | | | D33925 added a control flow simplification for -O2 --lto-O0 builds that manually splits blocks and reassigns conditional branches but does not correctly update phi nodes. If the else case being branched to had incoming phi nodes the control-flow simplification would leave phi nodes in that BB with an unhandled predecessor. Patch by Vlad Tsyrklevich! Differential Revision: https://reviews.llvm.org/D36012 llvm-svn: 309621
* [libFuzzer] implement __sanitizer_cov_pcs_init and add pc-table to build ↵Kostya Serebryany2017-07-313-6/+32
| | | | | | flags for one test (for now) llvm-svn: 309615
* [X86][MMX] Added custom lowering action for MMX SELECT (PR30418)Konstantin Belochapka2017-07-311-0/+13
| | | | | | | Fix for pr30418 - error in backend: Cannot select: t17: x86mmx = select_cc t2, Constant:i64<0>, t7, t8, seteq:ch Differential Revision: https://reviews.llvm.org/D34661 llvm-svn: 309614
* [sanitizer-coverage] don't instrument available_externally functionsKostya Serebryany2017-07-311-0/+3
| | | | llvm-svn: 309611
* [sanitizer-coverage] ensure minimal alignment for coverage counters and guardsKostya Serebryany2017-07-311-1/+2
| | | | llvm-svn: 309610
* [lld/pdb] Add an empty globals stream.Zachary Turner2017-07-314-1/+107
| | | | | | | | | | We don't write any actual symbols to this stream yet, but for now we just create the stream and hook it up to the appropriate places and give it a valid header. Differential Revision: https://reviews.llvm.org/D35290 llvm-svn: 309608
* [SLPVectorizer] Unbreak the build with -Werror.Davide Italiano2017-07-311-3/+3
| | | | | | | GCC was complaining about `&&` within `||` without explicit parentheses. NFCI. llvm-svn: 309606
* [X86][InstCombine] Add some simplifications for BZHI intrinsicsCraig Topper2017-07-311-0/+20
| | | | | | | | | | This intrinsic clears the upper bits starting at a specified index. If the index is a constant we can do some simplifications. This could be in InstSimplify, but we don't handle any target specific intrinsics there today. Differential Revision: https://reviews.llvm.org/D36069 llvm-svn: 309604
* [X86][InstCombine] Add basic simplification support for BEXTR/BEXTRI intrinsics.Craig Topper2017-07-311-0/+26
| | | | | | | | | | This patch adds simplification support for the BEXTR/BEXTRI intrinsics to match gcc. This only supports cases that fold to 0 or can be fully constant folded. Theoretically we could support converting to AND if the shift part is unused or to only a shift if the mask doesn't modify any bits after an equivalent shl. gcc doesn't do these transformations either. I put this in InstCombine, but it could be done in InstSimplify. It would be the first target specific intrinsic in InstSimplify. Differential Revision: https://reviews.llvm.org/D36063 llvm-svn: 309603
* [TargetPassConfig] Feature generic options to setup start/stop-after/beforeQuentin Colombet2017-07-312-21/+107
| | | | | | | | | | | | | | | | This patch refactors the code used in llc such that all the users of the addPassesToEmitFile API have access to a homogeneous way of handling start/stop-after/before options right out of the box. In particular, just invoking addPassesToEmitFile will set the proper pipeline without additional effort (modulo parsing a .mir file if the start-before/after options are used. NFC. Differential Revision: https://reviews.llvm.org/D30913 llvm-svn: 309599
* [CGP] use subtract or subtract-of-cmps for result of memcmp expansionSanjay Patel2017-07-311-6/+18
| | | | | | | | | | | | | | | | | | | | | | As noted in the code comment, transforming this in the other direction might require a separate transform here in CGP given the block-at-a-time DAG constraint. Besides that theoretical motivation, there are 2 practical motivations for the subtract-of-cmps form: 1. The codegen for both x86 and PPC is better for this IR (though PPC could be better still). There is discussion about canonicalizing IR to the select form ( http://lists.llvm.org/pipermail/llvm-dev/2017-July/114885.html ), so we probably need to add DAG transforms for those patterns anyway, but this improves the memcmp output without waiting for that step. 2. If we allow vector-sized chunks for the load and compare, x86 is better prepared to convert that to optimal code when using subtract-of-cmps, so another prerequisite patch is avoided if we choose to enable that. Differential Revision: https://reviews.llvm.org/D34904 llvm-svn: 309597
* [DWARF] Added verification check for tags in accelerator tables. This patch ↵Spyridoula Gravani2017-07-312-7/+23
| | | | | | | | verifies that the atom tag is actually the same with the tag of the DIE that we retrieve from the table. Differential Revision: https://reviews.llvm.org/D35963 llvm-svn: 309596
* [IPSCCP] Guard a user of getInitializer with hasDefinitiveInitializerDavid Majnemer2017-07-311-1/+2
| | | | | | | We are not allowed to reason about an initializer value without first consulting hasDefinitiveInitializer. llvm-svn: 309594
* [AVX-512] Remove patterns that select vmovdqu8/16 for unmasked loads. Prefer ↵Craig Topper2017-07-311-11/+18
| | | | | | | | | | | | | | vmovdqa64/vmovdqu64 instead. These were taking priority over the aligned load instructions since there is no vmovda8/16. I don't think there is really a difference between aligned and unaligned on newer cpus so I don't think it matters which instructions we use. But with this change we reduce the size of the isel table a little and we allow the aligned information to pass through to the evex->vec pass and produce the same output has avx/avx2 in some cases. I also generally dislike patterns rooted in a bitcast which these were. Differential Revision: https://reviews.llvm.org/D35977 llvm-svn: 309589
* Strip trailing whitespace. NFCI.Simon Pilgrim2017-07-311-7/+7
| | | | llvm-svn: 309584
* Fix typo in comment.Simon Pilgrim2017-07-311-1/+1
| | | | llvm-svn: 309583
* [GISel]: Support Widening G_ICMP's destination operand.Aditya Nandakumar2017-07-313-15/+55
| | | | | | | | | Updated AArch64 to widen destination to s32. https://reviews.llvm.org/D35737 Reviewed by Tim llvm-svn: 309579
* Do not recombine FMA when that is not needed.Amaury Sechet2017-07-311-4/+16
| | | | | | | | | | | | Summary: As per title. This creates useless recombines. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33848 llvm-svn: 309578
OpenPOWER on IntegriCloud