summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [AArch64] Prefer static_cast<> to C-style cast. NFCI.Davide Italiano2017-02-191-2/+4
| | | | llvm-svn: 295615
* [X86][SSE] Use getTargetConstantBitsFromNode to find zeroable shuffle elements.Simon Pilgrim2017-02-191-35/+31
| | | | | | | | Replaces existing approach that could only search BUILD_VECTOR nodes. Requires getTargetConstantBitsFromNode to discriminate cases with all/partial UNDEF bits in each element - this should also be useful when we get around to supporting getTargetShuffleMaskIndices with UNDEF elements. llvm-svn: 295613
* [AVX-512] Add patterns to recognize masked vpternlog when the passthrough ↵Craig Topper2017-02-191-0/+39
| | | | | | | | operand is not operand 0. This uses a SDNodeXForm to swizzle the appropriate immediate bits to allow this to be matched. llvm-svn: 295612
* [X86][SSE] Enable initial support for domain crossing at high shuffle ↵Simon Pilgrim2017-02-191-3/+3
| | | | | | | | | | combine depths. As discussed on D27692, this permits another domain to be used to combine a shuffle at high depths. We currently set the required depth at 4 or more combined shuffles, this is probably too high for most targets but is a good starting point and already helps avoid a number of costly variable shuffles. llvm-svn: 295608
* Remove redundant call to GluedNodes.back() [NFC]Artyom Skrobov2017-02-191-2/+1
| | | | llvm-svn: 295607
* [X86][SSE] Generalize INSERTPS/SHUFPS/SHUFPD combines across domains.Simon Pilgrim2017-02-191-14/+23
| | | | | | Relax the INSERTPS/SHUFPS/SHUFPD combines to support integer inputs if permitted. llvm-svn: 295606
* [X86][SSE] Add domain crossing support for target shuffle combines.Simon Pilgrim2017-02-191-36/+48
| | | | | | | | Add the infrastructure to flag whether float and/or int domains are permitable. A future patch will enable domain crossing based off shuffle depth and the value types of the source vectors. llvm-svn: 295604
* Removed extra ';'Simon Pilgrim2017-02-191-1/+1
| | | | llvm-svn: 295603
* [AVX-512] Add broadcast VPTERNLOG instructions to special case commuting switch.Craig Topper2017-02-191-13/+37
| | | | | | | | The instructions are marked commutable, but without special handling we don't get the immediate correct. While here also remove the masked memory forms that aren't commutable. llvm-svn: 295602
* Add two files lost in rebase, causing build breakDaniel Berlin2017-02-191-0/+108
| | | | llvm-svn: 295595
* Add a DebugCounter for PredicateInfo renaming, and an associated testDaniel Berlin2017-02-191-0/+8
| | | | llvm-svn: 295594
* Add initial support for debug countingDaniel Berlin2017-02-192-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: We have support for bisection, and bugpoint can reduce testcases often to a single pass. But that doesn't help reduce it to a single transform by a single pass. Which debug counting lets us do. Debug counting lets you instrument a pass so that it only executes a certain thing (rwhatever you want) after skipping it a certain time of times, and then only does a certain number of executions before saying "skip" again. To make it concrete, for predicateinfo, if i instrument use renaming, i can make it so it skips renaming the first N uses, renames the next N, and then skips the rest. This lets you narrow down a miscompilation to, often, a single transformation, and then also debug it (by using the same command line parameters). Reviewers: chandlerc, davide, mehdi_amini Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D29998 llvm-svn: 295593
* [X86] Remove patterns for MOVSD with v4i32 types. We don't appear to really ↵Craig Topper2017-02-192-12/+0
| | | | | | need them and if we do we should just use a bitcast to a 64-bit element type. llvm-svn: 295589
* [X86] Tighten up some of the SDNode type constraints.Craig Topper2017-02-191-26/+43
| | | | llvm-svn: 295588
* Fix unused variable warning when assertions are disabled.Simon Pilgrim2017-02-191-4/+4
| | | | llvm-svn: 295587
* [X86] Fix enumeral/non-enumeral conditional expression warning.Simon Pilgrim2017-02-191-4/+4
| | | | | | gcc only allows you to mix enums / ints if they have the same signedness. llvm-svn: 295586
* Fix 'variable set but not used' warning when assertions are disabled.Simon Pilgrim2017-02-191-0/+2
| | | | llvm-svn: 295585
* NewGVN: Start making use of predicateinfo pass.Daniel Berlin2017-02-181-18/+275
| | | | | | | | | | | | Summary: This begins using the predicateinfo pass in NewGVN. Reviewers: davide Subscribers: llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D29682 llvm-svn: 295583
* NewGVN: Make ranking prefer undef to constants. Fix direction ofDaniel Berlin2017-02-181-8/+9
| | | | | | shouldSwapOperands to be correct. llvm-svn: 295582
* PredicateInfo: Clean up predicate info a little, using insertionDaniel Berlin2017-02-181-67/+93
| | | | | | helpers, and fixing support for the renaming the comparison. llvm-svn: 295581
* Fix signed/unsigned comparison warning.Simon Pilgrim2017-02-181-2/+2
| | | | llvm-svn: 295580
* [X86][XOP] Reduce the size of a multiclass by moving more stuff to ↵Craig Topper2017-02-184-70/+42
| | | | | | | | parameters instead of doing 128-bit and 256-bit simultaneously. This requires some instructions to be renamed to move the Y earlier in the instruction name. The new names are more consistent with other instructions. llvm-svn: 295579
* [AArch64] Fix enumeral/non-enumeral conditional expression warning.Simon Pilgrim2017-02-181-4/+4
| | | | | | gcc only allows you to mix enums / ints if they have the same signedness. llvm-svn: 295577
* [X86] Fix enumeral/non-enumeral comparison warning.Simon Pilgrim2017-02-181-1/+1
| | | | | | gcc only allows you to mix enums / ints if they have the same signedness. llvm-svn: 295576
* [X86][SSE] Avoid repeated calls to SDValue::getValueType.Simon Pilgrim2017-02-181-4/+7
| | | | | | Added assertion to check input type of X86ISD::VZEXT during target known bits calculation. llvm-svn: 295575
* [InstCombine] add nsw/nuw X, signbit --> or X, signbitSanjay Patel2017-02-181-2/+9
| | | | | | | | | Changing to 'or' (rather than 'xor' when no wrapping flags are set) allows icmp simplifies to happen as expected. Differential Revision: https://reviews.llvm.org/D29729 llvm-svn: 295574
* [InstSimplify] add nsw/nuw (xor X, signbit), signbit --> XSanjay Patel2017-02-181-1/+11
| | | | | | | | | The change to InstCombine in: https://reviews.llvm.org/D29729 ...exposes this missing fold in InstSimplify, so adding this first to avoid a regression. llvm-svn: 295573
* Recommit "[X86] Remove XOP VPCMOV intrinsics and autoupgrade them to native IR."Craig Topper2017-02-183-65/+22
| | | | | | Clang has now been fixed to not use these intrinsics. llvm-svn: 295571
* [x86] fold sext (xor Bool, -1) --> sub (zext Bool), 1Sanjay Patel2017-02-181-0/+10
| | | | | | | This is the same transform that is current used for: select Bool, 0, -1 llvm-svn: 295568
* [MemorySSA] NFC small fixesPiotr Padlewski2017-02-181-9/+6
| | | | | | | | | | | | | | Summary: 2 small fixes extracted from https://reviews.llvm.org/D29064 Reviewers: kuhar, davide, dberlin, george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30109 llvm-svn: 295566
* Revert "[X86] Remove XOP VPCMOV intrinsics and autoupgrade them to native IR."Craig Topper2017-02-183-22/+65
| | | | | | This reverts r295564. I missed that clang was still using the intrinsics despite our half implemented autoupgrade support. llvm-svn: 295565
* [X86] Remove XOP VPCMOV intrinsics and autoupgrade them to native IR.Craig Topper2017-02-183-65/+22
| | | | | | It seems we were already upgrading 128-bit VPCMOV, but the intrinsic was still defined and being used in isel patterns. While I was here I also simplified the tablegen multiclasses. llvm-svn: 295564
* [X86][IR] Simplify the XOP vpcmov autoupgrade code. NFCCraig Topper2017-02-181-7/+3
| | | | llvm-svn: 295563
* [X86][IR] Merge together some very similar AutoUpgrade handling. NFCCraig Topper2017-02-181-24/+16
| | | | llvm-svn: 295562
* AMDGPU: Fix assembler subtarget predicate for gfx9Matt Arsenault2017-02-183-1/+13
| | | | | | This was accepting GFX9 instructions on VI. llvm-svn: 295557
* AMDGPU: Fix disassembly of aperture registersMatt Arsenault2017-02-181-0/+5
| | | | llvm-svn: 295555
* AMDGPU: Merge initial gfx9 supportMatt Arsenault2017-02-1818-41/+239
| | | | llvm-svn: 295554
* Refactor instruction simplification code in visitors. NFC.Easwaran Raman2017-02-181-88/+67
| | | | | | | | | | | | | | Several visitors check if operands to the instruction are constants, either as it is or after looking up SimplifiedValues, check if the result is a constant and update the SimplifiedValues map. This refactoring splits it into a common function that does the checking of whether the operands are constants and updating of the SimplifiedValues table, and an instruction specific part that is implemented by each instruction visitor as a lambda and passed to the common function. Differential revision: https://reviews.llvm.org/D30104 llvm-svn: 295552
* [AVX-512] Remove 128/256-bit masked fp max/min intrinsics. Upgrade them to ↵Craig Topper2017-02-182-8/+38
| | | | | | legacy unmasked intrinsics and select instructions. llvm-svn: 295543
* AMDGPU/R600: Assert on infinite loop in EmitClauseMarkersJan Vesely2017-02-181-3/+5
| | | | | | Differential Revision: https://reviews.llvm.org/D29792 llvm-svn: 295539
* Increases full-unroll threshold.Dehao Chen2017-02-183-31/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The default threshold for fully unroll is too conservative. This patch doubles the full-unroll threshold This change will affect the following speccpu2006 benchmarks (performance numbers were collected from Intel Sandybridge): Performance: 403 0.11% 433 0.51% 445 0.48% 447 3.50% 453 1.49% 464 0.75% Code size: 403 0.56% 433 0.96% 445 2.16% 447 2.96% 453 0.94% 464 8.02% The compiler time overhead is similar with code size. Reviewers: davidxl, mkuper, mzolotukhin, hfinkel, chandlerc Reviewed By: hfinkel, chandlerc Subscribers: mehdi_amini, zzheng, efriedma, haicheng, hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D28368 llvm-svn: 295538
* [IR/Verifier] Don't visit DISubprograms more than needed.Davide Italiano2017-02-181-2/+0
| | | | | | | | | | | | | Before this patch we happened to visit twice, one when scanning MDNodes and the other one while visiting the function. Remove the explicit call to visitDISubprogram there, so we don't emit the same error twice in case the verifier fail and we save some time when running it. Thanks to Justin Bogner for the report and Adrian for the quick review! PR: 31995 llvm-svn: 295537
* [AVR] Set UseIntegratedAssemblerDylan McKay2017-02-181-0/+1
| | | | llvm-svn: 295535
* OptDiag: Allow constructing DiagnosticLocation from DISubprogramsJustin Bogner2017-02-182-3/+10
| | | | | | | | This avoids creating a DILocation just to represent a line number, since creating Metadata is expensive. Creating a DiagnosticLocation directly is much cheaper. llvm-svn: 295531
* Don't assume little endian in StreamReader / StreamWriter.Zachary Turner2017-02-1815-152/+75
| | | | | | | In an effort to generalize this so it can be used by more than just PDB code, we shouldn't assume little endian. llvm-svn: 295525
* OptDiag: Decouple backend diagnostics from debug info metadataJustin Bogner2017-02-182-52/+49
| | | | | | | | | This creates and uses a DiagnosticLocation type rather than using DebugLoc for this purpose in the backend diagnostics. This is NFC for now, but will allow us to create locations for diagnostics without having to create new metadata nodes when we don't have a DILocation. llvm-svn: 295519
* MachineRegionInfo: Fix pass initializationMatthias Braun2017-02-183-9/+14
| | | | | | | | | | | | | | | | - Adapt MachineBasicBlock::getName() to have the same behavior as the IR BasicBlock (Value::getName()). - Add it to lib/CodeGen/CodeGen.cpp::initializeCodeGen so that it is linked in the CodeGen library. - MachineRegionInfoPass's name conflicts with RegionInfoPass's name ("region"). - MachineRegionInfo should depend on MachineDominatorTree, MachinePostDominatorTree and MachineDominanceFrontier instead of their respective IR versions. - Since there were no tests for this, add a X86 MIR test. Patch by Francis Visoiu Mistrih<fvisoiumistrih@apple.com> llvm-svn: 295518
* Verifier: Disallow a line number without a file in DISubprogramJustin Bogner2017-02-171-0/+2
| | | | | | | | A line number doesn't make much sense if you don't say where it's from. Add a verifier check for this and update some tests that had bogus debug info. llvm-svn: 295516
* AArch64LoadStoreOptimizer: Correctly clear kill flagsMatthias Braun2017-02-171-2/+4
| | | | | | | | | | | When promoting the Load of a Store-Load pair to a COPY all kill flags between the store and the load need to be cleared. rdar://30402435 Differential Revision: https://reviews.llvm.org/D30110 llvm-svn: 295512
* [PPC] Give unaligned memory access lower cost on processor that supports itGuozhi Wei2017-02-171-0/+4
| | | | | | | | | | | | Newer ppc supports unaligned memory access, it reduces the cost of unaligned memory access significantly. This patch handles this case in PPCTTIImpl::getMemoryOpCost. This patch fixes pr31492. Differential Revision: https://reviews.llvm.org/D28630 This is resubmit of r292680, which was reverted by r293092. The internal application failures were actually caused by a source code bug. llvm-svn: 295506
OpenPOWER on IntegriCloud