summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [DAGCombiner] Fix for big endian in ForwardStoreValueToDirectLoadBjorn Pettersson2018-10-301-9/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Normalize the offset for endianess before checking if the store cover the load in ForwardStoreValueToDirectLoad. Without this we missed out on some optimizations for big endian targets. If for example having a 4 bytes store followed by a 1 byte load, loading the least significant byte from the store, the STCoversLD check would fail (see @test4 in test/CodeGen/AArch64/load-store-forwarding.ll). This patch also fixes a problem seen in an out-of-tree target. The target has i40 as a legal type, it is big endian, and the StoreSize for i40 is 48 bits. So when normalizing the offset for endianess we need to take the StoreSize into account (assuming that padding added when storing into a larger StoreSize always is added at the most significant end). Reviewers: niravd Reviewed By: niravd Subscribers: javed.absar, kristof.beyls, llvm-commits, uabelho Differential Revision: https://reviews.llvm.org/D53776 llvm-svn: 345636
* [AArch64] [Windows] SEH opcodes should be scheduling boundaries.Eli Friedman2018-10-302-0/+41
| | | | | | | | | | | | | | Prevents the post-RA scheduler from modifying the prologue sequences emitting by frame lowering. This is roughly similar to what we do for other targets: TargetInstrInfo::isSchedulingBoundary checks isPosition(), which checks for CFI_INSTRUCTION. isSEHInstruction is taken from D50288; it'll land with whatever patch lands first. Differential Revision: https://reviews.llvm.org/D53851 llvm-svn: 345634
* [AArch64] Create proper memoperand for multi-vector storesDavid Greene2018-10-301-1/+1
| | | | | | | | | | | | Re-apply r345315 with testcase fixes. Include all of the store's source vector operands when creating the MachineMemOperand. Previously, we were missing the first operand, making the store size seem smaller than it really is. Differential Revision: https://reviews.llvm.org/D52816 llvm-svn: 345631
* [X86] In lowerVectorShuffleAsBroadcast, make peeking through CONCAT_VECTORS ↵Craig Topper2018-10-301-1/+2
| | | | | | | | | | | | work correctly if we already walked through a bitcast that changed the element size. The CONCAT_VECTORS case was using the original mask element count to determine how to adjust the broadcast index. But if we looked through a bitcast the original mask size doesn't tell us anything about the concat_vectors. This patch switchs to using the concat_vectors input element count directly instead. Differential Revision: https://reviews.llvm.org/D53823 llvm-svn: 345626
* [GCOV] Function counters are wrong when on one lineCalixte Denizet2018-10-301-4/+3
| | | | | | | | | | | | | | | | Summary: After commit https://reviews.llvm.org/rL344228, the function definitions have a counter but when on one line the counter is wrong (e.g. void foo() { }) I added a test in: https://reviews.llvm.org/D53601 Reviewers: marco-c Reviewed By: marco-c Subscribers: llvm-commits, sylvestre.ledru Differential Revision: https://reviews.llvm.org/D53600 llvm-svn: 345624
* [DAG] Add const variants for BaseIndexOffset functions.Nirav Dave2018-10-301-3/+4
| | | | llvm-svn: 345623
* Fix printing bug in pdb2yaml.Zachary Turner2018-10-301-1/+1
| | | | | | | We were using the wrong enum table when mapping enum values to strings for public symbol flags. llvm-svn: 345622
* [SystemZ] Simplify LRV/STRV ISD nodesUlrich Weigand2018-10-304-47/+40
| | | | | | | | | | | The LRV and STRV nodes carry an extra operand to indicate the type of the memory access. This is redundant, since the nodes are actually of class MemIntrinsicNode and therefore hold that same information already as MemoryVT. NFC intended. llvm-svn: 345618
* [TTI] Fix uses of SK_ExtractSubvector shuffle costs (PR39368)Simon Pilgrim2018-10-301-1/+1
| | | | | | | | | | | | | | | | Correct costings of SK_ExtractSubvector requires the SubTy argument to indicate the type/size of the extracted subvector. Unlike the rest of the shuffle kinds this means that the main Ty argument represents the source vector type not the destination! I've done my best to fix a number of vectorizer uses: SLP - the reduction epilogue costs should be using a SK_PermuteSingleSrc shuffle as these all occur at the hardware vector width - we're not extracting (illegal) subvector types. This is causing the cost model diffs as SK_ExtractSubvector costs are poorly handled and tend to just return 1 at the moment. LV - I'm not clear on what the SK_ExtractSubvector should represents for recurrences - I've used a <1 x ?> subvector extraction as that seems to match the VF delta. Differential Revision: https://reviews.llvm.org/D53573 llvm-svn: 345617
* [InstCombine] use getFltSemantics() instead of duplicating it; NFCSanjay Patel2018-10-301-19/+3
| | | | llvm-svn: 345613
* [InstCombine] try to turn shuffle into insertelementSanjay Patel2018-10-301-0/+70
| | | | | | | | | | | | | | | | | | | | | | shuffle (insert ?, Scalar, IndexC), V1, Mask --> insert V1, Scalar, IndexC' The motivating case is at least a couple of steps away: I noticed that SLPVectorizer does not analyze shuffles as well as sequences of insert/extract in PR34724: https://bugs.llvm.org/show_bug.cgi?id=34724 ...so SLP may fail to vectorize when source code has shuffles to start with or instcombine has converted insert/extract to shuffles. Independent of that, an insertelement is always a simpler op for IR analysis vs. a shuffle, so we should transform to insert when possible. I don't think there's any codegen concern here - if a target can't insert a scalar directly to some fixed element in a vector (x86?), then this should get expanded to the insert+shuffle that we started with. Differential Revision: https://reviews.llvm.org/D53507 llvm-svn: 345607
* [SchedModel] Fix for read advance cycles with implicit pseudo operands.Jonas Paulsson2018-10-301-4/+16
| | | | | | | | | | | | | | | | | | The SchedModel allows the addition of ReadAdvances to express that certain operands of the instructions are needed at a later point than the others. RegAlloc may add pseudo operands that are not part of the instruction descriptor, and therefore cannot have any read advance entries. This meant that in some cases the desired read advance was nullified by such a pseudo operand, which still had the original latency. This patch fixes this by making sure that such pseudo operands get a zero latency during DAG construction. Review: Matthias Braun, Ulrich Weigand. https://reviews.llvm.org/D49671 llvm-svn: 345606
* [LoopVectorizer] Fix for cost values of memory accesses.Jonas Paulsson2018-10-301-1/+8
| | | | | | | | | | | | | | | | | | | | | | This commit is a combination of two patches: * "Fix in getScalarizationOverhead()" If target returns false in TTI.prefersVectorizedAddressing(), it means the address registers will not need to be extracted. Therefore, there should be no operands scalarization overhead for a load instruction. * "Don't pass the instruction pointer from getMemInstScalarizationCost." Since VF is always > 1, this is a cost query for an instruction in the vectorized loop and it should not be evaluated within the scalar context of the instruction. Review: Ulrich Weigand, Hal Finkel https://reviews.llvm.org/D52351 https://reviews.llvm.org/D52417 llvm-svn: 345603
* [DAGCombiner] narrow vector binops when extraction is cheapSanjay Patel2018-10-301-11/+30
| | | | | | | | | | | | | | | | | Narrowing vector binops came up in the demanded bits discussion in D52912. I don't think we're going to be able to do this transform in IR as a canonicalization because of the risk of creating unsupported widths for vector ops, but we already have a DAG TLI hook to allow what I was hoping for: isExtractSubvectorCheap(). This is currently enabled for x86, ARM, and AArch64 (although only x86 has existing regression test diffs). This is artificially limited to not look through bitcasts because there are so many test diffs already, but that's marked with a TODO and is a small follow-up. Differential Revision: https://reviews.llvm.org/D53784 llvm-svn: 345602
* [SelectionDAG] fix build warning for mismatched signs in compare; NFCSanjay Patel2018-10-301-1/+1
| | | | llvm-svn: 345598
* [SystemZ] Improve isFoldableLoad() for Sub, SDiv and UDiv.Jonas Paulsson2018-10-301-0/+5
| | | | | | | | | | Sub, SDiv and UDiv are not commutative, so only the RHS operand can fold a load. This patch adds a check for this. Review: Ulrich Weigand https://reviews.llvm.org/D53791 llvm-svn: 345596
* [X86] Re-enable the machine verifier after fixing more testsFrancis Visoiu Mistrih2018-10-301-4/+0
| | | | | | Was disabled again in r345528. Hopefully this the bots. llvm-svn: 345593
* [llc] Error out when -print-machineinstrs is used with an unknown passFrancis Visoiu Mistrih2018-10-301-9/+11
| | | | | | | | We used to assert instead of reporting an error. PR39494 llvm-svn: 345589
* [SROA] Use offset sizes from the DataLayout instead of the pointer siezes.Nicola Zaghen2018-10-301-6/+6
| | | | | | | | | | This fixes an assertion when constant folding a GEP when the part of the offset was in i32 (IndexSize, as per DataLayout) and part in the i64 (PointerSize) in the newly created test case. Differential Revision: https://reviews.llvm.org/D52609 llvm-svn: 345585
* [X86][BMI1] X86DAGToDAGISel: select BEXTR from x & (-1 >> (32 - y)) patternRoman Lebedev2018-10-302-58/+40
| | | | | | | | | | | | | | | | | | | | | Summary: The final pattern. There is no test changes: * We are looking for the pattern with one-use of it's mask, * If the mask is one-use, D48768 will unfold it into pattern d. * Thus, the tests have extra-use on the mask. * Thus, only the BMI2 BZHI can be tested, and it already worked. * So there is no BMI1 test coverage, we just assume it works since it uses the same codepath. Reviewers: craig.topper, RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D53575 llvm-svn: 345584
* [AArch64] Add support for UDF instructionDiogo N. Sampaio2018-10-302-10/+29
| | | | | | | | | | | | | | | | Summary: Add support for AArch64 UDF instruction. UDF - Permanently Undefined generates an Undefined Instruction exception (ESR_ELx.EC = 0b000000). Reviewers: DavidSpickett, javed.absar, t.p.northover Reviewed By: javed.absar Subscribers: nhaehnle, kristof.beyls Differential Revision: https://reviews.llvm.org/D53319 llvm-svn: 345581
* [SelectionDAG] Add FoldBUILD_VECTOR to simplify new BUILD_VECTOR nodesSimon Pilgrim2018-10-302-16/+76
| | | | | | | | | | Similar to FoldCONCAT_VECTORS, this patch adds FoldBUILD_VECTOR to simplify cases that can avoid the creation of the BUILD_VECTOR - if all the operands are UNDEF or if the BUILD_VECTOR simplifies to a copy. This exposed an assumption in some AMDGPU code that getBuildVector was guaranteed to be a BUILD_VECTOR node that I've tried to handle. Differential Revision: https://reviews.llvm.org/D53760 llvm-svn: 345578
* [DAGCombiner] Improve X div/rem Y fold if single bit element typeDavid Bolvansky2018-10-301-3/+4
| | | | | | | | | | | | | | Summary: Tests by @spatel, thanks Reviewers: spatel, RKSimon Reviewed By: spatel Subscribers: sdardis, atanasyan, llvm-commits, spatel Differential Revision: https://reviews.llvm.org/D52668 llvm-svn: 345575
* [LegalizeTypes] Teach PromoteIntRes_BITCAST to better handle a bitcast with ↵Craig Topper2018-10-302-8/+22
| | | | | | | | | | | | | | | | vector output type and a vector input type that needs to be widened Summary: Previously if we had a bitcast vector output type that needs promotion and a vector input type that needs widening we would just do a stack store and load to handle the conversion. We can do a little better if we can widen the bitcast to a legal vector type the same size as the widened input type. Then we can do the bitcast between this widened type and the widened input type. Afterwards we can extract_subvector back to the original output and any_extend that. Type legalization will then circle back and handle promotion of the extract_subvector and the any_extend will just be removed. This will avoid going through the stack and allows us to remove a custom version of this legalization from X86. Reviewers: efriedma, RKSimon Reviewed By: efriedma Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D53229 llvm-svn: 345567
* [X86] Cleanup the code in LowerFABSorFNEG and LowerFCOPYSIGN a little. NFCCraig Topper2018-10-301-30/+20
| | | | | | Use SelectionDAG::EVTToAPFloatSemantics. Make the LogicVT calculation in LowerFABSorFNEG similar to LowerFCOPYSIGN. Use APInt::getSignedMaxValue instead of ~APInt::getSignMask. llvm-svn: 345565
* [X86] Stop changing f128 fand/for/fxor to v2i64.Craig Topper2018-10-302-21/+38
| | | | | | The additional patterns don't cost us much and it seems better than changing element widths. llvm-svn: 345564
* AMDGPU: Remove custom BUILD_VECTOR combineMatt Arsenault2018-10-302-46/+0
| | | | | | | This was looping in a testcase and removing it now slightly improves a test. llvm-svn: 345560
* AMDGPU: Use scavengeRegisterBackwardsMatt Arsenault2018-10-301-2/+3
| | | | llvm-svn: 345559
* Remove dead declarationMatt Arsenault2018-10-301-7/+0
| | | | llvm-svn: 345555
* Pass TRI to printRegMatt Arsenault2018-10-301-1/+1
| | | | llvm-svn: 345553
* Remove unneeded friend declarations that clang-cl warns onReid Kleckner2018-10-292-4/+0
| | | | llvm-svn: 345549
* [AliasSetTracker] Cleanup addPointer interface. [NFCI]Alina Sbirlea2018-10-291-6/+6
| | | | | | | | | | | | | | Summary: Attempting to simplify the addPointer interface. Currently there's code decomposing a MemoryLocation into (Ptr, Size, AAMDNodes) only to recreate the MemoryLocation inside the call. Reviewers: reames, mkazantsev Subscribers: sanjoy, jlebar, llvm-commits Differential Revision: https://reviews.llvm.org/D53836 llvm-svn: 345548
* [DWARF][NFC] Refactor range list extraction and dumpingWolfgang Pieb2018-10-297-229/+214
| | | | | | | | | | | | | | | | | The purpose of this patch is twofold: - Fold pre-DWARF v5 functionality into v5 to eliminate the need for 2 different versions of range list handling. We get rid of DWARFDebugRangelist{.cpp,.h}. - Templatize the handling of range list tables so that location list handling can take advantage of it as well. Location list and range list tables have the same basic layout. A non-NFC version of this patch was previously submitted with r342218, but it caused errors with some TSan tests. This patch has no functional changes. The difference to the non-NFC patch is that there are no changes to rangelist dumping in this patch. Differential Revision: https://reviews.llvm.org/D53545 llvm-svn: 345546
* Add parens to fix incorrect assert check.Erich Keane2018-10-291-1/+1
| | | | | | | | && has higher priority than ||, so this assert works really oddly. Add parens to match the programmer's intent. Change-Id: I3abe1361ee0694462190c5015779db664012f3d4 llvm-svn: 345543
* AMDGPU: Enable code object v3 by defaultKonstantin Zhuravlyov2018-10-291-15/+30
| | | | | | Differential Revision: https://reviews.llvm.org/D53525 llvm-svn: 345542
* [MachineOutliner] Inherit target features from parent functionJessica Paquette2018-10-291-0/+8
| | | | | | | | | | | | | | | If a function has target features, it may contain instructions that aren't represented in the default set of instructions. If the outliner pulls out one of these instructions, and the function doesn't have the right attributes attached, we'll run into an LLVM error explaining that the target doesn't support the necessary feature for the instruction. This makes outlined functions inherit target features from their parents. It also updates the machine-outliner.ll test to check that we're properly inheriting target features. llvm-svn: 345535
* [X86] Set isMachineVerifierClean() back to false (PR27481)Simon Pilgrim2018-10-291-0/+4
| | | | | | Put back the isMachineVerifierClean() override removed at rL345513 to fix Windows ThinLTO tests llvm-svn: 345528
* [HotColdSplitting] Allow outlining single-block cold regionsVedant Kumar2018-10-291-3/+20
| | | | | | | | | | | | | | | | It can be profitable to outline single-block cold regions because they may be large. Allow outlining single-block regions if they have over some threshold of non-debug, non-terminator instructions. I chose 3 as the threshold after experimenting with several internal frameworks. In practice, reducing the threshold further did not give much improvement, whereas increasing it resulted in substantial regressions. Differential Revision: https://reviews.llvm.org/D53824 llvm-svn: 345524
* [WebAssembly] Lower away condition truncations for scalar selectsThomas Lively2018-10-292-0/+14
| | | | | | | | | | Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D53676 llvm-svn: 345521
* [X86][SSE] getFauxShuffleMask - Fix shuffle mask adjustment for multiple ↵Simon Pilgrim2018-10-291-4/+3
| | | | | | | | inserted subvectors Part of the issue discovered in PR39483, although its not fully exposed until I reapply rL345395 (by reverting rL345451) llvm-svn: 345520
* [X86] Add AES to KNL CPUs to match clang.Craig Topper2018-10-291-0/+1
| | | | | | I believe this was lost from KNL when AES was pushed from Westmere to Skylake recently. KNL used to inherit from IVB. llvm-svn: 345519
* [AMDGPU] Fixed return value causing warning and regressionStanislav Mekhanoshin2018-10-291-1/+1
| | | | llvm-svn: 345518
* [AArch64] Rename FP16FML instruction format (NFC)Bryan Chan2018-10-292-72/+78
| | | | | | | Rename SIMDThreeSameMult (etc.) to SIMDThreeSameVectorFML (etc.) to follow usual naming convention, and add some comments in the .td files. llvm-svn: 345515
* [AMDGPU] Match v_swap_b32Stanislav Mekhanoshin2018-10-292-0/+175
| | | | | | Differential Revision: https://reviews.llvm.org/D52677 llvm-svn: 345514
* [X86] Enable the MachineVerifier by defaultFrancis Visoiu Mistrih2018-10-291-4/+0
| | | | | | | | | | | | | | | The machine verifier was disabled for x86 by default. There are now only 9 tests failing, compared to what previously was between 20 and 30. This is a good opportunity to file bugs for all the remaining issues, then explicitly disable the failing tests and enabling the machine verifier by default. This allows us to avoid adding new tests that break the verifier. PR27481 llvm-svn: 345513
* [Intrinsic] Signed and Unsigned Saturation Subtraction IntirnsicsLeonard Chan2018-10-2910-38/+101
| | | | | | | | | | | | Add an intrinsic that takes 2 integers and perform saturation subtraction on them. This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics. Differential Revision: https://reviews.llvm.org/D53783 llvm-svn: 345512
* [AArch64] Return address signing B key supportLuke Cheeseman2018-10-291-3/+20
| | | | | | | | | | | - Add support to generate AUTIBSP, PACIBSP, RETAB instructions for return address signing - The key used to sign the function is controlled by the function attribute "sign-return-address-key" Differential Revision: https://reviews.llvm.org/D51427 llvm-svn: 345511
* [LLVM-C] Add Builder Bindings to Common Memory IntrinsicsRobert Widmann2018-10-291-0/+24
| | | | | | | | | | | | | | Summary: Add IRBuilder bindings for memmove, memcpy, and memset. Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: harlanhaskins, llvm-commits Differential Revision: https://reviews.llvm.org/D53555 llvm-svn: 345508
* [X86] Force floating point values in constant pool decoding to print in ↵Craig Topper2018-10-291-1/+2
| | | | | | | | | | scientific notation so they can't be confused with integers. When the floating point constants are whole numbers they have no decimal point so look like integers, but mean something very different in something like an 'and' instruction. Ideally we would just print a decimal point and a 0, but I couldn't see how to make APFloat::toString do that. llvm-svn: 345488
* [X86] Recognize constant splats in LowerFCOPYSIGN.Craig Topper2018-10-281-1/+1
| | | | llvm-svn: 345484
OpenPOWER on IntegriCloud