summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Re-land "[ThinLTO] Add call edges' relative block frequency to per-module ↵Easwaran Raman2018-01-254-21/+59
| | | | | | | | | | | | | | summary." It was reverted after buildbot regressions. Original commit message: This allows relative block frequency of call edges to be passed to the thinlink stage where it will be used to compute synthetic entry counts of functions. llvm-svn: 323460
* [Hexagon] SETEQ and SETNE are valid integer condition codesKrzysztof Parzyszek2018-01-251-1/+2
| | | | llvm-svn: 323452
* Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements ↵Alexey Bataev2018-01-251-350/+129
| | | | | | | | as shuffle." This reverts commit r323441 to fix buildbots. llvm-svn: 323447
* [LTO] - Introduce GlobalResolution::Prevailing flag.George Rimar2018-01-251-15/+9
| | | | | | | | | It is NFC refactoring change that will make D42107 a bit smaller. Differential revision: https://reviews.llvm.org/D42528 llvm-svn: 323444
* [SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle.Alexey Bataev2018-01-251-129/+350
| | | | | | | | | | | | | | | | | Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323441
* [X86] Apply clang-format to detectUSatPattern. NFCI.Simon Pilgrim2018-01-251-5/+4
| | | | | | Cleanup from D42544 llvm-svn: 323439
* Revert "[Hexagon] Replace EmitFunctionEntryCode with a DAG preprocessing code"Krzysztof Parzyszek2018-01-252-22/+16
| | | | | | This reverts r323374. The fix needs a different approach. llvm-svn: 323438
* [InstCombine] narrow masked zexted binops (PR35792)Sanjay Patel2018-01-252-0/+71
| | | | | | | | | | | | | | | | | This is guarded by shouldChangeType(), so the tests show that we don't do the fold if the narrower type is not legal. Note that there is a proposal (D42424) that would change the results for the specific cases shown in these tests. That difference is also discussed in PR35792: https://bugs.llvm.org/show_bug.cgi?id=35792 Alive proofs for the cases handled here as well as the bitwise logic binops that we should already do better on: https://rise4fun.com/Alive/c97 https://rise4fun.com/Alive/Lc5E https://rise4fun.com/Alive/kdf llvm-svn: 323437
* Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements ↵Alexey Bataev2018-01-251-352/+130
| | | | | | | | as shuffle." This reverts commit r323430 to fix buildbots. llvm-svn: 323432
* [SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle.Alexey Bataev2018-01-251-130/+352
| | | | | | | | | | | | | | | | | Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323430
* Another try to commit 323321 (aggressive instruction combine).Amjad Aboud2018-01-2514-4/+682
| | | | llvm-svn: 323416
* [Dwarf] Add dsymutil Atom extensions. NFCJonas Devlieghere2018-01-251-0/+3
| | | | | | | | | | This patch extends the atom types used by the Apple accelerator tables with two dsymutil extensions: - DW_ATOM_type_type_flags - DW_ATOM_qual_name_hash llvm-svn: 323414
* [GlobalOpt] Emit fragments using field offsets from struct layoutMikael Holmen2018-01-251-4/+2
| | | | | | | | | | | | | | | | | | | | | | Summary: When creating the debug fragments for a SRA'd struct, use the fields' offsets, taken from the struct layout, as the offsets for the resulting fragments. This fixes an issue where GlobalOpt would emit fragments with incorrect offsets for padded fields. This should solve PR36016. Patch by David Stenberg. Reviewers: aprantl Reviewed By: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42489 llvm-svn: 323411
* [FuzzMutate] Inst deleter doesn't work with PhiNodesIgor Laevsky2018-01-251-4/+9
| | | | | | Differential Revision: https://reviews.llvm.org/D42412 llvm-svn: 323409
* [IRMover] Add comment and fix test caseEugene Leviant2018-01-251-0/+6
| | | | llvm-svn: 323407
* [X86] Expand IMUL/MUL instregexs in Intel scheduler models. Add load latency ↵Craig Topper2018-01-255-95/+95
| | | | | | | | | | | | to some of them in SkylakeClient model. The regular expressions and the imul names caused some instructions to be matched by multiple regexs creating unpredictable results. This changes them all to use explicit instrs instead. While doing this I also found that some instructions in Skylake were missing load latency so I fixed that too. llvm-svn: 323406
* [X86] Expand IMUL/MUL instregexs in Znver1 scheduler to show what's actually ↵Craig Topper2018-01-251-22/+16
| | | | | | | | | | implemented. The IMUL instruction names mixed with the prefix matching of the instregex lead to some strange matches. The worst being that several memory instructions are using the register form latency. I don't know what the right answer is, so I've left TODOs and will try to work with the AMD folks to get this cleaned up. llvm-svn: 323405
* [X86] Name the MMX phaddd instruction with 3 Ds instead of just 2. NFCCraig Topper2018-01-258-13/+13
| | | | llvm-svn: 323403
* [X86] Remove 64/128/256 from MMX/SSE/AVX instruction names for overall ↵Craig Topper2018-01-259-400/+400
| | | | | | | | | | consistency. NFC MMX instrutions all start with MMX_ so the 64 isn't needed for disambigutation. SSE/AVX1 instructions are assumed 128-bit so we don't need to say 128. AVX2 instructions should use a Y to indicate 256-bits. llvm-svn: 323402
* [X86] Remove unnecessary '_alt' and '_Int' from scheduler model regular ↵Craig Topper2018-01-255-18/+18
| | | | | | | | expressions. These were treated as optional suffixes, but the regular expressions are already prefix matches so this is unnecessary. It breaks the binary search optimization in tablegen due to the top level question mark. llvm-svn: 323401
* [ORC] Refactor the various lookupFlags methods to return the flags map via theLang Hames2018-01-252-5/+6
| | | | | | | | | | | | first argument. This makes lookupFlags more consistent with lookup (which takes the query as the first argument) and composes better in practice, since lookups are usually linearly chained: Each lookupFlags can populate the result map based on the symbols not found in the previous lookup. (If the maps were returned rather than passed by reference there would have to be a merge step at the end). llvm-svn: 323398
* [GISel]: Implement GlobalISel combiner API.Aditya Nandakumar2018-01-253-0/+124
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://reviews.llvm.org/D41373 The various components are GICombinerHelper contains transformations that are common to all targets. Targets can pick and choose which transformations (at function/opcode granularity) each pass uses via configuring a GICombinerInfo. GICombiner contains some common code and it does the traversal, driving of combines, worklist management and iterating until convergence. GICombinerInfo is an interface with a virtual method called combine. The combiner info will allow targets to pick and choose (or implement their own specific combines). CombineInfos can make use of available combines in GICombineHelper to configure the transformations for a particular pass. Currently this approach allows cherry picking transformations from helpers (at function/opcode granularity) and also allows early returning on specific transformations. Targets also get to prioritize whether target specific combines run before/after the opt-in generic combines. Ideally we would like this part to be configured by both C++ and Tablegen. The CombinerInfo also has a field which indicates how to deal with IllegalOps (ie - should we allow to create them/or legalize them?). A CombinerPass would configure a CombinerInfo, create the GICombiner with the Info, and call GICombiner::combineMachineInstrs(MachineFunction&). This organization is very similar to the GISelLegalizer. llvm-svn: 323392
* [Hexagon] Replace EmitFunctionEntryCode with a DAG preprocessing codeKrzysztof Parzyszek2018-01-242-16/+22
| | | | | | | | | The code in EmitFunctionEntryCode needs to know the maximum stack alignment, but it runs very early in the selection process (before lowering). The final stack alignment may change during lowering, so the code needs to be moved to where the alignment is known. llvm-svn: 323374
* [AArch64][GlobalISel] Fall back during AArch64 isel if we have a volatile load.Amara Emerson2018-01-241-0/+6
| | | | | | | | | | | | | | | The tablegen imported patterns for sext(load(a)) don't check for single uses of the load or delete the original after matching. As a result two loads are left in the generated code. This particular issue will be fixed by adding support for a G_SEXTLOAD opcode in future. There are however other potential issues around this that wouldn't be fixed by a G_SEXTLOAD, so until we have a proper solution we don't try to handle volatile loads at all in the AArch64 selector. Fixes/works around PR36018. llvm-svn: 323371
* [GlobalISel] Don't fall back to FastISel.Amara Emerson2018-01-242-1/+5
| | | | | | | Apparently checking the pass structure isn't enough to ensure that we don't fall back to FastISel, as it's set up as part of the SelectionDAGISel. llvm-svn: 323369
* [X86][SSE] Aggressively use PMADDWD for v4i32 multiplies with 17 or more ↵Simon Pilgrim2018-01-241-8/+20
| | | | | | | | | | | | leading zeros As discussed in D41484, PMADDWD for 'zero extended' vXi32 is nearly always a better option than PMULLD: On SNB it will result in code that isn't any faster, but not any slower so we may as well keep it. On KNL it only has half the throughput, so I've disabled it on there - ideally there'd be a better way than this. Differential Revision: https://reviews.llvm.org/D42258 llvm-svn: 323367
* Simplify. NFC.Rafael Espindola2018-01-241-1/+1
| | | | | | Thanks to Teresa Johnson for the suggestion. llvm-svn: 323365
* Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements ↵Alexey Bataev2018-01-241-346/+122
| | | | | | | | as shuffle." This reverts commit r323348 because of the broken buildbots. llvm-svn: 323359
* Revert "[ThinLTO] Add call edges' relative block frequency to per-module ↵Easwaran Raman2018-01-243-54/+18
| | | | | | | | summary." Causes buildbot regressions. llvm-svn: 323358
* [AMDGPU] Make sure all super regs of reserved regs are marked reserved.Geoff Berry2018-01-247-29/+30
| | | | | | | | | | | | | | | | | | | | | Summary: Move reserveRegisterTuples into AMDGPURegisterInfo and use it in R600RegisterInfo::getReservedRegs and R600InstrInfo::reserveIndirectRegisters to ensure that all super registers of reserved registers are also marked as reserved. Before this change, under certain circumstances, the registers %t1_x and %t1_xyzw would be marked as reserved, but %t1_xy and %t1_xyz would not be, leading to the register allocator sometimes assigning a register to %t1_xy, which is invalid since %t1_x is reserved. Reviewers: arsenm, tstellar, MatzeB, qcolombet Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D42448 llvm-svn: 323356
* Revert r321751, "StructurizeCFG: Fix broken backedge detection"Nicolai Haehnle2018-01-241-28/+82
| | | | | | | | | | | It causes regressions in various OpenGL test suites. Keep the test cases introduced by r321751 as XFAIL, and add a test case for the regression. Change-Id: I90b4cc354f68cebe5fcef1f2422dc8fe1c6d3514 Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=36015 llvm-svn: 323355
* [ARM] Expand long shifts for Thumb1 to __aeabi_ callsWeiming Zhao2018-01-241-0/+7
| | | | | | | | | | | | | | Summary: For long shifts, the inlined version takes about 20 instructions on Thumb1. To avoid the code bloat, expand to __aeabi_ calls if target is Thumb1. Reviewers: samparker Reviewed By: samparker Subscribers: samparker, aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D42401 llvm-svn: 323354
* [X86] Fix some inconsistencies in the itineraries and Sched for ↵Craig Topper2018-01-242-6/+6
| | | | | | | | (V)PEXTRW/(V)PINSRW The weirdest being that PEXTRWrr was tagged as a memory operation. llvm-svn: 323353
* [X86] Adjust names of PINSRW/PEXTRW intructions between MMX/SSE/AVX/AVX512 ↵Craig Topper2018-01-249-82/+74
| | | | | | for consistency and to maybe enable more regular expression compaction in the scheduler models. NFCI llvm-svn: 323352
* [X86] Remove '(_REV)?' from a bunch of scheduler regular expressions. NFCCraig Topper2018-01-245-195/+193
| | | | | | The regexs are treated as a prefix match already so the checking for optional text at the end provides no value. Instead it prevents the binary search optimization in tablegen from kicking in due to the top level question mark. llvm-svn: 323351
* [ThinLTO] Add call edges' relative block frequency to per-module summary.Easwaran Raman2018-01-243-18/+54
| | | | | | | | | | | | | | | Summary: This allows relative block frequency of call edges to be passed to the thinlink stage where it will be used to compute synthetic entry counts of functions. Reviewers: tejohnson, pcc Subscribers: mehdi_amini, llvm-commits, inglorion Differential Revision: https://reviews.llvm.org/D42212 llvm-svn: 323349
* [SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle.Alexey Bataev2018-01-241-122/+346
| | | | | | | | | | | | | | | | | Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323348
* [Hexagon] Run late copy propagation and dead code elimination passesKrzysztof Parzyszek2018-01-241-0/+5
| | | | llvm-svn: 323346
* Handle R_386_PLT32 in RuntimeDyldELF.Rafael Espindola2018-01-241-0/+3
| | | | | | This should fix the 32 bit buildbots. llvm-svn: 323344
* InstSimplify: If divisor element is undef simplify to undefZvi Rackover2018-01-241-2/+3
| | | | | | | | | | | | | | | | Summary: If any vector divisor element is undef, we can arbitrarily choose it be zero which would make the div/rem an undef value by definition. Reviewers: spatel, reames Reviewed By: spatel Subscribers: magabari, llvm-commits Differential Revision: https://reviews.llvm.org/D42485 llvm-svn: 323343
* [globalisel] Introduce LegalityQuery to better encapsulate the legalizer ↵Daniel Sanders2018-01-242-14/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | decisions. NFC. Summary: `getAction(const InstrAspect &) const` breaks encapsulation by exposing the smaller components that are used to decide how to legalize an instruction. This is a problem because we need to change the implementation of LegalizerInfo so that it's able to describe particular type combinations rather than just cartesian products of types. For example, declaring the following setAction({..., 0, s32}, Legal) setAction({..., 0, s64}, Legal) setAction({..., 1, s32}, Legal) setAction({..., 1, s64}, Legal) currently declares these type combinations as legal: {s32, s32} {s64, s32} {s32, s64} {s64, s64} but we currently have no means to say that, for example, {s64, s32} is not legal. Some operations such as G_INSERT/G_EXTRACT/G_MERGE_VALUES/ G_UNMERGE_VALUES has relationships between the types that are currently described incorrectly. Additionally, G_LOAD/G_STORE currently have no means to legalize non-atomics differently to atomics. The necessary information is in the MMO but we have no way to use this in the legalizer. Similarly, there is currently no way for the register type and the memory type to differ so there is no way to cleanly represent extending-load/truncating-store in a way that can't be broken by optimizers (resulting in illegal MIR). This patch introduces LegalityQuery which provides all the information needed by the legalizer to make a decision on whether something is legal and how to legalize it. Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar, volkan, reames, bogner Reviewed By: bogner Subscribers: bogner, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D42244 llvm-svn: 323342
* [NFC] Make magic number for DJB hash function customizable.Jonas Devlieghere2018-01-241-2/+1
| | | | | | | | This allows us to specify the magic number for the DJB hash function. This feature is needed by dsymutil to emit Apple types accelerator table. llvm-svn: 323341
* [ValueTracking] add recursion depth param to matchSelectPattern Sanjay Patel2018-01-241-11/+18
| | | | | | | | | | | | | | | | | | | | | | | We're getting bug reports: https://bugs.llvm.org/show_bug.cgi?id=35807 https://bugs.llvm.org/show_bug.cgi?id=35840 https://bugs.llvm.org/show_bug.cgi?id=36045 ...where we blow up the stack in value tracking because other passes are sending in selects that have an operand that is itself the select. We don't currently have a reliable way to avoid analyzing dead code that may take non-standard forms, so bail out when things go too far. This mimics the recursion depth limitations in other parts of value tracking. Unfortunately, this pushes the underlying problems for other passes (jump-threading, simplifycfg, correlated-propagation) into hiding. If someone wants to uncover those again, the first draft of this patch on Phab would do that (it would assert rather than bail out). Differential Revision: https://reviews.llvm.org/D42442 llvm-svn: 323331
* Reverted 323321.Amjad Aboud2018-01-2414-679/+4
| | | | llvm-svn: 323326
* [AArch64] Avoid unnecessary vector byte-swapping in big-endianPablo Barrio2018-01-241-12/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Loads/stores of some NEON vector types are promoted to other vector types with different lane sizes but same vector size. This is not a problem in little-endian but, when in big-endian, it requires additional byte reversals required to preserve the lane ordering while keeping the right endianness of the data inside each lane. For example: %1 = load <4 x half>, <4 x half>* %p results in the following assembly: ld1 { v0.2s }, [x1] rev32 v0.4h, v0.4h This patch changes the promotion of these loads/stores so that the actual vector load/store (LD1/ST1) takes care of the endianness correctly and there is no need for further byte reversals. The previous code now results in the following assembly: ld1 { v0.4h }, [x1] Reviewers: olista01, SjoerdMeijer, efriedma Reviewed By: efriedma Subscribers: aemerson, rengolin, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D42235 llvm-svn: 323325
* [Hexagon] Remove unused HexagonISD opcodes, NFCKrzysztof Parzyszek2018-01-244-28/+5
| | | | llvm-svn: 323324
* [DebugInfo] Emit DWARF reference for DIVariable 'count' in DISubrangeSander de Smalen2018-01-242-1/+8
| | | | | | | | | | | | | | | | | | | | | | Summary: This patch implements the codegen of DWARF debug info for non-constant 'count' fields for DISubrange. This is patch [2/3] in a series to extend LLVM's DISubrange Metadata node to support debugging of C99 variable length arrays and vectors with runtime length like the Scalable Vector Extension for AArch64. It is also a first step towards representing more complex cases like arrays in Fortran. Reviewers: echristo, pcc, aprantl, dexonsmith, clayborg, kristof.beyls, dblaikie Reviewed By: aprantl Subscribers: fhahn, aemerson, rengolin, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D41696 llvm-svn: 323323
* [InstCombine] Introducing Aggressive Instruction Combine pass ↵Amjad Aboud2018-01-2414-4/+679
| | | | | | | | | | | | | | | | | | (-aggressive-instcombine). Combine expression patterns to form expressions with fewer, simple instructions. This pass does not modify the CFG. For example, this pass reduce width of expressions post-dominated by TruncInst into smaller width when applicable. It differs from instcombine pass in that it contains pattern optimization that requires higher complexity than the O(1), thus, it should run fewer times than instcombine pass. Differential Revision: https://reviews.llvm.org/D38313 llvm-svn: 323321
* [X86][SSE] Avoid calls to combineX86ShufflesRecursively that can't combine ↵Simon Pilgrim2018-01-241-9/+14
| | | | | | | | | | | | to target shuffles (PR32037) Don't bother making recursive calls to combineX86ShufflesRecursively if we have more shuffle source operands than will be combined together with the remaining recursive depth. See https://bugs.llvm.org/show_bug.cgi?id=32037#c26 and https://bugs.llvm.org/show_bug.cgi?id=32037#c27 for the reduction in compile times from this patch. Differential Revision: https://reviews.llvm.org/D42378 llvm-svn: 323320
* Fix typos of occurred and occurrenceMalcolm Parsons2018-01-242-2/+2
| | | | llvm-svn: 323318
OpenPOWER on IntegriCloud