summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU: Split wide vectors of i16/f16 into 32-bit regs on callsMatt Arsenault2018-07-311-2/+24
| | | | | | | This improves code for the same reasons as scalarizing 32-bit element vectors. llvm-svn: 338418
* [CodeView] Minimal support for S_UNAMESPACE recordsAlexandre Ganea2018-07-315-1/+23
| | | | | | Differential Revision: https://reviews.llvm.org/D50007 llvm-svn: 338417
* AMDGPU: Scalarize vector argument types to callsMatt Arsenault2018-07-311-31/+15
| | | | | | | | | | | | | | | | | When lowering calling conventions, prefer to decompose vectors into the constitute register types. This avoids artifical constraints to satisfy a wide super-register. This improves code quality because now optimizations don't need to deal with the super-register constraint. For example the immediate folding code doesn't deal with 4 component reg_sequences, so by breaking the register down earlier the existing immediate folding code is able to work. This also avoids the need for the shader input processing code to manually split vector types. llvm-svn: 338416
* [X86] WriteBSWAP sched classes are reg-reg only.Simon Pilgrim2018-07-3110-20/+20
| | | | | | | Don't declare them as X86SchedWritePair when the folded class will never be used. Note: MOVBE (load/store endian conversion) instructions tend to have a very different behaviour to BSWAP. llvm-svn: 338412
* Revert "[DebugInfo] Generate DWARF debug information for labels."Vlad Tsyrklevich2018-07-3116-406/+149
| | | | | | | This reverts commits r338390 and r338398, they were causing LSan failures on the ASan bot. llvm-svn: 338408
* [X86][SSE] Use ISD::MULHU for constant/non-zero ISD::SRL lowering (PR38151)Simon Pilgrim2018-07-311-0/+18
| | | | | | | | | | As was done for vector rotations, we can efficiently use ISD::MULHU for vXi8/vXi16 ISD::SRL lowering. Shift-by-zero cases are still problematic (mainly on v32i8 due to extra AND/ANDN/OR or VPBLENDVB blend masks but v8i16/v16i16 aren't great either if PBLENDW fails) so I've limited this first patch to known non-zero cases if we can't easily use PBLENDW. Differential Revision: https://reviews.llvm.org/D49562 llvm-svn: 338407
* Make ICF log output order deterministic.Rui Ueyama2018-07-311-0/+4
| | | | | | | | | | This patch does the same thing as r338153 for COFF. Note that this patch affects only the order of log messages. The output file is already deterministic. Differential Revision: https://reviews.llvm.org/D50023 llvm-svn: 338406
* Resubmit r338340 "[MS Demangler] Better demangling of template arguments."Zachary Turner2018-07-311-45/+87
| | | | | | This broke the build with GCC, but has since been fixed. llvm-svn: 338403
* [X86] Add pattern matching for PMADDUBSWCraig Topper2018-07-311-0/+143
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Similar to D49636, but for PMADDUBSW. This instruction has the additional complexity that the addition of the two products saturates to 16-bits rather than wrapping around. And one operand is treated as signed and the other as unsigned. A C example that triggers this pattern ``` static const int N = 128; int8_t A[2*N]; uint8_t B[2*N]; int16_t C[N]; void foo() { for (int i = 0; i != N; ++i) C[i] = MIN(MAX((int16_t)A[2*i]*(int16_t)B[2*i] + (int16_t)A[2*i+1]*(int16_t)B[2*i+1], -32768), 32767); } ``` Reviewers: RKSimon, spatel, zvi Reviewed By: RKSimon, zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49829 llvm-svn: 338402
* [X86] Preserve more liveness information in emitStackProbeInlineFrancis Visoiu Mistrih2018-07-311-18/+37
| | | | | | | | | | | | | | | | This commit fixes two issues with the liveness information after the call: 1) The code always spills RCX and RDX if InProlog == true, which results in an use of undefined phys reg. 2) FinalReg, JoinReg, RoundedReg, SizeReg are not added as live-ins to the basic blocks that use them, therefore they are seen undefined. https://llvm.org/PR38376 Differential Revision: https://reviews.llvm.org/D50020 llvm-svn: 338400
* Fix InstCombine address space assertEwan Crawford2018-07-311-0/+6
| | | | | | | | | | | | | | | | | | | | | Workaround bug where the InstCombine pass was asserting on the IR added in lit test, where we have a bitcast instruction after a GEP from an addrspace cast. The second bitcast in the test was getting combined into `bitcast <16 x i32>* %0 to <16 x i32> addrspace(3)*`, which looks like it should be an addrspace cast instruction instead. Otherwise if control flow is allowed to continue as it is now we create a GEP instruction `<badref> = getelementptr inbounds <16 x i32>, <16 x i32>* %0, i32 0`. However because the type of this instruction doesn't match the address space we hit an assert when replacing the bitcast with that GEP. ``` void llvm::Value::doRAUW(llvm::Value*, bool): Assertion `New->getType() == getType() && "replaceAllUses of value with new value of different type!"' failed. ``` Differential Revision: https://reviews.llvm.org/D50058 llvm-svn: 338395
* [DebugInfo][LCSSA] Preserve debug location in lcssa phisAnastasis Grammenos2018-07-311-1/+2
| | | | | | | | | | | | | | Summary: When inserting lcssa Phi Nodes in the exit block mak sure to preserve the original instructions DL. Reviewers: vsk Subscribers: JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D50009 llvm-svn: 338391
* [DebugInfo] Generate DWARF debug information for labels.Hsiangkai Wang2018-07-3116-149/+406
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are two forms for label debug information in DWARF format. 1. Labels in a non-inlined function: DW_TAG_label DW_AT_name DW_AT_decl_file DW_AT_decl_line DW_AT_low_pc 2. Labels in an inlined function: DW_TAG_label DW_AT_abstract_origin DW_AT_low_pc We will collect label information from DBG_LABEL. Before every DBG_LABEL, we will generate a temporary symbol to denote the location of the label. The symbol could be used to get DW_AT_low_pc afterwards. So, we create a mapping between 'inlined label' and DBG_LABEL MachineInstr in DebugHandlerBase. The DBG_LABEL in the mapping is used to query the symbol before it. The AbstractLabels in DwarfCompileUnit is used to process labels in inlined functions. We also keep a mapping between scope and labels in DwarfFile to help to generate correct tree structure of DIEs. It also generates label debug information under global isel. Differential Revision: https://reviews.llvm.org/D45556 llvm-svn: 338390
* Revert Enrich inline messagesDavid Bolvansky2018-07-315-152/+111
| | | | llvm-svn: 338389
* Enrich inline messagesDavid Bolvansky2018-07-315-111/+152
| | | | | | | | | | | | | | | | | | | | | | Summary: This patch improves Inliner to provide causes/reasons for negative inline decisions. 1. It adds one new message field to InlineCost to report causes for Always and Never instances. All Never and Always instantiations must provide a simple message. 2. Several functions that used to return the inlining results as boolean are changed to return InlineResult which carries the cause for negative decision. 3. Changed remark priniting and debug output messages to provide the additional messages and related inline cost. 4. Adjusted tests for changed printing. Patch by: yrouban (Yevgeny Rouban) Reviewers: craig.topper, sammccall, sgraenitz, NutshellySima, shchenz, chandlerc, apilipenko, javed.absar, tejohnson, dblaikie, sanjoy, eraman, xbolva00 Reviewed By: tejohnson, xbolva00 Subscribers: xbolva00, llvm-commits, arsenm, mehdi_amini, eraman, haicheng, steven_wu, dexonsmith Differential Revision: https://reviews.llvm.org/D49412 llvm-svn: 338387
* [MemDep] Use PhiValuesAnalysis to improve alias analysis resultsJohn Brawn2018-07-311-3/+14
| | | | | | | | | | This is being done in order to make GVN able to better optimize certain inputs. MemDep doesn't use PhiValues directly, but does need to notifiy it when things get invalidated. Differential Revision: https://reviews.llvm.org/D48489 llvm-svn: 338384
* [InstSimplify] Fold another Select with And/Or patternDavid Bolvansky2018-07-311-14/+22
| | | | | | | | | | | | | | Summary: Proof: https://rise4fun.com/Alive/L5J Reviewers: lebedev.ri, spatel Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49975 llvm-svn: 338383
* DAG: Fix PromoteFloatResult for fcanonicalizeMatt Arsenault2018-07-311-1/+2
| | | | llvm-svn: 338382
* AMDGPU: Don't handle FP16_TO_FP in isCanonicalizedMatt Arsenault2018-07-311-4/+0
| | | | | | | This needs more special handling to do correctly. Fixes test in subsequent commit. llvm-svn: 338381
* [SLP] Fix PR38339: Instruction does not dominate all uses!Alexey Bataev2018-07-311-0/+6
| | | | | | | | | | | | | | | | Summary: If the ExtractElement instructions can be optimized out during the vectorization and we need to reshuffle the parent vector, this ShuffleInstruction may be inserted in the wrong place causing compiler to produce incorrect code. Reviewers: spatel, RKSimon, mkuper, hfinkel, javed.absar Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49928 llvm-svn: 338380
* AMDGPU: Fold undef fcanonicalize to qNaNMatt Arsenault2018-07-311-2/+10
| | | | | | | | | | We could choose a free 0 for this, but this matches the behavior for fmul undef, 1.0. Also, the NaN use is more useful for folding use operations although if it's not eliminated it is more expensive in terms of code size. llvm-svn: 338376
* [llvm-mca][BtVer2] Teach how to identify dependency-breaking idioms.Andrea Di Biagio2018-07-312-0/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch teaches llvm-mca how to identify dependency breaking instructions on btver2. An example of dependency breaking instructions is the zero-idiom XOR (example: `XOR %eax, %eax`), which always generates zero regardless of the actual value of the input register operands. Dependency breaking instructions don't have to wait on their input register operands before executing. This is because the computation is not dependent on the inputs. Not all dependency breaking idioms are also zero-latency instructions. For example, `CMPEQ %xmm1, %xmm1` is independent on the value of XMM1, and it generates a vector of all-ones. That instruction is not eliminated at register renaming stage, and its opcode is issued to a pipeline for execution. So, the latency is not zero. This patch adds a new method named isDependencyBreaking() to the MCInstrAnalysis interface. That method takes as input an instruction (i.e. MCInst) and a MCSubtargetInfo. The default implementation of isDependencyBreaking() conservatively returns false for all instructions. Targets may override the default behavior for specific CPUs, and return a value which better matches the subtarget behavior. In future, we should teach to Tablegen how to automatically generate the body of isDependencyBreaking from scheduling predicate definitions. This would allow us to expose the knowledge about dependency breaking instructions to the machine schedulers (and, potentially, other codegen passes). Differential Revision: https://reviews.llvm.org/D49310 llvm-svn: 338372
* Revert r338365: [X86] Improved sched models for X86 BT*rr instructions.Simon Pilgrim2018-07-3111-18/+48
| | | | | | | | https://reviews.llvm.org/D49243 Contains WIP code that should not have been included. llvm-svn: 338369
* [SystemZ] Improve decoding in case of instructions with four register operands.Jonas Paulsson2018-07-313-11/+46
| | | | | | | | | | | | | | Since z13, the max group size will be 2 if any μop has more than 3 register sources. This has been ignored sofar in the SystemZHazardRecognizer, but is now handled by recognizing those instructions and adjusting the tracking of decoding and the cost heuristic for grouping. Review: Ulrich Weigand https://reviews.llvm.org/D49847 llvm-svn: 338368
* [InstCombine] simplify code for A & (A ^ B) --> A & ~BSanjay Patel2018-07-311-25/+7
| | | | | | | | | | | | | This fold was written in an odd way and tried to avoid an endless loop by bailing out on all constants instead of the supposedly problematic case of -1. But (X & -1) should always be simplified before we reach here, so I'm not sure how that is a problem. There were no tests for the commuted patterns, so I added those at rL338364. llvm-svn: 338367
* [X86] Improved sched models for X86 BT*rr instructions.Andrew V. Tischenko2018-07-3111-48/+18
| | | | | | https://reviews.llvm.org/D49243 llvm-svn: 338365
* [X86] Improved sched models for X86 SHLD/SHRD* instructions.Andrew V. Tischenko2018-07-3111-196/+70
| | | | | | Differential Revision: https://reviews.llvm.org/D9611 llvm-svn: 338359
* [X86][SSE] isFNEG - Use getTargetConstantBitsFromNode to handle all constant ↵Simon Pilgrim2018-07-311-31/+7
| | | | | | | | | | cases isFNEG was duplicating much of what was done by getTargetConstantBitsFromNode in its own calls to getTargetConstantFromNode. Noticed while reviewing D48467. llvm-svn: 338358
* [ARM] Allow automatically deducing the thumb instruction size for .instMartin Storsjo2018-07-311-3/+14
| | | | | | | | This matches GAS, that allows unsuffixed .inst for thumb. Differential Revision: https://reviews.llvm.org/D49937 llvm-svn: 338357
* [ARM] Support the .inst directive for MachO and COFF targetsMartin Storsjo2018-07-312-7/+43
| | | | | | | | | | Contrary to ELF, we don't add any markers that distinguish data generated with .short/.long from normal instructions, so the .inst directive only adds compatibility with assembly that uses it. Differential Revision: https://reviews.llvm.org/D49936 llvm-svn: 338356
* [AArch64] Support the .inst directive for MachO and COFF targetsMartin Storsjo2018-07-312-7/+18
| | | | | | | | | | Contrary to ELF, we don't add any markers that distinguish data generated with .long from normal instructions, so the .inst directive only adds compatibility with assembly that uses it. Differential Revision: https://reviews.llvm.org/D49935 llvm-svn: 338355
* [ARM] Revert r337821Sam Parker2018-07-311-1/+1
| | | | | | | Re-enabling ARMCodeGenPrepare by default after failing to reproduce the bootstrap issues that I was concerned it was causing. llvm-svn: 338354
* Test commit.Hsiangkai Wang2018-07-311-1/+1
| | | | llvm-svn: 338352
* [NFC] Collect statistics in GuardWideningMax Kazantsev2018-07-311-0/+4
| | | | llvm-svn: 338348
* [VPlan] Introduce VPLoopInfo analysis.Diego Caballero2018-07-313-1/+66
| | | | | | | | | | | | | | The patch introduces loop analysis (VPLoopInfo/VPLoop) for VPBlockBases. This analysis will be necessary to perform some H-CFG transformations and detect and introduce regions representing a loop in the H-CFG. Reviewers: fhahn, rengolin, mkuper, hfinkel, mssimpso Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D48816 llvm-svn: 338346
* Revert r338340 "[MS Demangler] Better demangling of template arguments."Reid Kleckner2018-07-311-87/+46
| | | | | | Breaks the build with GCC, apparently. llvm-svn: 338344
* [X86] Stop accidentally running the Bonnell LEA fixup path on Goldmont.Craig Topper2018-07-311-1/+1
| | | | | | In one place we checked X86Subtarget.slowLEA() to decide if the pass should run. But to decide what the pass should we only check isSLM. This resulted in Goldmont going down the Bonnell path. llvm-svn: 338342
* [MS Demangler] Better demangling of template arguments.Zachary Turner2018-07-311-46/+87
| | | | | | | | | | | This patch fixes demangling of template aliases as template-template arguments, and also fixes function pointers and references as not type template parameters. All of these can be properly demangled now, so I've ported over the test clang/test/CodeGenCXX/ms-template-callbacks.cpp. All of these tests pass llvm-svn: 338340
* [AArch64][GlobalISel] Add isel support for G_BLOCK_ADDR.Amara Emerson2018-07-311-31/+65
| | | | | | | | | | | Also refactors some existing code to materialize addresses for the large code model so it can be shared between G_GLOBAL_VALUE and G_BLOCK_ADDR. This implements PR36390. Differential Revision: https://reviews.llvm.org/D49903 llvm-svn: 338337
* [AArch64][GlobalISel] Make G_BLOCK_ADDR legal.Amara Emerson2018-07-311-0/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D49902 llvm-svn: 338336
* [GlobalISel] Add a G_BLOCK_ADDR opcode to handle IR blockaddress constants.Amara Emerson2018-07-312-0/+11
| | | | | | Differential Revision: https://reviews.llvm.org/D49900 llvm-svn: 338335
* [DAGCombiner][TargetLowering] Pass a SmallVector instead of a std::vector to ↵Craig Topper2018-07-306-14/+13
| | | | | | | | BuildSDIV/BuildUDIV/etc. The vector contains the SDNodes that these functions create. The number of nodes is always a small number so we should use SmallVector to avoid a heap allocation. llvm-svn: 338329
* [MS Demangler] Add rudimentary C++11 SupportZachary Turner2018-07-301-48/+173
| | | | | | | | | | | | | | This patch adds support for demangling r-value references, new operators such as the ""_foo operator, lambdas, alias types, nullptr_t, and various other C++11'isms. There is 1 failing test remaining in this file, which appears to be related to back-referencing. This type of problem has the potential to get ugly so I'd rather fix it in a separate patch. Differential Revision: https://reviews.llvm.org/D50013 llvm-svn: 338324
* [DAGCombiner] transform sub-of-shifted-signbit to addSanjay Patel2018-07-301-0/+11
| | | | | | | | | | | | | | | | This is exchanging a sub-of-1 with add-of-minus-1: https://rise4fun.com/Alive/plKAH This is another step towards improving select-of-constants codegen (see D48970). x86 is the motivating target, and those diffs all appear to be wins. PPC and AArch64 look neutral. I've limited this to early combining (!LegalOperations) in case a target wants to reverse it, but I think canonicalizing to 'add' is more likely to produce further transforms because we have more folds for 'add'. Differential Revision: https://reviews.llvm.org/D49924 llvm-svn: 338317
* [VPlan] Introduce VPlan-based dominator analysis.Diego Caballero2018-07-306-21/+172
| | | | | | | | | | | | | | | The patch introduces dominator analysis for VPBlockBases and extend VPlan's GraphTraits specialization with the required interfaces. Dominator analysis will be necessary to perform some H-CFG transformations and to introduce VPLoopInfo (LoopInfo analysis on top of the VPlan representation). Reviewers: fhahn, rengolin, mkuper, hfinkel, mssimpso Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D48815 llvm-svn: 338310
* This fixes a crash when a second pass is required for the Codeview Type ↵Alexandre Ganea2018-07-301-1/+4
| | | | | | | | | | merging *and* the index points outside of the table (which should lead to an error being printed). This occurs currently until MS precompiled headers .obj is added (see D45213) Differential Revision: https://reviews.llvm.org/D50006 llvm-svn: 338308
* [TargetLowering] In BuildSDIV, add the MULHS/SMUL_LOHI to the Created vector.Craig Topper2018-07-301-0/+3
| | | | | | BuildUDIV was already correct. llvm-svn: 338304
* [DAGCombiner][PowerPC][AArch64] Pass Created vector by reference to ↵Craig Topper2018-07-306-17/+12
| | | | | | BuildSDIVPow2. llvm-svn: 338303
* [InstCombine] Fold Select with binary opDavid Bolvansky2018-07-301-0/+33
| | | | | | | | | | | | | | | | | | | | | | | Summary: Fold %A = icmp eq i8 %x, 0 %B = xor i8 %x, %z %C = select i1 %A, i8 %B, i8 %y To %C = select i1 %A, i8 %z, i8 %y Fixes https://bugs.llvm.org/show_bug.cgi?id=38345 Proof: https://rise4fun.com/Alive/43J Reviewers: lebedev.ri, spatel Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49954 llvm-svn: 338300
* Revert r338222 "[DAGCombiner] Remove unnecessary calls to AddToWorklist."Craig Topper2018-07-301-8/+46
| | | | | | | | Thinking about it more it might be possible for the later nodes to be folded in getNode in such a way that the other created nodes are left dead. This can cause use counts to be incorrect on nodes that aren't dead. So its probably safer to leave this alone. llvm-svn: 338298
OpenPOWER on IntegriCloud