summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Reformat slightly.Eric Christopher2017-02-141-4/+3
| | | | llvm-svn: 295096
* Reapply r294532, reverted in r294787.Wolfgang Pieb2017-02-141-9/+147
| | | | | | | | | | | | | Store instructions can have more than one memory operand as a result of optimizations that fold different stores into one. When we identify spill instructions to generate DBG_VALUE instructions to record the spilling of a variable, we disregard stores with multiple memory operands for now. We may miss some relevant spills but the handling is a bit more complex, so we'll do it in a different patch. This fixes PR31935. llvm-svn: 295093
* Revert "[profiling] Remove dead profile name vars after emitting name data"Vedant Kumar2017-02-141-3/+0
| | | | | | | | This reverts commit r295084. There is a test failure on: http://lab.llvm.org:8011/builders/clang-atom-d525-fedora-rel/builds/2620/ llvm-svn: 295092
* [Support] Add StringRef::getAsDouble.Zachary Turner2017-02-141-0/+13
| | | | | | Differential Revision: https://reviews.llvm.org/D29918 llvm-svn: 295089
* [profiling] Remove dead profile name vars after emitting name dataVedant Kumar2017-02-141-0/+3
| | | | | | | | | | | | | | | | The profile name variables passed to counter increment intrinsics are dead after we emit the finalized name data in __llvm_prf_nm. However, we neglect to erase these name variables. This causes huge size increases in the __TEXT,__const section as well as slowdowns when linker dead stripping is disabled. Some affected projects are so massive that they fail to link on Darwin, because only the small code model is supported. Fix the issue by throwing away the name constants as soon as we're done with them. Differential Revision: https://reviews.llvm.org/D29921 llvm-svn: 295084
* [Tablegen] Instrumenting table gen DAGGenISelDAGAditya Nandakumar2017-02-141-0/+9
| | | | | | | | | | To help assist in debugging ISEL or to prioritize GlobalISel backend work, this patch adds two more tables to <Target>GenISelDAGISel.inc - one which contains the patterns that are used during selection and the other containing include source location of the patterns Enabled through CMake varialbe LLVM_ENABLE_DAGISEL_COV llvm-svn: 295081
* [Hexagon] Remove leftover debugging codeKrzysztof Parzyszek2017-02-141-4/+0
| | | | llvm-svn: 295078
* Do not apply redundant LastCallToStaticBonusTaewook Oh2017-02-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | Summary: As written in the comments above, LastCallToStaticBonus is already applied to the cost if Caller has only one user, so it is redundant to reapply the bonus here. If the only user is not a caller, TotalSecondaryCost will not be adjusted anyway because callerWillBeRemoved is false. If there's no caller at all, we don't need to care about TotalSecondaryCost because inliningPreventsSomeOuterInline is false. Reviewers: chandlerc, eraman Reviewed By: eraman Subscribers: haicheng, davidxl, davide, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D29169 llvm-svn: 295075
* [LazyBFI] Fix typosAdam Nemet2017-02-141-1/+1
| | | | llvm-svn: 295073
* Add new pass LazyMachineBlockFrequencyInfoAdam Nemet2017-02-144-8/+89
| | | | | | | | | | | | | | | | | And use it in MachineOptimizationRemarkEmitter. A test will follow on top of Justin's changes to enable MachineORE in AsmPrinter. The approach is similar to the IR-level pass. It's a bit simpler because BPI is immutable at the Machine level so we don't need to make that lazy. Because of this, a new function mapping is introduced (BPIPassTrait::getBPI). This function extracts BPI from the pass. In case of the lazy pass, this is when the calculation of the BFI occurs. For Machine-level, this is the identity function. Differential Revision: https://reviews.llvm.org/D29836 llvm-svn: 295072
* fix documentation comments for Argument; NFCSanjay Patel2017-02-141-28/+0
| | | | llvm-svn: 295068
* Correct a typo, s/hosting/hoisting/Brian Cain2017-02-141-1/+1
| | | | llvm-svn: 295066
* Remove unused variable.Diego Novillo2017-02-141-1/+0
| | | | llvm-svn: 295065
* Reapply "[LV] Extend trunc optimization to all IVs with constant integer steps"Matthew Simpson2017-02-141-10/+47
| | | | | | | | | | | This reapplies commit r294967 with a fix for the execution time regressions caught by the clang-cmake-aarch64-quick bot. We now extend the truncate optimization to non-primary induction variables only if the truncate isn't already free. Differential Revision: https://reviews.llvm.org/D29847 llvm-svn: 295063
* [X86][SSE] Allow matchVectorShuffleWithUNPCK to recognise UNDEF inputsSimon Pilgrim2017-02-141-7/+21
| | | | | | Add support for specifying an UNPCK input as UNDEF llvm-svn: 295061
* [SCEV] Cache results during GetMinTrailingZeros queryIgor Laevsky2017-02-141-8/+22
| | | | | | Differential Revision: https://reviews.llvm.org/D29759 llvm-svn: 295060
* [SLP] Fix for PR31879: vectorize repeated scalar ops that don't get putAlexey Bataev2017-02-141-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | back into a vector Previously the cost of the existing ExtractElement/ExtractValue instructions was considered as a dead cost only if it was detected that they have only one use. But these instructions may be considered dead also if users of the instructions are also going to be vectorized, like: ``` %x0 = extractelement <2 x float> %x, i32 0 %x1 = extractelement <2 x float> %x, i32 1 %x0x0 = fmul float %x0, %x0 %x1x1 = fmul float %x1, %x1 %add = fadd float %x0x0, %x1x1 ``` This can be transformed to ``` %1 = fmul <2 x float> %x, %x %2 = extractelement <2 x float> %1, i32 0 %3 = extractelement <2 x float> %1, i32 1 %add = fadd float %2, %3 ``` because though `%x0` and `%x1` have 2 users each other, these users are part of the vectorized tree and we can consider these `extractelement` instructions as dead. Differential Revision: https://reviews.llvm.org/D29900 llvm-svn: 295056
* Removing a redundant assignmentArtyom Skrobov2017-02-141-1/+0
| | | | llvm-svn: 295055
* Revert "[AMDGPU] Fix for SIMachineScheduler crash. SI Scheduler should track"Alexander Timofeev2017-02-142-3/+4
| | | | | | This reverts commit ce06d9cb99298eb844b66e117f5108a06747c907. llvm-svn: 295054
* [X86][SSE] Move unary inputs handling inside matchVectorShuffleWithUNPCK.Simon Pilgrim2017-02-141-2/+3
| | | | llvm-svn: 295053
* [X86][SSE] Tidyup matchVectorShuffleWithUNPCK helper function call.Simon Pilgrim2017-02-141-7/+3
| | | | | | | | Don't bother setting the V1/V2 operands again for unary shuffles. Don't bother legalizing the value type unless the match succeeds. llvm-svn: 295051
* Revert "[LoopVectorize] Added address space check when analysing interleaved ↵Karl-Johan Karlsson2017-02-141-20/+14
| | | | | | | | | accesses" This reverts r295038. The buildbot clang-with-thin-lto-ubuntu failed. I'm reverting to investigate. llvm-svn: 295042
* [LoopVectorize] Added address space check when analysing interleaved accessesKarl-Johan Karlsson2017-02-141-14/+20
| | | | | | | | | | | | | | | | | Prevent memory objects of different address spaces to be part of the same load/store groups when analysing interleaved accesses. This is fixing pr31900. Reviewers: HaoLiu, mssimpso, mkuper Reviewed By: mssimpso, mkuper Subscribers: llvm-commits, efriedma, mzolotukhin Differential Revision: https://reviews.llvm.org/D29717 llvm-svn: 295038
* Test commit permissionKarl-Johan Karlsson2017-02-141-1/+1
| | | | | | Removing whitespace. llvm-svn: 295037
* [AVX-512] Add PAVGB/PAVGW to load folding tables.Craig Topper2017-02-141-0/+18
| | | | llvm-svn: 295035
* [LSR] Pointers with different address spaces are considered incompatible.Mikael Holmen2017-02-141-1/+6
| | | | | | | | | | | | | | | | | | | | | | Summary: Function isCompatibleIVType is already used as a guard before the call to SE.getMinusSCEV(OperExpr, PrevExpr); in LSRInstance::ChainInstruction. getMinusSCEV requires the expressions to be of the same type, so we now consider two pointers with different address spaces to be incompatible, since it is possible that the pointers in fact have different sizes. Reviewers: qcolombet, eli.friedman Reviewed By: qcolombet Subscribers: nhaehnle, Ka-Ka, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D29885 llvm-svn: 295033
* [RISCV] Fix RV32 datalayout string and ensure initAsmInfo is calledAlex Bradbury2017-02-141-2/+4
| | | | llvm-svn: 295028
* [RISCV] Pseudo instructions are isCodeGenOnly, have blank asmstrAlex Bradbury2017-02-141-1/+2
| | | | llvm-svn: 295027
* [RISCV] Fix unused variable in RISCVMCTargetDesc. NFCAlex Bradbury2017-02-141-3/+2
| | | | | | | Also, for better uniformity use TargetRegistry::RegisterMCAsmInfo rather than RegisterMCAsmInfoFn. Again, no functional change. llvm-svn: 295026
* ThinLTOBitcodeWriter: Write available_externally copies of VCP eligible ↵Peter Collingbourne2017-02-141-13/+79
| | | | | | | | functions to merged module. Differential Revision: https://reviews.llvm.org/D29701 llvm-svn: 295021
* [ThinLTO] Make a copy of buffer identifier in ThinLTOCodeGeneratorMehdi Amini2017-02-141-14/+17
| | | | | | | We can't assume that the `const char *` provided through libLTO has a lifetime that expands beyond the codegenerator itself. llvm-svn: 295018
* [LICM] Make store promotion work in the face of unordered atomicsPhilip Reames2017-02-141-5/+27
| | | | | | | | | | | | | | | | | | | | | | | Extend our store promotion code to deal with unordered atomic accesses. Ordered atomics continue to be unhandled. Most of the change is straight-forward, the only complicated bit is in the reasoning around mixing of atomic and non-atomic memory access. Rather than trying to reason about the complex semantics in these cases, I simply disallowed promotion when both atomic and non-atomic accesses are present. This is conservatively correct. It seems really tempting to just promote all access to atomics, but the original accesses might have been conditional. Since we can't lower an arbitrary atomic type, it might not be safe to promote all access to atomic. Consider a loop like the following: while(b) { load i128 ... if (can lower i128 atomic) store atomic i128 ... else store i128 } It could be there's no race on the location and thus the code is perfectly well defined even if we can't lower a i128 atomically. It's not clear we need to be this conservative - arguably the program above is brocken since it can't be lowered unless the branch is folded - but I didn't want to have to fix any fallout which might result. Differential Revision: https://reviews.llvm.org/D15592 llvm-svn: 295015
* [MC] Fix some Clang-tidy modernize and Include What You Use warnings; other ↵Eugene Zelenko2017-02-1411-97/+186
| | | | | | | | minor fixes (NFC). Same changes in files affected by reduced MC headers dependencies. llvm-svn: 295009
* FunctionAttrs: Factor out a function for querying memory access of a ↵Peter Collingbourne2017-02-141-16/+21
| | | | | | | | | | | specific copy of a function. NFC. This will later be used by ThinLTOBitcodeWriter to add copies of readnone functions to the regular LTO module. Differential Revision: https://reviews.llvm.org/D29695 llvm-svn: 295008
* [X86] Add MXCSR registerAndrew Kaylor2017-02-133-21/+33
| | | | | | | | | | This adds MXCSR to the set of recognized registers for X86 targets and updates the instructions that read or write it. I do not intend for all of the various floating point instructions that implicitly use the control bits or update the status bits of this register to ever have that usage modeled by default. However, when constrained floating point modes (such as strict FP exception status modeling or dynamic rounding modes) are enabled, implicit use/def information for MXCSR will be added to those instructions. Until those additional updates are made this should cause (almost?) no functional changes. Theoretically, this will prevent instructions like LDMXCSR and STMXCSR from being moved past one another, but that should be prevented anyway and I haven't found a case where it is happening now. Differential Revision: https://reviews.llvm.org/D29903 llvm-svn: 295004
* [FunctionAttrs] try to extend nonnull-ness of arguments from a callsite back ↵Sanjay Patel2017-02-131-0/+53
| | | | | | | | | | | | | | | | | | | to its parent function As discussed here: http://lists.llvm.org/pipermail/llvm-dev/2016-December/108182.html ...we should be able to propagate 'nonnull' info from a callsite back to its parent. The original motivation for this patch is our botched optimization of "dyn_cast" (PR28430), but this won't solve that problem. The transform is currently disabled by default while we wait for clang to work-around potential security problems: http://lists.llvm.org/pipermail/cfe-dev/2017-January/052066.html Differential Revision: https://reviews.llvm.org/D27855 llvm-svn: 294998
* GlobalISel: represent atomic loads & stores via the MachineMemOperand.Tim Northover2017-02-132-11/+10
| | | | | | | Also make sure the AArch64 backend doesn't try to convert them into normal loads and stores. llvm-svn: 294993
* MIR: parse & print the atomic parts of a MachineMemOperand.Tim Northover2017-02-132-2/+49
| | | | | | We're going to need them very soon for GlobalISel. llvm-svn: 294992
* Address post-commit comments for https://reviews.llvm.org/D29596. NFCI.Taewook Oh2017-02-131-1/+1
| | | | llvm-svn: 294985
* swiftcc: Don't emit tail calls from callers with swifterror parametersArnold Schwaighofer2017-02-131-0/+9
| | | | | | | | | Backends don't support this yet. They would have to move to the swifterror register before the tail call to make sure it is live-in to the call. rdar://30495920 llvm-svn: 294982
* IR: Type ID summary extensions for WPD; thread summary into WPD pass.Peter Collingbourne2017-02-132-9/+86
| | | | | | | | | | Make the whole thing testable by adding YAML I/O support for the WPD summary information and adding some negative tests that exercise the YAML support. Differential Revision: https://reviews.llvm.org/D29782 llvm-svn: 294981
* Make MachineBasicBlock::updateTerminator to update DebugLoc as wellTaewook Oh2017-02-131-2/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Currently MachineBasicBlock::updateTerminator simply drops DebugLoc for newly created branch instructions, which may cause incorrect stepping and/or imprecise sample profile data. Below is an example: ``` 1 extern int bar(int x); 2 3 int foo(int *begin, int *end) { 4 int *i; 5 int ret = 0; 6 for ( 7 i = begin ; 8 i != end ; 9 i++) 10 { 11 ret += bar(*i); 12 } 13 return ret; 14 } ``` Below is a bitcode of 'foo' at the end of LLVM-IR level optimizations with -O3: ``` define i32 @foo(i32* readonly %begin, i32* readnone %end) !dbg !4 { entry: %cmp6 = icmp eq i32* %begin, %end, !dbg !9 br i1 %cmp6, label %for.end, label %for.body.preheader, !dbg !12 for.body.preheader: ; preds = %entry br label %for.body, !dbg !13 for.body: ; preds = %for.body.preheader, %for.body %ret.08 = phi i32 [ %add, %for.body ], [ 0, %for.body.preheader ] %i.07 = phi i32* [ %incdec.ptr, %for.body ], [ %begin, %for.body.preheader ] %0 = load i32, i32* %i.07, align 4, !dbg !13, !tbaa !15 %call = tail call i32 @bar(i32 %0), !dbg !19 %add = add nsw i32 %call, %ret.08, !dbg !20 %incdec.ptr = getelementptr inbounds i32, i32* %i.07, i64 1, !dbg !21 %cmp = icmp eq i32* %incdec.ptr, %end, !dbg !9 br i1 %cmp, label %for.end.loopexit, label %for.body, !dbg !12, !llvm.loop !22 for.end.loopexit: ; preds = %for.body br label %for.end, !dbg !24 for.end: ; preds = %for.end.loopexit, %entry %ret.0.lcssa = phi i32 [ 0, %entry ], [ %add, %for.end.loopexit ] ret i32 %ret.0.lcssa, !dbg !24 } ``` where ``` !12 = !DILocation(line: 6, column: 3, scope: !11) ``` . As you can see, the terminator of 'entry' block, which is a loop control branch, has a DebugLoc of line 6, column 3. Howerver, after the execution of 'MachineBlock::updateTerminator' function, which is triggered by MachineSinking pass, the DebugLoc info is dropped as below (see there's no debug-location for JNE_1): ``` bb.0.entry: successors: %bb.4(0x30000000), %bb.1.for.body.preheader(0x50000000) liveins: %rdi, %rsi %6 = COPY %rsi %5 = COPY %rdi %8 = SUB64rr %5, %6, implicit-def %eflags, debug-location !9 JNE_1 %bb.1.for.body.preheader, implicit %eflags ``` This patch addresses this issue and make newly created branch instructions to keep debug-location info. Reviewers: aprantl, MatzeB, craig.topper, qcolombet Reviewed By: qcolombet Subscribers: qcolombet, llvm-commits Differential Revision: https://reviews.llvm.org/D29596 llvm-svn: 294976
* Revert "[LV] Extend trunc optimization to all IVs with constant integer steps"Matthew Simpson2017-02-131-20/+12
| | | | | | | | This reverts commit r294967. This patch caused execution time slowdowns in a few LLVM test-suite tests, as reported by the clang-cmake-aarch64-quick bot. I'm reverting to investigate. llvm-svn: 294973
* [FastISel] Add a diagnostic to warm on fallback.Quentin Colombet2017-02-131-0/+13
| | | | | | | | This is consistent with what we do for GlobalISel. That way, it is easy to see whether or not FastISel is able to fully select a function. At some point we may want to switch that to an optimization remark. llvm-svn: 294970
* [ARM] Fix crash caused by r294945James Molloy2017-02-131-2/+4
| | | | | | | | I'd missed a creator of FCMP nodes - duplicateCmp(). Kindly and promptly reported by Gabor Ballabas, due to his CSiBE test suite. llvm-svn: 294968
* [LV] Extend trunc optimization to all IVs with constant integer stepsMatthew Simpson2017-02-131-12/+20
| | | | | | | | | | | | | | This patch extends the optimization of truncations whose operand is an induction variable with a constant integer step. Previously we were only applying this optimization to the primary induction variable. However, the cost model assumes the optimization is applied to the truncation of all integer induction variables (even regardless of step type). The transformation is now applied to the other induction variables, and I've updated the cost model to ensure it is better in sync with the transformation we actually perform. Differential Revision: https://reviews.llvm.org/D29847 llvm-svn: 294967
* [mips] divide macro instruction cleanup.Simon Dardis2017-02-135-80/+223
| | | | | | | | | | | | | | Clean up the implementation of divide macro expansion by getting rid of a FIXME regarding magic numbers and branch instructions. Match GAS' behaviour for expansion of ddiv / div in the two and three operand cases. Add the two operand alias for MIPSR6. Finally, optimize macro expansion cases where the divisior is the $zero register. Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D29887 llvm-svn: 294960
* Fix indentation. NFCI.Simon Pilgrim2017-02-131-1/+1
| | | | llvm-svn: 294959
* [PM] Hook up the instrumented PGO machinery in the new PM.Davide Italiano2017-02-131-0/+60
| | | | | | Differential Revision: https://reviews.llvm.org/D29308 llvm-svn: 294955
* [LTO] Make sure we flush buffers to work around linker shenanigans.Davide Italiano2017-02-131-2/+17
| | | | | | | lld, at least, doesn't call global destructors by default (unless --full-shutdown is passed) because it's, allegedly, expensive. llvm-svn: 294953
OpenPOWER on IntegriCloud