summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
* De-flake a test that is failing due to coroutine spill insertion non-determinismReid Kleckner2017-04-071-4/+6
| | | | llvm-svn: 299791
* [Dominators] Simplify a member function. NFCI.Davide Italiano2017-04-071-8/+2
| | | | llvm-svn: 299789
* Revert "[SelectionDAG] Enable target specific vector scalarization of calls ↵Simon Dardis2017-04-0715-2121/+104
| | | | | | | | | | | | | and returns" This reverts commit r299766. This change appears to have broken the MIPS buildbots. Reverting while I investigate. Revert "[mips] Remove usage of debug only variable (NFC)" This reverts commit r299769. Follow up commit. llvm-svn: 299788
* [AMDGPU] Unroll more to eliminate phis and conditionsStanislav Mekhanoshin2017-04-072-2/+86
| | | | | | | | | | | | | Increase threshold to unroll a loop which contains an "if" statement whose condition defined by a PHI belonging to the loop. This may help to eliminate if region and potentially even PHI itself, saving on both divergence and registers used for the PHI. Add a small bonus for each of such "if" statements. Differential Revision: https://reviews.llvm.org/D31693 llvm-svn: 299779
* Use PMADDWD to expand reduction in a loopDehao Chen2017-04-072-0/+150
| | | | | | | | | | | | | | | | | | Summary: PMADDWD can help improve 8/16 bit integer mutliply-add operation performance for cases like: for (int i = 0; i < count; i++) a += x[i] * y[i]; Reviewers: wmi, davidxl, hfinkel, RKSimon, zvi, mkuper Reviewed By: mkuper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31679 llvm-svn: 299776
* [lit] Try using process pools by default againReid Kleckner2017-04-071-1/+1
| | | | | | | | | | | Both pickling errors encountered on clang bots and Darwin compiler-rt should now be fixed. This has no impact on testing time on Linux, and on Windows goes from 88s to 63s for 'check'. The tests pass on Mac, but I haven't compared execution time. llvm-svn: 299775
* [GlobalISel] implement narrowing for G_CONSTANT.Igor Breger2017-04-073-0/+79
| | | | | | | | | | | | | | Summary: [GlobalISel] implement narrowing for G_CONSTANT. Reviewers: bogner, zvi, t.p.northover Reviewed By: t.p.northover Subscribers: llvm-commits, dberris, rovka, kristof.beyls Differential Revision: https://reviews.llvm.org/D31744 llvm-svn: 299772
* [coroutines] Insert spills of PHI instructions correctlyGor Nishanov2017-04-072-0/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Fix a bug where we were inserting a spill in between the PHIs in the beginning of the block. Consider this fragment: ``` begin: %phi1 = phi i32 [ 0, %entry ], [ 2, %alt ] %phi2 = phi i32 [ 1, %entry ], [ 3, %alt ] %sp1 = call i8 @llvm.coro.suspend(token none, i1 false) switch i8 %sp1, label %suspend [i8 0, label %resume i8 1, label %cleanup] resume: call i32 @print(i32 %phi1) ``` Unless we are spilling the argument or result of the invoke, we were always inserting the spill immediately following the instruction. The fix adds a check that if the spilled instruction is a PHI Node, select an appropriate insert point with `getFirstInsertionPt()` that skips all the PHI Nodes and EH pads. Reviewers: majnemer, rnk Reviewed By: rnk Subscribers: qcolombet, EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D31799 llvm-svn: 299771
* Reapply r298620: [LV] Vectorize GEPsMatthew Simpson2017-04-075-186/+453
| | | | | | | | | | | | | This patch reapplies r298620. The original patch was reverted because of two issues. First, the patch exposed a bug in InstCombine that caused the Chromium builds to fail (PR32414). This issue was fixed in r299017. Second, the patch introduced a bug in the vectorizer's scalars analysis that caused test suite builds to fail on SystemZ. The scalars analysis was too aggressive and marked a memory instruction scalar, even though it was going to be vectorized. This issue has been fixed in the current patch and several new test cases for the scalars analysis have been added. llvm-svn: 299770
* [mips] Remove usage of debug only variable (NFC)Simon Dardis2017-04-071-2/+2
| | | | | | | Fix the lld-x86_64-darwin13 buildbot by removing the declaration of a debug only variable and instead moving the value into the debug statement. llvm-svn: 299769
* [mips][msa] Fix generation of bm(n)zi and bins[lr]i instructionsPetar Jovanovic2017-04-076-17/+72
| | | | | | | | | | | | | | | | | | | | | | | | | We have two cases here, the first one being the following instruction selection from the builtin function: bm(n)zi builtin -> vselect node -> bins[lr]i machine instruction In case of bm(n)zi having an immediate which has either its high or low bits set, a bins[lr] instruction can be selected through the selectVSplatMask[LR] function. The function counts the number of bits set, and that value is being passed to the bins[lr]i instruction as its immediate, which in turn copies immediate modulo the size of the element in bits plus 1 as per specs, where we get the off-by-one-error. The other case is: bins[lr]i -> vselect node -> bsel.v In this case, a bsel.v instruction gets selected with a mask having one bit less set than required. Patch by Stefan Maksimovic. Differential Revision: https://reviews.llvm.org/D30579 llvm-svn: 299768
* [AMDGPU][MC] Fix for Bug 28211 + LIT testsDmitry Preobrazhensky2017-04-077-75/+209
| | | | | | | | | | | | | | | | | | | | - corrected DS_GWS_* opcodes (see VI_Shader_Programming#16.pdf for detailed description) - address operand is not used - several opcodes have data operand - all opcodes have offset modifier - DS_AND_SRC2_B32: corrected typo in mnemo - DS_WRAP_RTN_F32 replaced with DS_WRAP_RTN_B32 - added CI/VI opcodes: - DS_CONDXCHG32_RTN_B64 - DS_GWS_SEMA_RELEASE_ALL - added VI opcodes: - DS_CONSUME - DS_APPEND - DS_ORDERED_COUNT Differential Revision: https://reviews.llvm.org/D31707 llvm-svn: 299767
* [SelectionDAG] Enable target specific vector scalarization of calls and returnsSimon Dardis2017-04-0715-104/+2121
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | By target hookifying getRegisterType, getNumRegisters, getVectorBreakdown, backends can request that LLVM to scalarize vector types for calls and returns. The MIPS vector ABI requires that vector arguments and returns are passed in integer registers. With SelectionDAG's new hooks, the MIPS backend can now handle LLVM-IR with vector types in calls and returns. E.g. 'call @foo(<4 x i32> %4)'. Previously these cases would be scalarized for the MIPS O32/N32/N64 ABI for calls and returns if vector types were not legal. If vector types were legal, a single 128bit vector argument would be assigned to a single 32 bit / 64 bit integer register. By teaching the MIPS backend to inspect the original types, it can now implement the MIPS vector ABI which requires a particular method of scalarizing vectors. Previously, the MIPS backend relied on clang to scalarize types such as "call @foo(<4 x float> %a) into "call @foo(i32 inreg %1, i32 inreg %2, i32 inreg %3, i32 inreg %4)". This patch enables the MIPS backend to take either form for vector types. Reviewers: zoran.jovanovic, jaydeep, vkalintiris, slthakur Differential Revision: https://reviews.llvm.org/D27845 llvm-svn: 299766
* [SystemZ] Check for presence of vector support in SystemZISelLoweringJonas Paulsson2017-04-073-2/+24
| | | | | | | | | | | | | | A test case was found with llvm-stress that caused DAGCombiner to crash when compiling for an older subtarget without vector support. SystemZTargetLowering::combineTruncateExtract() should do nothing for older subtargets. This check was placed in canTreatAsByteVector(), which also helps in a few other places. Review: Ulrich Weigand llvm-svn: 299763
* [SystemZ] Remove confusing comment in combineEXTRACT_VECTOR_ELT()Jonas Paulsson2017-04-071-2/+0
| | | | | | It isn't just one-element vectors that can appear here. llvm-svn: 299762
* [ARM] GlobalISel: Test hard float properlyDiana Picus2017-04-071-16/+26
| | | | | | | | It turns out -float-abi=hard doesn't set the hard float calling convention for libcalls. We need to use a hard float triple instead (e.g. gnueabihf). llvm-svn: 299761
* [AMDGPU] Move SiShrinkInstruction and SDWAPeephole to SSAOptimization passesSam Kolton2017-04-073-7/+7
| | | | | | | | | | | | | | Summary: Difference beetween PreRegAlloc() and MachineSSAOptimization() are that the former is run despite of -O0 optimization level. In my undestanding SiShrinkInstructions and SDWAPeephole shouldn't run when optimizations are disabled. With this change order of passes will not change. Reviewers: arsenm, vpykhtin, rampitec Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D31705 llvm-svn: 299757
* [ARM] GlobalISel: Support frem for 64-bit valuesDiana Picus2017-04-073-0/+59
| | | | | | Legalize to a libcall. llvm-svn: 299756
* [ARM] GlobalISel: Support frem for 32-bit valuesDiana Picus2017-04-074-5/+51
| | | | | | | | Legalize to a libcall. On this occasion, also start allowing soft float subtargets. For the moment G_FREM is the only legal floating point operation for them. llvm-svn: 299753
* [InstCombine] Handle more commuted cases of ((A & B) | ~A) -> (~A | B)Craig Topper2017-04-072-7/+6
| | | | llvm-svn: 299747
* [InstCombine] Add additional tests with varied commuting to show missing ↵Craig Topper2017-04-071-0/+38
| | | | | | combines. NFC llvm-svn: 299746
* [InstSimplify] Use Instruction::BinaryOps instead of unsigned for a few ↵Craig Topper2017-04-071-10/+11
| | | | | | function operands to remove some casts. NFC llvm-svn: 299745
* AliasAnalysis: Be less conservative about volatile than atomic.Daniel Berlin2017-04-074-7/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: getModRefInfo is meant to answer the question "what impact does this instruction have on a given memory location" (not even another instruction). Long debate on this on IRC comes to the conclusion the answer should be "nothing special". That is, a noalias volatile store does not affect a memory location just by being volatile. Note: DSE and GVN and memdep currently believe this, because memdep just goes behind AA's back after it says "modref" right now. see line 635 of memdep. Prior to this patch we would get modref there, then check aliasing, and if it said noalias, we would continue. getModRefInfo *already* has this same AA check, it just wasn't being used because volatile was lumped in with ordering. (I am separately testing whether this code in memdep is now dead except for the invariant load case) Reviewers: jyknight, chandlerc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31726 llvm-svn: 299741
* [InstCombine] Add more commuted patterns to support folding ((~A & B) | A) ↵Craig Topper2017-04-072-13/+8
| | | | | | -> (A | B). llvm-svn: 299737
* [WebAssembly] Fix -Wcovered-switch-default warningDerek Schuff2017-04-061-2/+1
| | | | llvm-svn: 299736
* Allow specification of what kinds of class members to dump.Zachary Turner2017-04-068-118/+62
| | | | | | | | | | | | | | Previously when dumping class definitions, there were only two modes - on or off. But it's useful to sometimes get a little more fine-grained. For example, you might only want to see the record layout (for example to look for extraneous padding). This patch adds a third mode, layout mode, which does exactly that. Only this-relative data members are displayed in this mode. Differential Revision: https://reviews.llvm.org/D31794 llvm-svn: 299733
* [llvm-pdbdump] Allow pretty to only dump specific types of types.Zachary Turner2017-04-063-30/+59
| | | | | | | | | | | | | Previously we just had the -types option, which would dump all classes, typedefs, and enums. But this produces a lot of output if you only want to view classes, for example. This patch breaks this down into 3 additional options, -classes, -enums, and -typedefs, and keeps the -types option around which implies all 3 more specific options. Differential Revision: https://reviews.llvm.org/D31791 llvm-svn: 299732
* AMDGPU/GFX9: Fix shared and private aperture queriesKonstantin Zhuravlyov2017-04-064-17/+50
| | | | | | Differential Revision: https://reviews.llvm.org/D31786 llvm-svn: 299727
* Remove the default subtarget from the Power port. It's unnecessary and ↵Eric Christopher2017-04-062-4/+1
| | | | | | harmful if used. llvm-svn: 299726
* [InstCombine] Add a few cases for OR we fail to optimize due to missing ↵Craig Topper2017-04-061-0/+45
| | | | | | commuted patterns checks. llvm-svn: 299725
* Revert "Revert "[ARM] Add Kryo to available targets""Yi Kong2017-04-064-1/+12
| | | | | | | | This reverts commit dc9458d5a747a02a9a8f198b84c2b92a6939a8dd. Added missing case for PreISelOperandLatencyAdjustment. llvm-svn: 299724
* Turn on -addr-sink-using-gep by default.Eli Friedman2017-04-0618-93/+66
| | | | | | | | | The new codepath has been in the tree for years, and there isn't any reason to use two codepaths here. Differential Revision: https://reviews.llvm.org/D30596 llvm-svn: 299723
* [X86] Revert r299387 due to AVX legalization infinite loop.Michael Kuperstein2017-04-0612-142/+103
| | | | llvm-svn: 299720
* [InstCombine] Remove testing assert I accidentally left in r299710.Craig Topper2017-04-061-3/+1
| | | | llvm-svn: 299715
* iwyu fixes for lldbCore.Zachary Turner2017-04-061-0/+2
| | | | | | | | | | | | | | This adjusts header file includes for headers and source files in Core. In doing so, one dependency cycle is eliminated because all the includes from Core to that project were dead includes anyway. In places where some files in other projects were only compiling due to a transitive include from another header, fixups have been made so that those files also include the header they need. Tested on Windows and Linux, and plan to address failures on OSX and FreeBSD after watching the bots. llvm-svn: 299714
* AMDGPU: Diagnose illegal SGPR to VGPR copiesMatt Arsenault2017-04-063-3/+85
| | | | | | | | | | This is possible in ways that are not compiler bugs, so stop asserting on them. This emits an extra error when emitting objects when it can't encode the new pseudo, but I'm not sure that matters. llvm-svn: 299712
* [InstCombine] When checking to see if we can turn subtracts of 2^n - 1 into ↵Craig Topper2017-04-061-5/+7
| | | | | | | | xor, we only need to call computeKnownBits on the RHS not the whole subtract. While there use isMask instead of isPowerOf2(C+1) Calling computeKnownBits on the RHS should allows us to recurse one step further. isMask is equivalent to the isPowerOf2(C+1) except in the case where C is all ones. But that was already handled earlier by creating a not which is an Xor with all ones. So this should be fine. llvm-svn: 299710
* AMDGPU: Replace fp16SrcZerosHighBits with a whitelistMatt Arsenault2017-04-062-24/+71
| | | | | | | FCOPYSIGN is lowered to bit operations which don't clear the high bits. llvm-svn: 299708
* [PGO] Preserve GlobalsAA in pgo-memop-opt pass.Rong Xu2017-04-061-1/+5
| | | | | | | Preserve GlobalsAA analysis in memory intrinsic calls optimization based on profiled size. llvm-svn: 299707
* [llvm-extract] Add option for recursive extractionKeno Fischer2017-04-062-1/+66
| | | | | | | | | | | | | | | | | | | Summary: Particularly, with --delete, this can be very useful for testing new optimizations on some hotspots, without having to run it on the whole application. E.g. as such: ``` llvm-extract app.bc --recursive --rfunc .*hotspot.* > hotspot.bc llvm-extract app.bc --recursive --delete --rfunc .*hotspot.* > residual.bc llc -filetype=obj residual.bc > residual.o llc -filetype=obj hotspot.bc > hotspot.o cc -o app residual.o hotspot.o ``` Reviewed By: davide Differential Revision: https://reviews.llvm.org/D31722 llvm-svn: 299706
* [InstCombine] Remove redundant combine from visitAndCraig Topper2017-04-062-92/+0
| | | | | | | | This combine is fully handled by SimplifyDemandedInstructionBits as of r299658 where I fixed this code to ensure the Add/Sub had only a single user. Otherwise it would fire and create additional instructions. That fix resulted in an improvement to code generated for tsan which is why I committed it before deleting. Differential Revision: https://reviews.llvm.org/D31543 llvm-svn: 299704
* [BFIterator] Remove an assertion that doesn't hold. NFCI.Davide Italiano2017-04-061-1/+0
| | | | llvm-svn: 299703
* Revert "Turn some C-style vararg into variadic templates"Mehdi Amini2017-04-0622-205/+251
| | | | | | This reverts commit r299699, the examples needs to be updated. llvm-svn: 299702
* [SelectionDAG] [ARM CodeGen] Fix chain information of LowerMULHuihui Zhang2017-04-062-2/+128
| | | | | | | | | | | | | | In LowerMUL, the chain information is not preserved for the new created Load SDNode. For example, if a Store alias with one of the operand of Mul. The Load for that operand need to be scheduled before the Store. The dependence is recorded in the chain of Store, in TokenFactor. However, when lowering MUL, the SDNodes for the new Loads for VMULL are not updated in the TokenFactor for the Store. Thus the chain is not preserved for the lowered VMULL. llvm-svn: 299701
* Turn some C-style vararg into variadic templatesMehdi Amini2017-04-0622-251/+205
| | | | | | | | | | | | | | | | Module::getOrInsertFunction is using C-style vararg instead of variadic templates. From a user prospective, it forces the use of an annoying nullptr to mark the end of the vararg, and there's not type checking on the arguments. The variadic template is an obvious solution to both issues. Patch by: Serge Guelton <serge.guelton@telecom-bretagne.eu> Differential Revision: https://reviews.llvm.org/D31070 llvm-svn: 299699
* [asan] Fix dead stripping of globals on Linux.Evgeniy Stepanov2017-04-067-48/+163
| | | | | | | | | | | | | | | | | | | | | | | Use a combination of !associated, comdat, @llvm.compiler.used and custom sections to allow dead stripping of globals and their asan metadata. Sometimes. Currently this works on LLD, which supports SHF_LINK_ORDER with sh_link pointing to the associated section. This also works on BFD, which seems to treat comdats as all-or-nothing with respect to linker GC. There is a weird quirk where the "first" global in each link is never GC-ed because of the section symbols. At this moment it does not work on Gold (as in the globals are never stripped). This is a re-land of r298158 rebased on D31358. This time, asan.module_ctor is put in a comdat as well to avoid quadratic behavior in Gold. llvm-svn: 299697
* [asan] Put ctor/dtor in comdat.Evgeniy Stepanov2017-04-063-10/+54
| | | | | | | | | | | | | | | | | | When possible, put ASan ctor/dtor in comdat. The only reason not to is global registration, which can be TU-specific. This is not the case when there are no instrumented globals. This is also limited to ELF targets, because MachO does not have comdat, and COFF linkers may GC comdat constructors. The benefit of this is a lot less __asan_init() calls: one per DSO instead of one per TU. It's also necessary for the upcoming gc-sections-for-globals change on Linux, where multiple references to section start symbols trigger quadratic behaviour in gold linker. This is a rebase of r298756. llvm-svn: 299696
* [asan] Delay creation of asan ctor.Evgeniy Stepanov2017-04-064-23/+31
| | | | | | | | | | | Create the constructor in the module pass. This in needed for the GC-friendly globals change, where the constructor can be put in a comdat in some cases, but we don't know about that in the function pass. This is a rebase of r298731 which was reverted due to a false alarm. llvm-svn: 299695
* Bitcode: Do not create FNENTRYs for aliases of functions.Peter Collingbourne2017-04-063-19/+6
| | | | | | | | There doesn't seem to be any point in doing this. Differential Revision: https://reviews.llvm.org/D31691 llvm-svn: 299694
* [StripDeadDebugInfo] Drop dead CUs entirelyKeno Fischer2017-04-062-2/+34
| | | | | | | | | | | | | Summary: Prior to this while it would delete the dead DIGlobalVariables, it would leave dead DICompileUnits and everything referenced therefrom. For a bit bitcode file with thousands of compile units those dead nodes easily outnumbered the real ones. Clean that up. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D31720 llvm-svn: 299692
OpenPOWER on IntegriCloud