summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [New PM][PassInstrumentation] IR printing support for New Pass ManagerFedor Sergeev2018-09-246-12/+135
| | | | | | | | | | | | | | | | Implementing -print-before-all/-print-after-all/-filter-print-func support through PassInstrumentation callbacks. - PrintIR routines implement printing callbacks. - StandardInstrumentations class provides a central place to manage all the "standard" in-tree pass instrumentations. Currently it registers PrintIR callbacks. Reviewers: chandlerc, paquette, philip.pfaffe Differential Revision: https://reviews.llvm.org/D50923 llvm-svn: 342896
* [X86] Split WriteIMul into 8/16/32/64 implementations (PR36931)Simon Pilgrim2018-09-2411-373/+183
| | | | | | | | Split WriteIMul by size and also by IMUL multiply-by-imm and multiply-by-reg cases. This removes all the scheduler overrides for gpr multiplies and stops WriteMULH being ignored for BMI2 MULX instructions. llvm-svn: 342892
* [Arm][AsmParser] Restrict register list size for VSTM/VLDMLuke Cheeseman2018-09-241-0/+9
| | | | | | | | | | - The assembler accepts VSTM/VLDM with register lists (specifically double registers lists) with more than 16 registers specified - The Arm architecture reference manual says this instruction must not contain more than 16 registers when the registers are doubleword registers - This addresses one of the concerns in https://bugs.llvm.org/show_bug.cgi?id=38389 Differential Revision: https://reviews.llvm.org/D52082 llvm-svn: 342891
* [DAGCombiner] use UADDO to optimize saturated unsigned addSanjay Patel2018-09-241-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a preliminary step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 If we have an 'add' instruction that sets flags, we can use that to eliminate an explicit compare instruction or some other instruction (cmn) that sets flags for use in the later select. As shown in the unchanged tests that use 'icmp ugt %x, %a', we're effectively reversing an IR icmp canonicalization that replaces a variable operand with a constant: https://rise4fun.com/Alive/V1Q But we're not using 'uaddo' in those cases via DAG transforms. This happens in CGP after D8889 without checking target lowering to see if the op is supported. So AArch already shows 'uaddo' codegen for the i8/i16/i32/i64 test variants with "using_cmp_sum" in the title. That's the pattern that CGP matches as an unsigned saturated add and converts to uaddo without checking target capabilities. This patch is gated by isOperationLegalOrCustom(ISD::UADDO, VT), so we see only see AArch diffs for i32/i64 in the tests with "using_cmp_notval" in the title (unlike x86 which sees improvements for all sizes because all sizes are 'custom'). But the AArch code (like x86) looks better when translated to 'uaddo' in all cases. So someone that is involved with AArch may want to set i8/i16 to 'custom' for UADDO, so this patch will fire on those tests. Another possibility given the existing behavior: we could remove the legal-or-custom check altogether because we're assuming that a UADDO sequence is canonical/optimal before we ever reach here. But that seems like a bug to me. If the target doesn't have an add-with-flags op, then it's not likely that we'll get optimal DAG combining using a UADDO node. This is similar justification for why we don't canonicalize IR to the overflow math intrinsic sibling (llvm.uadd.with.overflow) for UADDO in the first place. Differential Revision: https://reviews.llvm.org/D51929 llvm-svn: 342886
* [Mips][FastISel] Fix selectBranch on icmp i1Petar Jovanovic2018-09-241-0/+5
| | | | | | | | | | | | | | The r337288 tried to fix result of icmp i1 when its input is not sanitized by falling back to DagISel. While it now produces the correct result for bit 0, the other bits can still hold arbitrary value which is not supported by MipsFastISel branch lowering. This patch fixes the issue by falling back to DagISel in this case. Patch by Dragan Mladjenovic. Differential Revision: https://reviews.llvm.org/D52045 llvm-svn: 342884
* [PowerPC] Support operand modifier 'x' in inline asmZaara Syeda2018-09-241-0/+15
| | | | | | | | | | | gcc uses operand modifier 'x' in inline asm for VSX registers. Without this modifier, instructions which use VSX numbering for their operands are printed as VMX registers. This patch adds support for the operand modifier 'x'. Differential Revision: https://reviews.llvm.org/D52244 llvm-svn: 342882
* AMDGPU: Fix private handling for allowsMisalignedMemoryAccessesMatt Arsenault2018-09-241-1/+5
| | | | | | | | | | | | | If the alignment is at least 4, this should report true. Something still seems off with how < 4-byte types are handled here though. Fixing this seems to change how some combines get to where they get, but somehow isn't changing the net result. llvm-svn: 342879
* [ARM] Do not fuse VADD and VMUL on the Cortex-M4 and Cortex-M33Sjoerd Meijer2018-09-242-3/+5
| | | | | | | | | | | | A sequence of VMUL and VADD instructions always give the same or better performance than a fused VMLA instruction on the Cortex-M4 and Cortex-M33. Executing the VMUL and VADD back-to-back requires the same cycles, but having separate instructions allows scheduling to avoid the hazard between these 2 instructions. Differential Revision: https://reviews.llvm.org/D52289 llvm-svn: 342874
* Revert r341932 "[ARM] Enable ARMCodeGenPrepare by default"Hans Wennborg2018-09-241-1/+1
| | | | | | | | | | | This caused miscompilation of WebRTC for Android: PR39060. > We've had the pass enabled downstream for a couple of weeks and it > seems to be okay, so enable it by default. > > Differential Revision: https://reviews.llvm.org/D51920 llvm-svn: 342873
* [ARM][ARMLoadStoreOptimizer]Luke Cheeseman2018-09-241-0/+14
| | | | | | | | | | | - The load store optimizer is currently merging multiple loads/stores into VLDM/VSTM with more than 16 doubleword registers - This is an UNPREDICTABLE instruction and shouldn't be done - It looks like the Limit for how many registers included in a merge got dropped at some point so I am reintroducing it in this patch - This fixes https://bugs.llvm.org/show_bug.cgi?id=38389 Differential Revision: https://reviews.llvm.org/D52085 llvm-svn: 342872
* [deadargelim] Update dbg.value of 'unused' parametersPetar Jovanovic2018-09-241-3/+8
| | | | | | | | | | | | DeadArgElim pass marks unused function arguments as ‘undef’ without updating existing dbg.values referring to it. As a consequence the debug info metadata in the final executable was wrong. Patch by Djordje Todorovic. Differential Revision: https://reviews.llvm.org/D51968 llvm-svn: 342871
* [ARM] bottom-top mul support ARMParallelDSPSam Parker2018-09-241-27/+154
| | | | | | | | | | | | | | | | Originally committed in rL342210 but was reverted in rL342260 because it was causing issues in vectorized code, because I had forgotten to ensure that we're operating on scalar values. Original commit message: On failing to find sequences that can be converted into dual macs, try to find sequential 16-bit loads that are used by muls which we can then use smultb, smulbt, smultt with a wide load. Differential Revision: https://reviews.llvm.org/D51983 llvm-svn: 342870
* Remove debug printf leftover from r342397Hans Wennborg2018-09-241-2/+0
| | | | llvm-svn: 342863
* [XRay] Clean up XRay build configurationDean Michael Berris2018-09-242-0/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change spans both LLVM and compiler-rt, where we do the following: - Add XRay to the LLVMBuild system, to allow for distributing the XRay trace loading library along with the LLVM distributions. - Use `llvm-config` better in the compiler-rt XRay implementation, to depend on the potentially already-distributed LLVM XRay library. While this is tested with the standalone compiler-rt build, it does require that the LLVMXRay library (and LLVMSupport as well) are available during the build. In case the static libraries are available, the unit tests will build and work fine. We're still having issues with attempting to use a shared library version of the LLVMXRay library since the shared library might not be accessible from the standard shared library lookup paths. The larger change here is the inclusion of the LLVMXRay library in the distribution, which allows for building tools around the XRay traces and profiles that the XRay runtime already generates. Reviewers: echristo, beanz Subscribers: mgorny, hiraditya, mboerger, llvm-commits Differential Revision: https://reviews.llvm.org/D52349 llvm-svn: 342859
* Fix asserts when linking wrong address space declarationsMatt Arsenault2018-09-241-3/+6
| | | | llvm-svn: 342858
* [DAGCombiner] Remove some dead code from ConstantFoldBITCASTofBUILD_VECTORCraig Topper2018-09-241-9/+2
| | | | | | This code handled SCALAR_TO_VECTOR being returned by the recursion, but the code that used to return SCALAR_TO_VECTOR was removed in 2015. llvm-svn: 342856
* [ORC] Add some debugging output to Core.h/Core.cppLang Hames2018-09-231-0/+4
| | | | | | Core now logs when materialization units are dispatched or return to JITDylibs. llvm-svn: 342853
* [X86] Split WriteShift/WriteRotate schedule classes by CL usage.Simon Pilgrim2018-09-2311-95/+64
| | | | | | Variable Shifts/Rotates using the CL register have different behaviours to the immediate instructions - split accordingly to help remove yet more repeated overrides from the schedule models. llvm-svn: 342852
* [DAGCombiner] Clarify a comment. NFCCraig Topper2018-09-231-2/+4
| | | | | | This comment was misleading about why we were restricting to before legalize types. The reason given would only apply to before legalize ops. But there is a before legalize types reason that should also be listed. llvm-svn: 342851
* [LegalizeTypes] Fix bad indentation. NFCCraig Topper2018-09-231-1/+1
| | | | llvm-svn: 342850
* [X86] Remove unnecessary WriteRotate override. NFCI.Simon Pilgrim2018-09-231-4/+2
| | | | | | SNB was the last override for ROT(L|R)r(1|i) - they now all use WriteRotate correctly. llvm-svn: 342848
* Fix line ending mismatches. NFCI.Simon Pilgrim2018-09-231-6/+6
| | | | llvm-svn: 342847
* [X86] ROR*mCL instruction models should match ROL*mCL etc.Simon Pilgrim2018-09-234-28/+4
| | | | | | | | Confirmed with Craig Topper - fix a typo that was missing a Port4 uop for ROR*mCL instructions on some Intel models. Yet another step on the scheduler model cleanup marathon...... llvm-svn: 342846
* [Aarch64] Fix memcpy that was copying 4x too many bytesBenjamin Kramer2018-09-231-1/+1
| | | | | | Found by asan. llvm-svn: 342845
* [DAGCombiner][x86] extend decompose of integer multiply into shift/add with ↵Sanjay Patel2018-09-232-7/+16
| | | | | | | | | | | | | | | | | negation This is an alternative to https://reviews.llvm.org/D37896. We can't decompose multiplies generically without a target hook to tell us when it's profitable. ARM and AArch64 may be able to remove some existing code that overlaps with this transform. This extends D52195 and may resolve PR34474: https://bugs.llvm.org/show_bug.cgi?id=34474 (still an open question about transforming legal vector multiplies, but we could open another bug report for those) llvm-svn: 342844
* [X86] Added missing RCL/RCR schedule overrides to the generic SNB modelSimon Pilgrim2018-09-231-0/+24
| | | | | | | | | | | | The SandyBridge model was missing schedule values for the RCL/RCR values - instead using the (incredibly optimistic) WriteShift (now WriteRotate) defaults. I've added overrides with more realistic (slow) values, based on a mixture of Agner/instlatx64 numbers and what later Intel models do as well. This is necessary to allow WriteRotate to be updated to remove other rotate overrides. It'd probably be a good idea to investigate a WriteRotateCarry class at some point but its not high priority given the unusualness of these instructions. llvm-svn: 342842
* [X86] Remove unnecessary WriteRotate overrides. NFCI.Simon Pilgrim2018-09-234-26/+6
| | | | llvm-svn: 342841
* [X86] Move RORX instructions back to WriteShift schedule classSimon Pilgrim2018-09-231-2/+4
| | | | | | Despite being rotates, these more modern instructions avoid many of the quirks of the regular x86 rotate instructions and consistently have a schedule closer to shifts. llvm-svn: 342839
* [X86] Add WriteRotate schedule class, splitting off from WriteShift.Simon Pilgrim2018-09-2311-16/+27
| | | | | | | | NFCI for now, but it should make it easier to remove a lot of unnecessary overrides in a future commit. Now that funnel shift intrinsics are coming online we need to get this cleaned up to make vectorization costs from scalar rotate patterns more straightforward. llvm-svn: 342837
* [WholeProgramDevirt] Don't process declarations when building type id mapEugene Leviant2018-09-231-1/+1
| | | | | | Differential revision: https://reviews.llvm.org/D52175 llvm-svn: 342836
* Build PassBuilder.cpp with /bigobj to try and appease MSVC EXPENSIVE_CHECKS ↵Simon Pilgrim2018-09-231-0/+4
| | | | | | buildbot llvm-svn: 342835
* [X86] Add isel pattern for (v8i16 (sext (v8i1))) with DQI and no BWI.Craig Topper2018-09-231-0/+5
| | | | | | | | Our lowering that tries to avoid this sign extend can be defeated by the DAG combine folding it with a truncate. The pattern needs to extend to an v8i32 then truncate back down to v8i16. llvm-svn: 342830
* [X86] Fix a few typos in comments.Craig Topper2018-09-231-2/+2
| | | | llvm-svn: 342829
* [ORC] Update ORC C bindings to use the new llvm::Error C API.Lang Hames2018-09-232-145/+138
| | | | | | | | | This replaces instances of the LLVMOrcErrorCode type with LLVMErrorRef, simplifying the implementation of the OrcCBindingsStack class and ORC C API bindings and making it possible to return arbitrary (wrapped) llvm::Errors. llvm-svn: 342828
* [DAGCombiner] Simplify some code in visitBITCAST. NFCICraig Topper2018-09-221-9/+3
| | | | llvm-svn: 342826
* [AArch64] Support adding X[8-15,18] registers as CSRs.Tri Vo2018-09-229-7/+84
| | | | | | | | | | | | | | | | | | | Summary: Specifying X[8-15,18] registers as callee-saved is used to support CONFIG_ARM64_LSE_ATOMICS in Linux kernel. As part of this patch we: - use custom CSR list/mask when user specifies custom CSRs - update Machine Register Info's list of CSRs with additional custom CSRs in LowerCall and LowerFormalArguments. Reviewers: srhines, nickdesaulniers, efriedma, javed.absar Reviewed By: nickdesaulniers Subscribers: kristof.beyls, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D52216 llvm-svn: 342824
* [DAGCombiner] Rewrite r331896 in a different way to address a FIXME. NFCICraig Topper2018-09-221-11/+14
| | | | llvm-svn: 342809
* [InstCombine][x86] try even harder to convert blendv intrinsic to generic IR ↵Sanjay Patel2018-09-221-7/+20
| | | | | | | | | | | | | | | | (PR38814) Follow-up to rL342324 (D52059): Missing optimizations with blendv are shown in: https://bugs.llvm.org/show_bug.cgi?id=38814 This is an easier and more powerful solution than adding pattern matching for a few special cases in the backend. The potential danger with this transform in IR is that the condition value can get separated from the select, and the backend might not be able to make a blendv out of it again. llvm-svn: 342806
* [lib/MC] - Set SHF_EXCLUDE flag for .dwo sections.George Rimar2018-09-221-11/+11
| | | | | | | | | | | | | | | | | DWARF5 spec says about single file split case: "The sections that do not require relocation, however, can be written to the relocatable object (.o) file but ignored by the the linker or they can be written to a separate DWARF object (.dwo) file that need not be accessed by the linker." Nice way to make linker to ignore them is to set SHF_EXCLUDE flag. It seems to be not harmful to always set it for .dwo sections. That is what this patch does. Differential revision: https://reviews.llvm.org/D52303 llvm-svn: 342800
* [mips] Provide more detailed description for MIPS targets. NFCSimon Atanasyan2018-09-221-4/+5
| | | | llvm-svn: 342799
* [mips] Remove obsoleted "experimental" tag from MIPS 64-bit targets. NFCSimon Atanasyan2018-09-221-2/+2
| | | | llvm-svn: 342798
* [InstCombine] Fold (min/max ~X, Y) -> ~(max/min X, ~Y) when Y is freely ↵Craig Topper2018-09-221-10/+36
| | | | | | | | | | | | | | | | invertible Summary: This restores the combine that was reverted in r341883. The infinite loop from the failing test no longer occurs due to changes from r342163. Reviewers: spatel, dmgreen Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52070 llvm-svn: 342797
* [X86] Fix inline expansion for memset in x32Craig Topper2018-09-222-23/+36
| | | | | | | | | | | | | | Summary: Similar to D51893 which was for memcpy Reviewers: efriedma Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52063 llvm-svn: 342796
* [X86] Fold (movmsk (setne (and X, (1 << C)), 0)) -> (movmsk (X << C)) for ↵Craig Topper2018-09-221-8/+15
| | | | | | | | vXi8 vectors. We don't have a vXi8 shift left so we need to bitcast to a vXi16 vector to perform the shift. If we let lowering legalize the vXi8 shift we get an extra and that we don't need and fail to remove. llvm-svn: 342795
* [X86] Teach fast isel to use MOV32ri64 for loading an unsigned 32 immediate ↵Craig Topper2018-09-211-9/+1
| | | | | | | | into a 64-bit register. Previously we used SUBREG_TO_REG+MOV32ri. But regular isel was changed recently to use the MOV32ri64 pseudo. Fast isel now does the same. llvm-svn: 342788
* [Loop Vectorizer] Abandon vectorization when no integer IV foundWarren Ristow2018-09-212-0/+5
| | | | | | | | | | | | | | | | | Support for vectorizing loops with secondary floating-point induction variables was added in r276554. A primary integer IV is still required for vectorization to be done. If an FP IV was found, but no integer IV was found at all (primary or secondary), the attempt to vectorize still went forward, causing a compiler-crash. This change abandons that attempt when no integer IV is found. (Vectorizing FP-only cases like this, rather than bailing out, is discussed as possible future work in D52327.) See PR38800 for more information. Differential Revision: https://reviews.llvm.org/D52327 llvm-svn: 342786
* Add missing include.Zachary Turner2018-09-211-0/+1
| | | | llvm-svn: 342781
* [NativePDB] Add support for reading function signatures.Zachary Turner2018-09-217-30/+248
| | | | | | | This adds support for parsing function signature records and returning them through the native DIA interface. llvm-svn: 342780
* [PDB] Add native reading support for UDT / class types.Zachary Turner2018-09-2110-131/+431
| | | | | | | | | | | | This allows the native reader to find records of class/struct/ union type and dump them. This behavior is tested by using the diadump subcommand against golden output produced by actual DIA SDK on the same PDB file, and again using pretty -native to confirm that we actually dump the classes. We don't find class members or anything like that yet, for now it's just the class itself. llvm-svn: 342779
* [WebAssembly] Simplified selecting asmmatcher stack instructions.Wouter van Oortmerssen2018-09-212-0/+3
| | | | | | | | | | | | | | | | Summary: By using the existing isCodeGenOnly bit in the tablegen defs, as suggested by tlively in https://reviews.llvm.org/D51662 Tested: llvm-lit -v `find test -name WebAssembly` Reviewers: tlively Subscribers: dschuff, sbc100, jgravelle-google, aheejin, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D52373 llvm-svn: 342772
OpenPOWER on IntegriCloud