summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [SimplifyLibCalls] Fix memchr expansion for constant strings.Eli Friedman2019-01-091-1/+4
| | | | | | | | | | | | The C standard says "The memchr function locates the first occurrence of c (converted to an unsigned char)[...]". The expansion was missing the conversion to unsigned char. Fixes https://bugs.llvm.org/show_bug.cgi?id=39041 . Differential Revision: https://reviews.llvm.org/D55947 llvm-svn: 350775
* Don't require a null terminator when loading objectsDavid Major2019-01-091-1/+2
| | | | | | | | | | When a null terminator is required and the file size is a multiple of the system page size, MemoryBuffer will prefer pread() over mmap(), which can result in excessive memory usage. Patch by Mike Hommey! Differential Revision: https://reviews.llvm.org/D56475 llvm-svn: 350774
* [WebAssembly] Print a debug message at the start of each passHeejin Ahn2019-01-0910-3/+28
| | | | | | | | | | | | | | | Summary: Looks like many passes print its pass description as a debug message at the start of each pass, so added that to (mostly newly added) other passes as well. Reviewers: dschuff Subscribers: jgravelle-google, sbc100, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D56142 llvm-svn: 350771
* Refactor synthetic profile count computation. NFC.Easwaran Raman2019-01-093-37/+41
| | | | | | | | | | | | | | | | | Summary: Instead of using two separate callbacks to return the entry count and the relative block frequency, use a single callback to return callsite count. This would allow better supporting hybrid mode in the future as the count of callsite need not always be derived from entry count (as in sample PGO). Reviewers: davidxl Subscribers: mehdi_amini, steven_wu, dexonsmith, dang, llvm-commits Differential Revision: https://reviews.llvm.org/D56464 llvm-svn: 350755
* [CodeGen] Ignore return sext/zext attributes of unused results for tail callsFrancis Visoiu Mistrih2019-01-091-0/+15
| | | | | | | | | | | | | If the caller's return type does not have a zeroext attribute but the callee does a tail call zeroext, we won't consider the tail call during CodeGenPrepare because the attributes don't match. However, if the result of the tail call has no uses, it makes sense to drop the sext/zext attributes. Differential Revision: https://reviews.llvm.org/D56486 llvm-svn: 350753
* [Inliner] Assert that the computed inline threshold is non-negative.Easwaran Raman2019-01-091-0/+7
| | | | | | | | | | Reviewers: chandlerc Subscribers: haicheng, llvm-commits Differential Revision: https://reviews.llvm.org/D56409 llvm-svn: 350751
* refactor BlockFrequencyInfo::view to take a title parameterDavid Callahan2019-01-091-2/+2
| | | | | | | | | | | | | | Summary: All a non-default title for the debugging this debugging aide Reviewers: twoh, Kader, modocache Reviewed By: twoh Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56499 llvm-svn: 350749
* [WebAssembly] Standardize order of SIMD bitselect argumentsThomas Lively2019-01-091-1/+1
| | | | | | | | | | | | | | | Summary: For some reason the backend assumed that the condition mask would be the first argument to the LLVM intrinsic, but everywhere else the condition mask is the third argument. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D56412 llvm-svn: 350746
* [mips][micrompis] Emit 16bit NOPs by defaultAleksandar Beserminji2019-01-092-6/+19
| | | | | | | | | Emit 16bit NOPs by default. Use 32bit NOPs in delay slots where necessary. Differential https://reviews.llvm.org/D55323 llvm-svn: 350733
* Revert "[AMDGPU] Fix DPP combiner"Valery Pykhtin2019-01-093-171/+68
| | | | | | This reverts commit e3e2923a39cbec3b3bc3a7d3f0e9a77a4115080e, svn revision rL350721 llvm-svn: 350730
* Initial AArch64 SLH implementation.Kristof Beyls2019-01-093-14/+300
| | | | | | | | | | | | | | This is an initial implementation for Speculative Load Hardening for AArch64. It builds on top of the recently introduced AArch64SpeculationHardening pass. This doesn't implement (yet) some of the optimizations implemented for the X86SpeculativeLoadHardening pass. I thought introducing the optimizations incrementally in follow-up patches should make this easier to review. Differential Revision: https://reviews.llvm.org/D55929 llvm-svn: 350729
* [AMDGPU] Fix DPP combinerValery Pykhtin2019-01-093-68/+171
| | | | | | | | | | | | | | Fixed issue with identity values and other cases, f32/f16 identity values to be added later. fma/mac instructions is disabled for now. Test is fully reworked, added comments. Other fixes: 1. dpp move with uses and old reg initializer should be in the same BB. 2. bound_ctrl:0 is only considered when bank_mask and row_mask are fully enabled (0xF). Othervise the old register value is checked for identity. 3. Added add, subrev, and, or instructions to the old folding function. 4. Kill flag is cleared for the src0 (DPP register) as it may be copied into more than one user. Differential revision: https://reviews.llvm.org/D55444 llvm-svn: 350721
* Revert r350647: "[NewPM] Port tsan"Florian Hahn2019-01-094-60/+41
| | | | | | | This patch breaks thread sanitizer on some macOS builders, e.g. http://green.lab.llvm.org/green/job/clang-stage1-configure-RA/52725/ llvm-svn: 350719
* [X86] Enable combining shuffles to PACKSS/PACKUS for 256/512-bit vectorsSimon Pilgrim2019-01-091-3/+4
| | | | llvm-svn: 350716
* [MSP430] Optimize 'shl x, 8[+ N] -> swpb(zext(x)) [<< N]' for i16Anton Korobeynikov2019-01-091-7/+18
| | | | | | | | | | Perform additional simplification to reduce shift amount. Patch by Kristina Bessonova! Differential Revision: https://reviews.llvm.org/D56016 llvm-svn: 350712
* [MSP430] Fix crash while lowering llvm.stacksave/stackrestoreAnton Korobeynikov2019-01-091-0/+2
| | | | | | | | | Perform the usual expansion of stacksave / restore intrinsics. Patch by Kristina Bessonova! Differential Revision: https://reviews.llvm.org/D54890 llvm-svn: 350710
* [AArch64] Move feature predctrl to predresDiogo N. Sampaio2019-01-095-9/+9
| | | | | | | | | | | Follow up patch of rL350385, for adding predres command line option. This patch renames the feature as to keep it aligned with the option passed by/to clang Differential Revision: https://reviews.llvm.org/D56484 llvm-svn: 350702
* [X86] Fix gcc7 -Wunused-but-set-variable warning. NFCI.Simon Pilgrim2019-01-091-2/+0
| | | | llvm-svn: 350701
* [DebugInfo] Omit location list entries with empty rangesDavid Stenberg2019-01-091-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This fixes PR39710. In that case we emitted a location list looking like this: .Ldebug_loc0: .quad .Lfunc_begin0-.Lfunc_begin0 .quad .Lfunc_begin0-.Lfunc_begin0 .short 1 # Loc expr size .byte 85 # DW_OP_reg5 .quad .Lfunc_begin0-.Lfunc_begin0 .quad .Lfunc_end0-.Lfunc_begin0 .short 1 # Loc expr size .byte 85 # super-register DW_OP_reg5 .quad 0 .quad 0 As seen, the first entry's beginning and ending addresses evalute to 0, which meant that the entry inadvertently became an "end of list" entry, resulting in the location list ending sooner than expected. To fix this, omit all entries with empty ranges. Location list entries with empty ranges do not have any effect, as specified by DWARF, so we might as well drop them: "A location list entry (but not a base address selection or end of list entry) whose beginning and ending addresses are equal has no effect because the size of the range covered by such an entry is zero." Reviewers: davide, aprantl, dblaikie Reviewed By: aprantl Subscribers: javed.absar, JDevlieghere, llvm-commits Tags: #debug-info Differential Revision: https://reviews.llvm.org/D55919 llvm-svn: 350698
* GlobalISel: Implement fewerElements for implicit_defMatt Arsenault2019-01-092-3/+41
| | | | llvm-svn: 350697
* GlobalISel: Implement widenScalar for implicit_defMatt Arsenault2019-01-091-0/+6
| | | | llvm-svn: 350695
* [IPT] Drop cache less eagerly in GVN and LoopSafetyInfoMax Kazantsev2019-01-094-15/+21
| | | | | | | | | | | | | | | | | | | | Current strategy of dropping `InstructionPrecedenceTracking` cache is to invalidate the entire basic block whenever we change its contents. In fact, `InstructionPrecedenceTracking` has 2 internal strictures: `OrderedInstructions` that is needed to be invalidated whenever the contents changes, and the map with first special instructions in block. This second map does not need an update if we add/remove a non-special instuction because it cannot affect the contents of this map. This patch changes API of `InstructionPrecedenceTracking` so that it now accounts for reasons under which we invalidate blocks. This should lead to much less recalculations of the map and should save us some compile time because in practice we don't typically add/remove special instructions. Differential Revision: https://reviews.llvm.org/D54462 Reviewed By: efriedma llvm-svn: 350694
* Revert "[PowerPC] Fix assert from machine verify pass that unmatched ↵Zi Xuan Wu2019-01-091-20/+13
| | | | | | | | | | register class about fcmp selection in fast-isel" This reverts commit r350685. See compile assert in compiler-rt. llvm-svn: 350693
* [NFC] fix trivial typos in commentsHiroshi Inoue2019-01-097-10/+10
| | | | llvm-svn: 350690
* [X86] Correct the MaskVT for avx512 gather/scatter intrinsics to use the min ↵Craig Topper2019-01-091-4/+7
| | | | | | | | | | of the number of index and data elements. When the result type is v2i64/v2f64 and the index element size is i32, the index vector has two unused elements making the type v4i32. The mask VT should match the number of memory accesses that will be made. This is consistent with the isel patterns used for the target independent gather/scatter intrinsic. llvm-svn: 350687
* [PowerPC] Fix assert from machine verify pass that unmatched register class ↵Zi Xuan Wu2019-01-091-13/+20
| | | | | | | | | | | | | | | | | | | | | about fcmp selection in fast-isel Bad machine code: Illegal virtual register for instruction function: TestULE basic block: %bb.0 entry (0x1000a39b158) instruction: %2:crrc = FCMPUD %1:vsfrc, %3:f8rc operand 1: %1:vsfrc Fix assert about missing match between fcmp instruction and register class. We should use vsx related cmp instruction xvcmpudp instead of fcmpu when vsx is opened. add -verifymachineinstrs option into related test cases to enable the verify pass. Differential Revision: https://reviews.llvm.org/D55686 llvm-svn: 350685
* Remove check for single use in ShrinkDemandedConstantStanislav Mekhanoshin2019-01-092-5/+1
| | | | | | | | | | | | | | | This removes check for single use from general ShrinkDemandedConstant to the BE because of the AArch64 regression after D56289/rL350475. After several hours of experiments I did not come up with a testcase failing on any other targets if check is not performed. Moreover, direct call to ShrinkDemandedConstant is not really needed and superceed by SimplifyDemandedBits. Differential Revision: https://reviews.llvm.org/D56406 llvm-svn: 350684
* RegisterCoalescer: Assume CR_Replace for SubRangeJoinMatt Arsenault2019-01-081-0/+6
| | | | | | | | | | | | | | | | | Currently it's possible for following check on V.WriteLanes (which is not really meaningful during SubRangeJoin) to pass for one half of the pair, and then fall through to to one of the impossible or unresolved states. This then fails as inconsistent on the other half. During the main range join, the check between V.WriteLanes and OtherV.ValidLanes must have passed, meaning this should be a CR_Replace. Fixes most of the testcases in bugs 39542 and 39602 llvm-svn: 350678
* RegisterCoalescer: Defer clearing implicit_def lanesMatt Arsenault2019-01-081-16/+33
| | | | | | | | | | | | | | We can't go back and recover the lanes if it turns out the implicit_def really can't be erased. Assume all lanes are valid if an unresolved conflict is encountered. There aren't any tests where this seems to matter either way, but this seems like a safer option. Fixes bug 39602 llvm-svn: 350676
* [InstCombine] canonicalize another raw IR rotate pattern to funnel shiftSanjay Patel2019-01-082-3/+56
| | | | | | | | | This is matching the equivalent of the DAG expansion, so it should never end up with worse perf than the original code even if the target doesn't have a rotate instruction. llvm-svn: 350672
* [PGO] Use SourceFileName rather module name in PGOFuncNameRong Xu2019-01-081-5/+6
| | | | | | | | | | | | In LTO or Thin-lto mode (though linker plugin), the module names are of temp file names which are different for different compilations. Using SourceFileName avoids the issue. This should not change any functionality for current PGO as all the current callers of getPGOFuncName() is before LTO. Differential Revision: https://reviews.llvm.org/D56327 llvm-svn: 350671
* [WebAssembly] Rename StoreResults to MemIntrinsicResultsHeejin Ahn2019-01-085-38/+46
| | | | | | | | | | | | | | | | | | | Summary: StoreResults pass does not optimize store instructions anymore because store instructions don't return results values anymore. Now this pass is used solely for memory intrinsics, so update the pass name accordingly and fix outdated pass descriptions as well. This patch does not change any meaningful behavior, but not marked as NFC because it changes a comment check line in a test case. Reviewers: dschuff Subscribers: mgorny, sbc100, jgravelle-google, sunfiish, llvm-commits Differential Revision: https://reviews.llvm.org/D56093 llvm-svn: 350669
* [PGO] Revert r350442 to fix commit message.Rong Xu2019-01-081-6/+5
| | | | | | Will re-commit it using the correct commit message. llvm-svn: 350667
* [AArch64] Adjust the cost model for ExynosEvandro Menezes2019-01-083-56/+58
| | | | | | Improve the modeling of ALU instructions. llvm-svn: 350663
* [llvm-mca] Improve debugging (NFC)Evandro Menezes2019-01-082-0/+4
| | | | llvm-svn: 350661
* [llvm-undname] Add support for demangling msvc's noexcept types.Zachary Turner2019-01-082-3/+9
| | | | | | | | | | | Starting in C++17, MSVC introduced a new mangling for function parameters that are themselves noexcept functions. This patch makes llvm-undname properly demangle them. Patch by Zachary Henkel Differential Revision: https://reviews.llvm.org/D55769 llvm-svn: 350656
* Don't write #include "Windows/WindowsSupport.h" from the Windows dir.Zachary Turner2019-01-081-1/+1
| | | | | | | This generates -Wnonportable-include-dir warnings, and doesn't need to be there. It seems this was just checked in on accident. llvm-svn: 350655
* Revert "Revert "Revert "Resubmit rL345008 "Split MachinePipeliner code into ↵Adrian Prantl2019-01-081-5/+595
| | | | | | | | header and cpp files"""" This reverts commit D56084. llvm-svn: 350654
* [NewPM] Port tsanPhilip Pfaffe2019-01-084-41/+60
| | | | | | | | | A straightforward port of tsan to the new PM, following the same path as D55647. Differential Revision: https://reviews.llvm.org/D56433 llvm-svn: 350647
* Rename DIFlagFixedEnum to DIFlagEnumClass. NFCPaul Robinson2019-01-082-3/+3
| | | | llvm-svn: 350641
* [UnrollRuntime] Fix domTree failures in multiexit unrollingAnna Thomas2019-01-081-24/+24
| | | | | | | | | | | | | | | | | | | | Summary: This fixes the IDom for exit blocks and all blocks reachable from the exit blocks, when runtime unrolling under multiexit/exiting case. We initially had a restrictive check that the IDom is only updated when it is the header of the loop. However, we also need to update the IDom to the correct one when the IDom is any block within the original loop. See added test cases (which fail dom tree verification without the patch). Reviewers: reames, mzolotukhin, mkazantsev, hfinkel Reviewed by: brzycki, kuhar Subscribers: zzheng, dmgreen, llvm-commits Differential Revision: https://reviews.llvm.org/D56284 llvm-svn: 350640
* [BPF] Fix .BTF.ext reloc type assigment issueYonghong Song2019-01-081-2/+10
| | | | | | | | | | | | | | | | | | | | Commit f1db33c5c1a9 ("[BPF] Disable relocation for .BTF.ext section") assigned relocation type R_BPF_NONE if the fixup type is FK_Data_4 and the symbol is temporary. The reason is we use FK_Data_4 as a fixup type for insn offsets in .BTF.ext section. Just checking whether the symbol is temporary is not enough. For example, .debug_info may reference some strings whose fixup is FK_Data_4 with a temporary symbol as well. To truely reflect the case for .BTF.ext section, this patch further checks that the section associateed with the symbol must be SHF_ALLOC and SHF_EXECINSTR, i.e., in the text section. This fixed the above-mentioned problem. Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 350637
* [MachineVerifier] Include offending register in allocatable live-in error msg.Florian Hahn2019-01-081-0/+6
| | | | | | | | | | | | | | This patch adds a convenience report() method for physical registers and uses it to print the offending register with the 'MBB has allocatable live-in' error. Reviewers: MatzeB, rtereshin, dsanders Reviewed By: dsanders Differential Revision: https://reviews.llvm.org/D55946 llvm-svn: 350630
* [GlobalISel] Fix choice of instruction selector for AArch64 at -O0 with ↵Petr Pavlu2019-01-081-12/+23
| | | | | | | | | | | | | | | | | -global-isel=0 Commit rL347861 introduced an unintentional change in the behaviour when compiling for AArch64 at -O0 with -global-isel=0. Previously, explicitly disabling GlobalISel resulted in using FastISel but an updated condition in the commit changed it to using SelectionDAG. The patch fixes this condition and slightly better organizes the code that chooses the instruction selector. Fixes PR40131. Differential Revision: https://reviews.llvm.org/D56266 llvm-svn: 350626
* [DA][NewPM] Add a printerpass and port the testsuitePhilip Pfaffe2019-01-082-0/+8
| | | | | | | | | The new-pm version of DA is untested. Testing requires a printer, so add that and use it in the existing DA tests. Differential Revision: https://reviews.llvm.org/D56386 llvm-svn: 350624
* [X86][Darwin] Emit compact-unwind for register-sized stack adjustmentsFrancis Visoiu Mistrih2019-01-081-10/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For stack frames on the size of a register in x86, a code size optimization emits "push rax/eax" instead of "sub" for stack allocation. For example: foo: .cfi_startproc BB#0: pushq %rax Ltmp0: .cfi_def_cfa_offset 16 ... .cfi_endproc However, we are falling back to DWARF in this case because we cannot encode %rax as a saved register. This requirement is wrong, since we don't care about the contents of %rax, it is the equivalent of a sub. In order to specify that we care about the contents of %rax, we would need a .cfi_offset %rax, <offset>. It's also overzealous in the case where there are pushes for callee saved registers followed by a "push rax/eax" instead of "sub", in which case we should also be able to encode the callee saved regs and everything else using compact unwind. Patch authored by Bruno Cardoso Lopes. Differential Revision: https://reviews.llvm.org/D13793 llvm-svn: 350623
* Revert "Revert "Resubmit rL345008 "Split MachinePipeliner code into header ↵Lama Saba2019-01-081-595/+5
| | | | | | | | | | and cpp files""" This reverts commit rL350497 reported remaining issues seem to be unrelated to modules or this change. more info: https://reviews.llvm.org/D56084 llvm-svn: 350621
* AArch64: avoid splitting vector truncating stores.Tim Northover2019-01-081-0/+11
| | | | | | | | | | | | We have code to split vector splats (of zero and non-zero) for performance reasons, but it ignores the fact that a store might be truncating. Actually, truncating stores are formed for vNi8 and vNi16 types. Since the truncation is from a legal type, the size of the store is always <= 64-bits and so they don't actually benefit from being split up anyway, so this patch just disables that transformation. llvm-svn: 350620
* [GlobalISel] Fix unused variable warning in Release builds.Benjamin Kramer2019-01-081-3/+3
| | | | llvm-svn: 350618
* [ARM] Add missing patterns for DSP mulsSam Parker2019-01-082-80/+60
| | | | | | | | | | | Using a PatLeaf for sext_16_node allowed matching smulbb and smlabb instructions once the operands had been sign extended. But we also need to use sext_inreg operands along with sext_16_node to catch a few more cases that enable use to remove the unnecessary sxth. Differential Revision: https://reviews.llvm.org/D55992 llvm-svn: 350613
OpenPOWER on IntegriCloud