summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [opaque pointer types] Remove some calls to generic Type subtype accessors.James Y Knight2019-01-1012-59/+47
| | | | | | | | | | | | That is, remove many of the calls to Type::getNumContainedTypes(), Type::subtypes(), and Type::getContainedType(N). I'm not intending to remove these accessors -- they are useful/necessary in some cases. However, removing the pointee type from pointers would potentially break some uses, and reducing the number of calls makes it easier to audit. llvm-svn: 350835
* [RISCV][MC] Add support for evaluating constant symbols as immediatesAlex Bradbury2019-01-101-8/+1
| | | | | | | | | | | | | | | | | | | | | | | | This further improves compatibility with GNU as, allowing input such as the following to be assembled: .equ CONST, 0x123456 li a0, CONST addi a0, a0, %lo(CONST) .equ CONST, 1 slli a0, a0, CONST Note that we don't have perfect compatibility with gas, as it will avoid emitting a relocation in this case: addi a0, a0, %lo(CONST2) .equ CONST2, 0x123456 Thanks to Shiva Chen for suggesting a better way to approach this during review. Differential Revision: https://reviews.llvm.org/D52298 llvm-svn: 350831
* [x86] fix remaining miscompile bug in horizontal binop matching (PR40243)Sanjay Patel2019-01-101-5/+8
| | | | | | | | | | | When we use the partial-matching function on a 128-bit chunk, we must account for the possibility that we've matched undef halves of the original source vectors, so the outputs may need to be reset. This should allow closing PR40243: https://bugs.llvm.org/show_bug.cgi?id=40243 llvm-svn: 350830
* [x86] fix horizontal binop matching for 256-bit vectors (PR40243)Sanjay Patel2019-01-101-70/+169
| | | | | | | | | | | | | | | | | This is a partial fix for: https://bugs.llvm.org/show_bug.cgi?id=40243 ...as seen in the integer test, we still need to correct the result when using the existing (old) horizontal op matching function because it does not model the way x86 256-bit horizontal ops return results (each 128-bit half is its own horizontal-op). A potential follow-up change for that is discussed in the bug report - see also D56490. This generally duplicates a lot of the existing matching code, but we can't just remove that without introducing regressions, so the existing code is renamed and used less often. Follow-ups may try to reduce that overlap. Differential Revision: https://reviews.llvm.org/D56450 llvm-svn: 350826
* [AArch64] Fix operation actions for FP16 vector intrinsicsBryan Chan2019-01-101-20/+22
| | | | | | | | | | | | | | | | | | | | | | Summary: This patch changes the legalization action for some half-precision floating- point vector intrinsics (FSIN, FLOG, etc.) from Promote to Expand. These ops are not supported in hardware for half-precision vectors, but promotion is not always possible (for v8f16 operands). Changing the action to Expand fixes an assertion failure in the legalizer when the frontend produces such ops. In addition, a quick microbenchmark shows that, in the v4f16 case, expanding introduces fewer spills and is therefore slightly faster than promoting. Reviewers: t.p.northover, SjoerdMeijer Reviewed By: SjoerdMeijer Subscribers: javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D56296 llvm-svn: 350825
* [MCA] Fix wrong definition of ResourceUnitMask in DefaultResourceStrategy.Andrea Di Biagio2019-01-105-15/+36
| | | | | | | | | | | | | | Field ResourceUnitMask was incorrectly defined as a 'const unsigned' mask. It should have been a 64 bit quantity instead. That means, ResourceUnitMask was always implicitly truncated to a 32 bit quantity. This issue has been found by inspection. Surprisingly, that bug was latent, and it never negatively affected any existing upstream targets. This patch fixes the wrong definition of ResourceUnitMask, and adds a bunch of extra debug prints to help debugging potential issues related to invalid processor resource masks. llvm-svn: 350820
* [ARM] Fix for verifier buildbotSam Parker2019-01-101-5/+5
| | | | | | | Copy the MachineOperand first and then change the flags instead of making a copy. llvm-svn: 350811
* [LoopUnroll] add parsing for unroll parameters in -passes pipelineFedor Sergeev2019-01-102-2/+104
| | | | | | | | | | | | | | | | | | | | | | | Allow to specify loop-unrolling with optional parameters explicitly spelled out in -passes pipeline specification. Introducing somewhat generic way of specifying parameters parsing via FUNCTION_PASS_PARAMETRIZED pass registration. Syntax of parametrized unroll pass name is as follows: 'unroll<' parameter-list '>' Where parameter-list is ';'-separate list of parameter names and optlevel optlevel: 'O[0-3]' parameter: { 'partial' | 'peeling' | 'runtime' | 'upperbound' } negated: 'no-' parameter Example: -passes=loop(unroll<O3;runtime;no-upperbound>) this invokes LoopUnrollPass configured with OptLevel=3, Runtime, no UpperBound, everything else by default. llvm-svn: 350808
* [ARM] Size reduce teq to eorsSam Parker2019-01-101-3/+29
| | | | | | | | | | | Add t2TEQrr to the map of instructions with can be reduced down into a T1 instruction. This is a special case because TEQ just sets the CPSR and doesn't write to a GPR, which is not the case for EOR. So, we need to ensure that the EOR can write to the first operand. Differential Revision: https://reviews.llvm.org/D56255 llvm-svn: 350801
* [X86] Disable DomainReassignment pass when AVX512BW is disabled to avoid ↵Craig Topper2019-01-101-1/+4
| | | | | | | | | | | | | | | | | | | | | | | injecting VK32/VK64 references into the MachineIR Summary: This pass replaces GR8/GR16/GR32/GR64 with their equivalent sized mask register classes. But VK32/VK64 aren't legal without AVX512BW. Apparently this mostly appears to work if the register coalescer is able to remove the VK32/VK64 register class reference. Or if we don't ever spill it. But there's no guarantee of that. Another Intel employee managed to trigger a crash due to this with ISPC. Unfortunately, I've lost the test case he sent me at the time. I'm trying to get him to reproduce it for me. I'd like to get this in before 8.0 branches since its a little scary. The regressions here are unfortunate, but I think we can make some improvements to DAG combine, load folding, etc. to fix them. Just not sure if we can get that done for 8.0. Fixes PR39741 Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56460 llvm-svn: 350800
* Recommit "[PowerPC] Fix assert from machine verify pass that unmatched ↵Zi Xuan Wu2019-01-101-13/+24
| | | | | | | | | | register class about fcmp selection in fast-isel" This re-commit r350685. Differential Revision: https://reviews.llvm.org/D55686 llvm-svn: 350799
* [AArch64] Emit the correct MCExpr relocations specifiers like VK_ABS_G0, etcMandeep Singh Grang2019-01-103-4/+40
| | | | | | | | | | | | | | | | Summary: D55896 and D56029 add support to emit fixups for :abs_g0: , :abs_g1_s: , etc. This patch adds the necessary enums and MCExpr needed for lowering these. Reviewers: rnk, mstorsjo, efriedma Reviewed By: efriedma Subscribers: javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D56037 llvm-svn: 350798
* Revert "[WebAssembly] Add simd128-unimplemented subtarget feature"Thomas Lively2019-01-107-42/+20
| | | | | | This reverts rL350791. llvm-svn: 350795
* [AMDGPU] Separate feature dot-instsStanislav Mekhanoshin2019-01-105-6/+22
| | | | | | Differential Revision: https://reviews.llvm.org/D56524 llvm-svn: 350793
* [WebAssembly] Add simd128-unimplemented subtarget featureThomas Lively2019-01-107-20/+42
| | | | | | | | | | This is a second attempt at r350778, which was reverted in r350789. The only change is that the unimplemented-simd128 feature has been renamed simd128-unimplemented, since naming it unimplemented-simd128 somehow made the simd128 feature flag enable the unimplemented-simd128 feature on Windows. llvm-svn: 350791
* Revert "[WebAssembly] Add unimplemented-simd128 subtarget feature"Thomas Lively2019-01-107-42/+20
| | | | | | This reverts L350778. llvm-svn: 350789
* [X86] After turning VSELECT into SHRUNKBLEND, make we push the VSELECT into ↵Craig Topper2019-01-101-0/+1
| | | | | | | | | | the worklist so it can be deleted. Found while trying to figure out why my second version of D56421 worked better than the first version. We weren't deleting the vselect in a timely fashion and that caused SimplfyDemandedBit to see an additional user. The new version doesn't have this problem so this fix isn't needed there, but seemed like the right thing to do. llvm-svn: 350781
* [WebAssembly] Add unimplemented-simd128 subtarget featureThomas Lively2019-01-097-20/+42
| | | | | | | | | | | | | | | Summary: This replaces the old ad-hoc -wasm-enable-unimplemented-simd flag. Also makes the new unimplemented-simd128 feature imply the simd128 feature. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits, alexcrichton Differential Revision: https://reviews.llvm.org/D56501 llvm-svn: 350778
* [llvm-mca] Display masks in hexEvandro Menezes2019-01-092-5/+6
| | | | | | Display the resources masks as hexadecimal. Otherwise, NFC. llvm-svn: 350777
* [SimplifyLibCalls] Fix memchr expansion for constant strings.Eli Friedman2019-01-091-1/+4
| | | | | | | | | | | | The C standard says "The memchr function locates the first occurrence of c (converted to an unsigned char)[...]". The expansion was missing the conversion to unsigned char. Fixes https://bugs.llvm.org/show_bug.cgi?id=39041 . Differential Revision: https://reviews.llvm.org/D55947 llvm-svn: 350775
* Don't require a null terminator when loading objectsDavid Major2019-01-091-1/+2
| | | | | | | | | | When a null terminator is required and the file size is a multiple of the system page size, MemoryBuffer will prefer pread() over mmap(), which can result in excessive memory usage. Patch by Mike Hommey! Differential Revision: https://reviews.llvm.org/D56475 llvm-svn: 350774
* [WebAssembly] Print a debug message at the start of each passHeejin Ahn2019-01-0910-3/+28
| | | | | | | | | | | | | | | Summary: Looks like many passes print its pass description as a debug message at the start of each pass, so added that to (mostly newly added) other passes as well. Reviewers: dschuff Subscribers: jgravelle-google, sbc100, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D56142 llvm-svn: 350771
* Refactor synthetic profile count computation. NFC.Easwaran Raman2019-01-093-37/+41
| | | | | | | | | | | | | | | | | Summary: Instead of using two separate callbacks to return the entry count and the relative block frequency, use a single callback to return callsite count. This would allow better supporting hybrid mode in the future as the count of callsite need not always be derived from entry count (as in sample PGO). Reviewers: davidxl Subscribers: mehdi_amini, steven_wu, dexonsmith, dang, llvm-commits Differential Revision: https://reviews.llvm.org/D56464 llvm-svn: 350755
* [CodeGen] Ignore return sext/zext attributes of unused results for tail callsFrancis Visoiu Mistrih2019-01-091-0/+15
| | | | | | | | | | | | | If the caller's return type does not have a zeroext attribute but the callee does a tail call zeroext, we won't consider the tail call during CodeGenPrepare because the attributes don't match. However, if the result of the tail call has no uses, it makes sense to drop the sext/zext attributes. Differential Revision: https://reviews.llvm.org/D56486 llvm-svn: 350753
* [Inliner] Assert that the computed inline threshold is non-negative.Easwaran Raman2019-01-091-0/+7
| | | | | | | | | | Reviewers: chandlerc Subscribers: haicheng, llvm-commits Differential Revision: https://reviews.llvm.org/D56409 llvm-svn: 350751
* refactor BlockFrequencyInfo::view to take a title parameterDavid Callahan2019-01-091-2/+2
| | | | | | | | | | | | | | Summary: All a non-default title for the debugging this debugging aide Reviewers: twoh, Kader, modocache Reviewed By: twoh Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56499 llvm-svn: 350749
* [WebAssembly] Standardize order of SIMD bitselect argumentsThomas Lively2019-01-091-1/+1
| | | | | | | | | | | | | | | Summary: For some reason the backend assumed that the condition mask would be the first argument to the LLVM intrinsic, but everywhere else the condition mask is the third argument. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D56412 llvm-svn: 350746
* [mips][micrompis] Emit 16bit NOPs by defaultAleksandar Beserminji2019-01-092-6/+19
| | | | | | | | | Emit 16bit NOPs by default. Use 32bit NOPs in delay slots where necessary. Differential https://reviews.llvm.org/D55323 llvm-svn: 350733
* Revert "[AMDGPU] Fix DPP combiner"Valery Pykhtin2019-01-093-171/+68
| | | | | | This reverts commit e3e2923a39cbec3b3bc3a7d3f0e9a77a4115080e, svn revision rL350721 llvm-svn: 350730
* Initial AArch64 SLH implementation.Kristof Beyls2019-01-093-14/+300
| | | | | | | | | | | | | | This is an initial implementation for Speculative Load Hardening for AArch64. It builds on top of the recently introduced AArch64SpeculationHardening pass. This doesn't implement (yet) some of the optimizations implemented for the X86SpeculativeLoadHardening pass. I thought introducing the optimizations incrementally in follow-up patches should make this easier to review. Differential Revision: https://reviews.llvm.org/D55929 llvm-svn: 350729
* [AMDGPU] Fix DPP combinerValery Pykhtin2019-01-093-68/+171
| | | | | | | | | | | | | | Fixed issue with identity values and other cases, f32/f16 identity values to be added later. fma/mac instructions is disabled for now. Test is fully reworked, added comments. Other fixes: 1. dpp move with uses and old reg initializer should be in the same BB. 2. bound_ctrl:0 is only considered when bank_mask and row_mask are fully enabled (0xF). Othervise the old register value is checked for identity. 3. Added add, subrev, and, or instructions to the old folding function. 4. Kill flag is cleared for the src0 (DPP register) as it may be copied into more than one user. Differential revision: https://reviews.llvm.org/D55444 llvm-svn: 350721
* Revert r350647: "[NewPM] Port tsan"Florian Hahn2019-01-094-60/+41
| | | | | | | This patch breaks thread sanitizer on some macOS builders, e.g. http://green.lab.llvm.org/green/job/clang-stage1-configure-RA/52725/ llvm-svn: 350719
* [X86] Enable combining shuffles to PACKSS/PACKUS for 256/512-bit vectorsSimon Pilgrim2019-01-091-3/+4
| | | | llvm-svn: 350716
* [MSP430] Optimize 'shl x, 8[+ N] -> swpb(zext(x)) [<< N]' for i16Anton Korobeynikov2019-01-091-7/+18
| | | | | | | | | | Perform additional simplification to reduce shift amount. Patch by Kristina Bessonova! Differential Revision: https://reviews.llvm.org/D56016 llvm-svn: 350712
* [MSP430] Fix crash while lowering llvm.stacksave/stackrestoreAnton Korobeynikov2019-01-091-0/+2
| | | | | | | | | Perform the usual expansion of stacksave / restore intrinsics. Patch by Kristina Bessonova! Differential Revision: https://reviews.llvm.org/D54890 llvm-svn: 350710
* [AArch64] Move feature predctrl to predresDiogo N. Sampaio2019-01-095-9/+9
| | | | | | | | | | | Follow up patch of rL350385, for adding predres command line option. This patch renames the feature as to keep it aligned with the option passed by/to clang Differential Revision: https://reviews.llvm.org/D56484 llvm-svn: 350702
* [X86] Fix gcc7 -Wunused-but-set-variable warning. NFCI.Simon Pilgrim2019-01-091-2/+0
| | | | llvm-svn: 350701
* [DebugInfo] Omit location list entries with empty rangesDavid Stenberg2019-01-091-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This fixes PR39710. In that case we emitted a location list looking like this: .Ldebug_loc0: .quad .Lfunc_begin0-.Lfunc_begin0 .quad .Lfunc_begin0-.Lfunc_begin0 .short 1 # Loc expr size .byte 85 # DW_OP_reg5 .quad .Lfunc_begin0-.Lfunc_begin0 .quad .Lfunc_end0-.Lfunc_begin0 .short 1 # Loc expr size .byte 85 # super-register DW_OP_reg5 .quad 0 .quad 0 As seen, the first entry's beginning and ending addresses evalute to 0, which meant that the entry inadvertently became an "end of list" entry, resulting in the location list ending sooner than expected. To fix this, omit all entries with empty ranges. Location list entries with empty ranges do not have any effect, as specified by DWARF, so we might as well drop them: "A location list entry (but not a base address selection or end of list entry) whose beginning and ending addresses are equal has no effect because the size of the range covered by such an entry is zero." Reviewers: davide, aprantl, dblaikie Reviewed By: aprantl Subscribers: javed.absar, JDevlieghere, llvm-commits Tags: #debug-info Differential Revision: https://reviews.llvm.org/D55919 llvm-svn: 350698
* GlobalISel: Implement fewerElements for implicit_defMatt Arsenault2019-01-092-3/+41
| | | | llvm-svn: 350697
* GlobalISel: Implement widenScalar for implicit_defMatt Arsenault2019-01-091-0/+6
| | | | llvm-svn: 350695
* [IPT] Drop cache less eagerly in GVN and LoopSafetyInfoMax Kazantsev2019-01-094-15/+21
| | | | | | | | | | | | | | | | | | | | Current strategy of dropping `InstructionPrecedenceTracking` cache is to invalidate the entire basic block whenever we change its contents. In fact, `InstructionPrecedenceTracking` has 2 internal strictures: `OrderedInstructions` that is needed to be invalidated whenever the contents changes, and the map with first special instructions in block. This second map does not need an update if we add/remove a non-special instuction because it cannot affect the contents of this map. This patch changes API of `InstructionPrecedenceTracking` so that it now accounts for reasons under which we invalidate blocks. This should lead to much less recalculations of the map and should save us some compile time because in practice we don't typically add/remove special instructions. Differential Revision: https://reviews.llvm.org/D54462 Reviewed By: efriedma llvm-svn: 350694
* Revert "[PowerPC] Fix assert from machine verify pass that unmatched ↵Zi Xuan Wu2019-01-091-20/+13
| | | | | | | | | | register class about fcmp selection in fast-isel" This reverts commit r350685. See compile assert in compiler-rt. llvm-svn: 350693
* [NFC] fix trivial typos in commentsHiroshi Inoue2019-01-097-10/+10
| | | | llvm-svn: 350690
* [X86] Correct the MaskVT for avx512 gather/scatter intrinsics to use the min ↵Craig Topper2019-01-091-4/+7
| | | | | | | | | | of the number of index and data elements. When the result type is v2i64/v2f64 and the index element size is i32, the index vector has two unused elements making the type v4i32. The mask VT should match the number of memory accesses that will be made. This is consistent with the isel patterns used for the target independent gather/scatter intrinsic. llvm-svn: 350687
* [PowerPC] Fix assert from machine verify pass that unmatched register class ↵Zi Xuan Wu2019-01-091-13/+20
| | | | | | | | | | | | | | | | | | | | | about fcmp selection in fast-isel Bad machine code: Illegal virtual register for instruction function: TestULE basic block: %bb.0 entry (0x1000a39b158) instruction: %2:crrc = FCMPUD %1:vsfrc, %3:f8rc operand 1: %1:vsfrc Fix assert about missing match between fcmp instruction and register class. We should use vsx related cmp instruction xvcmpudp instead of fcmpu when vsx is opened. add -verifymachineinstrs option into related test cases to enable the verify pass. Differential Revision: https://reviews.llvm.org/D55686 llvm-svn: 350685
* Remove check for single use in ShrinkDemandedConstantStanislav Mekhanoshin2019-01-092-5/+1
| | | | | | | | | | | | | | | This removes check for single use from general ShrinkDemandedConstant to the BE because of the AArch64 regression after D56289/rL350475. After several hours of experiments I did not come up with a testcase failing on any other targets if check is not performed. Moreover, direct call to ShrinkDemandedConstant is not really needed and superceed by SimplifyDemandedBits. Differential Revision: https://reviews.llvm.org/D56406 llvm-svn: 350684
* RegisterCoalescer: Assume CR_Replace for SubRangeJoinMatt Arsenault2019-01-081-0/+6
| | | | | | | | | | | | | | | | | Currently it's possible for following check on V.WriteLanes (which is not really meaningful during SubRangeJoin) to pass for one half of the pair, and then fall through to to one of the impossible or unresolved states. This then fails as inconsistent on the other half. During the main range join, the check between V.WriteLanes and OtherV.ValidLanes must have passed, meaning this should be a CR_Replace. Fixes most of the testcases in bugs 39542 and 39602 llvm-svn: 350678
* RegisterCoalescer: Defer clearing implicit_def lanesMatt Arsenault2019-01-081-16/+33
| | | | | | | | | | | | | | We can't go back and recover the lanes if it turns out the implicit_def really can't be erased. Assume all lanes are valid if an unresolved conflict is encountered. There aren't any tests where this seems to matter either way, but this seems like a safer option. Fixes bug 39602 llvm-svn: 350676
* [InstCombine] canonicalize another raw IR rotate pattern to funnel shiftSanjay Patel2019-01-082-3/+56
| | | | | | | | | This is matching the equivalent of the DAG expansion, so it should never end up with worse perf than the original code even if the target doesn't have a rotate instruction. llvm-svn: 350672
* [PGO] Use SourceFileName rather module name in PGOFuncNameRong Xu2019-01-081-5/+6
| | | | | | | | | | | | In LTO or Thin-lto mode (though linker plugin), the module names are of temp file names which are different for different compilations. Using SourceFileName avoids the issue. This should not change any functionality for current PGO as all the current callers of getPGOFuncName() is before LTO. Differential Revision: https://reviews.llvm.org/D56327 llvm-svn: 350671
OpenPOWER on IntegriCloud