bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[AMDGPU] DAG combine to produce V_PERM_B32	Stanislav Mekhanoshin	2018-06-12	5	-1/+214
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D48099 llvm-svn: 334559
*	[DAGCombiner] Recognize more patterns for ABS	Krzysztof Parzyszek	2018-06-12	3	-28/+11
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D47831 llvm-svn: 334553
*	[AArch64] Support reserving x20 register	Petr Hosek	2018-06-12	4	-5/+25
\| \| \| \| \| \| \| \| \| \| \| \|	Register x20 is a callee-saved register which may be used for other purposes in certain contexts, for example to hold special variables within the kernel. This change adds support for reserving this register both to frontend and backend to make this register usable for these purposes. Differential Revision: https://reviews.llvm.org/D46552 llvm-svn: 334531
*	[X86] Remove mayLoad flag from AVX512 truncating store instructions.	Craig Topper	2018-06-12	1	-2/+1
\| \| \| \|	llvm-svn: 334529
*	[MS][ARM64] Hoist __ImageBase handling into TargetLoweringObjectFileCOFF	Reid Kleckner	2018-06-12	3	-112/+0
\| \| \| \| \| \| \| \| \| \| \| \|	All COFF targets should use @IMGREL32 relocations for symbol differences against __ImageBase. Do the same for getSectionForConstant, so that immediates lowered to globals get merged across TUs. Patch by Chris January Differential Revision: https://reviews.llvm.org/D47783 llvm-svn: 334523
*	AMDHSA/NFC: Code object v3 updates (additional):	Konstantin Zhuravlyov	2018-06-12	2	-13/+16
\| \| \| \| \| \|	- Move section selection and alignment to AMDGPUAsmPrinter llvm-svn: 334521
*	AMDHSA: Code object v3 updates	Konstantin Zhuravlyov	2018-06-12	6	-10/+184
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Do not emit following assembler directives: - .hsa_code_object_version - .hsa_code_object_isa - .amd_amdgpu_isa - .amd_amdgpu_hsa_metadata - .amd_amdgpu_pal_metadata - Do not emit .note entries - Cleanup and bring in sync kernel descriptor header file - Emit kernel descriptor into .rodata with appropriate relocations and alignments llvm-svn: 334519
*	[MC] [X86] Teach leaq _GLOBAL_OFFSET_TABLE(%rip), %r15 to use ↵	Fangrui Song	2018-06-12	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	R_X86_64_GOTPC32 instead of R_X86_64_PC32 Summary: This is similar to D46319 (ARM). x86-64 psABI p40 gives an example: leaq _GLOBAL_OFFSET_TABLE(%rip), %r15 # GOTPC32 reloc GNU as creates R_X86_64_GOTPC32. However, MC currently emits R_X86_64_PC32. Reviewers: javed.absar, echristo Subscribers: kristof.beyls, llvm-commits, peter.smith, grimar Differential Revision: https://reviews.llvm.org/D47507 llvm-svn: 334515
*	[CostModel] Replace ShuffleKind::SK_Alternate with ShuffleKind::SK_Select ↵	Simon Pilgrim	2018-06-12	2	-30/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(PR33744) As discussed on PR33744, this patch relaxes ShuffleKind::SK_Alternate which requires shuffle masks to only match an alternating pattern from its 2 sources: e.g. v4f32: <0,5,2,7> or <4,1,6,3> This seems far too restrictive as most SIMD hardware which will implement it using a general blend/bit-select instruction, so replaces it with SK_Select, permitting elements from either source as long as they are inline: e.g. v4f32: <0,5,2,7>, <4,1,6,3>, <0,1,6,7>, <4,1,2,3> etc. This initial patch just updates the name and cost model shuffle mask analysis, later patch reviews will update SLP to better utilise this - it still limits itself to SK_Alternate style patterns. Differential Revision: https://reviews.llvm.org/D47985 llvm-svn: 334513
*	[X86] Remove TB_ALIGN_16 from VEXTRACTF128/VEXTRACTI128 in the memory ↵	Craig Topper	2018-06-12	1	-2/+2
\| \| \| \| \| \|	folding table. llvm-svn: 334511
*	[Hexagon] Make floating point operations expensive for vectorization	Krzysztof Parzyszek	2018-06-12	2	-6/+35
\| \| \| \|	llvm-svn: 334508
*	[x86] move shrunkblend transform to helper function; NFCI	Sanjay Patel	2018-06-12	1	-74/+76
\| \| \| \| \| \| \|	We should be able to obsolete D48043 by easing the constraints on this existing code. llvm-svn: 334504
*	[SelectionDAG] Provide default expansion for rotates	Krzysztof Parzyszek	2018-06-12	3	-0/+32
\| \| \| \| \| \| \| \| \| \| \| \| \|	Implement default legalization of rotates: either in terms of the rotation in the opposite direction (if legal), or in terms of shifts and ors. Implement generating of rotate instructions for Hexagon. Hexagon only supports rotates by an immediate value, so implement custom lowering of ROTL/ROTR on Hexagon. If a rotate is not legal, use the default expansion. Differential Revision: https://reviews.llvm.org/D47725 llvm-svn: 334497
*	[mips] Guard some floating point instructions correctly	Simon Dardis	2018-06-12	1	-31/+37
\| \| \| \| \| \| \| \|	Reviewers: smaksimovic, atanasyan, abeserminji Differential Revision: https://reviews.llvm.org/D47636 llvm-svn: 334491
*	[mips] Extend LONG_BRANCH_LUi/ADDiu with extra parameter	Aleksandar Beserminji	2018-06-12	3	-22/+67
\| \| \| \| \| \| \| \| \| \| \|	Extend LONG_BRANCH_LUi and LONG_BRANCH_ADDiu pseudo instructions with additional flag, so instead of always lowering to lui %hi(...), addiu %lo(...) or addiu %hi(...), now they can lower to either %lo, %hi, %higher or %highest depending on the added flag. Differential Revision: https://reviews.llvm.org/D47941 llvm-svn: 334490
*	[AArch64] Audit on rL333879 to fix FP16 64bit bitpatterns	Luke Geeson	2018-06-12	1	-2/+2
\| \| \| \|	llvm-svn: 334488
*	[X86] Add NotMemoryFoldable to the VPCOMPRESS instructions.	Craig Topper	2018-06-12	1	-4/+4
\| \| \| \|	llvm-svn: 334481
*	[X86] Add NotMemoryFoldable to more instructions.	Craig Topper	2018-06-12	1	-14/+22
\| \| \| \| \| \|	These include PUSH/POP instructions that don't match the manual table. This also includes CMPXCHG which we never emit in non-locked form. llvm-svn: 334479
*	[X86] Add NotMemoryFoldable to a bunch of instructions to suppress them from ↵	Craig Topper	2018-06-12	4	-46/+54
\| \| \| \| \| \| \| \| \| \|	the autogenerated load folding table. Most of these are system instructions or other instructions we don't use in CodeGen. No point wasting space for them in the table. Removing them from the autogenerated table makes it easier to review the manual table. A few are real opcode collisions where the memory and register forms are completely different instructions. llvm-svn: 334474
*	[X86] Add isel patterns for folding loads when creating ROUND instructions ↵	Craig Topper	2018-06-12	2	-16/+200
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	from ffloor/fnearbyint/fceil/frint/ftrunc. We were missing packed isel folding patterns for all of sse41, avx, and avx512. For some reason avx512 had scalar load folding patterns under optsize(due to partial/undef reg update), but we didn't have the equivalent sse41 and avx patterns. Sometimes we would get load folding due to peephole pass anyway, but we're also missing avx512 instructions from the load folding table. I'll try to fix that in another patch. Some of this was spotted in the review for D47993. This patch adds all the folds to isel, adds a few spot tests, and disables the peephole pass on a few tests to ensure we're testing some of these patterns. llvm-svn: 334460
*	[AMDGPU] prevent hitting Assertion `isReg() && "Wrong MachineOperand accessor"'	Mark Searles	2018-06-12	1	-2/+2
\| \| \| \| \| \| \| \| \|	The use iterator, used within findMaskOperands(), can return anything which is not a def. isUse() requires a register, so check isReg() before calling isUse(). Differential Revision: https://reviews.llvm.org/D48047 llvm-svn: 334459
*	Simplify; NFC	George Burgess IV	2018-06-11	1	-1/+1
\| \| \| \| \| \|	Not shown in the diff: AQ is a `vector<SUnit >`, and SU is a `SUnit ` llvm-svn: 334451
*	AMDGPU: Add 64-bit relative variant kind	Konstantin Zhuravlyov	2018-06-11	1	-0/+2
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D47601 llvm-svn: 334443
*	[X86] Push some variable declarations down into the individual switch cases ↵	Craig Topper	2018-06-11	1	-3/+4
\| \| \| \| \| \| \| \|	that need them. NFC All of the cases are already wrapped in curly braces so declaring a variable there isn't an issue. And the variables aren't assigned or used in the larger scope. llvm-svn: 334436
*	[X86] Reorder some type constraints to force things to be vectors and ↵	Craig Topper	2018-06-11	1	-4/+4
\| \| \| \| \| \| \| \|	integer/fp before forcing them to be the same size. This may be needed by another patch that I'm working on. It should have no effect on any of the generated outputs. llvm-svn: 334430
*	[Hexagon] Late predicate producers cannot be used as dot-new sources	Krzysztof Parzyszek	2018-06-11	1	-4/+23
\| \| \| \|	llvm-svn: 334426
*	[X86][AVX512] Tag AVX5124FMAPS/AVX5124VNNIW with missing scheduler classes	Simon Pilgrim	2018-06-11	1	-6/+12
\| \| \| \| \| \| \| \|	Necessary for D46276 as even though btver2 doesn't use these instructions, its now flagged as complete so complains if ANY instruction isn't tagged..... UnsupportedFeatures wouldn't help here as these instructions don't appear to have a feature predicate (like a lot of AVX512). llvm-svn: 334423
*	[AMDGPU] Do not consider indirect acces through phi for wave limiter	Stanislav Mekhanoshin	2018-06-11	1	-6/+0
\| \| \| \| \| \| \| \| \| \| \|	Rational: if there is indirect access that is usually an issue because load is not ready by the use. However, if use is inside a loop and load is outside that is potentially an issue for a first iteration only. Differential Revision: https://reviews.llvm.org/D47740 llvm-svn: 334420
*	[mips] Fix spill slot for mips3, n64 abi	Aleksandar Beserminji	2018-06-11	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When program is compiled for mips3 with n64 abi, wrong register class is used for creating an emergency spill slot. This patch fixes the correct register class to be chosen. This patch resolves PR35859. Thanks to John Baldwin for reporting the issue! Differential Revision: https://reviews.llvm.org/D47938 llvm-svn: 334419
*	[AVR] Set trackLivenessAfterRegAlloc	Dylan McKay	2018-06-11	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This sets trackLivenessAfterRegAlloc on AVRRegisterInfo. Most existing targets set this flag. Without it, specific IR inputs cause LLVM to fail with: Assertion failed: (getParent()->getProperties().hasProperty( MachineFunctionProperties::Property::TracksLiveness) && "Liveness information is accurate"), function livein_begin file MachineBasicBlock.cpp, line 1354. With this commit, this no longer happens. Patch by Peter Nimmervoll. llvm-svn: 334409
*	[X86] Fix skylake server scheduling info.	Clement Courbet	2018-06-11	11	-310/+743
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This fixes most of the scheduling info for SKX vector operations. I had to split a lot of the YMM/ZMM classes into separate classes for YMM and ZMM. The before/after llvm-exegesis analysis are in the phabricator diff. Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47721 llvm-svn: 334407
*	[ExynosM1][Sched] Fix resource usage in scheduling model.	Clement Courbet	2018-06-11	1	-16/+16
\| \| \| \| \| \|	This is part of https://reviews.llvm.org/D46356. llvm-svn: 334391
*	[X86] Explicitly mark unsupported classes in scheduling models.	Clement Courbet	2018-06-11	9	-111/+131
\| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In preparation for D47721. HSW and SNB still define unsupported classes as they are used by KNL and generic models respectively. Reviewers: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47763 llvm-svn: 334389
*	[X86] Remove masking from dbpsadbw intrinsics, use select in IR instead.	Craig Topper	2018-06-11	2	-9/+13
\| \| \| \|	llvm-svn: 334384
*	[Sparc] Add support for 13-bit PIC	Daniel Cederman	2018-06-11	7	-7/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When compiling with -fpic, in contrast to -fPIC, use only the immediate field to index into the GOT. This saves space if the GOT is known to be small. The linker will warn if the GOT is too large for this method. Reviewers: jyknight, venkatra Reviewed By: jyknight Subscribers: brad, fedor.sergeev, jrtc27, llvm-commits Differential Revision: https://reviews.llvm.org/D47136 llvm-svn: 334383
*	[X86] Remove and autoupgrade the expandload and compressstore intrinsics.	Craig Topper	2018-06-11	2	-131/+1
\| \| \| \| \| \|	We use the target independent intrinsics now. llvm-svn: 334381
*	[X86] Miscellaneous fixes to get the load folding table generator to work again.	Craig Topper	2018-06-10	3	-9/+9
\| \| \| \|	llvm-svn: 334377
*	[NEON] Support VST1xN intrinsics in AArch32 mode (LLVM part)	Ivan A. Kosarev	2018-06-10	4	-8/+165
\| \| \| \| \| \| \| \| \|	We currently support them only in AArch64. The NEON Reference, however, says they are 'ARMv7, ARMv8' intrinsics. Differential Revision: https://reviews.llvm.org/D47447 llvm-svn: 334361
*	[X86] Remove masking from the 512-bit masked floating point add/sub/mul/div ↵	Craig Topper	2018-06-10	2	-19/+24
\| \| \| \| \| \|	intrinsics. Use a select in IR instead. llvm-svn: 334358
*	[X86] NFC Use member initialization in X86Subtarget	Gabor Buella	2018-06-09	2	-215/+107
\| \| \| \| \| \| \| \|	The separate initializeEnvironment function was sort of useless since r217071. ARM did this move already with r273556. llvm-svn: 334345
*	[ARM] Allow CMPZ transforms even if the input has multiple uses.	Eli Friedman	2018-06-08	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	It looks like this got left in by accident in r289794; I can't think of any reason this check would be necessary. (Maybe it was meant to be a check that the AND has one use? But we check that a few lines earlier.) Differential Revision: https://reviews.llvm.org/D47921 llvm-svn: 334322
*	[X86][SSE] Support v8i16/v16i16 rotations	Simon Pilgrim	2018-06-08	1	-14/+30
\| \| \| \| \| \| \| \|	Extension to D46954 (PR37426), this patch adds support for v8i16/v16i16 rotations in a similar manner - the conversion of the shift/rotate amount to a multiplication factor and the use of PMULLW to shift left and PMULHUW (ISD::MULHU) to shift the wrapped bits back around to be ORd together. Differential Revision: https://reviews.llvm.org/D47822 llvm-svn: 334309
*	[X86][BtVer2] Add support for all SUB/XOR 32/64 scalar instructions that ↵	Simon Pilgrim	2018-06-08	1	-1/+8
\| \| \| \| \| \| \| \|	should match the dependency-breaking 'zero-idiom' As detailed on Agner's Microarchitecture doc (21.8 AMD Bobcat and Jaguar pipeline - Dependency-breaking instructions), these instructions are dependency breaking and fast-path zero the destination register (and appropriate EFLAGS bits). llvm-svn: 334303
*	[AMDGPU] Inline asm - added i16, half and i128 types support	Daniil Fukalov	2018-06-08	1	-16/+32
\| \| \| \| \| \| \| \| \| \|	AMDGPU inline assembler support i16, half and i128 typed variables in constraints, but they were reported as error. Needed to fix https://github.com/RadeonOpenCompute/ROCm/issues/341, e.g. to be able to load with global_load_dwordx4 to a 128bit integer variable Differential Revision: https://reviews.llvm.org/D44920 llvm-svn: 334301
*	[X86][SSE] Simplify combineVectorTruncationWithPACKUS to reduce code duplication	Simon Pilgrim	2018-06-08	1	-37/+5
\| \| \| \| \| \| \| \| \| \|	Simplify combineVectorTruncationWithPACKUS to mask the upper bits followed by calling truncateVectorWithPACK instead of duplicating with similar code. This results in the codegen using (V)PACKUSDW on SSE41+ targets for vXi64/vXi32 inputs where before it always used PACKUSWB (along with a lot more bitcasting). I've raised PR37749 as until we avoid unnecessary concats back to 256-bit for bitwise ops, we can't avoid splitting the input value into 128-bit subvectors for masking. llvm-svn: 334289
*	[mips] Correct the predicates for a number of codegen only instructions	Simon Dardis	2018-06-08	1	-37/+52
\| \| \| \| \| \| \| \|	Reviewers: smaksimovic, atanasyan, abeserminji Differential Revision: https://reviews.llvm.org/D47638 llvm-svn: 334280
*	[RISCV] Implement MC layer support for the fence.tso instruction	Alex Bradbury	2018-06-08	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \|	The instruction makes use of a previously ignored field in the fence instruction. It is introduced in the version 2.3 draft of the RISC-V specification after much work by the Memory Model Task Group. As clarified here <https://github.com/riscv/riscv-isa-manual/issues/186>, the fence.tso assembler mnemonic does not have operands. llvm-svn: 334278
*	[X86][SSE] Consistently prefer lowering to PACKUS over PACKSS	Simon Pilgrim	2018-06-08	1	-32/+32
\| \| \| \| \| \| \| \| \| \|	We have some combines/lowerings that attempt to use PACKSS-then-PACKUS and others that use PACKUS-then-PACKSS. PACKUS is much easier to combine with if we know the upper bits are zero as ComputeKnownBits can easily see through BITCASTs etc. especially now that rL333995 and rL334007 have landed. It also effectively works at byte level which further simplifies shuffle combines. The only (minor) annoyances are that ComputeKnownBits can sometimes take longer as it doesn't fail as quickly as ComputeNumSignBits (but I'm not seeing any actual regressions in tests) and PACKUSDW only became available after SSE41 so we have more codegen diffs between targets. llvm-svn: 334276
*	AMDGPU: Error on LDS global address in functions	Matt Arsenault	2018-06-08	1	-1/+9
\| \| \| \| \| \| \|	These won't work as expected now, so error on them to avoid wasting time debugging this in the future. llvm-svn: 334269
*	[NFC] fix formatting	Hiroshi Inoue	2018-06-08	1	-1/+1
\| \| \| \|	llvm-svn: 334263