bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[AVX-512] Remove masked packss/packus intrinsics and autoupgrade to unmasked ↵	Craig Topper	2017-02-16	1	-12/+4
\| \| \| \| \| \| \| \|	intrinsics with select instructions. For 512-bit add new unmasked intrinsics. The new 512-bit unmasked intrinsics will make it easy to handle these with the SSE/AVX intrinsics in InstCombine where we currently have a TODO. llvm-svn: 295290
*	AMDGPU: Remove llvm.SI.sendmsg	Matt Arsenault	2017-02-16	2	-6/+3
\| \| \| \|	llvm-svn: 295270
*	AMDGPU: Remove SI_fs_constant and SI_fs_interp intrinsics	Matt Arsenault	2017-02-16	3	-50/+3
\| \| \| \| \| \|	Update test uses with expansion in terms of new intrinsics. llvm-svn: 295269
*	[X86] Re-enable conditional tail calls and fix PR31257.	Hans Wennborg	2017-02-16	5	-2/+156
\| \| \| \| \| \| \| \| \| \| \|	This reverts r294348, which removed support for conditional tail calls due to the PR above. It fixes the PR by marking live registers as implicitly used and defined by the now predicated tailcall. This is similar to how IfConversion predicates instructions. Differential Revision: https://reviews.llvm.org/D29856 llvm-svn: 295262
*	GlobalISel: legalize va_arg on AArch64.	Tim Northover	2017-02-15	2	-0/+85
\| \| \| \| \| \| \| \|	Uses a Custom implementation because the slot sizes being a multiple of the pointer size isn't really universal, even for the architectures that do have a simple "void *" va_list. llvm-svn: 295255
*	AMDGPU: Remove dead node definitions	Matt Arsenault	2017-02-15	1	-10/+0
\| \| \| \|	llvm-svn: 295247
*	AMDGPU: Consolidate sendmsg/sendmsghalt handling and tests	Matt Arsenault	2017-02-15	1	-7/+4
\| \| \| \|	llvm-svn: 295244
*	AMDGPU: Replace assert with report_fatal_error	Matt Arsenault	2017-02-15	1	-1/+2
\| \| \| \| \| \|	Also use a more refined condition. llvm-svn: 295239
*	[X86][SSE] Don't call EltsFromConsecutiveLoads if any element is missing.	Simon Pilgrim	2017-02-15	1	-4/+11
\| \| \| \| \| \|	Minor performance speedup - if any call to getShuffleScalarElt fails to get a result, don't both calling for the remaining elements as EltsFromConsecutiveLoads will fail anyhow. llvm-svn: 295235
*	[AArch64] Make am_ldrlit an iPTR - not OtherVT - operand. NFC-ish.	Ahmed Bougacha	2017-02-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	am_ldrlit diverged from am_brcond in r207105, but kept the OtherVT operand type. It made sense for branch targets, as those are represented as MVT::Other in SDAG. But loads operate on pointers. This shouldn't have an observable effect on any in-tree code, but helps make the patterns consistent for external users. llvm-svn: 295229
*	[X86][SSE] Propagate undef upper elements from scalar_to_vector during ↵	Simon Pilgrim	2017-02-15	1	-1/+7
\| \| \| \| \| \| \| \|	shuffle combining Only do this for integer types currently - floats types (in particular insertps) load folding often fails with this. llvm-svn: 295208
*	[AMDGPU] Revert failed scheduling	Stanislav Mekhanoshin	2017-02-15	3	-37/+106
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch reverts region's scheduling to the original untouched state in case if we have have decreased occupancy. In addition it switches to use TargetRegisterInfo occupancy callback for pressure limits instead of gradually increasing limits which were just passed by. We are going to stay with the best schedule so we do not need to tolerate worsened scheduling anymore. Differential Revision: https://reviews.llvm.org/D29971 llvm-svn: 295206
*	[X86][SSE] Allow matchVectorShuffleWithUNPCK to recognise ZERO inputs	Simon Pilgrim	2017-02-15	1	-11/+46
\| \| \| \| \| \|	Add support for specifying an UNPCK input as ZERO, particularly improves ZEXT cases with non-zero offsets llvm-svn: 295169
*	[LLVM][XRAY][MIPS] Support xray on mips/mipsel/mips64/mips64el	Sagar Thakur	2017-02-15	3	-4/+180
\| \| \| \| \| \| \| \| \|	Summary: Adds support for xray instrumentation on mips for both 32-bit and 64-bit. Reviewed by sdardis, dberris Differential: D27697 llvm-svn: 295164
*	[X86][AVX] Remove REX_W from AVX instructions.	Ayman Musa	2017-02-15	1	-3/+3
\| \| \| \| \| \| \| \|	There is no meaning for REX_W in VEX encoded AVX instruction. Differential Revision: https://reviews.llvm.org/D29894 llvm-svn: 295157
*	[X86] Don't create VBROADCAST nodes with 256-bit or 512-bit input types	Craig Topper	2017-02-15	1	-2/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We don't seem to have great rules on what a valid VBROADCAST node looks like. And as a consequence we end up with a lot of patterns to try to catch everything. We have patterns with scalar inputs, 128-bit vector inputs, 256-bit vector inputs, and 512-bit vector inputs. As you can see from the things improved here we are currently missing patterns for 128-bit loads being extended to 256-bit before the vbroadcast. I'd like to propose that VBROADCAST should always take a 128-bit vector type as input. As a first step towards that this patch adds an EXTRACT_SUBVECTOR in front of VBROADCAST when the input is 256 or 512-bits. In the future I would like to add scalar_to_vector around all the scalar operations. And maybe we should consider adding a VBROADCAST+load node to avoid separating loads from the broadcasting operation when the load itself isn't foldable. This requires an additional change in target shuffle combining to look for the extract subvector and look through it to find the original operand. I'm sure this change isn't perfect but was enough to fix a few test failures that were being caused. Another interesting thing I noticed is that the changes in masked_gather_scatter.ll show cases were we don't remove a useless insert into element 1 before broadcasting element 0. Reviewers: delena, RKSimon, zvi Reviewed By: zvi Subscribers: igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D28747 llvm-svn: 295155
*	[AVX-512] Add PACKSS/PACKUS instructions to load folding tables.	Craig Topper	2017-02-15	1	-0/+36
\| \| \| \|	llvm-svn: 295154
*	[AMDGPU] Fix MaxWorkGroupsPerCU for large workgroups	Stanislav Mekhanoshin	2017-02-15	1	-1/+5
\| \| \| \| \| \| \| \| \| \|	This patch corrects the maximum workgroups per CU if we have big workgroups (more than 128). This calculation contributes to the occupancy calculation in respect to LDS size. Differential Revision: https://reviews.llvm.org/D29974 llvm-svn: 295134
*	[mips] Correct mips16 return instructions definitions	Simon Dardis	2017-02-14	1	-0/+2
\| \| \| \| \| \| \|	Correct the definition of MIPS16 instructions that act as return instructions so that isReturn = 1 as expected. llvm-svn: 295109
*	GlobalISel: deal with new G_PTR_MASK instruction on AArch64.	Tim Northover	2017-02-14	2	-0/+13
\| \| \| \| \| \|	It's just an AND-immediate instruction for us, surprisingly simple to select. llvm-svn: 295104
*	[Hexagon] Remove leftover debugging code	Krzysztof Parzyszek	2017-02-14	1	-4/+0
\| \| \| \|	llvm-svn: 295078
*	Remove unused variable.	Diego Novillo	2017-02-14	1	-1/+0
\| \| \| \|	llvm-svn: 295065
*	[X86][SSE] Allow matchVectorShuffleWithUNPCK to recognise UNDEF inputs	Simon Pilgrim	2017-02-14	1	-7/+21
\| \| \| \| \| \|	Add support for specifying an UNPCK input as UNDEF llvm-svn: 295061
*	Revert "[AMDGPU] Fix for SIMachineScheduler crash. SI Scheduler should track"	Alexander Timofeev	2017-02-14	2	-3/+4
\| \| \| \| \| \|	This reverts commit ce06d9cb99298eb844b66e117f5108a06747c907. llvm-svn: 295054
*	[X86][SSE] Move unary inputs handling inside matchVectorShuffleWithUNPCK.	Simon Pilgrim	2017-02-14	1	-2/+3
\| \| \| \|	llvm-svn: 295053
*	[X86][SSE] Tidyup matchVectorShuffleWithUNPCK helper function call.	Simon Pilgrim	2017-02-14	1	-7/+3
\| \| \| \| \| \| \| \|	Don't bother setting the V1/V2 operands again for unary shuffles. Don't bother legalizing the value type unless the match succeeds. llvm-svn: 295051
*	[AVX-512] Add PAVGB/PAVGW to load folding tables.	Craig Topper	2017-02-14	1	-0/+18
\| \| \| \|	llvm-svn: 295035
*	[RISCV] Fix RV32 datalayout string and ensure initAsmInfo is called	Alex Bradbury	2017-02-14	1	-2/+4
\| \| \| \|	llvm-svn: 295028
*	[RISCV] Pseudo instructions are isCodeGenOnly, have blank asmstr	Alex Bradbury	2017-02-14	1	-1/+2
\| \| \| \|	llvm-svn: 295027
*	[RISCV] Fix unused variable in RISCVMCTargetDesc. NFC	Alex Bradbury	2017-02-14	1	-3/+2
\| \| \| \| \| \| \|	Also, for better uniformity use TargetRegistry::RegisterMCAsmInfo rather than RegisterMCAsmInfoFn. Again, no functional change. llvm-svn: 295026
*	[MC] Fix some Clang-tidy modernize and Include What You Use warnings; other ↵	Eugene Zelenko	2017-02-14	5	-21/+50
\| \| \| \| \| \| \| \|	minor fixes (NFC). Same changes in files affected by reduced MC headers dependencies. llvm-svn: 295009
*	[X86] Add MXCSR register	Andrew Kaylor	2017-02-13	3	-21/+33
\| \| \| \| \| \| \| \| \| \|	This adds MXCSR to the set of recognized registers for X86 targets and updates the instructions that read or write it. I do not intend for all of the various floating point instructions that implicitly use the control bits or update the status bits of this register to ever have that usage modeled by default. However, when constrained floating point modes (such as strict FP exception status modeling or dynamic rounding modes) are enabled, implicit use/def information for MXCSR will be added to those instructions. Until those additional updates are made this should cause (almost?) no functional changes. Theoretically, this will prevent instructions like LDMXCSR and STMXCSR from being moved past one another, but that should be prevented anyway and I haven't found a case where it is happening now. Differential Revision: https://reviews.llvm.org/D29903 llvm-svn: 295004
*	GlobalISel: represent atomic loads & stores via the MachineMemOperand.	Tim Northover	2017-02-13	1	-0/+6
\| \| \| \| \| \| \|	Also make sure the AArch64 backend doesn't try to convert them into normal loads and stores. llvm-svn: 294993
*	[ARM] Fix crash caused by r294945	James Molloy	2017-02-13	1	-2/+4
\| \| \| \| \| \| \| \|	I'd missed a creator of FCMP nodes - duplicateCmp(). Kindly and promptly reported by Gabor Ballabas, due to his CSiBE test suite. llvm-svn: 294968
*	[mips] divide macro instruction cleanup.	Simon Dardis	2017-02-13	5	-80/+223
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Clean up the implementation of divide macro expansion by getting rid of a FIXME regarding magic numbers and branch instructions. Match GAS' behaviour for expansion of ddiv / div in the two and three operand cases. Add the two operand alias for MIPSR6. Finally, optimize macro expansion cases where the divisior is the $zero register. Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D29887 llvm-svn: 294960
*	Fix indentation. NFCI.	Simon Pilgrim	2017-02-13	1	-1/+1
\| \| \| \|	llvm-svn: 294959
*	[CodeGen] fix alignment of JUMPTABLE_INSTS on v8M.base	Sanne Wouda	2017-02-13	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The attached test case fails with "fatal error: error in backend: misaligned pc-relative fixup value" as the jump table is misaligned. The EmitAlignment existed already for ARM and Thumb-1 code, but was missing for Thumb-2. The test checks that the fatal error disappears when generating an obj file, as well as checking the align directive is there when producing an asm file. Reviewers: rengolin, grosbach, t.p.northover, jmolloy, SjoerdMeijer, samparker Reviewed By: samparker Subscribers: samparker, aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D29650 llvm-svn: 294950
*	[Thumb-1] TBB generation: spot redefinitions of index register	James Molloy	2017-02-13	1	-1/+17
\| \| \| \| \| \| \| \| \| \| \| \| \|	We match a sequence of 3-4 instructions into a tTBB pseudo. One of our checks is that a particular register in that sequence is killed (so it can be clobbered by the pseudo). We weren't noticing if an errant MOV or other instruction had infiltrated the sequence we were walking. If it had, and it defined the register we've already identified as killed, it makes it live across the tBR_JT and thus unclobberable. Notice this case and bail out. llvm-svn: 294949
*	[ARM] Register ConstantIslands with the pass manager	James Molloy	2017-02-13	3	-1/+8
\| \| \| \| \| \| \|	This allows us to use -stop-before/-stop-after/-run-pass - we can now write .mir tests. llvm-svn: 294948
*	[ARM] Use VCMP, not VCMPE, for floating point equality comparisons	James Molloy	2017-02-13	5	-29/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When generating a floating point comparison we currently unconditionally generate VCMPE. This has the sideeffect of setting the cumulative Invalid bit in FPSCR if any of the operands are QNaN. It is expected that use of a relational predicate on a QNaN value should raise Invalid. Quoting from the C standard: The relational and equality operators support the usual mathematical relationships between numeric values. For any ordered pair of numeric values exactly one of relationships the less, greater, equal and is true. Relational operators may raise the floating-point exception when argument values are NaNs. The standard doesn't explicitly state the expectation for equality operators, but the implication and obvious expectation is that equality operators should not raise Invalid on a QNaN input, as those predicates are wholly defined on unordered inputs (to return not equal). Therefore, add a new operand to ARMISD::FPCMP and FPCMPZ indicating if QNaN should raise Invalid, and pipe that through to TableGen. llvm-svn: 294945
*	[X86][SSE] Create matchVectorShuffleWithUNPCK helper function.	Simon Pilgrim	2017-02-13	1	-46/+42
\| \| \| \| \| \|	Currently only used by target shuffle combining - will use it for lowering as well in a future patch. llvm-svn: 294943
*	[X86][AVX512] Fix operand classes for some AVX512 instructions to keep ↵	Ayman Musa	2017-02-13	1	-17/+20
\| \| \| \| \| \| \| \|	consistency between VEX/EVEX versions of the same instruction. Differential Revision: https://reviews.llvm.org/D29873 llvm-svn: 294937
*	[X86] Genericize the handling of INSERT_SUBVECTOR from an EXTRACT_SUBVECTOR ↵	Craig Topper	2017-02-13	1	-21/+18
\| \| \| \| \| \| \| \|	to support 512-bit vectors with 128-bit or 256-bit subvectors. We now detect that both the extract and insert indices are non-zero and convert to a shuffle. This will be lowered as a blend for 256-bit vectors or as a vshuf operations for 512-bit vectors. llvm-svn: 294931
*	[X86] Don't let LowerEXTRACT_SUBVECTOR call getNode for EXTRACT_SUBVECTOR.	Craig Topper	2017-02-12	1	-5/+7
\| \| \| \| \| \|	This results in the simplifications inside of getNode running while we're legalizing nodes popped off the worklist during the final DAG combine. This basically makes a DAG combine like operation occur during this legalize step, but we don't handle something quite the same way. I think we don't recursively added the removed nodes to the DAG combiner worklist. llvm-svn: 294929
*	[X86] Fix typo in function name. NFCI.	Simon Pilgrim	2017-02-12	1	-2/+2
\| \| \| \| \| \|	convertBitVectorToUnsiged - convertBitVectorToUnsigned llvm-svn: 294914
*	[AVX-512] Add various EVEX move instructions to load folding tables using ↵	Craig Topper	2017-02-12	1	-4/+10
\| \| \| \| \| \|	the VEX equivalents as a guide. llvm-svn: 294908
*	[AVX-512] Add VMOV64toSDZrm CodeGenOnly instruction based on the same ↵	Craig Topper	2017-02-12	1	-0/+4
\| \| \| \| \| \| \| \|	instruction from AVX/SSE. I can't prove that we can select this instruction or the AVX/SSE version, but I'm adding it for consistency for now so I can continue matching the load folding tables. llvm-svn: 294907
*	[X86] Fix a couple instruction names to use 'mr' instead of 'rm' to indicate ↵	Craig Topper	2017-02-12	1	-2/+2
\| \| \| \| \| \|	they are stores. AVX-512 version was already named with 'mr'. llvm-svn: 294906
*	[AVX-512] Add VPEXTRD/Q to load folding tables.	Craig Topper	2017-02-12	1	-0/+2
\| \| \| \|	llvm-svn: 294905
*	[X86][SSE] Update argument names to match function name. NFCI.	Simon Pilgrim	2017-02-12	1	-12/+13
\| \| \| \| \| \|	The target shuffle match function arguments were using the term 'Ops' but the function names referred to them as 'Inputs' - use 'Inputs' consistently. llvm-svn: 294900