bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	A new test to demostrate the current SHLD/SHRD code generation.	Andrew V. Tischenko	2018-01-18	1	-0/+464
\| \| \| \|	llvm-svn: 322828
*	[RISCV][NFC] Add nounwind to functions in div.ll and mul.ll	Alex Bradbury	2018-01-18	2	-16/+16
\| \| \| \| \| \| \|	Committing this separately to minimise irrelevant changes for an upcoming patch. llvm-svn: 322825
*	[X86] Use vmovdqu64/vmovdqa64 for unmasked integer vector stores for ↵	Craig Topper	2018-01-18	7	-29/+29
\| \| \| \| \| \| \| \|	consistency with loads. Previously we used 64 for vXi64 stores and 32 for everything else. This change uses 64 for everything just like do for loads. llvm-svn: 322820
*	[X86] Remove isel patterns for using unmasked vmovdqa32/vmovdqu32 for ↵	Craig Topper	2018-01-18	16	-106/+106
\| \| \| \| \| \| \| \|	integer vector loads. These patterns were just looking for a vXi64 bitcasted to vXi32, but there is no advantage to using vmovdqa32 over vmovdqa64. llvm-svn: 322819
*	[X86] Remove windows line endings from a test file. NFC	Craig Topper	2018-01-18	1	-93/+93
\| \| \| \|	llvm-svn: 322817
*	[DAGCombiner] Add a DAG combine to turn a splat build_vector where the splat ↵	Craig Topper	2018-01-18	1	-8/+0
\| \| \| \| \| \| \| \| \| \|	elemnt is a bitcast from a vector type into a concat_vector For example, a build_vector of i64 bitcasted from v2i32 can be turned into a concat_vectors of the v2i32 vectors with a bitcast to a vXi64 type Differential Revision: https://reviews.llvm.org/D42090 llvm-svn: 322811
*	GlobalISel: Make MachineCSE runnable in the middle of the GlobalISel	Justin Bogner	2018-01-18	1	-0/+181
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Right now, it is not possible to run MachineCSE in the middle of the GlobalISel pipeline. Being able to run generic optimizations between the core passes of GlobalISel was one of the goals of the new ISel framework. This is the first attempt to do it. The problem is that MachineCSE pass assumes all register operands have a register class, which, in GlobalISel context, won't be true until after the InstructionSelect pass. The reason for this behaviour is that before replacing one virtual register with another, MachineCSE pass (and most of the other optimization machine passes) must check if the virtual registers' constraints have a (sufficiently large) intersection, and constrain the resulting register appropriately if such intersection exists. GlobalISel extends the representation of such constraints from just a register class to a triple (low-level type, register bank, register class). This commit adds MachineRegisterInfo::constrainRegAttrs method that extends MachineRegisterInfo::constrainRegClass to such a triple. The idea is that going forward we should use: - RegisterBankInfo::constrainGenericRegister within GlobalISel's InstructionSelect pass - MachineRegisterInfo::constrainRegClass within SelectionDAG ISel - MachineRegisterInfo::constrainRegAttrs everywhere else regardless the target and instruction selector it uses. Patch by Roman Tereshin. Thanks! llvm-svn: 322805
*	Fix the failure caused by r322773	Volkan Keles	2018-01-18	1	-0/+3
\| \| \| \| \| \|	Do not run GlobalISel if `-fast-isel=0 -global-isel=false`. llvm-svn: 322800
*	[MachineOutliner] Add DISubprograms to outlined functions.	Jessica Paquette	2018-01-18	1	-0/+219
\| \| \| \| \| \| \| \| \| \|	Before, it wasn't possible to get backtraces inside outlined functions. This commit adds DISubprograms to the IR functions created by the outliner which makes this possible. Also attached a test that ensures that the produced debug information is correct. This is useful to users that want to debug outlined code. llvm-svn: 322789
*	[CodeGen] Hoist common AsmPrinter code out of X86, ARM, and AArch64	Reid Kleckner	2018-01-17	4	-124/+122
\| \| \| \| \| \| \| \| \| \| \|	Every known PE COFF target emits /EXPORT: linker flags into a .drective section. The AsmPrinter should handle this. While we're at it, use global_values() and emit each export flag with its own .ascii directive. This should make the .s file output more readable. llvm-svn: 322788
*	Add support for emitting libcalls for x86_fp80 -> fp128 and vice-versa	Benjamin Kramer	2018-01-17	1	-0/+27
\| \| \| \| \| \|	compiler_rt doesn't provide them (yet), but libgcc does. PR34076. llvm-svn: 322772
*	[X86][MMX] Add PR35982 test cases	Simon Pilgrim	2018-01-17	1	-0/+123
\| \| \| \| \| \|	FEMMS has the same problem as EMMS llvm-svn: 322770
*	[LegalizeDAG] Fix ATOMIC_CMP_SWAP_WITH_SUCCESS legalization.	Eli Friedman	2018-01-17	4	-15/+45
\| \| \| \| \| \| \| \| \| \| \| \| \|	The code wasn't zero-extending correctly, so the comparison could spuriously fail. Adds some AArch64 tests to cover this case. Inspired by D41791. Differential Revision: https://reviews.llvm.org/D41798 llvm-svn: 322767
*	[globalisel][tablegen] Honour priority order within nested instructions.	Daniel Sanders	2018-01-17	1	-0/+40
\| \| \| \| \| \| \| \| \| \| \|	It appears that we haven't been prioritizing rules that contain nested instructions properly. InstructionOperandMatcher didn't override isHigherPriorityThan so it never compared the instructions/operands/predicates inside nested instructions. Fixes PR35926. Thanks to Diana Picus for the bug report. llvm-svn: 322754
*	Revert [PowerPC] This reverts commit rL322721	Zaara Syeda	2018-01-17	2	-88/+0
\| \| \| \| \| \|	Failing build bots. Revert the commit now. llvm-svn: 322748
*	Use a got to access a hidden weak undefined on MachO.	Rafael Espindola	2018-01-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Trying to link __attribute__((weak, visibility("hidden"))) extern int foo; int *main(void) { return &foo; } on OS X fails with ld: 32-bit RIP relative reference out of range (-4294971318 max is +/-2GB): from _main (0x100000FAB) to _foo@0x00001000 (0x00000000) in '_main' from test.o for architecture x86_64 The problem being that 0 cannot be computed as a fixed difference from %rip. Exactly the same issue exists on ELF and we can use the same solution. llvm-svn: 322739
*	[ARM] Optimize {s,u}mul.with.overflow.	Joel Galenson	2018-01-17	1	-0/+40
\| \| \| \| \| \| \| \|	This extends my previous patches to also optimize overflow-checked multiplies during SelectionDAG. Differential revision: https://reviews.llvm.org/D40922 llvm-svn: 322738
*	[ARM] Optimize {s,u}{add,sub}.with.overflow.	Joel Galenson	2018-01-17	2	-24/+54
\| \| \| \| \| \| \| \|	The ARM backend contains code that tries to optimize compares by replacing them with an existing instruction that sets the flags the same way. This allows it to replace a "cmp" with a "adds", generalizing the code that replaces "cmp" with "sub". It also heuristically disables sinking of instructions that could potentially be used to replace compares (currently only if they're next to each other). Differential revision: https://reviews.llvm.org/D38378 llvm-svn: 322737
*	[X86] Teach LowerBUILD_VECTOR to recognize pair-wise splats of 32-bit ↵	Craig Topper	2018-01-17	6	-93/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	elements and use a 64-bit broadcast If we are splatting pairs of 32-bit elements, we can use a 64-bit broadcast to get the job done. We could probably could probably do this with other sizes too, for example four 16-bit elements. Or we could broadcast pairs of 16-bit elements using a 32-bit element broadcast. But I've left that as a future improvement. I've also restricted this to AVX2 only because we can only broadcast loads under AVX. Differential Revision: https://reviews.llvm.org/D42086 llvm-svn: 322730
*	[X86] When legalizing (v64i1 select i8, v64i1, v64i1) make sure not to ↵	Craig Topper	2018-01-17	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \|	introduce bitcasts to i64 in 32-bit mode We legalize selects of masks with scalar conditions using a bitcast to an integer type. But if we are in 32-bit mode we can't convert v64i1 to i64. So instead split the v64i1 to v32i1 and concat it back together. Each half will then be legalized by bitcasting to i32 which is fine. The test case is a little indirect. If we have the v64i1 select in IR it will get legalized by legalize vector ops which has a run of type legalization after it. That type legalization run is able to fix this i64 bitcast. So in order to avoid that we need a build_vector of a splat which legalize vector ops will ignore. Legalize DAG will then turn that into a select via LowerBUILD_VECTORvXi1. And the select will get legalized. In this case there is no type legalizer run to cleanup the bitcast. This fixes pr35972. llvm-svn: 322724
*	[X86][SSE] Add v4i16 PMULLD tests	Simon Pilgrim	2018-01-17	1	-0/+72
\| \| \| \|	llvm-svn: 322723
*	[PowerPC] Add handling for ColdCC calling convention and a pass to mark	Zaara Syeda	2018-01-17	2	-0/+88
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	candidates with coldcc attribute. This patch adds support for the coldcc calling convention for Power. This changes the set of non-volatile registers. It includes a pass to stress test the implementation by marking all static directly called functions with the coldcc attribute through the option -enable-coldcc-stress-test. It also includes an option, -ppc-enable-coldcc, to add the coldcc attribute to functions which are cold at all call sites based on BlockFrequencyInfo when the containing function does not call any non cold functions. Differential Revision: https://reviews.llvm.org/D38413 llvm-svn: 322721
*	[SystemZ] Handle BRCTH branches correctly in SystemZLongBranch.cpp.	Jonas Paulsson	2018-01-17	1	-0/+11953
\| \| \| \| \| \| \| \|	BRCTH is capable of a long branch which needs to be recognized during branch relaxation. This is done by checking for ExtraRelaxSize == 0. Review: Ulrich Weigand llvm-svn: 322688
*	[ARM GlobalISel] Add instselect tests for G_FPEXT and G_FPTRUNC	Diana Picus	2018-01-17	1	-0/+55
\| \| \| \| \| \| \|	G_FPEXT and G_FPTRUNC are handled by TableGen'erated code, just add tests. llvm-svn: 322665
*	[AArch64] Fix incorrect LD1 of 16-bit FP vectors in big endian	Pablo Barrio	2018-01-17	1	-0/+207
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Loading a vector of 4 half-precision FP sometimes results in an LD1 of 2 single-precision FP + a reversal. This results in an incorrect byte swap due to the conversion from little endian to big endian. In order to generate the correct byte swap, it is easier to generate the correct LD1 of 4 half-precision FP, thus avoiding the subsequent reversal. Reviewers: craig.topper, jmolloy, olista01 Reviewed By: olista01 Subscribers: efriedma, samparker, SjoerdMeijer, rogfer01, aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D41863 llvm-svn: 322663
*	[ARM GlobalISel] Map G_FPEXT and G_FPTRUNC to FPR	Diana Picus	2018-01-17	1	-0/+45
\| \| \| \|	llvm-svn: 322657
*	[AMDGPU] add LDS f32 intrinsics	Daniil Fukalov	2018-01-17	1	-0/+69
\| \| \| \| \| \| \| \| \| \| \| \|	added llvm.amdgcn.atomic.{add\|min\|max}.f32 intrinsics to allow generate ds_{add\|min\|max}[_rtn]_f32 instructions needed for OpenCL float atomics in LDS Reviewed by: arsenm Differential Revision: https://reviews.llvm.org/D37985 llvm-svn: 322656
*	[ARM GlobalISel] Legalize G_FPEXT and G_FPTRUNC	Diana Picus	2018-01-17	1	-0/+79
\| \| \| \| \| \| \| \| \| \| \|	Mark G_FPEXT and G_FPTRUNC as legal or libcall, depending on hardware support, but only for conversions between float and double. Also add the necessary boilerplate so that the LegalizerHelper can introduce the required libcalls. This also works only for float and double, but isn't too difficult to extend when the need arises. llvm-svn: 322651
*	[X86] Don't mutate shuffle arguments after early-out for AVX512	Benjamin Kramer	2018-01-17	1	-0/+40
\| \| \| \| \| \| \| \| \| \|	The match* functions have the annoying behavior of modifying its inputs. Save and restore the inputs, just in case the early out for AVX512 is hit. This is still not great and its only a matter of time this kind of bug happens again, but I couldn't come up with a better pattern without rewriting significant chunks of this code. Fixes PR35977. llvm-svn: 322644
*	[X86][AVX] Add extra 'interleaved+lanepermute' shuffle test	Simon Pilgrim	2018-01-17	1	-0/+51
\| \| \| \| \| \|	Possible missed opportunity to use 64-bit lane permute on AVX1 in lowerShuffleAsRepeatedMaskAndLanePermute llvm-svn: 322628
*	Allow usage of X86-prefixes as separate instrs.	Andrew V. Tischenko	2018-01-17	1	-1/+2
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D42102 llvm-svn: 322623
*	[MC] Fix -stack-size-section on ARM	Sean Eveson	2018-01-17	1	-0/+30
\| \| \| \| \| \| \| \|	Change symbol values in the stack_size section from being 8 bytes, to being a target dependent size. Differential Revision: https://reviews.llvm.org/D42108 llvm-svn: 322619
*	[X86][BTVER2] Fix scheduling of VCMPSD/VCMPSS instructions	Simon Pilgrim	2018-01-16	2	-4/+4
\| \| \| \| \| \|	For some reason they don't have a trailing i like the packed equivalents. llvm-svn: 322600
*	[GlobalISel][TableGen] Add support for SDNodeXForm	Volkan Keles	2018-01-16	1	-0/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds CustomRenderer which renders the matched operands to the specified instruction. Targets can enable the matching of SDNodeXForm by adding a definition that inherits from GICustomOperandRenderer and GISDNodeXFormEquiv as follows. def gi_imm8 : GICustomOperandRenderer<"renderImm8”>, GISDNodeXFormEquiv<imm8_xform>; Custom renderer functions should be of the form: void render(MachineInstrBuilder &MIB, const MachineInstr &I); Reviewers: dsanders, ab, rovka Reviewed By: dsanders Subscribers: kristof.beyls, javed.absar, llvm-commits, mgrang, qcolombet Differential Revision: https://reviews.llvm.org/D42012 llvm-svn: 322582
*	[X86][MMX] Accept UNDEF upper bits for MOVD GR32->MMX	Simon Pilgrim	2018-01-16	1	-78/+48
\| \| \| \|	llvm-svn: 322574
*	[X86][MMX] Improve MMX constant generation	Simon Pilgrim	2018-01-16	2	-23/+12
\| \| \| \| \| \|	Extend the MMX zero code to take any constant with zero'd upper 32-bits llvm-svn: 322553
*	[DebugInfo] Unify dumping of address ranges	Jonas Devlieghere	2018-01-16	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch unifies the printing of address ranges as [0x0, 0x1). rdar://34822059 Reviewers: aprantl, dblaikie Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D42056 llvm-svn: 322543
*	[BPF] Teach DAG2DAG AND elimination about load intrinsics	Yonghong Song	2018-01-16	1	-0/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As commented on the existing code: // The Reg operand should be a virtual register, which is defined // outside the current basic block. DAG combiner has done a pretty // good job in removing truncating inside a single basic block. However, when the Reg operand comes from bpf_load_[byte \| half \| word] intrinsics, the generic optimizer doesn't understand their results are zero extended, so these single basic block elimination opportunities were missed. Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> llvm-svn: 322534
*	[X86][MMX] Add support for MMX zero vector creation	Simon Pilgrim	2018-01-15	2	-34/+22
\| \| \| \| \| \| \| \| \| \|	As mentioned on PR35869, (and came up recently on D41517) we don't create a MMX zero register via the PXOR but instead perform a spill to stack from a XMM zero register. This patch adds support for direct MMX zero vector creation and should make it easier to add better constant vector creation in the future as well. Differential Revision: https://reviews.llvm.org/D41908 llvm-svn: 322525
*	[X86][SSE] Add custom execution domain fixing for ↵	Simon Pilgrim	2018-01-15	49	-1156/+821
\| \| \| \| \| \| \| \| \| \|	BLENDPD/BLENDPS/PBLENDD/PBLENDW (PR34873) Add support for custom execution domain fixing and implement support for BLENDPD/BLENDPS/PBLENDD/PBLENDW. Differential Revision: https://reviews.llvm.org/D42042 llvm-svn: 322524
*	[x86] add tests to show missed constant shrinking (PR35907); NFC	Sanjay Patel	2018-01-15	1	-4/+81
\| \| \| \|	llvm-svn: 322523
*	[x86] regenerate test checks; NFC	Sanjay Patel	2018-01-15	1	-7/+21
\| \| \| \|	llvm-svn: 322522
*	[x86] regenerate test checks; NFC	Sanjay Patel	2018-01-15	1	-127/+249
\| \| \| \|	llvm-svn: 322521
*	[x86] regenerate test checks; NFC	Sanjay Patel	2018-01-15	1	-4/+7
\| \| \| \|	llvm-svn: 322519
*	[AMDGPU] Add HW_REG_SH_MEM_BASES symbolic name for s_getreg_b32	Stanislav Mekhanoshin	2018-01-15	1	-3/+3
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D41617 llvm-svn: 322500
*	[Hexagon] Rewrite LowerVECTOR_SHUFFLE for 32-/64-bit vectors	Krzysztof Parzyszek	2018-01-15	2	-0/+152
\| \| \| \| \| \| \|	The old implementation was not always correct. The new one recognizes more shuffles that match specific instructions. llvm-svn: 322498
*	[SystemZ] Check for legality before doing LOAD AND TEST transformations.	Jonas Paulsson	2018-01-15	1	-0/+52
\| \| \| \| \| \| \| \| \| \|	Since a load and test instruction treat its operands as signed, it can only replace a logical compare for EQ/NE uses. Review: Ulrich Weigand https://bugs.llvm.org/show_bug.cgi?id=35662 llvm-svn: 322488
*	Update BTVER2 sched numbers for some AVX instructions (xmm version).	Andrew V. Tischenko	2018-01-15	3	-29/+29
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D40067 llvm-svn: 322485
*	Revert "[DAG] Elide overlapping stores"	Benjamin Kramer	2018-01-15	1	-2/+3
\| \| \| \| \| \| \|	This reverts commit r322085. Internal PPC testing is still showing the same symptoms as when this patch landed the last time. llvm-svn: 322474
*	[X86][SSE] Tag PR21137 test case	Simon Pilgrim	2018-01-14	1	-1/+2
\| \| \| \| \| \|	The test was added ages ago, but we didn't comment where it came from. llvm-svn: 322465