bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[x86] rename test file and auto-generate complete checks; NFC	Sanjay Patel	2017-06-23	2	-23/+35
\| \| \| \| \| \| \|	The command-line params override the target setting in the file itself, so delete that. Also, remove the cpu and arch because those don't matter and neither does the OS specification in the triple. llvm-svn: 306109
*	[X86][AVX] Extended vector average tests	Simon Pilgrim	2017-06-23	1	-411/+917
\| \| \| \| \| \|	Added AVX1 tests and merged AVX1/AVX2/AVX512 checks where possible llvm-svn: 306107
*	[SystemZ] Fix trap issue and enable expensive checks.	Jonas Paulsson	2017-06-23	3	-10/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The isBarrier/isTerminator flags have been removed from the SystemZ trap instructions, so that tests do not fail with EXPENSIVE_CHECKS. This was just an issue at -O0 and did not affect code output on benchmarks. (Like Eli pointed out: "targets are split over whether they consider their "trap" a terminator; x86, AArch64, and NVPTX don't, but ARM, MIPS, PPC, and SystemZ do. We should probably try to be consistent here.". This is still the case, although SystemZ has switched sides). SystemZ now returns true in isMachineVerifierClean() :-) These Generic tests have been modified so that they can be run with or without EXPENSIVE_CHECKS: CodeGen/Generic/llc-start-stop.ll and CodeGen/Generic/print-machineinstrs.ll Review: Ulrich Weigand, Simon Pilgrim, Eli Friedman https://bugs.llvm.org/show_bug.cgi?id=33047 https://reviews.llvm.org/D34143 llvm-svn: 306106
*	[X86][SSE] Dropped -mcpu from vector average tests	Simon Pilgrim	2017-06-23	1	-645/+686
\| \| \| \| \| \|	Use triple and attribute only for consistency llvm-svn: 306104
*	[X86][SSE] Dropped -mcpu from scalar math tests	Simon Pilgrim	2017-06-23	1	-6/+4
\| \| \| \| \| \|	Use triple and attribute only for consistency llvm-svn: 306097
*	[X86][SSE] Dropped -mcpu from insertps tests	Simon Pilgrim	2017-06-23	1	-3/+3
\| \| \| \| \| \|	Use triple and attribute only for consistency llvm-svn: 306092
*	[mips][msa] Splat.d endianness check	Stefan Maksimovic	2017-06-23	1	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \| \|	Before this change, it was always the first element of a vector that got splatted since the lower 6 bits of vshf.d $wd were always zero for little endian. Additionally, masking has been performed for vshf via which splat.d is created. Vshf has a property where if its first operand's elements have either bit 6 or 7 set, destination element is set to zero. Initially masked with 63 to avoid this property, which would result in generation of and.v + vshf.d in all cases. Masking with one results in generating a single splati.d instruction when possible. Differential Revision: https://reviews.llvm.org/D32216 llvm-svn: 306090
*	[x86] add/sub (X==0) --> sbb(cmp X, 1)	Sanjay Patel	2017-06-22	1	-6/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is very similar to the transform in: https://reviews.llvm.org/rL306040 ...but in this case, we use cmp X, 1 to set the carry bit as needed. Again, we can show that all of these are logically equivalent (although InstCombine currently canonicalizes to a form not seen here), and if we believe IACA, then this is the smallest/fastest code. Eg, with SNB: \| Num Of \| Ports pressure in cycles \| \| \| Uops \| 0 - DV \| 1 \| 2 - D \| 3 - D \| 4 \| 5 \| \| --------------------------------------------------------------------- \| 1 \| 1.0 \| \| \| \| \| \| \| cmp edi, 0x1 \| 2 \| \| 1.0 \| \| \| \| 1.0 \| CP \| sbb eax, eax The larger motivation is to clean up all select-of-constants combining/lowering because we're missing some common cases. llvm-svn: 306072
*	Supported lowerInterleavedStore() in X86InterleavedAccess.	Farhana Aleen	2017-06-22	1	-60/+32
\| \| \| \| \| \| \| \| \| \|	Reviewers: RKSimon, DavidKreitzer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32658 llvm-svn: 306068
*	[x86] add more tests for select --> sbb transform; NFC	Sanjay Patel	2017-06-22	1	-4/+61
\| \| \| \| \| \|	These are siblings of the tests added with r306032. llvm-svn: 306064
*	[WebAssembly] WebAssemblyFastISel getelementptr variable index support	Jacob Gravelle	2017-06-22	1	-0/+100
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Previously -fast-isel getelementptr would constant-fold non-constant i8 load/stores. Reviewers: sunfish Subscribers: jfb, dschuff, sbc100, llvm-commits Differential Revision: https://reviews.llvm.org/D34044 llvm-svn: 306060
*	[Hexagon] Properly update kill flags in HexagonNewValueJump	Krzysztof Parzyszek	2017-06-22	1	-0/+53
\| \| \| \| \| \| \|	The feeder instruction will be moved to right before the compare, so the updating code should not be looking for kills past the compare. llvm-svn: 306059
*	[Hexagon] Use LivePhysRegs to fix up kills in HexagonGenMux	Krzysztof Parzyszek	2017-06-22	3	-2/+33
\| \| \| \| \| \|	Remove the previous, manual shuffling of the kill flags. llvm-svn: 306054
*	[AVX-512] Remove and autoupgrade the masked integer compare intrinsics	Craig Topper	2017-06-22	8	-2704/+4230
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: These intrinsics aren't used by clang and haven't been for a while. There's some really terrible codegen in the 32-bit target for avx512bw due to i64 not being legal. But as I said these intrinsics aren't used by clang even before this patch so this codegen reflects our clang behavior today. Reviewers: spatel, RKSimon, zvi, igorb Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34389 llvm-svn: 306047
*	[x86] add/sub (X==0) --> sbb(neg X)	Sanjay Patel	2017-06-22	1	-9/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Our handling of select-of-constants is lumpy in IR (https://reviews.llvm.org/D24480), lumpy in DAGCombiner, and lumpy in X86ISelLowering. That's why we only had the 'sbb' codegen in 1 out of the 4 tests. This is a step towards smoothing that out. First, show that all of these IR forms are equivalent: http://rise4fun.com/Alive/mx Second, show that the 'sbb' version is faster/smaller. IACA output for SandyBridge (later Intel and AMD chips are similar based on Agner's tables): This is the "obvious" x86 codegen (what gcc appears to produce currently): \| Num Of \| Ports pressure in cycles \| \| \| Uops \| 0 - DV \| 1 \| 2 - D \| 3 - D \| 4 \| 5 \| \| --------------------------------------------------------------------- \| 1* \| \| \| \| \| \| \| \| xor eax, eax \| 1 \| 1.0 \| \| \| \| \| \| CP \| test edi, edi \| 1 \| \| \| \| \| \| 1.0 \| CP \| setnz al \| 1 \| \| 1.0 \| \| \| \| \| CP \| neg eax This is the adc version: \| 1* \| \| \| \| \| \| \| \| xor eax, eax \| 1 \| 1.0 \| \| \| \| \| \| CP \| cmp edi, 0x1 \| 2 \| \| 1.0 \| \| \| \| 1.0 \| CP \| adc eax, 0xffffffff And this is sbb: \| 1 \| 1.0 \| \| \| \| \| \| \| neg edi \| 2 \| \| 1.0 \| \| \| \| 1.0 \| CP \| sbb eax, eax If IACA is trustworthy, then sbb became a single uop in Broadwell, so this will be clearly better than the alternatives going forward. llvm-svn: 306040
*	[x86] add tests for select --> sbb transform; NFC	Sanjay Patel	2017-06-22	1	-0/+62
\| \| \| \|	llvm-svn: 306032
*	[AMDGPU] Add intrinsics for tbuffer load and store	David Stuttard	2017-06-22	8	-25/+275
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Intrinsic already existed for llvm.SI.tbuffer.store Needed tbuffer.load and also re-implementing the intrinsic as llvm.amdgcn.tbuffer.* Added CodeGen tests for the 2 new variants added. Left the original llvm.SI.tbuffer.store implementation to avoid issues with existing code Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, tpr Differential Revision: https://reviews.llvm.org/D30687 llvm-svn: 306031
*	[Hexagon] Fix typo in a testcase	Krzysztof Parzyszek	2017-06-22	1	-1/+1
\| \| \| \|	llvm-svn: 306030
*	[Hexagon] Handle a global operand to A2_addi when creating duplexes	Krzysztof Parzyszek	2017-06-22	1	-0/+22
\| \| \| \|	llvm-svn: 306012
*	[X86] Add support for "probe-stack" attribute	whitequark	2017-06-22	2	-0/+50
\| \| \| \| \| \| \| \| \| \| \|	This commit adds prologue code emission for stack probe function calls. Reviewed By: majnemer Differential Revision: https://reviews.llvm.org/D34387 llvm-svn: 306010
*	[Hexagon] Recognize potential offset overflow for store-imm to stack	Krzysztof Parzyszek	2017-06-22	1	-0/+151
\| \| \| \| \| \| \|	Reserve an extra scavenging stack slot if the offset field in store- -immediate instructions may overflow. llvm-svn: 306004
*	[AMDGPU] SDWA: remove support for VOP2 instructions that have only 64-bit ↵	Sam Kolton	2017-06-22	1	-0/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	encoding Summary: Despite that this instructions are listed in VOP2, they are treated as VOP3 in specs. They should not support SDWA. There are no real instructions for them, but there are pseudo instructions. Reviewers: arsenm, vpykhtin, cfang Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D34403 llvm-svn: 305999
*	Don't conditionalize Neon instructions, even in IT blocks.	Kristof Beyls	2017-06-22	4	-35/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This has been deprecated since ARMARM v7-AR, release C.b, published back in 2012. This also removes test/CodeGen/Thumb2/ifcvt-neon.ll that originally was introduced to check that conditionalization of Neon instructions did happen when generating Thumb2. However, the test had evolved and was no longer testing that. Rather than trying to adapt that test, this commit introduces test/CodeGen/Thumb2/ifcvt-neon-deprecated.mir, since we can now use the MIR framework to write nicer/more maintainable tests. llvm-svn: 305998
*	[GlobalISel][X86] Support vector type G_INSERT legalization/selection.	Igor Breger	2017-06-22	4	-0/+543
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Support vector type G_INSERT legalization/selection. Split from https://reviews.llvm.org/D33665 Reviewers: qcolombet, t.p.northover, zvi, guyblank Reviewed By: guyblank Subscribers: guyblank, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D33956 llvm-svn: 305989
*	[ARM] Add macro fusion for AES instructions.	Florian Hahn	2017-06-22	1	-0/+203
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds a macro fusion using CodeGen/MacroFusion.cpp to pair AES instructions back to back and adds FeatureFuseAES to enable the feature. Reviewers: evandro, javed.absar, rengolin, t.p.northover Reviewed By: javed.absar Subscribers: aemerson, mgorny, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34142 llvm-svn: 305988
*	AVX-512: Lowering Masked Gather intrinsic - fixed a bug	Elena Demikhovsky	2017-06-22	1	-8/+61
\| \| \| \| \| \| \| \| \| \| \| \|	Masked gather for vector length 2 is lowered incorrectly for element type i32. The type <2 x i32> was automatically extended to <2 x i64> and we generated VPGATHERQQ instead of VPGATHERQD. The type <2 x float> is extended to <4 x float>, so there is no bug for this type, but the sequence may be more optimal. In this patch I'm fixing <2 x i32>bug and optimizing <2 x float> sequence for GATHERs only. The same fix should be done for Scatters as well. Differential revision: https://reviews.llvm.org/D34343 llvm-svn: 305987
*	[AMDGPU] SDWA: add support for GFX9 in peephole pass	Sam Kolton	2017-06-22	6	-77/+220
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Added support based on merged SDWA pseudo instructions. Now peephole allow one scalar operand, omod and clamp modifiers. Added several subtarget features for GFX9 SDWA. This diff also contains changes from D34026. Depends D34026 Reviewers: vpykhtin, rampitec, arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D34241 llvm-svn: 305986
*	Revert "[Target] Implement the ".rdata" MIPS assembly directive."	Davide Italiano	2017-06-22	1	-13/+0
\| \| \| \| \| \| \|	This reverts commit r305949 and r305950 as they didn't have the correct commit message. llvm-svn: 305973
*	[AMDGPU] Add FP_CLASS to the add/setcc combine	Stanislav Mekhanoshin	2017-06-21	1	-0/+36
\| \| \| \| \| \| \| \|	This is one of the nodes which also compile as v_cmp_*. Differential Revision: https://reviews.llvm.org/D34485 llvm-svn: 305970
*	[AMDGPU] Combine add and adde, sub and sube	Stanislav Mekhanoshin	2017-06-21	1	-0/+80
\| \| \| \| \| \| \| \| \|	If one of the arguments of adde/sube is zero we can fold another add/sub into it. Differential Revision: https://reviews.llvm.org/D34374 llvm-svn: 305964
*	[AMDGPU] simplify add x, *ext (setcc) => addc\|subb x, 0, setcc	Stanislav Mekhanoshin	2017-06-21	1	-0/+43
\| \| \| \| \| \| \| \| \|	This simplification allows to avoid generating v_cndmask_b32 to serialize condition code between compare and use. Differential Revision: https://reviews.llvm.org/D34300 llvm-svn: 305962
*	Add Aarch64 ldst-opt test.	Nirav Dave	2017-06-21	1	-0/+60
\| \| \| \|	llvm-svn: 305951
*	[Target/Mips] Add test associated with r305949.	Davide Italiano	2017-06-21	1	-0/+13
\| \| \| \|	llvm-svn: 305950
*	[Solaris] emit .init_array instead of .ctors on Solaris (Sparc/x86)	Davide Italiano	2017-06-21	2	-0/+31
\| \| \| \| \| \| \| \|	Patch by Fedor Sergeev. Differential Revision: https://reviews.llvm.org/D33868 llvm-svn: 305948
*	[Hexagon] Handle more types of immediate operands in expand-condsets	Krzysztof Parzyszek	2017-06-21	1	-0/+22
\| \| \| \|	llvm-svn: 305943
*	[PowerPC] define target hook isReallyTriviallyReMaterializable()	Lei Huang	2017-06-21	1	-0/+179
\| \| \| \| \| \| \| \| \| \| \|	Define target hook isReallyTriviallyReMaterializable() to explicitly specify PowerPC instructions that are trivially rematerializable. This will allow the MachineLICM pass to accurately identify PPC instructions that should always be hoisted. Differential Revision: https://reviews.llvm.org/D34255 llvm-svn: 305932
*	[AARCH64][LSE] Preliminary support for ARMv8.1 LSE Atomics.	Christof Douma	2017-06-21	1	-0/+683
\| \| \| \| \| \| \| \| \| \| \| \|	Added test file for ARMv8.1 LSE Atomics that I forgot to include in commit r305893. Patch by Ananth Jasty. Differential Revision: https://reviews.llvm.org/D33586 Change-Id: Ic1ad8ed87c1b584c4c791b459a686c866a3c3087 llvm-svn: 305918
*	[X86][SSE] Dropped -mcpu from 256-bit vector shuffle tests	Simon Pilgrim	2017-06-21	4	-20/+12
\| \| \| \| \| \|	Use triple and attribute only for consistency llvm-svn: 305916
*	[X86][SSE] Dropped -mcpu from 128-bit vector shuffle tests	Simon Pilgrim	2017-06-21	4	-38/+26
\| \| \| \| \| \|	Use triple and attribute only for consistency llvm-svn: 305913
*	[X86][SSE] Regenerate merge store tests	Simon Pilgrim	2017-06-21	1	-15/+17
\| \| \| \|	llvm-svn: 305910
*	[X86][SSE] Dropped -mcpu from vector blend shuffle tests and regenerate	Simon Pilgrim	2017-06-21	1	-54/+20
\| \| \| \| \| \|	Use triple and attribute only for consistency llvm-svn: 305909
*	[X86][SSE] Dropped -mcpu from vector shuffle tests	Simon Pilgrim	2017-06-21	4	-14/+24
\| \| \| \| \| \|	Use triple and attribute only for consistency llvm-svn: 305908
*	[X86][SSE] Dropped -mcpu from vector zero extend tests	Simon Pilgrim	2017-06-21	1	-7/+5
\| \| \| \| \| \|	Use triple and attribute only for consistency llvm-svn: 305907
*	[X86][SSE] Dropped -mcpu from variable shuffle tests	Simon Pilgrim	2017-06-21	2	-8/+7
\| \| \| \| \| \|	Use triple and attribute only for consistency llvm-svn: 305906
*	[X86][AVX] Add AVX1 shuffle truncation tests	Simon Pilgrim	2017-06-21	1	-107/+234
\| \| \| \|	llvm-svn: 305905
*	[X86][SSE] Add SSE2/SSE42 shuffle truncation tests	Simon Pilgrim	2017-06-21	1	-0/+156
\| \| \| \|	llvm-svn: 305904
*	[X86] Rerun the update_llc_test_checks tool on test. NFC.	Zvi Rackover	2017-06-21	1	-0/+8
\| \| \| \|	llvm-svn: 305897
*	[MIPS] Fix for selecting of DINS/INS instruction	Strahinja Petrovic	2017-06-21	1	-6/+33
\| \| \| \| \| \| \| \| \| \|	This patch adds one more condition in selection DINS/INS instruction, which fixes MultiSource/Applications/JM/ldecod/ for mips32r2 (and mips64r2 n32 abi). Differential Revision: https://reviews.llvm.org/D33725 llvm-svn: 305888
*	[AArch64] Preserve register flags when promoting a load from store.	Florian Hahn	2017-06-21	1	-1/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch updates promoteLoadFromStore to use the store MachineOperand as the source operand of the of the new instruction instead of creating a new register MachineOperand. This way, the existing register flags are preserved. This fixes PR33468 (https://bugs.llvm.org/show_bug.cgi?id=33468). Reviewers: MatzeB, t.p.northover, junbuml Reviewed By: MatzeB Subscribers: aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34402 llvm-svn: 305885
*	[DAGCombiner] Add another combine from build vector to shuffle	Guy Blank	2017-06-21	2	-37/+13
\| \| \| \| \| \| \|	Add support for combining a build vector to a shuffle. When the build vector is of extracted elements from 2 vectors (vec1, vec2) where vec2 is 2 times smaller than vec1. llvm-svn: 305883