bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Adding a shufflevector and select LLVM IR instructions fuzz tool	Ayman Musa	2017-10-31	1	-0/+404
\| \| \| \| \| \| \| \| \| \| \|	Based on similar python tool - utils/shuffle-fuzz.py - this tool extends the ability of it's previous by optionally attaching select instruction to the generated shufflevector instructions. This was mainly developed to perform exhaustive testing of the X86 AVX512 masked shuffle instructions. But yet it can be used for various other targets. The general design of the implementation is much modular than the original shuffle_fuzz.py tool, which makes it easier for anyone to extend it further. Differential Revision: https://reviews.llvm.org/D38031 Change-Id: I0efc2aaa091b61a8a9552311c21cc77916a97111 llvm-svn: 316989
*	[LoopUnroll] Clean up remarks for unroll remainder	David Green	2017-10-31	4	-33/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The optimisation remarks for loop unrolling with an unrolled remainder looks something like: test.c:7:18: remark: completely unrolled loop with 3 iterations [-Rpass=loop-unroll] C[i] += A[i*N+j]; ^ test.c:6:9: remark: unrolled loop by a factor of 4 with run-time trip count [-Rpass=loop-unroll] for(int j = 0; j < N; j++) ^ This removes the first of the two messages. Differential revision: https://reviews.llvm.org/D38725 llvm-svn: 316986
*	[AVX512] Adding new patterns for extract_subvector of vXi1	Michael Zuckerman	2017-10-31	1	-14/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	extract subvector of vXi1 from vYi1 is poorly supported by LLVM and most of the time end with an assertion. This patch fixes this issue by adding new patterns to the TD file. Reviewers: 1. guyblank 2. igorb 3. zvi 4. ayman 5. craig.topper Differential Revision: https://reviews.llvm.org/D39292 Change-Id: Ideb4d7e946c8d40cfce2920891f2d89fe64c58f8 llvm-svn: 316981
*	[CGP] Fix the detection of trivial case for addressing mode	Serguei Katkov	2017-10-31	1	-10/+9
\| \| \| \| \| \| \|	The address can be presented as a bitcast of baseReg. In this case it is still trivial but OriginalValue != baseReg. llvm-svn: 316980
*	[IRCE][NFC] Rename fields of InductiveRangeCheck	Max Kazantsev	2017-10-31	3	-25/+25
\| \| \| \| \| \| \| \| \| \|	Rename `Offset`, `Scale`, `Length` into `Begin`, `Step`, `End` respectively to make naming of similar entities for Ranges and Range Checks more consistent. Differential Revision: https://reviews.llvm.org/D39414 llvm-svn: 316979
*	[X86] Make AVX512_512_SET0 XMM16-31 lower to 128-bit XOR when AVX512VL is ↵	Craig Topper	2017-10-31	10	-331/+320
\| \| \| \| \| \| \| \| \| \|	enabled. Use 128-bit VLX instruction when VLX is enabled. Unfortunately, this weakens our ability to do domain fixing when AVX512DQ is not enabled, but it is consistent with our 256-bit behavior. Maybe we should add custom handling to domain fixing to allow EVEX integer XOR/AND/OR/ANDN to switch to VEX encoded fp instructions if the high registers aren't being used? llvm-svn: 316978
*	[NFC] Get rid of variables used in assert only	Max Kazantsev	2017-10-31	1	-6/+6
\| \| \| \|	llvm-svn: 316977
*	[IndVarSimplify] Simplify code using preheader assumption	Philip Reames	2017-10-31	2	-44/+28
\| \| \| \| \| \| \| \|	As noted in the nice block comment, the previous code didn't actually handle multi-entry loops correctly, it just assumed SCEV didn't analyze such loops. Given SCEV has comments to the contrary, that seems a bit suspect. More importantly, the pass actually requires loopsimplify form which ensures a loop-preheader is available. Remove the excessive generaility and shorten the code greatly. Note that we do successfully analyze many multi-entry loops, but we do so by converting them to single entry loops. See the added test case. llvm-svn: 316976
*	Reapply "[GVN] Prevent LoadPRE from hoisting across instructions that don't ↵	Max Kazantsev	2017-10-31	6	-0/+533
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	pass control flow to successors" This patch fixes the miscompile that happens when PRE hoists loads across guards and other instructions that don't always pass control flow to their successors. PRE is now prohibited to hoist across such instructions because there is no guarantee that the load standing after such instruction is still valid before such instruction. For example, a load from under a guard may be invalid before the guard in the following case: int array[LEN]; ... guard(0 <= index && index < LEN); use(array[index]); Differential Revision: https://reviews.llvm.org/D37460 llvm-svn: 316975
*	[SimplifyIndVar] Extract out invariant expression handling	Philip Reames	2017-10-31	1	-82/+107
\| \| \| \| \| \| \| \|	Previously, the code returned early from the function when it couldn't find a free expansion, it should be returning from the transform. I don't have a test case, noticed this via inspection. As a follow up, I'm going to revisit the logic in the extract function. I think that essentially the whole helper routine can be replaced with SCEVExpander, but I wanted to do that in a series of separate commits. llvm-svn: 316974
*	[X86] Clang-format some code. NFC	Craig Topper	2017-10-31	1	-2/+8
\| \| \| \|	llvm-svn: 316973
*	[cmake] Make check_linker_flags operate via linker flags	Shoaib Meenai	2017-10-31	2	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	`check_linker_flags` currently sets the compiler flags (via `CMAKE_REQUIRED_FLAGS`), and thus implicitly relies on cmake's default behavior of passing the compiler flags to the linker. This breaks when cmake's build rules have been altered to not pollute the link line with compiler flags (which can be desirable for build cleanliness). Instead, set `CMAKE_EXE_LINKER_FLAGS` explicitly and use `CMP0056` to ensure the linker flags are passed along. Additionally, since we're inside a function, we can just alter the variable directly (as the alteration will be limited to the scope of the function) rather than saving and restoring the old value. Differential Revision: https://reviews.llvm.org/D39431 llvm-svn: 316972
*	Undo accidental commit	Philip Reames	2017-10-31	2	-245/+82
\| \| \| \| \| \|	These files shouldn't have been submitted in 316967 llvm-svn: 316968
*	[CGP] Fix crash on i96 bit multiply	Philip Reames	2017-10-30	4	-83/+256
\| \| \| \| \| \| \| \|	Issue found by llvm-isel-fuzzer on OSS fuzz, https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3725 If anyone actually cares about > 64 bit arithmetic, there's a lot more to do in this area. There's a bunch of obviously wrong code in the same function. I don't have the time to fix all of them and am just using this to understand what the workflow for fixing fuzzer cases might look like. llvm-svn: 316967
*	Fix unused variable warnings. NFCI.	Simon Pilgrim	2017-10-30	1	-3/+0
\| \| \| \|	llvm-svn: 316964
*	[SelectionDAG] Tidyup computeKnownBits extension/truncation cases. NFCI.	Simon Pilgrim	2017-10-30	1	-17/+4
\| \| \| \| \| \|	We don't need to extend/truncate the Known structure before calling computeKnownBits - it will reset at the start of the function. llvm-svn: 316962
*	[AArch64]: range loopify frame-lowering	Javed Absar	2017-10-30	1	-2/+2
\| \| \| \|	llvm-svn: 316960
*	Fix -fuse-ld feature detection error.	Rui Ueyama	2017-10-30	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	check_cxx_compiler_flag doesn't seem to try to link a program, so the existing code doesn't correctly detect the availability of a given linker. This patch uses check_cxx_source_compiles instead. I confirmed that cmake now reports this error Host compiler does not support '-fuse-ld=foo' for -DLLVM_USE_LINKER=foo. Differential Revision: https://reviews.llvm.org/D39274 llvm-svn: 316958
*	InferAddressSpaces: Fix bug about replacing addrspacecast	Yaxun Liu	2017-10-30	2	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	InferAddressSpaces assumes the pointee type of addrspacecast is the same as the operand, which is not always true and causes invalid IR. This bug cause build failure in HCC. This patch fixes that. Differential Revision: https://reviews.llvm.org/D39432 llvm-svn: 316957
*	[CMake] Fix linker detection in AddLLVM.cmake	Tim Shen	2017-10-30	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix linker not being correctly detected when a custom one is specified through LLVM_USE_LINKER CMake variable. In particular, cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_USE_LINKER=gold ../llvm resulted into Linker detection: GNU ld instead of Linker detection: GNU Gold due to the construction not accounting for such variable. It led to the general confusion and prevented setting linker-specific flags inside functions defined in AddLLVM.cmake. Thanks Oleksii Vilchanskyi for the patch! llvm-svn: 316956
*	[X86] Add AVX512 support to fast isel's X86ChooseCmpOpcode.	Craig Topper	2017-10-30	3	-2/+5
\| \| \| \|	llvm-svn: 316955
*	[NewGVN] Stop assuming PHI args ordering when looking at phi-of-ops.	Davide Italiano	2017-10-30	2	-1/+71
\| \| \| \| \| \| \| \|	It's not guaranteed. There's a bug open to sort them in predecessor order, but it won't happen anytime soon. In the meanwhile, passes will have to do an O(#preds) scan. Such is life. llvm-svn: 316953
*	Revert "[PowerPC] Try to simplify a Swap if it feeds a Splat"	Stefan Pintilie	2017-10-30	3	-183/+2
\| \| \| \| \| \| \| \| \| \|	Revert r316478. A test case has failed. Will recommit this change once we find and fix the failure. This reverts commit 7c330fabaedaba3d02c58bc3cc1198896c895f34. llvm-svn: 316952
*	Create instruction classes for identifying any atomicity of memory ↵	Daniel Neilson	2017-10-30	5	-221/+253
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	intrinsic. (NFC) Summary: For reference, see: http://lists.llvm.org/pipermail/llvm-dev/2017-August/116589.html This patch fleshes out the instruction class hierarchy with respect to atomic and non-atomic memory intrinsics. With this change, the relevant part of the class hierarchy becomes: IntrinsicInst -> MemIntrinsicBase (methods-only class) -> MemIntrinsic (non-atomic intrinsics) -> MemSetInst -> MemTransferInst -> MemCpyInst -> MemMoveInst -> AtomicMemIntrinsic (atomic intrinsics) -> AtomicMemSetInst -> AtomicMemTransferInst -> AtomicMemCpyInst -> AtomicMemMoveInst -> AnyMemIntrinsic (both atomicities) -> AnyMemSetInst -> AnyMemTransferInst -> AnyMemCpyInst -> AnyMemMoveInst This involves some class renaming: ElementUnorderedAtomicMemCpyInst -> AtomicMemCpyInst ElementUnorderedAtomicMemMoveInst -> AtomicMemMoveInst ElementUnorderedAtomicMemSetInst -> AtomicMemSetInst A script for doing this renaming in downstream trees is included below. An example of where the Any* classes should be used in LLVM is when reasoning about the effects of an instruction (ex: aliasing). --- Script for renaming AtomicMem* classes: PREFIXES="[<,([:space:]]" CLASSES="MemIntrinsic\|MemTransferInst\|MemSetInst\|MemMoveInst\|MemCpyInst" SUFFIXES="[;)>,[:space:]]" REGEX="(${PREFIXES})ElementUnorderedAtomic(${CLASSES})(${SUFFIXES})" REGEX2="visitElementUnorderedAtomic(${CLASSES})" FILES=$( grep -E "(${REGEX}\|${REGEX2})" -r . \| tr ':' ' ' \| awk '{print $1}' \| sort \| uniq ) SED_SCRIPT="s~${REGEX}~\1Atomic\2\3~g" SED_SCRIPT2="s~${REGEX2}~visitAtomic\1~g" for f in $FILES; do echo "Processing: $f" sed -i ".bak" -E "${SED_SCRIPT};${SED_SCRIPT2};${EA_SED_SCRIPT};${EA_SED_SCRIPT2}" $f done Reviewers: sanjoy, deadalnix, apilipenko, anna, skatkov, mkazantsev Reviewed By: sanjoy Subscribers: hfinkel, jholewinski, arsenm, sdardis, nhaehnle, JDevlieghere, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D38419 llvm-svn: 316950
*	[GVNHoist] Fix non-deterministic sort order of PHIs for identical instructions	Mandeep Singh Grang	2017-10-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This fixes failure in Transforms/GVNHoist/hoist.ll uncovered by D39245. Reviewers: hiraditya, spop, dberlin Reviewed By: dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39410 llvm-svn: 316949
*	[SelectionDAG] Add VSELECT demanded elts support to computeKnownBits	Simon Pilgrim	2017-10-30	2	-24/+8
\| \| \| \|	llvm-svn: 316947
*	X86 Tests: Update the variable-index permute tests with FP types. NFC.	Zvi Rackover	2017-10-30	3	-0/+456
\| \| \| \| \| \|	These cases will be addressed in a future update to D39126. llvm-svn: 316946
*	[X86][SSE] Add another computeKnownBits test showing missing VSELECT ↵	Simon Pilgrim	2017-10-30	1	-0/+44
\| \| \| \| \| \|	demandedelts support llvm-svn: 316945
*	[SelectionDAG] Add VSELECT support to computeKnownBits	Simon Pilgrim	2017-10-30	3	-29/+30
\| \| \| \|	llvm-svn: 316944
*	[X86][SSE] computeKnownBits tests showing missing VSELECT demandedelts support	Simon Pilgrim	2017-10-30	1	-0/+48
\| \| \| \|	llvm-svn: 316940
*	[X86][AVX512] Cleanup scheduler tests - split GENERIC and SKX targets	Simon Pilgrim	2017-10-30	2	-8923/+16710
\| \| \| \|	llvm-svn: 316938
*	[SelectionDAG] Add SELECT demanded elts support to ComputeNumSignBits	Simon Pilgrim	2017-10-30	2	-40/+12
\| \| \| \|	llvm-svn: 316933
*	[X86][SSE] ComputeNumSignBits tests showing missing VSELECT demandedelts ↵	Simon Pilgrim	2017-10-30	1	-0/+107
\| \| \| \| \| \|	support llvm-svn: 316932
*	[MC] Split out register def/use idx calls to make debugging simpler. NFCI.	Simon Pilgrim	2017-10-30	1	-3/+4
\| \| \| \|	llvm-svn: 316927
*	[X86][AVX] Add missing vcvtpd2dq/vcvtps2dq scheduling tests	Simon Pilgrim	2017-10-30	1	-14/+142
\| \| \| \|	llvm-svn: 316926
*	[X86][SSE] Add clflush scheduling test	Simon Pilgrim	2017-10-30	1	-0/+61
\| \| \| \|	llvm-svn: 316925
*	[X86][AVX512] Adding a pattern for broadcastm intrinsic.	Jina Nahias	2017-10-30	2	-57/+73
\| \| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D38312 Change-Id: I71c8605a8e4c98013ef25289694afc5cfd46bb0b llvm-svn: 316921
*	Move isDSOLocal check and add a comment.	Rafael Espindola	2017-10-30	1	-2/+12
\| \| \| \|	llvm-svn: 316920
*	[PPC CodeGen] Fix the bitreverse.i64 intrinsic.	Fangrui Song	2017-10-30	3	-103/+66
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: The two 32-bit words were swapped. Update a test omitted in reverted r316270. Reviewers: jtony, aaron.ballman Subscribers: nemanjai, kbarton Differential Revision: https://reviews.llvm.org/D39163 llvm-svn: 316916
*	[X86] Make sure we don't create locked inc/dec instructions when the carry ↵	Craig Topper	2017-10-30	5	-47/+126
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	flag is being used. Summary: INC/DEC don't update the carry flag so we need to make sure we don't try to use it. This patch introduces new X86ISD opcodes for locked INC/DEC. Teaches lowerAtomicArithWithLOCK to emit these nodes if INC/DEC is not slow or the function is being optimized for size. An additional flag is added that allows the INC/DEC to be disabled if the caller determines that the carry flag is being requested. The test_sub_1_cmp_1_setcc_ugt test is currently showing this bug. The other test case changes are recovering cases that were regressed in r316860. This should fully fix PR35068 finishing the fix started in r316860. Reviewers: RKSimon, zvi, spatel Reviewed By: zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39411 llvm-svn: 316913
*	[X86] Remove AVX512 early out from X86FastISel::X86SelectCmp.	Craig Topper	2017-10-30	5	-130/+152
\| \| \| \| \| \|	This shouldn't be needed anymore since i1 isn't a legal type. llvm-svn: 316912
*	[X86] Regenerate test using update_llc_test_checks.py	Craig Topper	2017-10-30	1	-32/+439
\| \| \| \|	llvm-svn: 316911
*	[PassManager, SimplifyCFG] add test for PR34603 / D38566; NFC	Sanjay Patel	2017-10-30	1	-1/+41
\| \| \| \| \| \|	Sinking common insts and converting to select early can inhibit better folds in other passes. llvm-svn: 316908
*	[AMDGPU] Emit metadata for hidden arguments for kernel enqueue	Yaxun Liu	2017-10-30	6	-9/+214
\| \| \| \| \| \| \| \| \| \| \| \| \|	Identifies kernels which performs device side kernel enqueues and emit metadata for the associated hidden kernel arguments. Such kernels are marked with calls-enqueue-kernel function attribute by AMDGPUOpenCLEnqueueKernelLowering pass and later on hidden kernel arguments metadata HiddenDefaultQueue and HiddenCompletionAction are emitted for them. Differential Revision: https://reviews.llvm.org/D39255 llvm-svn: 316907
*	[CodeGen][ExpandMemcmp] Allow memcmp to expand to vector loads (2).	Clement Courbet	2017-10-30	12	-295/+730
\| \| \| \| \| \| \| \| \| \| \| \|	- Targets that want to support memcmp expansions now return the list of supported load sizes. - Expansion codegen does not assume that all power-of-two load sizes smaller than the max load size are valid. For examples, this is not the case for x86(32bit)+sse2. Fixes PR34887. llvm-svn: 316905
*	[Hexagon] Allow the RDF optimizations to be run in .mir testcases	Krzysztof Parzyszek	2017-10-30	2	-5/+7
\| \| \| \|	llvm-svn: 316904
*	[GlobalISel\|ARM] : Allow legalizing G_FSUB	Javed Absar	2017-10-30	7	-8/+238
\| \| \| \| \| \| \| \|	Adding support for VSUB. Reviewed by: @rovka Differential Revision: https://reviews.llvm.org/D39261 llvm-svn: 316902
*	Invalid used of 'w' suffix on push and pop using 64-bit register.	Andrew V. Tischenko	2017-10-30	2	-5/+5
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D38626 llvm-svn: 316898
*	[ARM GlobalISel] Fixup r316572. NFC	Diana Picus	2017-10-30	1	-9/+0
\| \| \| \| \| \|	Just missed a few spots... llvm-svn: 316897
*	Revert "[X86][AVX512] Adding a pattern for broadcastm intrinsic."	Jina Nahias	2017-10-30	2	-73/+57
\| \| \| \| \| \| \|	This reverts commit r316890. Change-Id: I683cceee9848ef309b452293086b1f26a941950d llvm-svn: 316894