bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[CodeGenPrepare] Check that erased sunken address are not reused	Simon Dardis	2017-11-13	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CodeGenPrepare sinks address computations from one basic block to another and attempts to reuse address computations that have already been sunk. If the same address computation appears twice with the first instance as an operand of a load whose result is an operand to a simplifable select, CodeGenPrepare simplifies the select and recursively erases the now dead instructions. CodeGenPrepare then attempts to use the erased address computation for the second load. Fix this by erasing the cached address value if it has zero uses before looking for the address value in the sunken address map. This partially resolves PR35209. Thanks to Alexander Richardson for reporting the issue! Reviewers: john.brawn Differential Revision: https://reviews.llvm.org/D39841 llvm-svn: 318032
*	[CodeExtractor] Add missing AllowVarArgs initialization.	Florian Hahn	2017-11-13	1	-2/+3
\| \| \| \|	llvm-svn: 318029
*	[PartialInliner] Inline vararg functions that forward varargs.	Florian Hahn	2017-11-13	3	-24/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch extends the partial inliner to support inlining parts of vararg functions, if the vararg handling is done in the outlined part. It adds a `ForwardVarArgsTo` argument to InlineFunction. If it is non-null, all varargs passed to the inlined function will be added to all calls to `ForwardVarArgsTo`. The partial inliner takes care to only pass `ForwardVarArgsTo` if the varargs handing is done in the outlined function. It checks that vastart is not part of the function to be inlined. `test/Transforms/CodeExtractor/PartialInlineNoInline.ll` (already part of the repo) checks we do not do partial inlining if vastart is used in a basic block that will be inlined. Reviewers: davide, davidxl, grosser Reviewed By: davide, davidxl, grosser Subscribers: gyiu, grosser, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D39607 llvm-svn: 318028
*	Test commit	Sander de Smalen	2017-11-13	1	-1/+1
\| \| \| \|	llvm-svn: 318027
*	[x86][AVX512] Lowering shuffle i/f intrinsics to LLVM IR	Jina Nahias	2017-11-13	2	-17/+25
\| \| \| \| \| \| \| \| \|	This patch, together with a matching clang patch (https://reviews.llvm.org/D38672), implements the lowering of X86 shuffle i/f intrinsics to IR. Differential Revision: https://reviews.llvm.org/D38671 Change-Id: I1e7d359a74743e995ec356237a85214ce55d3661 llvm-svn: 318026
*	[X86][SKX] Adding scheduling info of non-intrinsic + commutable SKX opcodes.	Gadi Haber	2017-11-13	1	-102/+102
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Updated the scheduling information of the SKX subtarget in the file X86SchedSkylakeServer.td under lib/Target/X86 to: 1. add regular opcodes in addition to the suffixed "_Int" opcodes 2. add the (V)MAXCPD/MAXCPS/MAXCSD/MAXCSS/MINCPD/MINCPS/MINCSD/MINCSS instructions that are equivalent to their counterparts without the 'C' as they are part of a hack to make floating point min/max commutable under fast math. Reviewers: zvi, RKSimon, craig.topper Differential Revision: https://reviews.llvm.org/D39833 Change-Id: Ie13702a5ce1b1a08af91ca637a52b6962881e7d6 llvm-svn: 318024
*	[X86] Limit NOPs to 7 bytes when 'slm' is spelled 'silvermont'.	Craig Topper	2017-11-13	1	-1/+1
\| \| \| \| \| \|	We support 2 spelling for silvermont and we should accept both here. llvm-svn: 318023
*	[X86] Use sse_load_f32/f64 to improve load folding of scalar vfscalefss/sd, ↵	Craig Topper	2017-11-13	1	-5/+4
\| \| \| \| \| \|	vrcp14ss/sd, rsqrt14ss/sd instructions. llvm-svn: 318022
*	MI: Print ranges on MMO	Matt Arsenault	2017-11-13	1	-0/+15
\| \| \| \|	llvm-svn: 318020
*	[X86] Use sse_load_f32/f64 to improve load folding for scalar VFPCLASS ↵	Craig Topper	2017-11-13	1	-4/+4
\| \| \| \| \| \|	intrinsics. llvm-svn: 318019
*	AMDGPU: Preserve nuw in shl add ptr combine	Matt Arsenault	2017-11-13	1	-1/+6
\| \| \| \|	llvm-svn: 318017
*	[X86] Fix SQRTSS/SQRTSD/RCPSS/RCPSD intrinsics to use ↵	Craig Topper	2017-11-13	2	-10/+13
\| \| \| \| \| \|	sse_load_f32/sse_load_f64 to increase load folding opportunities. llvm-svn: 318016
*	AMDGPU: Fix multi-use shl/add combine	Matt Arsenault	2017-11-13	2	-31/+15
\| \| \| \| \| \| \| \| \| \| \|	This was using a custom function that didn't handle the addressing modes properly for private. Use isLegalAddressingMode to avoid duplicating this. Additionally, skip the combine if there is only one use since the standard combine will handle it. llvm-svn: 318013
*	[X86] Attempt to fix signed and unsigned comparison warning.	Craig Topper	2017-11-13	1	-2/+2
\| \| \| \|	llvm-svn: 318010
*	[X86] Use sse_load_f32/f64 in patterns for the memory forms of VRNDSCALESS/SD.	Craig Topper	2017-11-13	1	-3/+2
\| \| \| \|	llvm-svn: 318009
*	[X86] Use EVEX encoded VRNDSCALE instructions to implement the legacy round ↵	Craig Topper	2017-11-13	4	-29/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	intrinsics. The VRNDSCALE instructions implement a superset of the (V)ROUND instructions. They are equivalent if the upper 4-bits of the immediate are 0. This patch lowers the legacy intrinsics to the VRNDSCALE ISD node and masks the upper bits of the immediate to 0. This allows us to take advantage of the larger register encoding space. We should maybe consider converting VRNDSCALE back to VROUND in the EVEX to VEX pass if the extended registers are not being used. I notice some load folding opportunities being missed for the VRNDSCALESS/SD instructions that I'll try to fix in future patches. llvm-svn: 318008
*	[X86] Split VRNDSCALE/VREDUCE/VGETMANT/VRANGE ISD nodes into versions with ↵	Craig Topper	2017-11-13	5	-99/+157
\| \| \| \| \| \| \| \|	and without the rounding operand. NFCI I want to reuse the VRNDSCALE node for the legacy SSE rounding intrinsics so that those intrinsics can use EVEX instructions. All of these nodes share tablegen multiclasses so I split them all so that they all remain similar in their implementations. llvm-svn: 318007
*	AMDGPU: Select d16 loads into low component of register	Matt Arsenault	2017-11-13	6	-5/+147
\| \| \| \|	llvm-svn: 318005
*	[X86] Add an X86ISD::RANGES opcode to use for the scalar intrinsics.	Craig Topper	2017-11-12	5	-6/+8
\| \| \| \| \| \|	This fixes a bug where we selected packed instructions for scalar intrinsics. llvm-svn: 317999
*	[X86] Remove some no longer needed intrinsic lowering code.	Craig Topper	2017-11-12	2	-18/+1
\| \| \| \|	llvm-svn: 317997
*	[llvm] Remove redundant return [NFC]	Mandeep Singh Grang	2017-11-12	3	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: davidxl, olista01, Eugene.Zelenko Reviewed By: Eugene.Zelenko Subscribers: sdardis, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D39917 llvm-svn: 317995
*	[InstCombine] Teach visitICmpInst to not break integer absolute value idioms	Craig Topper	2017-11-12	1	-6/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds an early out to visitICmpInst if we are looking at a compare as part of an integer absolute value idiom. Similar is already done for min/max. In the particular case I observed in a benchmark we had an absolute value of a load from an indexed global. We simplified the compare using foldCmpLoadFromIndexedGlobal into a magic bit vector, a shift, and an and. But the load result was still used for the select and the negate part of the absolute valute idiom. So we overcomplicated the code and lost the ability to recognize it as an absolute value. I've chosen a simpler case for the test here. Reviewers: spatel, davide, majnemer Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39766 llvm-svn: 317994
*	[X86] Use vrndscaleps/pd for 128/256 ffloor/ftrunc/fceil/fnearbyint/frint ↵	Craig Topper	2017-11-11	2	-1/+47
\| \| \| \| \| \| \| \|	when avx512vl is enabled. This matches what we do for scalar and 512-bit types. llvm-svn: 317991
*	[X86] Attempt to match multiple binary reduction ops at once. NFCI	Simon Pilgrim	2017-11-11	1	-61/+67
\| \| \| \| \| \| \| \| \| \|	matchBinOpReduction currently matches against a single opcode, but we already have a case where we repeat calls to try to match against AND/OR and I'll be shortly adding another case for SMAX/SMIN/UMAX/UMIN (D39729). This NFCI patch alters matchBinOpReduction to try and pattern match against any of the provided list of candidate bin ops at once to save time. Differential Revision: https://reviews.llvm.org/D39726 llvm-svn: 317985
*	[X86] Add scalar register class versions of VRNDSCALE instructions and ↵	Craig Topper	2017-11-11	2	-36/+56
\| \| \| \| \| \| \| \| \| \|	rename the existing versions to _Int. This is consistent with out normal implementation of scalar instructions. While there disable load folding for the patterns with IMPLICIT_DEF unless optimizing for size which is also our standard practice. llvm-svn: 317977
*	[X86] Inline some SDNode operand multiclass operands that don't vary. NFC	Craig Topper	2017-11-11	1	-33/+28
\| \| \| \|	llvm-svn: 317975
*	[X86] Set the execution domain for VFPCLASS to SSEPackedSingle/Double.	Craig Topper	2017-11-11	1	-1/+3
\| \| \| \|	llvm-svn: 317974
*	[X86] Set the execution domain for vptest instruction to the integer domain.	Craig Topper	2017-11-11	1	-0/+3
\| \| \| \|	llvm-svn: 317973
*	[globalisel][tablegen] Import signextload and zeroextload.	Daniel Sanders	2017-11-11	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allow a pattern rewriter to be installed in CodeGenDAGPatterns and use it to correct situations where SelectionDAG and GlobalISel disagree on representation. For example, it would rewrite: (sextload:i32 $ptr)<<unindexedload>><<sextload>><<sextloadi16> to: (sext:i32 (load:i16 $ptr)<<unindexedload>>) I'd have preferred to replace the fragments and have the expansion happen naturally as part of PatFrag expansion but the type inferencing system can't cope with loads of types narrower than those mentioned in register classes. This is because the SDTCisInt's on the sext constrain both the result and operand to the 'legal' integer types (where legal is defined as 'a register class can contain the type') which immediately rules the narrower types out. Several targets (those with only one legal integer type) would then go on to crash on the SDTCisOpSmallerThanOp<> when it removes all the possible types for the result of the extend. Also, improve isObviouslySafeToFold() slightly to automatically return true for neighbouring instructions. There can't be any re-ordering problems if re-ordering isn't happenning. We'll need to improve it further to handle sign/zero-extending loads when the extend and load aren't immediate neighbours though. llvm-svn: 317971
*	[X86] Correct the execution domain on ROUND/VROUND instructions.	Craig Topper	2017-11-11	1	-6/+12
\| \| \| \|	llvm-svn: 317968
*	[X86] Remove the default for one of the arguments to some tablegen ↵	Craig Topper	2017-11-11	1	-5/+3
\| \| \| \| \| \| \| \|	multiclasses. NFC No one ever uses this default and probably shouldn't since it sets the execution domain to generic. llvm-svn: 317967
*	[SelectionDAG] Make getUniformBase in SelectionDAGBuilder fail if any of the ↵	Craig Topper	2017-11-10	1	-4/+5
\| \| \| \| \| \| \| \|	middle GEP indices are non-constant. This is a fix for a bug in r317947. We were supposed to check that all the indices are are constant 0, but instead we're only make sure that indices that are constant are 0. Non-constant indices are being ignored. llvm-svn: 317950
*	[SelectionDAG] Teach SelectionDAGBuilder's getUniformBase for gather/scatter ↵	Craig Topper	2017-11-10	1	-2/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	handling to accept GEPs with more than 2 operands if the middle operands are all 0s Currently we can only get a uniform base from a simple GEP with 2 operands. This causes us to miss address folding opportunities for simple global array accesses as the test case shows. This patch adds support for larger GEPs if the other indices are 0 since those don't require any additional computations to be inserted. We may also want to handle constant splats of zero here, but I'm leaving that for future work when I have a real world example. Differential Revision: https://reviews.llvm.org/D39911 llvm-svn: 317947
*	[asan] Use dynamic shadow on 32-bit Android.	Evgeniy Stepanov	2017-11-10	1	-7/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The following kernel change has moved ET_DYN base to 0x4000000 on arm32: https://marc.info/?l=linux-kernel&m=149825162606848&w=2 Switch to dynamic shadow base to avoid such conflicts in the future. Reserve shadow memory in an ifunc resolver, but don't use it in the instrumentation until PR35221 is fixed. This will eventually let use save one load per function. Reviewers: kcc Subscribers: aemerson, srhines, kubamracek, kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D39393 llvm-svn: 317943
*	[llvm-cvtres] Add support for ARM64	Martin Storsjo	2017-11-10	1	-14/+5
\| \| \| \| \| \| \| \| \| \| \|	Also change some default cases into llvm_unreachable in WindowsResourceCOFFWriter, to make it easier to find if they are triggerd from within e.g. lld, which supported ARM64 earlier than llvm-cvtres did. Differential Revision: https://reviews.llvm.org/D39892 llvm-svn: 317942
*	[DAGcombine] Do not replace truncate node by itself when doing constant ↵	Amaury Sechet	2017-11-10	1	-3/+9
\| \| \| \| \| \|	folding, this trigger needless extra rounds of combine for nothing. NFC llvm-svn: 317926
*	[SimplifyCFG] Use auto * when the type is obvious. NFCI.	Davide Italiano	2017-11-10	1	-11/+8
\| \| \| \|	llvm-svn: 317923
*	Recommit r317904: [Hexagon] Create HexagonISelDAGToDAG.h, NFC	Krzysztof Parzyszek	2017-11-10	2	-109/+139
\| \| \| \| \| \| \|	The Windows builder did not reconstruct the HexagonGenDAGISel.inc file after the TableGen binary has changed. llvm-svn: 317921
*	AMDGPU/NFC: Split Processors.td into GCNProcessors.td and R600Processors.td	Konstantin Zhuravlyov	2017-11-10	4	-218/+258
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D39880 llvm-svn: 317920
*	Expand IRBuilder interface for atomic memcpy to require pointer alignments. ↵	Daniel Neilson	2017-11-10	2	-11/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(NFC) Summary: The specification of the @llvm.memcpy.element.unordered.atomic intrinsic requires that the pointer arguments have alignments of at least the element size. The existing IRBuilder interface to create a call to this intrinsic does not allow for providing the alignment of these pointer args. Having an interface that makes it easy to construct invalid intrinsic calls doesn't seem sensible, so this patch simply adds the requirement that one provide the argument alignments when using IRBuilder to create atomic memcpy calls. llvm-svn: 317918
*	Revert "[Hexagon] Create HexagonISelDAGToDAG.h, NFC"	Krzysztof Parzyszek	2017-11-10	2	-139/+109
\| \| \| \| \| \|	This reverts r317904: broke Windows build. llvm-svn: 317916
*	[X86] Merge the template method selectAddrOfGatherScatterNode into ↵	Craig Topper	2017-11-10	1	-25/+16
\| \| \| \| \| \| \| \|	selectVectorAddr. NFCI Just need to initialize a couple variables differently based on the node type. No need for a whole separate template method. llvm-svn: 317915
*	[CVP] Remove some {s\|u}add.with.overflow checks.	Sanjoy Das	2017-11-10	1	-0/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This adds logic to CVP to remove some overflow checks. It uses LVI to remove operations with at least one constant. Specifically, this can remove many overflow intrinsics immediately following an overflow check in the source code, such as: if (x < INT_MAX) ... x + 1 ... Patch by Joel Galenson! Reviewers: sanjoy, regehr Reviewed By: sanjoy Subscribers: fhahn, pirama, srhines, llvm-commits Differential Revision: https://reviews.llvm.org/D39483 llvm-svn: 317911
*	[RISCV] Silence an unused variable warning in release builds [NFC]	Mandeep Singh Grang	2017-11-10	2	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Also minor cleanups: 1. Avoided multiple calls to Fixup.getKind() 2. Avoided multiple calls to getFixupKindInfo() 3. Removed a redundant return. Reviewers: asb, apazos Reviewed By: asb Subscribers: rbar, johnrusso, llvm-commits Differential Revision: https://reviews.llvm.org/D39881 llvm-svn: 317908
*	[Hexagon] Create HexagonISelDAGToDAG.h, NFC	Krzysztof Parzyszek	2017-11-10	2	-109/+139
\| \| \| \|	llvm-svn: 317904
*	[X86] Add a def file to CPU vendor, type, and subtype encodings used by Host.cpp	Craig Topper	2017-11-10	1	-271/+101
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: I want to leverage this to clean up some of the code in clang. This will allow us to simplify D39521 which was trying to do some of the same. If we accurately keep the code in Host.cpp synced with new CPUs added to compile-rt/libgcc we should be able to use this file as a proxy for what's implemented in the libraries. The entries for the CPUs recognized by the libraries use separate macros that define additional parameters like the name for __builtin_cpu_is and an alias string for the couple cases where __builtin_cpu_is accepts two different names. All of the macros contain an ARCHNAME that is usually the same as the __builtin_cpu_is string, but sometimes isn't. This represents the name recognized by X86.td and -march. I'm following the precedent set by ARM and AArch64 and adding this information to lib/Support/TargetParser.cpp Reviewers: erichkeane, echristo, asbirlea Reviewed By: echristo Subscribers: llvm-commits, aemerson, kristof.beyls Differential Revision: https://reviews.llvm.org/D39782 llvm-svn: 317900
*	LTO: don't fatal when value for cache key already exists	Bob Haarman	2017-11-10	1	-2/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: LTO/Caching.cpp uses file rename to atomically set the value for a cache key. On Windows, this fails when the destination file already exists. Previously, LLVM would report_fatal_error in such cases. However, because the old and the new value for the cache key are supposed to be equivalent, it actually doesn't matter which one we keep. This change makes it so that failing the rename when an openable file with the desired name already exists causes us to report success instead of fataling. Reviewers: pcc, hans Subscribers: mehdi_amini, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D39874 llvm-svn: 317899
*	[WebAssembly] Fix stack offsets of return values from call lowering.	Jatin Bhateja	2017-11-10	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Fixes PR35220 Reviewers: vadimcn, alexcrichton Reviewed By: alexcrichton Subscribers: pepyakin, alexcrichton, jfb, dschuff, sbc100, jgravelle-google, llvm-commits, aheejin Differential Revision: https://reviews.llvm.org/D39866 llvm-svn: 317895
*	[AMDGPU] Prevent Machine Copy Propagation from replacing live copy with the ↵	Alexander Timofeev	2017-11-10	1	-0/+2
\| \| \| \| \| \| \| \|	dead one Differential revision: https://reviews.llvm.org/D38754 llvm-svn: 317884
*	[llvm-opt-fuzzer] Introduce llvm-opt-fuzzer for fuzzing optimization passes	Igor Laevsky	2017-11-10	1	-0/+34
\| \| \| \| \| \| \| \| \|	This change adds generic fuzzing tools capable of running libFuzzer tests on any optimization pass or combination of them. Differential Revision: https://reviews.llvm.org/D39555 llvm-svn: 317883