bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Interprocedural Register Allocation (IPRA): add a Transformation Pass	Mehdi Amini	2016-06-10	3	-0/+135
\| \| \| \| \| \| \| \| \| \| \| \|	Adds a MachineFunctionPass that scans the body to find calls, and update the register mask with the one saved by the RegUsageInfoCollector analysis in PhysicalRegisterUsageInfo. Patch by Vivek Pandya <vivekvpandya@gmail.com> Differential Revision: http://reviews.llvm.org/D21180 llvm-svn: 272414
*	Add a period. NFC.	Chad Rosier	2016-06-10	1	-1/+1
\| \| \| \|	llvm-svn: 272410
*	Fix whitespace. NFC.	Chad Rosier	2016-06-10	1	-1/+1
\| \| \| \|	llvm-svn: 272409
*	[X86] Add costs for SSE zext/sext to v4i64 to TTI	Michael Kuperstein	2016-06-10	1	-0/+14
\| \| \| \| \| \| \| \| \|	The costs are somewhat hand-wavy, but should be much closer to the truth than what we get from BasicTTI. Differential Revision: http://reviews.llvm.org/D21156 llvm-svn: 272406
*	Interprocedural Register Allocation (IPRA) Analysis	Mehdi Amini	2016-06-10	5	-0/+244
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add an option to enable the analysis of MachineFunction register usage to extract the list of clobbered registers. When enabled, the CodeGen order is changed to be bottom up on the Call Graph. The analysis is split in two parts, RegUsageInfoCollector is the MachineFunction Pass that runs post-RA and collect the list of clobbered registers to produce a register mask. An immutable pass, RegisterUsageInfo, stores the RegMask produced by RegUsageInfoCollector, and keep them available. A future tranformation pass will use this information to update every call-sites after instruction selection. Patch by Vivek Pandya <vivekvpandya@gmail.com> Differential Revision: http://reviews.llvm.org/D20769 llvm-svn: 272403
*	[AArch64] Add preferred alignments for Exynos M1	Evandro Menezes	2016-06-10	3	-2/+13
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D21203 llvm-svn: 272400
*	[Hexagon] Remove incorrect offset scaling	Krzysztof Parzyszek	2016-06-10	1	-4/+2
\| \| \| \|	llvm-svn: 272399
*	Test commit	Roman Shirokiy	2016-06-10	1	-3/+3
\| \| \| \|	llvm-svn: 272393
*	[AMDGPU] AsmParser: Support for sext() modifier in SDWA. Some code cleaning ↵	Sam Kolton	2016-06-10	5	-257/+348
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	in AMDGPUOperand. Summary: sext() modifier is supported in SDWA instructions only for integer operands. Spec is unclear should integer operands support abs and neg modifiers with sext - for now they are not supported. Renamed InputModsWithNoDefault to FloatInputMods. Added SextInputMods for operands that support sext() modifier. Added AMDGPUOperand::Modifier struct to handle register and immediate modifiers. Code cleaning in AMDGPUOperand class: organize method in groups (render-, predicate-methods...). Reviewers: vpykhtin, artem.tamazov, tstellarAMD Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D20968 llvm-svn: 272384
*	test commit: remove trailing whitespaces in README.txt	Roger Ferrer Ibanez	2016-06-10	1	-8/+8
\| \| \| \|	llvm-svn: 272380
*	Bug fix remove another illegal char from prof symbol name	Xinliang David Li	2016-06-10	1	-1/+1
\| \| \| \| \| \| \| \|	End-end test with no integrated assembly should be added at some point (not done now because some bots are not properly configured to support -no-integrated-as) llvm-svn: 272376
*	[LibFuzzer] Fix some unit test crashes on OSX.	Dan Liew	2016-06-10	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes the following unit tests: FuzzerDictionary.ParseOneDictionaryEntry FuzzerDictionary.ParseDictionaryFile The issue appears to be mixing non-ASan-ified code (LibFuzzer) and ASan-ified code (the unittest) as the tests would pass fine if everything was built with ASan enabled. I believe the issue is that different implementations of std::vector<> are being used in LibFuzzer and outside LibFuzzer (in the unittests). For Libcxx (I've not seen the issue manifest for libstdc++) we can disable the ASanified std::vector<> by definining the ``_LIBCPP_HAS_NO_ASAN`` macro. Doing this fixes the tests on OSX. Differential Revision: http://reviews.llvm.org/D21049 llvm-svn: 272374
*	[AVX512] Add shuffle comment printing for masked VPERMPD/VPERMQ.	Craig Topper	2016-06-10	1	-1/+9
\| \| \| \|	llvm-svn: 272371
*	Make PDBFile take a StreamInterface instead of a MemBuffer.	Zachary Turner	2016-06-10	2	-118/+79
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is the next step towards being able to write PDBs. MemoryBuffer is immutable, and StreamInterface is our replacement which can be any combination of read-only, read-write, or write-only depending on the particular implementation. The one place where we were creating a PDBFile (in RawSession) is updated to subclass ByteStream with a simple adapter that holds a MemoryBuffer, and initializes the superclass with the buffer's array, so that all the functionality of ByteStream works transparently. llvm-svn: 272370
*	Add support for writing through StreamInterface.	Zachary Turner	2016-06-10	9	-17/+302
\| \| \| \| \| \| \| \| \| \| \|	This adds method and tests for writing to a PDB stream. With this, even a PDB stream which is discontiguous can be treated as a sequential stream of bytes for the purposes of writing. Reviewed By: ruiu Differential Revision: http://reviews.llvm.org/D21157 llvm-svn: 272369
*	[AVX512] Fix shuffle comment printing to handle the masked versions of some ↵	Craig Topper	2016-06-10	1	-30/+46
\| \| \| \| \| \|	shuffles. Previously we were printing the mask operands as the register names. llvm-svn: 272367
*	AMDGPU: Fix trailing whitespace	Matt Arsenault	2016-06-10	10	-40/+40
\| \| \| \|	llvm-svn: 272364
*	[esan\|cfrag] Add the struct field offset array in StructInfo	Qin Zhao	2016-06-10	1	-11/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Adds the struct field offset array in struct StructInfo. Updates test struct_field_count_basic.ll. Reviewers: aizatsky Subscribers: llvm-commits, bruening, eugenis, kcc, zhaoqin, vitalybuka Differential Revision: http://reviews.llvm.org/D21192 llvm-svn: 272362
*	Add null checks before using a pointer.	Richard Trieu	2016-06-10	1	-0/+4
\| \| \| \|	llvm-svn: 272359
*	[esan\|cfrag] Disable load/store instrumentation for cfrag	Qin Zhao	2016-06-10	1	-3/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Adds ClInstrumentFastpath option to control fastpath instrumentation. Avoids the load/store instrumentation for the cache fragmentation tool. Renames cache_frag_basic.ll to working_set_slow.ll for slowpath instrumentation test. Adds the __esan_init check in struct_field_count_basic.ll. Reviewers: aizatsky Subscribers: llvm-commits, bruening, eugenis, kcc, zhaoqin, vitalybuka Differential Revision: http://reviews.llvm.org/D21079 llvm-svn: 272355
*	AMDGPU: v_cndmask_b32 does not def vcc	Matt Arsenault	2016-06-10	1	-2/+2
\| \| \| \| \| \|	Fixes verifier errors after SIShrinkInstructions. llvm-svn: 272351
*	AMDGPU/SI: Make sure to emit TargetConstant nodes when matching ds_*permute	Tom Stellard	2016-06-10	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This fixes a bug with ds_*permute instructions where if it was passed a constant address, then the offset operand would get assigned a register operand instead of an immediate. Reviewers: scchan, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19994 llvm-svn: 272349
*	AMDGPU/SI: Use common topological sort algorithm in SIScheduleDAGMI	Tom Stellard	2016-06-09	1	-64/+3
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm, axeldavy Subscribers: MatzeB, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19823 llvm-svn: 272346
*	AMDGPU: Fix flat atomics	Matt Arsenault	2016-06-09	4	-41/+90
\| \| \| \| \| \| \| \|	The flat atomics could already be selected, but only when using flat instructions for global memory. Add patterns for flat addresses. llvm-svn: 272345
*	AMDGPU: Fix i64 global cmpxchg	Matt Arsenault	2016-06-09	4	-40/+82
\| \| \| \| \| \| \| \| \| \|	This was using extract_subreg sub0 to extract the low register of the result instead of sub0_sub1, producing an invalid copy. There doesn't seem to be a way to use the compound subreg indices in tablegen since those are generated, so manually select it. llvm-svn: 272344
*	Make sure that not interesting allocas are not instrumented.	Vitaly Buka	2016-06-09	1	-4/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We failed to unpoison uninteresting allocas on return as unpoisoning is part of main instrumentation which skips such allocas. Added check -asan-instrument-allocas for dynamic allocas. If instrumentation of dynamic allocas is disabled it will not will not be unpoisoned. PR27453 Reviewers: kcc, eugenis Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21207 llvm-svn: 272341
*	CodeGen: Allow verifier to run after MachineBlockPlacement	Matt Arsenault	2016-06-09	1	-1/+1
\| \| \| \| \| \|	No tests break with this enabled. llvm-svn: 272340
*	Add aliases for mfvrsave/mtvrsave.	Eric Christopher	2016-06-09	1	-0/+4
\| \| \| \| \| \| \|	Update a test as we're now going to emit it for easier reading of generated assembly as well. llvm-svn: 272339
*	AMDGPU: Run verifer after insert waits pass	Matt Arsenault	2016-06-09	1	-1/+1
\| \| \| \|	llvm-svn: 272338
*	AMDGPU: Remove incorrect assertion	Matt Arsenault	2016-06-09	1	-4/+0
\| \| \| \| \| \| \|	I'm still not sure under what circumstances the offset here is non-0, but private memory is not limited to 27-bits. llvm-svn: 272337
*	AMDGPU: Properly initialize SIShrinkInstructions	Matt Arsenault	2016-06-09	3	-8/+6
\| \| \| \|	llvm-svn: 272336
*	[CFLAA] Handle global/arg attrs more sanely.	George Burgess IV	2016-06-09	2	-20/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Prior to this patch, we used argument/global stratified attributes in order to note that a value could have come from either dereferencing a global/arg, or from the assignment from a global/arg. Now, AttrUnknown is placed on sets when we see a dereference, instead of the global/arg attributes. This allows us to be more aggressive in the future when we see global/arg attributes without AttrUnknown. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21110 llvm-svn: 272335
*	Unpoison stack memory in use-after-return + use-after-scope mode	Vitaly Buka	2016-06-09	1	-12/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We still want to unpoison full stack even in use-after-return as it can be disabled at runtime. PR27453 Reviewers: eugenis, kcc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21202 llvm-svn: 272334
*	Reapply 272328 and 272329 as a single patch.	Alina Sbirlea	2016-06-09	1	-10/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[cpu-detection] [amdfam10] Return barcelona, and amdfam10 for all other subtypes. Address Bug 28067. Along with the refactoring of Host.cpp, getHostCPUName() was modified to return more precise types for CPUs in amdfam10. However, callers of getHostCPUName() do string matching on type, so this cannot be modified. Currently there is support in the x86 backend for barcelona. For all other subtypes the assumed return value is amdfam10. Fix: getHostCPUName() returns barcelona subtype and amdfam10 for all others. This can be extended further when support for the other subtypes is added. Differential revision: http://reviews.llvm.org/D21193 llvm-svn: 272333
*	Revert 272328 and 272329 to recommit as a single patch.	Alina Sbirlea	2016-06-09	1	-3/+10
\| \| \| \|	llvm-svn: 272332
*	Keep barcelona subtype for amdfam10	Alina Sbirlea	2016-06-09	1	-1/+3
\| \| \| \|	llvm-svn: 272329
*	[cpu-detection] Return amdfam10 for all subtypes. Address Bug 28067.	Alina Sbirlea	2016-06-09	1	-9/+0
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Remove architecture subtype from the string returned by getHostCPUName(). String matching done on type. Reviewers: llvm-commits, echristo Subscribers: mehdi_amini Differential Revision: http://reviews.llvm.org/D21193 llvm-svn: 272328
*	Use ProfileSummaryInfo in inline cost analysis.	Easwaran Raman	2016-06-09	4	-40/+36
\| \| \| \| \| \| \| \|	Instead of directly using MaxFunctionCount and function entry count to determine callee hotness, use the isHotFunction/isColdFunction methods provided by ProfileSummaryInfo. Differential revision: http://reviews.llvm.org/D21045 llvm-svn: 272321
*	[X86][AVX512] Added avx512 VPSLLDQ/VPSRLDQ instruction comments	Simon Pilgrim	2016-06-09	1	-0/+12
\| \| \| \|	llvm-svn: 272319
*	[LiveRangeEdit] Fix a crash in eliminateDeadDef.	Quentin Colombet	2016-06-09	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we delete a live-range, we check if that live-range is the origin of others to keep it around for rematerialization. For that we check that the instruction we are about to remove is the same as the definition of the VNI of the original live-range. If this is the case, we just shrink the live-range to an empty one. Now, when we try to delete one of the children of such live-range (product of splitting), we do the same check. However, now the original live-range is empty and there is no way we can access the VNI to check its definition, and we crash. When we cannot get the VNI for the original live-range, that means we are not in the presence of the original definition. Thus, this check does not need to happen in that case and the crash is sloved! This bug was introduced in r266162 \| wmi \| 2016-04-12 20:08:27. It affects every target that uses the greedy register allocator. To happen, we need to delete both a the original instruction and its split products, in that order. This is likely to happen when rematerialization comes into play. Trying to produce a more robust test case. Will follow in a coming commit. This fixes llvm.org/PR27983. rdar://problem/26651519 llvm-svn: 272314
*	[X86][AVX512] Dropped avx512 VPSLLDQ/VPSRLDQ intrinsics	Simon Pilgrim	2016-06-09	2	-12/+14
\| \| \| \| \| \|	Auto-upgrade to generic shuffles like sse/avx2 implementations now that we can lower to VPSLLDQ/VPSRLDQ llvm-svn: 272308
*	[X86][AVX512] Fixed issue with v16i32 shuffles lowering to VPALIGNR	Simon Pilgrim	2016-06-09	1	-1/+1
\| \| \| \|	llvm-svn: 272307
*	BitcodeReader: Use std:::piecewise_construct when upgrading type refs	Duncan P. N. Exon Smith	2016-06-09	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	r267296 used std::piecewise_construct without using std::forward_as_tuple, and r267298 hacked it out (using an emplace_back followed by a couple of reset() calls) because of a problem on a bot. I'm finally circling back to call forward_as_tuple as I should have to begin with (thanks to David Blaikie for pointing out the missing piece). Note that this code uses emplace_back() instead of push_back(make_pair()) because the move constructor for TrackingMDRef is expensive (cheaper than a copy, but still expensive). llvm-svn: 272306
*	[X86][AVX512] Added support for lowering 512-bit vector shuffles to bit/byte ↵	Simon Pilgrim	2016-06-09	1	-19/+41
\| \| \| \| \| \| \| \|	shifts 512-bit VPSLLDQ/VPSRLDQ can only be used for avx512bw targets so lowerVectorShuffleAsShift had to be adjusted to include the subtarget llvm-svn: 272300
*	[NVPTX] Add intrinsics for shfl instructions.	Justin Lebar	2016-06-09	1	-1/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Currently clang emits these instructions via inline (volatile) asm in the CUDA headers. Switching to intrinsics will let the optimizer reason across calls to these intrinsics. Reviewers: tra Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D21160 llvm-svn: 272298
*	[PM] Port LCSSA to the new PM.	Easwaran Raman	2016-06-09	8	-23/+47
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D21090 llvm-svn: 272294
*	AMDGPU/SI: Fix 32-bit fdiv lowering	Wei Ding	2016-06-09	1	-16/+53
\| \| \| \| \| \| \| \| \|	We were using the fast fdiv lowering for all division, implementation of IEEE754 fdiv is added. http://reviews.llvm.org/D20557 llvm-svn: 272292
*	[LV] Use vector phis for some secondary induction variables	Michael Kuperstein	2016-06-09	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, we materialized secondary vector IVs from the primary scalar IV, by offseting the primary to match the correct start value, and then broadcasting it - inside the loop body. Instead, we can use a real vector IV, like we do for the primary. This enables using vector IVs for secondary integer IVs whose type matches the type of the primary. Differential Revision: http://reviews.llvm.org/D20932 llvm-svn: 272283
*	SelectionDAG: Implement expansion of {S,U}MIN/MAX in integer legalization	Jan Vesely	2016-06-09	2	-0/+55
\| \| \| \| \| \| \| \| \|	Fixes {u,}long_{min,max,clamp} opencl piglit regressions on EG. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D17898 llvm-svn: 272272
*	Reapply "[MBP] Reduce code size by running tail merging in MBP.""	Haicheng Wu	2016-06-09	4	-34/+121
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reapplies commit r271930, r271915, r271923. They hit a bug in Thumb which is fixed in r272258 now. The original message: The code layout that TailMerging (inside BranchFolding) works on is not the final layout optimized based on the branch probability. Generally, after BlockPlacement, many new merging opportunities emerge. This patch calls Tail Merging after MBP and calls MBP again if Tail Merging merges anything. llvm-svn: 272267