bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[X86][SSE] Add slow-pmulld attribute (silvermont-style) test	Simon Pilgrim	2018-01-24	1	-247/+505
\| \| \| \| \| \|	Requested by @zvi on D42258 llvm-svn: 323364
*	Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements ↵	Alexey Bataev	2018-01-24	3	-24/+24
\| \| \| \| \| \| \| \|	as shuffle." This reverts commit r323348 because of the broken buildbots. llvm-svn: 323359
*	Revert "[ThinLTO] Add call edges' relative block frequency to per-module ↵	Easwaran Raman	2018-01-24	1	-35/+0
\| \| \| \| \| \| \| \|	summary." Causes buildbot regressions. llvm-svn: 323358
*	Revert r321751, "StructurizeCFG: Fix broken backedge detection"	Nicolai Haehnle	2018-01-24	5	-144/+123
\| \| \| \| \| \| \| \| \| \| \|	It causes regressions in various OpenGL test suites. Keep the test cases introduced by r321751 as XFAIL, and add a test case for the regression. Change-Id: I90b4cc354f68cebe5fcef1f2422dc8fe1c6d3514 Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=36015 llvm-svn: 323355
*	[ARM] Expand long shifts for Thumb1 to __aeabi_ calls	Weiming Zhao	2018-01-24	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: For long shifts, the inlined version takes about 20 instructions on Thumb1. To avoid the code bloat, expand to __aeabi_ calls if target is Thumb1. Reviewers: samparker Reviewed By: samparker Subscribers: samparker, aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D42401 llvm-svn: 323354
*	[X86] Fix some inconsistencies in the itineraries and Sched for ↵	Craig Topper	2018-01-24	2	-2/+2
\| \| \| \| \| \| \| \|	(V)PEXTRW/(V)PINSRW The weirdest being that PEXTRWrr was tagged as a memory operation. llvm-svn: 323353
*	[X86] Adjust names of PINSRW/PEXTRW intructions between MMX/SSE/AVX/AVX512 ↵	Craig Topper	2018-01-24	2	-5/+5
\| \| \| \| \| \|	for consistency and to maybe enable more regular expression compaction in the scheduler models. NFCI llvm-svn: 323352
*	[ThinLTO] Add call edges' relative block frequency to per-module summary.	Easwaran Raman	2018-01-24	1	-0/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This allows relative block frequency of call edges to be passed to the thinlink stage where it will be used to compute synthetic entry counts of functions. Reviewers: tejohnson, pcc Subscribers: mehdi_amini, llvm-commits, inglorion Differential Revision: https://reviews.llvm.org/D42212 llvm-svn: 323349
*	[SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle.	Alexey Bataev	2018-01-24	3	-24/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323348
*	[Hexagon] Run late copy propagation and dead code elimination passes	Krzysztof Parzyszek	2018-01-24	10	-37/+43
\| \| \| \|	llvm-svn: 323346
*	InstSimplify: If divisor element is undef simplify to undef	Zvi Rackover	2018-01-24	2	-0/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If any vector divisor element is undef, we can arbitrarily choose it be zero which would make the div/rem an undef value by definition. Reviewers: spatel, reames Reviewed By: spatel Subscribers: magabari, llvm-commits Differential Revision: https://reviews.llvm.org/D42485 llvm-svn: 323343
*	[ValueTracking] add recursion depth param to matchSelectPattern	Sanjay Patel	2018-01-24	1	-0/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We're getting bug reports: https://bugs.llvm.org/show_bug.cgi?id=35807 https://bugs.llvm.org/show_bug.cgi?id=35840 https://bugs.llvm.org/show_bug.cgi?id=36045 ...where we blow up the stack in value tracking because other passes are sending in selects that have an operand that is itself the select. We don't currently have a reliable way to avoid analyzing dead code that may take non-standard forms, so bail out when things go too far. This mimics the recursion depth limitations in other parts of value tracking. Unfortunately, this pushes the underlying problems for other passes (jump-threading, simplifycfg, correlated-propagation) into hiding. If someone wants to uncover those again, the first draft of this patch on Phab would do that (it would assert rather than bail out). Differential Revision: https://reviews.llvm.org/D42442 llvm-svn: 323331
*	X86 Tests: Add more sdiv combine cases. NFC	Zvi Rackover	2018-01-24	1	-9/+3160
\| \| \| \| \| \|	Add cases with vector non-splat pow2 contant divider. llvm-svn: 323329
*	Regenerate shuffle sink test	Simon Pilgrim	2018-01-24	1	-28/+39
\| \| \| \|	llvm-svn: 323328
*	Reverted 323321.	Amjad Aboud	2018-01-24	4	-220/+2
\| \| \| \|	llvm-svn: 323326
*	[AArch64] Avoid unnecessary vector byte-swapping in big-endian	Pablo Barrio	2018-01-24	1	-55/+70
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Loads/stores of some NEON vector types are promoted to other vector types with different lane sizes but same vector size. This is not a problem in little-endian but, when in big-endian, it requires additional byte reversals required to preserve the lane ordering while keeping the right endianness of the data inside each lane. For example: %1 = load <4 x half>, <4 x half>* %p results in the following assembly: ld1 { v0.2s }, [x1] rev32 v0.4h, v0.4h This patch changes the promotion of these loads/stores so that the actual vector load/store (LD1/ST1) takes care of the endianness correctly and there is no need for further byte reversals. The previous code now results in the following assembly: ld1 { v0.4h }, [x1] Reviewers: olista01, SjoerdMeijer, efriedma Reviewed By: efriedma Subscribers: aemerson, rengolin, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D42235 llvm-svn: 323325
*	[DebugInfo] Emit DWARF reference for DIVariable 'count' in DISubrange	Sander de Smalen	2018-01-24	1	-0/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch implements the codegen of DWARF debug info for non-constant 'count' fields for DISubrange. This is patch [2/3] in a series to extend LLVM's DISubrange Metadata node to support debugging of C99 variable length arrays and vectors with runtime length like the Scalable Vector Extension for AArch64. It is also a first step towards representing more complex cases like arrays in Fortran. Reviewers: echristo, pcc, aprantl, dexonsmith, clayborg, kristof.beyls, dblaikie Reviewed By: aprantl Subscribers: fhahn, aemerson, rengolin, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D41696 llvm-svn: 323323
*	[InstCombine] Introducing Aggressive Instruction Combine pass ↵	Amjad Aboud	2018-01-24	4	-2/+220
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(-aggressive-instcombine). Combine expression patterns to form expressions with fewer, simple instructions. This pass does not modify the CFG. For example, this pass reduce width of expressions post-dominated by TruncInst into smaller width when applicable. It differs from instcombine pass in that it contains pattern optimization that requires higher complexity than the O(1), thus, it should run fewer times than instcombine pass. Differential Revision: https://reviews.llvm.org/D38313 llvm-svn: 323321
*	[Metadata] Extend 'count' field of DISubrange to take a metadata node	Sander de Smalen	2018-01-24	5	-0/+131
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch extends the DISubrange 'count' field to take either a (signed) constant integer value or a reference to a DILocalVariable or DIGlobalVariable. This is patch [1/3] in a series to extend LLVM's DISubrange Metadata node to support debugging of C99 variable length arrays and vectors with runtime length like the Scalable Vector Extension for AArch64. It is also a first step towards representing more complex cases like arrays in Fortran. Reviewers: echristo, pcc, aprantl, dexonsmith, clayborg, kristof.beyls, dblaikie Reviewed By: aprantl Subscribers: rnk, probinson, fhahn, aemerson, rengolin, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D41695 llvm-svn: 323313
*	[DAGCombiner] Bail out if vector size is not a multiple	Sven van Haastregt	2018-01-24	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \|	For the included test case, the DAG transformation concat_vectors(scalar, undef) -> scalar_to_vector(sclr) would attempt to create a v2i32 vector for a v9i8 concat_vector. Bail out to avoid creating a bitcast with mismatching sizes later on. Differential Revision: https://reviews.llvm.org/D42379 llvm-svn: 323312
*	[NFC] Remove overconfident assert from IRCE	Max Kazantsev	2018-01-24	1	-0/+42
\| \| \| \| \| \| \| \| \|	This patch removes assert that SCEV is able to prove that a value is non-negative. In fact, SCEV can sometimes be unable to do this because its cache does not update properly. This assert will be returned once this problem is resolved. llvm-svn: 323309
*	[ARM] Call __chkstk for dynamic stack allocation in all windows environments	Martin Storsjo	2018-01-24	2	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This matches what MSVC does for alloca() function calls on ARM. Even if MSVC doesn't support VLAs at the language level, it does support the alloca function. On the clang level, both the _alloca() (when emulating MSVC, which is what the alloca() function expands to) and __builtin_alloca() builtin functions, and VLAs, map to the same LLVM IR "alloca" function - so within LLVM they're not distinguishable from each other. Differential Revision: https://reviews.llvm.org/D42292 llvm-svn: 323308
*	[GlobalMerge] Don't merge dllexport globals	Martin Storsjo	2018-01-24	2	-0/+30
\| \| \| \| \| \| \| \| \|	Merging such globals loses the dllexport attribute. Add a test to check that normal globals still are merged. Differential Revision: https://reviews.llvm.org/D42127 llvm-svn: 323307
*	[NFC] fix trivial typos in comments	Hiroshi Inoue	2018-01-24	5	-6/+6
\| \| \| \| \| \|	"the the" -> "the" llvm-svn: 323302
*	Don't assume a null GV is local for ELF and MachO.	Rafael Espindola	2018-01-24	6	-20/+20
\| \| \| \| \| \| \| \| \| \| \|	This is already a simplification, and should help with avoiding a plt reference when calling an intrinsic with -fno-plt. With this change we return false for null GVs, so the caller only needs to check the new metadata to decide if it should use foo@plt or *foo@got. llvm-svn: 323297
*	X86: Update isVectorShiftByScalarCheap with cases covered by AVX512BW	Zvi Rackover	2018-01-24	1	-10/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: AVX512BW adds support for variable shift amount for 16-bit element vectors. Reviewers: craig.topper, RKSimon, spatel Reviewed By: RKSimon Subscribers: rengolin, tschuett, llvm-commits Differential Revision: https://reviews.llvm.org/D42437 llvm-svn: 323292
*	[GISel]: Remove redundant copies at the end of ISel	Aditya Nandakumar	2018-01-24	14	-83/+43
\| \| \| \| \| \| \| \| \|	https://reviews.llvm.org/D42402 A lot of these copies are useless (copies b/w VRegs having the same regclass) and should be cleaned up. llvm-svn: 323291
*	AArch64: Cyclone: Remove SlowMisaligned128Store tuning flag	Matthias Braun	2018-01-24	4	-14/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove FeatureSlowMisaligned128Store from cyclone flags. This flag causes splitting of 16 byte wide stores into 2 stored of 8 bytes. This was useful on older apple CPUs which were slow for 16byte stores that were not aligned on 16byte. As the compiler often cannot predict the actual alignment, the splitting was choosen. This has been a topic for a lot of debate as the splitting also decreases performance for some benchmarks. Measuring the effects on newer apple chips (rdar://35525421) shows that it harms more cases than it helps. So it is time to retire this workaround. llvm-svn: 323289
*	[WebAssembly] MC: Use inline triple in test bitcode files	Sam Clegg	2018-01-23	17	-18/+54
\| \| \| \| \| \| \| \| \|	This matches the CodeGen tests and makes it a little easy to run these from the command line manually. Differential Revision: https://reviews.llvm.org/D42440 llvm-svn: 323275
*	[WebAssembly] Add to test expectations for ↵	Sam Clegg	2018-01-23	1	-2/+7
\| \| \| \| \| \| \| \|	test/MC/WebAssembly/weak-alias.ll. NFC. Split out from D42095 llvm-svn: 323272
*	[PPC] Avoid incorrect fp-i128-fp lowering.	Tim Shen	2018-01-23	1	-0/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Fix an issue that's similar to what D41411 fixed: float(__int128(float_var)) shouldn't be optimized to xscvdpsxds + xscvsxdsp, as they mean (float)(int64_t)float_var. Reviewers: jtony, hfinkel, echristo Subscribers: sanjoy, nemanjai, hiraditya, llvm-commits, kbarton Differential Revision: https://reviews.llvm.org/D42400 llvm-svn: 323270
*	[SLPVectorizer] add test for PR13837; NFC	Sanjay Patel	2018-01-23	1	-0/+31
\| \| \| \| \| \| \| \| \|	This was probably fixed long ago, but I don't see a test that lines up with the example and target in the bug report: https://bugs.llvm.org/show_bug.cgi?id=13837 ...so adding it here. llvm-svn: 323269
*	Add bdver shuffle sink tests.	Simon Pilgrim	2018-01-23	1	-0/+21
\| \| \| \|	llvm-svn: 323268
*	[llvm-extract] Support extracting basic blocks	Volkan Keles	2018-01-23	6	-0/+147
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Currently, there is no way to extract a basic block from a function easily. This patch extends llvm-extract to extract the specified basic block(s). Reviewers: loladiro, rafael, bogner Reviewed By: bogner Subscribers: hintonda, mgorny, qcolombet, llvm-commits Differential Revision: https://reviews.llvm.org/D41638 llvm-svn: 323266
*	Regenerate select test. NFCI.	Simon Pilgrim	2018-01-23	1	-53/+74
\| \| \| \|	llvm-svn: 323265
*	Regenerate shuffle sink test. NFCI.	Simon Pilgrim	2018-01-23	1	-42/+69
\| \| \| \|	llvm-svn: 323264
*	[X86] Move 'Int_' to the end of the name of the VCOMISS/VUCOMISS and ↵	Craig Topper	2018-01-23	3	-64/+64
\| \| \| \| \| \| \| \|	instructions to get them picked up by the scheduler model regexs. All other intrinsic instructions put the _Int on the end. This make these instructions consistent and gets the prefix instregexs in the scheduler models to pick them up. llvm-svn: 323261
*	[X86][AVX] LowerBUILD_VECTORAsVariablePermute - add support for VPERMILPV to ↵	Simon Pilgrim	2018-01-23	1	-15/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	v2i64/v2f64 Minor refactor to make it possible for LowerBUILD_VECTORAsVariablePermute to be used with a wider variety of shuffles op and types. I'd have liked to add v4i32/v4f32 support as well but we don't see v4i32 index extractions at the moment (which is why I created D42308) After this I intend to begin adding scaling support for PSHUFB (v8i16, v4i32, v2i64)) and VPERMPS (v4f64, v4i64). Differential Revision: https://reviews.llvm.org/D42431 llvm-svn: 323260
*	[safestack] Inline safestack pointer access when possible.	Evgeniy Stepanov	2018-01-23	2	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This adds an -mllvm flag that forces the use of a runtime function call to get the unsafe stack pointer, the same that is currently used on non-x86, non-aarch64 android. The call may be inlined. Reviewers: pcc Subscribers: aemerson, kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D37405 llvm-svn: 323259
*	[Debugify] Add a mode to opt to enable faster testing	Vedant Kumar	2018-01-23	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Opt's "-enable-debugify" mode adds an instance of Debugify at the beginning of the pass pipeline, and an instance of CheckDebugify at the end. You can enable this mode with lit using: -Dopt="opt -enable-debugify". Note that running test suites in this mode will result in many failures due to strict FileCheck commands, etc. It can be more useful to look for assertion failures which arise only when Debugify is enabled, e.g to prove that we have (or do not have) test coverage for some code path with debug info present. Differential Revision: https://reviews.llvm.org/D41793 llvm-svn: 323256
*	Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements ↵	Alexey Bataev	2018-01-23	3	-24/+24
\| \| \| \| \| \| \| \|	as shuffle." This reverts commit r323246 because of the broken buildbots. llvm-svn: 323252
*	[Hexagon] Add patterns for sext_inreg of HVX vector types	Krzysztof Parzyszek	2018-01-23	1	-0/+54
\| \| \| \|	llvm-svn: 323250
*	[SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle.	Alexey Bataev	2018-01-23	3	-24/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323246
*	X86 Tests: Add AVX512BW config to CodeGenPrepare test. NFC	Zvi Rackover	2018-01-23	1	-9/+10
\| \| \| \| \| \| \|	Case points out that we don't consider shifts supported by AVX512BW in isVectorShiftByScalarCheap() llvm-svn: 323242
*	[WebAssembly] Remove "name" section of object wasm object files	Sam Clegg	2018-01-23	6	-58/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	LLD is unaffected, no changes needed there. LLD continues to write out a name section, using the symbol names. Fixes: https://github.com/WebAssembly/tool-conventions/issues/37 Patch by Nicholas Wilson! Differential Revision: https://reviews.llvm.org/D42425 llvm-svn: 323234
*	[Hexagon] Implement basic vector operations on vectors vNi1	Krzysztof Parzyszek	2018-01-23	4	-0/+155
\| \| \| \| \| \| \| \| \| \| \|	In addition to that, make sure that there are no boolean vector types that are associated with multiple register classes. Specifically, remove v32i1 and v64i1 from integer register classes. These types will correspond to results of vector comparisons, and as such should belong to the vector predicate class. Having them in scalar registers as well makes legalization ambiguous. llvm-svn: 323229
*	[X86][SSE] LowerBUILD_VECTORAsVariablePermute - extract subvector from ↵	Simon Pilgrim	2018-01-23	1	-50/+1
\| \| \| \| \| \|	oversized index vectors llvm-svn: 323223
*	[WebAssembly] Add mem.* intrinsics.	Dan Gohman	2018-01-23	1	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \|	The grow_memory and current_memory instructions are expected to be officially renamed to mem.grow and mem.size. Introduce new intrinsics with the new names. These new names aren't yet official, so for now, use them at your own risk. Also, take this opportunity to add arguments for the currently unused immediate field in those instructions. llvm-svn: 323222
*	[WebAssembly] Switch to *-wasm as the default target triple.	Dan Gohman	2018-01-23	4	-11/+11
\| \| \| \| \| \| \| \|	This makes wasm32-unknown-unknown-wasm the default, which supports the .o file writer and the new linking ABI. To enable s2wasm-compatible output, use the wasm32-unknown-unknown-elf triple. llvm-svn: 323220
*	Verifier: fix bug treating debug info issue as non-debug info issue	Yaxun Liu	2018-01-23	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Normally when llvm-as sees only debug info errors in LLVM assembly, it simply drops the debug info and outputs a valid LLVM bitcode and returns 0. There is a bug in LLVM verifier which incorrectly treats a debug info error as non-debug info error, which causes llvm-as returns 1 even though llvm-as can drop the invalid debug info and outputs a valid LLVM bitcode. This patch fixes that. Differential Revision: https://reviews.llvm.org/D42391 llvm-svn: 323216