bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	AMDGPU: Fix folding immediates into mac src2	Matt Arsenault	2017-01-11	1	-0/+66
\| \| \| \| \| \| \|	Whether it is legal or not needs to check for the instruction it will be replaced with. llvm-svn: 291711
*	Add test that verifies we don't peel loops in optsize functions. NFC.	Michael Kuperstein	2017-01-11	1	-0/+39
\| \| \| \|	llvm-svn: 291708
*	LowerTypeTests: Represent the memory region size with the constant size-1.	Peter Collingbourne	2017-01-11	4	-6/+6
\| \| \| \| \| \| \| \| \|	This means that we can use a shorter instruction sequence in the case where the size is a power of two and on the boundary between two representations. Differential Revision: https://reviews.llvm.org/D28421 llvm-svn: 291706
*	[SCEV] Make howFarToZero max backedge-taken count check for precondition.	Eli Friedman	2017-01-11	1	-4/+2
\| \| \| \| \| \| \| \| \|	Refines max backedge-taken count if a loop like "for (int i = 0; i != n; ++i) { /* body */ }" is rotated. Differential Revision: https://reviews.llvm.org/D28536 llvm-svn: 291704
*	[SCEV] Make howFarToZero use a simpler formula for max backedge-taken count.	Eli Friedman	2017-01-11	1	-0/+83
\| \| \| \| \| \| \| \| \|	This is both easier to understand, and produces a tighter bound in certain cases. Differential Revision: https://reviews.llvm.org/D28393 llvm-svn: 291701
*	Re-apply r291205, "LowerTypeTests: Split the pass in two: a resolution phase ↵	Peter Collingbourne	2017-01-11	7	-12/+9
\| \| \| \| \| \|	and a lowering phase.", with a fix for an off-by-one error. llvm-svn: 291699
*	NewGVN: Fix PR31594, by tracking the store count of congruence	Daniel Berlin	2017-01-11	1	-0/+119
\| \| \| \| \| \| \| \| \| \| \|	classes, and updating checking to allow for equivalence through reachability. (Sadly, the checking here is not perfect, and can't be made perfect, so we'll have to disable it after we are satisfied with correctness. Right now it is just "very unlikely" to happen.) llvm-svn: 291698
*	Resubmit "[PGO] Turn off comdat renaming in IR PGO by default"	Rong Xu	2017-01-11	5	-21/+81
\| \| \| \| \| \|	This patch resubmits the changes in r291588. llvm-svn: 291696
*	Revert "CodeGen: Allow small copyable blocks to "break" the CFG."	Kyle Butt	2017-01-11	57	-420/+205
\| \| \| \| \| \| \| \| \|	This reverts commit ada6595a526d71df04988eb0a4b4fe84df398ded. This needs a simple probability check because there are some cases where it is not profitable. llvm-svn: 291695
*	[ARM] More aggressive matching for vpadd and vpaddl.	Eli Friedman	2017-01-11	2	-18/+234
\| \| \| \| \| \| \| \| \|	The new matchers work after legalization to make them simpler, and to avoid blocking other optimizations. Differential Revision: https://reviews.llvm.org/D27779 llvm-svn: 291693
*	[SLP] Remove bogus assert.	Michael Kuperstein	2017-01-11	1	-0/+30
\| \| \| \| \| \| \| \| \| \| \| \|	The removed assert seems bogus - it's perfectly legal for the roots of the vectorized subtrees to be equal even if the original scalar values aren't, if the original scalars happen to be equivalent. This fixes PR31599. Differential Revision: https://reviews.llvm.org/D28539 llvm-svn: 291692
*	[X86][XOP] Add vpermil2ps target shuffle -> insertps combine test	Simon Pilgrim	2017-01-11	1	-0/+14
\| \| \| \|	llvm-svn: 291690
*	Revert rL291205 because it breaks Chrome tests under CFI.	Ivan Krasin	2017-01-11	7	-9/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Revert LowerTypeTests: Split the pass in two: a resolution phase and a lowering phase. This change separates how type identifiers are resolved from how intrinsic calls are lowered. All information required to lower an intrinsic call is stored in a new TypeIdLowering data structure. The idea is that this data structure can either be initialized using the module itself during regular LTO, or using the module summary in ThinLTO backends. Original URL: https://reviews.llvm.org/D28341 Reviewers: pcc Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D28532 llvm-svn: 291684
*	[ARM] Fix test CodeGen/ARM/fpcmp_ueq.ll broken by rL290616	Evgeny Astigeevich	2017-01-11	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \|	Commit rL290616 (https://reviews.llvm.org/rL290616) changed a checking command for the triple arm-apple-darwin in LLVM::CodeGen/ARM/fpcmp_ueq.ll. As a result of the changes the test could fail for the valid generated code. These changes fixes the test to check only instructions we would expect. Differential Revision: https://reviews.llvm.org/D28159 llvm-svn: 291678
*	X86 CodeGen: Optimized pattern for truncate with unsigned saturation.	Elena Demikhovsky	2017-01-11	2	-0/+231
\| \| \| \| \| \| \| \| \|	DAG patterns optimization: truncate + unsigned saturation supported by VPMOVUS* instructions in AVX-512. And VPACKUS* instructions on SEE* targets. Differential Revision: https://reviews.llvm.org/D28216 llvm-svn: 291670
*	[AMDGPU] Assembler: SDWA/DPP should not accept scalar registers and ↵	Sam Kolton	2017-01-11	2	-1/+67
\| \| \| \| \| \| \| \| \| \| \| \|	immediate operands Reviewers: artem.tamazov, nhaustov, vpykhtin, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D28157 llvm-svn: 291668
*	[X86][AVX512BW] Vectorize v64i8 vector shifts	Simon Pilgrim	2017-01-11	6	-3114/+192
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D28447 llvm-svn: 291665
*	Fix line endings	Simon Pilgrim	2017-01-11	4	-421/+421
\| \| \| \|	llvm-svn: 291663
*	[X86] Fix PR30926 - Add patterns for (v)cvtsi2s{s,d} and (v)cvtsd2s{s,d}	Elad Cohen	2017-01-11	4	-9/+108
\| \| \| \| \| \| \| \| \| \| \|	The code emiited by Clang's intrinsics for (v)cvtsi2ss, (v)cvtsi2sd, (v)cvtsd2ss and (v)cvtss2sd is lowered to a code sequence that includes redundant (v)movss/(v)movsd instructions. This patch adds patterns for optimizing these sequences. Differential revision: https://reviews.llvm.org/D28455 llvm-svn: 291660
*	[X86] fixing failed test in commit: r291657	Mohammed Agabaria	2017-01-11	1	-0/+1
\| \| \| \| \| \|	Missing Requires asserts. llvm-svn: 291659
*	[X86] updating TTI costs for arithmetic instructions on X86\SLM arch.	Mohammed Agabaria	2017-01-11	2	-0/+461
\| \| \| \| \| \| \| \| \| \| \| \|	updated instructions: pmulld, pmullw, pmulhw, mulsd, mulps, mulpd, divss, divps, divsd, divpd, addpd and subpd. special optimization case which replaces pmulld with pmullw\pmulhw\pshuf seq. In case if the real operands bitwidth <= 16. Differential Revision: https://reviews.llvm.org/D28104 llvm-svn: 291657
*	[XRay] Define the library for XRay trace logs	Dean Michael Berris	2017-01-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In this change we move the definition of the log reading routines from the tools directory in LLVM to {include/llvm,lib}/XRay. We improve the documentation a little bit for the publicly accessible headers, and adjust the top-matter. This also leads to some refactoring and cleanup in the tooling code. In particular, we do the following: - Rename the class from LogReader to Trace, as it better represents the logical set of records as opposed to a log. - Use file type detection instead of asking the user to say what format the input file is. This allows us to keep the interface simple and encapsulate the logic of loading the data appropriately. In future changes we increase the API surface and write dedicated unit tests for the XRay library. Depends on D24376. Reviewers: dblaikie, echristo Subscribers: mehdi_amini, mgorny, llvm-commits, varno Differential Revision: https://reviews.llvm.org/D28345 llvm-svn: 291652
*	[PM] Rewrite the loop pass manager to use a worklist and augmented run	Chandler Carruth	2017-01-11	3	-22/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	arguments much like the CGSCC pass manager. This is a major redesign following the pattern establish for the CGSCC layer to support updates to the set of loops during the traversal of the loop nest and to support invalidation of analyses. An additional significant burden in the loop PM is that so many passes require access to a large number of function analyses. Manually ensuring these are cached, available, and preserved has been a long-standing burden in LLVM even with the help of the automatic scheduling in the old pass manager. And it made the new pass manager extremely unweildy. With this design, we can package the common analyses up while in a function pass and make them immediately available to all the loop passes. While in some cases this is unnecessary, I think the simplicity afforded is worth it. This does not (yet) address loop simplified form or LCSSA form, but those are the next things on my radar and I have a clear plan for them. While the patch is very large, most of it is either mechanically updating loop passes to the new API or the new testing for the loop PM. The code for it is reasonably compact. I have not yet updated all of the loop passes to correctly leverage the update mechanisms demonstrated in the unittests. I'll do that in follow-up patches along with improved FileCheck tests for those passes that ensure things work in more realistic scenarios. In many cases, there isn't much we can do with these until the loop simplified form and LCSSA form are in place. Differential Revision: https://reviews.llvm.org/D28292 llvm-svn: 291651
*	Revert r291645 "[DAGCombiner] Teach DAG combiner to fold (vselect (N0 xor ↵	Craig Topper	2017-01-11	3	-339/+560
\| \| \| \| \| \| \| \|	AllOnes), N1, N2) -> (vselect N0, N2, N1). Only do this if the target indicates its vector boolean type is ZeroOrNegativeOneBooleanContent." Some test appears to be hanging on the build bots. llvm-svn: 291650
*	[LICM] Report failing to hoist conditionally-executed loads	Adam Nemet	2017-01-11	1	-0/+47
\| \| \| \| \| \| \| \| \| \| \| \|	These are interesting again because the user may not be aware that this is a common reason preventing LICM. A const is removed from an instruction pointer declaration in order to pass it to ORE. Differential Revision: https://reviews.llvm.org/D27940 llvm-svn: 291649
*	[LICM] Report failing to hoist a load with an invariant address	Adam Nemet	2017-01-11	1	-0/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These are interesting because lack of precision in alias information could be standing in the way of this optimization. An example is the case in the test suite that I showed in the DevMeeting talk: http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/MultiSource/Benchmarks/FreeBench/distray/CMakeFiles/distray.dir/html/_org_test-suite_MultiSource_Benchmarks_FreeBench_distray_distray.c.html#L236 canSinkOrHoistInst is also used from LoopSink, which does not use opt-remarks so we need to take ORE as an optional argument. Differential Revision: https://reviews.llvm.org/D27939 llvm-svn: 291648
*	[LICM] Report successful hoist/sink/promotion	Adam Nemet	2017-01-11	25	-24/+105
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D27938 llvm-svn: 291646
*	[DAGCombiner] Teach DAG combiner to fold (vselect (N0 xor AllOnes), N1, N2) ↵	Craig Topper	2017-01-11	3	-560/+339
\| \| \| \| \| \|	-> (vselect N0, N2, N1). Only do this if the target indicates its vector boolean type is ZeroOrNegativeOneBooleanContent. llvm-svn: 291645
*	DAGCombiner: Add hasOneUse checks to fadd/fma combine	Matt Arsenault	2017-01-11	1	-0/+262
\| \| \| \| \| \| \| \|	Even with aggressive fusion enabled, this requires duplicating the fmul, or increases an fadd to another fma which is not an improvement. llvm-svn: 291642
*	Re-commit r289955: [X86] Fold (setcc (cmp (atomic_load_add x, -C) C), COND) ↵	Hans Wennborg	2017-01-11	1	-0/+64
\| \| \| \| \| \| \| \| \| \| \| \| \|	to (setcc (LADD x, -C), COND) (PR31367) This was reverted because it would miscompile code where the cmp had multiple uses. That was due to a deficiency in the existing code, which was fixed in r291630 (see the PR for details). This re-commit includes an extra test for the kind of code that got miscompiled: @test_sub_1_setcc_jcc. llvm-svn: 291640
*	[X86] Dont run combineSetCCAtomicArith() when the cmp has multiple uses	Hans Wennborg	2017-01-11	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We would miscompile the following: void g(int); int f(volatile long long *p) { bool b = __atomic_fetch_add(p, 1, __ATOMIC_SEQ_CST) < 0; g(b ? 12 : 34); return b ? 56 : 78; } into pushq %rax lock incq (%rdi) movl $12, %eax movl $34, %edi cmovlel %eax, %edi callq g(int) testq %rax, %rax <---- Bad. movl $56, %ecx movl $78, %eax cmovsl %ecx, %eax popq %rcx retq because the code failed to take into account that the cmp has multiple uses, replaced one of them, and left the other one comparing garbage. llvm-svn: 291630
*	InstSimplify: Eliminate fabs on known positive	Matt Arsenault	2017-01-11	3	-8/+142
\| \| \| \|	llvm-svn: 291624
*	AMDGPU/EG,CM: Add fp16 conversion instructions	Jan Vesely	2017-01-11	4	-35/+49
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D28164 llvm-svn: 291622
*	Revert "[PGO] Turn off comdat renaming in IR PGO by default"	Rong Xu	2017-01-10	5	-81/+21
\| \| \| \| \| \| \|	This patch reverts r291588: [PGO] Turn off comdat renaming in IR PGO by default, as we are seeing some hash mismatches in our internal tests. llvm-svn: 291621
*	[TM] Restore default TargetOptions in TargetMachine::resetTargetOptions.	Justin Lebar	2017-01-10	3	-2/+125
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Previously if you had * a function with the fast-math-enabled attr, followed by * a function without the fast-math attr, the second function would inherit the first function's fast-math-ness. This means that mixing fast-math and non-fast-math functions in a module was completely broken unless you explicitly annotated every non-fast-math function with "unsafe-fp-math"="false". This appears to have been broken since r176986 (March 2013), when the resetTargetOptions function was introduced. This patch tests the correct behavior as best we can. I don't think I can test FPDenormalMode and NoTrappingFPMath, because they aren't used in any backends during function lowering. Surprisingly, I also can't find any uses at all of LessPreciseFPMAD affecting generated code. The NVPTX/fast-math.ll test changes are an expected result of fixing this bug. When FMA is disabled, we emit add as "add.rn.f32", which prevents fma combining. Before this patch, fast-math was enabled in all functions following the one which explicitly enabled it on itself, so we were emitting plain "add.f32" where we should have generated "add.rn.f32". Reviewers: mkuper Subscribers: hfinkel, majnemer, jholewinski, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D28507 llvm-svn: 291618
*	[NVPTX] Add CHECK-LABEL where appropriate to fast-math.ll test.	Justin Lebar	2017-01-10	1	-9/+4
\| \| \| \| \| \| \| \|	Also fix up whitespace. Test-only change. llvm-svn: 291617
*	[AArch64] Consider all vector types for FeatureSlowMisaligned128Store	Evandro Menezes	2017-01-10	1	-8/+50
\| \| \| \| \| \| \| \| \| \| \| \|	The original code considered only v2i64 as slow for this feature. This patch consider all 128-bit long vector types as slow candidates. In internal tests, extending this feature to all 128-bit vector types resulted in an overall improvement of 1% on Exynos M1. Differential revision: https://reviews.llvm.org/D27998 llvm-svn: 291616
*	AMDGPU: Constant fold when immediate is materialized	Matt Arsenault	2017-01-10	1	-0/+858
\| \| \| \| \| \|	In future commits these patterns will appear after moveToVALU changes. llvm-svn: 291615
*	InstCombine: fdiv -x, -y -> fdiv x, y	Matt Arsenault	2017-01-10	1	-0/+18
\| \| \| \|	llvm-svn: 291611
*	CodeGen: Allow small copyable blocks to "break" the CFG.	Kyle Butt	2017-01-10	57	-205/+420
\| \| \| \| \| \| \| \| \| \| \|	When choosing the best successor for a block, ordinarily we would have preferred a block that preserves the CFG unless there is a strong probability the other direction. For small blocks that can be duplicated we now skip that requirement as well. Differential revision: https://reviews.llvm.org/D27742 llvm-svn: 291609
*	Make the test accept different OpCode values since it doesn't really care ↵	Douglas Yung	2017-01-10	1	-1/+1
\| \| \| \| \| \| \| \|	about the value. Differential Revision: https://reviews.llvm.org/D28487 llvm-svn: 291605
*	DAG: Avoid OOB when legalizing vector indexing	Matt Arsenault	2017-01-10	16	-654/+999
\| \| \| \| \| \| \| \| \|	If a vector index is out of bounds, the result is supposed to be undefined but is not undefined behavior. Change the legalization for indexing the vector on the stack so that an out of bounds index does not create an out of bounds memory access. llvm-svn: 291604
*	[WebAssembly] Only RAUW a constant once in FixFunctionBitcasts	Derek Schuff	2017-01-10	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \|	When we collect 2 uses of a function in FindUses and then RAUW when we visit the first, we end up visiting the wrapper (because the second was RAUW'd). We still want to use RAUW instead of just Use->set() because it has special handling for Constants, so this patch just ensures that only one use of each constant is added to the work list. Differential Revision: https://reviews.llvm.org/D28504 llvm-svn: 291603
*	Correct object file for implicit const test	Victor Leschuk	2017-01-10	1	-0/+0
\| \| \| \|	llvm-svn: 291601
*	DebugInfo: support for DW_FORM_implicit_const	Victor Leschuk	2017-01-10	2	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Support for DW_FORM_implicit_const DWARFv5 feature. When this form is used attribute value goes to .debug_abbrev section (as SLEB). As this form would break any debug tool which doesn't support DWARFv5 it is guarded by dwarf version check. Attempt to use this form with dwarf version <= 4 is considered a fatal error. Differential Revision: https://reviews.llvm.org/D28456 llvm-svn: 291599
*	[llvm-config] Canonicalize CMake booleans to 0/1	Michal Gorny	2017-01-10	1	-0/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Following the similar change to lit configuration, ensure that all CMake booleans are canonicalized to 0/1 when being passed to llvm-config. This fixes the incorrect interpretation of values when user passes another value than the ON/OFF, and simplifies the code by removing unnecessary string matching. Furthermore, the code for --has-rtti and --has-global-isel has been modified to print consistent values indepdently of the boolean used by passed by the user to CMake. Sadly, the code already implicitly used different values for the two (YES/NO for --has-rtti, ON/OFF for --has-global-isel). Include tests for all booleans and multi-value options in llvm-config. Differential Revision: https://reviews.llvm.org/D28366 llvm-svn: 291593
*	[LV] Don't panic when encountering the IV of an outer loop.	Michael Kuperstein	2017-01-10	1	-0/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bail out instead of asserting when we encounter this situation, which can actually happen. The reason the test uses the new PM is that the "bad" phi, incidentally, gets cleaned up by LoopSimplify. But LICM can create this kind of phi and preserve loop simplify form, so the cleanup has no chance to run. This fixes PR31190. We may want to solve this in a less conservative manner, since this phi is actually uniform within the inner loop (or we may want LICM to output a cleaner promotion to begin with). Differential Revision: https://reviews.llvm.org/D28490 llvm-svn: 291589
*	[PGO] Turn off comdat renaming in IR PGO by default	Rong Xu	2017-01-10	5	-21/+81
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In IR PGO we append the function hash to comdat functions to avoid the potential hash mismatch. This turns out not legal in some cases: if the comdat function is address-taken and used in comparison. Renaming changes the semantic. This patch turns off comdat renaming by default. To alleviate the hash mismatch issue, we now rename the profile variable for comdat functions. Profile allows co-existing multiple versions of profiles with different hash value. The inlined copy will always has the correct profile counter. The out-of-line copy might not have the correct count. But we will not have the bogus mismatch warning. Reviewers: davidxl Subscribers: llvm-commits, xur Differential Revision: https://reviews.llvm.org/D28416 llvm-svn: 291588
*	AMDGPU: Add tests for HasMultipleConditionRegisters	Matt Arsenault	2017-01-10	1	-0/+161
\| \| \| \| \| \|	This was enabled without many specific tests or the comment. llvm-svn: 291586
*	[CostModel][X86] Add AVX512VL vector shift cost tests.	Simon Pilgrim	2017-01-10	3	-0/+57
\| \| \| \|	llvm-svn: 291585