bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[GISel]: Add G_FEXP, G_FEXP2 opcodes	Aditya Nandakumar	2017-06-27	1	-0/+20
\| \| \| \| \| \| \|	Also add IRTranslator support. https://reviews.llvm.org/D34710 llvm-svn: 306475
*	re-commit r306336: Enable vectorizer-maximize-bandwidth by default.	Dehao Chen	2017-06-27	11	-67/+76
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D33341 llvm-svn: 306473
*	[CGP] eliminate a sub instruction in memcmp expansion	Sanjay Patel	2017-06-27	5	-141/+118
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As noted in D34071, there are some IR optimization opportunities that could be handled by normal IR passes if this expansion wasn't happening so late in CGP. Regardless of that, it seems wasteful to knowingly produce suboptimal IR here, so I'm proposing this change: %s = sub i32 %x, %y %r = icmp ne %s, 0 => %r = icmp ne %x, %y Changing the predicate to 'eq' mimics what InstCombine would do, so that's just an efficiency improvement if we decide this expansion should happen sooner. The fact that the PowerPC backend doesn't eliminate the 'subf.' might be something for PPC folks to investigate separately. Differential Revision: https://reviews.llvm.org/D34416 llvm-svn: 306471
*	GlobalISel: verify that a COPY is trivial when created.	Tim Northover	2017-06-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Without this check, COPY instructions can actually be one of the generic casts in disguise. That's confusing and bad. At some point during ISel this restriction has to be relaxed since the fully selected instructions will usually use COPY for those purposes. Right now I think it's possible that relaxation occurs during RegBankSelect (hence the change there). I'm not convinced that's where it belongs long-term though. llvm-svn: 306470
*	Clean up a test case	Xinliang David Li	2017-06-27	1	-34/+41
\| \| \| \|	llvm-svn: 306468
*	Create a PHI value when merging with a known undef live-in	Krzysztof Parzyszek	2017-06-27	1	-0/+35
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D34640 llvm-svn: 306466
*	[WebAssembly] Only run WebAssembly objdump tests if it is enabled as a target	Sam Clegg	2017-06-27	1	-0/+2
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D34712 llvm-svn: 306464
*	[WebAssembly] Add support for printing relocations with llvm-objdump	Sam Clegg	2017-06-27	1	-0/+8
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D34658 llvm-svn: 306461
*	[WebAssembly] Add data size and alignement to linking section	Sam Clegg	2017-06-27	5	-46/+75
\| \| \| \| \| \| \| \| \|	The overal size of the data section (including BSS) is otherwise not included in the wasm binary. Differential Revision: https://reviews.llvm.org/D34657 llvm-svn: 306459
*	[Hexagon] Use proper predicate register state when expanding PS_vselect	Krzysztof Parzyszek	2017-06-27	1	-0/+53
\| \| \| \|	llvm-svn: 306458
*	[InstCombine] Propagate nsw flag when turning mul by pow2 into shift when ↵	Craig Topper	2017-06-27	1	-5/+3
\| \| \| \| \| \| \| \| \| \| \| \|	the constant is a vector splat or the scalar bit width is larger than 64-bits The check to see if we can propagate the nsw flag used m_ConstantInt(uint64_t*&) which doesn't work with splat vectors and has a restriction that the bitwidth of the ConstantInt must be 64-bits are less. This patch changes it to use m_APInt to remove both these issues Differential Revision: https://reviews.llvm.org/D34699 llvm-svn: 306457
*	[AMDGPU] Add 2 new alignbit patterns	Stanislav Mekhanoshin	2017-06-27	1	-0/+142
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D34655 llvm-svn: 306449
*	[CodeExtractor] Prevent extraction of block involving blockaddress	Serge Guelton	2017-06-27	2	-0/+79
\| \| \| \| \| \| \| \| \|	BlockAddress are only valid within their function context, which does not interact well with CodeExtractor. Detect this case and prevent it. Differential Revision: https://reviews.llvm.org/D33839 llvm-svn: 306448
*	[AMDGPU] Simplify setcc (sext from i1 b), -1\|0, cc	Stanislav Mekhanoshin	2017-06-27	1	-0/+292
\| \| \| \| \| \| \| \| \| \| \|	Depending on the compare code that can be either an argument of sext or negate of it. This helps to avoid v_cndmask_b64 instruction for sext. A reversed value can be further simplified and folded into its parent comparison if possible. Differential Revision: https://reviews.llvm.org/D34545 llvm-svn: 306446
*	[Hexagon] Update kills in hexagon-nvj even more properly than before	Krzysztof Parzyszek	2017-06-27	1	-0/+18
\| \| \| \| \| \| \|	Account for the fact that both, the feeder and the compare can be moved over instructions that kill registers. llvm-svn: 306443
*	RenameIndependentSubregs: Fix infinite loop	Matt Arsenault	2017-06-27	1	-0/+19
\| \| \| \| \| \| \| \|	Apparently this replacement can really be substituting the same as the original register. Avoid restarting the loop when there's been no change in the register uses. llvm-svn: 306441
*	[SROA] Fix APInt size when alloca address space is not 0	Yaxun Liu	2017-06-27	1	-1/+30
\| \| \| \| \| \| \| \|	SROA assumes alloca address space is 0, which causes assertion. This patch fixes that. Differential Revision: https://reviews.llvm.org/D34104 llvm-svn: 306440
*	[AMDGPU] Combine and x, (sext cc from i1) => select cc, x, 0	Stanislav Mekhanoshin	2017-06-27	2	-0/+47
\| \| \| \| \| \| \| \| \| \|	Also factored out function to check if a boolean is an already deserialized value which does not require v_cndmask_b32 to be loaded. Added binary logical operators to its check. Differential Revision: https://reviews.llvm.org/D34500 llvm-svn: 306439
*	[InstCombine] canonicalize icmp predicate feeding select	Sanjay Patel	2017-06-27	8	-63/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This canonicalization was suggested in D33172 as a way to make InstCombine behavior more uniform. We have this transform for icmp+br, so unless there's some reason that icmp+select should be treated differently, we should do the same thing here. The benefit comes from increasing the chances of creating identical instructions. This is shown in the tests in logical-select.ll (PR32791). InstCombine doesn't fold those directly, but EarlyCSE can simplify the identical cmps, and then InstCombine can fold the selects together. The possible regression for the tests in select.ll raises questions about poison/undef: http://lists.llvm.org/pipermail/llvm-dev/2017-May/113261.html ...but that transform is just as likely to be triggered by this canonicalization as it is to be missed, so we're just pointing out a commutation deficiency in the pattern matching: https://reviews.llvm.org/rL228409 Differential Revision: https://reviews.llvm.org/D34242 llvm-svn: 306435
*	[InstCombine] Add test case demonstrating that we don't propagate nsw flag ↵	Craig Topper	2017-06-27	1	-0/+10
\| \| \| \| \| \|	when converting mul by pow2 to shl when the type is larger than 64-bits. NFC llvm-svn: 306427
*	[InstCombine] Add test cases to show that we don't propagate 'nsw' flags ↵	Craig Topper	2017-06-27	1	-0/+20
\| \| \| \| \| \|	when converting mul by pow2 constant to shl for splat vectors. NFC llvm-svn: 306426
*	[X86][AsmParser][MS-compatability] Binary/Unary operators enhancements	Coby Tayree	2017-06-27	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \|	Introducing MOD binary operator https://msdn.microsoft.com/en-us/library/hha180wt.aspx Enhancing unary operators NEG and NOT, to support more complex patterns Differential Revision: https://reviews.llvm.org/D33876 llvm-svn: 306425
*	Another test commit	Chih-Hung Hsieh	2017-06-27	1	-4/+4
\| \| \| \|	llvm-svn: 306420
*	[PatternMatch] Remove 64-bit or less restriction from m_SpecificInt	Craig Topper	2017-06-27	1	-8/+4
\| \| \| \| \| \| \| \| \| \|	Not sure why this restriction existed, but it seems like we should support any size Constant here. The particular pattern in the tests is not the only use of this matcher in the tree. There's one in CodeGenPrepare and one in InstSimplify as well. Differential Revision: https://reviews.llvm.org/D34666 llvm-svn: 306417
*	[JumpThreading] Add test case that was supposed to go with r306085.	Craig Topper	2017-06-27	1	-0/+125
\| \| \| \| \| \|	Looks like I forgot to 'git add' when I submitted the commit. Thanks to Chandler for noticing. llvm-svn: 306416
*	Updated and extended the information about each instruction in HSW and SNB ↵	Gadi Haber	2017-06-27	31	-3632/+4339
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to include the following data: •static latency •number of uOps from which the instructions consists •all ports used by the instruction Reviewers:  RKSimon zvi aymanmus m_zuckerman Differential Revision: https://reviews.llvm.org/D33897 llvm-svn: 306414
*	[AMDGPU] SDWA: several fixes for V_CVT and VOPC instructions	Sam Kolton	2017-06-27	2	-1/+447
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: 1. Instruction V_CVT_U32_F32 allow omod operand (see SIInstrInfo.td:1435). In fact this operand shouldn't be allowed here. This fix checks if SDWA pseudo instruction has OMod operand and then copy it. 2. There were several problems with support of VOPC instructions in SDWA peephole pass. Reviewers: tstellar, arsenm, vpykhtin, airlied, kzhuravl Subscribers: wdng, nhaehnle, yaxunl, dstuttard, tpr, sarnex, t-tye Differential Revision: https://reviews.llvm.org/D34626 llvm-svn: 306413
*	[AArch64] Update successor probabilities after ccmp-conversion	Matthew Simpson	2017-06-27	2	-3/+49
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch modifies the conditional compares pass so that it keeps successor probabilities up-to-date after the conversion. Previously, successor probabilities were being normalized to a uniform distribution, even though they may have been heavily biased prior to the conversion (e.g., if one of the edges was the back edge of a loop). This loss of information affected passes later in the pipeline. Differential Revision: https://reviews.llvm.org/D34109 llvm-svn: 306412
*	[mips] Add instruction aliases for ds(r\|l)l.	Simon Dardis	2017-06-27	9	-3/+39
\| \| \| \| \| \| \|	Add the instruction aliases for ds(r\|l)l for the two operand alias of ds(r\|l)lv and the aliases ds(r\|l)l with the three register operands. llvm-svn: 306405
*	[SelectionDAG] set dereferenceable flag in MergeConsecutiveStores to fix ↵	Hiroshi Inoue	2017-06-27	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \|	assetion failure When SelectionDAG merges consecutive stores and loads in MergeConsecutiveStores, it does not set dereferenceable flag for a created load instruction. This results in an assertion failure if SelectionDAG commonizes this load instruction with other load instructions, as well as it may miss optimization opportunities. This patch sat dereferenceable flag for the newly created load instruction if all the load instructions to be merged are dereferenceable. Differential Revision: https://reviews.llvm.org/D34679 llvm-svn: 306404
*	Recommitting rL305465 after fixing bug in TableGen in rL306251 & rL306371	Ayman Musa	2017-06-27	4	-332/+13581
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[X86][AVX512] Improve lowering of AVX512 compare intrinsics (remove redundant shift left+right instructions). AVX512 compare instructions return v*i1 types. In cases where the number of elements in the returned value are less than 8, clang adds zeroes to get a mask of v8i1 type. Later on it's replaced with CONCAT_VECTORS, which then is lowered to many DAG nodes including insert/extract element and shift right/left nodes. The fact that AVX512 compare instructions put the result in a k register and zeroes all its upper bits allows us to remove the extra nodes simply by copying the result to the required register class. When lowering, identify these cases and transform them into an INSERT_SUBVECTOR node (marked legal), then catch this pattern in instructions selection phase and transform it into one avx512 cmp instruction. Differential Revision: https://reviews.llvm.org/D33188 llvm-svn: 306402
*	[ARM] GlobalISel: Support G_SELECT for pointers	Diana Picus	2017-06-27	3	-3/+76
\| \| \| \| \| \|	All we need to do is mark it as legal, otherwise it's just like s32. llvm-svn: 306390
*	[X86][AVX512] Regenerate avx512 arithmetic tests	Simon Pilgrim	2017-06-27	1	-129/+129
\| \| \| \|	llvm-svn: 306389
*	[globalisel][tablegen] Add support for EXTRACT_SUBREG.	Daniel Sanders	2017-06-27	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: After this patch, we finally have test cases that require multiple instruction emission. Depends on D33590 Reviewers: ab, qcolombet, t.p.northover, rovka, kristof.beyls Subscribers: javed.absar, llvm-commits, igorb Differential Revision: https://reviews.llvm.org/D33596 llvm-svn: 306388
*	[ARM] GlobalISel: Support G_SELECT for i32	Diana Picus	2017-06-27	4	-0/+106
\| \| \| \| \| \| \| \| \| \|	* Mark as legal for (s32, i1, s32, s32) * Map everything into GPRs * Select to two instructions: a CMP of the condition against 0, to set the flags, and a MOVCCr to select between the two inputs based on the flags that we've just set llvm-svn: 306382
*	[PowerPC] fix incorrect processor name for -mcpu in a test case	Hiroshi Inoue	2017-06-27	1	-1/+1
\| \| \| \| \| \|	to surpress warnings. ppc970 should be 970 (or g5) llvm-svn: 306380
*	[SROA] Fix PR32902 by more carefully propagating !nonnull metadata.	Chandler Carruth	2017-06-27	1	-0/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is based heavily on the work done ni D34285. I mostly wanted to do test cleanup for the author to save them some time, but I had a really hard time understanding why it was so hard to write better test cases for these issues. The problem is that because SROA does a second rewrite of the loads and because we don't propagate !nonnull for non-pointer loads, we first introduced invalid !nonnull metadata and then stripped it back off just in time to avoid most ways of this PR manifesting. Moving to the more careful utility only fixes this by changing the predicate to look at the new load's type rather than the target type. However, that does fix the bug, and the utility is much nicer including adding range metadata to model the nonnull property after a conversion to an integer. However, we have bigger problems because we don't actually propagate range metadata, and the utility to do this extracted from instcombine isn't really in good shape to do this currently. It only handles the case of copying range metadata from an integer load to a pointer load. It doesn't even handle the trivial cases of propagating from one integer load to another when they are the same width! This utility will need to be beefed up prior to using in this location to get the metadata to fully survive. And even then, we need to go and teach things to turn the range metadata into an assume the way we do with nonnull so that when we promote an integer we don't lose the information. All of this will require a new test case that looks kind-of like `preserve-nonnull.ll` does here but focuses on range metadata. It will also likely require more testing because it needs to correctly handle changes to the integer width, especially as SROA actively tries to change the integer width! Last but not least, I'm a little worried about hooking the range metadata up here because the instcombine logic for converting from a range metadata to a nonnull metadata node seems broken in the face of non-zero address spaces where null is not mapped to the integer `0`. So that probably needs to get fixed with test cases both in SROA and in instcombine to cover it. But this does extract the core PR fix from D34285 of preventing the !nonnull metadata from being propagated in a broken state just long enough to feed into promotion and crash value tracking. On D34285 there is some discussion of zero-extend handling because it isn't necessary. First, the new load size covers all of the non-undef (ie, possibly initialized) bits. This may even extend past the original alloca if loading those bits could produce valid data. The only way its valid for us to zero-extend an integer load in SROA is if the original code had a zero extend or those bits were undef. And we get to assume things like undef never satifies nonnull, so non undef bits can participate here. No need to special case the zero-extend handling, it just falls out correctly. The original credit goes to Ariel Ben-Yehuda! I'm mostly landing this to save a few rounds of trivial edits fixing style issues and test case formulation. Differental Revision: D34285 llvm-svn: 306379
*	AMDGPU: M0 operands to spill/restore opcodes are dead	Nicolai Haehnle	2017-06-27	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: With scalar stores, M0 is clobbered and therefore marked as implicitly defined. However, it is also dead. This fixes an assertion when the Greedy Register Allocator decides to optimize a spill/restore pair away again (via tryHintsRecoloring). Reviewers: arsenm Subscribers: qcolombet, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D33319 llvm-svn: 306375
*	[GlobalISel][X86] Add fp32/62 legalizer, regbank-select, selection tests for ↵	Igor Breger	2017-06-27	15	-157/+970
\| \| \| \| \| \|	G_FADD, G_FSUB, G_FMUL, G_FDIV. NFC. llvm-svn: 306370
*	[Reassociate] Make sure EraseInst sets MadeChange	Mikael Holmen	2017-06-27	1	-0/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: EraseInst didn't report that it made IR changes through MadeChange. It is essential that changes to the IR are reported correctly, since for example ReassociatePass::run() will indicate that all analyses are preserved otherwise. And the CGPassManager determines if the CallGraph is up-to-date based on status from InstructionCombiningPass::runOnFunction(). Reviewers: craig.topper, rnk, davide Reviewed By: rnk, davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34616 llvm-svn: 306368
*	[PowerPC] set optimization level in SelectionDAGISel	Hiroshi Inoue	2017-06-27	4	-45/+45
\| \| \| \| \| \| \| \| \|	PowerPC backend does not pass the current optimization level to SelectionDAGISel and so SelectionDAGISel works with the default optimization level regardless of the current optimization level. This patch makes the PowerPC backend set the optimization level correctly. Differential Revision: https://reviews.llvm.org/D34615 llvm-svn: 306367
*	[InstCombine] Add test cases demonstrating that we don't optmize ↵	Craig Topper	2017-06-27	1	-0/+27
\| \| \| \| \| \|	select+cmp+cttz/ctlz when the bitwidth is larger than 64 bits. llvm-svn: 306365
*	[SROA] Further test cleanup and add a test for the actual propagation of	Chandler Carruth	2017-06-27	1	-4/+25
\| \| \| \| \| \|	the nonnull attribute distinct from rewriting it into an assume. llvm-svn: 306358
*	[SROA] Clean up a test case a bit prior to adding more testing for	Chandler Carruth	2017-06-27	1	-15/+13
\| \| \| \| \| \|	nonnull as part of fixing PR32902. llvm-svn: 306353
*	ScheduleDAGInstrs: Fix fixupKills() adding too many kill flags.	Matthias Braun	2017-06-27	1	-0/+45
\| \| \| \| \| \| \| \| \|	Remove invalid shortcut in fixupKills(): A register needs to be marked live even when we are not adding a kill flag. This is because a partially live register must not get a kill flags, but it still needs to be fully marked live when walking backwards. llvm-svn: 306352
*	DAGCombine: Make sure we only eliminate trunc/extend when the scales of ↵	Wolfgang Pieb	2017-06-26	1	-0/+35
\| \| \| \| \| \| \| \| \| \| \| \|	truncation and extension match. This fixes PR33368. Reviewer: rksimon Differential Revision: https://reviews.llvm.org/D34069 llvm-svn: 306345
*	revert r306336 for breaking ppc test.	Dehao Chen	2017-06-26	11	-76/+67
\| \| \| \|	llvm-svn: 306344
*	[x86] add tests for missing sbb transforms; NFC	Sanjay Patel	2017-06-26	1	-0/+80
\| \| \| \|	llvm-svn: 306337
*	Enable vectorizer-maximize-bandwidth by default.	Dehao Chen	2017-06-26	11	-67/+76
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: vectorizer-maximize-bandwidth is generally useful in terms of performance. I've tested the impact of changing this to default on speccpu benchmarks on sandybridge machines. The result shows non-negative impact: spec/2006/fp/C++/444.namd 26.84 -0.31% spec/2006/fp/C++/447.dealII 46.19 +0.89% spec/2006/fp/C++/450.soplex 42.92 -0.44% spec/2006/fp/C++/453.povray 38.57 -2.25% spec/2006/fp/C/433.milc 24.54 -0.76% spec/2006/fp/C/470.lbm 41.08 +0.26% spec/2006/fp/C/482.sphinx3 47.58 -0.99% spec/2006/int/C++/471.omnetpp 22.06 +1.87% spec/2006/int/C++/473.astar 22.65 -0.12% spec/2006/int/C++/483.xalancbmk 33.69 +4.97% spec/2006/int/C/400.perlbench 33.43 +1.70% spec/2006/int/C/401.bzip2 23.02 -0.19% spec/2006/int/C/403.gcc 32.57 -0.43% spec/2006/int/C/429.mcf 40.35 +0.27% spec/2006/int/C/445.gobmk 26.96 +0.06% spec/2006/int/C/456.hmmer 24.4 +0.19% spec/2006/int/C/458.sjeng 27.91 -0.08% spec/2006/int/C/462.libquantum 57.47 -0.20% spec/2006/int/C/464.h264ref 46.52 +1.35% geometric mean +0.29% The regression on 453.povray seems real, but is due to secondary effects as all hot functions are bit-identical with and without the flag. I started this patch to consult upstream opinions on this. It will be greatly appreciated if the community can help test the performance impact of this change on other architectures so that we can decided if this should be target-dependent. Reviewers: hfinkel, mkuper, davidxl, chandlerc Reviewed By: chandlerc Subscribers: rengolin, sanjoy, javed.absar, bjope, dorit, magabari, RKSimon, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33341 llvm-svn: 306336
*	Fix the bug when handling shufflevector for aarch64.	Dehao Chen	2017-06-26	1	-0/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This Fixes https://bugs.llvm.org/show_bug.cgi?id=33600 Reviewers: mssimpso, davidxl, Carrot Reviewed By: mssimpso Subscribers: aemerson, rengolin, sanjoy, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D34641 llvm-svn: 306334