bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Analysis: Move llvm::getConstantRangeFromMetadata to IR library.	Peter Collingbourne	2016-10-21	4	-22/+24
\| \| \| \| \| \| \| \|	We're about to start using it there. Differential Revision: https://reviews.llvm.org/D25877 llvm-svn: 284865
*	X86: Improve BT instruction selection for 64-bit values.	Peter Collingbourne	2016-10-21	1	-0/+8
\| \| \| \| \| \| \| \| \| \|	If a 64-bit value is tested against a bit which is known to be in the range [0..31) (modulo 64), we can use the 32-bit BT instruction, which has a slightly shorter encoding. Differential Revision: https://reviews.llvm.org/D25862 llvm-svn: 284864
*	[X86][AVX512BWVL] Added support for lowering v16i16 shuffles to AVX512BWVL ↵	Simon Pilgrim	2016-10-21	1	-16/+20
\| \| \| \| \| \|	vpermw llvm-svn: 284863
*	[pdb] added support for dumping globals stream	Bob Haarman	2016-10-21	6	-46/+236
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: This adds support for dumping the globals stream from PDB files using llvm-pdbdump, similar to the support we have for the publics stream. Reviewers: ruiu, zturner Subscribers: beanz, mgorny, modocache Differential Revision: https://reviews.llvm.org/D25801 llvm-svn: 284861
*	[X86][AVX512BWVL] Added support for combining target v16i16 shuffles to ↵	Simon Pilgrim	2016-10-21	1	-2/+2
\| \| \| \| \| \|	AVX512BWVL vpermw llvm-svn: 284860
*	[X86][AVX512] Added support for combining target shuffles to AVX512 ↵	Simon Pilgrim	2016-10-21	1	-7/+11
\| \| \| \| \| \|	vpermpd/vpermq/vpermps/vpermd/vpermw llvm-svn: 284858
*	[RDF] Use RegisterId typedef more consistently, NFC	Krzysztof Parzyszek	2016-10-21	3	-11/+12
\| \| \| \|	llvm-svn: 284857
*	[StripGCRelocates] New pass to remove gc.relocates added by RS4GC	Anna Thomas	2016-10-21	3	-0/+82
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Utility pass to remove gc.relocates created by rewrite statepoints for GC. With respect to safepoint verification, the IR generated would be incorrect, and cannot run as such. This would be a single transformation on the final optimized IR. The benefit of the pass is for easy analysis when the IRs are 'polluted' by too many gc.relocates. Added tests. test run: All RS4GC tests with -verify option. Local downstream tests on large IR files. This also works when the pointer being gc.relocated is another gc.relocate. Reviewers: sanjoy, reames Subscribers: beanz, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D25096 llvm-svn: 284855
*	[DAG] fold negation of sign-bit	Sanjay Patel	2016-10-21	1	-11/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	0 - X --> 0, if the sub is NUW 0 - X --> 0, if X is 0 or the minimum signed value and the sub is NSW 0 - X --> X, if X is 0 or the minimum signed value This is the DAG equivalent of: https://reviews.llvm.org/rL284649 plus the fold for the NUW case which already existed in InstSimplify. Note that we miss a vector fold because of a deficiency in the DAG version of computeKnownBits(). llvm-svn: 284844
*	[Hexagon] Handle spills of partially defined double vector registers	Krzysztof Parzyszek	2016-10-21	1	-21/+36
\| \| \| \| \| \| \| \| \| \| \| \|	After register allocation it is possible to have a spill of a register that is only partially defined. That in itself it fine, but creates a problem for double vector registers. Stores of such registers are pseudo instructions that are expanded into pairs of individual vector stores, and in case of a partially defined source, one of the stores may use an entirely undefined register. To avoid this, track the defined parts and only generate actual stores for those. llvm-svn: 284841
*	[WebAssembly] Fix for 0xc call_indirect changes	Derek Schuff	2016-10-21	6	-20/+159
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Need to reorder the operands to have the callee as the last argument. Adds a pseudo-instruction, and a pass to lower it into a real call_indirect. This is the first of two options for how to fix the problem. Reviewers: dschuff, sunfish Subscribers: jfb, beanz, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D25708 llvm-svn: 284840
*	Set the vectorizer MaxInterleaveFactor for Exynos.	Abderrazek Zaafrani	2016-10-21	1	-0/+1
\| \| \| \|	llvm-svn: 284839
*	Fix -Wunused-variable warning in libFuzzer	Reid Kleckner	2016-10-21	1	-1/+1
\| \| \| \|	llvm-svn: 284838
*	[X86] Use DAG::getBuildVector helper wrapper where possible. NFCI.	Simon Pilgrim	2016-10-21	1	-4/+4
\| \| \| \|	llvm-svn: 284835
*	Test commit	Abderrazek Zaafrani	2016-10-21	1	-1/+0
\| \| \| \|	llvm-svn: 284832
*	[LVI] Fix a bug with a guard being the very first instruction in a BB not ↵	Artur Pilipenko	2016-10-21	1	-5/+4
\| \| \| \| \| \| \| \|	taken into account While looking for guards use reverse iterator and scan up to rend() not to begin() llvm-svn: 284827
*	fix variable names; NFCI	Sanjay Patel	2016-10-21	1	-2/+2
\| \| \| \| \| \| \| \|	Because we're just 'or-ing' these 2 variables later in the code, I don't think there's a logical bug here, but of course the string with "no size" is the one that should have the size suffix stripped off. llvm-svn: 284826
*	[AMDGPU][mc] Fix ds_min/max[_rtn]_f32 - extra source operand removed.	Artem Tamazov	2016-10-21	1	-4/+4
\| \| \| \| \| \| \| \|	Fixes Bug 28215. Lit tests updated. Differential Revision: https://reviews.llvm.org/D25837 llvm-svn: 284825
*	[DAG] use SDNode flags 'nsz' to enable fadd/fsub with zero folds	Sanjay Patel	2016-10-21	1	-16/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As discussed in D24815, let's start the process of killing off the broken fast-math global state housed in TargetOptions and eliminate the need for function-level fast-math attributes. Here we enable two similar folds that are possible when we don't care about signed-zero: fadd nsz x, 0 --> x fsub nsz 0, x --> -x Note that although the test cases include a 'sin' function call, I'm side-stepping the FMF-on-calls question (and lack of support in the DAG) for now. It's not needed for these tests - isNegatibleForFree/GetNegatedExpression just look through a ISD::FSIN node. Also, when we create an FNEG node and propagate the Flags of the FSUB to it, this doesn't actually do anything today because Flags are silently dropped for any node that is not a binary operator. Differential Revision: https://reviews.llvm.org/D25297 llvm-svn: 284824
*	[X86][AVX2] Begun generalizing lowering to VPERMD/VPERMPS in preparation for ↵	Simon Pilgrim	2016-10-21	1	-8/+9
\| \| \| \| \| \|	AVX512 support. llvm-svn: 284823
*	[X86][AVX512] Add mask/maskz writemask support to subvector broadcast ↵	Simon Pilgrim	2016-10-21	1	-0/+40
\| \| \| \| \| \|	shuffle decode comments llvm-svn: 284821
*	[LoopUnroll] Keep the loop test only on the first iteration of max-or-zero loops	John Brawn	2016-10-21	3	-33/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we have a loop with a known upper bound on the number of iterations, and furthermore know that either the number of iterations will be either exactly that upper bound or zero, then we can fully unroll up to that upper bound keeping only the first loop test to check for the zero iteration case. Most of the work here is in plumbing this 'max-or-zero' information from the part of scalar evolution where it's detected through to loop unrolling. I've also gone for the safe default of 'false' everywhere but howManyLessThans which could probably be improved. Differential Revision: https://reviews.llvm.org/D25682 llvm-svn: 284818
*	[AArch64] Corrected spill size for DDD register class. NFCI	Bjorn Pettersson	2016-10-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The spill size was incorrectly set to 196 bits, which isn't a multiple of 8. This problem was detected when experimenting with asserts that the spill size should be a multiple of the byte size. New corrected value for the spill size is set to 192 bits. Note that tablegen (RegisterInfoEmitter) will divide the size set in the RegisterClass definition by 8. So this change should not have any impact on the tablegen output (trunc(192/8) == trunc(196/8) == 24 bytes). Reviewers: t.p.northover Subscribers: llvm-commits, aemerson, rengolin Differential Revision: https://reviews.llvm.org/D25818 llvm-svn: 284814
*	Revert "[GVN/PRE] Hoist global values outside of loops."	Davide Italiano	2016-10-21	1	-58/+26
\| \| \| \| \| \| \| \| \|	There's no agreement about this patch. I personally find the PRE machinery of the current GVN hard enough to reason about that I'm not sure I'll try to land this again, instead of working on the rewrite). llvm-svn: 284796
*	Fix cross-endianness RuntimeDyld relocation for ARM	Keno Fischer	2016-10-20	1	-9/+10
\| \| \| \| \| \| \| \| \|	rL284780 fixed the PREL31 relocation and added a test for it. Being the first such test for ARM relocations, it exposed incorrect endianness assumptions (causing buildbot failures on big-endian hosts). Fix that by using the same helpers used for the x86 case. llvm-svn: 284789
*	[SCEV] Add a threshold to restrict number of mul operands to be inlined into ↵	Li Huang	2016-10-20	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \|	SCEV This is to avoid inlining too many multiplication operands into a SCEV, which could take exponential time in the worst case. Reviewers: Sanjoy Das, Mehdi Amini, Michael Zolotukhin Differential Revision: https://reviews.llvm.org/D25794 llvm-svn: 284784
*	Fix PREL31 relocation on ARM	Keno Fischer	2016-10-20	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is a 31bits relative relocation instead of a 32bits absolute relocation. Reviewers: t.p.northover, peter.smith, rengolin Subscribers: aemerson, llvm-commits, samparker Differential Revision: https://reviews.llvm.org/D25069 llvm-svn: 284780
*	[X86] Enable interleaved memory access by default	Michael Kuperstein	2016-10-20	2	-0/+9
\| \| \| \| \| \| \| \|	This lets the loop vectorizer generate interleaved memory accesses on x86. Differential Revision: https://reviews.llvm.org/D25350 llvm-svn: 284779
*	[MSSA] Avoid unnecessary use walks when calling getClobberingMemoryAccess	Daniel Berlin	2016-10-20	1	-6/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This allows us to mark when uses have been optimized. This lets us avoid rewalking (IE when people call getClobberingAccess on everything), and also enables us to later relax the requirement of use optimization during updates with less cost. Reviewers: george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25172 llvm-svn: 284771
*	Another additional error check for invalid Mach-O files for the	Kevin Enderby	2016-10-20	1	-0/+34
\| \| \| \| \| \| \| \| \| \| \| \|	load commands that use the MachO::twolevel_hints_command type which includes only the LC_TWOLEVEL_HINTS load command. This is not used in llvm libObject code or in llvm tool code. But does appear in one of the binary test files. While this load command is obsolete it is easier to add code for it in libObject than edit or change the binary test case. llvm-svn: 284769
*	[CodeView] Refactor serialization to use StreamInterface.	Zachary Turner	2016-10-20	4	-217/+242
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was all using ArrayRef<>s before which presents a problem when you want to serialize to or deserialize from an actual PDB stream. An ArrayRef<> is really just a special case of what can be handled with StreamInterface though (e.g. by using a ByteStream), so changing this to use StreamInterface allows us to plug in a PDB stream and get all the record serialization and deserialization for free on a MappedBlockStream. Subsequent patches will try to remove TypeTableBuilder and TypeRecordBuilder in favor of class that operate on Streams as well, which should allow us to completely merge the reading and writing codepaths for both types and symbols. Differential Revision: https://reviews.llvm.org/D25831 llvm-svn: 284762
*	[AMDGPU] Make note record name a static const member of target streamer	Konstantin Zhuravlyov	2016-10-20	2	-13/+15
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D25746 llvm-svn: 284760
*	[AMDGPU] Emit constant address space data in .rodata section and use ↵	Konstantin Zhuravlyov	2016-10-20	5	-24/+46
\| \| \| \| \| \| \| \|	relocations instead of fixups (amdhsa only) Differential Revision: https://reviews.llvm.org/D25693 llvm-svn: 284759
*	Using branch probability to guide critical edge splitting.	Dehao Chen	2016-10-20	1	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The original heuristic to break critical edge during machine sink is relatively conservertive: when there is only one instruction sinkable to the critical edge, it is likely that the machine sink pass will not break the critical edge. This leads to many speculative instructions executed at runtime. However, with profile info, we could model the splitting benefits: if the critical edge has 50% taken rate, it would always be beneficial to split the critical edge to avoid the speculated runtime instructions. This patch uses profile to guide critical edge splitting in machine sink pass. The performance impact on speccpu2006 on Intel sandybridge machines: spec/2006/fp/C++/444.namd 25.3 +0.26% spec/2006/fp/C++/447.dealII 45.96 -0.10% spec/2006/fp/C++/450.soplex 41.97 +1.49% spec/2006/fp/C++/453.povray 36.83 -0.96% spec/2006/fp/C/433.milc 23.81 +0.32% spec/2006/fp/C/470.lbm 41.17 +0.34% spec/2006/fp/C/482.sphinx3 48.13 +0.69% spec/2006/int/C++/471.omnetpp 22.45 +3.25% spec/2006/int/C++/473.astar 21.35 -2.06% spec/2006/int/C++/483.xalancbmk 36.02 -2.39% spec/2006/int/C/400.perlbench 33.7 -0.17% spec/2006/int/C/401.bzip2 22.9 +0.52% spec/2006/int/C/403.gcc 32.42 -0.54% spec/2006/int/C/429.mcf 39.59 +0.19% spec/2006/int/C/445.gobmk 26.98 -0.00% spec/2006/int/C/456.hmmer 24.52 -0.18% spec/2006/int/C/458.sjeng 28.26 +0.02% spec/2006/int/C/462.libquantum 55.44 +3.74% spec/2006/int/C/464.h264ref 46.67 -0.39% geometric mean +0.20% Manually checked 473 and 471 to verify the diff is in the noise range. Reviewers: rengolin, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24818 llvm-svn: 284757
*	[CostModel][X86] Fixed AVX1/AVX512 sdiv/udiv uniformconst costs for 256/512 ↵	Simon Pilgrim	2016-10-20	1	-17/+48
\| \| \| \| \| \| \| \|	bit integer vectors We weren't checking for uniform const costs before the general cost, resulting in very high estimates. llvm-svn: 284755
*	Fix *_EXTEND_VECTOR_INREG legalization	Pirama Arumuga Nainar	2016-10-20	1	-3/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: While promoting _EXTEND_VECTOR_INREG nodes whose inputs are already promoted, perform the appropriate sign extension for the promoted node before doing the _EXTEND_VECTOR_INREG operation. If not, the undefined high-order bits of the promoted operand may (a) be garbage inc ase of zext) or (b) contribute the wrong sign-bit (in case of sext) Updated the promote-vec3.ll test after this change. The diff shows explicit zeroing in case of zext and intermediate sign extension in case of sext. Reviewers: RKSimon Subscribers: llvm-commits, srhines Differential Revision: https://reviews.llvm.org/D25790 llvm-svn: 284752
*	[Target] remove TargetRecip class; 2nd try	Sanjay Patel	2016-10-20	10	-348/+286
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a retry of r284495 which was reverted at r284513 due to use-after-scope bugs caused by faulty usage of StringRef. This version also renames a pair of functions: getRecipEstimateDivEnabled() getRecipEstimateSqrtEnabled() as suggested by Eric Christopher. original commit msg: [Target] remove TargetRecip class; move reciprocal estimate isel functionality to TargetLowering This is a follow-up to https://reviews.llvm.org/D24816 - where we changed reciprocal estimates to be function attributes rather than TargetOptions. This patch is intended to be a structural, but not functional change. By moving all of the TargetRecip functionality into TargetLowering, we can remove all of the reciprocal estimate state, shield the callers from the string format implementation, and simplify/localize the logic needed for a target to enable this. If a function has a "reciprocal-estimates" attribute, those settings may override the target's default reciprocal preferences for whatever operation and data type we're trying to optimize. If there's no attribute string or specific setting for the op/type pair, just use the target default settings. As noted earlier, a better solution would be to move the reciprocal estimate settings to IR instructions and SDNodes rather than function attributes, but that's a multi-step job that requires infrastructure improvements. I intend to work on that, but it's not clear how long it will take to get all the pieces in place. Differential Revision: https://reviews.llvm.org/D25440 llvm-svn: 284746
*	[CostModel][X86] Fixed AVX1/AVX512 sdiv/udiv general costs for 256/512 bit ↵	Simon Pilgrim	2016-10-20	1	-2/+30
\| \| \| \| \| \| \| \| \| \|	integer vectors We weren't accounting for legal types on every subtarget, meaning that many of the costs were using defaults. We still don't correctly cost (or test) the 512-bit sdiv/udiv by uniform const cases, nor the power-of-2 cases. llvm-svn: 284744
*	[AMDGPU] add fcopysign(f64, f32) pattern	Valery Pykhtin	2016-10-20	1	-0/+9
\| \| \| \| \| \|	Differential revision: https://reviews.llvm.org/D25827 llvm-svn: 284743
*	Retire llvm::alignOf in favor of C++11 alignof.	Benjamin Kramer	2016-10-20	7	-15/+12
\| \| \| \| \| \|	No functionality change intended. llvm-svn: 284733
*	[GVN] Use defaulted members. No functional change.	Benjamin Kramer	2016-10-20	1	-10/+3
\| \| \| \|	llvm-svn: 284726
*	[mips][mcjit] Add the majority of N32 support.	Simon Dardis	2016-10-20	3	-17/+36
\| \| \| \| \| \| \| \| \|	The missing piece is relocation composition for %hi(%neg(%gp_rel(x))) and similar. Patch by: Daniel Sanders llvm-svn: 284724
*	Do a sweep over move ctors and remove those that are identical to the default.	Benjamin Kramer	2016-10-20	16	-142/+5
\| \| \| \| \| \| \| \| \| \|	All of these existed because MSVC 2013 was unable to synthesize default move ctors. We recently dropped support for it so all that error-prone boilerplate can go. No functionality change intended. llvm-svn: 284721
*	Reapply "Add Chrono.h - std::chrono support header"	Pavel Labath	2016-10-20	5	-124/+56
\| \| \| \| \| \| \| \| \|	This is a resubmission of r284590. The mingw build should be fixed now. The problem was we were matching time_t with _localtime_64s, which was incorrect on _USE_32BIT_TIME_T systems. Instead I use localtime_s, which should always evaluate to the correct function. llvm-svn: 284720
*	[DAGCombiner] Add general constant vector support to (srl (shl x, c), c) -> ↵	Simon Pilgrim	2016-10-20	1	-8/+8
\| \| \| \| \| \| \| \|	(and x, cst2) We already supported scalar constant / splatted constant vector - now accepts any (non opaque) constant scalar / vector llvm-svn: 284717
*	Fix spelling mistake in comment.	Simon Pilgrim	2016-10-20	1	-1/+1
\| \| \| \|	llvm-svn: 284714
*	Fix MSVC bool -> uint64_t promotion warning	Simon Pilgrim	2016-10-20	1	-1/+1
\| \| \| \|	llvm-svn: 284713
*	[SystemZ] Post-RA scheduler implementation	Jonas Paulsson	2016-10-20	17	-18/+3323
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Post-RA sched strategy and scheduling instruction annotations for z196, zEC12 and z13. This scheduler optimizes decoder grouping and balances processor resources (including side steering the FPd unit instructions). The SystemZHazardRecognizer keeps track of the scheduling state, which can be dumped with -debug-only=misched. Reviers: Ulrich Weigand, Andrew Trick. https://reviews.llvm.org/D17260 llvm-svn: 284704
*	X86: Allow expressions to appear as u8imm operands.	Peter Collingbourne	2016-10-20	3	-1/+9
\| \| \| \|	llvm-svn: 284688
*	X86: Deduplicate some lowering code. NFCI.	Peter Collingbourne	2016-10-20	2	-34/+18
\| \| \| \|	llvm-svn: 284686