bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[codeview] Wire up the .cv_inline_linetable directive	Reid Kleckner	2016-02-02	6	-64/+120
\| \| \| \| \| \| \| \|	This directive emits the binary annotations that describe line and code deltas in inlined call sites. Single-stepping through inlined frames in windbg now works. llvm-svn: 259535
*	[MC] Enable eip-relative addressing on x86-64 for X32 ABI	Derek Schuff	2016-02-02	1	-0/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Enables eip-based addressing, e.g., lea constant(%eip), %rax lea constant(%eip), %eax in MC, (used for the x32 ABI). EIP-base addressing is also valid in x86_64, it is left enabled for that architecture as well. Patch by João Porto Differential Revision: http://reviews.llvm.org/D16581 llvm-svn: 259528
*	Refactor backend diagnostics for unsupported features	Oliver Stannard	2016-02-02	9	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Re-commit of r258951 after fixing layering violation. The BPF and WebAssembly backends had identical code for emitting errors for unsupported features, and AMDGPU had very similar code. This merges them all into one DiagnosticInfo subclass, that can be used by any backend. There should be minimal functional changes here, but some AMDGPU tests have been updated for the new format of errors (it used a slightly different format to BPF and WebAssembly). The AMDGPU error messages will now benefit from having precise source locations when debug info is available. llvm-svn: 259498
*	[X86][AVX512] Add support for AVX512 VMOVQ (load) shuffle decoding	Simon Pilgrim	2016-02-02	4	-108/+36
\| \| \| \|	llvm-svn: 259496
*	Removed FeatureVFPOnlySP from the Cortex-R7 processor model	Sjoerd Meijer	2016-02-02	1	-2/+1
\| \| \| \| \| \| \| \| \|	description and changed the regression test accordingly. The default configuration of a Cortex-R7 is to implement the VFPv3-D16 architecture and the feature line as it was is too restrictive. llvm-svn: 259480
*	[RegisterCoalescer] Better DebugLoc for reMaterializeTrivialDef	David Majnemer	2016-02-02	1	-0/+56
\| \| \| \| \| \| \| \| \| \|	When rematerializing a computation by replacing the copy, use the copy's location. The location of the copy is more representative of the original program. This partially fixes PR10003. llvm-svn: 259469
*	[LCG] Build an edge abstraction for the LazyCallGraph and use it to	Chandler Carruth	2016-02-02	2	-29/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	differentiate between indirect references to functions an direct calls. This doesn't do a whole lot yet other than change the print out produced by the analysis, but it lays the groundwork for a very major change I'm working on next: teaching the call graph to actually be a call graph, modeling both the indirect reference graph and the call graph simultaneously. More details on that in the next patch though. The rest of this is essentially a bunch of over-engineering that won't be interesting until the next patch. But this also isolates essentially all of the churn necessary to introduce the edge abstraction from the very important behavior change necessary in order to separately model the two graphs. So it should make review of the subsequent patch a bit easier at the cost of making this patch seem poorly motivated. ;] Differential Revision: http://reviews.llvm.org/D16038 llvm-svn: 259463
*	[LVI] Introduce an intersect operation on lattice values	Philip Reames	2016-02-02	2	-0/+75
\| \| \| \| \| \| \| \| \| \| \| \|	LVI has several separate sources of facts - edge local conditions, recursive queries, assumes, and control independent value facts - which all apply to the same value at the same location. The existing implementation was very conservative about exploiting all of these facts at once. This change introduces an "intersect" function specifically to abstract the action of picking a good set of facts from all of the separate facts given. At the moment, this function is relatively simple (i.e. mostly just reuses the bits which were already there), but even the minor additions reveal the inherent power. For example, JumpThreading is now capable of doing an inductive proof that a particular value is always positive and removing a half range check. I'm currently only using the new intersect function in one place. If folks are happy with the direction of the work, I plan on making a series of small changes without review to replace mergeIn with intersect at all the appropriate places. Differential Revision: http://reviews.llvm.org/D14476 llvm-svn: 259461
*	[X86] Fix a bug in getMemOpBaseRegImmOfs	Sanjoy Das	2016-02-02	1	-1/+35
\| \| \| \| \| \| \| \| \|	Fix a crash in `getMemOpBaseRegImmOfs` that happens if the base of `MemOp` is a frame index memory operand. The fix is to have `getMemOpBaseRegImmOfs` bail out in such cases. We can possibly be more clever here, if needed. llvm-svn: 259456
*	[X86][FastISel] Don't force Nearest-Even rounding for VCVTPS2PH, use MXCSR.	Ahmed Bougacha	2016-02-02	1	-1/+1
\| \| \| \| \| \|	FastISel counterpart to r259448. llvm-svn: 259449
*	[X86] Don't force Nearest-Even rounding for VCVTPS2PH, use MXCSR.	Ahmed Bougacha	2016-02-02	2	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Officially, we don't acknowledge non-default configurations of MXCSR, as getting there would require usage of the FENV_ACCESS pragma (at least insofar as rounding mode is concerned). We don't support the pragma, so we can assume that the default rounding mode - round to nearest, ties to even - is always used. However, it's inconsistent with the rest of the instruction set, where MXCSR is always effective (unless otherwise specified). Also, it's an unnecessary obstacle to the few brave souls that use fenv.h with LLVM. Avoid the hard-coded rounding mode for fp_to_f16; use MXCSR instead. llvm-svn: 259448
*	[safestack] Make sure the unsafe stack pointer is popped in all cases	Anna Zaks	2016-02-02	3	-4/+8
\| \| \| \| \| \| \| \| \| \|	The unsafe stack pointer is only popped in moveStaticAllocasToUnsafeStack so it won't happen if there are no static allocas. Fixes https://llvm.org/bugs/show_bug.cgi?id=26122 Differential Revision: http://reviews.llvm.org/D16339 llvm-svn: 259447
*	[LVI] Missing test case from 259432	Philip Reames	2016-02-01	1	-0/+25
\| \| \| \|	llvm-svn: 259437
*	Add test for PR26419 (stable function summary ordering)	Teresa Johnson	2016-02-01	1	-3/+10
\| \| \| \| \| \| \|	Enhance an existing test to also check that the ordering of the function summary entries is stable. llvm-svn: 259434
*	[X86][AVX512] Add support for AVX512 VMOVD (load) shuffle decoding	Simon Pilgrim	2016-02-01	3	-19/+9
\| \| \| \|	llvm-svn: 259430
*	[LVI] Add select handling	Philip Reames	2016-02-01	1	-0/+69
\| \| \| \| \| \| \| \|	Teach LVI to handle select instructions in the exact same way it handles PHI nodes. This is useful since various parts of the optimizer convert PHI nodes into selects and we don't want these transformations to cause inferior optimization. Note that this patch does nothing to exploit the implied constraint on the inputs represented by the select condition itself. That will be a later patch and is blocked on http://reviews.llvm.org/D14476 llvm-svn: 259429
*	[X86][AVX512] Add support for AVX512 VMOVSD/VMOVSS shuffle decoding	Simon Pilgrim	2016-02-01	3	-56/+22
\| \| \| \|	llvm-svn: 259427
*	[InstCombine] simplify masked scatter/gather intrinsics with zero masks	Sanjay Patel	2016-02-01	1	-3/+20
\| \| \| \| \| \| \| \| \| \| \|	A masked scatter with a zero mask means there's no store. A masked gather with a zero mask means the passthru arg is returned. This is a continuation of: http://reviews.llvm.org/rL259369 http://reviews.llvm.org/rL259392 llvm-svn: 259421
*	[X86][AVX512] Add support for AVX512 VINSERTPS shuffle decoding	Simon Pilgrim	2016-02-01	2	-35/+11
\| \| \| \|	llvm-svn: 259420
*	[X86][SSE] Regenerated load vector + element extraction tests.	Simon Pilgrim	2016-02-01	1	-22/+69
\| \| \| \|	llvm-svn: 259416
*	[X86][SSE] Add AVX512 merge consecutive load tests	Simon Pilgrim	2016-02-01	3	-57/+759
\| \| \| \| \| \| \| \|	Add AVX512F/AVX512BW 512-bit tests. Add AVX512F tests to existing 128/256-bit tests. llvm-svn: 259410
*	Regenerate vector blend tests.	Simon Pilgrim	2016-02-01	1	-8/+9
\| \| \| \|	llvm-svn: 259406
*	Regenerate vector sext/zext constant folding tests.	Simon Pilgrim	2016-02-01	1	-72/+81
\| \| \| \|	llvm-svn: 259405
*	Avoid inlining call sites in unreachable-terminated block	Jun Bum Lim	2016-02-01	2	-3/+138
\| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If the normal destination of the invoke or the parent block of the call site is unreachable-terminated, there is little point in inlining the call site unless there is literally zero cost. Unlike my previous change (D15289), this change specifically handle the call sites followed by unreachable in the same basic block for call or in the normal destination for the invoke. This change could be a reasonable first step to conservatively inline call sites leading to an unreachable-terminated block while BFI / BPI is not yet available in inliner. Reviewers: manmanren, majnemer, hfinkel, davidxl, mcrosier, dblaikie, eraman Subscribers: dblaikie, davidxl, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D16616 llvm-svn: 259403
*	Add a test for r258362.	Rafael Espindola	2016-02-01	2	-0/+25
\| \| \| \| \| \|	Thanks to Mehdi for finding it. llvm-svn: 259394
*	[InstCombine] simplify masked store intrinsics with all ones or zeros masks	Sanjay Patel	2016-02-01	1	-0/+18
\| \| \| \| \| \| \| \| \| \|	A masked store with a zero mask means there's no store. A masked store with an allOnes mask means it's a normal vector store. This is a continuation of: http://reviews.llvm.org/rL259369 llvm-svn: 259392
*	AArch64: Implement missed conditional compare sequences.	Balaram Makam	2016-02-01	1	-12/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is an extension to the existing implementation of r242436 which restricts to only select inputs. This version fixes missed opportunities in pr26084 by attempting to lower conditional compare sequences of and/or trees with setcc leafs. This will additionaly handle the case when a tree with select input is not a conjunction-disjunction tree but some of the sub trees are conjunction-disjunction trees. Reviewers: jmolloy, t.p.northover, mcrosier, MatzeB Subscribers: mcrosier, llvm-commits, junbuml, haicheng, mssimpso, gberry Differential Revision: http://reviews.llvm.org/D16291 llvm-svn: 259387
*	Add test case missing from r259357 (NFC)	Matthew Simpson	2016-02-01	1	-0/+26
\| \| \| \|	llvm-svn: 259385
*	[SystemZ] Fix wrong-code generation for certain always-false conditions	Ulrich Weigand	2016-02-01	1	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We've found another bug in the code generation logic conditions for a certain class of always-false conditions, those of the form if ((a & 1) < 0) These only reach the back end when compiling without optimization. The bug was introduced by the choice of using TEST UNDER MASK to implement a check for if ((a & MASK) < VAL) as if ((a & MASK) == 0) where VAL is less than the the lowest bit of MASK. This is correct in all cases except for VAL == 0, in which case the original condition is always false, but the replacement isn't. Fixed by excluding that particular case. llvm-svn: 259381
*	fix broken check lines	Sanjay Patel	2016-02-01	1	-2/+2
\| \| \| \| \| \|	Without the colon, it doesn't mean anything! llvm-svn: 259377
*	[InstCombine] Don't transform (X+INT_MAX)>=(Y+INT_MAX) -> (X<=Y)	David Majnemer	2016-02-01	1	-0/+12
\| \| \| \| \| \| \| \| \|	This miscompile came about because we tried to use a transform which was only appropriate for xor operators when addition was present. This fixes PR26407. llvm-svn: 259375
*	[ValueTracking] Improve isKnownNonZero for PHI of non-zero constants	Jun Bum Lim	2016-02-01	1	-0/+24
\| \| \| \| \| \|	It is clear that a PHI is a non-zero if all incoming values are non-zero constants. llvm-svn: 259370
*	[InstCombine] simplify masked load intrinsics with all ones or zeros masks	Sanjay Patel	2016-02-01	1	-21/+2
\| \| \| \| \| \| \| \| \|	A masked load with a zero mask means there's no load. A masked load with an allOnes mask means it's a normal vector load. Differential Revision: http://reviews.llvm.org/D16691 llvm-svn: 259369
*	[X86][AVX512VBMI] add encoding and intrinsics for Multishift	Asaf Badouh	2016-02-01	3	-49/+281
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D16399 llvm-svn: 259363
*	[mips] Split large test file into 3 smaller ones.	Vasileios Kalintiris	2016-02-01	4	-712/+772
\| \| \| \| \| \| \|	Remove the old select.ll file and use select-int.ll, select-flt.ll, select-dbl.ll for testing selects on integers, floats & doubles respectivelly. llvm-svn: 259361
*	[mips] Range check uimm16 and fix several bugs this revealed.	Daniel Sanders	2016-02-01	22	-103/+153
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The bugs were: * teq and similar take 4-bit unsigned immediates on microMIPS. * teqi and similar have side-effects like teq do. * shll_s.w and shra_r.w take 5-bit unsigned immediates. * The various DSP ext* instructions take a 5-bit immediate. * repl.qh takes an 8-bit unsigned immediate. * repl.ph takes a 10-bit unsigned immediate. * rddsp/wrdsp take a 10-bit unsigned immediate. * teqi and similar take signed 16-bit immediates (10-bit for microMIPS). * Out-of-range immediate macros for or/xor take a simm32/simm64 depending on architecture. I'll fix the simm64 case properly when I reach simm32. lui is a bit more lenient than GAS and accepts signed immediates in addition to unsigned. This is because MipsMCExpr can produce signed values when constant folding and it currently lacks a way of knowing it should fold to an unsigned value. Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D15446 llvm-svn: 259360
*	Reapply commit r258404 with fix.	Matthew Simpson	2016-02-01	1	-13/+18
\| \| \| \| \| \| \|	The previous patch caused PR26364. The fix is to ensure that we don't enter a cycle when iterating over use-def chains. llvm-svn: 259357
*	AVX512: fix mask handling for gather/scatter/prefetch intrinsics.	Igor Breger	2016-02-01	1	-4/+59
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D16755 llvm-svn: 259346
*	[X86][SSE] Find source of the inserted element of INSERTPS	Simon Pilgrim	2016-02-01	1	-15/+4
\| \| \| \| \| \| \| \|	Minor patch to trace back through target shuffles to the source of the inserted element in a (V)INSERTPS shuffle. Differential Revision: http://reviews.llvm.org/D16652 llvm-svn: 259343
*	AVX512 : Fix SETCCE lowering for KNL 32 bit.	Igor Breger	2016-02-01	1	-17/+62
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D16752 llvm-svn: 259342
*	[dsymutil] Support scattered relocs.	Frederic Riss	2016-02-01	4	-0/+213
\| \| \| \| \| \| \| \| \| \|	Although it seems like clang will never emit scattered relocations in the debug information (at least I couldn't find a way), we have too support them for the benefit of other compilers. As clang doesn't generate them, the included testcase was produced from hacked up assembly. llvm-svn: 259339
*	Revert r258580 and r258581.	David Majnemer	2016-02-01	4	-100/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Those commits created an artificial edge from a cleanup to a synthesized catchswitch in order to get the MSVC personality routine to execute cleanups which don't cleanupret and are not wrapped by a catchswitch. This worked well enough but is not a complete solution in situations where there the cleanup infinite loops. However, the real deal breaker behind this approach comes about from a degenerate case where the cleanup is post-dominated by unreachable and throws an exception. This ends poorly because the catchswitch will inadvertently catch the exception. Because of this we should go back to our previous behavior of not executing certain cleanups (identical behavior with the Itanium ABI implementation in clang, GCC and ICC). N.B. I think this could be salvaged by making the catchpad rethrow the exception and properly transforming throwing calls in the cleanup into invokes. llvm-svn: 259338
*	[dsymutil] Fix FileCheck command.	Frederic Riss	2016-01-31	1	-1/+1
\| \| \| \| \| \|	Damn case-insensitive filesystem... llvm-svn: 259319
*	[dsymutil] Fix handling of common symbols.	Frederic Riss	2016-01-31	7	-7/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	llvm-dsymutil was misinterpreting the value of common symbols as their address when it actually contains their size. This didn't impact llvm-dsymutil's ability to link the debug information for common symbols because these are always found by name and not by address. Things could however go wrong when the size of a common object matched the object file address of another symbol. Depending on the link order of the symbols the common object might incorrectly evict this other object from the address to symbol mapping, and then link the evicted symbol with a wrong binary address. Use the new ability to have symbols without an object file address to fix this. llvm-svn: 259318
*	[WebAssembly] Fix uses of FrameIndex as store values	Derek Schuff	2016-01-30	1	-3/+12
\| \| \| \| \| \| \| \|	Previously the code assumed all uses of FI on loads and stores were as addresses. This checks whether the use is the address or a value and handles the latter case as it does for non-memory instructions. llvm-svn: 259306
*	WebAssembly: don't optimize frameindex store	JF Bastien	2016-01-30	1	-0/+10
\| \| \| \| \| \| \| \|	The previous code was incorrect (can't getReg a frameindex). We could instead optimize it to reduce tree height, but I'm not sure that's worthwhile yet because we then try to eliminate the frameindex. This patch also fixes frame index elimination for operations which may load or store: it used to assume the base was operand 2 and immediate offset operand 1. That's not true for stores, where they're 4 and 3. llvm-svn: 259305
*	[BasicAA] Fix for missing must alias (D16343)	Gerolf Hoflehner	2016-01-30	1	-0/+24
\| \| \| \|	llvm-svn: 259299
*	AMDGPU: Fix emitting invalid workitem intrinsics for HSA	Matt Arsenault	2016-01-30	2	-3/+331
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The AMDGPUPromoteAlloca pass was emitting the read.local.size calls, which with HSA was incorrectly selected to reading from the offset mesa uses off of the kernarg pointer. Error on intrinsics which aren't supported by HSA, and start emitting the correct IR to read the workgroup size out of the dispatch pointer. Also initialize the pass so it can be tested with opt, and start moving towards not depending on the subtarget as an argument. Start emitting errors for the intrinsics not handled with HSA. llvm-svn: 259297
*	AMDGPU: Stop checking intrinsics not used by HSA for dispatch-ptr	Matt Arsenault	2016-01-30	2	-7/+171
\| \| \| \| \| \| \| \|	Only the dispatch.ptr intrinsic is supposed to be used now to get the workgroup size, and the read.local.size intrinsics do not work correctly. llvm-svn: 259296
*	InstCombine: fabs(x) * fabs(x) -> x * x	Matt Arsenault	2016-01-30	1	-0/+29
\| \| \| \|	llvm-svn: 259295