bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[AVX512] Adding PMOVZXBD/W/Q , PMOVZXDQ and PMOVZXWD/Q Intrinsics	Michael Zuckerman	2016-01-13	4	-1/+378
\| \| \| \| \| \|	Differential Revision:http://reviews.llvm.org/D16071 llvm-svn: 257601
*	[PowerPC] Fix large code model with the ELFv2 ABI	Ulrich Weigand	2016-01-13	2	-10/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The global entry point prologue currently assumes that the TOC associated with a function is less than 2GB away from the function entry point. This is always true when using the medium or small code model, but may not be the case when using the large code model. This patch adds a new variant of the ELFv2 global entry point prologue that lifts the 2GB restriction when building with -mcmodel=large. This works by emitting a quadword containing the distance from the function entry point to its associated TOC immediately before the entry point, and then using a prologue like: ld r2,-8(r12) add r2,r2,r12 Since creation of the entry point prologue is now split across two separate routines (PPCLinuxAsmPrinter::EmitFunctionEntryLabel emits the data word, PPCLinuxAsmPrinter::EmitFunctionBodyStart the prolog code), I've switched to using named labels instead of just temporaries to indicate the locations of the global and local entry points and the new TOC offset data word. These names are provided by new routines in PPCFunctionInfo modeled after the existing PPCFunctionInfo::getPICOffsetSymbol. Note that a corresponding change was committed to GCC here: https://gcc.gnu.org/ml/gcc-patches/2015-12/msg00355.html Reviewers: hfinkel Differential Revision: http://reviews.llvm.org/D15500 llvm-svn: 257597
*	[AVX512] adding PRORQ , PRORD , PRORLVQ and PRORLVD Intrinsics	Michael Zuckerman	2016-01-13	1	-0/+168
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D16052 llvm-svn: 257594
*	LEA code size optimization pass (Part 2): Remove redundant LEA instructions.	Andrey Turetskiy	2016-01-13	1	-0/+38
\| \| \| \| \| \| \| \| \| \|	Make x86 OptimizeLEAs pass remove LEA instruction if there is another LEA (in the same basic block) which calculates address differing only be a displacement. Works only for -Oz. Differential Revision: http://reviews.llvm.org/D13295 llvm-svn: 257589
*	Add test cases that will show the bug that was fixed in r256725.	Craig Topper	2016-01-13	1	-0/+24
\| \| \| \|	llvm-svn: 257584
*	[Inliner] Merge the attributes of the caller and callee functions	Akira Hatanaka	2016-01-13	1	-0/+84
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch turns off the fast-math optimization attribute on the caller if the callee's fast-math attribute is not turned on. For example, - before inlining caller: "less-precise-fpmad"="true" callee: "less-precise-fpmad"="false" - after inlining caller: "less-precise-fpmad"="false" Alternatively, it's possible to block inlining if the caller's and callee's attributes don't match. If this approach is preferable to the one in this patch, we can discuss post-commit. rdar://problem/19836465 Differential Revision: http://reviews.llvm.org/D7802 llvm-svn: 257575
*	[SPARC] Revamp AnalyzeBranch and add ReverseBranchCondition.	James Y Knight	2016-01-13	2	-1/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	AnalyzeBranch on X86 (and, previously, SPARC, which implementation was copied from X86) tries to modify the branches based on block layout (e.g. checking isLayoutSuccessor), when AllowModify is true. The rest of the architectures leave that up to the caller, which can call InsertBranch, RemoveBranch, and ReverseBranchCondition as appropriate. That appears to be the preferred way to do it nowadays. This commit makes SPARC like the rest: replaces AnalyzeBranch with an implementation cribbed from AArch64, and adds a ReverseBranchCondition implementation. Additionally, a test-case has been added (also cribbed from AArch64) demonstrating that redundant branch sequences no longer get emitted. E.g., it used to emit code like this: bne .LBB1_2 nop ba .LBB1_1 nop .LBB1_2: And now emits: cmp %i0, 42 be .LBB1_1 nop llvm-svn: 257572
*	Re-Revert r257105 (Verifier debug info changes)	Keno Fischer	2016-01-13	36	-213/+191
\| \| \| \| \| \| \|	While I investigate some new buildbot failures. This was originally reapplied as r257550 and r257558. llvm-svn: 257563
*	[llvm-objdump] Use report_error() and improve error coverage.	Davide Italiano	2016-01-13	3	-1/+3
\| \| \| \|	llvm-svn: 257561
*	AsmPrinter: Fix wrong OS X versions being emitted for darwin triples	Matthias Braun	2016-01-13	2	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \|	The version numbers of the darwin kernel are different from the version numbers of OS X, so we need adjustments if we had "--darwin" triples. Use the existing utility functions in TargetTriple for this. Fixes rdar://22056966 Differential Revision: http://reviews.llvm.org/D14601 llvm-svn: 257555
*	[CodeView] Mark our lines as statements, not expressions	David Majnemer	2016-01-13	5	-81/+81
\| \| \| \| \| \| \| \| \|	The line tables for CodeView make a distinction between expressions and statements. As it turns out, MSVC always emits them as statements and we always emit them as expressions. Let's switch to statements to match the CodeView that they emit. llvm-svn: 257553
*	[CodeView] Improve the line table dumper	David Majnemer	2016-01-13	5	-199/+510
\| \| \| \| \| \| \| \|	This change has us print out fields we didn't previously understand. To improve readability, we now group column information with it's respective line. llvm-svn: 257552
*	Reapply r257105 "[Verifier] Check that debug values have proper size"	Keno Fischer	2016-01-13	36	-191/+213
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The follow extra changes were made to test cases: Manually making the variable be the actual type instead of a pointer to avoid pointer-size differences in generic code: LLVM :: DebugInfo/Generic/2010-03-24-MemberFn.ll LLVM :: DebugInfo/Generic/2010-04-06-NestedFnDbgInfo.ll LLVM :: DebugInfo/Generic/2010-05-03-DisableFramePtr.ll LLVM :: DebugInfo/Generic/varargs.ll Delete sizing information from debug info for the same reason (but the presence of the pointer was important to the test case): LLVM :: DebugInfo/Generic/restrict.ll LLVM :: DebugInfo/Generic/tu-composite.ll LLVM :: Linker/type-unique-type-array-a.ll LLVM :: Linker/type-unique-simple2.ll Fixing an incorrect DW_OP_deref LLVM :: DebugInfo/Generic/2010-05-03-OriginDIE.ll Fixing a missing DW_OP_deref LLVM :: DebugInfo/Generic/incorrect-variable-debugloc.ll Additionally, clang should no longer complain during bootstrap should no longer happen after r257534. The original commit message was: ``` Summary: Teach the Verifier to make sure that the storage size given to llvm.dbg.declare or the value size given to llvm.dbg.value agree with what is declared in DebugInfo. This is implicitly assumed in a number of passes (e.g. in SROA). Additionally this catches a number of common mistakes, such as passing a pointer when a value was intended or vice versa. One complication comes from stack coloring which modifies the original IR when it merges allocas in order to make sure that if AA falls back to the IR it gets the correct result. However, given this new invariant, indiscriminately replacing one alloca by a different (differently sized one) is no longer valid. Fix this by just undefing out any use of the alloca in a dbg.declare in this case. Additionally, I had to fix a number of test cases. Of particular note: - I regenerated dbg-changes-codegen-branch-folding.ll from the given source as it was affected by the bug fixed in r256077 - two-cus-from-same-file.ll was changed to avoid having a variable-typed debug variable as that would depend on the target, even though this test is supposed to be generic - I had to manually declared size/align for reference type. See also the discussion for D14275/r253186. - fpstack-debuginstr-kill.ll required changing `double` to `long double` - most others were just a question of adding OP_deref ``` llvm-svn: 257550
*	For llvm-objdump, add the option -private-header (without the trailing ’s’)	Kevin Enderby	2016-01-13	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to only print the first private header. Which for Mach-O files only prints the Mach header and not the subsequent load commands. Which is used by scripts to match what the darwin otool(1) with the -h flag does without the -l flag. For non-Mach-O files it has the same functionality as -private-headers (with the trailing ’s’). rdar://24158331 llvm-svn: 257548
*	Guard fabs to bfc convert with V6T2 flag	Ana Pazos	2016-01-13	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: BFC instructions are available in ARMv6T2 and above. Reviewers: t.p.northover Subscribers: aemerson Differential Revision: http://reviews.llvm.org/D16076 llvm-svn: 257546
*	[ARM] Mark VMOV with immediate: isAsCheapAsMove.	Quentin Colombet	2016-01-13	1	-37/+21
\| \| \| \| \| \| \| \| \| \|	VMOVs are not strictly speaking cheap, but they are as expensive as a vector copy (VORR), so we should prefer rematerialization over splitting when it applies. rdar://problem/23754176 llvm-svn: 257545
*	CannotBeOrderedLessThanZero: add some missing cases	Fiona Glaser	2016-01-12	1	-0/+41
\| \| \| \|	llvm-svn: 257542
*	[Utils] Insert DW_OP_bit_piece when only describing part of the variable	Keno Fischer	2016-01-12	1	-0/+45
\| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The dbg.declare -> dbg.value conversion looks through any zext/sext to find a value to describe the variable (in the expectation that those zext/sext instruction will go away later). However, those values do not cover the entire variable and thus need a DW_OP_bit_piece. Reviewers: aprantl Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16061 llvm-svn: 257534
*	[CodeView] Initialize column-end to zero	David Majnemer	2016-01-12	1	-32/+32
\| \| \| \| \| \| \| \| \| \| \| \| \|	CodeView, unlike DWARF, can associate code with a range of columns. However, LLVM can only represent a single column position internally. We used to claim that the end column and start column were the same which yielded less than satisfactory results: we would stop printing at the _beginning_ of the source expression! Instead, mark the column-end as 'zero' to indicate that we don't have one (as per the documentation for IDiaLineNumber::get_lineNumberEnd). llvm-svn: 257528
*	[AVX512] adding PROLQ and PROLD Intrinsics	Michael Zuckerman	2016-01-12	2	-0/+126
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D16048 llvm-svn: 257523
*	[WebAssembly] Fix a test to work even when the integrated assembler is enabled.	Dan Gohman	2016-01-12	1	-2/+3
\| \| \| \| \| \| \|	Add -no-integrated-as to this test, since it's testing inline asm strings that aren't actually valid assembly syntax. llvm-svn: 257519
*	Codegen: [PPC] Handle weighted comparisons when inserting selects.	Kyle Butt	2016-01-12	1	-0/+35
\| \| \| \| \| \| \| \| \| \| \| \| \|	Only non-weighted predicates were handled in PPCInstrInfo::insertSelect. Handle the weighted predicates as well. This latent bug was triggered by r255398, because it added use of the branch-weighted predicates. While here, switch over an enum instead of an int to get the compiler to enforce totality in the future. llvm-svn: 257518
*	[WebAssembly] Add a EM_WEBASSEMBLY value, and several bits of code that use it.	Dan Gohman	2016-01-12	1	-0/+1
\| \| \| \| \| \| \| \| \|	A request has been made to the official registry, but an official value is not yet available. This patch uses a temporary value in order to support development. When an official value is recieved, the value of EM_WEBASSEMBLY will be updated. llvm-svn: 257517
*	[WebAssembly] Make CFG stackification independent of basic-block labels.	Dan Gohman	2016-01-12	3	-264/+380
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch changes the way labels are referenced. Instead of referencing the basic-block label name (eg. .LBB0_0), instructions now just have an immediate which indicates the depth in the control-flow stack to find a label to jump to. This makes them much closer to what we expect to have in the binary encoding, and avoids the problem of basic-block label names not being explicit in the binary encoding. Also, it terminates blocks and loops with end_block and end_loop instructions, rather than basic-block label names, for similar reasons. This will also fix problems where two constructs appear to have the same label, because we no longer explicitly use labels, so consumers that need labels will presumably create their own labels, and presumably they won't reuse labels when they do. This patch does make the code a little more awkward to read; as a partial mitigation, this patch also introduces comments showing where the labels are, and comments on each branch showing where it's branching to. llvm-svn: 257505
*	[Hexagon] Implement RDF-based post-RA optimizations	Krzysztof Parzyszek	2016-01-12	3	-1/+87
\| \| \| \| \| \| \| \|	- Handle simple cases of register copies (what current RDF CP allows). - Hexagon-specific dead code elimination: handles dead address updates in post-increment instructions. llvm-svn: 257504
*	[LibCallSimplifier] use instruction-level fast-math-flags to transform ↵	Sanjay Patel	2016-01-12	1	-8/+6
\| \| \| \| \| \| \| \|	pow(x, 0.5) calls Also, propagate the FMF to the newly created sqrt() call. llvm-svn: 257503
*	Fix bot failure from r257493: remove extraneous temp file read	Teresa Johnson	2016-01-12	1	-1/+1
\| \| \| \| \| \|	This was left from an earlier version of the test. llvm-svn: 257494
*	[ThinLTO] Handle an external call from an import to an alias in dest	Teresa Johnson	2016-01-12	2	-0/+32
\| \| \| \| \| \| \| \| \|	The findExternalCalls routine ignores calls to functions already defined in the dest module. This was not handling the case where the definition in the current module is actually an alias to a function call. llvm-svn: 257493
*	[LibCallSimplifier] use instruction-level fast-math-flags to transform ↵	Sanjay Patel	2016-01-12	1	-11/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	pow(exp(x)) calls See also: http://reviews.llvm.org/rL255555 http://reviews.llvm.org/rL256871 http://reviews.llvm.org/rL256964 http://reviews.llvm.org/rL257400 http://reviews.llvm.org/rL257404 http://reviews.llvm.org/rL257414 llvm-svn: 257491
*	AMDGPU: Emit note directive for HSA even if there are no functions	Tom Stellard	2016-01-12	1	-0/+6
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm, echristo Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16010 llvm-svn: 257488
*	consolidate exp/exp2 tests	Sanjay Patel	2016-01-12	2	-19/+12
\| \| \| \| \| \|	The transform is identical, so keep the tests together and save some overhead. llvm-svn: 257484
*	Add/edit tests to include instruction-level FMF on calls	Sanjay Patel	2016-01-12	1	-16/+28
\| \| \| \| \| \| \| \| \|	Prepatory patch before changing LibCallSimplifier to use the FMF. Also, tighten the CHECK lines and give the tests more meaningful names. Similar changes to: http://reviews.llvm.org/rL257414 llvm-svn: 257481
*	[mips] Correct operand order in DSP's mthi/mtlo	Daniel Sanders	2016-01-12	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: The result register is the second operand as per the other mt* instructions. Reviewers: vkalintiris Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D15993 llvm-svn: 257478
*	Fix test on windows.	Rafael Espindola	2016-01-12	1	-1/+1
\| \| \| \|	llvm-svn: 257475
*	[ARM] Fix several state persistence bugs	Keno Fischer	2016-01-12	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This fixes three bugs, in all of which state is not or incorrecly reset between objects (i.e. when reusing the same pass manager to create multiple object files): 1) AttributeSection needs to be reset to nullptr, because otherwise the backend will try to emit into the old object file's attribute section causing a segmentation fault. 2) MappingSymbolCounter needs to be reset, otherwise the second object file will start where the first one left off. 3) The MCStreamer base class resets the Streamer's e_flags settings. Since EF_ARM_EABI_VER5 is set on streamer creation, we need to set it again after the MCStreamer was rest. Also rename Reset (uppser case) to EHReset to avoid confusion with reset (lower case). Reviewers: rengolin Differential Revision: http://reviews.llvm.org/D15950 llvm-svn: 257473
*	The isel pattern that selects the memory-register form of VCVTPH2PS	Robert Lougher	2016-01-12	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(64 to 128-bit) matches against the pattern fragment 'vzmovl_v2i64' (a zero-extended 64-bit load). However, a change in r248784 teaches the instruction combiner that only the lower 64 bits of the input to a 128-bit vcvtph2ps are used. This means the instruction combiner will ordinarily optimize away the upper 64-bit insertelement instruction in the zero-extension and so we no longer select the memory-register form. To fix this a new pattern has been added. Differential Revision: http://reviews.llvm.org/D16067 llvm-svn: 257470
*	AVX512: VPMOVAPS/PD and VPMOVUPS/PD (load) intrinsic implementation.	Igor Breger	2016-01-12	2	-22/+204
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D16042 llvm-svn: 257463
*	CXX_FAST_TLS calling convention: performance improvement for x86-64.	Manman Ren	2016-01-12	1	-42/+19
\| \| \| \| \| \| \|	This is the same change on x86-64 as r255821 on AArch64. rdar://9001553 llvm-svn: 257428
*	CXX_FAST_TLS calling convention: performance improvement for ARM.	Manman Ren	2016-01-12	1	-8/+10
\| \| \| \| \| \| \|	This is the same change on ARM as r255821 on AArch64. rdar://9001553 llvm-svn: 257424
*	[IRMover] Don't copy personality, etc unless creating def	Teresa Johnson	2016-01-12	2	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Function::copyAttributesFrom will copy the personality function, prefix data and prolog data from the source function to the new function, and is invoked when the IRMover copies the function prototype. This puts a reference to a constant in the source module on a function in the dest module, which causes an error when deleting the source module after importing, since the personality function in the source module still has uses (this would presumably also be an issue for the prologue and prefix data). Remove the copies added to the dest copy when creating the new prototype, as they are mapped properly when/if we link the function body. llvm-svn: 257420
*	[Orc] XFAIL a few remote-jit test cases that I missed in r257391.	Lang Hames	2016-01-11	3	-2/+3
\| \| \| \|	llvm-svn: 257419
*	CXX_FAST_TLS calling convention: Add support for ARM on Darwin.	Manman Ren	2016-01-11	1	-0/+44
\| \| \| \| \| \|	rdar://9001553 llvm-svn: 257417
*	[WebAssembly] Define WebAssembly-specific relocation codes.	Dan Gohman	2016-01-11	11	-36/+36
\| \| \| \| \| \| \| \|	Currently WebAssembly has two kinds of relocations; data addresses and function addresses. This adds ELF relocations for them, as well as an MC symbol kind to indicate which type of relocation is needed. llvm-svn: 257416
*	[LibCallSimplifier] use instruction-level fast-math-flags to transform log calls	Sanjay Patel	2016-01-11	1	-22/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Also, add tests to verify that we're checking 'fast' on both calls of each transform pair, tighten the CHECK lines, and give the tests more meaningful names. This is a continuation of: http://reviews.llvm.org/rL255555 http://reviews.llvm.org/rL256871 http://reviews.llvm.org/rL256964 http://reviews.llvm.org/rL257400 http://reviews.llvm.org/rL257404 llvm-svn: 257414
*	Remove a bugs assert.	Rafael Espindola	2016-01-11	1	-0/+52
\| \| \| \| \| \| \|	There is no reason the value being printed has to be positive. Fixes pr25802. llvm-svn: 257412
*	[LibCallSimplifier] don't allow sqrt transform unless all ops are unsafe	Sanjay Patel	2016-01-11	1	-0/+15
\| \| \| \| \| \| \|	Fix the FIXME added with: http://reviews.llvm.org/rL257400 llvm-svn: 257404
*	LoopUnroll: Use the optsize threshold for minsize as well	Justin Bogner	2016-01-11	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Currently we're unrolling loops more in minsize than in optsize, which means -Oz will have a larger code size than -Os. That doesn't make any sense. This resolves the FIXME about this in LoopUnrollPass and extends the optsize test to make sure we use the smaller threshold for minsize as well. llvm-svn: 257402
*	[LibCallSimplifier] use instruction-level fast-math-flags to transform sqrt ↵	Sanjay Patel	2016-01-11	3	-40/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	calls This is a continuation of adding FMF to call instructions: http://reviews.llvm.org/rL255555 The intent of the patch is to preserve the current behavior of the transform except that we use the sqrt instruction's 'fast' attribute as a trigger rather than the function-level attribute. But this raises a bug noted by the new FIXME comment. In order to do this transform: sqrt((x * x) * y) ---> fabs(x) * sqrt(y) ...we need all of the sqrt, the first fmul, and the second fmul to be 'fast'. If any of those ops is strict, we should bail out. Differential Revision: http://reviews.llvm.org/D15937 llvm-svn: 257400
*	Add a missing error handling to llvm-lto.	Rafael Espindola	2016-01-11	1	-0/+2
\| \| \| \|	llvm-svn: 257395
*	AMDGPU: Implement {{s\|u}}int_to_fp i64 -> f32	Matt Arsenault	2016-01-11	3	-7/+128
\| \| \| \| \| \| \|	The old lowering for uint_to_fp failed opencl conformance. It might be OK for fast math mode, but I'm not sure. llvm-svn: 257393