bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[X86] Reuse EFLAGS and form LOCKed ops when only user is SETCC.	Ahmed Bougacha	2016-04-07	1	-24/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Re-apply r265450 which caused PR27245 and was reverted in r265559 because of a wrong generalization: the fetch_and_add->add_and_fetch combine only works in specific, but pretty common, cases: (icmp slt x, 0) -> (icmp sle (add x, 1), 0) (icmp sge x, 0) -> (icmp sgt (add x, 1), 0) (icmp sle x, 0) -> (icmp slt (sub x, 1), 0) (icmp sgt x, 0) -> (icmp sge (sub x, 1), 0) Original Message: We only generate LOCKed versions of add/sub when the result is unused. It often happens that the result is used, but only by a comparison. We can optimize those out by reusing EFLAGS, which lets us use the proper instructions, instead of having to fallback to LXADD. Instead of doing this as an MI peephole (as we do for the other non-LOCKed (really, non-MR) forms), do it in ISel. It becomes quite tricky later. This also makes it eventually possible to stop expanding and/or/xor if the only user is an icmp (also see D18141). This uses the LOCK ISD opcodes added by r262244. Differential Revision: http://reviews.llvm.org/D17633 llvm-svn: 265636
*	[X86] Refresh and tweak EFLAGS reuse tests. NFC.	Ahmed Bougacha	2016-04-07	1	-51/+82
\| \| \| \| \| \|	The non-1 and EQ/NE tests were misguided. llvm-svn: 265635
*	Re-commit r265039 "[X86] Merge adjacent stack adjustments in ↵	Hans Wennborg	2016-04-07	8	-17/+75
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	eliminateCallFramePseudoInstr (PR27140)" Third time's the charm? The previous attempt (r265345) caused ASan test failures on X86, as broken CFI caused stack traces to not work. This version of the patch makes sure not to merge with stack adjustments that have CFI, and to not add merged instructions' offests to the CFI about to be generated. This is already covered by the lit tests; I just got the expectations wrong previously. llvm-svn: 265623
*	[sancov] enabling coverage edge pruning by default.	Mike Aizatsky	2016-04-06	3	-9/+2
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D18844 llvm-svn: 265615
*	Thread Expected<...> up from createMachOObjectFile() to allow llvm-objdump ↵	Kevin Enderby	2016-04-06	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to produce a real error message Produce the first specific error message for a malformed Mach-O file describing the problem instead of the generic message for object_error::parse_failed of "Invalid data was encountered while parsing the file”. Many more good error messages will follow after this first one. This is built on Lang Hames’ great work of adding the ’Error' class for structured error handling and threading Error through MachOObjectFile construction. And making createMachOObjectFile return Expected<...> . So to to get the error to the llvm-obdump tool, I changed the stack of these methods to also return Expected<...> : object::ObjectFile::createObjectFile() object::SymbolicFile::createSymbolicFile() object::createBinary() Then finally in ParseInputMachO() in MachODump.cpp the error can be reported and the specific error message can be printed in llvm-objdump and can be seen in the existing test case for the existing malformed binary but with the updated error message. Converting these interfaces to Expected<> from ErrorOr<> does involve touching a number of places. To contain the changes for now use of errorToErrorCode() and errorOrToExpected() are used where the callers are yet to be converted. Also there some were bugs in the existing code that did not deal with the old ErrorOr<> return values. So now with Expected<> since they must be checked and the error handled, I added a TODO and a comment: “// TODO: Actually report errors helpfully” and a call something like consumeError(ObjOrErr.takeError()) so the buggy code will not crash since needed to deal with the Error. Note there is one fix also needed to lld/COFF/InputFiles.cpp that goes along with this that I will commit right after this. So expect lld not to built after this commit and before the next one. llvm-svn: 265606
*	[LoopUnroll] Fix the way we update DT after complete unrolling.	Michael Zolotukhin	2016-04-06	1	-0/+53
\| \| \| \| \| \| \| \|	Updating dominators for exit-blocks of the unrolled loops is not enough, as shown in PR27157. The proper way is to update dominators for all dominance-children of original loop blocks. llvm-svn: 265605
*	[PPC] Use VSX/FP Facility integer load when an integer load's only users are ↵	Ehsan Amiri	2016-04-06	1	-0/+83
\| \| \| \| \| \| \| \| \| \| \|	conversion to FP http://reviews.llvm.org/D18405 When the integer value loaded is never used directly as integer we should use VSX or Floating Point Facility integer loads and avoid extra direct move llvm-svn: 265593
*	regenerate checks	Sanjay Patel	2016-04-06	1	-556/+828
\| \| \| \|	llvm-svn: 265591
*	AMDGPU: Add a shader calling convention	Nicolai Haehnle	2016-04-06	81	-511/+431
\| \| \| \| \| \| \| \| \| \| \|	This makes it possible to distinguish between mesa shaders and other kernels even in the presence of compute shaders. Patch By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Differential Revision: http://reviews.llvm.org/D18559 llvm-svn: 265589
*	[IRVerifier] Don't crash on invalid DIFile inside DISubprogram.	Davide Italiano	2016-04-06	1	-0/+10
\| \| \| \| \| \| \|	r265515, this time with the correct fix. file inside DISubprogram is not mandatory. llvm-svn: 265586
*	[gold] Save bitcode for module partitions (save-temps + split codegen).	Evgeniy Stepanov	2016-04-06	1	-0/+6
\| \| \| \|	llvm-svn: 265583
*	Revert r265450 "[X86] Reuse EFLAGS and form LOCKed ops when only user is SETCC."	Hans Wennborg	2016-04-06	1	-5/+15
\| \| \| \| \| \|	It caused ASan 32-bit tests to hang (PR27245). llvm-svn: 265559
*	LoopUnroll: only allow non-modulo Partial unrolling when Runtime=true	Fiona Glaser	2016-04-06	1	-1/+1
\| \| \| \| \| \|	Patch by Evgeny Stupachenko <evstupac@gmail.com>. llvm-svn: 265558
*	Revert "[AMDGPU] llvm-objdump: Minimal HSA Code Object disassembler support."	Valery Pykhtin	2016-04-06	1	-75/+0
\| \| \| \| \| \|	This reverts commit r265550. There're problems with endianness on dumping instruction bytes. Need to find out how to use support::ulittle32_t type properly. llvm-svn: 265554
*	Revert "Re-commit r265039 "[X86] Merge adjacent stack adjustments in ↵	Hans Wennborg	2016-04-06	8	-77/+19
\| \| \| \| \| \| \| \| \|	eliminateCallFramePseudoInstr (PR27140)"" It seems to be causing ASan tests to crash, probably due to miscompiling the run-time somehow. llvm-svn: 265551
*	[AMDGPU] llvm-objdump: Minimal HSA Code Object disassembler support.	Valery Pykhtin	2016-04-06	1	-0/+75
\| \| \| \| \| \|	Differential revision: http://reviews.llvm.org/D16998 llvm-svn: 265550
*	Recommit r265309 after fixed an invalid memory reference bug happened	Wei Mi	2016-04-06	5	-518/+199
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	when DenseMap growed and moved memory. I verified it fixed the bootstrap problem on x86_64-linux-gnu but I cannot verify whether it fixes the bootstrap error on clang-ppc64be-linux. I will watch the build-bot result closely. Replace analyzeSiblingValues with new algorithm to fix its compile time issue. The patch is to solve PR17409 and its duplicates. analyzeSiblingValues is a N x N complexity algorithm where N is the number of siblings generated by reg splitting. Although it causes siginificant compile time issue when N is large, it is also important for performance since it removes redundent spills and enables rematerialization. To solve the compile time issue, the patch removes analyzeSiblingValues and replaces it with lower cost alternatives containing two parts. The first part creates a new spill hoisting method in postOptimization of register allocation. It does spill hoisting at once after all the spills are generated instead of inside every instance of selectOrSplit. The second part queries the define expr of the original register for rematerializaiton and keep it always available during register allocation even if it is already dead. It deletes those dead instructions only in postOptimization. With the two parts in the patch, it can remove analyzeSiblingValues without sacrificing performance. Differential Revision: http://reviews.llvm.org/D15302 llvm-svn: 265547
*	Revert r265535 until we know how we can fix the bots	Silviu Baranga	2016-04-06	3	-277/+1
\| \| \| \|	llvm-svn: 265541
*	[AMDGPU] AsmParser: disable DPP for unsupported instructions. New dpp tests. ↵	Sam Kolton	2016-04-06	1	-12/+363
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix v_nop_dpp. Summary: 1. Disable DPP encoding for instructions that do not support it: - VOP1: - v_readfirstlane_b32 - v_clrexcp - v_movreld_b32 - v_movrels_b32 - v_movrelsd_b32 - VOP2: - v_madmk_f16/32 - v_madak_f16/32 - VOPC, VINTRP, VOP3 2. Fix DPP for v_nop 3. New DPP tests for VOP1 and VOP2 instructions Reviewers: nhaustov, tstellarAMD, vpykhtin Subscribers: tstellarAMD, arsenm Differential Revision: http://reviews.llvm.org/D18552 llvm-svn: 265538
*	[SCEV] Introduce a guarded backedge taken count and use it in LAA and LV	Silviu Baranga	2016-04-06	3	-1/+277
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When the backedge taken codition is computed from an icmp, SCEV can deduce the backedge taken count only if one of the sides of the icmp is an AddRecExpr. However, due to sign/zero extensions, we sometimes end up with something that is not an AddRecExpr. However, we can use SCEV predicates to produce a 'guarded' expression. This change adds a method to SCEV to get this expression, and the SCEV predicate associated with it. In HowManyGreaterThans and HowManyLessThans we will now add a SCEV predicate associated with the guarded backedge taken count when the analyzed SCEV expression is not an AddRecExpr. Note that we only do this as an alternative to returning a 'CouldNotCompute'. We use new feature in Loop Access Analysis and LoopVectorize to analyze and transform more loops. Reviewers: anemet, mzolotukhin, hfinkel, sanjoy Subscribers: flyingforyou, mcrosier, atrick, mssimpso, sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D17201 llvm-svn: 265535
*	[ppc64] Temporary disable sibling call optimization on ppc64 due to breaking ↵	Chuang-Yu Cheng	2016-04-06	2	-7/+7
\| \| \| \| \| \| \| \| \| \| \|	test case r265506 breaks print-stack-trace.cc test case of compiler-rt in bootstrap test. http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/1708 llvm-svn: 265528
*	[SLPVectorizer] Vectorizing the libm sqrt to llvm's sqrt intrinsic requires nnan	David Majnemer	2016-04-06	2	-3/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	To quote the langref "Unlike sqrt in libm, however, llvm.sqrt has undefined behavior for negative numbers other than -0.0 (which allows for better optimization, because there is no need to worry about errno being set). llvm.sqrt(-0.0) is defined to return -0.0 like IEEE sqrt." This means that it's unsafe to replace sqrt with llvm.sqrt unless the call is annotated with nnan. Thanks to Hal Finkel for pointing this out! llvm-svn: 265521
*	Revert "[IRVerifier] Don't crash on invalid DIFile inside DISubprogram."	Davide Italiano	2016-04-06	1	-10/+0
\| \| \| \| \| \| \|	This reverts commit r265515 as lots of tests need to be fixed before this actually can go in. llvm-svn: 265517
*	[IRVerifier] Don't crash on invalid DIFile inside DISubprogram.	Davide Italiano	2016-04-06	1	-0/+10
\| \| \| \|	llvm-svn: 265515
*	[IRVerifier] Avoid crashing on an invalid compile unit.	Davide Italiano	2016-04-06	1	-0/+8
\| \| \| \|	llvm-svn: 265514
*	ValueMapper: Fix delayed blockaddress handling after r265273	Duncan P. N. Exon Smith	2016-04-06	1	-0/+22
\| \| \| \| \| \| \| \| \|	r265273 added Mapper::mapBlockAddress, which delays mapping a blockaddress value until the function has a body. The condition was backwards, and should be checking Function::empty instead of GlobalValue::isDeclaration. llvm-svn: 265508
*	AsmParser: Don't crash on unresolved !tbaa	Duncan P. N. Exon Smith	2016-04-06	2	-1/+12
\| \| \| \| \| \| \| \|	Instead of crashing, give a nice error. As a drive-by, fix the location associated with the errors for unresolved metadata (the location was off by one token). llvm-svn: 265507
*	[ppc64] Enable sibling call optimization on ppc64 ELFv1/ELFv2 abi	Chuang-Yu Cheng	2016-04-06	3	-1/+239
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch enable sibling call optimization on ppc64 ELFv1/ELFv2 abi, and add a couple of test cases. This patch also passed llvm/clang bootstrap test, and spec2006 build/run/result validation. Original issue: https://llvm.org/bugs/show_bug.cgi?id=25617 Great thanks to Tom's (tjablin) help, he contributed a lot to this patch. Thanks Hal and Kit's invaluable opinions! Reviewers: hfinkel kbarton http://reviews.llvm.org/D16315 llvm-svn: 265506
*	[Power9] Implement add-pc, multiply-add, modulo, extend-sign-shift, random ↵	Chuang-Yu Cheng	2016-04-06	3	-0/+111
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	number, set bool, and dfp test significance This patch implement the following instructions: - addpcis subpcis - maddhd maddhdu maddld - modsw moduw modsd modud - darn - extswsli extswsli. - setb - dtstsfi dtstsfiq Total 15 instructions Reviewers: nemanjai hfinkel tjablin amehsan kbarton http://reviews.llvm.org/D17885 llvm-svn: 265505
*	[Power9] Implement copy-paste, msgsync, slb, and stop instructions	Chuang-Yu Cheng	2016-04-06	3	-0/+65
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch implements the following BookII and Book III instructions: - copy copy_first cp_abort paste paste. paste_last - msgsync - slbieg slbsync - stop Total 10 instructions Reviewers: nemanjai hfinkel tjablin amehsan kbarton llvm-svn: 265504
*	Lower @llvm.experimental.deoptimize as a noreturn call	Sanjoy Das	2016-04-06	1	-6/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While preserving the return value for @llvm.experimental.deoptimize at the IR level is useful during mid-level optimization, doing so at the machine instruction level requires generating some extra code and a return that is non-ideal. This change has LLVM lower ``` %val = call @llvm.experimental.deoptimize ret %val ``` to effectively ``` call @__llvm_deoptimize() unreachable ``` instead. llvm-svn: 265502
*	[SLPVectorizer] Vectorize libcalls of sqrt	David Majnemer	2016-04-06	2	-25/+24
\| \| \| \| \| \| \|	We didn't realize that we could transform the libcall into a vectorized intrinsic. llvm-svn: 265493
*	[DebugInfo] Fix tests so that each subprogram belongs to a CU.	Davide Italiano	2016-04-05	14	-11/+89
\| \| \| \|	llvm-svn: 265490
*	[RS4GC] Better codegen for deoptimize calls	Sanjoy Das	2016-04-05	1	-2/+14
\| \| \| \| \| \| \| \| \|	Don't emit a gc.result for a statepoint lowered from @llvm.experimental.deoptimize since the call into __llvm_deoptimize is effectively noreturn. Instead follow the corresponding gc.statepoint with an "unreachable". llvm-svn: 265485
*	Swift Calling Convention: swiftcc for ARM.	Manman Ren	2016-04-05	2	-0/+201
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D18769 llvm-svn: 265482
*	Faster stack-protector for Android/AArch64.	Evgeniy Stepanov	2016-04-05	2	-0/+44
\| \| \| \| \| \| \|	Bionic has a defined thread-local location for the stack protector cookie. Emit a direct load instead of going through __stack_chk_guard. llvm-svn: 265481
*	Swift Calling Convention: add swiftcc.	Manman Ren	2016-04-05	1	-0/+201
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D17863 llvm-svn: 265480
*	[CFLAA] Fix PR27213; incorrect tagging of args/globals	George Burgess IV	2016-04-05	2	-5/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Prior to this patch, CFLAA wouldn't tag arguments/globals properly if it didn't find any "interesting" edges on them. This means that, if all you do is store constants to a global or argument, we would never actually treat it as a global/argument. Test case: define void @foo(i32* %A, i32* %B) #0 { entry: store i32 0, i32* %A, align 4 store i32 0, i32* %B, align 4 ret void } CFLAA would say that %A can't alias %B, because neither pointer was used in an interesting way. This patch makes us note whether something is an argument, global, ... regardless of how interesting CFLAA thinks its uses are. (For the record, using a value in an interesting way means loading from it, using it in a GEP, ...) llvm-svn: 265474
*	llvm-dwp: Handle GCC's use of multiple debug_types.dwo sections in a single ↵	David Blaikie	2016-04-05	1	-0/+9
\| \| \| \| \| \| \| \|	.dwo file (also includes the .test file missing from my previous commit, r265452) llvm-svn: 265457
*	llvm-dwp: Handle dwo files produced by GCC	David Blaikie	2016-04-05	1	-0/+0
\| \| \| \| \| \| \|	To start with, handle DW_FORM_string names. Follow up commit will handle the interesting quirk with type units I was originally aiming for here. llvm-svn: 265452
*	[X86] Reuse EFLAGS and form LOCKed ops when only user is SETCC.	Ahmed Bougacha	2016-04-05	1	-15/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We only generate LOCKed versions of add/sub when the result is unused. It often happens that the result is used, but only by a comparison. We can optimize those out by reusing EFLAGS, which lets us use the proper instructions, instead of having to fallback to LXADD. Instead of doing this as an MI peephole (as we do for the other non-LOCKed (really, non-MR) forms), do it in ISel. It becomes quite tricky later. This also makes it eventually possible to stop expanding and/or/xor if the only user is an icmp (also see D18141). This uses the LOCK ISD opcodes added by r262244. Differential Revision: http://reviews.llvm.org/D17633 llvm-svn: 265450
*	[X86] Add tests for ATOMIC_LOAD_OP EFLAGS reuse. NFC.	Ahmed Bougacha	2016-04-05	1	-0/+159
\| \| \| \|	llvm-svn: 265448
*	[AArch64][Test] Do not override the suffixes for test cases.	Quentin Colombet	2016-04-05	2	-3/+1
\| \| \| \|	llvm-svn: 265441
*	add tests to show missing optimization from D18230	Sanjay Patel	2016-04-05	1	-0/+59
\| \| \| \|	llvm-svn: 265431
*	[InstCombine] regenerate checks	Sanjay Patel	2016-04-05	10	-81/+89
\| \| \| \| \| \| \| \| \|	utils/update_test_checks.py was improved with: http://reviews.llvm.org/rL265414 to CHECK-NEXT the first line of the IR function. This ensures that nothing bad has happened before that. llvm-svn: 265417
*	[x86] regenerate checks	Sanjay Patel	2016-04-05	3	-43/+86
\| \| \| \| \| \| \| \| \| \| \|	utils/update_test_checks.py was improved with: http://reviews.llvm.org/rL265414 to include the first line of the function (expected to be a comment line). This ensures that nothing bad has happened before the first actual line of checked asm. It also matches the existing behavior of the old script. llvm-svn: 265416
*	WebAssembly: fix cfg-stackify test	JF Bastien	2016-04-05	1	-10/+10
\| \| \| \| \| \|	It was broken by reshuffling induced by r265397 'Don't delete empty preheaders in CodeGenPrepare if it would create a critical edge'. llvm-svn: 265415
*	[lanai] LanaiSetflagAluCombiner more conservative	Jacques Pienaar	2016-04-05	1	-3/+47
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: LanaiSetflagAluCombiner could previously combine instructions across basic building blocks even when not legal. Make the LanaiSetflagAluCombiner more conservative to avoid this. Reviewers: eliben Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D18746 llvm-svn: 265411
*	[AMDGPU] Emit linkonce and linkonce_odr symbols	Konstantin Zhuravlyov	2016-04-05	1	-0/+56
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D18726 llvm-svn: 265408
*	Add missing test for the "Don't delete empty preheaders" added in r265397	Chuang-Yu Cheng	2016-04-05	1	-0/+39
\| \| \| \| \|	Author: Tom Jablin (tjablin) llvm-svn: 265402