bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Revert "Output optimization remarks in YAML"	Adam Nemet	2016-09-27	5	-131/+6
\| \| \| \| \| \| \| \|	This reverts commit r282499. The GCC bots are failing llvm-svn: 282503
*	Output optimization remarks in YAML	Adam Nemet	2016-09-27	5	-6/+131
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This allows various presentation of this data using an external tool. This was first recommended here[1]. As an example, consider this module: 1 int foo(); 2 int bar(); 3 4 int baz() { 5 return foo() + bar(); 6 } The inliner generates these missed-optimization remarks today (the hotness information is pulled from PGO): remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30) remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30) Now with -pass-remarks-output=<yaml-file>, we generate this YAML file: --- !Missed Pass: inline Name: NotInlined DebugLoc: { File: /tmp/s.c, Line: 5, Column: 10 } Function: baz Hotness: 30 Args: - Callee: foo - String: will not be inlined into - Caller: baz ... --- !Missed Pass: inline Name: NotInlined DebugLoc: { File: /tmp/s.c, Line: 5, Column: 18 } Function: baz Hotness: 30 Args: - Callee: bar - String: will not be inlined into - Caller: baz ... This is a summary of the high-level decisions: * There is a new streaming interface to emit optimization remarks. E.g. for the inliner remark above: ORE.emit(DiagnosticInfoOptimizationRemarkMissed( DEBUG_TYPE, "NotInlined", &I) << NV("Callee", Callee) << " will not be inlined into " << NV("Caller", CS.getCaller()) << setIsVerbose()); NV stands for named value and allows the YAML client to process a remark using its name (NotInlined) and the named arguments (Callee and Caller) without parsing the text of the message. Subsequent patches will update ORE users to use the new streaming API. * I am using YAML I/O for writing the YAML file. YAML I/O requires you to specify reading and writing at once but reading is highly non-trivial for some of the more complex LLVM types. Since it's not clear that we (ever) want to use LLVM to parse this YAML file, the code supports and asserts that we're writing only. On the other hand, I did experiment that the class hierarchy starting at DiagnosticInfoOptimizationBase can be mapped back from YAML generated here (see D24479). * The YAML stream is stored in the LLVM context. * In the example, we can probably further specify the IR value used, i.e. print "Function" rather than "Value". * As before hotness is computed in the analysis pass instead of DiganosticInfo. This avoids the layering problem since BFI is in Analysis while DiagnosticInfo is in IR. [1] https://reviews.llvm.org/D19678#419445 Differential Revision: https://reviews.llvm.org/D24587 llvm-svn: 282499
*	Add xxhash to llvm.	Rafael Espindola	2016-09-27	2	-0/+135
\| \| \| \| \| \|	It will be used for fast fingerprinting in lld at least. llvm-svn: 282493
*	[AMDGPU] Enable changing instprinter's behavior based on the per-function	Konstantin Zhuravlyov	2016-09-27	3	-132/+214
\| \| \| \| \| \| \| \| \| \|	subtarget This is a prerequisite for coming waitcnt changes Differential Revision: https://reviews.llvm.org/D24939 llvm-svn: 282489
*	[mips] Disable tail calls temporarily	Simon Dardis	2016-09-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Disable tail calls while the remaining bugs are fixed. Enable only for tests. Reviewers: vkalintiris Differential Review: https://reviews.llvm.org/D24912 llvm-svn: 282487
*	[mips] Add rsqrt, recip for MIPS	Simon Dardis	2016-09-27	8	-25/+77
\| \| \| \| \| \| \| \| \| \| \|	Add rsqrt.[ds], recip.[ds] for MIPS. Correct the microMIPS definitions for architecture support and register usage. Reviewers: vkalintiris, zoran.jovanoic Differential Review: https://reviews.llvm.org/D24499 llvm-svn: 282485
*	[Power9] Builtins for ELF v.2 API conformance - back end portion	Nemanja Ivanovic	2016-09-27	3	-31/+78
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch corresponds to review: https://reviews.llvm.org/D24396 This patch adds support for the "vector count trailing zeroes", "vector compare not equal" and "vector compare not equal or zero instructions" as well as "scalar count trailing zeroes" instructions. It also changes the vector negation to use XXLNOR (when VSX is enabled) so as not to increase register pressure (previously this was done with a splat immediate of all ones followed by an XXLXOR). This was done because the altivec.h builtins (patch to follow) use vector negation and the use of an additional register for the splat immediate is not optimal. llvm-svn: 282478
*	[X86] Use std::max to calculate alignment instead of assuming RC->getSize() ↵	Craig Topper	2016-09-27	1	-2/+2
\| \| \| \| \| \|	will not return a value greater than 32. I think it theoretically could be 64 for AVX-512. llvm-svn: 282471
*	[libFuzzer] run re2 test in 8 threads by default	Kostya Serebryany	2016-09-27	1	-1/+1
\| \| \| \|	llvm-svn: 282469
*	[sanitizer-coverage] fix a bug in trace-gep	Kostya Serebryany	2016-09-27	2	-2/+2
\| \| \| \|	llvm-svn: 282467
*	[sanitizer-coverage] don't emit the CTOR function if nothing has been ↵	Kostya Serebryany	2016-09-27	1	-17/+21
\| \| \| \| \| \|	instrumented llvm-svn: 282465
*	Revert r277556. Add -lowertypetests-bitsets-level to control bitsets generation	Ivan Krasin	2016-09-27	1	-9/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We don't currently need this facility for CFI. Disabling individual hot methods proved to be a better strategy in Chrome. Also, the design of the feature is suboptimal, as pointed out by Peter Collingbourne. Reviewers: pcc Subscribers: kcc Differential Revision: https://reviews.llvm.org/D24948 llvm-svn: 282461
*	[libFuzzer] add a test based on openssl-1.0.1f (finds heartbleed)	Kostya Serebryany	2016-09-27	5	-0/+89
\| \| \| \|	llvm-svn: 282460
*	[libFuzzer] add -exit_on_src_pos to test libFuzzer itself, add a test script ↵	Kostya Serebryany	2016-09-27	12	-13/+55
\| \| \| \| \| \|	for RE2 that uses this flag llvm-svn: 282458
*	LowerTypeTests: Remove unused variable.	Peter Collingbourne	2016-09-26	1	-1/+0
\| \| \| \|	llvm-svn: 282456
*	LowerTypeTests: Create LowerTypeTestsModule class and move implementation ↵	Peter Collingbourne	2016-09-26	1	-82/+74
\| \| \| \| \| \|	there. Related simplifications. llvm-svn: 282455
*	[CodeGen] Add support for emitting .init_array instead of .ctors on FreeBSD.	Davide Italiano	2016-09-26	3	-0/+15
\| \| \| \| \|	PR: 30494 llvm-svn: 282451
*	[WebAssembly] Use the frame pointer instead of the stack pointer	Derek Schuff	2016-09-26	1	-4/+9
\| \| \| \| \| \| \| \| \| \| \|	When we have dynamic allocas we have a frame pointer, and when we're lowering frame indexes we should make sure we use it. Patch by Jacob Gravelle Differential Revision: https://reviews.llvm.org/D24889 llvm-svn: 282442
*	Next set of additional error checks for invalid Mach-O files for the	Kevin Enderby	2016-09-26	1	-0/+15
\| \| \| \| \| \| \| \| \| \|	other load commands that use the Mach::linkedit_data_command type but not used in llvm libObject code but used in llvm tool code. This includes LC_FUNCTION_STARTS, LC_SEGMENT_SPLIT_INFO and LC_DYLIB_CODE_SIGN_DRS load commands. llvm-svn: 282441
*	Move computation past early return	Aditya Kumar	2016-09-26	1	-3/+2
\| \| \| \| \| \| \| \| \| \|	Reviewers: rafael spatel Differential Revision: https://reviews.llvm.org/D24843 llvm-svn: 282440
*	[thinlto] Basic thinlto fdo heuristic	Piotr Padlewski	2016-09-26	5	-48/+102
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch improves thinlto importer by importing 3x larger functions that are called from hot block. I compared performance with the trunk on spec, and there were about 2% on povray and 3.33% on milc. These results seems to be consistant and match the results Teresa got with her simple heuristic. Some benchmarks got slower but I think they are just noisy (mcf, xalancbmki, omnetpp)- running the benchmarks again with more iterations to confirm. Geomean of all benchmarks including the noisy ones were about +0.02%. I see much better improvement on google branch with Easwaran patch for pgo callsite inlining (the inliner actually inline those big functions) Over all I see +0.5% improvement, and I get +8.65% on povray. So I guess we will see much bigger change when Easwaran patch will land (it depends on new pass manager), but it is still worth putting this to trunk before it. Implementation details changes: - Removed CallsiteCount. - ProfileCount got replaced by Hotness - hot-import-multiplier is set to 3.0 for now, didn't have time to tune it up, but I see that we get most of the interesting functions with 3, so there is no much performance difference with higher, and binary size doesn't grow as much as with 10.0. Reviewers: eraman, mehdi_amini, tejohnson Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24638 llvm-svn: 282437
*	Add support for Code16GCC	Nirav Dave	2016-09-26	1	-20/+42
\| \| \| \| \| \| \| \| \| \| \| \| \|	[X86] The .code16gcc directive parses X86 assembly input in 32-bit mode and outputs in 16-bit mode. Teach parser to switch modes appropriately. Reviewers: dwmw2, craig.topper Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20109 llvm-svn: 282430
*	Add optimization bisect support to an optional Mips pass	Andrew Kaylor	2016-09-26	1	-0/+3
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D19513 llvm-svn: 282428
*	Statistic: Only print statistics on exit for -stats	Matthias Braun	2016-09-26	1	-8/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously enabling the statistics with EnableStatistics() would lead to them getting printed to stderr/-info-output-file on exit. However frontends may want a way to enable statistics and do the printing on their own instead of the forced printing on exit. This changes the code so that only the -stats option enables printing on exit, EnableStatistics() only enables the tracking but requires invoking one of the PrintStatistics() variants. Differential Revision: https://reviews.llvm.org/D24819 llvm-svn: 282425
*	AMDGPU/SI: Don't crash on anonymous GlobalValues	Tom Stellard	2016-09-26	3	-7/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We need to call AsmPrinter::getNameWithPrefix() in order to handle anonymous GlobalValues (e.g. @0, @1). Reviewers: arsenm, b-sumner Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D24865 llvm-svn: 282420
*	Remove pruning of phi nodes in MemorySSA - it makes updating harder	Daniel Berlin	2016-09-26	1	-40/+5
\| \| \| \| \| \| \| \| \| \|	Reviewers: george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24923 llvm-svn: 282419
*	[LV] Scalarize instructions marked scalar after vectorization	Matthew Simpson	2016-09-26	1	-0/+9
\| \| \| \| \| \| \| \| \|	This patch ensures that we actually scalarize instructions marked scalar after vectorization. Previously, such instructions may have been vectorized instead. Differential Revision: https://reviews.llvm.org/D23889 llvm-svn: 282418
*	[Coroutines] Part14: Handle coroutines with no suspend points.	Gor Nishanov	2016-09-26	2	-0/+120
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If coroutine has no suspend points, remove heap allocation and turn a coroutine into a normal function. Also, if a pattern is detected that coroutine resumes or destroys itself prior to coro.suspend call, turn the suspend point into a simple jump to resume or cleanup label. This pattern occurs when coroutines are used to propagate errors in functions that return expected<T>. Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24408 llvm-svn: 282414
*	[AArch64] Improve add/sub/cmp isel of uxtw forms.	Geoff Berry	2016-09-26	3	-8/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Don't match the UXTW extended reg forms of ADD/ADDS/SUB/SUBS if the 32-bit to 64-bit zero-extend can be done for free by taking advantage of the 32-bit defining instruction zeroing the upper 32-bits of the X register destination. This enables better instruction selection in a few cases, such as: sub x0, xzr, x8 instead of: mov x8, xzr sub x0, x8, w9, uxtw madd x0, x1, x1, x8 instead of: mul x9, x1, x1 add x0, x9, w8, uxtw cmp x2, x8 instead of: sub x8, x2, w8, uxtw cmp x8, #0 add x0, x8, x1, lsl #3 instead of: lsl x9, x1, #3 add x0, x9, w8, uxtw Reviewers: t.p.northover, jmolloy Subscribers: mcrosier, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D24747 llvm-svn: 282413
*	Add support to optionally limit the size of jump tables.	Evandro Menezes	2016-09-26	5	-12/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Many high-performance processors have a dedicated branch predictor for indirect branches, commonly used with jump tables. As sophisticated as such branch predictors are, they tend to have well defined limits beyond which their effectiveness is hampered or even nullified. One such limit is the number of possible destinations for a given indirect branches that such branch predictors can handle. This patch considers a limit that a target may set to the number of destination addresses in a jump table. Patch by: Evandro Menezes <e.menezes@samsung.com>, Aditya Kumar <aditya.k7@samsung.com>, Sebastian Pop <s.pop@samsung.com>. Differential revision: https://reviews.llvm.org/D21940 llvm-svn: 282412
*	[InstCombine] Fixed bug introduced in r282237	Alexey Bataev	2016-09-26	1	-6/+8
\| \| \| \| \| \| \| \|	The index of the new insertelement instruction was evaluated in the wrong way, it was considered as the index of the inserted value instead of index of the position, where the value should be inserted. llvm-svn: 282401
*	[InstCombine] Teach the udiv folding logic how to handle constant expressions.	Andrea Di Biagio	2016-09-26	1	-11/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes PR30366. Function foldUDivShl() worked under the assumption that one of the values in input to the function was always an instance of llvm::Instruction. However, function visitUDivOperand() (the only user of foldUDivShl) was clearly violating that precondition; internally, visitUDivOperand() uses pattern matches to check the operands of a udiv. Pattern matchers for binary operators know how to handle both Instruction and ConstantExpr values. This patch fixes the problem in foldUDivShl(). Now we use pattern matchers instead of explicit casts to Instruction. The reduced test case from PR30366 has been added to test file InstCombine/udiv-simplify.ll. Differential Revision: https://reviews.llvm.org/D24565 llvm-svn: 282398
*	[AVR] Add AVRMCExpr	Dylan McKay	2016-09-26	4	-0/+427
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: This adds the AVRMCExpr headers and implementation. Reviewers: arsenm, ruiu, grosbach, kparzysz Subscribers: wdng, beanz, mgorny, kparzysz, jtbandes, llvm-commits Differential Revision: https://reviews.llvm.org/D20503 llvm-svn: 282397
*	Revert "[AMDGPU] Disassembler: print label names in branch instructions"	Sam Kolton	2016-09-26	3	-156/+66
\| \| \| \| \| \|	This reverts commit 6c6dbe625263ec9fcf8de0df27263cf147cde550. llvm-svn: 282396
*	[AMDGPU] Disassembler: print label names in branch instructions	Sam Kolton	2016-09-26	3	-66/+156
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add AMDGPUSymbolizer for finding names for labels from ELF symbol table. Reviewers: vpykhtin, artem.tamazov, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D24802 llvm-svn: 282394
*	[ARM] Promote small global constants to constant pools	James Molloy	2016-09-26	9	-9/+285
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a constant is unamed_addr and is only used within one function, we can save on the code size and runtime cost of an indirection by changing the global's storage to inside the constant pool. For example, instead of: ldr r0, .CPI0 bl printf bx lr .CPI0: &format_string format_string: .asciz "hello, world!\n" We can emit: adr r0, .CPI0 bl printf bx lr .CPI0: .asciz "hello, world!\n" This can cause significant code size savings when many small strings are used in one function (4 bytes per string). This recommit contains fixes for a nasty bug related to fast-isel fallback - because fast-isel doesn't know about this optimization, if it runs and emits references to a string that we inline (because fast-isel fell back to SDAG) we will end up with an inlined string and also an out-of-line string, and we won't emit the out-of-line string, causing backend failures. It also contains fixes for emitting .text relocations which made the sanitizer bots unhappy. llvm-svn: 282387
*	[X86] Optimization for replacing LEA with MOV at frame index elimination time	Zvi Rackover	2016-09-26	1	-1/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Replace a LEA instruction of the form 'lea (%esp), %ebx' --> 'mov %esp, %ebx' MOV is preferable over LEA because usually there are more issue-slots available to execute MOVs than LEAs. Latest processors also support zero-latency MOVs. Fixes pr29022. Reviewers: hfinkel, delena, igorb, myatsina, mkuper Differential Revision: https://reviews.llvm.org/D24705 llvm-svn: 282385
*	[X86][avx512] Fix bug in masked compress store.	Ayman Musa	2016-09-26	4	-16/+31
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D23984 llvm-svn: 282381
*	[SCEV] Fix the order of members in the initializer list.	Chandler Carruth	2016-09-26	1	-1/+1
\| \| \| \| \| \| \|	Noticed due to the warning on this line. Sanjoy is on a less-than-awesome internet connection, so committing on his behalf. llvm-svn: 282380
*	[SCEV] Assign LoopPropertiesCache in the move constructor	Sanjoy Das	2016-09-26	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	In a previous change I collapsed two different caches into one. When doing that I noticed that ScalarEvolution's move constructor was not moving those caches. To keep the previous change simple, I've moved that bugfix into this separate change. llvm-svn: 282376
*	[SCEV] Combine two predicates into one; NFC	Sanjoy Das	2016-09-26	1	-31/+24
\| \| \| \| \| \| \| \| \|	Both `loopHasNoSideEffects` and `loopHasNoAbnormalExits` involve walking the loop and maintaining similar sorts of caches. This commit changes SCEV to compute both the predicates via a single walk, and maintain a single cache instead of two. llvm-svn: 282375
*	[SCEV] Make it obvious BackedgeTakenInfo's constructor steals storage	Sanjoy Das	2016-09-26	1	-2/+4
\| \| \| \| \| \| \|	Specifically, it moves SCEVUnionPredicates from its input into its own storage. Make this obvious at the type level. llvm-svn: 282374
*	[SCEV] Further isolate incidental data structure; NFC	Sanjoy Das	2016-09-26	1	-4/+7
\| \| \| \|	llvm-svn: 282373
*	[SCEV] Simplify BackedgeTakenInfo::getMax; NFC	Sanjoy Das	2016-09-26	1	-7/+7
\| \| \| \|	llvm-svn: 282372
*	[SCEV] Reserve space in SmallVector; NFC	Sanjoy Das	2016-09-25	1	-0/+1
\| \| \| \|	llvm-svn: 282368
*	[SCEV] Have ExitNotTakenInfo keep a pointer to its predicate; NFC	Sanjoy Das	2016-09-25	1	-11/+15
\| \| \| \| \| \| \|	SCEVUnionPredicate is a "heavyweight" structure, so it is beneficial to store the (optional) data out of line. llvm-svn: 282366
*	[SCEV] Simplify tracking ExitNotTakenInfo instances; NFC	Sanjoy Das	2016-09-25	1	-72/+24
\| \| \| \| \| \| \| \| \| \|	This change simplifies a data structure optimization in the `BackedgeTakenInfo` class for loops with exactly one computable exit. I've sanity checked that this does not regress compile time performance, using sqlite3's amalgamated build. llvm-svn: 282365
*	[SCEV] Rename a couple of fields; NFC	Sanjoy Das	2016-09-25	1	-48/+55
\| \| \| \|	llvm-svn: 282364
*	[SCEV] Remove incidental data structure; NFC	Sanjoy Das	2016-09-25	1	-15/+19
\| \| \| \|	llvm-svn: 282363
*	[X86] Remove what appears to be leftover MMX code involving (v1i64 ↵	Craig Topper	2016-09-25	1	-4/+0
\| \| \| \| \| \|	scalar_to_vector). llvm-svn: 282361