bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[PowerPC] Add a regression test for r225251	Hal Finkel	2015-01-06	1	-0/+23
\| \| \| \| \| \| \| \|	In r225251, I removed an old entry from the README.txt file. While there are several contributing factors (including pieces in Clang's ABI code), upon further reflection, the backend part deserves a regression test. llvm-svn: 225268
*	[Hexagon] Adding dealloc_return encoding and absolute address stores.	Colin LeMahieu	2015-01-06	3	-1/+29
\| \| \| \|	llvm-svn: 225267
*	Convert fcmp with 0.0 from casted integers to icmp	Matt Arsenault	2015-01-06	1	-0/+454
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is already handled in general when it is known the conversion can't lose bits with smaller integer types casted into wider floating point types. This pattern happens somewhat often in GPU programs that cast workitem intrinsics to float, which are often compared with 0. Specifically handle the special case of compares with zero which should also be known to not lose information. I had a more general version of this which allows equality compares if the casted float is exactly representable in the integer, but I'm not 100% confident that is always correct. Also fold cases that aren't integers to true / false. llvm-svn: 225265
*	[PM] Introduce a utility pass that preserves no analyses.	Chandler Carruth	2015-01-06	1	-0/+33
\| \| \| \| \| \| \| \| \|	Use this to test that path of invalidation. This test actually shows redundant invalidation here that is really bad. I'm going to work on fixing that next, but wanted to commit the test harness now that its all working. llvm-svn: 225257
*	[X86] Add OpSize32 to XBEGIN_4. Add XBEGIN_2 with OpSize16.	Craig Topper	2015-01-06	1	-0/+3
\| \| \| \| \| \|	Requires new AsmParserOperand types that detect 16-bit and 32/64-bit mode so that we choose the right instruction based on default sizing without predicates. This is necessary since predicates mess up the disassembler table building. llvm-svn: 225256
*	InstCombine: Bitcast call arguments from/to pointer/integer type	David Majnemer	2015-01-06	2	-4/+43
\| \| \| \| \| \| \|	Try harder to get rid of bitcast'd calls by ptrtoint/inttoptr'ing arguments and return values when DataLayout says it is safe to do so. llvm-svn: 225254
*	[PM] Simplify how we parse the outer layer of the pass pipeline text and	Chandler Carruth	2015-01-06	2	-3/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	remove an extra, redundant pass manager wrapping every run. I had kept seeing these when manually testing, but it was getting really annoying and was going to cause problems with overly eager invalidation. The root cause was an overly complex and unnecessary pile of code for parsing the outer layer of the pass pipeline. We can instead delegate most of this to the recursive pipeline parsing. I've added some somewhat more basic and precise tests to catch this. llvm-svn: 225253
*	X86: Don't make illegal GOTTPOFF relocations	David Majnemer	2015-01-06	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	"ELF Handling for Thread-Local Storage" specifies that R_X86_64_GOTTPOFF relocation target a movq or addq instruction. Prohibit the truncation of such loads to movl or addl. This fixes PR22083. Differential Revision: http://reviews.llvm.org/D6839 llvm-svn: 225250
*	[PowerPC] Improve int_to_fp(fp_to_int(x)) combining	Hal Finkel	2015-01-06	1	-0/+70
\| \| \| \| \| \| \| \| \|	The old target DAG combine that allowed for performing int_to_fp(fp_to_int(x)) without a load/store pair is updated here with support for unsigned integers, and to support single-precision values without a third rounding step, on newer cores with the appropriate instructions. llvm-svn: 225248
*	[PM] Add a utility pass template that synthesizes the invalidation of	Chandler Carruth	2015-01-06	1	-0/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	a specific analysis result. This is quite handy to test things, and will also likely be very useful for debugging issues. You could narrow down pass validation failures by walking these invalidate pass runs up and down the pass pipeline, etc. I've added support to the pass pipeline parsing to be able to create one of these for any analysis pass desired. Just adding this class uncovered one latent bug where the AnalysisManager CRTP base class had a hard-coded Module type rather than using IRUnitT. I've also added tests for invalidation and caching of analyses in a basic way across all the pass managers. These in turn uncovered two more bugs where we failed to correctly invalidate an analysis -- its results were invalidated but the key for re-running the pass was never cleared and so it was never re-run. Quite nasty. I'm very glad to debug this here rather than with a full system. Also, yes, the naming here is horrid. I'm going to update some of the names to be slightly less awful shortly. But really, I've no "good" ideas for naming. I'll be satisfied if I can get it to "not bad". llvm-svn: 225246
*	[PM] Add a collection of no-op analysis passes and switch the new pass	Chandler Carruth	2015-01-06	1	-6/+14
\| \| \| \| \| \| \| \| \| \| \| \|	manager tests to use them and be significantly more comprehensive. This, naturally, uncovered a bug where the CGSCC pass manager wasn't printing analyses when they were run. The only remaining core manipulator is I think an invalidate pass similar to the require pass. That'll be next. =] llvm-svn: 225240
*	[PM] Add a utility to the new pass manager for generating a pass which	Chandler Carruth	2015-01-06	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	is a no-op other than requiring some analysis results be available. This can be used in real pass pipelines to force the usually lazy analysis running to eagerly compute something at a specific point, and it can be used to test the pass manager infrastructure (my primary use at the moment). I've also added bit of pipeline parsing magic to support generating these directly from the opt command so that you can directly use these when debugging your analysis. The syntax is: require<analysis-name> This can be used at any level of the pass manager. For example: cgscc(function(require<my-analysis>,no-op-function)) This would produce a no-op function pass requiring my-analysis, followed by a fully no-op function pass, both of these in a function pass manager which is nested inside of a bottom-up CGSCC pass manager which is in the top-level (implicit) module pass manager. I have zero attachment to the particular syntax I'm using here. Consider it a straw man for use while I'm testing and fleshing things out. Suggestions for better syntax welcome, and I'll update everything based on any consensus that develops. I've used this new functionality to more directly test the analysis printing rather than relying on the cgscc pass manager running an analysis for me. This is still minimally tested because I need to have analyses to run first! ;] That patch is next, but wanted to keep this one separate for easier review and discussion. llvm-svn: 225236
*	Add a testcase that would have found the problem in r225048.	Rafael Espindola	2015-01-06	1	-0/+27
\| \| \| \|	llvm-svn: 225235
*	Revert r225048: It broke ObjC on AArch64.	Lang Hames	2015-01-06	3	-124/+0
\| \| \| \| \| \|	I've filed http://llvm.org/PR22100 to track this issue. llvm-svn: 225228
*	[PowerPC] Fix test to pass on Darwin hosts	Hal Finkel	2015-01-05	1	-1/+3
\| \| \| \|	llvm-svn: 225220
*	[PowerPC] Convert a README.txt entry into a better test	Hal Finkel	2015-01-05	1	-1/+7
\| \| \| \| \| \| \|	We now produce the desired code as noted in the README.txt file (no spurious or). Remove the README entry and improve the regression test. llvm-svn: 225214
*	[Hexagon] Adding add/sub with carry, logical shift left by immediate and ↵	Colin LeMahieu	2015-01-05	3	-0/+56
\| \| \| \| \| \|	memop instructions. Removing old defs without bits and updating references. llvm-svn: 225210
*	[PowerPC] Add a test for truncating a shifted load	Hal Finkel	2015-01-05	1	-0/+18
\| \| \| \| \| \| \|	We now produce the desired code as noted in the README.txt file. Remove the README entry and add a regression test. llvm-svn: 225209
*	[dsymutil] Implement the BinaryHolder object and gain archive support.	Frederic Riss	2015-01-05	4	-0/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This object is meant to own the ObjectFiles and their underlying MemoryBuffer. It is basically the equivalent of an OwningBinary except that it efficiently handles Archives. It is optimized for efficiently providing mappings of members of the same archive when they are opened successively (which is standard in Darwin debug maps, objects from the same archive will be contiguous). Of course, the BinaryHolder will also be used by the DWARF linker once it is commited, but for now only the debug map parser uses it. With this change, you can run llvm-dsymutil on your Darwin debug build of clang and get a complete debug map for it. Differential Revision: http://reviews.llvm.org/D6690 llvm-svn: 225207
*	[PowerPC] Add another test for load/store with update	Hal Finkel	2015-01-05	1	-0/+19
\| \| \| \| \| \| \|	We now produce the desired code as noted in the README.txt file. Remove the README entry and add a regression test. llvm-svn: 225205
*	[PowerPC] Fold i1 extensions with other ops	Hal Finkel	2015-01-05	1	-0/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Consider this function from our README.txt file: int foo(int a, int b) { return (a < b) << 4; } We now explicitly track CR bits by default, so the comment in the README.txt about not really having a SETCC is no longer accurate, but we did generate this somewhat silly code: cmpw 0, 3, 4 li 3, 0 li 12, 1 isel 3, 12, 3, 0 sldi 3, 3, 4 blr which generates the zext as a select between 0 and 1, and then shifts the result by a constant amount. Here we preprocess the DAG in order to fold the results of operations on an extension of an i1 value into the SELECT_I[48] pseudo instruction when the resulting constant can be materialized using one instruction (just like the 0 and 1). This was not implemented as a DAGCombine because the resulting code would have been anti-canonical and depends on replacing chained user nodes, which does not fit well into the lowering paradigm. Now we generate: cmpw 0, 3, 4 li 3, 0 li 12, 16 isel 3, 12, 3, 0 blr which is less silly. llvm-svn: 225203
*	[Hexagon] Adding rounding reg/reg variants, accumulating multiplies, and ↵	Colin LeMahieu	2015-01-05	3	-0/+32
\| \| \| \| \| \|	accumulating shifts. llvm-svn: 225201
*	[Hexagon] Adding V4 bit manipulating instructions, removing ALU defs without ↵	Colin LeMahieu	2015-01-05	2	-0/+22
\| \| \| \| \| \|	encoding bits. llvm-svn: 225199
*	[Hexagon] Adding V4 logic-logic instructions and tests.	Colin LeMahieu	2015-01-05	1	-0/+26
\| \| \| \|	llvm-svn: 225198
*	[Hexagon] Adding orand, bitsplit reg/reg, and modwrap instructions.	Colin LeMahieu	2015-01-05	2	-0/+6
\| \| \| \|	llvm-svn: 225197
*	[PowerPC] Remove zexts after i32 ctlz	Hal Finkel	2015-01-05	1	-4/+20
\| \| \| \| \| \| \| \| \|	The 64-bit semantics of cntlzw are not special, the 32-bit population count is stored as a 64-bit value in the range [0,32]. As a result, it is always zero extended, and it can be added to the PPCISelDAGToDAG peephole optimization as a frontier instruction for the removal of unnecessary zero extensions. llvm-svn: 225192
*	[PowerPC] Remove zexts after byte-swapping loads	Hal Finkel	2015-01-05	1	-0/+30
\| \| \| \| \| \| \| \| \|	lhbrx and lwbrx not only load their data with byte swapping, but also clear the upper 32 bits (at least). As a result, they can be added to the PPCISelDAGToDAG peephole optimization as frontier instructions for the removal of unnecessary zero extensions. llvm-svn: 225189
*	[Hexagon] Adding round reg/imm and bitsplit instructions.	Colin LeMahieu	2015-01-05	2	-0/+8
\| \| \| \|	llvm-svn: 225188
*	[AArch64] Improve codegen of store lane instructions by avoiding GPR usage.	Ahmed Bougacha	2015-01-05	1	-4/+104
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We used to generate code similar to: umov.b w8, v0[2] strb w8, [x0, x1] because the STRro patterns were preferred to ST1. Instead, we can avoid going through GPRs, and generate: add x8, x0, x1 st1.b { v0 }[2], [x8] This patch increases the ST1 AddedComplexity to achieve that. rdar://16372710 Differential Revision: http://reviews.llvm.org/D6202 llvm-svn: 225183
*	[AArch64] Improve codegen of store lane 0 instructions by directly storing ↵	Ahmed Bougacha	2015-01-05	1	-0/+92
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	the subregister. For 0-lane stores, we used to generate code similar to: fmov w8, s0 str w8, [x0, x1, lsl #2] instead of: str s0, [x0, x1, lsl #2] To correct that: for store lane 0 patterns, directly match to STR <subreg>0. Byte-sized instructions don't have the special case for a 0 index, because FPR8s are defined to have untyped content. rdar://16372710 Differential Revision: http://reviews.llvm.org/D6772 llvm-svn: 225181
*	llvm/test/lit.cfg: have_ld_plugin_support(): Use decode() for stdout.	NAKAMURA Takumi	2015-01-05	1	-1/+1
\| \| \| \|	llvm-svn: 225171
*	Select lower fsub,fabs pattern to fabd on AArch64	Karthik Bhat	2015-01-05	1	-0/+69
\| \| \| \| \| \| \| \| \| \| \| \|	This patch lowers patterns such as- fsub v0.4s, v0.4s, v1.4s fabs v0.4s, v0.4s to fabd v0.4s, v0.4s, v1.4s on AArch64. Review: http://reviews.llvm.org/D6791 llvm-svn: 225169
*	Parse Tag_compatibility correctly.	Charlie Turner	2015-01-05	3	-4/+5
\| \| \| \| \| \| \| \|	Tag_compatibility takes two arguments, but before this patch it would erroneously accept just one, it now produces an error in that case. Change-Id: I530f918587620d0d5dfebf639944d6083871ef7d llvm-svn: 225167
*	Emit the build attribute Tag_conformance.	Charlie Turner	2015-01-05	4	-5/+25
\| \| \| \| \| \| \| \| \| \| \|	Claim conformance to version 2.09 of the ARM ABI. This build attribute must be emitted first amongst the build attributes when written to an object file. This is to simplify conformance detection by consumers. Change-Id: If9eddcfc416bc9ad6e5cc8cdcb05d0031af7657e llvm-svn: 225166
*	Select lower sub,abs pattern to sabd on AArch64	Karthik Bhat	2015-01-05	1	-0/+101
\| \| \| \| \| \| \| \| \| \| \| \|	This patch lowers patterns such as- sub v0.4s, v0.4s, v1.4s abs v0.4s, v0.4s to sabd v0.4s, v0.4s, v1.4s on AArch64. Review: http://reviews.llvm.org/D6781 llvm-svn: 225165
*	Fix broken test from r225159.	Michael Kuperstein	2015-01-05	1	-1/+1
\| \| \| \|	llvm-svn: 225164
*	[PM] Don't run the machinery of invalidating all the analysis passes	Chandler Carruth	2015-01-05	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	when all are being preserved. We want to short-circuit this for a couple of reasons. One, I don't really want passes to grow a dependency on actually receiving their invalidate call when they've been preserved. I'm thinking about removing this entirely. But more importantly, preserving everything is likely to be the common case in a lot of scenarios, and it would be really good to bypass all of the invalidation and preservation machinery there. Avoiding calling N opaque functions to try to invalidate things that are by definition still valid seems important. =] This wasn't really inpsired by much other than seeing the spam in the logging for analyses, but it seems better ot get it checked in rather than forgetting about it. llvm-svn: 225163
*	[PM] Add names and debug logging for analysis passes to the new pass	Chandler Carruth	2015-01-05	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	manager. This starts to allow us to test analyses more easily, but it's really only the beginning. Some of the code here is still untestable without manual changes to create analysis passes, but I wanted to factor it into a small of chunks as possible. Next up in order to be able to test things are, in no particular order: - No-op analyses passes so we don't have to use real ones to exercise the pass maneger itself. - Automatic way of generating dummy passes that require an analysis be run, including a variant that calls a 'print' method on a pass to make it even easier to print out the results of an analysis. - Dummy passes that invalidate all analyses for their IR unit so we can test invalidation and re-runs. - Automatic way to print each analysis pass as it is re-run. - Automatic but optional verification of analysis passes everywhere possible. I'm not claiming I'll get to all of these immediately, but that's what is in the pipeline at some stage. I'm fleshing out exactly what I need and what to prioritize by working on converting analyses and then trying to test the conversion. =] llvm-svn: 225162
*	Fixed a bug in memory dependence checking module of loop vectorization. The ↵	Jiangning Liu	2015-01-05	1	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	following loop should not be vectorized with current algorithm. {code} // loop body ... = a[i] (1) ... = a[i+1] (2) ....... a[i+1] = .... (3) a[i] = ... (4) {code} The algorithm tries to collect memory access candidates from AliasSetTracker, and then check memory dependences one another. The memory accesses are unique in AliasSetTracker, and a single memory access in AliasSetTracker may map to multiple entries in AccessAnalysis, which could cover both 'read' and 'write'. Originally the algorithm only checked 'write' entry in Accesses if only 'write' exists. This is incorrect and the consequence is it ignored all read access, and finally some RAW and WAR dependence are missed. For the case given above, if we ignore two reads, the dependence between (1) and (3) would not be able to be captured, and finally this loop will be incorrectly vectorized. The fix simply inserts a new loop to find all entries in Accesses. Since it will skip most of all other memory accesses by checking the Value pointer at the very beginning of the loop, it should not increase compile-time visibly. llvm-svn: 225159
*	[PowerPC] Enable speculation of cttz/ctlz	Hal Finkel	2015-01-05	1	-0/+41
\| \| \| \| \| \| \| \|	PPC has an instruction for ctlz with defined zero behavior, and our lowering of cttz (provided by DAGCombine) is also efficient and branchless, so speculating these makes sense. llvm-svn: 225150
*	[SROA] Apply a somewhat heavy and unpleasant hammer to fix PR22093, an	Chandler Carruth	2015-01-05	1	-0/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	assert out of the new pre-splitting in SROA. This fix makes the code do what was originally intended -- when we have a store of a load both dealing in the same alloca, we force them to both be pre-split with identical offsets. This is really quite hard to do because we can keep discovering problems as we go along. We have to track every load over the current alloca which for any resaon becomes invalid for pre-splitting, and go back to remove all stores of those loads. I've included a couple of test cases derived from PR22093 that cover the different ways this can happen. While that PR only really triggered the first of these two, its the same fundamental issue. The other challenge here is documented in a FIXME now. We end up being quite a bit more aggressive for pre-splitting when loads and stores don't refer to the same alloca. This aggressiveness comes at the cost of introducing potentially redundant loads. It isn't clear that this is the right balance. It might be considerably better to require that we only do pre-splitting when we can presplit every load and store involved in the entire operation. That would give more consistent if conservative results. Unfortunately, it requires a non-trivial change to the actual pre-splitting operation in order to correctly handle cases where we end up pre-splitting stores out-of-order. And it isn't 100% clear that this is the right direction, although I'm starting to suspect that it is. llvm-svn: 225149
*	[PowerPC] Materialize i64 constants using rotation with masking	Hal Finkel	2015-01-05	1	-7/+27
\| \| \| \| \| \| \| \| \|	r225135 added the ability to materialize i64 constants using rotations in order to reduce the instruction count. Sometimes we can use a rotation only with some extra masking, so that we take advantage of the fact that generating a bunch of extra higher-order 1 bits is easy using li/lis. llvm-svn: 225147
*	[PM] Wire up support for explicitly running the verifier pass.	Chandler Carruth	2015-01-05	1	-0/+20
\| \| \| \| \| \| \| \|	The required functionality has been there for some time, but I never managed to actually wire it into the command line registry of passes. Let's do that. llvm-svn: 225144
*	[X86][SSE] Added vector packing test for pr12412	Simon Pilgrim	2015-01-04	1	-0/+34
\| \| \| \|	llvm-svn: 225138
*	[X86][SSE] Added vector integer truncation tests - based off pr15524	Simon Pilgrim	2015-01-04	1	-0/+90
\| \| \| \|	llvm-svn: 225137
*	[PowerPC] Materialize i64 constants using rotation	Hal Finkel	2015-01-04	1	-4/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Materializing full 64-bit constants on PPC64 can be expensive, requiring up to 5 instructions depending on the locations of the non-zero bits. Sometimes materializing a rotated constant, and then applying the inverse rotation, requires fewer instructions than the direct method. If so, do that instead. In r225132, I added support for forming constants using bit inversion. In effect, this reverts that commit and replaces it with rotation support. The bit inversion is useful for turning constants that are mostly ones into ones that are mostly zeros (thus enabling a more-efficient shift-based materialization), but the same effect can be obtained by using negative constants and a rotate, and that is at least as efficient, if not more. llvm-svn: 225135
*	[PowerPC] Materialize i64 constants using bit inversion	Hal Finkel	2015-01-04	1	-0/+20
\| \| \| \| \| \| \| \| \|	Materializing full 64-bit constants on PPC64 can be expensive, requiring up to 5 instructions depending on the locations of the non-zero bits. Sometimes materializing the bit-reversed constant, and then flipping the bits, requires fewer instructions than the direct method. If so, do that instead. llvm-svn: 225132
*	InstCombine: match can find ConstantExprs, don't assume we have a Value	David Majnemer	2015-01-04	1	-0/+9
\| \| \| \| \| \| \| \| \| \|	We assumed the output of a match was a Value, this would cause us to assert because we would fail a cast<>. Instead, use a helper in the Operator family to hide the distinction between Value and Constant. This fixes PR22087. llvm-svn: 225127
*	ValueTracking: ComputeNumSignBits should tolerate misshapen phi nodes	David Majnemer	2015-01-04	1	-0/+28
\| \| \| \| \| \| \| \| \| \| \| \|	PHI nodes can have zero operands in the middle of a transform. It is expected that utilities in Analysis don't freak out when this happens. Note that it is considered invalid to allow these misshapen phi nodes to make it to another pass. This fixes PR22086. llvm-svn: 225126
*	llvm-readobj: add support to dump COFF export tables	Saleem Abdulrasool	2015-01-03	4	-0/+11
\| \| \| \| \| \| \|	This enhances llvm-readobj to print out the COFF export table, similar to the -coff-import option. This is useful for testing in lld. llvm-svn: 225120