bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[Powerpc darwin] AsmParser Base implementation.	Iain Sandoe	2013-12-14	1	-27/+23
\| \| \| \| \| \| \| \| \| \| \| \|	This is a base implementation of the powerpc-apple-darwin asm parser dialect. * Enables infrastructure (essentially isDarwin()) and fixes up the parsing of asm directives to separate out ELF and MachO/Darwin additions. * Enables parsing of {r,f,v}XX as register identifiers. * Enables parsing of lo16() hi16() and ha16() as modifiers. The changes to the test case are from David Fang (fangism). llvm-svn: 197324
*	PowerPC: add Linux triple to TLS tests	Tim Northover	2013-12-12	2	-0/+3
\| \| \| \| \| \|	The tests were failing on OS X. llvm-svn: 197146
*	Improve instruction scheduling for the PPC POWER7	Hal Finkel	2013-12-12	1	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Aside from a few minor latency corrections, the major change here is a new hazard recognizer which focuses on better dispatch-group formation on the POWER7. As with the PPC970's hazard recognizer, the most important thing it does is avoid load-after-store hazards within the same dispatch group. It uses the POWER7's special dispatch-group-terminating nop instruction (instead of inserting multiple regular nop instructions). This new hazard recognizer makes use of the scheduling dependency graph itself, built using AA information, to robustly detect the possibility of load-after-store hazards. significant test-suite performance changes (the error bars are 99.5% confidence intervals based on 5 test-suite runs both with and without the change -- speedups are negative): speedups: MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2 -0.55171% +/- 0.333168% MultiSource/Benchmarks/TSVC/CrossingThresholds-dbl/CrossingThresholds-dbl -17.5576% +/- 14.598% MultiSource/Benchmarks/TSVC/Reductions-dbl/Reductions-dbl -29.5708% +/- 7.09058% MultiSource/Benchmarks/TSVC/Reductions-flt/Reductions-flt -34.9471% +/- 11.4391% SingleSource/Benchmarks/BenchmarkGame/puzzle -25.1347% +/- 11.0104% SingleSource/Benchmarks/Misc/flops-8 -17.7297% +/- 9.79061% SingleSource/Benchmarks/Shootout-C++/ary3 -35.5018% +/- 23.9458% SingleSource/Regression/C/uint64_to_float -56.3165% +/- 25.4234% SingleSource/UnitTests/Vectorizer/gcc-loops -18.5309% +/- 6.8496% regressions: MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg2000 18.351% +/- 12.156% SingleSource/Benchmarks/Shootout-C++/methcall 27.3086% +/- 14.4733% llvm-svn: 197099
*	Fix the PPC subsumes-predicate check	Hal Finkel	2013-12-11	1	-0/+65
\| \| \| \| \| \| \| \| \|	For one predicate to subsume another, they must both check the same condition register. Failure to check this prerequisite was causing miscompiles. Fixes PR18003. llvm-svn: 197089
*	Merge all tls tests to two files. One for normal codegen (initial and local	Roman Divacky	2013-12-11	6	-95/+73
\| \| \| \| \| \|	exec) and one for PIC codegen (local and general dynamic). llvm-svn: 197081
*	Remove test thats testing the same thing as tls.ll.	Roman Divacky	2013-12-11	1	-15/+0
\| \| \| \|	llvm-svn: 197074
*	on darwin<10, fallback to .weak_definition (PPC,X86)	David Fang	2013-12-10	1	-0/+38
\| \| \| \| \| \|	.weak_def_can_be_hidden was not yet supported by the system assembler llvm-svn: 196970
*	Correct word hyphenations	Alp Toker	2013-12-05	1	-1/+1
\| \| \| \| \| \| \|	This patch tries to avoid unrelated changes other than fixing a few hyphen-related ambiguities and contractions in nearby lines. llvm-svn: 196471
*	Convert a PPC test from grep to FileCheck	Hal Finkel	2013-11-30	1	-8/+27
\| \| \| \| \| \| \| \|	Convert this test to FileCheck, and improve it to check for the instructions it is trying to exclude instead of checking for register use (especially because grepping for r1 can be thrown off, for example, by a use of r12). llvm-svn: 195979
*	Desensitize a couple of PPC regression tests	Hal Finkel	2013-11-30	2	-6/+6
\| \| \| \| \| \| \|	Use CHECK-DAG to make these regression tests more resilient against changes in instruction scheduling. llvm-svn: 195978
*	Update the cpu specified on some PPC regression tests	Hal Finkel	2013-11-30	11	-12/+12
\| \| \| \| \| \| \| \| \| \| \|	Some of these tests did not specify a cpu but were also sensitive to instruction scheduling and/or register assignment choices. A few others similarly-sensitive tests specified a cpu (often the POWER7), and while the P7 currently uses the default model for PPC64, this will soon change. For those tests which should not really be cpu-dependent anyway, the cpu is set to the generic 'ppc64'. llvm-svn: 195977
*	Debug Info: update testing cases to specify the debug info version number.	Manman Ren	2013-11-22	3	-2/+6
\| \| \| \| \| \| \| \|	We are going to drop debug info without a version number or with a different version number, to make sure we don't crash when we see bitcode files with different debug info metadata format. llvm-svn: 195504
*	PPC popcnt[dw] do not have record forms	Hal Finkel	2013-11-20	1	-0/+16
\| \| \| \| \| \| \|	The instruction definitions incorrectly specified that popcntd and popcntw have record forms; they do not. This mistake was causing invalid code generation. llvm-svn: 195272
*	PPC: Optimize rldicl generation for masked shifts	Hal Finkel	2013-11-20	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Masking operations (where only some number of the low bits are being kept) are selected to rldicl(x, 0, mb). If x is a logical right shift (which would become rldicl(y, 64-n, n)), we might be able to fold the two instructions together: rldicl(rldicl(x, 64-n, n), 0, mb) -> rldicl(x, 64-n, mb) for n <= mb The right shift is really a left rotate followed by a mask, and if the explicit mask is a more-restrictive sub-mask of the mask implied by the shift, only one rldicl is needed. llvm-svn: 195185
*	Avoid illegal integer promotion in fastisel	Bob Wilson	2013-11-15	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Stop folding constant adds into GEP when the type size doesn't match. Otherwise, the adds' operands are effectively being promoted, changing the conditions of an overflow. Results are different when: sext(a) + sext(b) != sext(a + b) Problem originally found on x86-64, but also fixed issues with ARM and PPC, which used similar code. <rdar://problem/15292280> Patch by Duncan Exon Smith! llvm-svn: 194840
*	Error if we see an alias to a declaration.	Rafael Espindola	2013-11-14	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In ELF and COFF an alias is just another offset in a section. There is no way to represent an alias to something in another file. In MachO, the spec has the N_INDR type which should allow for exactly that, but is not currently implemented. Given that it is specified but not implemented, we error in codegen to avoid miscompiling but don't reject aliases to declarations in the verifier to leave the option open of implementing it. In the past we have used alias to declarations as a way of implementing weakref, which is why it exists in some old tests which this patch updates. llvm-svn: 194705
*	Add PPC option for full register names in asm	Hal Finkel	2013-11-11	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On non-Darwin PPC systems, we currently strip off the register name prefix prior to instruction printing. So instead of something like this: mr r3, r4 we print this: mr 3, 4 The first form is the default on Darwin, and is understood by binutils, but not yet understood by our integrated assembler. Once our integrated-as understands full register names as well, this temporary option will be replaced by tying this functionality to the verbose-asm option. The numeric-only form is compatible with legacy assemblers and tools, and is also gcc's default on most PPC systems. On the other hand, it is harder to read, and there are some analysis tools that expect full register names. llvm-svn: 194384
*	Convert another llc -filetype=obj test.	Rafael Espindola	2013-10-28	1	-29/+0
\| \| \| \|	llvm-svn: 193548
*	Convert another llc -filetype=obj test.	Rafael Espindola	2013-10-28	1	-34/+0
\| \| \| \|	llvm-svn: 193547
*	Convert another llc -filetype=obj test.	Rafael Espindola	2013-10-28	1	-31/+0
\| \| \| \|	llvm-svn: 193546
*	Update PPC loop tests after SCEV non-unit-stride checkin r193015.	Andrew Trick	2013-10-19	2	-24/+12
\| \| \| \|	llvm-svn: 193021
*	[PATCH] Fix PR17168 (DAG scheduler inserts DBG_VALUE before PHI with fast-isel)	Bill Schmidt	2013-10-18	1	-0/+520
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PR17168 describes a test case that fails when compiling for debug with fast-isel. Investigation showed that the test was failing because a DBG_VALUE machine instruction was placed prior to a PHI. For this problem to occur requires the following: * Compile for debug * Compile with fast-isel * In a block B, fast-isel must partially succeed before punting to DAG-isel * B must start with a PHI * The first unhandled node in the DAG must not generate a machine instruction * A debug value with an order less than that of that first node exists When all of these circumstances apply, the existing test that an instruction was not inserted won't fire. Currently it tests whether the block is empty, or whether the last instruction generated is a phi. When fast-isel has partially succeeded, the last instruction generated will not be a phi. Instead, we need to check whether the current insert position is immediately following a phi. This patch adds that check, and adds the test case from the PR as a regression test. llvm-svn: 192976
*	Replace sra with srl if a single sign bit is required	Richard Sandiford	2013-10-17	1	-5/+4
\| \| \| \| \| \|	E.g. (and (sra (i32 x) 31) 2) -> (and (srl (i32 x) 30) 2). llvm-svn: 192884
*	TBAA: update tbaa format from scalar format to struct-path aware format.	Manman Ren	2013-09-30	1	-6/+8
\| \| \| \|	llvm-svn: 191690
*	TBAA: remove !tbaa from testing cases when they are not needed.	Manman Ren	2013-09-30	3	-29/+17
\| \| \| \|	llvm-svn: 191689
*	Fix spelling intruction -> instruction.	Robert Wilhelm	2013-09-28	1	-1/+1
\| \| \| \|	llvm-svn: 191610
*	[PowerPC] Fix PR17354: Generate nop after local calls for PIC code.	Bill Schmidt	2013-09-26	1	-0/+39
\| \| \| \| \| \| \| \|	When generating code for shared libraries, even local calls may be intercepted, so we need a nop after the call for the linker to fix up the TOC. Test case adapted from the one provided in PR17354. llvm-svn: 191440
*	[PowerPC] Fix problems with large code model (PR17169).	Bill Schmidt	2013-09-17	3	-8/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Large code model on PPC64 requires creating and referencing TOC entries when using the addis/ld form of addressing. This was not being done in all cases. The changes in this patch to PPCAsmPrinter::EmitInstruction() fix this. Two test cases are also modified to reflect this requirement. Fast-isel was not creating correct code for loading floating-point constants using large code model. This also requires the addis/ld form of addressing. Previously we were using the addis/lfd shortcut which is only applicable to medium code model. One test case is modified to reflect this requirement. llvm-svn: 190882
*	PPC: Don't restrict lvsl generation to after type legalization	Hal Finkel	2013-09-15	1	-0/+166
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a re-commit of r190764, with an extra check to make sure that we're not performing the transformation on illegal types (a small test case has been added for this as well). Original commit message: The PPC backend uses a target-specific DAG combine to turn unaligned Altivec loads into a permutation-based sequence when possible. Unfortunately, the target-specific DAG combine is not always called on all loads of interest (sometimes the routines in DAGCombine call CombineTo such that the new node and users are not added to the worklist); allowing the combine to trigger early (before type legalization) mitigates this problem. Because the autovectorizers only create legal vector types, I don't expect a lot of cases where this optimization is enabled by type legalization in practice. llvm-svn: 190771
*	Revert r190764: PPC: Don't restrict lvsl generation to after type legalization	Hal Finkel	2013-09-15	1	-154/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is causing test-suite failures. Original commit message: The PPC backend uses a target-specific DAG combine to turn unaligned Altivec loads into a permutation-based sequence when possible. Unfortunately, the target-specific DAG combine is not always called on all loads of interest (sometimes the routines in DAGCombine call CombineTo such that the new node and users are not added to the worklist); allowing the combine to trigger early (before type legalization) mitigates this problem. Because the autovectorizers only create legal vector types, I don't expect a lot of cases where this optimization is enabled by type legalization in practice. llvm-svn: 190765
*	PPC: Don't restrict lvsl generation to after type legalization	Hal Finkel	2013-09-15	1	-0/+154
\| \| \| \| \| \| \| \| \| \| \| \| \|	The PPC backend uses a target-specific DAG combine to turn unaligned Altivec loads into a permutation-based sequence when possible. Unfortunately, the target-specific DAG combine is not always called on all loads of interest (sometimes the routines in DAGCombine call CombineTo such that the new node and users are not added to the worklist); allowing the combine to trigger early (before type legalization) mitigates this problem. Because the autovectorizers only create legal vector types, I don't expect a lot of cases where this optimization is enabled by type legalization in practice. llvm-svn: 190764
*	Prevent assert in CombinerGlobalAA with null values	Hal Finkel	2013-09-15	1	-0/+137
\| \| \| \| \| \| \|	DAGCombiner::isAlias can be called with SrcValue1 or SrcValue2 null, and we can't use AA in this case (if we try, then the casting code in AA will assert). llvm-svn: 190763
*	Remove unnecessary TBAA metadata from r190636's test case	Hal Finkel	2013-09-12	1	-13/+9
\| \| \| \|	llvm-svn: 190637
*	Fix PPC ABI for ByVal structs with vector members	Hal Finkel	2013-09-12	1	-0/+64
\| \| \| \| \| \| \| \| \| \|	When a structure is passed by value, and that structure contains a vector member, according to the PPC ABI, the structure will receive enhanced alignment (so that the vector within the structure will always be aligned). This should resolve PR16641. llvm-svn: 190636
*	Make the PPC fast-math sqrt expansion safe at 0	Hal Finkel	2013-09-12	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \|	In fast-math mode sqrt(x) is calculated using the fast expansion of the reciprocal of the reciprocal sqrt expansion. The reciprocal and reciprocal sqrt expansions use the associated estimate instructions along with some Newton iterations. Unfortunately, as a result, sqrt(0) was being calculated as NaN, which is not correct. Now we explicitly return a result of zero if the input is zero. llvm-svn: 190624
*	PPC: Enable aggressive anti-dependency breaking	Hal Finkel	2013-09-12	2	-14/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Aggressive anti-dependency breaking is enabled by default for all PPC cores. This provides a general speedup on the P7 and other platforms (among other factors, the instruction group formation for the non-embedded PPC cores is done during post-RA scheduling). In order to do this safely, the incompatibility between uses of the MFOCRF instruction and anti-dependency breaking are resolved by marking MFOCRF with hasExtraSrcRegAllocReq. As noted in the removed FIXME, the problem was that MFOCRF's output is sensitive to the identify of the source register, and always paired with a shift to undo this effect. Because anti-dependency breaking is unaware of this hidden dependency of the shift amount on the source register of the MFOCRF instruction, changing that register must be inhibited. Two test cases were adjusted: The SjLj test was made more insensitive to register choices and scheduling; the saveCR test disabled anti-dependency breaking because part of what it is testing is proper register reuse. llvm-svn: 190587
*	Debug Info Testing: updated to use NULL instead of "i32 0" in a few fields.	Manman Ren	2013-09-06	2	-2/+2
\| \| \| \| \| \| \| \|	Field 2 of DIType (Context), field 9 of DIDerivedType (TypeDerivedFrom), field 12 of DICompositeType (ContainingType), fields 2, 7, 12 of DISubprogram (Context, Type, ContainingType). llvm-svn: 190205
*	[PowerPC] Call support for fast-isel.	Bill Schmidt	2013-08-30	2	-0/+166
\| \| \| \| \| \| \| \| \|	This patch adds fast-isel support for calls (but not intrinsic calls or varargs calls). It also removes a badly-formed assert. There are some new tests just for calls, and also for folding loads into arguments on calls to avoid extra extends. llvm-svn: 189701
*	[PowerPC] Add handling for conversions to fast-isel.	Bill Schmidt	2013-08-30	1	-0/+305
\| \| \| \| \| \| \| \| \|	Yet another chunk of fast-isel code. This one handles various conversions involving floating-point. (It also includes some miscellaneous handling throughout the back end for LWA_32 and LWAX_32 that should have been part of the load-store patch.) llvm-svn: 189677
*	[PowerPC] Handle selection of compare instructions in fast-isel.	Bill Schmidt	2013-08-30	1	-0/+289
\| \| \| \| \| \| \|	Mostly trivial patch adding support for compares. The meat of the work was added with the branch support. llvm-svn: 189639
*	[PowerPC] Miscellaneous fast-isel test cases.	Bill Schmidt	2013-08-30	4	-0/+131
\| \| \| \| \| \| \|	Here are a few more tests that now pass after the recent fast-isel commits. llvm-svn: 189637
*	[PowerPC] Add loads, stores, and related things to fast-isel.	Bill Schmidt	2013-08-30	3	-0/+434
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is the next big chunk of fast-isel code. The primary purpose is to implement selection of loads and stores, but there is a lot of drag-along to support this. The common code to analyze addresses for both loads and stores is substantial. It's also necessary to add the materialization code for global values. Related to load-store processing is the code to fold loads into integer extends, since otherwise we generate lots of redundant instructions. We also need to add some overrides to some FastEmit routines to ensure we don't assign GPR 0 to a virtual register when this would change the meaning of an instruction. I added handling selection of a few binary arithmetic instructions, to enable committing some test cases I wrote a while back. Finally, ap couple of miscellaneous changes: * I cleaned up some poor style from a previous patch in PPCISelLowering.cpp, pointed out by David Blaikie. * I enlarged the Addr.Offset field to avoid sign problems with 32-bit offsets. llvm-svn: 189636
*	Debug Info: add an identifier field to DICompositeType.	Manman Ren	2013-08-26	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	DICompositeType will have an identifier field at position 14. For now, the field is set to null in DIBuilder. For DICompositeTypes where the template argument field (the 13th field) was optional, modify DIBuilder to make sure the template argument field is set. Now DICompositeType has 15 fields. Update DIBuilder to use NULL instead of "i32 0" for null value of a MDNode. Update verifier to check that DICompositeType has 15 fields and the last field is null or a MDString. Update testing cases to include an extra field for DICompositeType. The identifier field will be used by type uniquing so a front end can genearte a DICompositeType with a unique identifer. llvm-svn: 189282
*	[PowerPC] More fast-isel chunks (returns and integer extends)	Bill Schmidt	2013-08-26	2	-0/+217
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Incremental improvement to fast-isel for PPC64. This allows us to select on ret, sext, and zext. Filling in sext/zext improves some of the existing logic in handling compare-immediates that needed extends. A simplified return convention for fast-isel is also added to the PPC64 calling conventions. All call/return processing for DAG selection is handled with custom code, so there isn't an existing CC to rely on here. The include of PPCGenCallingConv.inc causes compiler warnings due to the 32-bit calling conventions that are not used, so the dummy function "usePPC32CCs()" is added here to silence those. Test cases for the return and extend logic are added. llvm-svn: 189266
*	[PowerPC] Add fast-isel branch and compare selection.	Bill Schmidt	2013-08-25	2	-0/+58
\| \| \| \| \| \| \| \| \| \| \|	First chunk of actual fast-isel selection code. This handles direct and indirect branches, as well as feeding compares for direct branches. PPCFastISel::PPCEmitIntExt() is just roughed in and will be expanded in a future patch. This also corrects a problem with selection for constant pool entries in JIT mode or with small code model. llvm-svn: 189202
*	Update to remove the no-frame-pointer-elim-non-leaf flag if it was set to ↵	Bill Wendling	2013-08-22	12	-14/+14
\| \| \| \| \| \|	'false'. llvm-svn: 189068
*	TBAA: remove !tbaa from testing cases when they are not needed.	Manman Ren	2013-08-21	4	-99/+70
\| \| \| \| \| \| \|	This will make it easier to turn on struct-path aware TBAA since the metadata format will change. llvm-svn: 188944
*	Don't form PPC CTR-based loops around a copysignl call	Hal Finkel	2013-08-19	1	-0/+28
\| \| \| \| \| \| \| \| \|	copysign/copysignf never become function calls (because the SDAG expansion code does not lower to the corresponding function call, but rather directly implements the associated logic), but copysignl almost always is lowered into a call to the requested libm functon (and, thus, might clobber CTR). llvm-svn: 188727
*	Add ExpandFloatOp_FCOPYSIGN to handle ppcf128-related expansions	Hal Finkel	2013-08-19	1	-0/+67
\| \| \| \| \| \| \| \| \| \|	We had previously been asserting when faced with a FCOPYSIGN f64, ppcf128 node because there was no way to expand the FCOPYSIGN node. Because ppcf128 is the sum of two doubles, and the first double must have the larger magnitude, we can take the sign from the first double. As a result, in addition to fixing the crash, this is also an optimization. llvm-svn: 188655
*	Add the PPC fcpsgn instruction	Hal Finkel	2013-08-19	1	-0/+52
\| \| \| \| \| \| \| \| \|	Modern PPC cores support a floating-point copysign instruction, and we can use this to lower the FCOPYSIGN node (which is created from calls to the libm copysign function). A couple of extra patterns are necessary because the operand types of FCOPYSIGN need not agree. llvm-svn: 188653