bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Debug info: Factor out the creation of DWARF expressions from AsmPrinter	Adrian Prantl	2015-01-12	3	-8/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	into a new class DwarfExpression that can be shared between AsmPrinter and DwarfUnit. This is the first step towards unifying the two entirely redundant implementations of dwarf expression emission in DwarfUnit and AsmPrinter. Almost no functional change — Testcases were updated because asm comments that used to be on two lines now appear on the same line, which is actually preferable. llvm-svn: 225706
*	[X86] Also create+widen FMIN/FMAX nodes for v2f32.	Ahmed Bougacha	2015-01-12	1	-0/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This happens in the HINT benchmark, where the SLP-vectorizer created v2f32 fcmp/select code. The "correct" solution would have been to teach the vectorizer cost model that v2f32 isn't legal (because really, it isn't), but if we can vectorize we might as well do so. We legalize these v2f32 FMIN/FMAX nodes by widening to v4f32 later on. v3f32 were already widened to v4f32 by the generic unroll-and-build-vector legalization. rdar://15763436 Differential Revision: http://reviews.llvm.org/D6557 llvm-svn: 225691
*	[X86] Make SSE min/max testcases more explicit. NFC.	Ahmed Bougacha	2015-01-12	1	-32/+84
\| \| \| \|	llvm-svn: 225687
*	R600/SI: Use RegisterOperands to specify which operands can accept immediates	Tom Stellard	2015-01-12	5	-18/+18
\| \| \| \| \| \| \| \| \| \| \| \|	There are some operands which can take either immediates or registers and we were previously using different register class to distinguish between operands that could take immediates and those that could not. This patch switches to using RegisterOperands which should simplify the backend by reducing the number of register classes and also make it easier to implement the assembler. llvm-svn: 225662
*	GVN: propagate equalities for floating point compares	Sanjay Patel	2015-01-12	2	-0/+83
\| \| \| \| \| \| \| \| \| \| \| \|	Allow optimizations based on FP comparison values in the same way as integers. This resolves PR17713: http://llvm.org/bugs/show_bug.cgi?id=17713 Differential Revision: http://reviews.llvm.org/D6911 llvm-svn: 225660
*	Add r224985 back with two fixes.	Rafael Espindola	2015-01-12	3	-0/+124
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	One is that AArch64 has additional restrictions on when local relocations can be used. We have to take those into consideration when deciding to put a L symbol in the symbol table or not. The other is that ld64 requires the relocations to cstring to use linker visible symbols on AArch64. Thanks to Michael Zolotukhin for testing this! Remove doesSectionRequireSymbols. In an assembly expression like bar: .long L0 + 1 the intended semantics is that bar will contain a pointer one byte past L0. In sections that are merged by content (strings, 4 byte constants, etc), a single position in the section doesn't give the linker enough information. For example, it would not be able to tell a relocation must point to the end of a string, since that would look just like the start of the next. The solution used in ELF to use relocation with symbols if there is a non-zero addend. In MachO before this patch we would just keep all symbols in some sections. This would miss some cases (only cstrings on x86_64 were implemented) and was inefficient since most relocations have an addend of 0 and can be represented without the symbol. This patch implements the non-zero addend logic for MachO too. llvm-svn: 225644
*	[mips][microMIPS] Implement BEQZ16 and BNEZ16 instructions	Jozef Kolek	2015-01-12	6	-0/+59
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D5271 llvm-svn: 225627
*	Put this test's input in the Inputs directory where it belongs, rather than	Richard Smith	2015-01-12	2	-2/+2
\| \| \| \| \| \|	reusing a file from a different test directory. llvm-svn: 225621
*	[PowerPC] Fix calls to non-function objects	Hal Finkel	2015-01-12	1	-0/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Looking at r225438 inspired me to see how the PowerPC backend handled the situation (calling a bitcasted TLS global), and it turns out we also produced an error (cannot select ...). What it means to "call" something that is not a function is implementation and platform specific, but in the name of doing something (besides crashing), this makes sure we do what GCC does (treat all such calls as calls through a function pointer -- meaning that the pointer is assumed, as is the convention on PPC, to point to a function descriptor structure holding the actual code address along with the function's TOC pointer and environment pointer). As GCC does, we now do the same for calling regular (non-TLS) non-function globals too. I'm not sure whether this is the most useful way to define the behavior, but at least we won't be alone. llvm-svn: 225617
*	Revert most of r225597	David Majnemer	2015-01-11	1	-1/+1
\| \| \| \| \| \|	We can't rely on a DataLayout enlightened constant folder. llvm-svn: 225599
*	X86: Properly decode shuffle masks when the constant pool type is weird	David Majnemer	2015-01-11	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \|	It's possible for the constant pool entry for the shuffle mask to come from a completely different operation. This occurs when Constants have the same bit pattern but have different types. Make DecodePSHUFBMask tolerant of types which, after a bitcast, are appropriately sized vector types. This fixes PR22188. llvm-svn: 225597
*	X86: teach X86TargetLowering about L,M,O constraints	Saleem Abdulrasool	2015-01-11	1	-0/+34
\| \| \| \| \| \| \| \| \|	Teach the ISelLowering for X86 about the L,M,O target specific constraints. Although, for the moment, clang performs constraint validation and prevents passing along inline asm which may have immediate constant constraints violated, the backend should be able to cope with the invalid inline asm a bit better. llvm-svn: 225596
*	ARM: add support for segment base relocations (SBREL)	Saleem Abdulrasool	2015-01-11	2	-0/+37
\| \| \| \| \| \| \| \|	This adds support for parsing and emitting the SBREL relocation variant for the ARM target. Handling this relocation variant is necessary for supporting the full ARM ELF specification. Addresses PR22128. llvm-svn: 225595
*	[x86] Remove some windows line endings that snuck into the tests here.	Chandler Carruth	2015-01-11	8	-463/+463
\| \| \| \| \| \| \|	Folks on Windows, remember to set up your subversion to strip these when submitting... llvm-svn: 225593
*	Fix PR22179.	Sanjoy Das	2015-01-10	3	-2/+30
\| \| \| \| \| \| \| \| \| \| \|	We were incorrectly inferring nsw for certain SCEVs. We can be more aggressive here (see Richard Smith's comment on http://llvm.org/bugs/show_bug.cgi?id=22179) but this change just focuses on correctness. Differential Revision: http://reviews.llvm.org/D6914 llvm-svn: 225591
*	[X86][SSE] Improved (v)insertps shuffle matching	Simon Pilgrim	2015-01-10	3	-19/+29
\| \| \| \| \| \| \| \| \| \| \| \|	In the current code we only attempt to match against insertps if we have exactly one element from the second input vector, irrespective of how much of the shuffle result is zeroable. This patch checks to see if there is a single non-zeroable element from either input that requires insertion. It also supports matching of cases where only one of the inputs need to be referenced. We also split insertps shuffle matching off into a new lowerVectorShuffleAsInsertPS function. Differential Revision: http://reviews.llvm.org/D6879 llvm-svn: 225589
*	[PowerPC] Mark zext of a small scalar load as free	Hal Finkel	2015-01-10	1	-0/+37
\| \| \| \| \| \| \| \| \| \| \| \| \|	This initial implementation of PPCTargetLowering::isZExtFree marks as free zexts of small scalar loads (that are not sign-extending). This callback is used by SelectionDAGBuilder's RegsForValue::getCopyToRegs, and thus to determine whether a zext or an anyext is used to lower illegally-typed PHIs. Because later truncates of zero-extended values are nops, this allows for the elimination of later unnecessary truncations. Fixes the initial complaint associated with PR22120. llvm-svn: 225584
*	tests: fix previous commit	Saleem Abdulrasool	2015-01-10	1	-10/+6
\| \| \| \| \| \| \|	The previous commit accidentally missed changes to the test output checking, resulting in an errant failure. llvm-svn: 225577
*	test: merge ARM relocations test	Saleem Abdulrasool	2015-01-10	2	-17/+15
\| \| \| \| \| \| \| \|	There is a fair number of relocations that are part of the AAELF specification. Simply merge the tests into a single test file, otherwise, we will end up with far too many test files to test each relocation type. NFC. llvm-svn: 225576
*	tests: convert a couple of ARM relocation tests to readobj	Saleem Abdulrasool	2015-01-10	2	-8/+16
\| \| \| \| \| \| \|	These tests are checking the relocation generation. Use the readobj output as it is much easier to follow when glancing over the tests. llvm-svn: 225575
*	Fully fix Bug #22115.	Justin Hibbits	2015-01-10	2	-5/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In the previous commit, the register was saved, but space was not allocated. This resulted in the parameter save area potentially clobbering r30, leading to nasty results. Test Plan: Tests updated Reviewers: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6906 llvm-svn: 225573
*	[PowerPC] Readjust the loop unrolling threshold	Hal Finkel	2015-01-10	1	-2/+50
\| \| \| \| \| \| \| \|	Now that the way that the partial unrolling threshold for small loops is used to compute the unrolling factor as been corrected, a slightly smaller threshold is preferable. This is expected; other targets may need to re-tune as well. llvm-svn: 225566
*	[LoopUnroll] Fix the partial unrolling threshold for small loop sizes	Hal Finkel	2015-01-10	2	-2/+19
\| \| \| \| \| \| \| \| \| \| \| \| \|	When we compute the size of a loop, we include the branch on the backedge and the comparison feeding the conditional branch. Under normal circumstances, these don't get replicated with the rest of the loop body when we unroll. This led to the somewhat surprising behavior that really small loops would not get unrolled enough -- they could be unrolled more and the resulting loop would be below the threshold, because we were assuming they'd take (LoopSize * UnrollingFactor) instructions after unrolling, instead of (((LoopSize-2) * UnrollingFactor)+2) instructions. This fixes that computation. llvm-svn: 225565
*	Use the DiagnosticHandler to print diagnostics when reading bitcode.	Rafael Espindola	2015-01-10	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The bitcode reading interface used std::error_code to report an error to the callers and it is the callers job to print diagnostics. This is not ideal for error handling or diagnostic reporting: * For error handling, all that the callers care about is 3 possibilities: * It worked * The bitcode file is corrupted/invalid. * The file is not bitcode at all. * For diagnostic, it is user friendly to include far more information about the invalid case so the user can find out what is wrong with the bitcode file. This comes up, for example, when a developer introduces a bug while extending the format. The compromise we had was to have a lot of error codes. With this patch we use the DiagnosticHandler to communicate with the human and std::error_code to communicate with the caller. This allows us to have far fewer error codes and adds the infrastructure to print better diagnostics. This is so because the diagnostics are printed when he issue is found. The code that detected the problem in alive in the stack and can pass down as much context as needed. As an example the patch updates test/Bitcode/invalid.ll. Using a DiagnosticHandler also moves the fatal/non-fatal error decision to the caller. A simple one like llvm-dis can just use fatal errors. The gold plugin needs a bit more complex treatment because of being passed non-bitcode files. An hypothetical interactive tool would make all bitcode errors non-fatal. llvm-svn: 225562
*	Disable Go bindings test under UBSan.	Alexey Samsonov	2015-01-09	1	-1/+1
\| \| \| \|	llvm-svn: 225557
*	Fix the JIT event listeners and replace the associated tests.	Andrew Kaylor	2015-01-09	6	-537/+222
\| \| \| \| \| \| \| \| \| \|	The changes to EventListenerCommon.h were contributed by Arch Robison. This fixes bug 22095. http://reviews.llvm.org/D6905 llvm-svn: 225554
*	SimplifyCFG: check uses of constant-foldable instrs in switch destinations ↵	Hans Wennborg	2015-01-09	1	-0/+40
\| \| \| \| \| \| \| \| \| \| \|	(PR20210) The previous code assumed that such instructions could not have any uses outside CaseDest, with the motivation that the instruction could not dominate CommonDest because CommonDest has phi nodes in it. That simply isn't true; e.g., CommonDest could have an edge back to itself. llvm-svn: 225552
*	[X86][SSE] Avoid vector byte shuffles with zero by using pshufb to create zeros	Simon Pilgrim	2015-01-09	1	-112/+76
\| \| \| \| \| \| \| \|	pshufb can shuffle in zero bytes as well as bytes from a source vector - we can use this to avoid having to shuffle 2 vectors and ORing the result when the used inputs from a vector are all zeroable. Differential Revision: http://reviews.llvm.org/D6878 llvm-svn: 225551
*	Add a testcase of llvm-lto error handling.	Rafael Espindola	2015-01-09	1	-0/+4
\| \| \| \|	llvm-svn: 225545
*	Add the option, -universal-headers, used with -macho to print the Mach-O ↵	Kevin Enderby	2015-01-09	1	-0/+19
\| \| \| \| \| \|	universal headers to llvm-objdump. llvm-svn: 225537
*	Re-reapply r221924: "[GVN] Perform Scalar PRE on gep indices that feed loads ↵	Tim Northover	2015-01-09	2	-7/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	before doing Load PRE" It's not really expected to stick around, last time it provoked a weird LTO build failure that I can't reproduce now, and the bot logs are long gone. I'll re-revert it if the failures recur. Original description: Perform Scalar PRE on gep indices that feed loads before doing Load PRE. llvm-svn: 225536
*	[mips] Add support for accessing $gp as a named register.	Daniel Sanders	2015-01-09	3	-0/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Mips Linux uses $gp to hold a pointer to thread info structure and accesses it with a named register. This makes this work for LLVM. The N32 ABI doesn't quite work yet since the frontend generates incorrect IR for this case. It neglects to truncate the 64-bit GPR to a 32-bit value before converting to a pointer. Given correct IR (as in the testcase in this patch), it works correctly. Reviewers: sstankovic, vmedic, atanasyan Reviewed By: atanasyan Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6893 llvm-svn: 225529
*	[PowerPC] Enable late partial unrolling on the POWER7	Hal Finkel	2015-01-09	1	-0/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The P7 benefits from not have really-small loops so that we either have multiple dispatch groups in the loop and/or the ability to form more-full dispatch groups during scheduling. Setting the partial unrolling threshold to 44 seems good, empirically, for the P7. Compared to using no late partial unrolling, this yields the following test-suite speedups: SingleSource/Benchmarks/Adobe-C++/simple_types_constant_folding -66.3253% +/- 24.1975% SingleSource/Benchmarks/Misc-C++/oopack_v1p8 -44.0169% +/- 29.4881% SingleSource/Benchmarks/Misc/pi -27.8351% +/- 12.2712% SingleSource/Benchmarks/Stanford/Bubblesort -30.9898% +/- 22.4647% I've speculatively added a similar setting for the P8. Also, I've noticed that the unroller does not quite calculate the unrolling factor correctly for really tiny loops because it neglects to account for the fact that not every loop body replicant contains an ending branch and counter increment. I'll fix that later. llvm-svn: 225522
*	ARM: add support for R_ARM_ABS16	Saleem Abdulrasool	2015-01-09	1	-0/+13
\| \| \| \| \| \|	Add support for R_ARM_ABS16 relocation mapping. Addresses PR22156. llvm-svn: 225510
*	test: add additional test for SVN r225507	Saleem Abdulrasool	2015-01-09	1	-0/+2
\| \| \| \| \| \| \|	Add an additional test case to ensure that we generate the relocation even if the thumb target is used. llvm-svn: 225509
*	ARM: add support for R_ARM_ABS8 relocations	Saleem Abdulrasool	2015-01-09	1	-0/+10
\| \| \| \| \| \|	Add support for R_ARM_ABS8 relocation. Addresses PR22126. llvm-svn: 225507
*	RegisterCoalescer: Fix removeCopyByCommutingDef with subreg liveness	Matthias Braun	2015-01-09	1	-0/+51
\| \| \| \| \| \| \| \| \|	The code that eliminated additional coalescable copies in removeCopyByCommutingDef() used MergeValueNumberInto() which internally may merge A into B or B into A. In this case A and B had different Def points, so we have to reset ValNo.Def to the intended one after merging. llvm-svn: 225503
*	[PowerPC] Fold [sz]ext with fp_to_int lowering where possible	Hal Finkel	2015-01-09	1	-0/+69
\| \| \| \| \| \| \|	On modern cores with lfiw[az]x, we can fold a sign or zero extension from i32 to i64 into the load necessary for an i64 -> fp conversion. llvm-svn: 225493
*	Utils: Keep distinct MDNodes distinct in MapMetadata()	Duncan P. N. Exon Smith	2015-01-08	2	-0/+50
\| \| \| \| \| \| \| \| \| \| \| \| \|	Create new copies of distinct `MDNode`s instead of following the uniquing `MDNode` logic. Just like self-references (or other cycles), `MapMetadata()` creates a new node. In practice most calls use `RF_NoModuleLevelChanges`, in which case nothing is duplicated anyway. Part of PR22111. llvm-svn: 225476
*	IR: Add 'distinct' MDNodes to bitcode and assembly	Duncan P. N. Exon Smith	2015-01-08	9	-46/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Propagate whether `MDNode`s are 'distinct' through the other types of IR (assembly and bitcode). This adds the `distinct` keyword to assembly. Currently, no one actually calls `MDNode::getDistinct()`, so these nodes only get created for: - self-references, which are never uniqued, and - nodes whose operands are replaced that hit a uniquing collision. The concept of distinct nodes is still not quite first-class, since distinct-ness doesn't yet survive across `MapMetadata()`. Part of PR22111. llvm-svn: 225474
*	[PowerPC] Mark all instructions as non-cheap for MachineLICM	Hal Finkel	2015-01-08	1	-0/+55
\| \| \| \| \| \| \| \| \| \| \| \|	MachineLICM uses a callback named hasLowDefLatency to determine if an instruction def operand has a 'low' latency. If all relevant operands have a 'low' latency, the instruction is considered too cheap to hoist out of loops even in low-register-pressure situations. On PowerPC cores, both the embedded cores and the others, there is no reason to believe that this is a good choice: all instructions have a cost inside a loop, and hoisting them when not limited by register pressure is a reasonable default. llvm-svn: 225471
*	[ARM] Fix a bug in constant island pass that was triggering an assertion.	Akira Hatanaka	2015-01-08	1	-0/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The assert was being triggered when the distance between a constant pool entry and its user exceeded the maximally allowed distance after thumb2 branch shortening. A padding was inserted after a thumb2 branch instruction was shrunk, which caused the user to be out of range. This is wrong as the padding should have been inserted by the layout algorithm so that the distance between two instructions doesn't grow later during thumb2 instruction optimization. This commit fixes the code in ARMConstantIslands::createNewWater to call computeBlockSize and set BasicBlock::Unalign when a branch instruction is inserted to create new water after a basic block. A non-zero Unalign causes the worst-case padding to be inserted when adjustBBOffsetsAfter is called to recompute the basic block offsets. rdar://problem/19130476 llvm-svn: 225467
*	Fix fcmp + fabs instcombines when using the intrinsic	Matt Arsenault	2015-01-08	1	-0/+82
\| \| \| \| \| \| \|	This was only handling the libcall. This is another example of why only the intrinsic should ever be used when it exists. llvm-svn: 225465
*	[MCJIT] Remove a few redundant MCJIT tests, and drop the extraneous datalayout	Lang Hames	2015-01-08	6	-57/+0
\| \| \| \| \| \|	strings from the copies that remain. llvm-svn: 225460
*	Make this test a bit stricter.	Rafael Espindola	2015-01-08	1	-41/+41
\| \| \| \| \| \| \|	It now checks for the end of the line or the opening '{'. While at it, remove empty comments. llvm-svn: 225451
*	Add saving and restoring of r30 to the prologue and epilogue, respectively	Justin Hibbits	2015-01-08	2	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The PIC additions didn't update the prologue and epilogue code to save and restore r30 (PIC base register). This does that. Test Plan: Tests updated. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6876 llvm-svn: 225450
*	Fix large stack alignment codegen for ARM and Thumb2 targets	Kristof Beyls	2015-01-08	7	-15/+179
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This partially fixes PR13007 (ARM CodeGen fails with large stack alignment): for ARM and Thumb2 targets, but not for Thumb1, as it seems stack alignment for Thumb1 targets hasn't been supported at all. Producing an aligned stack pointer is done by zero-ing out the lower bits of the stack pointer. The BIC instruction was used for this. However, the immediate field of the BIC instruction only allows to encode an immediate that can zero out up to a maximum of the 8 lower bits. When a larger alignment is requested, a BIC instruction cannot be used; llvm was silently producing incorrect code in this case. This commit fixes code generation for large stack aligments by using the BFC instruction instead, when the BFC instruction is available. When not, it uses 2 instructions: a right shift, followed by a left shift to zero out the lower bits. The lowering of ARM::Int_eh_sjlj_dispatchsetup still has code that unconditionally uses BIC to realign the stack pointer, so it very likely has the same problem. However, I wasn't able to produce a test case for that. This commit adds an assert so that the compiler will fail the assert instead of silently generating wrong code if this is ever reached. llvm-svn: 225446
*	R600/SI: Remove SIISelLowering::legalizeOperands()	Tom Stellard	2015-01-08	10	-12/+33
\| \| \| \| \| \| \| \| \|	Its functionality has been replaced by calling SIInstrInfo::legalizeOperands() from SIISelLowering::AdjstInstrPostInstrSelection() and running the SIFoldOperands and SIShrinkInstructions passes. llvm-svn: 225445
*	Masked Load/Store - fixed a bug in type legalization.	Elena Demikhovsky	2015-01-08	1	-0/+49
\| \| \| \|	llvm-svn: 225441
*	Fix a think-o in the test for r225438.	Michael Kuperstein	2015-01-08	1	-1/+1
\| \| \| \|	llvm-svn: 225440