bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[Hexagon] Adding functionality for searching for compound instruction pairs. ↵	Colin LeMahieu	2015-06-08	1	-0/+17
\| \| \| \| \| \|	Compound instructions reduce slot resource requirements freeing those packet slots up for more instructions. llvm-svn: 239307
*	[DAGCombiner] Added CTLZ vector constant folding support.	Simon Pilgrim	2015-06-08	2	-0/+210
\| \| \| \|	llvm-svn: 239305
*	ARM]: Add support for MMFR4_EL1 in assembler	Javed Absar	2015-06-08	3	-0/+8
\| \| \| \| \| \| \|	This patch adds support for system register MMFR4_EL1 (memory model feature register) in the assembler. This register provides information about the implemented memory model and memory management support. llvm-svn: 239302
*	[Mips64][mcjit] Add R_MIPS_PC32 relocation	Petar Jovanovic	2015-06-08	1	-0/+9
\| \| \| \| \| \| \| \| \| \|	This patch adds R_MIPS_PC32 relocation for Mips64. Patch by Vladimir Radosavljevic. Differential Revision: http://reviews.llvm.org/D10235 llvm-svn: 239301
*	AVX-512: Implemented 256/128bit VALIGND/Q instructions for SKX and KNL	Igor Breger	2015-06-08	3	-0/+263
\| \| \| \| \| \| \| \| \|	Implemented DAG lowering for all these forms. Added tests for DAG lowering and encoding. Differential Revision: http://reviews.llvm.org/D10310 llvm-svn: 239300
*	Minor refactoring of GEP handling in isDereferenceablePointer	Artur Pilipenko	2015-06-08	1	-0/+13
\| \| \| \| \| \| \| \| \| \|	For GEP instructions isDereferenceablePointer checks that all indices are constant and within bounds. Replace this index calculation logic to a call to accumulateConstantOffset. Separated from the http://reviews.llvm.org/D9791 Reviewed By: sanjoy Differential Revision: http://reviews.llvm.org/D9874 llvm-svn: 239299
*	[LAA] Fix estimation of number of memchecks	Silviu Baranga	2015-06-08	1	-0/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We need to add a runtime memcheck for pair of accesses (x,y) where at least one of x and y are writes. Assuming we have w writes and r reads, currently this number is estimated as being w* (w+r-1). This estimation will count (write,write) pairs twice and will overestimate the number of checks required. This change adds a getNumberOfChecks method to RuntimePointerCheck, which will count the number of runtime checks needed (similar in implementation to needsAnyChecking) and uses it to produce the correct number of runtime checks. Test Plan: llvm test suite spec2k spec2k6 Performance results: no changes observed (not surprising since the formula for 1 writer is basically the same, which would covers most cases - at least with the current check limit). Reviewers: anemet Reviewed By: anemet Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D10217 llvm-svn: 239295
*	[DAGCombiner] Added CTTZ vector constant folding support.	Simon Pilgrim	2015-06-08	2	-0/+188
\| \| \| \|	llvm-svn: 239293
*	[LoopVectorize] Teach Loop Vectorizor about interleaved memory accesses.	Hao Liu	2015-06-08	2	-20/+484
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Interleaved memory accesses are grouped and vectorized into vector load/store and shufflevector. E.g. for (i = 0; i < N; i+=2) { a = A[i]; // load of even element b = A[i+1]; // load of odd element ... // operations on a, b, c, d A[i] = c; // store of even element A[i+1] = d; // store of odd element } The loads of even and odd elements are identified as an interleave load group, which will be transfered into vectorized IRs like: %wide.vec = load <8 x i32>, <8 x i32>* %ptr %vec.even = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 0, i32 2, i32 4, i32 6> %vec.odd = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 1, i32 3, i32 5, i32 7> The stores of even and odd elements are identified as an interleave store group, which will be transfered into vectorized IRs like: %interleaved.vec = shufflevector <4 x i32> %vec.even, %vec.odd, <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7> store <8 x i32> %interleaved.vec, <8 x i32>* %ptr This optimization is currently disabled by defaut. To try it by adding '-enable-interleaved-mem-accesses=true'. llvm-svn: 239291
*	[LoopAccessAnalysis] Teach LAA to check the memory dependence between ↵	Hao Liu	2015-06-08	1	-0/+540
\| \| \| \| \| \| \| \|	strided accesses. Differential Revision: http://reviews.llvm.org/D9368 llvm-svn: 239285
*	[objdump] Moving PrintImmHex out of MachODump and in to llvm-objdump and ↵	Colin LeMahieu	2015-06-07	1	-0/+10
\| \| \| \| \| \|	setting instprinter appropriately. llvm-svn: 239265
*	[X86] Added tzcnt vector tests.	Simon Pilgrim	2015-06-07	2	-0/+2795
\| \| \| \|	llvm-svn: 239264
*	SeparateConstOffsetFromGEP: Pass address space to isLegalAddressingMode	Matt Arsenault	2015-06-07	2	-0/+97
\| \| \| \|	llvm-svn: 239262
*	[X86] Added BitScanForward/BitScanReverse memory folding + tests	Simon Pilgrim	2015-06-07	1	-0/+51
\| \| \| \|	llvm-svn: 239257
*	Fixed line endings	Simon Pilgrim	2015-06-07	2	-6/+6
\| \| \| \|	llvm-svn: 239253
*	[DAGCombiner] Added CTPOP vector constant folding support.	Simon Pilgrim	2015-06-07	2	-0/+92
\| \| \| \| \| \|	Added tests to the existing SSE/AVX test files. llvm-svn: 239252
*	Teaching llvm-mc how to understand the defsym command line option. This ↵	Colin LeMahieu	2015-06-07	3	-0/+24
\| \| \| \| \| \|	allows integer-constant symbols to be defined on the command line and used during assembly. llvm-svn: 239240
*	[MC] Common symbols weren't being checked for redeclaration which allowed an ↵	Colin LeMahieu	2015-06-06	2	-0/+10
\| \| \| \| \| \|	assembly file to generate an assertion in setCommon(): !isCommon(). This change allows redeclaration as long as the size and alignment match exactly, otherwise report a fatal error. llvm-svn: 239227
*	[LoopUnroll] Fix truncation bug in canUnrollCompletely.	Sanjoy Das	2015-06-06	1	-0/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: canUnrollCompletely takes `unsigned` values for `UnrolledCost` and `RolledDynamicCost` but is passed in `uint64_t`s that are silently truncated. Because of this, when `UnrolledSize` is a large integer that has a small remainder with UINT32_MAX, LLVM tries to completely unroll loops with high trip counts. Reviewers: mzolotukhin, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10293 llvm-svn: 239218
*	[CVP] Don't assume Constants of type i1 can be known to be true or false	David Majnemer	2015-06-06	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \| \|	CVP wants to analyze the condition operand of a select along an edge. It succeeds in getting back a Constant but not a ConstantInt. Instead, it gets a ConstantExpr. It then assumes that the Constant must be equal to false because it isn't equal to true. Instead, perform an additional comparison. This fixes PR23752. llvm-svn: 239217
*	[InstCombine] Don't miscompile select to poison	David Majnemer	2015-06-06	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we have (select a, b, c), it is sometimes valid to simplify this to a single select operand. However, doing so is only valid if the computation doesn't inject poison into the computation. It might be helpful to consider the following example: (select (icmp ne %i, INT_MAX), (add nsw %i, 1), INT_MIN) The select is equivalent to (add %i, 1) but not (add nsw %i, 1). Self hosting on x86_64 revealed that this occurs very, very rarely so bailing out is hopefully pretty reasonable. llvm-svn: 239215
*	Handle 16 bit PC relative relocations.	Rafael Espindola	2015-06-06	1	-0/+5
\| \| \| \| \| \|	Fixes pr23771. llvm-svn: 239214
*	[dsymutil] Fix misspelled CHECK line.	Frederic Riss	2015-06-05	1	-1/+1
\| \| \| \|	llvm-svn: 239200
*	[dsymutil] Add support for linking the debug_frame section.	Frederic Riss	2015-06-05	5	-0/+231
\| \| \| \| \| \| \| \| \| \| \|	Linking the debug frame section is actually very easy as we just have to patch the start address in the FDE header and then copy the rest of the FDE without even looking at it. The only small complexity comes from the handling of the CIEs that we should unique across object file. This is also really easy by using a StringMap keyed on the raw contents of the CIE. llvm-svn: 239198
*	[dsymutil] Have the YAML deserialization rewrite the object address of symbols.	Frederic Riss	2015-06-05	1	-0/+44
\| \| \| \| \| \| \| \| \| \| \|	The main use of the YAML debug map format is for testing inside LLVM. If we have IR files in the tests used to generate object files, then we obviously don't know the addresses of the symbols inside the object files beforehand. This change lets the YAML import lookup the addresses in the object files and rewrite them. This will allow to have test that really don't need any binary input. llvm-svn: 239189
*	Revert "[InstCombine] Rephrase fix to SimplifyWithOpReplaced"	Renato Golin	2015-06-05	1	-10/+0
\| \| \| \| \| \| \| \| \|	This reverts commit r239141. This commit was an attempt to reintroduce a previous patch that broke many self-hosting bots with clang timeouts, but it still has slowdown issues, at least on ARM, increasing the compilation time (stage 2, clang's) by 5x. llvm-svn: 239175
*	[InstCombine] Fix PR23751.	Sanjoy Das	2015-06-05	1	-0/+13
\| \| \| \| \| \|	PR23751 was caused by a missing ``break;`` in r234388. llvm-svn: 239171
*	Revert r238473, "Thumb2: Modify codegen for memcpy intrinsic to prefer LDM/STM."	Peter Collingbourne	2015-06-05	2	-74/+7
\| \| \| \| \| \| \|	as it caused miscompilations and assertion failures (PR23768, http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150601/280380.html). llvm-svn: 239169
*	DAGCombiner: don't duplicate (fmul x, c) in visitFNEG if fneg is free	Fiona Glaser	2015-06-05	1	-0/+16
\| \| \| \| \| \| \| \| \| \|	For targets with a free fneg, this fold is always a net loss if it ends up duplicating the multiply, so definitely avoid it. This might be true for some targets without a free fneg too, but I'll leave that for future investigation. llvm-svn: 239167
*	[Unroll] Rework the naming and structure of the new unroll heuristics.	Chandler Carruth	2015-06-05	2	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The new naming is (to me) much easier to understand. Here is a summary of the new state of the world: - 'Threshold' is the threshold for full unrolling. It is measured against the estimated unrolled cost as computed by getUserCost in TTI (or CodeMetrics, etc). We will exceed this threshold when unrolling loops where unrolling exposes a significant degree of simplification of the logic within the loop. - 'PercentDynamicCostSavedThreshold' is the percentage of the loop's estimated dynamic execution cost which needs to be saved by unrolling to apply a discount to the estimated unrolled cost. - 'DynamicCostSavingsDiscount' is the discount applied to the estimated unrolling cost when the dynamic savings are expected to be high. When actually analyzing the loop, we now produce both an estimated unrolled cost, and an estimated rolled cost. The rolled cost is notably a dynamic estimate based on our analysis of the expected execution of each iteration. While we're still working to build up the infrastructure for making these estimates, to me it is much more clear how* to make them better when they have reasonably descriptive names. For example, we may want to apply estimated (from heuristics or profiles) dynamic execution weights to the dynamic cost estimates. If we start doing that, we would also need to track the static unrolled cost and the dynamic unrolled cost, as only the latter could reasonably be weighted by profile information. This patch is sadly not without functionality change for the new unroll analysis logic. Buried in the heuristic management were several things that surprised me. For example, we never subtracted the optimized instruction count off when comparing against the unroll heursistics! I don't know if this just got lost somewhere along the way or what, but with the new accounting of things, this is much easier to keep track of and we use the post-simplification cost estimate to compare to the thresholds, and use the dynamic cost reduction ratio to select whether we can exceed the baseline threshold. The old values of these flags also don't necessarily make sense. My impression is that none of these thresholds or discounts have been tuned yet, and so they're just arbitrary placehold numbers. As such, I've not bothered to adjust for the fact that this is now a discount and not a tow-tier threshold model. We need to tune all these values once the logic is ready to be enabled. Differential Revision: http://reviews.llvm.org/D9966 llvm-svn: 239164
*	[dsymutil] Handle the -oso-prepend-path option when the input is a YAML ↵	Frederic Riss	2015-06-05	2	-2/+2
\| \| \| \| \| \| \| \|	debug map All the tests using a YAML debug map will need this. llvm-svn: 239163
*	[bpf] rename triple names bpf_be -> bpfeb	Alexei Starovoitov	2015-06-05	14	-14/+14
\| \| \| \|	llvm-svn: 239162
*	[Hexagon] Reapply r239097 with tests corrected for shuffling and duplexing.	Colin LeMahieu	2015-06-05	11	-105/+41
\| \| \| \|	llvm-svn: 239161
*	[ARM] Add support for -sp- FPUs and FPU none to TargetParser	John Brawn	2015-06-05	2	-4/+20
\| \| \| \| \| \| \| \| \| \|	These are added mainly for the benefit of clang, but this also means that they are now allowed in .fpu directives and we emit the correct .fpu directive when single-precision-only is used. Differential Revision: http://reviews.llvm.org/D10238 llvm-svn: 239151
*	[X86][AVX2] Added tests for v32i8 vector shifts	Simon Pilgrim	2015-06-05	1	-0/+390
\| \| \| \| \| \|	Currently still scalarized, but D9474 should remedy that. llvm-svn: 239146
*	Revert "[mips] [IAS] Restore STI.FeatureBits in .set pop." (r239144).	Toma Tabacu	2015-06-05	2	-26/+0
\| \| \| \| \| \|	This is breaking the Windows buildbots. llvm-svn: 239145
*	[mips] [IAS] Restore STI.FeatureBits in .set pop.	Toma Tabacu	2015-06-05	2	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Only restoring AvailableFeatures is not enough and will lead to buggy behaviour. For example, if we have a feature enabled and we ".set pop", the next time we try to ".set" that feature nothing will happen because the "!(STI.getFeatureBits()[Feature])" check will be false, because we didn't restore STI.FeatureBits. In order to fix this, we need to make MipsAssemblerOptions remember the STI.FeatureBits instead of the AvailableFeatures and then regenerate AvailableFeatures each time we ".set pop". This is because, AFAIK, there is no way to convert from AvailableFeatures back to STI.FeatureBits, but the reverse is possible by using ComputeAvailableFeatures(STI.FeatureBits). I also moved the updating of AssemblerOptions inside the "if" statement in setFeatureBits() and clearFeatureBits(), as there is no reason to update if nothing changes. Reviewers: dsanders, mkuper Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9156 llvm-svn: 239144
*	[LoopVectorize] Don't crash on zero-sized types in isInductionPHI	David Majnemer	2015-06-05	1	-0/+27
\| \| \| \| \| \| \| \| \|	isInductionPHI wants to calculate the stride based on the pointee size. However, this is not possible when the pointee is zero sized. This fixes PR23763. llvm-svn: 239143
*	Simplify code; NFC.	Andrea Di Biagio	2015-06-05	2	-47/+48
\| \| \| \| \| \| \| \|	Also, moved test cases from CodeGen/X86/fold-buildvector-bug.ll into CodeGen/X86/buildvec-insertvec.ll and regenerated CHECK lines using update_llc_test_checks.py. llvm-svn: 239142
*	[InstCombine] Rephrase fix to SimplifyWithOpReplaced	David Majnemer	2015-06-05	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I don't have the IR which is causing the build bot breakage but I can postulate as to why they are timing out: 1. SimplifyWithOpReplaced was stripping flags from the simplified value. 2. visitSelectInstWithICmp was overriding SimplifyWithOpReplaced because it's simplification wasn't correct. 3. InstCombine would revisit the add instruction and note that it can rederive the flags. 4. By modifying the value, we chose to revisit instructions which reuse the value. One of the instructions is the original select, causing LLVM to never reach fixpoint. Instead, strip the flags only when we are sure we are going to perform the simplification. llvm-svn: 239141
*	Revert "[InstCombine] Don't miscompile safe increment idiom"	Daniel Jasper	2015-06-05	1	-10/+0
\| \| \| \| \| \| \| \| \|	This is breaking a lot of build bots and is causing very long-running compiles (infinite loops)? Likely, we shouldn't return nullptr? llvm-svn: 239139
*	[X86][SSE] Added tests for i8/i16 vector shifts	Simon Pilgrim	2015-06-05	1	-0/+1016
\| \| \| \| \| \|	Currently still scalarized, but D9474 should remedy that. llvm-svn: 239136
*	Revert "[Object, ELF] Fix segmentation fault in ELFFile::getSectionName()."	Alexey Samsonov	2015-06-04	2	-3/+0
\| \| \| \| \| \|	This reverts commit r239124. llvm-svn: 239125
*	[Object, ELF] Fix segmentation fault in ELFFile::getSectionName().	Alexey Samsonov	2015-06-04	2	-0/+3
\| \| \| \| \| \|	Don't do a null dereference if .shstrtab section is missing. llvm-svn: 239124
*	[Object, ELF] Don't assert on invalid magic in createELFObjectFile.	Alexey Samsonov	2015-06-04	2	-0/+2
\| \| \| \| \| \|	Instead, return a proper error code from factory. llvm-svn: 239116
*	[InstCombine] Don't miscompile safe increment idiom	David Majnemer	2015-06-04	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \|	We cleverly handle cases where computation done in one argument of a select instruction is suitable for the other operand, thus obviating the need of the select and the comparison. However, the other operand cannot have flags. This fixes PR23757. llvm-svn: 239115
*	Statepoint: Fix handling of Far Immediate calls	Swaroop Sridhar	2015-06-04	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	gc.statepoint intrinsics with a far immediate call target were lowered incorrectly as pc-rel32 calls. This change fixes the problem, and generates an indirect call via a scratch register. For example: Intrinsic: %safepoint_token = call i32 (i64, i32, void (), i32, i32, ...) @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 0, i32 0, void () inttoptr (i64 140727162896504 to void ()), i32 0, i32 0, i32 0, i32 0) Old Incorrect Lowering: callq 140727162896504 New Correct Lowering: movabsq $140727162896504, %rax callq %rax In lowerCallFromStatepoint(), the callee-target was modified and represented as a "TargetConstant" node, rather than a "Constant" node. Undoing this modification enabled LowerCall() to generate the correct CALL instruction. llvm-svn: 239114
*	[Object, ELF] Don't call llvm_unreachable() from createELFObjectFile.	Alexey Samsonov	2015-06-04	2	-0/+2
\| \| \| \| \| \|	Instead, return a proper error code from factory. llvm-svn: 239113
*	[Target/X86] Don't use callee-saved registers in a Win64 tail call on ↵	Charles Davis	2015-06-04	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	non-Windows. Summary: A small bit that I missed when I updated the X86 backend to account for the Win64 calling convention on non-Windows. Now we don't use dead non-volatile registers when emitting a Win64 indirect tail call on non-Windows. Should fix PR23710. Test Plan: Added test for the correct behavior based on the case I posted to PR23710. Reviewers: rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10258 llvm-svn: 239111
*	[Object, MachO] Don't crash on incomplete MachO segment load commands.	Alexey Samsonov	2015-06-04	2	-0/+3
\| \| \| \| \| \| \| \|	Report proper error code from MachOObjectFile constructor if we can't parse another segment load command (we already return a proper error if segment load command contents is suspicious). llvm-svn: 239109