bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[AVX512] Add DQ subvector inserts	Adam Nemet	2014-10-15	4	-14/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	In AVX512f we support 64x2 and 32x8 inserts via matching them to 32x4 and 64x4 respectively. These are matched by "Alt" Pat<>'s (Alt stands for alternative VTs). Since DQ has native support for these intructions, I peeled off the non-"Alt" part of the baseclass into vinsert_for_size_no_alt. The DQ instructions are derived from this multiclass. The "Alt" Pat<>'s are disabled with DQ. Fixes <rdar://problem/18426089> llvm-svn: 219874
*	[AVX512] Add SKX testing to avx512-insert-extract.ll	Adam Nemet	2014-10-15	1	-2/+4
\| \| \| \| \| \|	This is in preparation to adding DQ subvector inserts to this testcase. llvm-svn: 219873
*	[AVX512] Fix test to produce a defined value	Adam Nemet	2014-10-15	1	-1/+1
\| \| \| \| \| \|	We're inserting into a 8 wide vector, so the index should be < 8. llvm-svn: 219872
*	[AVX512] Two new attributes in X86VectorVTInfo for subvector insert	Adam Nemet	2014-10-15	2	-4/+14
\| \| \| \| \| \| \| \| \|	The new attributes are NumElts and the CD8TupleForm. This prepares the code to enable x8 and x2 inserts. NFC, no change in X86.td.expanded except for the new attributes. llvm-svn: 219871
*	[AVX512] Rename arg from Opcode32/64 to Opcode128/256 in vinsert_for_size	Adam Nemet	2014-10-15	1	-4/+4
\| \| \| \| \| \| \|	It's the W bit that selects between 32 or 64 elt type and not the opcode. The opcode selects between the width of the insert (128 or 256). llvm-svn: 219870
*	R600: Remove unnecessary part of computeKnownBitsForTargetNode	Matt Arsenault	2014-10-15	1	-5/+0
\| \| \| \| \| \| \|	Zero-width BFEs are combined away already, so there's no point in handling them. llvm-svn: 219868
*	Move variable down to use	Matt Arsenault	2014-10-15	1	-4/+4
\| \| \| \|	llvm-svn: 219867
*	Add MachOObjectFile::getUuid()	Alexander Potapenko	2014-10-15	2	-1/+12
\| \| \| \| \| \| \|	This CL introduces MachOObjectFile::getUuid(). This function returns an ArrayRef to the object file's UUID, or an empty ArrayRef if the object file doesn't contain an LC_UUID load command. The new function is gonna be used by llvm-symbolizer. llvm-svn: 219866
*	Updating documentation based on my change to remove the template disambiguation.	Chris Bieneman	2014-10-15	1	-3/+3
\| \| \| \|	llvm-svn: 219862
*	Fixing the build failure due to compiler warnings and unnecessary ↵	Chris Bieneman	2014-10-15	1	-3/+2
\| \| \| \| \| \|	disambiguation. llvm-svn: 219861
*	Defining a new API for debug options that doesn't rely on static global ↵	Chris Bieneman	2014-10-15	7	-12/+193
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	cl::opts. Summary: This is based on the discussions from the LLVMDev thread: http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075886.html Reviewers: chandlerc Reviewed By: chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5389 llvm-svn: 219854
*	R600/SI: Fix bug where immediates were being used in DS addr operands	Tom Stellard	2014-10-15	2	-1/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The SelectDS1Addr1Offset complex pattern always tries to store constant lds pointers in the offset operand and store a zero value in the addr operand. Since the addr operand does not accept immediates, the zero value needs to first be copied to a register. This newly created zero value will not go through normal instruction selection, so we need to manually insert a V_MOV_B32_e32 in the complex pattern. This bug was hidden by the fact that if there was another zero value in the DAG that had not been selected yet, then the CSE done by the DAG would use the unselected node for the addr operand rather than the one that was just created. This would lead to the zero value being selected and the DAG automatically inserting a V_MOV_B32_e32 instruction. llvm-svn: 219848
*	Avoid caching the MachineFunction, we don't use it outside of	Eric Christopher	2014-10-15	1	-9/+7
\| \| \| \| \| \|	runOnMachineFunction. llvm-svn: 219847
*	Wrong attribute. LLVM_ATTRIBUTE_UNUSED not LLVM_ATTRIBUTE_USED	Sid Manning	2014-10-15	1	-1/+1
\| \| \| \| \| \| \| \| \|	This original fix for the build break was correct. LLVM_ATTRIBUTE_USED removes the warning message because it keeps the function in the object file. LLVM_ATTRIBUTE_UNUSED indicates that it may or may not be used depending on build settings. llvm-svn: 219846
*	IR: Move NumOperands from User to Value, NFC	Duncan P. N. Exon Smith	2014-10-15	4	-8/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Store `User::NumOperands` (and `MDNode::NumOperands`) in `Value`. On 64-bit host architectures, this reduces `sizeof(User)` and all subclasses by 8, and has no effect on `sizeof(Value)` (or, incidentally, on `sizeof(MDNode)`). On 32-bit host architectures, this increases `sizeof(Value)` by 4. However, it has no effect on `sizeof(User)` and `sizeof(MDNode)`, so the only concrete subclasses of `Value` that actually see the increase are `BasicBlock`, `Argument`, `InlineAsm`, and `MDString`. Moreover, I'll be shocked and confused if this causes a tangible memory regression. This has no functionality change (other than memory footprint). llvm-svn: 219845
*	IR: Cleanup comments for Value, User, and MDNode	Duncan P. N. Exon Smith	2014-10-15	7	-233/+212
\| \| \| \| \| \| \| \| \| \| \| \| \|	A follow-up commit will modify the memory-layout of `Value`, `User`, and `MDNode`. First fix the comments to be doxygen-friendly (and to follow the coding standards). - Use "\brief" instead of "repeatedName -". - Add a brief intro where it was missing. - Remove duplicated comments from source files (and a couple of noisy/trivial comments altogether). llvm-svn: 219844
*	Wrong attribute. LLVM_ATTRIBUTE_USED not LLVM_ATTRIBUTE_UNUSED	Sid Manning	2014-10-15	1	-1/+1
\| \| \| \|	llvm-svn: 219837
*	Allow forward references to section symbols.	Rafael Espindola	2014-10-15	2	-1/+36
\| \| \| \|	llvm-svn: 219835
*	Teach ScalarEvolution to sharpen range information.	Sanjoy Das	2014-10-15	3	-2/+78
\| \| \| \| \| \| \| \| \| \| \| \|	If x is known to have the range [a, b) in a loop predicated by (icmp ne x, a), its range can be sharpened to [a + 1, b). Get ScalarEvolution and hence IndVars to exploit this fact. This change triggers an optimization to widen-loop-comp.ll, so it had to be edited to get it to pass. phabricator: http://reviews.llvm.org/D5639 llvm-svn: 219834
*	Add LLVM_ATTRIBUTE_UNUSED to function currently just used in an assert	Sid Manning	2014-10-15	1	-0/+2
\| \| \| \| \| \|	Fixes break when -Wunused-function is used. llvm-svn: 219833
*	InstCombine: Narrow switch instructions using known bits.	Akira Hatanaka	2014-10-15	2	-0/+92
\| \| \| \| \| \| \| \| \|	Truncate the operands of a switch instruction to a narrower type if the upper bits are known to be all ones or zeros. rdar://problem/17720004 llvm-svn: 219832
*	Reapply "[FastISel][AArch64] Add custom lowering for GEPs."	Juergen Ributzka	2014-10-15	3	-3/+119
\| \| \| \| \| \| \| \| \| \| \|	This is mostly a copy of the existing FastISel GEP code, but we have to duplicate it for AArch64, because otherwise we would bail out even for simple cases. This is because the standard fastEmit functions don't cover MUL at all and ADD is lowered very inefficientily. The original commit had a bug in the add emit logic, which has been fixed. llvm-svn: 219831
*	[FastISel][AArch64] Factor out add with immediate emission into a helper ↵	Juergen Ributzka	2014-10-15	1	-13/+28
\| \| \| \| \| \| \| \|	function. NFC. Simplify add with immediate emission by factoring it out into a helper function. llvm-svn: 219830
*	Correctly handle references to section symbols.	Rafael Espindola	2014-10-15	8	-52/+160
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When processing assembly like .long .text we were creating a new undefined symbol .text. GAS on the other hand would handle that as a reference to the .text section. This patch implements that by creating the section symbols earlier so that they are visible during asm parsing. The patch also updates llvm-readobj to print the symbol number in the relocation dump so that the test can differentiate between two sections with the same name. llvm-svn: 219829
*	Enable the instruction printer in HexagonMCTargetDesc	Sid Manning	2014-10-15	4	-4/+64
\| \| \| \| \| \| \| \| \| \|	This adds the MCInstPrinter to the LLVMHexagonDesc library and removes the dependency LLVMHexagonAsmPrinter had on LLVMHexagonDesc. This is a prerequisite needed by the disassembler. Phabricator Revision: http://reviews.llvm.org/D5734 llvm-svn: 219826
*	R600/SI: Also try to use 0 base for misaligned 8-byte DS loads.	Matt Arsenault	2014-10-15	3	-0/+73
\| \| \| \|	llvm-svn: 219823
*	R600: Fix miscompiles when BFE has multiple uses	Matt Arsenault	2014-10-15	2	-7/+32
\| \| \| \| \| \|	SimplifyDemandedBits would break the other uses of the operand. llvm-svn: 219819
*	correct const-ness with auto and dyn_cast	Sanjay Patel	2014-10-15	1	-3/+3
\| \| \| \| \| \| \| \| \|	1. Use const with autos. 2. Don't bother with explicit const in cast ops because they do it automagically. Thanks, David B. / Aaron B. / Reid K. llvm-svn: 219817
*	[SLPVectorize] Basic ephemeral-value awareness	Hal Finkel	2014-10-15	2	-3/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The SLP vectorizer should not vectorize ephemeral values. These are used to express information to the optimizer, and vectorizing them does not lead to faster code (because the ephemeral values are dropped prior to code generation, vectorized or not), and obscures the information the instructions are attempting to communicate (the logic that interprets the arguments to @llvm.assume generically does not understand vectorized conditions). Also, uses by ephemeral values are free (because they, and the necessary extractelement instructions, will be dropped prior to code generation). llvm-svn: 219816
*	Treat the WorkSet used to find ephemeral values as double-ended	Hal Finkel	2014-10-15	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We need to make sure that we visit all operands of an instruction before moving deeper in the operand graph. We had been pushing operands onto the back of the work set, and popping them off the back as well, meaning that we might visit an instruction before visiting all of its uses that sit in between it and the call to @llvm.assume. To provide an explicit example, given the following: %q0 = extractelement <4 x float> %rd, i32 0 %q1 = extractelement <4 x float> %rd, i32 1 %q2 = extractelement <4 x float> %rd, i32 2 %q3 = extractelement <4 x float> %rd, i32 3 %q4 = fadd float %q0, %q1 %q5 = fadd float %q2, %q3 %q6 = fadd float %q4, %q5 %qi = fcmp olt float %q6, %q5 call void @llvm.assume(i1 %qi) %q5 is used by both %qi and %q6. When we visit %qi, it will be marked as ephemeral, and we'll queue %q6 and %q5. %q6 will be marked as ephemeral and we'll queue %q4 and %q5. Under the old system, we'd then visit %q4, which would become ephemeral, %q1 and then %q0, which would become ephemeral as well, and now we have a problem. We'd visit %rd, but it would not be marked as ephemeral because we've not yet visited %q2 and %q3 (because we've not yet visited %q5). This will be covered by a test case in a follow-up commit that enables ephemeral-value awareness in the SLP vectorizer. llvm-svn: 219815
*	[MC] Make bundle alignment mode setting idempotent and support nested bundles	Derek Schuff	2014-10-15	4	-16/+108
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Currently an error is thrown if bundle alignment mode is set more than once per module (either via the API or the .bundle_align_mode directive). This change allows setting it multiple times as long as the alignment doesn't change. Also nested bundle_lock groups are currently not allowed. This change allows them, with the effect that the group stays open until all nests are exited, and if any of the bundle_lock directives has the align_to_end flag, the group becomes align_to_end. These changes make the bundle aligment simpler to use in the compiler, and also better match the corresponding support in GNU as. Reviewers: jvoung, eliben Differential Revision: http://reviews.llvm.org/D5801 llvm-svn: 219811
*	DI: Make comments "brief"-er, NFC	Duncan P. N. Exon Smith	2014-10-15	2	-85/+109
\| \| \| \| \| \| \| \| \| \| \|	Follow-up to r219801. Post-commit review pointed out that all comments require a `\brief` description [1], so I converted many and recrafted a few to be briefer or to include a brief intro. (If I'm going to clean them up, I should do it right!) [1]: http://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments llvm-svn: 219808
*	Use 'auto' for easier reading; no functional change intended.	Sanjay Patel	2014-10-15	1	-6/+3
\| \| \| \|	llvm-svn: 219804
*	remove function names from comments; NFC	Sanjay Patel	2014-10-15	1	-35/+24
\| \| \| \|	llvm-svn: 219803
*	DI: Cleanup comments, NFC	Duncan P. N. Exon Smith	2014-10-15	2	-212/+85
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A number of comment cleanups: - Remove duplicated function and class names from comments. - Remove duplicated comments from source file (some of which were out-of-sync). - Move any unduplicated comments from source file to header. - Remove some noisy comments entirely (e.g., a comment for `DIDescriptor::print()` saying "print descriptor" just gets in the way of reading the code). llvm-svn: 219801
*	Simplify handling of --noexecstack by using getNonexecutableStackSection.	Rafael Espindola	2014-10-15	31	-127/+81
\| \| \| \|	llvm-svn: 219799
*	DI: Use a `DenseMap` instead of named metadata, NFC	Duncan P. N. Exon Smith	2014-10-15	4	-57/+8
\| \| \| \| \| \| \|	Remove a strange round-trip through named metadata to assign preserved local variables to their subprograms. llvm-svn: 219798
*	Move getNonexecutableStackSection up to the base ELF class.	Rafael Espindola	2014-10-15	8	-23/+12
\| \| \| \| \| \|	The .note.GNU-stack section is not SystemZ/X86 specific. llvm-svn: 219796
*	R600: Use existing variable	Matt Arsenault	2014-10-15	1	-1/+1
\| \| \| \|	llvm-svn: 219778
*	R600: Remove outdated comment	Matt Arsenault	2014-10-15	1	-3/+0
\| \| \| \|	llvm-svn: 219777
*	Revert "[FastISel][AArch64] Add custom lowering for GEPs."	Juergen Ributzka	2014-10-15	3	-105/+3
\| \| \| \| \| \|	This breaks our internal build bots. Reverting it to get the bots green again. llvm-svn: 219776
*	[MachineSink] Use the real post dominator tree	Jingyue Wu	2014-10-15	4	-23/+80
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Fixes a FIXME in MachineSinking. Instead of using the simple heuristics in isPostDominatedBy, use the real MachinePostDominatorTree and MachineLoopInfo. The old heuristics caused instructions to sink unnecessarily, and might create register pressure. This is the second try of the fix. The first one (D4814) caused a performance regression due to failing to sink instructions out of loops (PR21115). This patch fixes PR21115 by sinking an instruction from a deeper loop to a shallower one regardless of whether the target block post-dominates the source. Thanks Alexey Volkov for reporting PR21115! Test Plan: Added a NVPTX codegen test to verify that our change prevents the backend from over-sinking. It also shows the unnecessary register pressure caused by over-sinking. Added an X86 test to verify we can sink instructions out of loops regardless of the dominance relationship. This test is reduced from Alexey's test in PR21115. Updated an affected test in X86. Also ran SPEC CINT2006 and llvm-test-suite for compilation time and runtime performance. Results are attached separately in the review thread. Reviewers: Jiangning, resistor, hfinkel Reviewed By: hfinkel Subscribers: hfinkel, bruno, volkalexey, llvm-commits, meheff, eliben, jholewinski Differential Revision: http://reviews.llvm.org/D5633 llvm-svn: 219773
*	ARM: drop check for triple that's no longer used.	Tim Northover	2014-10-15	1	-3/+2
\| \| \| \| \| \| \| \| \| \|	Early attempts to support AAPCS bare metal MachO targets based the decision on the CPU being compiled for. This was not a particularly great idea and we've got a better option now, but this check remained. No functional change for any target we care about. llvm-svn: 219767
*	Remove unused variable.	Eric Christopher	2014-10-15	1	-1/+0
\| \| \| \|	llvm-svn: 219750
*	No need to cache this unused variable.	Eric Christopher	2014-10-14	1	-3/+1
\| \| \| \| \| \|	Patch by Ehsan Akhgari. llvm-svn: 219749
*	[AArch64] Wrong CC access in CSINC-conditional branch sequence	Gerolf Hoflehner	2014-10-14	1	-5/+1
\| \| \| \| \| \| \| \|	This is a follow up to commit r219742. It removes the CCInMI variable and accesses the CC in CSCINC directly. In the case of a conditional branch accessing the CC with CCInMI was wrong. llvm-svn: 219748
*	[llvm-objdump] Update error message and add test case for mach-o file with ↵	Nick Kledzik	2014-10-14	3	-1/+7
\| \| \| \| \| \|	bad library ordinals llvm-svn: 219746
*	[AAarch64] Optimize CSINC-branch sequence	Gerolf Hoflehner	2014-10-14	5	-29/+176
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Peephole optimization that generates a single conditional branch for csinc-branch sequences like in the examples below. This is possible when the csinc sets or clears a register based on a condition code and the branch checks that register. Also the condition code may not be modified between the csinc and the original branch. Examples: 1. Convert csinc w9, wzr, wzr, <CC>;tbnz w9, #0, 0x44 to b.<invCC> 2. Convert csinc w9, wzr, wzr, <CC>; tbz w9, #0, 0x44 to b.<CC> rdar://problem/18506500 llvm-svn: 219742
*	[LoopVectorize] Ignore @llvm.assume for cost estimates and legality	Hal Finkel	2014-10-14	3	-4/+133
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	A few minor changes to prevent @llvm.assume from interfering with loop vectorization. First, treat @llvm.assume like the lifetime intrinsics, which are scalarized (but don't otherwise interfere with the legality checking). Second, ignore the cost of ephemeral instructions in the loop (these will go away anyway during CodeGen). Alignment assumptions and other uses of @llvm.assume can often end up inside of loops that should be vectorized (this is not uncommon for assumptions generated by __attribute__((align_value(n))), for example). llvm-svn: 219741
*	MC, COFF: Make bigobj test compatible with python3	David Majnemer	2014-10-14	1	-3/+3
\| \| \| \| \| \|	No functionality change intended. llvm-svn: 219739