bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Create llvm.addressofreturnaddress intrinsic	Albert Gutowski	2016-10-12	2	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: We need a new LLVM intrinsic to implement MS _AddressOfReturnAddress builtin on 64-bit Windows. Reviewers: majnemer, rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25293 llvm-svn: 284061
*	Reapply "[LoopUnroll] Use the upper bound of the loop trip count to fullly ↵	Haicheng Wu	2016-10-12	1	-0/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	unroll a loop" Reappy r284044 after revert in r284051. Krzysztof fixed the error in r284049. The original summary: This patch tries to fully unroll loops having break statement like this for (int i = 0; i < 8; i++) { if (a[i] == value) { found = true; break; } } GCC can fully unroll such loops, but currently LLVM cannot because LLVM only supports loops having exact constant trip counts. The upper bound of the trip count can be obtained from calling ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the refactoring work in SCEV to prevent duplicating code. The feature of using the upper bound is enabled under the same circumstance when runtime unrolling is enabled since both are used to unroll loops without knowing the exact constant trip count. llvm-svn: 284053
*	[MIRParser] Parse lane masks for register live-ins	Krzysztof Parzyszek	2016-10-12	1	-0/+23
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D25530 llvm-svn: 284052
*	Revert "[LoopUnroll] Use the upper bound of the loop trip count to fullly ↵	Haicheng Wu	2016-10-12	1	-43/+0
\| \| \| \| \| \| \| \|	unroll a loop" This reverts commit r284044. llvm-svn: 284051
*	Fix testcases failing after r284036	Krzysztof Parzyszek	2016-10-12	2	-4/+2
\| \| \| \| \| \|	The codegen has changed slightly between my tests and the commit. llvm-svn: 284049
*	[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop	Haicheng Wu	2016-10-12	1	-0/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch tries to fully unroll loops having break statement like this for (int i = 0; i < 8; i++) { if (a[i] == value) { found = true; break; } } GCC can fully unroll such loops, but currently LLVM cannot because LLVM only supports loops having exact constant trip counts. The upper bound of the trip count can be obtained from calling ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the refactoring work in SCEV to prevent duplicating code. The feature of using the upper bound is enabled under the same circumstance when runtime unrolling is enabled since both are used to unroll loops without knowing the exact constant trip count. Differential Revision: https://reviews.llvm.org/D24790 llvm-svn: 284044
*	LTO: Use the correct mangler function in ↵	Peter Collingbourne	2016-10-12	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \|	LTOCodeGenerator::applyScopeRestrictions(). We need to use the overload of Mangler::getNameWithPrefix that takes a GlobalValue in order to mangle in the stdcall stack byte count for Windows targets. Differential Revision: https://reviews.llvm.org/D25529 llvm-svn: 284040
*	Do not remove implicit defs in BranchFolder	Krzysztof Parzyszek	2016-10-12	2	-0/+30
\| \| \| \| \| \| \| \| \| \| \|	Branch folder removes implicit defs if they are the only non-branching instructions in a block, and the branches do not use the defined registers. The problem is that in some cases these implicit defs are required for the liveness information to be correct. Differential Revision: https://reviews.llvm.org/D25478 llvm-svn: 284036
*	AMDGPU: Initial implementation of VGPR indexing mode	Matt Arsenault	2016-10-12	1	-62/+225
\| \| \| \| \| \| \| \| \| \| \|	This is the most basic handling of the indirect access pseudos using GPR indexing mode. This currently only enables the mode for a single v_mov_b32 and then disables it. This is much more complicated to use than the movrel instructions, so a new optimization pass is probably needed to fold the access into the uses and keep the mode enabled for them. llvm-svn: 284031
*	[ThinLTO] Don't link module level assembly when importing	Teresa Johnson	2016-10-12	2	-0/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Module inline asm was always being linked/concatenated when running the IRLinker. This is correct for full LTO but not when we are importing for ThinLTO, as it can result in multiply defined symbols when the module asm defines a global symbol. In order to test with llvm-lto2, I had to work around PR30396, where a symbol that is defined in module assembly but defined in the LLVM IR appears twice. Added workaround to llvm-lto2 with a FIXME. Fixes PR30610. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25359 llvm-svn: 284030
*	[SimplifyCFG] Don't create PHI nodes for constant bundle operands	Sanjoy Das	2016-10-12	1	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Constant bundle operands may need to retain their constant-ness for correctness. I'll admit that this is slightly odd, but it looks like SimplifyCFG already does this for things like @llvm.frameaddress and @llvm.stackmap, so I suppose adding one more case is not a big deal. It is possible to add a mechanism to denote bundle operands that need to remain constants, but that's probably too complicated for the time being. Reviewers: jmolloy Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D25502 llvm-svn: 284028
*	AMDGPU: Add instruction definitions for VGPR indexing	Matt Arsenault	2016-10-12	4	-4/+46
\| \| \| \| \| \| \|	VI added a second method of indexing into VGPRs besides using v_movrel* llvm-svn: 284027
*	[X86] Add the v4i32 flavor test-case for pr30371	Zvi Rackover	2016-10-12	1	-5/+60
\| \| \| \|	llvm-svn: 284025
*	AMDGPU/SI: Change mimg intrinsic signatures	Tom Stellard	2016-10-12	3	-31/+49
\| \| \| \| \| \| \| \|	This makes more fields overridable and removes redundant bits. Patch by: Changpeng Fang llvm-svn: 284024
*	[ValueTracking] An improvement to IR ValueTracking on Non-negative Integers	Artur Pilipenko	2016-10-12	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	Since this change is known to cause performance degradations in some cases it's commited under a temporary flag which is turned off by default. Patch by Li Huang Differential Revision: https://reviews.llvm.org/D18777 llvm-svn: 284022
*	[MC] Fix Error Location for ParseIdentifier	Nirav Dave	2016-10-12	2	-2/+2
\| \| \| \| \| \| \|	Prevent partial parsing of '$' or '@' of invalid identifiers and fixup workaround points. NFC Intended. llvm-svn: 284017
*	[DAGCombiner] Update most ADD combines to support general vector combines	Simon Pilgrim	2016-10-12	1	-50/+27
\| \| \| \| \| \| \| \|	Add a number of helper functions to match scalar or vector equivalent constant/splat values to allow most of the combine patterns to be used by vectors. Differential Revision: https://reviews.llvm.org/D25374 llvm-svn: 284015
*	[DAGCombiner] Do not remove the load of stored values when optimizations are ↵	Konstantin Zhuravlyov	2016-10-12	5	-8/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	disabled This combiner breaks debug experience and should not be run when optimizations are disabled. For example: int main() { int j = 0; j += 2; if (j == 2) return 0; return 5; } When debugging this code compiled in /O0, it should be valid to break at line "j+=2;" and edit the value of j. It should change the return value of the function. Differential Revision: https://reviews.llvm.org/D19268 llvm-svn: 284014
*	[CVP] Convert an AShr to a LShr if 1st operand is known to be nonnegative.	Chad Rosier	2016-10-12	1	-0/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	An arithmetic shift can be safely changed to a logical shift if the first operand is known positive. This allows ComputeKnownBits (and similar analysis) to determine the sign bit of the shifted value in some cases. In turn, this allows InstCombine to canonicalize a signed comparison (a > 0) into an equality check (a != 0). PR30577 Differential Revision: https://reviews.llvm.org/D25119 llvm-svn: 284013
*	[InstCombine] Fix constexpr issue in select combining	Simon Pilgrim	2016-10-12	1	-0/+44
\| \| \| \| \| \| \| \|	As discussed by Andrea on PR30486, we have an unsafe cast to an Instruction type in the select combine which doesn't take into account that it could be a ConstantExpr instead. Differential Revision: https://reviews.llvm.org/D25466 llvm-svn: 284000
*	Add AArch64 unit tests	Diana Picus	2016-10-12	3	-208/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add unit tests for checking a few tricky instruction sizes. Also remove the old tests for the instruction sizes, which were clunky and brittle. Since this is the first set of target-specific unit tests, we need to add some CMake plumbing. In the future, adding unit tests for a given target will be as simple as creating a directory with the same name as the target under unittests/Target. The tests are only run if the target is enabled in LLVM_TARGETS_TO_BUILD. Differential Revision: https://reviews.llvm.org/D24548 llvm-svn: 283990
*	[AArch64][InstructionSelector] Fix unintended test changes in r283973.	Quentin Colombet	2016-10-12	1	-3/+6
\| \| \| \| \| \|	I screwed up my merge conflict and lost some of the CHECK lines. llvm-svn: 283974
*	[AArch64][InstrustionSelector] Teach the selector about G_BITCAST.	Quentin Colombet	2016-10-12	1	-9/+207
\| \| \| \|	llvm-svn: 283973
*	[AArch64][InstructionSelector] Refactor the handling of copies.	Quentin Colombet	2016-10-12	1	-5/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Although Copies are not specific to preISel, we still have to assign them a proper register class. However, given they are not constrained to anything we do not have to handle the source register at the copy. It will be properly mapped when reaching the related definition. In the process, the handlong of G_ANYEXT is slightly modified as those end up being selected as copy. The difference is that when register size do not match on both sides, we need to insert SUBREG_TO_REG operation, otherwise the post RA copy expansion will not be happy! llvm-svn: 283972
*	[AArch64][InstructionSelector] Fix typos in the related mir file. NFC.	Quentin Colombet	2016-10-12	1	-3/+4
\| \| \| \|	llvm-svn: 283971
*	[AArch64][MachineLegalizer] Mark more bitcasts as legal.	Quentin Colombet	2016-10-12	1	-0/+10
\| \| \| \| \| \|	Those are copies, we do not have to do any legalization action for them. llvm-svn: 283970
*	GVN-hoist: fix store past load dependence analysis (PR30216, PR30499)	Sebastian Pop	2016-10-12	2	-0/+82
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a refreshed version of a patch that was reverted: it fixes the problems reported in both PR30216 and PR30499, and contains all the test-cases from both bugs. To hoist stores past loads, we used to search for potential conflicting loads on the hoisting path by following a MemorySSA def-def link from the store to be hoisted to the previous defining memory access, and from there we followed the def-use chains to all the uses that occur on the hoisting path. The problem is that the def-def link may point to a store that does not alias with the store to be hoisted, and so the loads that are walked may not alias with the store to be hoisted, and even as in the testcase of PR30216, the loads that may alias with the store to be hoisted are not visited. The current patch visits all loads on the path from the store to be hoisted to the hoisting position and uses the alias analysis to ask whether the store may alias the load. I was not able to use the MemorySSA functionality to ask for whether load and store are clobbered: I'm not sure which function to call, so I used a call to AA->isNoAlias(). Store past store is still working as before using a MemorySSA query: I added an extra test to pr30216.ll to make sure store past store does not regress. Tested on x86_64-linux with check and a test-suite run. Differential Revision: https://reviews.llvm.org/D25476 llvm-svn: 283965
*	[PPCMIPeephole] Fix splat elimination	Tim Shen	2016-10-12	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In PPCMIPeephole, when we see two splat instructions, we can't simply do the following transformation: B = Splat A C = Splat B => C = Splat A because B may still be used between these two instructions. Instead, we should make the second Splat a PPC::COPY and let later passes decide whether to remove it or not: B = Splat A C = Splat B => B = Splat A C = COPY B Fixes PR30663. Reviewers: echristo, iteratee, kbarton, nemanjai Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D25493 llvm-svn: 283961
*	[DAG] Fix crash in build_vector -> vector_shuffle combine	Michael Kuperstein	2016-10-11	1	-0/+45
\| \| \| \| \| \| \| \|	Fixes a crash in the build_vector -> vector_shuffle combine when the first vector input is twice as wide as the output, and the second input vector is even wider. llvm-svn: 283953
*	GlobalISel: support same-size casts on AArch64.	Tim Northover	2016-10-11	1	-0/+31
\| \| \| \| \| \| \|	Mostly Ahmed's work again, I'm just sprucing things up slightly before committing. llvm-svn: 283952
*	[InstrProf] Add support for dead_strip+live_support functionality	Vedant Kumar	2016-10-11	3	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On Darwin, marking a section as "regular,live_support" means that a symbol in the section should only be kept live if it has a reference to something that is live. Otherwise, the linker is free to dead-strip it. Turn this functionality on for the __llvm_prf_data section. This means that counters and data associated with dead functions will be removed from dead-stripped binaries. This will result in smaller profiles and binaries, and should speed up profile collection. Tested with check-profile, llvm-lit test/tools/llvm-{cov,profdata}, and check-llvm. Differential Revision: https://reviews.llvm.org/D25456 llvm-svn: 283947
*	Re-land "[Thumb] Save/restore high registers in Thumb1 pro/epilogues"	Reid Kleckner	2016-10-11	3	-12/+252
\| \| \| \| \| \| \| \| \|	Reverts r283938 to reinstate r283867 with a fix. The original change had an ArrayRef referring to a destroyed temporary initializer list. Use plain C arrays instead. llvm-svn: 283942
*	Next set of additional error checks for invalid Mach-O files for the	Kevin Enderby	2016-10-11	3	-0/+6
\| \| \| \| \| \| \| \| \|	load commands that uses the MachO::linker_option_command type but not used in llvm libObject code but used in llvm tool code. This includes just LC_LINKER_OPTION load command. llvm-svn: 283939
*	Revert "[Thumb] Save/restore high registers in Thumb1 pro/epilogues"	Reid Kleckner	2016-10-11	3	-252/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts r283867. This appears to be an infinite loop: while (HiRegToSave != AllHighRegs.end() && CopyReg != AllCopyRegs.end()) { if (HiRegsToSave.count(*HiRegToSave)) { ... CopyReg = findNextOrderedReg(++CopyReg, CopyRegs, AllCopyRegs.end()); HiRegToSave = findNextOrderedReg(++HiRegToSave, HiRegsToSave, AllHighRegs.end()); } } llvm-svn: 283938
*	GlobalISel: support selection of extend operations.	Tim Northover	2016-10-11	1	-0/+104
\| \| \| \| \| \|	Patch mostly by Ahmed Bougaca. llvm-svn: 283937
*	Codegen: Tail-duplicate during placement.	Kyle Butt	2016-10-11	20	-47/+595
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Issue was worklist/scheduling/taildup issue in layout. Issue from 2nd rollback fixed, with 2 additional tests. Issue was tail merging/loop info/tail-duplication causing issue with loops that share a header block. Issue with early tail-duplication of blocks that branch to a fallthrough predecessor fixed with test case: tail-dup-branch-to-fallthrough.ll Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283934
*	[x86] add tests for negate bool	Sanjay Patel	2016-10-11	1	-0/+101
\| \| \| \|	llvm-svn: 283930
*	[sanitizer-coverage] use private linkage for coverage guards, delete old ↵	Kostya Serebryany	2016-10-11	1	-1/+1
\| \| \| \| \| \|	commented-out code. llvm-svn: 283924
*	[AMDGPU] Fix test that was broken by rL283893	Konstantin Zhuravlyov	2016-10-11	1	-1/+3
\| \| \| \|	llvm-svn: 283911
*	[DAG] add fold for masked negated sign-extended bool	Sanjay Patel	2016-10-11	1	-9/+1
\| \| \| \| \| \| \|	This enhances the fold added with: https://reviews.llvm.org/rL283900 llvm-svn: 283905
*	[x86] add sext variants of tests added with r283894	Sanjay Patel	2016-10-11	1	-6/+51
\| \| \| \|	llvm-svn: 283903
*	Let test pass for builds that support X86, but do not default to it	Bernard Ogden	2016-10-11	1	-1/+1
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D25471 llvm-svn: 283902
*	Fix test on non-x86 hosts	Bernard Ogden	2016-10-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Summary: This test is allowed to run on non-x86 hosts and thus must use llvm-nm rather than nm. Differential Revision: https://reviews.llvm.org/D25473 llvm-svn: 283901
*	[DAG] add fold for masked negated extended bool	Sanjay Patel	2016-10-11	1	-9/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The non-obvious motivation for adding this fold (which already happens in InstCombine) is that we want to canonicalize IR towards select instructions and canonicalize DAG nodes towards boolean math. So we need to recreate some folds in the DAG to handle that change in direction. An interesting implementation difference for cases like this is that InstCombine generally works top-down while the DAG goes bottom-up. That means we need to detect different patterns. In this case, the SimplifyDemandedBits fold prevents us from performing a zext to sext fold that would then be recognized as a negation of a sext. llvm-svn: 283900
*	[x86] add tests to show missed folds for masked bools	Sanjay Patel	2016-10-11	1	-0/+48
\| \| \| \|	llvm-svn: 283894
*	AMDGPU/SI: Update ISA version numbers for Tonga and Polaris10/11.	Changpeng Fang	2016-10-11	1	-4/+8
\| \| \| \| \| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D25454 Reviewers: tstellarAMD llvm-svn: 283893
*	[X86][SSE] Regenerate scalar i64 uitofp test	Simon Pilgrim	2016-10-11	1	-16/+41
\| \| \| \| \| \|	Added 32-bit target test llvm-svn: 283883
*	[X86][SSE] Regenerate vector load-trunc test	Simon Pilgrim	2016-10-11	1	-1/+20
\| \| \| \|	llvm-svn: 283881
*	[X86][SSE] Regenerate vsplit and tests	Simon Pilgrim	2016-10-11	1	-6/+53
\| \| \| \| \| \|	To make it more obvious how bad some of that truncation code is.... llvm-svn: 283880
*	[x86] update test to use FileCheck and auto-generate checks	Sanjay Patel	2016-10-11	1	-5/+15
\| \| \| \|	llvm-svn: 283876