bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Introduce RegionInfoAnalysis, which compute Region Tree in the new ↵	Hongbin Zheng	2016-02-25	22	-0/+149
\| \| \| \| \| \| \| \|	PassManager. NFC Differential Revision: http://reviews.llvm.org/D17571 llvm-svn: 261904
*	Introduce DominanceFrontierAnalysis to the new PassManager to compute ↵	Hongbin Zheng	2016-02-25	10	-68/+137
\| \| \| \| \| \| \| \|	DominanceFrontier. NFC Differential Revision: http://reviews.llvm.org/D17570 llvm-svn: 261903
*	Introduce analysis pass to compute PostDominators in the new pass manager. NFC	Hongbin Zheng	2016-02-25	13	-85/+124
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D17537 llvm-svn: 261902
*	ARM: disallow pc as a base register in Thumb2 memory ops.	Tim Northover	2016-02-25	3	-2/+18
\| \| \| \| \| \| \|	These should all be deferring to the "OP (literal)" variant according to the ARM ARM. llvm-svn: 261895
*	Revert "Introduce analysis pass to compute PostDominators in the new pass ↵	Hongbin Zheng	2016-02-25	12	-120/+81
\| \| \| \| \| \| \| \|	manager. NFC" This reverts commit a3e5cc6a51ab5ad88d1760c63284294a4e34c018. llvm-svn: 261891
*	Revert "Introduce DominanceFrontierAnalysis to the new PassManager to ↵	Hongbin Zheng	2016-02-25	10	-137/+68
\| \| \| \| \| \| \| \|	compute DominanceFrontier. NFC" This reverts commit 109c38b2226a87b0be73fa7a0a8c1a81df20aeb2. llvm-svn: 261890
*	Revert "Introduce RegionInfoAnalysis, which compute Region Tree in the new ↵	Hongbin Zheng	2016-02-25	22	-149/+0
\| \| \| \| \| \| \| \|	PassManager. NFC" This reverts commit 8228b4d374edeb4cc0c5fddf6e1ab876918ee126. llvm-svn: 261889
*	rangify; NFCI	Sanjay Patel	2016-02-25	1	-54/+43
\| \| \| \|	llvm-svn: 261888
*	[AArch64] Clean up callee-save CFI emission. NFC.	Geoff Berry	2016-02-25	2	-46/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Avoid special case for FP, LR CFI emission and just allow general AArch64FrameLowering::emitCalleeSavedFrameMoves() to handle them. Also, stop recalculating the stack offsets in emitCalleeSavedFrameMoves() since we can just reuse the previously calculated offset stored in the MachineFrameInfo. Depends on D17000 Reviewers: t.p.northover, rengolin, mcrosier, jmolloy Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D17004 llvm-svn: 261885
*	Introduce RegionInfoAnalysis, which compute Region Tree in the new ↵	Hongbin Zheng	2016-02-25	22	-0/+149
\| \| \| \| \| \| \| \|	PassManager. NFC Differential Revision: http://reviews.llvm.org/D17571 llvm-svn: 261884
*	Introduce DominanceFrontierAnalysis to the new PassManager to compute ↵	Hongbin Zheng	2016-02-25	10	-68/+137
\| \| \| \| \| \| \| \|	DominanceFrontier. NFC Differential Revision: http://reviews.llvm.org/D17570 llvm-svn: 261883
*	Introduce analysis pass to compute PostDominators in the new pass manager. NFC	Hongbin Zheng	2016-02-25	12	-81/+120
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D17537 llvm-svn: 261882
*	[AMDGPU] Disassembler: Support for all VOP1 instructions.	Nikolay Haustov	2016-02-25	4	-62/+492
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Support all instructions with VOP1 encoding with 32 or 64-bit operands for VI subtarget: VGPR_32 and VReg_64 operand register classes VS_32 and VS_64 operand register classes with inline and literal constants Tests for VOP1 instructions. Patch by: skolton Reviewers: arsenm, tstellarAMD Review: http://reviews.llvm.org/D17194 llvm-svn: 261878
*	don't repeat names in documentation comments; NFC	Sanjay Patel	2016-02-25	3	-222/+157
\| \| \| \|	llvm-svn: 261877
*	AVX512F: Add GATHER/SCATTER assembler Intel syntax tests for knl/skx/avx . ↵	Igor Breger	2016-02-25	10	-117/+1536
\| \| \| \| \| \| \| \|	Change memory operand parser handling. Differential Revision: http://reviews.llvm.org/D17564 llvm-svn: 261862
*	[mips][microMIPS] Implement DINSU, DINSM, DINS instructions	Hrvoje Varga	2016-02-25	12	-13/+94
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D16181 llvm-svn: 261860
*	[AMDGPU] Assembler: Simplify handling of optional operands	Nikolay Haustov	2016-02-25	3	-75/+77
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Resubmit with index problem fixed. Verified with valgrind. Prepare to support DPP encodings. For DPP encodings, we want row_mask/bank_mask/bound_ctrl to be optional operands. However this means that when parsing instruction which has no mnemonic prefix, we cannot add both default values for VOP3 and for DPP optional operands to OperandVector - neither instructions would match. So add default values for optional operands to MCInst during conversion instead. Mark more operands as IsOptional = 1 in .td files. Do not add default values for optional operands to OperandVector in AMDGPUAsmParser. Add default values for optional operands during conversion using new helper addOptionalImmOperand. Change to cvtVOP3_2_mod to check instruction flag instead of presence of modifiers. In the future, cvtVOP3* functions can be combined into one. Separate cvtFlat and cvtFlatAtomic. Fix CNDMASK_B32 definition to have no modifiers. Review: http://reviews.llvm.org/D17445 llvm-svn: 261856
*	[PM] Add the IR unit type to the pass manager's logging and make all of	Chandler Carruth	2016-02-25	3	-112/+113
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	the testing more more explicit. This will currently fail on platforms without support for getTypeName. While an assert failure seems too harsh, I'm hoping we're OK with the regression test failure, and I'd like to find out about what platforms actually exist in this state if there are any so we can get implementations in place for them. But if we just can't fix all the host compilers to have a reasonably portable variant of getTypeName and are worried about xfailing this test on those platforms, I can add the horrible regular expression magic to make the tests support "unknown" here as well. llvm-svn: 261853
*	[X86][SSE3] Added combine support for MOVDDUP/MOVSHDUP/MOVSLDUP target shuffles	Simon Pilgrim	2016-02-25	2	-12/+9
\| \| \| \| \| \|	Now that PerformShuffleCombine can handle unary shuffles. llvm-svn: 261843
*	Revert r260064, "Disable llvm/test/tools/llvm-profdata/value-prof.proftext ↵	NAKAMURA Takumi	2016-02-25	1	-2/+0
\| \| \| \| \| \| \| \|	on win32 for now. Investigating." It seems unreproducible any more for me. llvm-svn: 261842
*	Revert r261742, "[AMDGPU] Assembler: Simplify handling of optional operands"	NAKAMURA Takumi	2016-02-25	3	-79/+75
\| \| \| \| \| \|	It brought undefined behavior. llvm-svn: 261839
*	PM: Implement a basic loop pass manager	Justin Bogner	2016-02-25	12	-4/+757
\| \| \| \| \| \| \| \| \| \| \|	This creates the new-style LoopPassManager and wires it up with dummy and print passes. This version doesn't support modifying the loop nest at all. It will be far easier to discuss and evaluate the approaches to that with this in place so that the boilerplate is out of the way. llvm-svn: 261831
*	Optimized loading (zextload) of i1 value from memory.	Elena Demikhovsky	2016-02-25	7	-43/+42
\| \| \| \| \| \| \| \| \| \| \|	This patch is a partial revert of https://llvm.org/svn/llvm-project/llvm/trunk@237793. Extra "and" causes performance degradation. We assume that i1 is stored in zero-extended form. And store operation is responsible for zeroing upper bits. Differential Revision: http://reviews.llvm.org/D17541 llvm-svn: 261828
*	[Support] Don't check for ICC directly and rely on the __GNUC__ check	Chandler Carruth	2016-02-25	1	-1/+1
\| \| \| \| \| \| \| \| \|	(which they emulate). This way we don't use that path when compiled with ICC on Windows where it mimics MSVC's behavior and supports __FUNCSIG__. Thanks for David Majnemer again for spotting this better pattern! llvm-svn: 261827
*	[Support] Add a fancy helper function to get a static name for a type.	Chandler Carruth	2016-02-25	3	-0/+115
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This extracts the type name from __PRETTY_FUNCTION__ for compilers that support it (I've opted Clang, GCC, and ICC into this as I've tested that they work) and from __FUNCSIG__ which is very similar on MSVC. The routine falls back gracefully on a stub "UNKNOWN_TYPE" string with compilers or formats it doesn't understand. This should be enough for a lot of common cases in LLVM where the real goal is just to log or print a type name as a debugging aid, and save a ton of boilerplate in the process. Notably, I'm planning to use this to remove all the getName() boiler plate from the new pass manager. The design and implementation is based on a bunch of advice and discussion with Richard Smith and experimenting with most versions of Clang and GCC. David Majnemer also provided excellent advice on how best to do this with MSVC. Richard also checked that ICC does something reasonable and I'll watch the build bots for other compilers. It'd be great if someone could contribute logic for xlC and/or other toolchains. Differential Revision: http://reviews.llvm.org/D17565 llvm-svn: 261819
*	IR: Make the X / undef -> undef fold match the comment	Justin Bogner	2016-02-25	2	-1/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The constant folding for sdiv and udiv has a big discrepancy between the comments and the code, which looks like a typo. Currently, we're folding X / undef pretty inconsistently: 0 / undef -> undef C / undef -> 0 undef / undef -> 0 Whereas the comments state we do X / undef -> undef. The logic that returns zero is actually commented as doing undef / X -> 0, despite that the LHS isn't undef in many of the cases that hit it. llvm-svn: 261813
*	[CodeGenPrepare] Remove load-based heuristic	Junmo Park	2016-02-25	4	-37/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Both the hardware and LLVM have changed since 2012. Now, load-based heuristic don't show big differences any more on OoO cores. There is no notable regressons and improvements on spec2000/2006. (Cortex-A57, Core i5). Reviewers: spatel, zansari Differential Revision: http://reviews.llvm.org/D16836 llvm-svn: 261809
*	Move test/CodeGen/Generic/pr26652.ll to test/CodeGen/X86/pr26652.ll and test ↵	Cong Hou	2016-02-25	1	-1/+2
\| \| \| \| \| \|	it only on X86. llvm-svn: 261807
*	fix typo	Sanjay Patel	2016-02-24	1	-1/+1
\| \| \| \|	llvm-svn: 261805
*	Detecte vector reduction operations just before instruction selection.	Cong Hou	2016-02-24	4	-0/+379
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(This is the second attemp to commit this patch, after fixing pr26652 & pr26653). This patch detects vector reductions before instruction selection. Vector reductions are vectorized reduction operations, and for such operations we have freedom to reorganize the elements of the result as long as the reduction of them stay unchanged. This will enable some reduction pattern recognition during instruction combine such as SAD/dot-product on X86. A flag is added to SDNodeFlags to mark those vector reduction nodes to be checked during instruction combine. To detect those vector reductions, we search def-use chains starting from the given instruction, and check if all uses fall into two categories: 1. Reduction with another vector. 2. Reduction on all elements. in which 2 is detected by recognizing the pattern that the loop vectorizer generates to reduce all elements in the vector outside of the loop, which includes several ShuffleVector and one ExtractElement instructions. Differential revision: http://reviews.llvm.org/D15250 llvm-svn: 261804
*	add tests to show missing bitcasted logic transform	Sanjay Patel	2016-02-24	2	-0/+83
\| \| \| \|	llvm-svn: 261799
*	Add capability to push/pop DFI in MCStreamer. NFC	Amaury Sechet	2016-02-24	2	-2/+8
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is extracted from D17555 Reviewers: davidxl, reames, sanjoy, MatzeB, pete Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17579 llvm-svn: 261796
*	[asan] Do not instrument globals in the special "LLVM" sections	Anna Zaks	2016-02-24	2	-1/+3
\| \| \| \|	llvm-svn: 261794
*	MachineInstr: Respect register aliases in clearRegiserKills()	Matthias Braun	2016-02-24	3	-3/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes bugs in copy elimination code in llvm. It slightly changes the semantics of clearRegisterKills(). This is appropriate because: - Users in lib/CodeGen/MachineCopyPropagation.cpp and lib/Target/AArch64RedundantCopyElimination.cpp and lib/Target/SystemZ/SystemZElimCompare.cpp are incorrect without it (see included testcase). - All other users in llvm are unaffected (they pass TRI==nullptr) - (Kill flags are optional anyway so removing too many shouldn't hurt.) Differential Revision: http://reviews.llvm.org/D17554 llvm-svn: 261763
*	AArch64: remove CRC feature from Cyclone.	Tim Northover	2016-02-24	2	-1/+27
\| \| \| \| \| \|	Turns out we don't actually support those instructions. llvm-svn: 261759
*	[ThinLTO] Add missing breaks when parsing summaries (NFC)	Teresa Johnson	2016-02-24	1	-0/+2
\| \| \| \| \| \| \|	This wasn't causing a correctness issue, but was causing extra duplicate entries to be added to the SummaryMap. llvm-svn: 261757
*	[SimplifyCFG] Use a more elegant solution than r261731	David Majnemer	2016-02-24	1	-11/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The cleanupret instruction has an invariant that it's 'from' operand be a cleanuppad. This invariant was violated when we removed a dead block which removed a cleanuppad leaving behind a cleanupret with an undef 'from' operand. This was solved in r261731 by staving off the removal of the dead block to a later pass. However, it occured to me that we do not need to do this. Instead, we can simply avoid processing the cleanupret if it has an undef 'from' operand because we know that it will be removed soon. llvm-svn: 261754
*	[X86][SSSE3] Added target shuffle combine tests for SSE3/SSSE3 specific ↵	Simon Pilgrim	2016-02-24	1	-0/+67
\| \| \| \| \| \| \| \|	shuffles. Allows us to test SSSE3 PSHUFB intrinsic. llvm-svn: 261753
*	remove fixme comment that was fixed with r261750	Sanjay Patel	2016-02-24	1	-1/+1
\| \| \| \|	llvm-svn: 261752
*	[InstCombine] enable optimization of casted vector xor instructions	Sanjay Patel	2016-02-24	2	-21/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is part of the payoff for the refactoring in: http://reviews.llvm.org/rL261649 http://reviews.llvm.org/rL261707 In addition to removing a pile of duplicated code, the xor case was missing the optimization for vector types because it checked "SrcTy->isIntegerTy()" rather than "SrcTy->isIntOrIntVectorTy()" like 'and' and 'or' were already doing. This solves part of: https://llvm.org/bugs/show_bug.cgi?id=26702 llvm-svn: 261750
*	add test to show missing bitcasted vector xor fold	Sanjay Patel	2016-02-24	1	-0/+15
\| \| \| \|	llvm-svn: 261748
*	`MSP430InstrInfo::loadRegFromStackSlot` forgets to set register def.	Anton Korobeynikov	2016-02-24	2	-2/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: For instance, compiling the below results in a panic: ``` llc: ../lib/CodeGen/InlineSpiller.cpp:1140: bool (anonymous namespace)::InlineSpiller::foldMemoryOperand(ArrayRef<std::pair<MachineInstr , unsigned int> >, llvm::MachineInstr ): Assertion `MO->isDead() && "Cannot fold physreg def"' failed. #0 0x00007f50fbcf353e llvm::sys::PrintStackTrace(llvm::raw_ostream&) /home/h/3rd/llvm/build/../lib/Support/Unix/Signals.inc:321:15 #1 0x00007f50fbcf3929 PrintStackTraceSignalHandler(void) /home/h/3rd/llvm/build/../lib/Support/Unix/Signals.inc:380:1 #2 0x00007f50fbcf22a3 llvm::sys::RunSignalHandlers() /home/h/3rd/llvm/build/../lib/Support/Signals.cpp:45:5 #3 0x00007f50fbcf3bb4 SignalHandler(int) /home/h/3rd/llvm/build/../lib/Support/Unix/Signals.inc:210:1 #4 0x00007f50fa87a180 (/lib/x86_64-linux-gnu/libc.so.6+0x35180) #5 0x00007f50fa87a107 gsignal (/lib/x86_64-linux-gnu/libc.so.6+0x35107) #6 0x00007f50fa87b4e8 abort (/lib/x86_64-linux-gnu/libc.so.6+0x364e8) #7 0x00007f50fa873226 (/lib/x86_64-linux-gnu/libc.so.6+0x2e226) #8 0x00007f50fa8732d2 (/lib/x86_64-linux-gnu/libc.so.6+0x2e2d2) #9 0x00007f50fddd9287 (anonymous namespace)::InlineSpiller::foldMemoryOperand(llvm::ArrayRef<std::pair<llvm::MachineInstr, unsigned int> >, llvm::MachineInstr) /home/h/3rd/llvm/build/../lib/CodeGen/InlineSpiller.cpp:1141:21 #10 0x00007f50fddd9ee9 (anonymous namespace)::InlineSpiller::spillAroundUses(unsigned int) /home/h/3rd/llvm/build/../lib/CodeGen/InlineSpiller.cpp:1286:9 #11 0x00007f50fddd388b (anonymous namespace)::InlineSpiller::spillAll() /home/h/3rd/llvm/build/../lib/CodeGen/InlineSpiller.cpp:1338:21 #12 0x00007f50fddd221d (anonymous namespace)::InlineSpiller::spill(llvm::LiveRangeEdit&) /home/h/3rd/llvm/build/../lib/CodeGen/InlineSpiller.cpp:1391:3 #13 0x00007f50fdfd921b (anonymous namespace)::RAGreedy::selectOrSplitImpl(llvm::LiveInterval&, llvm::SmallVectorImpl<unsigned int>&, llvm::SmallSet<unsigned int, 16u, std::less<unsigned int> >&, unsigned int) /home/h/3rd/llvm/build/../lib/CodeGen/RegAllocGreedy.cpp:2555:5 #14 0x00007f50fdfd647b (anonymous namespace)::RAGreedy::selectOrSplit(llvm::LiveInterval&, llvm::SmallVectorImpl<unsigned int>&) /home/h/3rd/llvm/build/../lib/CodeGen/RegAllocGreedy.cpp:2221:12 #15 0x00007f50fdfc89f9 llvm::RegAllocBase::allocatePhysRegs() /home/h/3rd/llvm/build/../lib/CodeGen/RegAllocBase.cpp:110:14 #16 0x00007f50fdfd6337 (anonymous namespace)::RAGreedy::runOnMachineFunction(llvm::MachineFunction&) /home/h/3rd/llvm/build/../lib/CodeGen/RegAllocGreedy.cpp:2611:3 #17 0x00007f50fded33ee llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /home/h/3rd/llvm/build/../lib/CodeGen/MachineFunctionPass.cpp:43:3 #18 0x00007f50fd6cdc6f llvm::FPPassManager::runOnFunction(llvm::Function&) /home/h/3rd/llvm/build/../lib/IR/LegacyPassManager.cpp:1550:23 #19 0x00007f50fd6cdf85 llvm::FPPassManager::runOnModule(llvm::Module&) /home/h/3rd/llvm/build/../lib/IR/LegacyPassManager.cpp:1571:16 #20 0x00007f50fd6ce71a (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) /home/h/3rd/llvm/build/../lib/IR/LegacyPassManager.cpp:1627:23 #21 0x00007f50fd6ce246 llvm::legacy::PassManagerImpl::run(llvm::Module&) /home/h/3rd/llvm/build/../lib/IR/LegacyPassManager.cpp:1730:16 #22 0x00007f50fd6cec31 llvm::legacy::PassManager::run(llvm::Module&) /home/h/3rd/llvm/build/../lib/IR/LegacyPassManager.cpp:1761:3 #23 0x0000000000415bdc compileModule(char, llvm::LLVMContext&) /home/h/3rd/llvm/build/../tools/llc/llc.cpp:405:5 #24 0x0000000000414571 main /home/h/3rd/llvm/build/../tools/llc/llc.cpp:211:13 #25 0x00007f50fa866b45 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b45) #26 0x0000000000414296 _start (/home/h/3rd/llvm/build/bin/llc+0x414296) Stack dump: 0. Program arguments: ./bin/llc -mtriple msp430 loadstore.ll 1. Running pass 'Function Pass Manager' on module 'loadstore.ll'. 2. Running pass 'Greedy Register Allocator' on function '@inc' ``` Original IR: ```llvm %struct.VeryLarge = type { i8, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32 } ; Function Attrs: norecurse nounwind define void @inc(%struct.VeryLarge noalias nocapture sret %agg.result, %struct.VeryLarge* byval align 1 %s) #0 { entry: %p0 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 0 %0 = load i8, i8* %p0, align 1, !tbaa !1 %p1 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 1 %1 = load i32, i32* %p1, align 1, !tbaa !6 %p2 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 2 %2 = load i32, i32* %p2, align 1, !tbaa !7 %p3 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 3 %3 = load i32, i32* %p3, align 1, !tbaa !8 %p4 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 4 %4 = load i32, i32* %p4, align 1, !tbaa !9 %p5 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 5 %5 = load i32, i32* %p5, align 1, !tbaa !10 %p6 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 6 %6 = load i32, i32* %p6, align 1, !tbaa !11 %p7 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 7 %7 = load i32, i32* %p7, align 1, !tbaa !12 %p8 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 8 %8 = load i32, i32* %p8, align 1, !tbaa !13 %p9 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 9 %9 = load i32, i32* %p9, align 1, !tbaa !14 %p10 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 10 %10 = load i32, i32* %p10, align 1, !tbaa !15 %p11 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 11 %11 = load i32, i32* %p11, align 1, !tbaa !16 %p12 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 12 %12 = load i32, i32* %p12, align 1, !tbaa !17 %p13 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 13 %13 = load i32, i32* %p13, align 1, !tbaa !18 %p14 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 14 %14 = load i32, i32* %p14, align 1, !tbaa !19 %p15 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 15 %15 = load i32, i32* %p15, align 1, !tbaa !20 %p16 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 16 %16 = load i32, i32* %p16, align 1, !tbaa !21 %p17 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 17 %17 = load i32, i32* %p17, align 1, !tbaa !22 %p18 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 18 %18 = load i32, i32* %p18, align 1, !tbaa !23 %p19 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 19 %19 = load i32, i32* %p19, align 1, !tbaa !24 %p20 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 20 %20 = load i32, i32* %p20, align 1, !tbaa !25 %p21 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 21 %21 = load i32, i32* %p21, align 1, !tbaa !26 %p22 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 22 %22 = load i32, i32* %p22, align 1, !tbaa !27 %p23 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 23 %23 = load i32, i32* %p23, align 1, !tbaa !28 %p24 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 24 %24 = load i32, i32* %p24, align 1, !tbaa !29 %p25 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 25 %25 = load i32, i32* %p25, align 1, !tbaa !30 %p26 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 26 %26 = load i32, i32* %p26, align 1, !tbaa !31 %p27 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 27 %27 = load i32, i32* %p27, align 1, !tbaa !32 %p28 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 28 %28 = load i32, i32* %p28, align 1, !tbaa !33 %p29 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 29 %29 = load i32, i32* %p29, align 1, !tbaa !34 %p30 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 30 %30 = load i32, i32* %p30, align 1, !tbaa !35 %p31 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 31 %31 = load i32, i32* %p31, align 1, !tbaa !36 %p32 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 32 %32 = load i32, i32* %p32, align 1, !tbaa !37 %add = add i8 %0, 1 store i8 %add, i8* %p0, align 1, !tbaa !1 %add2 = add i32 %1, 2 store i32 %add2, i32* %p1, align 1, !tbaa !6 %add3 = add i32 %2, 3 store i32 %add3, i32* %p2, align 1, !tbaa !7 %add4 = add i32 %3, 4 store i32 %add4, i32* %p3, align 1, !tbaa !8 %add5 = add i32 %4, 5 store i32 %add5, i32* %p4, align 1, !tbaa !9 %add6 = add i32 %5, 6 store i32 %add6, i32* %p5, align 1, !tbaa !10 %add7 = add i32 %6, 7 store i32 %add7, i32* %p6, align 1, !tbaa !11 %add8 = add i32 %7, 8 store i32 %add8, i32* %p7, align 1, !tbaa !12 %add9 = add i32 %8, 9 store i32 %add9, i32* %p8, align 1, !tbaa !13 %add10 = add i32 %9, 10 store i32 %add10, i32* %p9, align 1, !tbaa !14 %add11 = add i32 %10, 11 store i32 %add11, i32* %p10, align 1, !tbaa !15 %add12 = add i32 %11, 12 store i32 %add12, i32* %p11, align 1, !tbaa !16 %add13 = add i32 %12, 13 store i32 %add13, i32* %p12, align 1, !tbaa !17 %add14 = add i32 %13, 14 store i32 %add14, i32* %p13, align 1, !tbaa !18 %add15 = add i32 %14, 15 store i32 %add15, i32* %p14, align 1, !tbaa !19 %add16 = add i32 %15, 16 store i32 %add16, i32* %p15, align 1, !tbaa !20 %add17 = add i32 %16, 17 store i32 %add17, i32* %p16, align 1, !tbaa !21 %add18 = add i32 %17, 18 store i32 %add18, i32* %p17, align 1, !tbaa !22 %add19 = add i32 %18, 19 store i32 %add19, i32* %p18, align 1, !tbaa !23 %add20 = add i32 %19, 20 store i32 %add20, i32* %p19, align 1, !tbaa !24 %add21 = add i32 %20, 21 store i32 %add21, i32* %p20, align 1, !tbaa !25 %add22 = add i32 %21, 22 store i32 %add22, i32* %p21, align 1, !tbaa !26 %add23 = add i32 %22, 23 store i32 %add23, i32* %p22, align 1, !tbaa !27 %add24 = add i32 %23, 24 store i32 %add24, i32* %p23, align 1, !tbaa !28 %add25 = add i32 %24, 25 store i32 %add25, i32* %p24, align 1, !tbaa !29 %add26 = add i32 %25, 26 store i32 %add26, i32* %p25, align 1, !tbaa !30 %add27 = add i32 %26, 27 store i32 %add27, i32* %p26, align 1, !tbaa !31 %add28 = add i32 %27, 28 store i32 %add28, i32* %p27, align 1, !tbaa !32 %add29 = add i32 %28, 29 store i32 %add29, i32* %p28, align 1, !tbaa !33 %add30 = add i32 %29, 30 store i32 %add30, i32* %p29, align 1, !tbaa !34 %add31 = add i32 %30, 31 store i32 %add31, i32* %p30, align 1, !tbaa !35 %add32 = add i32 %31, 32 store i32 %add32, i32* %p31, align 1, !tbaa !36 %add33 = add i32 %32, 33 store i32 %add33, i32* %p32, align 1, !tbaa !37 %33 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %agg.result, i32 0, i32 0 call void @llvm.memcpy.p0i8.p0i8.i32(i8* %33, i8* %p0, i32 129, i32 1, i1 false), !tbaa.struct !38 ret void } ; Function Attrs: argmemonly nounwind declare void @llvm.memcpy.p0i8.p0i8.i32(i8* nocapture, i8* nocapture readonly, i32, i32, i1) #1 attributes #0 = { norecurse nounwind "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" } attributes #1 = { argmemonly nounwind } !llvm.ident = !{!0} !0 = !{!"clang version 3.8.0 (git://github.com/llvm-mirror/clang 40ef2b7531472c41212c4719a9294aeb7bddebbc) (git://github.com/llvm-mirror/llvm c601eaf55606dfb9ad372b514b77aa00d1409be1)"} !1 = !{!2, !3, i64 0} !2 = !{!"", !3, i64 0, !5, i64 1, !5, i64 5, !5, i64 9, !5, i64 13, !5, i64 17, !5, i64 21, !5, i64 25, !5, i64 29, !5, i64 33, !5, i64 37, !5, i64 41, !5, i64 45, !5, i64 49, !5, i64 53, !5, i64 57, !5, i64 61, !5, i64 65, !5, i64 69, !5, i64 73, !5, i64 77, !5, i64 81, !5, i64 85, !5, i64 89, !5, i64 93, !5, i64 97, !5, i64 101, !5, i64 105, !5, i64 109, !5, i64 113, !5, i64 117, !5, i64 121, !5, i64 125} !3 = !{!"omnipotent char", !4, i64 0} !4 = !{!"Simple C/C++ TBAA"} !5 = !{!"int", !3, i64 0} !6 = !{!2, !5, i64 1} !7 = !{!2, !5, i64 5} !8 = !{!2, !5, i64 9} !9 = !{!2, !5, i64 13} !10 = !{!2, !5, i64 17} !11 = !{!2, !5, i64 21} !12 = !{!2, !5, i64 25} !13 = !{!2, !5, i64 29} !14 = !{!2, !5, i64 33} !15 = !{!2, !5, i64 37} !16 = !{!2, !5, i64 41} !17 = !{!2, !5, i64 45} !18 = !{!2, !5, i64 49} !19 = !{!2, !5, i64 53} !20 = !{!2, !5, i64 57} !21 = !{!2, !5, i64 61} !22 = !{!2, !5, i64 65} !23 = !{!2, !5, i64 69} !24 = !{!2, !5, i64 73} !25 = !{!2, !5, i64 77} !26 = !{!2, !5, i64 81} !27 = !{!2, !5, i64 85} !28 = !{!2, !5, i64 89} !29 = !{!2, !5, i64 93} !30 = !{!2, !5, i64 97} !31 = !{!2, !5, i64 101} !32 = !{!2, !5, i64 105} !33 = !{!2, !5, i64 109} !34 = !{!2, !5, i64 113} !35 = !{!2, !5, i64 117} !36 = !{!2, !5, i64 121} !37 = !{!2, !5, i64 125} !38 = !{i64 0, i64 1, !39, i64 1, i64 4, !40, i64 5, i64 4, !40, i64 9, i64 4, !40, i64 13, i64 4, !40, i64 17, i64 4, !40, i64 21, i64 4, !40, i64 25, i64 4, !40, i64 29, i64 4, !40, i64 33, i64 4, !40, i64 37, i64 4, !40, i64 41, i64 4, !40, i64 45, i64 4, !40, i64 49, i64 4, !40, i64 53, i64 4, !40, i64 57, i64 4, !40, i64 61, i64 4, !40, i64 65, i64 4, !40, i64 69, i64 4, !40, i64 73, i64 4, !40, i64 77, i64 4, !40, i64 81, i64 4, !40, i64 85, i64 4, !40, i64 89, i64 4, !40, i64 93, i64 4, !40, i64 97, i64 4, !40, i64 101, i64 4, !40, i64 105, i64 4, !40, i64 109, i64 4, !40, i64 113, i64 4, !40, i64 117, i64 4, !40, i64 121, i64 4, !40, i64 125, i64 4, !40} !39 = !{!3, !3, i64 0} !40 = !{!5, !5, i64 0} ``` Reviewers: asl Subscribers: qcolombet Differential Revision: http://reviews.llvm.org/D17441 llvm-svn: 261746
*	[X86][SSE41] Combine vector blends with zero	Simon Pilgrim	2016-02-24	7	-26/+80
\| \| \| \| \| \| \| \| \|	Part 2 of 2 This patch add support for combining target shuffles into blends-with-zero. Differential Revision: http://reviews.llvm.org/D17483 llvm-svn: 261745
*	[X86][SSE41] Combine insertion of zero scalars into vector blends with zero	Simon Pilgrim	2016-02-24	3	-98/+189
\| \| \| \| \| \| \| \| \| \|	Part 1 of 2 This patch attempts to replace the insertion of zero scalars with a vector blend with zero, avoiding the use of the integer insertion instructions (which are particularly slow on many targets). (Part 2 will add support for combining multiple blends-with-zero). Differential Revision: http://reviews.llvm.org/D17483 llvm-svn: 261743
*	[AMDGPU] Assembler: Simplify handling of optional operands	Nikolay Haustov	2016-02-24	3	-75/+79
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Prepare to support DPP encodings. For DPP encodings, we want row_mask/bank_mask/bound_ctrl to be optional operands. However this means that when parsing instruction which has no mnemonic prefix, we cannot add both default values for VOP3 and for DPP optional operands to OperandVector - neither instructions would match. So add default values for optional operands to MCInst during conversion instead. Mark more operands as IsOptional = 1 in .td files. Do not add default values for optional operands to OperandVector in AMDGPUAsmParser. Add default values for optional operands during conversion using new helper addOptionalImmOperand. Change to cvtVOP3_2_mod to check instruction flag instead of presence of modifiers. In the future, cvtVOP3* functions can be combined into one. Separate cvtFlat and cvtFlatAtomic. Fix CNDMASK_B32 definition to have no modifiers. Review: http://reviews.llvm.org/D17445 Reviewers: tstellarAMD llvm-svn: 261742
*	NFC. Move isDereferenceable to Loads.h/cpp	Artur Pilipenko	2016-02-24	8	-222/+228
\| \| \| \| \| \| \| \| \| \|	This is a part of the refactoring to unify isSafeToLoadUnconditionally and isDereferenceablePointer functions. In subsequent change I'm going to eliminate isDerferenceableAndAlignedPointer from Loads API, leaving isSafeToLoadSpecualtively the only function to check is load instruction can be speculated. Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D16180 llvm-svn: 261736
*	NFC. Move getAlignment helper function from ValueTracking to Value class.	Artur Pilipenko	2016-02-24	3	-42/+50
\| \| \| \| \| \| \| \|	Reviewed By: reames, hfinkel Differential Revision: http://reviews.llvm.org/D16144 llvm-svn: 261735
*	[X86][SSE] Fixed vector rotation test name typo	Simon Pilgrim	2016-02-24	1	-5/+5
\| \| \| \| \| \|	Rotation of 16i6 vector not 8i16 vector - copy+paste is not your friend llvm-svn: 261733
*	[AMDGPU] fix amd_kernel_code_t bit field position as per spec (added missing ↵	Nikolay Haustov	2016-02-24	1	-7/+15
\| \| \| \| \| \| \| \| \| \| \|	reserved fields) lit tests passed before and after because it doesn't test the binary representation of amd_kernel_code_t. Patch by: Valery Pykhtin (Valery.Pykhtin@amd.com) Reviewers: arsenm llvm-svn: 261732
*	[SimplifyCFG] Do not blindly remove unreachable blocks	David Majnemer	2016-02-24	2	-3/+51
\| \| \| \| \| \| \| \| \| \| \|	DeleteDeadBlock was called indiscriminately, leading to cleanuprets with undef cleanuppad references. Instead, try to drain the BB of most of it's instructions if it is unreachable. We can then remove the BB if it solely consists of a terminator (and maybe some phis). llvm-svn: 261731