bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Make isLegalAddressingMode() taking DataLayout as an argument	Mehdi Amini	2015-07-09	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11040 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241778
*	Make TargetLowering::getShiftAmountTy() taking DataLayout as an argument	Mehdi Amini	2015-07-09	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11037 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241776
*	Make TargetLowering::getPointerTy() taking DataLayout as an argument	Mehdi Amini	2015-07-09	4	-44/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, ted, yaron.keren, rafael, llvm-commits Differential Revision: http://reviews.llvm.org/D11028 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241775
*	Make TargetTransformInfo keeping a reference to the Module DataLayout	Mehdi Amini	2015-07-09	2	-16/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	DataLayout is no longer optional. It was initialized with or without a DataLayout, and the DataLayout when supplied could have been the one from the TargetMachine. Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11021 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241774
*	Redirect DataLayout from TargetMachine to Module in ComputeValueVTs()	Mehdi Amini	2015-07-09	2	-43/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Avoid using the TargetMachine owned DataLayout and use the Module owned one instead. This requires passing the DataLayout up the stack to ComputeValueVTs(). This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, yaron.keren, rafael, llvm-commits Differential Revision: http://reviews.llvm.org/D11019 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241773
*	Cosmetic cleanups - NFC	Eli Bendersky	2015-07-08	2	-6/+3
\| \| \| \| \| \|	Remove commented lines, trailing whitespace, etc. llvm-svn: 241687
*	Change the last few internal StringRef triples into Triple objects.	Daniel Sanders	2015-07-06	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This concludes the patch series to eliminate StringRef forms of GNU triples from the internals of LLVM that began in r239036. At this point, the StringRef-form of GNU Triples should only be used in the public API (including IR serialization) and a couple objects that directly interact with the API (most notably the Module class). The next step is to replace these Triple objects with the TargetTuple object that will represent our authoratative/unambiguous internal equivalent to GNU Triples. Reviewers: rengolin Subscribers: llvm-commits, jholewinski, ted, rengolin Differential Revision: http://reviews.llvm.org/D10962 llvm-svn: 241472
*	[TargetLowering] StringRefize asm constraint getters.	Benjamin Kramer	2015-07-05	2	-6/+4
\| \| \| \| \| \| \| \|	There is some functional change here because it changes target code from atoi(3) to StringRef::getAsInteger which has error checking. For valid constraints there should be no difference. llvm-svn: 241411
*	[NVPTX] expand extload/truncstore for vectors of floats	Jingyue Wu	2015-07-01	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: According to PTX ISA: For convenience, ld, st, and cvt instructions permit source and destination data operands to be wider than the instruction-type size, so that narrow values may be loaded, stored, and converted using regular-width registers. For example, 8-bit or 16-bit values may be held directly in 32-bit or 64-bit registers when being loaded, stored, or converted to other types and sizes. The operand type checking rules are relaxed for bit-size and integer (signed and unsigned) instruction types; floating-point instruction types still require that the operand type-size matches exactly, unless the operand is of bit-size type. So, the ISA does not support load with extending/store with truncatation for floating numbers. This is reflected in setting the loadext/truncstore actions to expand in the code for floating numbers, but vectors of floating numbers are not taken care of. As a result, loading a vector of floats followed by a fp_extend may be combined by DAGCombiner to a extload, and the extload may be lowered to NVPTXISD::LoadV2 with extending information. However, NVPTXISD::LoadV2 does not perform extending, and no extending instructions are inserted. Finally, PTX instructions with mismatched types are generated, like ld.v2.f32 {%fd3, %fd4}, [%rd2] This patch adds the correct actions for vectors of floats, so DAGCombiner would not create loads with extending, and correct code is generated. Patched by Gang Hu. Test Plan: Test case attached. Reviewers: jingyue Reviewed By: jingyue Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D10876 llvm-svn: 241191
*	[NVPTX] Move NVPTXPeephole after NVPTXPrologEpilogPass	Jingyue Wu	2015-07-01	2	-11/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Offset of frame index is calculated by NVPTXPrologEpilogPass. Before that the correct offset of stack objects cannot be obtained, which leads to wrong offset if there are more than 2 frame objects. This patch move NVPTXPeephole after NVPTXPrologEpilogPass. Because the frame index is already replaced by %VRFrame in NVPTXPrologEpilogPass, we check VRFrame register instead, and try to remove the VRFrame if there is no usage after NVPTXPeephole pass. Patched by Xuetian Weng. Test Plan: Strengthened test/CodeGen/NVPTX/local-stack-frame.ll to check the offset calculation based on SP and SPL. Reviewers: jholewinski, jingyue Reviewed By: jingyue Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D10853 llvm-svn: 241185
*	[NVPTX] cleanups and refacotring in NVPTXFrameLowering.cpp	Jingyue Wu	2015-06-30	1	-27/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: NFC Test Plan: no regression Reviewers: wengxt Reviewed By: wengxt Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D10849 llvm-svn: 241118
*	[NVPTX] Fix issue introduced in D10321	Jingyue Wu	2015-06-30	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Really check if %SP is not used in other places, instead of checking only exact one non-dbg use. Patched by Xuetian Weng. Test Plan: @foo4 in test/CodeGen/NVPTX/local-stack-frame.ll, create a case that SP will appear twice. Reviewers: jholewinski, jingyue Reviewed By: jingyue Subscribers: llvm-commits, sfantao, jholewinski Differential Revision: http://reviews.llvm.org/D10844 llvm-svn: 241099
*	Force relocation mode to be default, regardless of what is passed to the ↵	Samuel Antao	2015-06-30	1	-1/+4
\| \| \| \| \| \|	backend. llvm-svn: 241081
*	[NVPTX] noop when kernel pointers are already global	Jingyue Wu	2015-06-26	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Some front ends make kernel pointers global already. In that case, handlePointerParams does nothing. Test Plan: more tests in lower-kernel-ptr-arg.ll Reviewers: grosser Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D10779 llvm-svn: 240849
*	Add NVPTXPeephole pass to reduce unnecessary address cast	Jingyue Wu	2015-06-24	6	-16/+178
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch first change the register that holds local address for stack frame to %SPL. Then the new NVPTXPeephole pass will try to scan the following pattern %vreg0<def> = LEA_ADDRi64 <fi#0>, 4 %vreg1<def> = cvta_to_local %vreg0 and transform it into %vreg1<def> = LEA_ADDRi64 %VRFrameLocal, 4 Patched by Xuetian Weng Test Plan: test/CodeGen/NVPTX/local-stack-frame.ll Reviewers: jholewinski, jingyue Reviewed By: jingyue Subscribers: eliben, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D10549 llvm-svn: 240587
*	Simplify the Mangler interface now that DataLayout is mandatory.	Rafael Espindola	2015-06-23	1	-1/+1
\| \| \| \| \| \| \|	We only need to pass in a DataLayout when mangling a raw string, not when constructing the mangler. llvm-svn: 240405
*	Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)	Alexander Kornienko	2015-06-23	19	-24/+24
\| \| \| \| \| \|	Apparently, the style needs to be agreed upon first. llvm-svn: 240390
*	Fixed/added namespace ending comments using clang-tidy. NFC	Alexander Kornienko	2015-06-19	19	-24/+24
\| \| \| \| \| \| \| \| \| \| \| \| \|	The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-,llvm-namespace-comment -header-filter='llvm/.\|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137
*	Add NVPTXLowerAlloca pass to convert alloca'ed memory to local address	Jingyue Wu	2015-06-17	4	-4/+122
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is done by first adding two additional instructions to convert the alloca returned address to local and convert it back to generic. Then replace all uses of alloca instruction with the converted generic address. Then we can rely NVPTXFavorNonGenericAddrSpace pass to combine the generic addresscast and the corresponding Load, Store, Bitcast, GEP Instruction together. Patched by Xuetian Weng (xweng@google.com). Test Plan: test/CodeGen/NVPTX/lower-alloca.ll Reviewers: jholewinski, jingyue Reviewed By: jingyue Subscribers: meheff, broune, eliben, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D10483 llvm-svn: 239964
*	Clean up redundant copies of Triple objects. NFC	Daniel Sanders	2015-06-16	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Reviewers: rengolin Reviewed By: rengolin Subscribers: llvm-commits, rengolin, jholewinski Differential Revision: http://reviews.llvm.org/D10382 llvm-svn: 239823
*	Replace string GNU Triples with llvm::Triple in TargetMachine. NFC.	Daniel Sanders	2015-06-11	2	-16/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: For the moment, TargetMachine::getTargetTriple() still returns a StringRef. This continues the patch series to eliminate StringRef forms of GNU triples from the internals of LLVM that began in r239036. Reviewers: rengolin Reviewed By: rengolin Subscribers: ted, llvm-commits, rengolin, jholewinski Differential Revision: http://reviews.llvm.org/D10362 llvm-svn: 239554
*	[CodeGen] ArrayRef'ize cond/pred in various TII APIs. NFC.	Ahmed Bougacha	2015-06-11	2	-2/+2
\| \| \| \|	llvm-svn: 239553
*	Remove MachineModuleInfo::UsedFunctions as it has no users.	Pete Cooper	2015-06-11	1	-1/+0
\| \| \| \| \| \| \| \| \| \|	It hasn't been used since r130964. This also removes MachineModuleInfo::isUsedFunction and MachineModuleInfo::AnalyzeModule, both of which were only there to support UsedFunctions. llvm-svn: 239501
*	Replace string GNU Triples with llvm::Triple in MCSubtargetInfo and ↵	Daniel Sanders	2015-06-10	5	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	create*MCSubtargetInfo(). NFC. Summary: This continues the patch series to eliminate StringRef forms of GNU triples from the internals of LLVM that began in r239036. Reviewers: rafael Reviewed By: rafael Subscribers: rafael, ted, jfb, llvm-commits, rengolin, jholewinski Differential Revision: http://reviews.llvm.org/D10311 llvm-svn: 239467
*	[NVPTX] fix a crash bug in NVPTXFavorNonGenericAddrSpaces	Jingyue Wu	2015-06-09	1	-54/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We used to assume V->RAUW only modifies the operand list of V's user. However, if V and V's user are Constants, RAUW may replace and invalidate V's user entirely. This patch fixes the above issue by letting the caller replace the operand instead of calling RAUW on Constants. Test Plan: @nested_const_expr and @rauw in access-non-generic.ll Reviewers: broune, jholewinski Reviewed By: broune, jholewinski Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D10345 llvm-svn: 239435
*	The constant initialization for globals in NVPTX is generated as an	Samuel Antao	2015-06-09	1	-9/+33
\| \| \| \| \| \| \| \| \| \|	array of bytes. The generation of this byte arrays was expecting the host to be little endian, which prevents big endian hosts to be used in the generation of the PTX code. This patch fixes the problem by changing the way the bytes are extracted so that it works for either little and big endian. llvm-svn: 239412
*	[nvptx] Only support the 'm' inline assembly memory constraint. NFC.	Daniel Sanders	2015-06-09	1	-6/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: NVPTX doesn't seem to support any additional constraints. Therefore remove the target hook. No functional change intended. Reviewers: jholewinski Reviewed By: jholewinski Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D8209 llvm-svn: 239395
*	MC: Add target hook to control symbol quoting	Matt Arsenault	2015-06-09	5	-32/+51
\| \| \| \|	llvm-svn: 239370
*	[NVPTX] run SROA after NVPTXFavorNonGenericAddrSpaces	Jingyue Wu	2015-06-09	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This cleans up most allocas NVPTXLowerKernelArgs emits for byval parameters. Test Plan: makes bug21465.ll more stronger to verify no redundant local load/store. Reviewers: eliben, jholewinski Reviewed By: eliben, jholewinski Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D10322 llvm-svn: 239368
*	[NVPTX] roll forward r239082	Jingyue Wu	2015-06-04	6	-140/+179
\| \| \| \| \| \| \| \| \|	NVPTXISelDAGToDAG translates "addrspacecast to param" to NVPTX::nvvm_ptr_gen_to_param Added an llc test in bug21465. llvm-svn: 239100
*	Revert r239082	Jingyue Wu	2015-06-04	5	-175/+140
\| \| \| \| \| \|	llc crashed for NVPTX backend llvm-svn: 239094
*	[NVPTX] kernel pointer arguments point to the global address space	Jingyue Wu	2015-06-04	5	-140/+175
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: With this patch, NVPTXLowerKernelArgs converts a kernel pointer argument to a pointer in the global address space. This change, along with NVPTXFavorNonGenericAddrSpaces, allows the NVPTX backend to emit ld.global.* and st.global.* for accessing kernel pointer arguments. Minor changes: 1. refactor: extract function convertToPointerInAddrSpace 2. fix a bug in the test case in bug21465.ll Test Plan: lower-kernel-ptr-arg.ll Reviewers: eliben, meheff, jholewinski Reviewed By: jholewinski Subscribers: wengxt, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D10154 llvm-svn: 239082
*	Replace string GNU Triples with llvm::Triple in MCAsmInfo subclasses and ↵	Daniel Sanders	2015-06-04	2	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	create*AsmInfo(). NFC. Summary: This is the first of several patches to eliminate StringRef forms of GNU triples from the internals of LLVM. After this is complete, GNU triples will be replaced by a more authoratitive representation in the form of an LLVM TargetTuple. Reviewers: rengolin Reviewed By: rengolin Subscribers: ted, llvm-commits, rengolin, jholewinski Differential Revision: http://reviews.llvm.org/D10236 llvm-svn: 239036
*	Push constness through LoopInfo::isLoopHeader and clean it up a bit.	Benjamin Kramer	2015-06-02	1	-2/+1
\| \| \| \| \| \|	NFC. llvm-svn: 238843
*	Add address space argument to isLegalAddressingMode	Matt Arsenault	2015-06-01	2	-2/+4
\| \| \| \| \| \| \| \| \| \|	This is important because of different addressing modes depending on the address space for GPU targets. This only adds the argument, and does not update any of the uses to provide the correct address space. llvm-svn: 238723
*	MC: Clean up MCExpr naming. NFC.	Jim Grosbach	2015-05-30	3	-29/+29
\| \| \| \|	llvm-svn: 238634
*	[NVPTXFavorNonGenericAddrSpaces] recursively trace into GEP and BitCast	Jingyue Wu	2015-05-29	1	-60/+147
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch allows NVPTXFavorNonGenericAddrSpaces to remove addrspacecast from longer chains consisting of GEPs and BitCasts. For example, it can now optimize %0 = addrspacecast [10 x float] addrspace(3)* @a to [10 x float]* %1 = gep [10 x float]* %0, i64 0, i64 %i %2 = bitcast float* %1 to i32* %3 = load i32* %2 ; emits ld.u32 to %0 = gep [10 x float] addrspace(3)* @a, i64 0, i64 %i %1 = bitcast float addrspace(3)* %0 to i32 addrspace(3)* %3 = load i32 addrspace(3)* %1 ; emits ld.shared.f32 Test Plan: @ld_int_from_global_float in access-non-generic.ll Reviewers: broune, eliben, jholewinski, meheff Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D10074 llvm-svn: 238574
*	Don't call utostr in Twine/raw_ostream contexts.	Benjamin Kramer	2015-05-28	1	-6/+4
\| \| \| \| \| \|	Creating temporary std::strings there is unnecessary. llvm-svn: 238412
*	[NaryReassociate] Run EarlyCSE after NaryReassociate	Jingyue Wu	2015-05-28	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch made two improvements to NaryReassociate and the NVPTX pipeline 1. Run EarlyCSE/GVN after NaryReassociate to get rid of redundant common expressions. 2. When adding an instruction to SeenExprs, maps both the SCEV before and after reassociation to that instruction. Test Plan: updated @reassociate_gep_nsw in nary-gep.ll Reviewers: meheff, broune Reviewed By: broune Subscribers: dberlin, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D9947 llvm-svn: 238396
*	Move alignment from MCSectionData to MCSection.	Rafael Espindola	2015-05-21	3	-15/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This starts merging MCSection and MCSectionData. There are a few issues with the current split between MCSection and MCSectionData. * It optimizes the the not as important case. We want the production of .o files to be really fast, but the split puts the information used for .o emission in a separate data structure. * The ELF/COFF/MachO hierarchy is not represented in MCSectionData, leading to some ad-hoc ways to represent the various flags. * It makes it harder to remember where each item is. The attached patch starts merging the two by moving the alignment from MCSectionData to MCSection. Most of the patch is actually just dropping 'const', since MCSectionData is mutable, but MCSection was not. llvm-svn: 237936
*	MC: Clean up method names in MCContext.	Jim Grosbach	2015-05-18	1	-2/+2
\| \| \| \| \| \| \|	The naming was a mish-mash of old and new style. Update to be consistent with the new. NFC. llvm-svn: 237594
*	Remove MCAssembler.h include from MCStreamer.h and fix users of MCStreamer.h	Pete Cooper	2015-05-15	1	-0/+1
\| \| \| \|	llvm-svn: 237483
*	Remove 3 includes from MCInstrDesc.h and explicitly include them where needed	Pete Cooper	2015-05-15	1	-0/+1
\| \| \| \|	llvm-svn: 237481
*	MC: MCCodeGenInfo naming update. NFC.	Jim Grosbach	2015-05-15	1	-1/+1
\| \| \| \| \| \|	s/InitMCCodeGenInfo/initMCCodeGenInfo/ llvm-svn: 237471
*	MC: Modernize MCOperand API naming. NFC.	Jim Grosbach	2015-05-13	1	-6/+6
\| \| \| \| \| \|	MCOperand::Create() methods renamed to MCOperand::create(). llvm-svn: 237275
*	Change getTargetNodeName() to produce compiler warnings for missing cases, ↵	Matthias Braun	2015-05-07	2	-4/+13
\| \| \| \| \| \|	fix them llvm-svn: 236775
*	[ShrinkWrap] Add (a simplified version) of shrink-wrapping.	Quentin Colombet	2015-05-05	3	-4/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch introduces a new pass that computes the safe point to insert the prologue and epilogue of the function. The interest is to find safe points that are cheaper than the entry and exits blocks. As an example and to avoid regressions to be introduce, this patch also implements the required bits to enable the shrink-wrapping pass for AArch64. Context Currently we insert the prologue and epilogue of the method/function in the entry and exits blocks. Although this is correct, we can do a better job when those are not immediately required and insert them at less frequently executed places. The job of the shrink-wrapping pass is to identify such places. Motivating example Let us consider the following function that perform a call only in one branch of a if: define i32 @f(i32 %a, i32 %b) { %tmp = alloca i32, align 4 %tmp2 = icmp slt i32 %a, %b br i1 %tmp2, label %true, label %false true: store i32 %a, i32* %tmp, align 4 %tmp4 = call i32 @doSomething(i32 0, i32* %tmp) br label %false false: %tmp.0 = phi i32 [ %tmp4, %true ], [ %a, %0 ] ret i32 %tmp.0 } On AArch64 this code generates (removing the cfi directives to ease readabilities): _f: ; @f ; BB#0: stp x29, x30, [sp, #-16]! mov x29, sp sub sp, sp, #16 ; =16 cmp w0, w1 b.ge LBB0_2 ; BB#1: ; %true stur w0, [x29, #-4] sub x1, x29, #4 ; =4 mov w0, wzr bl _doSomething LBB0_2: ; %false mov sp, x29 ldp x29, x30, [sp], #16 ret With shrink-wrapping we could generate: _f: ; @f ; BB#0: cmp w0, w1 b.ge LBB0_2 ; BB#1: ; %true stp x29, x30, [sp, #-16]! mov x29, sp sub sp, sp, #16 ; =16 stur w0, [x29, #-4] sub x1, x29, #4 ; =4 mov w0, wzr bl _doSomething add sp, x29, #16 ; =16 ldp x29, x30, [sp], #16 LBB0_2: ; %false ret Therefore, we would pay the overhead of setting up/destroying the frame only if we actually do the call. Proposed Solution This patch introduces a new machine pass that perform the shrink-wrapping analysis (See the comments at the beginning of ShrinkWrap.cpp for more details). It then stores the safe save and restore point into the MachineFrameInfo attached to the MachineFunction. This information is then used by the PrologEpilogInserter (PEI) to place the related code at the right place. This pass runs right before the PEI. Unlike the original paper of Chow from PLDI’88, this implementation of shrink-wrapping does not use expensive data-flow analysis and does not need hack to properly avoid frequently executed point. Instead, it relies on dominance and loop properties. The pass is off by default and each target can opt-in by setting the EnableShrinkWrap boolean to true in their derived class of TargetPassConfig. This setting can also be overwritten on the command line by using -enable-shrink-wrap. Before you try out the pass for your target, make sure you properly fix your emitProlog/emitEpilog/adjustForXXX method to cope with basic blocks that are not necessarily the entry block. Design Decisions 1. ShrinkWrap is its own pass right now. It could frankly be merged into PEI but for debugging and clarity I thought it was best to have its own file. 2. Right now, we only support one save point and one restore point. At some point we can expand this to several save point and restore point, the impacted component would then be: - The pass itself: New algorithm needed. - MachineFrameInfo: Hold a list or set of Save/Restore point instead of one pointer. - PEI: Should loop over the save point and restore point. Anyhow, at least for this first iteration, I do not believe this is interesting to support the complex cases. We should revisit that when we motivating examples. Differential Revision: http://reviews.llvm.org/D9210 <rdar://problem/3201744> llvm-svn: 236507
*	IR: Give 'DI' prefix to debug info metadata	Duncan P. N. Exon Smith	2015-04-29	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Finish off PR23080 by renaming the debug info IR constructs from `MD` to `DI`. The last of the `DIDescriptor` classes were deleted in r235356, and the last of the related typedefs removed in r235413, so this has all baked for about a week. Note: If you have out-of-tree code (like a frontend), I recommend that you get everything compiling and tests passing with the previous commit before updating to this one. It'll be easier to keep track of what code is using the `DIDescriptor` hierarchy and what you've already updated, and I think you're extremely unlikely to insert bugs. YMMV of course. Back to this commit: I did this using the rename-md-di-nodes.sh upgrade script I've attached to PR23080 (both code and testcases) and filtered through clang-format-diff.py. I edited the tests for test/Assembler/invalid-generic-debug-node-*.ll by hand since the columns were off-by-three. It should work on your out-of-tree testcases (and code, if you've followed the advice in the previous paragraph). Some of the tests are in badly named files now (e.g., test/Assembler/invalid-mdcompositetype-missing-tag.ll should be 'dicompositetype'); I'll come back and move the files in a follow-up commit. llvm-svn: 236120
*	Fix a [-Werror,-Winconsistent-missing-override] problem in the	Eric Christopher	2015-04-28	1	-1/+1
\| \| \| \| \| \|	NVPTX overrides. llvm-svn: 236007
*	[NVPTX] Handle addrspacecast constant expressions in aggregate initializers	Justin Holewinski	2015-04-28	4	-2/+268
\| \| \| \| \| \| \| \| \| \| \|	We need to track if an AddrSpaceCast expression was seen when generating an MCExpr for a ConstantExpr. This change introduces a custom lowerConstant method to the NVPTX asm printer that will create NVPTXGenericMCSymbolRefExpr nodes at the appropriate places to encode the information that a given symbol needs to be casted to a generic address. llvm-svn: 236000