bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[X86][GlobalISel] Initial implementation , select G_ADD gpr, gpr	Igor Breger	2017-02-22	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Initial implementation for X86InstructionSelector. Handle selection COPY and G_ADD/G_SUB gpr, gpr . Reviewers: qcolombet, rovka, zvi, ab Reviewed By: rovka Subscribers: mgorny, dberris, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D29816 llvm-svn: 295824
*	[WebAssembly] Add skeleton MC support for the Wasm container format	Dan Gohman	2017-02-22	1	-0/+52
\| \| \| \| \| \| \| \| \|	This just adds the basic skeleton for supporting a new object file format. All of the actual encoding will be implemented in followup patches. Differential Revision: https://reviews.llvm.org/D26722 llvm-svn: 295803
*	DAG: Check if extract_vector_elt is legal or custom	Matt Arsenault	2017-02-21	1	-1/+1
\| \| \| \| \| \| \|	Avoids test regressions in future AMDGPU commits when more vector types are custom lowered. llvm-svn: 295782
*	[CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; ↵	Eugene Zelenko	2017-02-21	5	-61/+141
\| \| \| \| \| \|	other minor fixes (NFC). llvm-svn: 295773
*	[CodeGenPrepare] Sink and duplicate more 'and' instructions.	Geoff Berry	2017-02-21	2	-79/+89
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Rework the code that was sinking/duplicating (icmp and, 0) sequences into blocks where they were being used by conditional branches to form more tbz instructions on AArch64. The new code is more general in that it just looks for 'and's that have all icmp 0's as users, with a target hook used to select which subset of 'and' instructions to consider. This change also enables 'and' sinking for X86, where it is more widely beneficial than on AArch64. The 'and' sinking/duplicating code is moved into the optimizeInst phase of CodeGenPrepare, where it can take advantage of the fact the OptimizeCmpExpression has already sunk/duplicated any icmps into the blocks where they are used. One minor complication from this change is that optimizeLoadExt needed to be updated to always mark 'and's it has determined should be in the same block as their feeding load in the InsertedInsts set to avoid an infinite loop of hoisting and sinking the same 'and'. This change fixes a regression on X86 in the tsan runtime caused by moving GVNHoist to a later place in the optimization pipeline (see PR31382). Reviewers: t.p.northover, qcolombet, MatzeB Subscribers: aemerson, mcrosier, sebpop, llvm-commits Differential Revision: https://reviews.llvm.org/D28813 llvm-svn: 295746
*	ScheduleDAG: Cleanup; NFC	Matthias Braun	2017-02-21	1	-184/+133
\| \| \| \| \| \| \| \| \|	- Fix doxygen comments (do not repeat documented name, remove definition comment if there is already one at the declaration, add \p, ...) - Add some const modifiers - Use range based for llvm-svn: 295688
*	[BranchFolding] Update debug location along with the update of branch ↵	Taewook Oh	2017-02-21	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instruction. Summary: Currently, BranchFolder drops DebugLoc for branch instructions in some places. For example, for the test code attached, the branch instruction of 'entry' block has a DILocation of ``` !12 = !DILocation(line: 6, column: 3, scope: !11) ``` , but this information is gone when then block is lowered because BranchFolder misses it. This patch is a fix for this issue. Reviewers: qcolombet, aprantl, craig.topper, MatzeB Reviewed By: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29902 llvm-svn: 295684
*	Strip trailing whitespace.	Simon Pilgrim	2017-02-20	1	-1/+1
\| \| \| \|	llvm-svn: 295653
*	[SelectionDAG] Add scalarization support for ISD::*_EXTEND_VECTOR_INREG opcodes.	Simon Pilgrim	2017-02-20	2	-0/+34
\| \| \| \| \| \|	Thanks to Mikael Holmén for the initial test case llvm-svn: 295652
*	Remove redundant call to GluedNodes.back() [NFC]	Artyom Skrobov	2017-02-19	1	-2/+1
\| \| \| \|	llvm-svn: 295607
*	MachineRegionInfo: Fix pass initialization	Matthias Braun	2017-02-18	3	-9/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Adapt MachineBasicBlock::getName() to have the same behavior as the IR BasicBlock (Value::getName()). - Add it to lib/CodeGen/CodeGen.cpp::initializeCodeGen so that it is linked in the CodeGen library. - MachineRegionInfoPass's name conflicts with RegionInfoPass's name ("region"). - MachineRegionInfo should depend on MachineDominatorTree, MachinePostDominatorTree and MachineDominanceFrontier instead of their respective IR versions. - Since there were no tests for this, add a X86 MIR test. Patch by Francis Visoiu Mistrih<fvisoiumistrih@apple.com> llvm-svn: 295518
*	[CodeGen] Revert changes in LowLevelType to pre-r295499 to fix broken buildbots.	Eugene Zelenko	2017-02-17	1	-7/+1
\| \| \| \|	llvm-svn: 295505
*	[CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; ↵	Eugene Zelenko	2017-02-17	6	-26/+63
\| \| \| \| \| \|	other minor fixes (NFC). llvm-svn: 295499
*	Debug Info: Sort frame index expressions before emitting them.	Adrian Prantl	2017-02-17	3	-36/+44
\| \| \| \| \| \| \| \| \| \|	This fixes PR31381, which caused an assertion and/or invalid debug info. This affects debug variables that have multiple fragments in the MMI side (i.e.: in the stack frame) table. rdar://problem/30571676 llvm-svn: 295486
*	GlobalISel: verify that generic loads & stores have a mem operand.	Tim Northover	2017-02-17	1	-0/+8
\| \| \| \| \| \| \|	The mem operand is used by GlobalISel to convey atomic constraints so dropping it is invalid. llvm-svn: 295476
*	[DAGCombiner] split i1 select-of-constants from non-i1 case; NFCI	Sanjay Patel	2017-02-17	1	-9/+25
\| \| \| \| \| \|	I can't find any tests of the non-i1 code path, so it may be unnecessary at this point. llvm-svn: 295463
*	Fix signed/unsigned comparison warning.	Simon Pilgrim	2017-02-17	1	-2/+2
\| \| \| \|	llvm-svn: 295453
*	[DAGCombine] Recognise any_extend_vector_inreg and truncation style shuffle ↵	Simon Pilgrim	2017-02-17	1	-0/+125
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	masks During legalization we are often creating shuffles (via a build_vector scalarization stage) that are "any_extend_vector_inreg" style masks, and also other masks that are the equivalent of "truncate_vector_inreg" (if we had such a thing). This patch is an attempt to match these cases to help undo the effects of just leaving shuffle lowering to handle it - which typically means we lose track of the undefined elements of the shuffles resulting in an unnecessary extension+truncation stage for widened illegal types. The 2011-10-21-widen-cmp.ll regression will be fixed by making SIGN_EXTEND_VECTOR_IN_REG legal in SSE instead of lowering them to X86ISD::VSEXT (PR31712). Differential Revision: https://reviews.llvm.org/D29454 llvm-svn: 295451
*	[DAGCombiner] improve readability; NFCI	Sanjay Patel	2017-02-17	1	-44/+32
\| \| \| \|	llvm-svn: 295447
*	Handle link of NoDebug CU with a CU that has debug emission enabled	Teresa Johnson	2017-02-17	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is an issue both with regular and Thin LTO. When we link together a DICompileUnit that is marked NoDebug (e.g when compiling with -g0 but applying an AutoFDO profile, which requires location tracking in the compiler) and a DICompileUnit with debug emission enabled, we can have failures during dwarf debug generation. Specifically, when we have inlined from the NoDebug compile unit into the debug compile unit, we can fail during construction of the abstract and inlined scope DIEs. This is because the SPMap does not include NoDebug CUs (they are skipped in the debug_compile_units_iterator). This patch fixes the failures by skipping locations from NoDebug CUs when extracting lexical scopes. Reviewers: dblaikie, aprantl Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D29765 llvm-svn: 295384
*	[MachinePipeliner] Remove redundant destructor. NFC.	Benjamin Kramer	2017-02-16	1	-8/+1
\| \| \| \|	llvm-svn: 295372
*	Refactor DebugHandlerBase a bit to common non-debug-having-function filtering	David Blaikie	2017-02-16	6	-54/+60
\| \| \| \|	llvm-svn: 295354
*	[DAGCombiner] Support {a\|s}ext, {a\|z\|s}ext load nodes in load combine	Artur Pilipenko	2017-02-16	1	-8/+19
\| \| \| \| \| \| \| \| \| \| \| \|	Resubmit -r295314 with PowerPC and AMDGPU tests updated. Support {a\|s}ext, {a\|z\|s}ext load nodes as a part of load combine patters. Reviewed By: filcab Differential Revision: https://reviews.llvm.org/D29591 llvm-svn: 295336
*	Rever -r295314 "[DAGCombiner] Support {a\|s}ext, {a\|z\|s}ext load nodes in ↵	Artur Pilipenko	2017-02-16	1	-19/+8
\| \| \| \| \| \| \| \|	load combine" This change causes some of AMDGPU and PowerPC tests to fail. llvm-svn: 295316
*	[DAGCombiner] Support {a\|s}ext, {a\|z\|s}ext load nodes in load combine	Artur Pilipenko	2017-02-16	1	-8/+19
\| \| \| \| \| \| \| \| \| \|	Support {a\|s}ext, {a\|z\|s}ext load nodes as a part of load combine patters. Reviewed By: filcab Differential Revision: https://reviews.llvm.org/D29591 llvm-svn: 295314
*	[ARM] GlobalISel: Lower double precision FP args	Diana Picus	2017-02-16	1	-2/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	For the hard float calling convention, we just use the D registers. For the soft-fp calling convention, we use the R registers and move values to/from the D registers by means of G_SEQUENCE/G_EXTRACT. While doing so, we make sure to honor the endianness of the target, since the CCAssignFn doesn't do that for us. For pure soft float targets, we still bail out because we don't support the libcalls yet. llvm-svn: 295295
*	[X86] Re-enable conditional tail calls and fix PR31257.	Hans Wennborg	2017-02-16	1	-0/+37
\| \| \| \| \| \| \| \| \| \| \|	This reverts r294348, which removed support for conditional tail calls due to the PR above. It fixes the PR by marking live registers as implicitly used and defined by the now predicated tailcall. This is similar to how IfConversion predicates instructions. Differential Revision: https://reviews.llvm.org/D29856 llvm-svn: 295262
*	GlobalISel: legalize va_arg on AArch64.	Tim Northover	2017-02-15	2	-0/+10
\| \| \| \| \| \| \| \|	Uses a Custom implementation because the slot sizes being a multiple of the pointer size isn't really universal, even for the architectures that do have a simple "void *" va_list. llvm-svn: 295255
*	GlobalISel: support translating va_arg	Tim Northover	2017-02-15	1	-0/+12
\| \| \| \| \| \| \|	Since (say) i128 and [16 x i8] map to the same type in generic MIR, we also need to attach the required alignment info. llvm-svn: 295254
*	Fix typos	Matt Arsenault	2017-02-15	2	-2/+2
\| \| \| \|	llvm-svn: 295246
*	DAG: Do not scalarize fsub if fneg is legal	Matt Arsenault	2017-02-15	1	-0/+15
\| \| \| \| \| \|	Tests will be included with future commit. llvm-svn: 295242
*	Codegen: Make chains from trellis-shaped CFGs	Kyle Butt	2017-02-15	1	-17/+293
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Lay out trellis-shaped CFGs optimally. A trellis of the shape below: A B \|\ /\| \| \ / \| \| X \| \| / \ \| \|/ \\| C D would be laid out A; B->C ; D by the current layout algorithm. Now we identify trellises and lay them out either A->C; B->D or A->D; B->C. This scales with an increasing number of predecessors. A trellis is a a group of 2 or more predecessor blocks that all have the same successors. because of this we can tail duplicate to extend existing trellises. As an example consider the following CFG: B D F H / \ / \ / \ / \ A---C---E---G---Ret Where A,C,E,G are all small (Currently 2 instructions). The CFG preserving layout is then A,B,C,D,E,F,G,H,Ret. The current code will copy C into B, E into D and G into F and yield the layout A,C,B(C),E,D(E),F(G),G,H,ret define void @straight_test(i32 %tag) { entry: br label %test1 test1: ; A %tagbit1 = and i32 %tag, 1 %tagbit1eq0 = icmp eq i32 %tagbit1, 0 br i1 %tagbit1eq0, label %test2, label %optional1 optional1: ; B call void @a() br label %test2 test2: ; C %tagbit2 = and i32 %tag, 2 %tagbit2eq0 = icmp eq i32 %tagbit2, 0 br i1 %tagbit2eq0, label %test3, label %optional2 optional2: ; D call void @b() br label %test3 test3: ; E %tagbit3 = and i32 %tag, 4 %tagbit3eq0 = icmp eq i32 %tagbit3, 0 br i1 %tagbit3eq0, label %test4, label %optional3 optional3: ; F call void @c() br label %test4 test4: ; G %tagbit4 = and i32 %tag, 8 %tagbit4eq0 = icmp eq i32 %tagbit4, 0 br i1 %tagbit4eq0, label %exit, label %optional4 optional4: ; H call void @d() br label %exit exit: ret void } here is the layout after D27742: straight_test: # @straight_test ; ... Prologue elided ; BB#0: # %entry ; A (merged with test1) ; ... More prologue elided mr 30, 3 andi. 3, 30, 1 bc 12, 1, .LBB0_2 ; BB#1: # %test2 ; C rlwinm. 3, 30, 0, 30, 30 beq 0, .LBB0_3 b .LBB0_4 .LBB0_2: # %optional1 ; B (copy of C) bl a nop rlwinm. 3, 30, 0, 30, 30 bne 0, .LBB0_4 .LBB0_3: # %test3 ; E rlwinm. 3, 30, 0, 29, 29 beq 0, .LBB0_5 b .LBB0_6 .LBB0_4: # %optional2 ; D (copy of E) bl b nop rlwinm. 3, 30, 0, 29, 29 bne 0, .LBB0_6 .LBB0_5: # %test4 ; G rlwinm. 3, 30, 0, 28, 28 beq 0, .LBB0_8 b .LBB0_7 .LBB0_6: # %optional3 ; F (copy of G) bl c nop rlwinm. 3, 30, 0, 28, 28 beq 0, .LBB0_8 .LBB0_7: # %optional4 ; H bl d nop .LBB0_8: # %exit ; Ret ld 30, 96(1) # 8-byte Folded Reload addi 1, 1, 112 ld 0, 16(1) mtlr 0 blr The tail-duplication has produced some benefit, but it has also produced a trellis which is not laid out optimally. With this patch, we improve the layouts of such trellises, and decrease the cost calculation for tail-duplication accordingly. This patch produces the layout A,C,E,G,B,D,F,H,Ret. This layout does have back edges, which is a negative, but it has a bigger compensating positive, which is that it handles the case where there are long strings of skipped blocks much better than the original layout. Both layouts handle runs of executed blocks equally well. Branch prediction also improves if there is any correlation between subsequent optional blocks. Here is the resulting concrete layout: straight_test: # @straight_test ; BB#0: # %entry ; A (merged with test1) mr 30, 3 andi. 3, 30, 1 bc 12, 1, .LBB0_4 ; BB#1: # %test2 ; C rlwinm. 3, 30, 0, 30, 30 bne 0, .LBB0_5 .LBB0_2: # %test3 ; E rlwinm. 3, 30, 0, 29, 29 bne 0, .LBB0_6 .LBB0_3: # %test4 ; G rlwinm. 3, 30, 0, 28, 28 bne 0, .LBB0_7 b .LBB0_8 .LBB0_4: # %optional1 ; B (Copy of C) bl a nop rlwinm. 3, 30, 0, 30, 30 beq 0, .LBB0_2 .LBB0_5: # %optional2 ; D (Copy of E) bl b nop rlwinm. 3, 30, 0, 29, 29 beq 0, .LBB0_3 .LBB0_6: # %optional3 ; F (Copy of G) bl c nop rlwinm. 3, 30, 0, 28, 28 beq 0, .LBB0_8 .LBB0_7: # %optional4 ; H bl d nop .LBB0_8: # %exit Differential Revision: https://reviews.llvm.org/D28522 llvm-svn: 295223
*	include function name in dot filename	Xinliang David Li	2017-02-15	4	-8/+9
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D29975 llvm-svn: 295220
*	[DAG] Don't try to create an INSERT_SUBVECTOR with an illegal source	Michael Kuperstein	2017-02-15	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \|	We currently can't legalize those, but we should really not be creating them in the first place, since legalization would probably look similar to the way we legalize CONCAT_VECTORS - basically replace the INSERT with a BUILD. This fixes PR311956. Differential Revision: https://reviews.llvm.org/D29961 llvm-svn: 295213
*	[LLVM][XRAY][MIPS] Support xray on mips/mipsel/mips64/mips64el	Sagar Thakur	2017-02-15	1	-0/+4
\| \| \| \| \| \| \| \| \|	Summary: Adds support for xray instrumentation on mips for both 32-bit and 64-bit. Reviewed by sdardis, dberris Differential: D27697 llvm-svn: 295164
*	[SelectionDAGBuilder] Simplify creation of shufflevector DAG nodes where ↵	Craig Topper	2017-02-15	1	-46/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	inputs are larger than the mask Summary: The current code loops over all elements to calculate a used range. Then a second short loop looks at the ranges and determines if they can be used in a extract and creates a properly aligned start index for the extract. This range finding is unnecessary, we can just calculate a properly aligned start index for an extract for each input during the first loop. If we don't find the same start index for each indice we can't use an extract. Reviewers: zvi, RKSimon Reviewed By: zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29926 llvm-svn: 295152
*	[BranchFolding] Tail common all identical unreachable blocks	Reid Kleckner	2017-02-14	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Blocks ending in unreachable are typically cold because they end the program or throw an exception, so merging them with other identical blocks is usually profitable because it reduces the size of cold code. MachineBlockPlacement generally does not arrange to fall through to such blocks, so commoning these blocks will not introduce additional unconditional branches. Reviewers: hans, iteratee, haicheng Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29153 llvm-svn: 295105
*	GlobalISel: introduce G_PTR_MASK to simplify alloca handling.	Tim Northover	2017-02-14	2	-23/+19
\| \| \| \| \| \| \| \| \|	This instruction clears the low bits of a pointer without requiring (possibly dodgy if pointers aren't ints) conversions to and from an integer. Since (as far as I'm aware) all masks are statically known, the instruction takes an immediate operand rather than a register to specify the mask. llvm-svn: 295103
*	Reformat slightly.	Eric Christopher	2017-02-14	1	-4/+3
\| \| \| \|	llvm-svn: 295096
*	Reapply r294532, reverted in r294787.	Wolfgang Pieb	2017-02-14	1	-9/+147
\| \| \| \| \| \| \| \| \| \| \| \| \|	Store instructions can have more than one memory operand as a result of optimizations that fold different stores into one. When we identify spill instructions to generate DBG_VALUE instructions to record the spilling of a variable, we disregard stores with multiple memory operands for now. We may miss some relevant spills but the handling is a bit more complex, so we'll do it in a different patch. This fixes PR31935. llvm-svn: 295093
*	[Tablegen] Instrumenting table gen DAGGenISelDAG	Aditya Nandakumar	2017-02-14	1	-0/+9
\| \| \| \| \| \| \| \| \| \|	To help assist in debugging ISEL or to prioritize GlobalISel backend work, this patch adds two more tables to <Target>GenISelDAGISel.inc - one which contains the patterns that are used during selection and the other containing include source location of the patterns Enabled through CMake varialbe LLVM_ENABLE_DAGISEL_COV llvm-svn: 295081
*	Add new pass LazyMachineBlockFrequencyInfo	Adam Nemet	2017-02-14	4	-8/+89
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	And use it in MachineOptimizationRemarkEmitter. A test will follow on top of Justin's changes to enable MachineORE in AsmPrinter. The approach is similar to the IR-level pass. It's a bit simpler because BPI is immutable at the Machine level so we don't need to make that lazy. Because of this, a new function mapping is introduced (BPIPassTrait::getBPI). This function extracts BPI from the pass. In case of the lazy pass, this is when the calculation of the BFI occurs. For Machine-level, this is the identity function. Differential Revision: https://reviews.llvm.org/D29836 llvm-svn: 295072
*	Removing a redundant assignment	Artyom Skrobov	2017-02-14	1	-1/+0
\| \| \| \|	llvm-svn: 295055
*	[MC] Fix some Clang-tidy modernize and Include What You Use warnings; other ↵	Eugene Zelenko	2017-02-14	2	-42/+80
\| \| \| \| \| \| \| \|	minor fixes (NFC). Same changes in files affected by reduced MC headers dependencies. llvm-svn: 295009
*	GlobalISel: represent atomic loads & stores via the MachineMemOperand.	Tim Northover	2017-02-13	1	-11/+4
\| \| \| \| \| \| \|	Also make sure the AArch64 backend doesn't try to convert them into normal loads and stores. llvm-svn: 294993
*	MIR: parse & print the atomic parts of a MachineMemOperand.	Tim Northover	2017-02-13	2	-2/+49
\| \| \| \| \| \|	We're going to need them very soon for GlobalISel. llvm-svn: 294992
*	Address post-commit comments for https://reviews.llvm.org/D29596. NFCI.	Taewook Oh	2017-02-13	1	-1/+1
\| \| \| \|	llvm-svn: 294985
*	swiftcc: Don't emit tail calls from callers with swifterror parameters	Arnold Schwaighofer	2017-02-13	1	-0/+9
\| \| \| \| \| \| \| \| \|	Backends don't support this yet. They would have to move to the swifterror register before the tail call to make sure it is live-in to the call. rdar://30495920 llvm-svn: 294982
*	Make MachineBasicBlock::updateTerminator to update DebugLoc as well	Taewook Oh	2017-02-13	1	-2/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Currently MachineBasicBlock::updateTerminator simply drops DebugLoc for newly created branch instructions, which may cause incorrect stepping and/or imprecise sample profile data. Below is an example: ``` 1 extern int bar(int x); 2 3 int foo(int begin, int end) { 4 int i; 5 int ret = 0; 6 for ( 7 i = begin ; 8 i != end ; 9 i++) 10 { 11 ret += bar(i); 12 } 13 return ret; 14 } ``` Below is a bitcode of 'foo' at the end of LLVM-IR level optimizations with -O3: ``` define i32 @foo(i32* readonly %begin, i32* readnone %end) !dbg !4 { entry: %cmp6 = icmp eq i32* %begin, %end, !dbg !9 br i1 %cmp6, label %for.end, label %for.body.preheader, !dbg !12 for.body.preheader: ; preds = %entry br label %for.body, !dbg !13 for.body: ; preds = %for.body.preheader, %for.body %ret.08 = phi i32 [ %add, %for.body ], [ 0, %for.body.preheader ] %i.07 = phi i32* [ %incdec.ptr, %for.body ], [ %begin, %for.body.preheader ] %0 = load i32, i32* %i.07, align 4, !dbg !13, !tbaa !15 %call = tail call i32 @bar(i32 %0), !dbg !19 %add = add nsw i32 %call, %ret.08, !dbg !20 %incdec.ptr = getelementptr inbounds i32, i32* %i.07, i64 1, !dbg !21 %cmp = icmp eq i32* %incdec.ptr, %end, !dbg !9 br i1 %cmp, label %for.end.loopexit, label %for.body, !dbg !12, !llvm.loop !22 for.end.loopexit: ; preds = %for.body br label %for.end, !dbg !24 for.end: ; preds = %for.end.loopexit, %entry %ret.0.lcssa = phi i32 [ 0, %entry ], [ %add, %for.end.loopexit ] ret i32 %ret.0.lcssa, !dbg !24 } ``` where ``` !12 = !DILocation(line: 6, column: 3, scope: !11) ``` . As you can see, the terminator of 'entry' block, which is a loop control branch, has a DebugLoc of line 6, column 3. Howerver, after the execution of 'MachineBlock::updateTerminator' function, which is triggered by MachineSinking pass, the DebugLoc info is dropped as below (see there's no debug-location for JNE_1): ``` bb.0.entry: successors: %bb.4(0x30000000), %bb.1.for.body.preheader(0x50000000) liveins: %rdi, %rsi %6 = COPY %rsi %5 = COPY %rdi %8 = SUB64rr %5, %6, implicit-def %eflags, debug-location !9 JNE_1 %bb.1.for.body.preheader, implicit %eflags ``` This patch addresses this issue and make newly created branch instructions to keep debug-location info. Reviewers: aprantl, MatzeB, craig.topper, qcolombet Reviewed By: qcolombet Subscribers: qcolombet, llvm-commits Differential Revision: https://reviews.llvm.org/D29596 llvm-svn: 294976
*	[FastISel] Add a diagnostic to warm on fallback.	Quentin Colombet	2017-02-13	1	-0/+13
\| \| \| \| \| \| \| \|	This is consistent with what we do for GlobalISel. That way, it is easy to see whether or not FastISel is able to fully select a function. At some point we may want to switch that to an optimization remark. llvm-svn: 294970