bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	CodeGen, Target: Move Mach-O-specific symbol name logic to Mach-O lowering.	Peter Collingbourne	2015-11-03	3	-23/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A profile of an LTO link of Chrome revealed that we were spending some ~30-50% of execution time in the function Constant::getRelocationInfo(), which is called from TargetLoweringObjectFile::getKindForGlobal() and in turn from TargetMachine::getNameWithPrefix(). It turns out that we only need the result of getKindForGlobal() when targeting Mach-O, so this change moves the relevant part of the logic to TargetLoweringObjectFileMachO. NFCI. Differential Revision: http://reviews.llvm.org/D14168 llvm-svn: 252014
*	AMDGPU: Make flat_scratch name consistent	Matt Arsenault	2015-11-03	1	-3/+3
\| \| \| \| \| \| \|	The printed name and the parsed assembler names weren't the same. I'm not sure which name SC prints these as, but I think it's this one. llvm-svn: 252010
*	AMDGPU: Fix asserts on invalid register ranges	Matt Arsenault	2015-11-03	1	-5/+13
\| \| \| \| \| \| \| \| \|	If the requested SGPR was not actually aligned, it was accepted and rounded down instead of rejected. Also fix an assert if the range is an invalid size. llvm-svn: 252009
*	AMDGPU: Fix off by one error in register parsing	Matt Arsenault	2015-11-03	1	-4/+5
\| \| \| \| \| \|	If trying to use one past the end, this would assert. llvm-svn: 252008
*	Align whitespace	Derek Schuff	2015-11-03	2	-4/+4
\| \| \| \|	llvm-svn: 252003
*	[WebAssembly] Support wasm select operator	Derek Schuff	2015-11-03	2	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add support for wasm's select operator, and lower LLVM's select DAG node to it. Reviewers: sunfish Subscribers: dschuff, llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D14295 llvm-svn: 252002
*	AMDGPU: s[102:103] is unavailable on VI	Matt Arsenault	2015-11-03	1	-1/+10
\| \| \| \|	llvm-svn: 252000
*	AMDGPU: Define correct number of SGPRs	Matt Arsenault	2015-11-03	2	-6/+10
\| \| \| \| \| \| \| \| \|	There are actually 104 so 2 were missing. More assembler tests with high register number tuples will be included in later patches. llvm-svn: 251999
*	AMDGPU: Make findUsedSGPR more readable	Matt Arsenault	2015-11-03	1	-7/+18
\| \| \| \| \| \|	Add more comments etc. llvm-svn: 251996
*	AMDGPU: Initialize SIFixSGPRCopies so -print-after works	Matt Arsenault	2015-11-03	3	-8/+15
\| \| \| \|	llvm-svn: 251995
*	AMDGPU: Alphabetize includes	Matt Arsenault	2015-11-03	1	-1/+1
\| \| \| \|	llvm-svn: 251994
*	InstCombine: fix sinking of convergent calls	Fiona Glaser	2015-11-03	1	-0/+6
\| \| \| \|	llvm-svn: 251991
*	[SelectionDAG] Use existing constant nodes instead of recreating them. NFC.	Simon Pilgrim	2015-11-03	1	-9/+6
\| \| \| \|	llvm-svn: 251990
*	[LLVMSymbolize] Factor out the logic for printing structs from DIContext. NFC.	Alexey Samsonov	2015-11-03	3	-61/+76
\| \| \| \| \| \| \| \|	Introduce DIPrinter which takes care of rendering DILineInfo and friends. This allows LLVMSymbolizer class to return a structured data instead of plain std::strings. llvm-svn: 251989
*	[LAA] LLE 3/6: Rename InterestingDependence to Dependences, NFC	Adam Nemet	2015-11-03	2	-33/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We now collect all types of dependences including lexically forward deps not just "interesting" ones. Reviewers: hfinkel Subscribers: rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D13256 llvm-svn: 251985
*	[LLVMSymbolize] Move demangling away from printing routines. NFC.	Alexey Samsonov	2015-11-03	1	-28/+33
\| \| \| \| \| \| \| \| \|	Make printDILineInfo and friends responsible for just rendering the contents of the structures, demangling should actually be performed earlier, when we have the information about the originating SymbolizableModule at hand. llvm-svn: 251981
*	[SimplifyLibCalls] Add a new transformation: pow(exp(x), y) -> exp(x*y)	Davide Italiano	2015-11-03	1	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This one is enabled only under -ffast-math (due to rounding/overflows) but allows us to emit shorter code. Before (on FreeBSD x86-64): 4007f0: 50 push %rax 4007f1: f2 0f 11 0c 24 movsd %xmm1,(%rsp) 4007f6: e8 75 fd ff ff callq 400570 <exp2@plt> 4007fb: f2 0f 10 0c 24 movsd (%rsp),%xmm1 400800: 58 pop %rax 400801: e9 7a fd ff ff jmpq 400580 <pow@plt> 400806: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 40080d: 00 00 00 After: 4007b0: f2 0f 59 c1 mulsd %xmm1,%xmm0 4007b4: e9 87 fd ff ff jmpq 400540 <exp2@plt> 4007b9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) Differential Revision: http://reviews.llvm.org/D14045 llvm-svn: 251976
*	[X86][XOP] Add support for the matching of the VPCMOV bit select instruction	Simon Pilgrim	2015-11-03	2	-0/+21
\| \| \| \| \| \| \| \| \| \|	XOP has the VPCMOV instruction that performs the common vector bit select operation OR( AND( SRC1, SRC3 ), AND( SRC2, ~SRC3 ) ) This patch adds tablegen pattern matching for this instruction. Differential Revision: http://reviews.llvm.org/D8841 llvm-svn: 251975
*	[LAA] LLE 2/6: Fix a NoDep case that should be a Forward dependence	Adam Nemet	2015-11-03	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When the dependence distance in zero then we have a loop-independent dependence from the earlier to the later access. No current client of LAA uses forward dependences so other than potentially hitting the MaxDependences threshold earlier, this change shouldn't affect anything right now. This and the previous patch were tested together for compile-time regression. None found in LNT/SPEC. Reviewers: hfinkel Subscribers: rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D13255 llvm-svn: 251973
*	[LAA] LLE 1/6: Expose Forward dependences	Adam Nemet	2015-11-03	1	-13/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Before this change, we didn't use to collect forward dependences since none of the current clients (LV, LDist) required them. The motivation to also collect forward dependences is a new pass LoopLoadElimination (LLE) which discovers store-to-load forwarding opportunities across the loop's backedge. The pass uses both lexically forward or backward loop-carried dependences to detect these opportunities. The new pass also analyzes loop-independent (forward) dependences since they can conflict with the loop-carried dependences in terms of how the data flows through memory. The newly added test only covers loop-carried forward dependences because loop-independent ones are currently categorized as NoDep. The next patch will fix this. The two patches were tested together for compile-time regression. None found in LNT/SPEC. Note that with this change LAA provides all dependences rather than just "interesting" ones. A subsequent NFC patch will remove the now trivial isInterestingDependence and rename the APIs. Reviewers: hfinkel Subscribers: jmolloy, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D13254 llvm-svn: 251972
*	Don't create empty sections just to look like gas.	Rafael Espindola	2015-11-03	1	-10/+0
\| \| \| \| \| \| \|	We are long past the time when this much bug for bug compatibility was useful. llvm-svn: 251970
*	Revert "Move metadata linking after lazy global materialization/linking."	Teresa Johnson	2015-11-03	1	-9/+9
\| \| \| \| \| \| \| \| \| \| \|	This reverts commit r251926. I believe this is causing an LTO bootstrapping bot failure (http://lab.llvm.org:8080/green/job/llvm-stage2-cmake-RgLTO_build/3669/). Haven't been able to repro it yet, but after looking at the metadata I am pretty sure I know what is going on. llvm-svn: 251965
*	[libFuzzer] make -test_single_input more reliable: make sure the input's ↵	Kostya Serebryany	2015-11-03	1	-1/+3
\| \| \| \| \| \|	size is equal to it's capacity llvm-svn: 251961
*	Delete dead code.	Rafael Espindola	2015-11-03	2	-9/+0
\| \| \| \|	llvm-svn: 251960
*	Simplify local common output.	Rafael Espindola	2015-11-03	1	-20/+14
\| \| \| \| \| \| \| \|	We now create them as they are found and use higher level APIs. This is a step in avoiding creating unnecessary sections. llvm-svn: 251958
*	[CodegenPrepare] Do not rematerialize gc.relocates across different basic blocks	Igor Laevsky	2015-11-03	1	-0/+8
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D14258 llvm-svn: 251957
*	Move code out of a loop and use a range loop.	Rafael Espindola	2015-11-03	1	-10/+8
\| \| \| \|	llvm-svn: 251952
*	Revert "Revert "[Orc] Directly emit machine code for the x86 resolver block ↵	Rafael Espindola	2015-11-03	4	-150/+90
\| \| \| \| \| \| \| \| \| \|	and trampolines."" This reverts commit r251937. The test was updated to the new API, bring the API back. llvm-svn: 251944
*	Fix PR25372 - teach replaceCongruentPHIs to handle cases where SE evaluates ↵	Silviu Baranga	2015-11-03	1	-1/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	a PHI to a SCEVConstant Summary: Since now Scalar Evolution can create non-add rec expressions for PHI nodes, it can also create SCEVConstant expressions. This will confuse replaceCongruentPHIs, which previously relied on the fact that SCEV could not produce constants in this case. We will now replace the node with a constant in these cases - or avoid processing the Phi in case of a type mismatch. Reviewers: sanjoy Subscribers: llvm-commits, majnemer Differential Revision: http://reviews.llvm.org/D14230 llvm-svn: 251938
*	Revert "[Orc] Directly emit machine code for the x86 resolver block and ↵	Rafael Espindola	2015-11-03	4	-90/+150
\| \| \| \| \| \| \| \| \| \|	trampolines." This reverts commit r251933. It broke the build of examples/Kaleidoscope/Orc/fully_lazy/toy.cpp. llvm-svn: 251937
*	[Orc] Directly emit machine code for the x86 resolver block and trampolines.	Lang Hames	2015-11-03	4	-150/+90
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bypassing LLVM for this has a number of benefits: 1) Laziness support becomes asm-syntax agnostic (previously lazy jitting didn't work on Windows as the resolver block was in Darwin asm). 2) For cross-process JITs, it allows resolver blocks and trampolines to be emitted directly in the target process, reducing cross process traffic. 3) It should be marginally faster. llvm-svn: 251933
*	Move metadata linking after lazy global materialization/linking.	Teresa Johnson	2015-11-03	1	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Currently, named metadata is linked before the LazilyLinkGlobalValues list is walked and materialized/linked. As a result, references from DISubprogram and DIGlobalVariable metadata to yet unmaterialized functions and variables cause them to be added to the lazy linking list and their definitions are materialized and linked. This makes the llvm-link -only-needed option not have the intended effect when debug information is present, as the otherwise unneeded functions/variables are still linked in. Additionally, for ThinLTO I have implemented a mechanism to only link in debug metadata needed by imported functions. Moving named metadata linking after lazy GV linking will facilitate applying this mechanism to the LTO and "llvm-link -only-needed" cases as well. Reviewers: dexonsmith, tra, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14195 llvm-svn: 251926
*	Don't assert if materializing before seeing any function bodies	Filipe Cabecinhas	2015-11-03	1	-1/+3
\| \| \| \| \| \| \| \| \|	This assert was reachable from user input. A minimized test case (no FUNCTION_BLOCK_ID record) is attached. Bug found with afl-fuzz llvm-svn: 251910
*	Don't use Twine objects after their lifetimes end.	Filipe Cabecinhas	2015-11-03	1	-6/+6
\| \| \| \| \| \| \| \|	No test, since it would depend on what the compiler can optimize/reuse. My next commit made this bug visible on Linux Release compiles with some versions of gcc. llvm-svn: 251909
*	LoopVectorizer - skip 'bitcast' between GEP and load.	Elena Demikhovsky	2015-11-03	1	-2/+28
\| \| \| \| \| \| \| \| \| \| \| \|	Skipping 'bitcast' in this case allows to vectorize load: %arrayidx = getelementptr inbounds double, double* %in, i64 %indvars.iv %tmp53 = bitcast double** %arrayidx to i64* %tmp54 = load i64, i64* %tmp53, align 8 Differential Revision http://reviews.llvm.org/D14112 llvm-svn: 251907
*	[X86] Generate .cfi_adjust_cfa_offset correctly when pushing arguments	Michael Kuperstein	2015-11-03	4	-27/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When push instructions are being used to pass function arguments on the stack, and either EH or debugging are enabled, we need to generate .cfi_adjust_cfa_offset directives appropriately. For (synch) EH, it is enough for the CFA offset to be correct at every call site, while for debugging we want to be correct after every push. Darwin does not support this well, so don't use pushes whenever it would be required. Differential Revision: http://reviews.llvm.org/D13767 llvm-svn: 251904
*	AVX512: add encoding tests for vmovq/d instructions.	Igor Breger	2015-11-03	1	-1/+1
\| \| \| \|	llvm-svn: 251903
*	Revert "[IndVarSimplify] Rewrite loop exit values with their initial values ↵	Tobias Grosser	2015-11-03	1	-73/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	from loop preheader" Commit 251839 triggers miscompiles on some bots: http://lab.llvm.org:8011/builders/perf-x86_64-penryn-O3-polly-fast/builds/13723 (The commit is listed in 13722, but due to an existing failure introduced in 13721 and reverted in 13723 the failure is only visible in 13723) To verify r251839 is indeed the only change that triggered the buildbot failures and to ensure the buildbots remain green while investigating I temporarily revert this commit. At the current state it is unclear if this commit introduced some miscompile or if it only exposed code to Polly that is subsequently miscompiled by Polly. llvm-svn: 251901
*	Fix build problme introduced in r251883	Matthias Braun	2015-11-03	1	-1/+1
\| \| \| \|	llvm-svn: 251888
*	RegisterPressure: Improve assert message	Matthias Braun	2015-11-03	1	-1/+2
\| \| \| \|	llvm-svn: 251885
*	RegisterPressure: Slightly nicer pressure diff dumping	Matthias Braun	2015-11-03	1	-1/+3
\| \| \| \|	llvm-svn: 251884
*	ScheduleDAGInstrs: Remove IsPostRA flag; NFC	Matthias Braun	2015-11-03	5	-39/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ScheduleDAGInstrs doesn't behave differently before or after register allocation. It was only used in a method of MachineSchedulerBase which behaved differently in MachineScheduler/PostMachineScheduler. Change this to let MachineScheduler/PostMachineScheduler just pass in a parameter to that function. The order of the LiveIntervals* and bool RemoveKillFlags paramters have been switched to make out-of-tree code fail instead of unintentionally passing a value intended for the IsPostRA flag to the (previously following and default initialized) RemoveKillFlags. Differential Revision: http://reviews.llvm.org/D14245 llvm-svn: 251883
*	This never returns end(), simplify to use Child instead of iterator. NFC.	Rafael Espindola	2015-11-03	1	-3/+2
\| \| \| \|	llvm-svn: 251876
*	[Hexagon] Fixing mistaken case fallthrough.	Colin LeMahieu	2015-11-03	1	-0/+1
\| \| \| \|	llvm-svn: 251867
*	Restore "Support for ThinLTO function importing and symbol linking."	Teresa Johnson	2015-11-03	4	-44/+401
\| \| \| \| \| \| \|	This restores commit r251837, with the new library dependence added to llvm-link/Makefile to address bot failures. llvm-svn: 251866
*	AMDGPU: Stop assuming vreg for build_vector	Matt Arsenault	2015-11-02	2	-20/+40
\| \| \| \| \| \| \| \| \| \| \| \| \|	This was causing a variety of test failures when v2i64 is added as a legal type. SIFixSGPRCopies should correctly handle the case of vector inputs to a scalar reg_sequence, so this isn't necessary anymore. This was hiding some deficiencies in how reg_sequence is handled later, but this shouldn't be a problem anymore since the register class copy of a reg_sequence is now done before the reg_sequence. llvm-svn: 251860
*	[WebAssembly] Make WebAssemblyCodeGen depend on WebAssemblyAsmPrinter	Derek Schuff	2015-11-02	1	-1/+1
\| \| \| \|	llvm-svn: 251859
*	AMDGPU: Error on graphics shaders with HSA	Matt Arsenault	2015-11-02	1	-0/+8
\| \| \| \| \| \| \| \|	I've found myself pointlessly debugging problems from running graphics tests with an HSA triple a few times, so stop this from happening again. llvm-svn: 251858
*	[CGP] widen switch condition and case constants to target's register width ↵	Sanjay Patel	2015-11-02	1	-0/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(2nd try) This is a redo of r251849 except the tests have been split into arch-specific folders to hopefully make the bots happy. This is a follow-up from the discussion in D12965. The block-at-a-time limitation of SelectionDAG also came up in D13297. Without the InstCombine change from D12965, I don't expect this patch to make any difference in the real world because InstCombine does not shrink cases like this in visitSwitchInst(). But we need to have this CGP safety harness in place before proceeding with any shrinkage in D12965, so we won't generate extra extends for compares. I've opted for IR regression tests in the patch because that seems like a clearer way to test the transform, but PowerPC CodeGen for an i16 widening test is shown below. x86 will need more work to solve: https://llvm.org/bugs/show_bug.cgi?id=22473 Before: BB#0: mr 4, 3 extsh. 3, 4 ble 0, .LBB0_5 BB#1: cmpwi 3, 99 bgt 0, .LBB0_9 BB#2: rlwinm 4, 4, 0, 16, 31 <--- 32-bit mask/extend li 3, 0 cmplwi 4, 1 beqlr 0 BB#3: cmplwi 4, 10 bne 0, .LBB0_12 BB#4: li 3, 1 blr .LBB0_5: rlwinm 3, 4, 0, 16, 31 <--- 32-bit mask/extend cmplwi 3, 65436 beq 0, .LBB0_13 BB#6: cmplwi 3, 65526 beq 0, .LBB0_15 BB#7: cmplwi 3, 65535 bne 0, .LBB0_12 BB#8: li 3, 4 blr .LBB0_9: rlwinm 3, 4, 0, 16, 31 <--- 32-bit mask/extend cmplwi 3, 100 beq 0, .LBB0_14 ... After: BB#0: rlwinm 4, 3, 0, 16, 31 <--- mask/extend to 32-bit and then use that for comparisons cmpwi 4, 999 ble 0, .LBB0_5 BB#1: lis 3, 0 ori 3, 3, 65525 cmpw 4, 3 bgt 0, .LBB0_9 BB#2: cmplwi 4, 1000 beq 0, .LBB0_14 BB#3: cmplwi 4, 65436 bne 0, .LBB0_13 BB#4: li 3, 6 blr .LBB0_5: li 3, 0 cmplwi 4, 1 beqlr 0 BB#6: cmplwi 4, 10 beq 0, .LBB0_12 BB#7: cmplwi 4, 100 bne 0, .LBB0_13 BB#8: li 3, 2 blr .LBB0_9: cmplwi 4, 65526 beq 0, .LBB0_15 BB#10: cmplwi 4, 65535 bne 0, .LBB0_13 ... Differential Revision: http://reviews.llvm.org/D13532 llvm-svn: 251857
*	AMDGPU: Distribute SGPR->VGPR copies of REG_SEQUENCE	Matt Arsenault	2015-11-02	1	-23/+89
\| \| \| \| \| \| \|	Make the REG_SEQUENCE be a VGPR, and do the register class copy first. llvm-svn: 251855