bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[HotColdSplit] Calculate domtrees lazily to reduce compile-time, NFC	Vedant Kumar	2019-01-22	1	-25/+21
\| \| \| \| \| \| \| \| \| \|	The splitting pass does not need (post)domtrees until after it's found a cold block. Defer domtree calculation until a cold block is found. For the sqlite3 amalgamation, this reduces time spent in the splitting pass from 0.8% of the total to 0.4%. llvm-svn: 351892
*	[LegalizeTypes] Add debug prints to the top of PromoteFloatOperand and ↵	Craig Topper	2019-01-22	1	-0/+12
\| \| \| \| \| \| \| \| \| \|	PromoteFloatResult. Also add debug prints in the default case of the switches in these routines. Most if not all of the type legalization handlers already do this so this makes promoting floats consistent llvm-svn: 351890
*	AMDGPU/GlobalISel: Start selectively legalizing 16-bit operations	Matt Arsenault	2019-01-22	1	-4/+9
\| \| \| \| \| \| \| \|	It might be a bit nicer to use the fancy .legalIf and co. predicates, but this was requiring more boilerplate and disables the coverage assertions. llvm-svn: 351886
*	AMDGPU/GlobalISel: Handle legality/regbanks for 32/64-bit shifts	Matt Arsenault	2019-01-22	2	-2/+5
\| \| \| \|	llvm-svn: 351884
*	FileOutputBuffer: handle mmap(2) failure	Rui Ueyama	2019-01-22	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the underlying filesystem does not support mmap system call, FileOutputBuffer may fail when it attempts to mmap an output temporary file. This patch handles such situation. Unfortunately, it looks like it is very hard to test this functionality without a filesystem that doesn't support mmap using llvm-lit. I tested this locally by passing an invalid parameter to mmap so that it fails and falls back to the in-memory buffer. Maybe that's all what we can do. I believe it is reasonable to submit this without a test. Differential Revision: https://reviews.llvm.org/D56949 llvm-svn: 351883
*	GlobalISel: Allow shift amount to be a different type	Matt Arsenault	2019-01-22	8	-48/+115
\| \| \| \| \| \| \| \| \|	For AMDGPU the shift amount is never 64-bit, and this needs to use a 32-bit shift. X86 uses i8, but seemed to be hacking around this before. llvm-svn: 351882
*	[FileCheck] Suppress old -v/-vv diags if dumping input	Joel E. Denny	2019-01-22	1	-17/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The old diagnostic form of the trace produced by -v and -vv looks like: ``` check1:1:8: remark: CHECK: expected string found in input CHECK: abc ^ <stdin>:1:3: note: found here ; abc def ^~~ ``` When dumping annotated input is requested (via -dump-input), I find that this old trace is not useful and is sometimes harmful: 1. The old trace is mostly redundant because the same basic information also appears in the input dump's annotations. 2. The old trace buries any error diagnostic between it and the input dump, but I find it useful to see any error diagnostic up front. 3. FILECHECK_OPTS=-dump-input=fail requests annotated input dumps only for failed FileCheck calls. However, I have to also add -v or -vv to get a full set of annotations, and that can produce massive output from all FileCheck calls in all tests. That's a real problem when I run this in the IDE I use, which grinds to a halt as it tries to capture all that output. When -dump-input=fail\|always, this patch suppresses the old trace from -v or -vv. Error diagnostics still print as usual. If you want the old trace, perhaps to see variable expansions, you can set -dump-input=none (the default). Reviewed By: probinson Differential Revision: https://reviews.llvm.org/D55825 llvm-svn: 351881
*	GlobalISel: Make buildConstant handle vectors	Matt Arsenault	2019-01-22	1	-4/+38
\| \| \| \| \| \| \|	Produce a splat build_vector similar to how SelectionDAG::getConstant does. llvm-svn: 351880
*	GlobalISel: Implement widen for extract_vector_elt elt type	Matt Arsenault	2019-01-22	2	-4/+32
\| \| \| \|	llvm-svn: 351871
*	GlobalISel: Implement fewerElementsVector for basic FP ops	Matt Arsenault	2019-01-22	2	-27/+65
\| \| \| \|	llvm-svn: 351866
*	Add missing include (cstdlib) to Demangle.h	Konstantin Zhuravlyov	2019-01-22	1	-0/+1
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D57035 llvm-svn: 351861
*	AMDGPU/GlobalISel: Remove vectors from legal constant types	Matt Arsenault	2019-01-22	1	-1/+1
\| \| \| \|	llvm-svn: 351859
*	GlobalISel: Support narrowing zextload/sextload	Matt Arsenault	2019-01-22	2	-0/+45
\| \| \| \|	llvm-svn: 351856
*	[SelectionDAGBuilder] Defer C_Register Assignments to be in line with	Nirav Dave	2019-01-22	1	-13/+3
\| \| \| \| \| \|	those of C_RegisterClass. NFCI. llvm-svn: 351854
*	GlobalISel: Disallow vectors for G_CONSTANT/G_FCONSTANT	Matt Arsenault	2019-01-22	2	-4/+12
\| \| \| \|	llvm-svn: 351853
*	FileOutputBuffer: Handle "-" as stdout.	Rui Ueyama	2019-01-22	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I was honestly a bit surprised that we didn't do this before. This patch is to handle "-" as the stdout so that if you pass `-o -` to lld, for example, it writes an output to stdout instead of file `-`. I thought that we might want to handle this at a higher level than FileOutputBuffer, because if we land this patch, we can no longer create a file whose name is `-` (there's a workaround though; you can pass `./-` instead of `-`). However, because raw_fd_ostream already handles `-` as a special file name, I think it's okay and actually consistent to handle `-` as a special name in FileOutputBuffer. Differential Revision: https://reviews.llvm.org/D56940 llvm-svn: 351852
*	Codegen support for atomicrmw fadd/fsub	Matt Arsenault	2019-01-22	11	-17/+63
\| \| \| \|	llvm-svn: 351851
*	Reapply "IR: Add fp operations to atomicrmw"	Matt Arsenault	2019-01-22	10	-14/+76
\| \| \| \| \| \| \|	This reapplies commits r351778 and r351782 with RISCV test fixes. llvm-svn: 351850
*	[DEBUGINFO, NVPTX] Enable support for the debug info on NVPTX target.	Alexey Bataev	2019-01-22	3	-12/+8
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Enable full support for the debug info. Reviewers: echristo Subscribers: jholewinski, aprantl, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D46189 llvm-svn: 351846
*	Revert r351520, "Re-enable terminator folding in LoopSimplifyCFG"	Jordan Rupprecht	2019-01-22	1	-1/+1
\| \| \| \| \| \|	This is still causing compilation crashes in some targets. Will follow up shortly with a repro. llvm-svn: 351845
*	[DEBUG_INFO, NVPTX] Fix relocation info.	Alexey Bataev	2019-01-22	4	-22/+57
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Initial function labels must follow the debug location for the correct relocation info generation. Reviewers: tra, jlebar, echristo Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D45784 llvm-svn: 351843
*	[DAGCombiner] narrow vector binop with 2 insert subvector operands	Sanjay Patel	2019-01-22	1	-1/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	vecbo (insertsubv undef, X, Z), (insertsubv undef, Y, Z) --> insertsubv VecC, (vecbo X, Y), Z This is another step in generic vector narrowing. It's also a step towards more horizontal op formation specifically for x86 (although we still failed to match those in the affected tests). The scalarization cases are also not optimal (we should be scalarizing those), but it's still an improvement to use a narrower vector op when we know part of the result must be constant because both inputs are undef in some vector lanes. I think a similar match but checking for a constant operand might help some of the cases in D51553. Differential Revision: https://reviews.llvm.org/D56875 llvm-svn: 351825
*	[RISCV][NFC] Change naming scheme for RISC-V specific DAG nodes	Alex Bradbury	2019-01-22	1	-43/+50
\| \| \| \| \| \| \| \| \| \|	Previously we had names like 'Call' or 'Tail'. This potentially clashes with the naming scheme used elsewhere in RISCVInstrInfo.td. Many other backends would use names like AArch64call or PPCtail. I prefer the SystemZ approach, which uses prefixed all-lowercase names. This matches the naming scheme used for target-independent SelectionDAG nodes. llvm-svn: 351823
*	[X86][SSE] Canonicalize OR(AND(X,C),AND(Y,~C)) -> OR(AND(X,C),ANDNP(C,Y))	Simon Pilgrim	2019-01-22	1	-0/+70
\| \| \| \| \| \| \| \| \| \|	For constant bit select patterns, replace one AND with a ANDNP, allowing us to reuse the constant mask. Only do this if the mask has multiple uses (to avoid losing load folding) or if we have XOP as its VPCMOV can handle most folding commutations. This also requires computeKnownBitsForTargetNode support for X86ISD::ANDNP and X86ISD::FOR to prevent regressions in fabs/fcopysign patterns. Differential Revision: https://reviews.llvm.org/D55935 llvm-svn: 351819
*	[X86][BtVer2] SSE2 vector shifts has local forwarding disabled	Simon Pilgrim	2019-01-22	1	-2/+2
\| \| \| \| \| \| \| \|	Similar to horizontal ops on D56777, the sse2 (but not mmx) bit shift ops has local forwarding disabled, adding +1cy to the use latency for the result. Differential Revision: https://reviews.llvm.org/D57026 llvm-svn: 351817
*	Fix "comparison of unsigned expression >= 0 is always true" warning. NFCI.	Simon Pilgrim	2019-01-22	1	-1/+1
\| \| \| \|	llvm-svn: 351816
*	[X86][BtVer2] X86ISD::VPERMILPV has local forwarding disabled	Simon Pilgrim	2019-01-22	1	-2/+2
\| \| \| \| \| \| \| \|	Similar to horizontal ops on D56777, the vpermilpd/vpermilps variable mask ops has local forwarding disabled, adding +1cy to the use latency for the result. Differential Revision: https://reviews.llvm.org/D57022 llvm-svn: 351815
*	[CostModel][X86] Add ICMP Predicate specific costs	Simon Pilgrim	2019-01-22	1	-8/+49
\| \| \| \| \| \| \| \|	First step towards PR40376, this patch adds support for getCmpSelInstrCost to use the (optional) Instruction CmpInst predicate to indicate the type of integer comparison we're performing and alter the costs accordingly. Differential Revision: https://reviews.llvm.org/D57013 llvm-svn: 351810
*	[X86][SSE] Add selective commutation support for insertps (PR40340)	Simon Pilgrim	2019-01-22	3	-0/+24
\| \| \| \| \| \| \| \|	When we are inserting 1 "inline" element, and zeroing 2 of the other elements then we can safely commute the insertps source inputs to improve memory folding. Differential Revision: https://reviews.llvm.org/D56843 llvm-svn: 351807
*	[RISCV] Quick fix for PR40333	Alex Bradbury	2019-01-22	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Avoid the infinite loop caused by the target DAG combine converting ANYEXT to SIGNEXT and the target-independent DAG combine logic converting back to ANYEXT. Do this by not adding the new node to the worklist. Committing directly as this definitely doesn't make the problem any worse, and I intend to follow-up with a patch that avoids this custom combiner logic altogether and just lowers the i32 operations to a target-specific SelectionDAG node. This should be easier to reason about and improve codegen quality in some cases (though may miss out on some later DAG combines). llvm-svn: 351806
*	[LoopPredication] Support guards expressed as branches by widenable condition	Max Kazantsev	2019-01-22	1	-4/+60
\| \| \| \| \| \| \| \| \| \|	This patch adds support of guards expressed as branches by widenable conditions in Loop Predication. Differential Revision: https://reviews.llvm.org/D56081 Reviewed By: reames llvm-svn: 351805
*	[NFC] Add function to parse widenable conditional branches	Max Kazantsev	2019-01-22	1	-17/+14
\| \| \| \|	llvm-svn: 351803
*	[X86] HADDPS/HADDPD scalar lowering was added at rL350421	Simon Pilgrim	2019-01-22	1	-12/+0
\| \| \| \|	llvm-svn: 351797
*	Revert r351778: IR: Add fp operations to atomicrmw	Chandler Carruth	2019-01-22	10	-76/+14
\| \| \| \| \| \| \| \| \| \| \| \| \|	This broke the RISCV build, and even with that fixed, one of the RISCV tests behaves surprisingly differently with asserts than without, leaving there no clear test pattern to use. Generally it seems bad for hte IR to differ substantially due to asserts (as in, an alloca is used with asserts that isn't needed without!) and nothing I did simply would fix it so I'm reverting back to green. This also required reverting the RISCV build fix in r351782. llvm-svn: 351796
*	[llvm-symbolizer] Add support for --basenames/-s	James Henderson	2019-01-22	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes https://bugs.llvm.org/show_bug.cgi?id=40068. --basenames is a GNU addr2line switch which strips the directory names from the file path in the output. Reviewed by: ruiu Differential Revision: https://reviews.llvm.org/D56919 llvm-svn: 351795
*	[NFC] Factor out some reusable logic	Max Kazantsev	2019-01-22	1	-15/+21
\| \| \| \|	llvm-svn: 351794
*	[NFC] Add detector for guards expressed as branch by widenable conditions	Max Kazantsev	2019-01-22	1	-0/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds a function to detect guards expressed in explicit control flow form as branch by `and` with widenable condition intrinsic call: %wc = call i1 @llvm.experimental.widenable.condition() %guard_cond = and i1, %some_cond, %wc br i1 %guard_cond, label %guarded, label %deopt deopt: <maybe some non-side-effecting instructions> deoptimize() This form can be used as alternative to implicit control flow guard representation expressed by `experimental_guard` intrinsic. Differential Revision: https://reviews.llvm.org/D56074 Reviewed By: reames llvm-svn: 351791
*	[RISCV][NFC] Add break to case statement in RISCVDAGToDAGISel::Select	Alex Bradbury	2019-01-22	1	-0/+1
\| \| \| \| \| \| \|	The break isn't strictly needed yet as there is no subsequent entry in the case. But adding to prevent mistakes further down the road. llvm-svn: 351785
*	[RISCV] Fix build after r351778	Alex Bradbury	2019-01-22	1	-3/+6
\| \| \| \| \| \| \|	Also add a comment to explain the expansion strategy for atomicrmw {fadd,fsub}. llvm-svn: 351782
*	IR: Add fp operations to atomicrmw	Matt Arsenault	2019-01-22	10	-14/+73
\| \| \| \| \| \|	Add just fadd/fsub for now. llvm-svn: 351778
*	[ARM] Combine ands+lsls to lsls+lsrs for Thumb1.	Eli Friedman	2019-01-22	1	-4/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch may seem familiar... but my previous patch handled the equivalent lsls+and, not this case. Usually instcombine puts the "and" after the shift, so this case doesn't come up. However, if the shift comes out of a GEP, it won't get canonicalized by instcombine, and DAGCombine doesn't have an equivalent transform. This also modifies isDesirableToCommuteWithShift to suppress DAGCombine transforms which would make the overall code worse. I'm not really happy adding a bunch of code to handle this, but it would probably be tricky to substantially improve the behavior of DAGCombine here. Differential Revision: https://reviews.llvm.org/D56032 llvm-svn: 351776
*	[CVP] Use LVI to constant fold deopt operands	Philip Reames	2019-01-22	2	-1/+27
\| \| \| \| \| \| \| \|	Deopt operands are generally intended to record information about a site in code with minimal perturbation of the surrounding code. Idiomatically, they also tend to appear down rare paths. Putting these together, we have an obvious case for extending CVP w/deopt operand constant folding. Arguably, we should be doing this for all operands on all instructions, but that's definitely a much larger and risky change. Differential Revision: https://reviews.llvm.org/D55678 llvm-svn: 351774
*	GlobalISel: Fix out of bounds crashes in verifier	Matt Arsenault	2019-01-22	1	-3/+8
\| \| \| \|	llvm-svn: 351769
*	[AArch64] Add patterns for zext/sext of shift amount.	Eli Friedman	2019-01-22	1	-0/+8
\| \| \| \| \| \| \| \| \|	Not sure this is the best fix, but it saves an instruction for certain constructs involving variable shifts. Differential Revision: https://reviews.llvm.org/D55572 llvm-svn: 351768
*	AMDGPU/GlobalISel: Legalize more fp<->int conversions	Matt Arsenault	2019-01-22	1	-10/+4
\| \| \| \|	llvm-svn: 351767
*	[X86] Use X86ISD::VFPROUND instead of ISD::FP_ROUND for 256 and 512 bit ↵	Craig Topper	2019-01-21	5	-47/+76
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	cvtpd2ps intrinsics. Summary: Use X86ISD::VFPROUND in the instruction isel patterns. Add new patterns for ISD::FP_ROUND to maintain support for fptrunc in IR. In the process I found a couple duplicate isel patterns which I also deleted in this patch. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56991 llvm-svn: 351762
*	[X86] Change avx512 COMPRESS and EXPAND lowering to use a single masked node ↵	Craig Topper	2019-01-21	3	-14/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instead of expand/compress+select. Summary: For compress, a select node doesn't semantically reflect the behavior of the instruction. The mask would have holes in it, but the resulting write is to contiguous elements at the bottom of the vector. Furthermore, as far as the compressing and expanding is concerned the behavior is depended on the mask. You can't just have an expand/compress node that only reads the input vector. That node would have no meaning by itself. This all only works because we pattern match the compress/expand+select back to the instruction. But conceivably an optimization of the select could break the pattern and leave something meaningless. This patch modifies the expand and compress node to take the mask and passthru as additional inputs and gets rid of the select all together. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D57002 llvm-svn: 351761
*	[AMDGPU] Fixed hazard recognizer to walk predecessors	Stanislav Mekhanoshin	2019-01-21	4	-32/+126
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes two problems with GCNHazardRecognizer: 1. It only scans up to 5 instructions emitted earlier. 2. It does not take control flow into account. An earlier instruction from the previous basic block is not necessarily a predecessor. At the same time a real predecessor block is not scanned. The patch provides a way to distinguish between scheduler and hazard recognizer mode. It is OK to work with emitted instructions in the scheduler because we do not really know what will be emitted later and its order. However, when pass works as a hazard recognizer the schedule is already finalized, and we have full access to the instructions for the whole function, so we can properly traverse predecessors and their instructions. Differential Revision: https://reviews.llvm.org/D56923 llvm-svn: 351759
*	[X86][BtVer2] Update latency of mmx horizontal operations	Simon Pilgrim	2019-01-21	1	-1/+1
\| \| \| \| \| \| \| \|	D56777 added +1cy local forwarding penalty for horizontal operations, but this penalty only affects sse2/xmm variants, the mmx variants don't suffer the penalty. Confirmed with @andreadb llvm-svn: 351755
*	[DAGCombiner] fix crash when converting build vector to shuffle	Sanjay Patel	2019-01-21	1	-5/+11
\| \| \| \| \| \| \| \| \| \|	The regression test is reduced from the example shown in D56281. This does raise a question as noted in the test file: do we want to handle this pattern? I don't have a motivating example for that on x86 yet, but it seems like we could have that pattern there too, so we could avoid the back-and-forth using a shuffle. llvm-svn: 351753