bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[KnownBits] Add computeForAddCarry()	Nikita Popov	2019-04-12	1	-12/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is for D60460. computeForAddSub() essentially already supports carries because it has to deal with subtractions. This revision extracts a lower-level computeForAddCarry() function, which allows computing the known bits for add (carry known zero), sub (carry known one) and addcarry (carry unknown). As we don't seem to have any yet, I've added a unit test file for KnownBits and exhaustive tests for the new computeForAddCarry() functionality, as well the existing computeForAddSub() function. Differential Revision: https://reviews.llvm.org/D60522 llvm-svn: 358297
*	Simplify decoupling between RuntimeDyld/RuntimeDyldChecker, add 'got_addr' util.	Lang Hames	2019-04-12	12	-134/+101
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch reduces the number of functions in the interface between RuntimeDyld and RuntimeDyldChecker by combining "GetXAddress" and "GetXContent" functions into "GetXInfo" functions that return a struct describing both the address and content. The GetStubOffset function is also replaced with a pair of utilities, GetStubInfo and GetGOTInfo, that fit the new scheme. For RuntimeDyld both of these functions will return the same result, but for the new JITLink linker (https://reviews.llvm.org/D58704) these will provide the addresses of PLT stubs and GOT entries respectively. For JITLink's use, a 'got_addr' utility has been added to the rtdyld-check language, and the syntax of 'got_addr' and 'stub_addr' has been changed: both functions now take two arguments, a 'stub container name' and a target symbol name. For llvm-rtdyld/RuntimeDyld the stub container name is the object file name and section name, separated by a slash. E.g.: rtdyld-check: {8}(stub_addr(foo.o/__text, y)) = y For the upcoming llvm-jitlink utility, which creates stubs on a per-file basis rather than a per-section basis, the container name is just the file name. E.g.: jitlink-check: {8}(got_addr(foo.o, y)) = y llvm-svn: 358295
*	[Hexagon] Fix reuse bug in Vector Loop Carried Reuse pass	Brendon Cahoon	2019-04-12	1	-3/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The Hexagon Vector Loop Carried Reuse pass was allowing reuse between two shufflevectors with different masks. The reason is that the masks are not instruction objects, so the code that checks each operand just skipped over the operands. This patch fixes the bug by checking if the operands are the same when they are not instruction objects. If the objects are not the same, then the code assumes that reuse cannot occur. Differential Revision: https://reviews.llvm.org/D60019 llvm-svn: 358292
*	[DAGCombiner] narrow shuffle of concatenated vectors	Sanjay Patel	2019-04-12	1	-0/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	// shuffle (concat X, undef), (concat Y, undef), Mask --> // concat (shuffle X, Y, Mask0), (shuffle X, Y, Mask1) The ARM changes with 'vtrn' and narrowed 'vuzp' are improvements. The x86 changes look neutral or better. There's one test with an extra instruction, but that could be reversed for a subtarget with the right attributes. But by default, we want to avoid the 256-bit op when possible (in my motivating benchmark, a handful of ymm ops sprinkled into a sequence of xmm ops are triggering frequency throttling on Haswell resulting in significantly worse perf). Differential Revision: https://reviews.llvm.org/D60545 llvm-svn: 358291
*	Add options for MaxLoadsPerMemcmp(OptSize).	Hiroshi Yamauchi	2019-04-12	1	-2/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60587 llvm-svn: 358287
*	[X86][SSE] Recognise vXi1 boolean anyof/allof reduction patterns	Simon Pilgrim	2019-04-12	1	-33/+56
\| \| \| \| \| \| \| \| \| \| \| \|	Currently combineHorizontalPredicateResult only handles anyof/allof reduction patterns of legal types, which can be tricky to match as type legalization of bools can introduce bitcasts/truncs/extensions. This patch extends combineHorizontalPredicateResult to recognise vXi1 bool reductions as well and uses the existing combineBitcastvxi1 helper to create the MOVMSK necessary to then compare the signmask result. This ensures the accuracy of the reduction costs added in D60403 which assume the MOVMSK generation. Differential Revision: https://reviews.llvm.org/D60610 llvm-svn: 358286
*	Revert r358268 "[DebugInfo] DW_OP_deref_size in PrologEpilogInserter."	Hans Wennborg	2019-04-12	4	-37/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It causes clang to crash while building Chromium. See https://crbug.com/952230 for reproducer. > The PrologEpilogInserter need to insert a DW_OP_deref_size before > prepending a memory location expression to an already implicit > expression to avoid having the existing expression act on the memory > address instead of the value behind it. > > The reason for using DW_OP_deref_size and not plain DW_OP_deref is that > big-endian targets need to read the right size as simply truncating a > larger read would yield the wrong result (LSB bytes are not at the lower > address). > > Differential Revision: https://reviews.llvm.org/D59687 llvm-svn: 358281
*	Use llvm::upper_bound. NFC	Fangrui Song	2019-04-12	4	-13/+8
\| \| \| \|	llvm-svn: 358277
*	[PowerPC] Add initialization for some ppc passes	Kang Zhang	2019-04-12	13	-49/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Some llc debug options need pass-name as the parameters. But if we use the pass-name ppc-early-ret, we will get below error: llc test.ll -stop-after ppc-early-ret LLVM ERROR: "ppc-early-ret" pass is not registered. Below pass-names have the pass is not registered error: ppc-ctr-loops ppc-ctr-loops-verify ppc-loop-preinc-prep ppc-toc-reg-deps ppc-vsx-copy ppc-early-ret ppc-vsx-fma-mutate ppc-vsx-swaps ppc-reduce-cr-ops ppc-qpx-load-splat ppc-branch-coalescing ppc-branch-select Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D60248 llvm-svn: 358271
*	[DebugInfo] Fix pr41175 Dead Store Elimination missing debug loc	Jeremy Morse	2019-04-12	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bug: https://bugs.llvm.org/show_bug.cgi?id=41175 In the bug test case the DSE pass is shortening the range of memory that a memset is working on. A getelementptr is generated so that the new starting address can be passed to memset. This instruction was not given a DebugLoc. To fix the bug, copy the DebugLoc from the memset instruction. Patch by Orlando Cazalet-Hyams! Differential Revision: https://reviews.llvm.org/D60556 llvm-svn: 358270
*	[DebugInfo] DW_OP_deref_size in PrologEpilogInserter.	Markus Lavin	2019-04-12	4	-3/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The PrologEpilogInserter need to insert a DW_OP_deref_size before prepending a memory location expression to an already implicit expression to avoid having the existing expression act on the memory address instead of the value behind it. The reason for using DW_OP_deref_size and not plain DW_OP_deref is that big-endian targets need to read the right size as simply truncating a larger read would yield the wrong result (LSB bytes are not at the lower address). Differential Revision: https://reviews.llvm.org/D59687 llvm-svn: 358268
*	Move getNumFrameInfos and getDwarfFrameInfos out of line and remove	Eric Christopher	2019-04-12	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \|	the MCDwarf.h include. This removes 50 transitive dependencies for a modification of MCDwarf.h in a build of llc for a pair of out of line functions and reduces the build overhead of 'touch MCDwarf.h" by 15% without impacting test time of check-llvm. llvm-svn: 358264
*	Add explicit dependencies on MCSection.h and MCDwarf.h to the .cpp	Eric Christopher	2019-04-12	5	-0/+5
\| \| \| \| \| \|	files rather than rely on transitive includes from MCStreamer.h. llvm-svn: 358263
*	[ConstantFold] Don't evaluate FP or FP vector casts or truncations when ↵	Fangrui Song	2019-04-12	1	-1/+1
\| \| \| \| \| \| \| \|	simplifying icmp Fix PR41476 llvm-svn: 358262
*	Revert "[PowerPC] Add initialization for some ppc passes"	Eric Christopher	2019-04-12	13	-28/+49
\| \| \| \| \| \| \|	This reverts commit 6f8f98ce8de7c0e4ebd7fa2e1fd9507fe8d1c317 as it is breaking nearly every bot. llvm-svn: 358260
*	Move addInitialFrameState out of line and remove the MCDwarf.h include.	Eric Christopher	2019-04-12	1	-0/+4
\| \| \| \| \| \| \| \| \|	This removes 50 transitive dependencies for a modification of MCDwarf.h in a build of llc for a single out of line function and reduces the build overhead by 20% without impacting test time of check-llvm. llvm-svn: 358258
*	[TargetLowering][X86] Teach SimplifyDemandedBits to use ShrinkDemandedOp on ↵	Craig Topper	2019-04-12	1	-0/+6
\| \| \| \| \| \| \| \| \| \|	ISD::SHL nodes. If the upper bits of the SHL result aren't used, we might be able to use a narrower shift. For example, on X86 this can turn a 64-bit into 32-bit enabling a smaller encoding. Differential Revision: https://reviews.llvm.org/D60358 llvm-svn: 358257
*	[PowerPC] Add initialization for some ppc passes	Kang Zhang	2019-04-12	13	-49/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Some llc debug options need pass-name as the parameters. But if we use the pass-name ppc-early-ret, we will get below error: llc test.ll -stop-after ppc-early-ret LLVM ERROR: "ppc-early-ret" pass is not registered. Below pass-names have the pass is not registered error: ppc-ctr-loops ppc-ctr-loops-verify ppc-loop-preinc-prep ppc-toc-reg-deps ppc-vsx-copy ppc-early-ret ppc-vsx-fma-mutate ppc-vsx-swaps ppc-reduce-cr-ops ppc-qpx-load-splat ppc-branch-coalescing ppc-branch-select Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D60248 llvm-svn: 358256
*	Move addFrameInst out of line and remove the MCDwarf.h include.	Eric Christopher	2019-04-12	1	-0/+6
\| \| \| \| \| \| \| \| \|	This removes 500 transitive dependencies for a modification of MCDwarf.h in a build of llc for a single out of line function and reduces the build overhead by more than half without impacting test time of check-llvm. llvm-svn: 358255
*	Include what's used in a few cpp files - these were getting transitive	Eric Christopher	2019-04-12	4	-0/+4
\| \| \| \| \| \|	includes from MCDwarf.h. llvm-svn: 358254
*	[PowerPC] More precise exploitation of P9 maddld instruction when operands ↵	Zi Xuan Wu	2019-04-12	2	-2/+13
\| \| \| \| \| \| \| \| \| \| \| \| \|	are constant There are 3 operands of maddld, (add (mul %1, %2), %3) and sometimes they are constant. If there is constant operand, it takes extra li to materialize the operand, and one more extra register too. So it's not profitable to use maddld to optimize mul-add pattern. Differential Revision: https://reviews.llvm.org/D60181 llvm-svn: 358253
*	MCDwarfLineTableheader::tryGetFile : replace a loop with llvm::find	Fangrui Song	2019-04-12	1	-5/+1
\| \| \| \| \| \| \|	Note, `DirIndex++` below is incorrect for DWARF 5, but it can be fixed later after the file index is fixed. llvm-svn: 358251
*	Move a couple of optional references to just optional to make the	Eric Christopher	2019-04-12	1	-1/+1
\| \| \| \| \| \|	forwarding APIs look similar. llvm-svn: 358250
*	[MC] Fix typo: .symtab_shndxr -> .symtab_shndx	Fangrui Song	2019-04-12	1	-1/+1
\| \| \| \| \| \| \| \| \|	This special section is named .symtab_shndx, according to gABI Chapter 4 Sections, and the name is used by some other tools. Though the section type SHT_SYMTAB_SHNDX is what really matters, let's fix the typo introduced in rL204769 :) llvm-svn: 358247
*	Use llvm::lower_bound. NFC	Fangrui Song	2019-04-12	7	-28/+23
\| \| \| \| \| \|	This reapplies rL358161. That commit inadvertently reverted an exegesis file to an old version. llvm-svn: 358246
*	Remove a parameter that was being passed around that we had at the	Eric Christopher	2019-04-12	1	-5/+5
\| \| \| \| \| \| \| \|	local callsite. NFC. llvm-svn: 358244
*	llvm-undname: Use UNREACHABLE after exhaustive switch returning everywhere	Nico Weber	2019-04-11	1	-1/+1
\| \| \| \| \| \|	No behavior change. llvm-svn: 358241
*	llvm-undname: Name a bool param, no behavior change	Nico Weber	2019-04-11	1	-5/+6
\| \| \| \|	llvm-svn: 358240
*	llvm-undname: Fix out-of-bounds read on invalid intrinsic function code	Nico Weber	2019-04-11	1	-3/+9
\| \| \| \| \| \|	Found by inspection. llvm-svn: 358239
*	llvm-undname: Don't crash on incomplete enum tag manglings	Nico Weber	2019-04-11	1	-1/+1
\| \| \| \| \| \|	Found by inspection. llvm-svn: 358238
*	llvm-undname: Fix crash on incomplete virtual this adjusts	Nico Weber	2019-04-11	1	-2/+3
\| \| \| \| \| \| \| \|	Found by oss-fuzz. Also remove an else-after-return, this part has no behavior change. llvm-svn: 358237
*	[X86AsmPrinter] refactor static functions into private methods. NFC	Nick Desaulniers	2019-04-11	2	-77/+84
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: A lot of the code for printing special cases of operands in this translation unit are static functions. While I too have suffered many years of abuse at the hands of C, we should prefer private methods, particularly when you start passing around *this as your first argument, which is a code smell. This will help make generic vs arch specific asm printing easier, as it brings X86AsmPrinter more in line with other arch's derived AsmPrinters. We will then be able to more easily move architecture generic code to the base class, and architecture specific code to the derived classes. Some other small refactorings while we're here: - the parameter Op is now consistently OpNo - add spaces around binary expressions. I know we're not millionaires but c'mon. Reviewers: echristo Reviewed By: echristo Subscribers: smeenai, hiraditya, llvm-commits, srhines, craig.topper Tags: #llvm Differential Revision: https://reviews.llvm.org/D60577 llvm-svn: 358236
*	llvm-undname: Fix crash on invalid name in a template parameter pointer to ↵	Nico Weber	2019-04-11	1	-0/+2
\| \| \| \| \| \| \| \|	member arg Found by oss-fuzz. llvm-svn: 358234
*	[Pipeliner] Fix incorrect loop carried dependence calculation	Brendon Cahoon	2019-04-11	1	-5/+7
\| \| \| \| \| \| \| \| \| \| \| \|	The isLoopCarriedDep function does not correctly compute loop carried dependences when the array index offset is negative or the stride is smallar than the access size. Patch by Denis Antrushin. Differential Revision: https://reviews.llvm.org/D60135 llvm-svn: 358233
*	[ConstantRange] Add unsignedMulMayOverflow()	Nikita Popov	2019-04-11	1	-0/+20
\| \| \| \| \| \| \| \| \| \|	Same as the other ConstantRange overflow checking methods, but for unsigned mul. In this case there is no cheap overflow criterion, so using umul_ov for the implementation. Differential Revision: https://reviews.llvm.org/D60574 llvm-svn: 358228
*	[PGO] Better handling of profile hash mismatch	Rong Xu	2019-04-11	1	-6/+20
\| \| \| \| \| \| \| \| \| \| \|	We currently assume profile hash conflicts will be caught by an upfront check and we assert for the cases that escape the check. The assumption is not always true as there are chances of conflict. This patch prints a warning and skips annotating the function for the escaped cases,. Differential Revision: https://reviews.llvm.org/D60154 llvm-svn: 358225
*	[AArch64][GlobalISel] Flesh out vector load/store support for more types.	Amara Emerson	2019-04-11	1	-0/+8
\| \| \| \| \| \| \|	Some of these were legalizing into smaller vector types unnecessarily, others were simply not supported yet. llvm-svn: 358223
*	[AArch64][GlobalISel] Legalization and ISel support for load/stores of ↵	Amara Emerson	2019-04-11	3	-9/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	vectors of pointers. Loads and store of values with type like <2 x p0> currently don't get imported because SelectionDAG has no knowledge of pointer types. To leverage the existing support for vector load/stores, we can bitcast the value to have s64 element types instead. We do this as a custom legalization. This patch also adds support for general loads of <2 x s64>, and relaxes some type conditions on selecting G_BITCAST. Differential Revision: https://reviews.llvm.org/D60534 llvm-svn: 358221
*	[X86] Restrict vselect handling in scalarizeExtEltFP to only case to pre ↵	Craig Topper	2019-04-11	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	type legalization where the setcc result type is vXi1. If the vector setcc has been legalized then we will need to convert a vector boolean of 0 or -1 to a scalar boolean of 0 or 1. The added test case previously crashed in 32-bit mode by creating a setcc with an i64 condition that type legalization couldn't expand. llvm-svn: 358218
*	[X86] Add patterns for using movss/movsd for atomic load/store of f32/64. ↵	Craig Topper	2019-04-11	2	-70/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove atomic fadd pseudos use isel patterns instead. This patch adds patterns for turning bitcasted atomic load/store into movss/sd. It also removes the pseudo instructions for atomic RMW fadd. Instead just adding isel patterns for folding an atomic load into addss/sd. And relying on the new movss/sd store pattern to handle the write part. This also makes the fadd patterns use VEX and EVEX instructions when AVX or AVX512F are enabled. Differential Revision: https://reviews.llvm.org/D60394 llvm-svn: 358215
*	Recommit r358211 "[X86] Use FILD/FIST to implement i64 atomic load on 32-bit ↵	Craig Topper	2019-04-11	3	-20/+75
\| \| \| \| \| \| \| \| \| \| \| \|	targets with X87, but no SSE2" With correct test checks this time. If we have X87, but not SSE2 we can atomicaly load an i64 value into the significand of an 80-bit extended precision x87 register using fild. We can then use a fist instruction to convert it back to an i64 integ This matches what gcc and icc do for this case and removes an existing FIXME. llvm-svn: 358214
*	Revert r358211 "[X86] Use FILD/FIST to implement i64 atomic load on 32-bit ↵	Craig Topper	2019-04-11	3	-75/+20
\| \| \| \| \| \| \| \|	targets with X87, but no SSE2" I seem to have messed up the test checks. llvm-svn: 358212
*	[X86] Use FILD/FIST to implement i64 atomic load on 32-bit targets with X87, ↵	Craig Topper	2019-04-11	3	-20/+75
\| \| \| \| \| \| \| \| \| \| \| \|	but no SSE2 If we have X87, but not SSE2 we can atomicaly load an i64 value into the significand of an 80-bit extended precision x87 register using fild. We can then use a fist instruction to convert it back to an i64 integer and store it to a stack temporary. From there we can do two 32-bit loads to get the value into integer registers without worrying about atomicness. This matches what gcc and icc do for this case and removes an existing FIXME. Differential Revision: https://reviews.llvm.org/D60156 llvm-svn: 358211
*	Revert "Use llvm::lower_bound. NFC"	Ali Tamur	2019-04-11	7	-23/+28
\| \| \| \| \| \| \| \| \|	This reverts commit rL358161. This patch have broken the test: llvm/test/tools/llvm-exegesis/X86/uops-CMOV16rm-noreg.s llvm-svn: 358199
*	[ConstantFold] ExtractConstantBytes - handle shifts on large integer types	Simon Pilgrim	2019-04-11	1	-14/+16
\| \| \| \| \| \| \| \|	Use APInt instead of getZExtValue from the ConstantInt until we can confirm that the shift amount is in range. Reduced from OSS-Fuzz #14169 - https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14169 llvm-svn: 358192
*	[DAGCombiner] refactor narrowing of extracted vector binop; NFC	Sanjay Patel	2019-04-11	1	-20/+19
\| \| \| \| \| \| \|	There's a TODO comment about handling patterns with insert_subvector, and we do want to match that. llvm-svn: 358187
*	[X86] SimplifyDemandedVectorElts - add X86ISD::VPERMV3 mask support	Simon Pilgrim	2019-04-11	1	-1/+1
\| \| \| \| \| \|	Completes SimplifyDemandedVectorElts's basic variable shuffle mask support which should help D60512 + D60562 llvm-svn: 358186
*	[RISCV] Diagnose invalid second input register operand when using %tprel_add	Roger Ferrer Ibanez	2019-04-11	1	-2/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	RISCVMCCodeEmitter::expandAddTPRel asserts that the second operand must be x4/tp. As we are not currently checking this in the RISCVAsmParser, the assert is easy to trigger due to wrong assembly input. This patch does a late check of this constraint. An alternative could be using a singleton register class for x4/tp similar to the current one for sp. Unfortunately it does not result in a good diagnostic. Because add is an overloaded mnemonic, if no matching is possible, the diagnostic of the first failing alternative seems to be used as the diagnostic itself. This means that this case the %tprel_add is diagnosed as an invalid operand (because the real add instruction only has 3 operands). Differential Revision: https://reviews.llvm.org/D60528 llvm-svn: 358183
*	[X86] Add MM register mapping from CodeView to MC register id	Luo, Yuanke	2019-04-11	1	-0/+9
\| \| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D60437 Change-Id: I2183a6d825d0284b22705d423b88882992b236c5 llvm-svn: 358179
*	YAMLIO: Fix serialization of strings with embedded nuls	Pavel Labath	2019-04-11	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: A bug/typo in Output::scalarString caused us to round-trip a StringRef through a const char *. This meant that any strings with embedded nuls were unintentionally cut short at the first such character. (It also could have caused accidental buffer overruns, but it seems that all StringRefs coming into this functions were formed from null-terminated strings.) This patch fixes the bug and adds an appropriate test. Reviewers: sammccall, jhenderson Subscribers: kristina, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60505 llvm-svn: 358176