bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[XCore] CombineSTORE - Use allowsMemoryAccess wrapper. NFCI.	Simon Pilgrim	2019-06-12	1	-9/+3
\| \| \| \| \| \|	Noticed in D63075 - there was a allowsMisalignedMemoryAccesses call to check for unaligned loads and a check for aligned legal type loads - which is exactly what allowsMemoryAccess does. llvm-svn: 363141
*	[ThinLTO]LTO]Legacy] Fix dependent libraries support by adding querying of ↵	Ben Dunbobbin	2019-06-12	1	-5/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	the IRSymtab Dependent libraries support for the legacy api was committed in a broken state (see: https://reviews.llvm.org/D60274). This was missed due to the painful nature of having to integrate the changes into a linker in order to test. This change implements support for dependent libraries in the legacy LTO api: - I have removed the current api function, which returns a single string, and added functions to access each dependent library specifier individually. - To reduce the testing pain, I have made the api functions as thin as possible to maximize coverage from llvm-lto. - When doing ThinLTO the system linker will load the modules lazily when scanning the input files. Unfortunately, when modules are lazily loaded there is no access to module level named metadata. To fix this I have added api functions that allow querying the IRSymtab for the dependent libraries. I hope to expand the api in the future so that, eventually, all the information needed by a client linker during scan can be retrieved from the IRSymtab. Differential Revision: https://reviews.llvm.org/D62935 llvm-svn: 363140
*	[XCore] LowerLOAD/LowerSTORE - Use allowsMemoryAccess wrapper. NFCI.	Simon Pilgrim	2019-06-12	1	-28/+14
\| \| \| \| \| \|	Noticed in D63075 - there was a allowsMisalignedMemoryAccesses call to check for unaligned loads and a check for aligned legal type loads - which is exactly what allowsMemoryAccess does. llvm-svn: 363137
*	Revert "[DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step ↵	Orlando Cazalet-Hyams	2019-06-12	2	-21/+7
\| \| \| \| \| \| \| \| \|	through loop even after completion" This reverts commit 1a0f7a2077b70c9864faa476e15b048686cf1ca7. See phabricator thread for D60831. llvm-svn: 363132
*	[AArch64] Merge globals when optimising for size	Sjoerd Meijer	2019-06-12	1	-1/+14
\| \| \| \| \| \| \| \| \| \| \| \| \|	Extern global merging is good for code-size. There's definitely potential for performance too, but there's one regression in a benchmark that needs investigating, so that's why we enable it only when we optimise for size for now. Patch by Ramakota Reddy and Sjoerd Meijer. Differential Revision: https://reviews.llvm.org/D61947 llvm-svn: 363130
*	[X86] Add VCMPSSZrr_Intk and VCMPSDZrr_Intk to isNonFoldablePartialRegisterLoad.	Craig Topper	2019-06-12	1	-0/+2
\| \| \| \| \| \| \| \| \|	The non-masked versions are already in there. I'm having some trouble coming up with a way to test this right now. Most load folding should happen during isel so I'm not sure how to get peephole pass to do it. llvm-svn: 363125
*	[RISCV] Add CFI directives for RISCV prologue/epilog.	Hsiangkai Wang	2019-06-12	3	-4/+78
\| \| \| \| \| \| \| \| \|	In order to generate correct debug frame information, it needs to generate CFI information in prologue and epilog. Differential Revision: https://reviews.llvm.org/D61773 llvm-svn: 363120
*	[NFC] Correct comments in RegisterCoalescer.	Hsiangkai Wang	2019-06-12	1	-6/+6
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D63124 llvm-svn: 363119
*	Fix a bug in getSCEVAtScope w.r.t. non-canonical loops	Philip Reames	2019-06-11	1	-2/+2
\| \| \| \| \| \| \| \|	The issue is that if we have a loop with multiple predecessors outside the loop, the code was expecting to merge them and only return if equal, but instead returned the first one seen. I have no idea if this actually tripped anywhere. I noticed it by accident when reading the code and have no idea how to go about constructing a test case. llvm-svn: 363112
*	Generalize icmp matching in IndVars' eliminateTrunc	Philip Reames	2019-06-11	1	-14/+15
\| \| \| \| \| \| \| \|	We were only matching RHS being a loop invariant value, not the inverse. Since there's nothing which appears to canonicalize loop invariant values to RHS, this means we missed cases. Differential Revision: https://reviews.llvm.org/D63112 llvm-svn: 363108
*	[Analysis] add isSplatValue() for vectors in IR	Sanjay Patel	2019-06-11	1	-0/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have the related getSplatValue() already in IR (see code just above the proposed addition). But sometimes we only need to know that the value is a splat rather than capture the splatted scalar value. Also, we have an isSplatValue() function already in SDAG. Motivation - recent bugs that would potentially benefit from improved splat analysis in IR: https://bugs.llvm.org/show_bug.cgi?id=37428 https://bugs.llvm.org/show_bug.cgi?id=42174 Differential Revision: https://reviews.llvm.org/D63138 llvm-svn: 363106
*	[GlobalISel] Add a G_JUMP_TABLE opcode.	Amara Emerson	2019-06-11	2	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \|	This opcode generates a pointer to the address of the jump table specified by the source operand, which is a jump table index. It will be used in conjunction with an upcoming G_BRJT opcode to support jump table codegen with GlobalISel. Differential Revision: https://reviews.llvm.org/D63111 llvm-svn: 363096
*	[MemorySSA] When applying updates, clean unnecessary Phis.	Alina Sbirlea	2019-06-11	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: After applying a set of insert updates, there may be trivial Phis left over. Clean them up. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63033 llvm-svn: 363094
*	Only passes that preserve MemorySSA must mark it as preserved.	Alina Sbirlea	2019-06-11	6	-5/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The method `getLoopPassPreservedAnalyses` should not mark MemorySSA as preserved, because it's being called in a lot of passes that do not preserve MemorySSA. Instead, mark the MemorySSA analysis as preserved by each pass that does preserve it. These changes only affect the new pass mananger. Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, Prazek, george.burgess.iv, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62536 llvm-svn: 363091
*	Deduplicate S_CONSTANTs in LLD.	Amy Huang	2019-06-11	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Deduplicate S_CONSTANTS when linking, if they have the same value. Reviewers: rnk Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63151 llvm-svn: 363089
*	[PowerPC] Enable MachinePipeliner for P9 with -ppc-enable-pipeliner	Jinsong Ji	2019-06-11	8	-14/+140
\| \| \| \| \| \| \| \| \|	Implement necessary target hooks to enable MachinePipeliner for P9 only. The pass is off by default, can be enabled with -ppc-enable-pipeliner for P9. Differential Revision: https://reviews.llvm.org/D62164 llvm-svn: 363085
*	[Path] Set FD to -1 in moved-from TempFile	Jonas Devlieghere	2019-06-11	1	-0/+1
\| \| \| \| \| \| \| \| \|	When moving a temp file, explicitly set the file descriptor to -1 so we can never accidentally close the moved-from TempFile. Differential revision: https://reviews.llvm.org/D63087 llvm-svn: 363083
*	[InstCombine] Handle -(X-Y) --> (Y-X) for unary fneg when NSZ	Cameron McInally	2019-06-11	1	-1/+10
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D62612 llvm-svn: 363082
*	[InstCombine] Update fptrunc (fneg x)) -> (fneg (fptrunc x) for unary FNeg	Cameron McInally	2019-06-11	1	-4/+12
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D62629 llvm-svn: 363080
*	lld-link: Reject more than one resource .obj file	Nico Weber	2019-06-11	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Users are exepcted to pass all .res files to the linker, which then merges all the resource in all .res files into a tree structure and then converts the final tree structure to a .obj file with .rsrc$01 and .rsrc$02 sections and then links that. If the user instead passes several .obj files containing such resources, the correct thing to do would be to have custom code to merge the trees in the resource sections instead of doing normal section merging -- but link.exe rejects if multiple resource obj files are passed in with LNK4078, so let lld-link do that too instead of silently writing broken .rsrc sections in that case. The only real way to run into this is if users manually convert .res files to .obj files by running cvtres and then handing the resulting .obj files to lld-link instead, which in practice likely never happens. (lld-link is slightly stricter than link.exe now: If link.exe is passed one .obj file created by cvtres, and a .res file, for some reason it just emits a warning instead of an error and outputs strange looking data. lld-link now errors out on mixed input like this.) One way users could accidentally run into this is the following scenario: If a .res file is passed to lib.exe, then lib.exe calls cvtres.exe on the .res file before putting it in the output .lib. (llvm-lib currently doesn't do this.) link.exe's /wholearchive seems to only add obj files referenced from the static library index, but lld-link current really adds all files in the archive. So if lld-link /wholearchive is used with .lib files produced by lib.exe and .res files were among the files handed to lib.exe, we previously silently produced invalid output, but now we error out. link.exe's /wholearchive semantics on the other hand mean that it wouldn't load the resource object files from the .lib file at all. Since this scenario is probably still an unlikely corner case, the difference in behavior here seems fine -- and lld-link might have to change to use link.exe's /wholearchive semantics in the future anyways. Vaguely related to PR42180. Differential Revision: https://reviews.llvm.org/D63109 llvm-svn: 363078
*	[RISCV] Add lowering of addressing sequences for PIC	Lewis Revill	2019-06-11	6	-16/+64
\| \| \| \| \| \| \| \| \| \|	This patch allows lowering of PIC addresses by using PC-relative addressing for DSO-local symbols and accessing the address through the global offset table for non-DSO-local symbols. Differential Revision: https://reviews.llvm.org/D55303 llvm-svn: 363058
*	[RISCV] Lower inline asm constraints I, J & K for RISC-V	Lewis Revill	2019-06-11	2	-0/+42
\| \| \| \| \| \| \| \| \| \| \| \| \|	This validates and lowers arguments to inline asm nodes which have the constraints I, J & K, with the following semantics (equivalent to GCC): I: Any 12-bit signed immediate. J: Immediate integer zero only. K: Any 5-bit unsigned immediate. Differential Revision: https://reviews.llvm.org/D54093 llvm-svn: 363054
*	[ARM] First MVE instructions: scalar shifts.	Mikhail Maltsev	2019-06-11	7	-0/+294
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This introduces a new decoding table for MVE instructions, and starts by adding the family of scalar shift instructions that are part of the MVE architecture extension: saturating shifts within a single GPR, and long shifts across a pair of GPRs (both saturating and normal). Some of these shift instructions have only 3-bit register fields in the encoding, with the low bit fixed. So they can only address an odd or even numbered GPR (depending on the operand), and therefore I add two new register classes, GPREven and GPROdd. Differential Revision: https://reviews.llvm.org/D62668 Change-Id: Iad95d5f83d26aef70c674027a184a6b1e0098d33 llvm-svn: 363051
*	Let writeWindowsResourceCOFF() take a TimeStamp parameter	Nico Weber	2019-06-11	1	-15/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For lld, pass in Config->Timestamp (which is set based on lld's /timestamp: and /Brepro flags). Since the writeWindowsResourceCOFF() data is only used in-memory by LLD and the obj's timestamp isn't used for anything in the output, this doesn't change behavior. For llvm-cvtres, add an optional /timestamp: parameter, and use the current behavior of calling time() if the parameter is not passed in. This doesn't really change observable behavior (unless someone passes /timestamp: to llvm-cvtres, which wasn't possible before), but it removes the last unqualified call to time() from llvm/lib, which seems like a good thing. Differential Revision: https://reviews.llvm.org/D63116 llvm-svn: 363050
*	[TargetLowering] Add allowsMemoryAccess(MachineMemOperand) helper wrapper. NFCI.	Simon Pilgrim	2019-06-11	7	-66/+63
\| \| \| \| \| \|	As suggested by @arsenm on D63075 - this adds a TargetLowering::allowsMemoryAccess wrapper that takes a Load/Store node's MachineMemOperand to handle the AddressSpace/Alignment arguments and will also implicitly handle the MachineMemOperand::Flags change in D63075. llvm-svn: 363048
*	[DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through ↵	Orlando Cazalet-Hyams	2019-06-11	2	-7/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	loop even after completion Summary: Bug: https://bugs.llvm.org/show_bug.cgi?id=39024 The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here: A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins. B) Instructions in the middle block have different line numbers which give the impression of another iteration. In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks. I have set up a separate review D61933 for a fix which is required for this patch. Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel, jmorse Reviewed By: hfinkel, jmorse Subscribers: jmorse, javed.absar, eraman, kcc, bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D60831 llvm-svn: 363046
*	[ARM] Fix unused-variable warning in rL363039.	Simon Tatham	2019-06-11	1	-0/+1
\| \| \| \| \| \| \| \|	The variable `OffsetMask` is currently only used in an assertion, so if assertions are compiled out and -Werror is enabled, it becomes a build failure. llvm-svn: 363043
*	[DAGCombine] GetNegatedExpression - constant float vector support (PR42105)	Simon Pilgrim	2019-06-11	1	-9/+40
\| \| \| \| \| \| \| \|	Add support for negation of constant build vectors. Differential Revision: https://reviews.llvm.org/D62963 llvm-svn: 363040
*	[ARM] Add the non-MVE instructions in Arm v8.1-M.	Simon Tatham	2019-06-11	23	-90/+1587
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds support for the new family of conditional selection / increment / negation instructions; the low-overhead branch instructions (e.g. BF, WLS, DLS); the CLRM instruction to zero a whole list of registers at once; the new VMRS/VMSR and VLDR/VSTR instructions to get data in and out of 8.1-M system registers, particularly including the new VPR register used by MVE vector predication. To support this, we also add a register name 'zr' (used by the CSEL family to force one of the inputs to the constant 0), and operand types for lists of registers that are also allowed to include APSR or VPR (used by CLRM). The VLDR/VSTR instructions also need a new addressing mode. The low-overhead branch instructions exist in their own separate architecture extension, which we treat as enabled by default, but you can say -mattr=-lob or equivalent to turn it off. Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover Reviewed By: samparker Subscribers: miyuki, javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62667 llvm-svn: 363039
*	Change semantics of fadd/fmul vector reductions.	Sander de Smalen	2019-06-11	5	-47/+87
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch changes how LLVM handles the accumulator/start value in the reduction, by never ignoring it regardless of the presence of fast-math flags on callsites. This change introduces the following new intrinsics to replace the existing ones: llvm.experimental.vector.reduce.fadd -> llvm.experimental.vector.reduce.v2.fadd llvm.experimental.vector.reduce.fmul -> llvm.experimental.vector.reduce.v2.fmul and adds functionality to auto-upgrade existing LLVM IR and bitcode. Reviewers: RKSimon, greened, dmgreen, nikic, simoll, aemerson Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D60261 llvm-svn: 363035
*	[X86] Add load folding isel patterns to scalar_math_patterns and ↵	Craig Topper	2019-06-11	3	-9/+44
\| \| \| \| \| \| \| \|	AVX512_scalar_math_fp_patterns. Also add a FIXME for the peephole pass not being able to handle this. llvm-svn: 363032
*	Revert CMake: Make most target symbols hidden by default	Tom Stellard	2019-06-11	105	-113/+105
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts r362990 (git commit 374571301dc8e9bc9fdd1d70f86015de198673bd) This was causing linker warnings on Darwin: ld: warning: direct access in function 'llvm::initializeEvexToVexInstPassPass(llvm::PassRegistry&)' from file '../../lib/libLLVMX86CodeGen.a(X86EvexToVex.cpp.o)' to global weak symbol 'void std::__1::__call_once_proxy<std::__1::tuple<void* (&)(llvm::PassRegistry&), std::__1::reference_wrapper<llvm::PassRegistry>&&> >(void*)' from file '../../lib/libLLVMCore.a(Verifier.cpp.o)' means the weak symbol cannot be overridden at runtime. This was likely caused by different translation units being compiled with different visibility settings. llvm-svn: 363028
*	Symbolize: Make DWPName a symbolizer option instead of an argument to ↵	Peter Collingbourne	2019-06-11	1	-10/+8
\| \| \| \| \| \| \| \| \| \| \| \|	symbolize{,Inlined}Code. This makes the interface simpler and more consistent with the interface for .dSYM files and fixes a bug where llvm-symbolizer would not read the dwp if it was asked to symbolize data before symbolizing code. Differential Revision: https://reviews.llvm.org/D63114 llvm-svn: 363025
*	AtomicExpand: Don't crash on non-0 alloca	Matt Arsenault	2019-06-11	2	-2/+7
\| \| \| \| \| \| \|	This now produces garbage on AMDGPU with a call to an nonexistent, anonymous libcall but won't assert. llvm-svn: 363022
*	AMDGPU: Expand < 32-bit atomics	Matt Arsenault	2019-06-11	2	-0/+6
\| \| \| \| \| \|	Also fix AtomicExpand asserting on atomicrmw fadd/fsub. llvm-svn: 363021
*	llvm-lib: Implement /machine: argument	Nico Weber	2019-06-11	2	-10/+40
\| \| \| \| \| \| \| \| \| \|	And share some code with lld-link. While here, also add a FIXME about PR42180 and merge r360150 to llvm-lib. Differential Revision: https://reviews.llvm.org/D63021 llvm-svn: 363016
*	[AArch64] Add more CPUs to host detection	Yi Kong	2019-06-11	1	-0/+6
\| \| \| \| \| \| \| \| \|	Returns "cortex-a73" for 3rd and 4th gen Kryo; not precisely correct, but close enough. Differential Revision: https://reviews.llvm.org/D63099 llvm-svn: 363013
*	[MIR-Canon] Fixing non-determinism that was breaking bots (NFC).	Puyan Lotfi	2019-06-11	1	-3/+7
\| \| \| \| \| \| \| \| \| \| \| \|	An earlier fix of a subtle iterator invalidation bug had uncovered a nondeterminism that was present in the MultiUsers bag. Problem was that MultiUsers was being looked up using pointers. This patch is an NFC change that numbers each multiuser and processes each in numbered order. This fixes the test failure on netbsd and will likely fix the green-dragon bot too. llvm-svn: 363012
*	[Support] Explicitly detect recursive response files	Shoaib Meenai	2019-06-10	1	-9/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previous detection relied upon an arbitrary hard coded limit of 21 response files, which some code bases were running up against. The new detection maintains a stack of processing response files and explicitly checks if a newly encountered file is in the current stack. Some bookkeeping data is necessary in order to detect when to pop the stack. Patch by Chris Glover. Differential Revision: https://reviews.llvm.org/D62798 llvm-svn: 363005
*	[PGO] Fix the buildbot failure in r362995	Rong Xu	2019-06-10	1	-5/+2
\| \| \| \| \| \|	Fixed one unused variable warning. llvm-svn: 363004
*	[PGO] Handle cases of non-instrument BBs	Rong Xu	2019-06-10	1	-38/+85
\| \| \| \| \| \| \| \| \| \| \|	As shown in PR41279, some basic blocks (such as catchswitch) cannot be instrumented. This patch filters out these BBs in PGO instrumentation. It also sets the profile count to the fail-to-instrument edge, so that we can propagate the counts in the CFG. Differential Revision: https://reviews.llvm.org/D62700 llvm-svn: 362995
*	CMake: Make most target symbols hidden by default	Tom Stellard	2019-06-10	105	-105/+113
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: For builds with LLVM_BUILD_LLVM_DYLIB=ON and BUILD_SHARED_LIBS=OFF this change makes all symbols in the target specific libraries hidden by default. A new macro called LLVM_EXTERNAL_VISIBILITY has been added to mark symbols in these libraries public, which is mainly needed for the definitions of the LLVMInitialize* functions. This patch reduces the number of public symbols in libLLVM.so by about 25%. This should improve load times for the dynamic library and also make abi checker tools, like abidiff require less memory when analyzing libLLVM.so One side-effect of this change is that for builds with LLVM_BUILD_LLVM_DYLIB=ON and LLVM_LINK_LLVM_DYLIB=ON some unittests that access symbols that are no longer public will need to be statically linked. Before and after public symbol counts (using gcc 8.2.1, ld.bfd 2.31.1): nm before/libLLVM-9svn.so \| grep ' [A-Zuvw] ' \| wc -l 36221 nm after/libLLVM-9svn.so \| grep ' [A-Zuvw] ' \| wc -l 26278 Reviewers: chandlerc, beanz, mgorny, rnk, hans Reviewed By: rnk, hans Subscribers: Jim, hiraditya, michaelplatings, chapuni, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, kristina, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D54439 llvm-svn: 362990
*	[GlobalISel] Translate memset/memmove/memcpy from undef ptrs into nops	Jessica Paquette	2019-06-10	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \|	If the source is undef, then just don't do anything. This matches SelectionDAG's behaviour in SelectionDAG.cpp. Also add a test showing that we do the right thing here. (irtranslator-memfunc-undef.ll) Differential Revision: https://reviews.llvm.org/D63095 llvm-svn: 362989
*	Factor out a helper function for readability and reuse in a future patch [NFC]	Philip Reames	2019-06-10	1	-2/+8
\| \| \| \|	llvm-svn: 362980
*	[LFTR] Use recomputed BE count	Philip Reames	2019-06-10	1	-9/+1
\| \| \| \| \| \| \| \| \| \| \| \|	This was discussed as part of D62880. The basic thought is that computing BE taken count after widening should produce (on average) an equally good backedge taken count as the one before widening. Since there's only one test in the suite which is impacted by this change, and it's essentially equivelent codegen, that seems to be a reasonable assertion. This change was separated from r362971 so that if this turns out to be problematic, the triggering piece is obvious and easily revertable. For the nestedIV example from elim-extend.ll, we end up with the following BE counts: BEFORE: (-2 + (-1 * %innercount) + %limit) AFTER: (-1 + (sext i32 (-1 + %limit) to i64) + (-1 * (sext i32 %innercount to i64))<nsw>) Note that before is an i32 type, and the after is an i64. Truncating the i64 produces the i32. llvm-svn: 362975
*	[PowerPC][HTM]Fix $zero is not a GPRC register for builtin_ttest	Jinsong Ji	2019-06-10	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was found during HTM cleanup. Adding a test for builtin_ttest would expose following issue. * Bad machine code: Illegal physical register for instruction * - function: test10 - basic block: %bb.0 entry (0xf0e57497b58) - instruction: %5:crrc0 = TABORTWCI 0, $zero, 0 - operand 2: $zero $zero is not a GPRC register. LLVM ERROR: Found 1 machine code errors. Differential Revision: https://reviews.llvm.org/D63079 llvm-svn: 362974
*	Prepare for multi-exit LFTR [NFC]	Philip Reames	2019-06-10	1	-65/+77
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change does the plumbing to wire an ExitingBB parameter through the LFTR implementation, and reorganizes the code to work in terms of a set of individual loop exits. Most of it is fairly obvious, but there's one key complexity which makes it worthy of consideration. The actual multi-exit LFTR patch is in D62625 for context. Specifically, it turns out the existing code uses the backedge taken count from before a IV is widened. Oddly, we can end up with a different (more expensive, but semantically equivelent) BE count for the loop when requerying after widening. For the nestedIV example from elim-extend, we end up with the following BE counts: BEFORE: (-2 + (-1 * %innercount) + %limit) AFTER: (-1 + (sext i32 (-1 + %limit) to i64) + (-1 * (sext i32 %innercount to i64))<nsw>) This is the only test in tree which seems sensitive to this difference. The actual result of using the wider BETC on this example is that we actually produce slightly better code. :) In review, we decided to accept that test change. This patch is structured to preserve the old behavior, but a separate change will immediate follow with the behavior change. (I wanted it separate for problem attribution purposes.) Differential Revision: https://reviews.llvm.org/D62880 llvm-svn: 362971
*	[FastISel] Skip creating unnecessary vregs for arguments	Francis Visoiu Mistrih	2019-06-10	2	-33/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This behavior was added in r130928 for both FastISel and SD, and then disabled in r131156 for FastISel. This re-enables it for FastISel with the corresponding fix. This is triggered only when FastISel can't lower the arguments and falls back to SelectionDAG for it. FastISel contains a map of "register fixups" where at the end of the selection phase it replaces all uses of a register with another register that FastISel sometimes pre-assigned. Code at the end of SelectionDAGISel::runOnMachineFunction is doing the replacement at the very end of the function, while other pieces that come in before that look through the MachineFunction and assume everything is done. In this case, the real issue is that the code emitting COPY instructions for the liveins (physreg to vreg) (EmitLiveInCopies) is checking if the vreg assigned to the physreg is used, and if it's not, it will skip the COPY. If a register wasn't replaced with its assigned fixup yet, the copy will be skipped and we'll end up with uses of undefined registers. This fix moves the replacement of registers before the emission of copies for the live-ins. The initial motivation for this fix is to enable tail calls for swiftself functions, which were blocked because we couldn't prove that the swiftself argument (which is callee-save) comes from a function argument (live-in), because there was an extra copy (vreg to vreg). A few tests are affected by this: * llvm/test/CodeGen/AArch64/swifterror.ll: we used to spill x21 (callee-save) but never reload it because it's attached to the return. We now don't even spill it anymore. * llvm/test/CodeGen//swiftself.ll: we tail-call now. llvm/test/CodeGen/AMDGPU/mubuf-legalize-operands.ll: I believe this test was not really testing the right thing, but it worked because the same registers were re-used. * llvm/test/CodeGen/ARM/cmpxchg-O0.ll: regalloc changes * llvm/test/CodeGen/ARM/swifterror.ll: get rid of a copy * llvm/test/CodeGen/Mips/: get rid of spills and copies llvm/test/CodeGen/SystemZ/swift-return.ll: smaller stack * llvm/test/CodeGen/X86/atomic-unordered.ll: smaller stack * llvm/test/CodeGen/X86/swifterror.ll: same as AArch64 * llvm/test/DebugInfo/X86/dbg-declare-arg.ll: stack size changed Differential Revision: https://reviews.llvm.org/D62361 llvm-svn: 362963
*	[ExecutionEngine] Fix rL362941: Add UnaryOperator visitor to the interpreter	Cameron McInally	2019-06-10	1	-0/+2
\| \| \| \| \| \|	Missed break statements. This was D62881. llvm-svn: 362958
*	[AMDGPU] Optimize image_[load\|store]_mip	Piotr Sobczak	2019-06-10	4	-0/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Replace image_load_mip/image_store_mip with image_load/image_store if lod is 0. Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63073 llvm-svn: 362957