bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	CodeGen: handle llvm.used properly for COFF	Saleem Abdulrasool	2018-01-20	1	-0/+20
\| \| \| \| \| \| \| \| \|	`llvm.used` contains a list of pointers to named values which the compiler, assembler, and linker are required to treat as if there is a reference that they cannot see. Ensure that the symbols are preserved by adding an explicit `-include` reference to the linker command. llvm-svn: 323017
*	[X86] Teach X86 codegen to use vector width preference to avoid promoting to ↵	Craig Topper	2018-01-20	7	-0/+1258
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	512-bit types when VLX is enabled and the preference is for a smaller size. This change applies to places where we would turn 128/256-bit code into 512-bit in order to get a wider element type through sext/zext. Any 512-bit types that already existed in the IR/DAG will be left that way. The width preference has no effect on codegen behavior when the target does not have AVX512 enabled. So AVX/AVX2 codegen cannot be limited via this mechanism yet. If the preference is lower than 256 we may still use a 256 bit type to do the operation. Constraining to 128 bits makes it much more difficult to support some operations. For many of these cases we need to change element width while keeping element count constant which is easiest done by switching between 256 and 128 bit. The preference is only obeyed when AVX512 and VLX are available. This means the preference is not obeyed for KNL, but is obeyed for SKX, Cannonlake, and Icelake. For KNL, the only way to do masked operation is on 512-bit registers so we would have to completely disable masking to obey the preference. We would also lose support for gather, scatter, ctlz, vXi64 multiplies, etc. This may change in the future, but this simplifies the initial implementation. Differential Revision: https://reviews.llvm.org/D41895 llvm-svn: 323016
*	[X86] Add support for passing 'prefer-vector-width' function attribute into ↵	Craig Topper	2018-01-20	1	-0/+77
\| \| \| \| \| \| \| \| \| \| \| \|	X86Subtarget and exposing via X86's getRegisterWidth TTI interface. This will cause the vectorizers to do some limiting of the vector widths they create. This is not a strict limit. There are reasons I know of that the loop vectorizer will generate larger vectors for. I've written this in such a way that the interface will only return a properly supported width(0/128/256/512) even if the attribute says something funny like 384 or 10. This has been split from D41895 with the remainder in a follow up commit. llvm-svn: 323015
*	[ObjCARC] Do not turn a call to @objc_autoreleaseReturnValue into a call	Akira Hatanaka	2018-01-19	2	-0/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to @objc_autorelease if its operand is a PHI and the PHI has an equivalent value that is used by a return instruction. For example, ARC optimizer shouldn't replace the call in the following example, as doing so breaks the AutoreleaseRV/RetainRV optimization: %v1 = bitcast i32* %v0 to i8* br label %bb3 bb2: %v3 = bitcast i32* %v2 to i8* br label %bb3 bb3: %p = phi i8* [ %v1, %bb1 ], [ %v3, %bb2 ] %retval = phi i32* [ %v0, %bb1 ], [ %v2, %bb2 ] ; equivalent to %p %v4 = tail call i8* @objc_autoreleaseReturnValue(i8* %p) ret i32* %retval Also, make sure ObjCARCContract replaces @objc_autoreleaseReturnValue's operand uses with its value so that the call gets tail-called. rdar://problem/15894705 llvm-svn: 323009
*	[x86] add tests for sqrt estimate that should respect denorms; NFC (PR34994)	Sanjay Patel	2018-01-19	1	-0/+67
\| \| \| \|	llvm-svn: 323003
*	[X86] Autogenerate complete checks on a couple tests. NFC	Craig Topper	2018-01-19	2	-46/+110
\| \| \| \|	llvm-svn: 322997
*	[Dominators] Visit affected node candidates found at different root levels	Jakub Kuderski	2018-01-19	2	-0/+118
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch attempts to fix the DomTree incremental insertion bug found here [[ https://bugs.llvm.org/show_bug.cgi?id=35969 \| PR35969 ]] . When performing an insertion into a piece of unreachable CFG, we may find the same not at different levels. When this happens, the node can turn out to be affected when we find it starting from a node with a lower level in the tree. The level at which we start visitation affects if we consider a node affected or not. This patch tracks the lowest level at which each node was visited during insertion and allows it to be visited multiple times, if it can cause it to be considered affected. Reviewers: brzycki, davide, dberlin, grosser Reviewed By: brzycki Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42231 llvm-svn: 322993
*	Add optional DICompileUnit to DIBuilder + make outliner debug info use it	Jessica Paquette	2018-01-19	1	-12/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, the DIBuilder didn't expose functionality to set its compile unit in any other way than calling createCompileUnit. This meant that the outliner, which creates new functions, had to create a new compile unit for its debug info. This commit adds an optional parameter in the DIBuilder's constructor which lets you set its CU at construction. It also changes the MachineOutliner so that it keeps track of the DISubprograms for each outlined sequence. If debugging information is requested, then it uses one of the outlined sequence's DISubprograms to grab a CU. It then uses that CU to construct the DISubprogram for the new outlined function. The test has also been updated to reflect this change. See https://reviews.llvm.org/D42254 for more information. Also see the e-mail discussion on D42254 in llvm-commits for more context. llvm-svn: 322992
*	[SystemZ] Prefer LOCHI over generating IPM sequences	Ulrich Weigand	2018-01-19	3	-50/+42
\| \| \| \| \| \| \| \|	On current machines we have load-on-condition instructions that can be used to directly implement the SETCC semantics. If we have those, it is always preferable to use them instead of generating the IPM sequence. llvm-svn: 322989
*	[SystemZ] Directly use CC result of compare-and-swap	Ulrich Weigand	2018-01-19	5	-0/+287
\| \| \| \| \| \| \| \| \| \|	In order to implement a test whether a compare-and-swap succeeded, the SystemZ back-end currently emits a rather inefficient sequence of first converting the CC result into an integer, and then testing that integer against zero. This commit changes the back-end to simply directly test the CC value set by the compare-and-swap instruction. llvm-svn: 322988
*	[SystemZ] Rework IPM sequence generation	Ulrich Weigand	2018-01-19	3	-4/+250
\| \| \| \| \| \| \| \| \| \| \| \|	The SystemZ back-end uses a sequence of IPM followed by arithmetic operations to implement the SETCC primitive. This is currently done early during SelectionDAG. This patch moves generating those sequences to much later in SelectionDAG (during PreprocessISelDAG). This doesn't change much in generated code by itself, but it allows further enhancements that will be checked-in as follow-on commits. llvm-svn: 322987
*	[SystemZ] Run branch-12.ll test only if long tests enabled	Ulrich Weigand	2018-01-19	2	-1/+1
\| \| \| \| \| \|	This avoids excessive test run times e.g. with expensive checks enabled. llvm-svn: 322983
*	[WebAssembly] MC: Start table at offset 1 rather than 0	Sam Clegg	2018-01-19	4	-12/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: For consistency with the output of lld. This is useful in runnable binaries as can them be sure the null function pointer will never be a valid argument call_indirect. Subscribers: jfb, dschuff, jgravelle-google, aheejin, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D42284 llvm-svn: 322978
*	[X86][SSE] Add SSE2 gather tests	Simon Pilgrim	2018-01-19	1	-74/+156
\| \| \| \| \| \|	Check codegen without PEXTRD llvm-svn: 322974
*	[ARM] Fix perf regression in compare optimization.	Joel Galenson	2018-01-19	1	-0/+32
\| \| \| \| \| \| \| \| \| \| \| \|	Fix a performance regression caused by r322737. While trying to make it easier to replace compares with existing adds and subtracts, I accidentally stopped it from doing so in some cases. This should fix that. I'm also fixing another potential bug in that commit. Differential Revision: https://reviews.llvm.org/D42263 llvm-svn: 322972
*	[WebAssembly] Fix libcall signature lookup	Derek Schuff	2018-01-19	1	-0/+107
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	RuntimeLibcallSignatures previously manually initialized all the libcall names into an array and searched it linearly for the first match to lookup the corresponding index. r322802 switched that to initializing a map keyed by the libcall name. Neither of these approaches works correctly because some libcall numbers use the same name on different platforms (e.g. the "l" suffixed functions use f80 or f128 or ppcf128). This change fixes that by ensuring that each name only goes into the map once. It also adds tests. Differential Revision: https://reviews.llvm.org/D42271 llvm-svn: 322971
*	[WebAssembly] Make sign-extension opcodes a distinct feature.	Dan Gohman	2018-01-19	3	-12/+12
\| \| \| \| \| \| \| \| \| \|	Sign-extension opcodes have been split into a separate proposal from the main threads proposal, so switch them to their own target feature. See: https://github.com/WebAssembly/sign-extension-ops llvm-svn: 322966
*	Remove alignment argument from memcpy/memmove/memset in favour of alignment ↵	Daniel Neilson	2018-01-19	382	-2195/+2193
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	attributes (Step 1) Summary: This is a resurrection of work first proposed and discussed in Aug 2015: http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html and initially landed (but then backed out) in Nov 2015: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument which is required to be a constant integer. It represents the alignment of the dest (and source), and so must be the minimum of the actual alignment of the two. This change is the first in a series that allows source and dest to each have their own alignments by using the alignment attribute on their arguments. In this change we: 1) Remove the alignment argument. 2) Add alignment attributes to the source & dest arguments. We, temporarily, require that the alignments for source & dest be equal. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false) will now read call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false) Downstream users may have to update their lit tests that check for @llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script may help with updating the majority of your tests, but it does not catch all possible patterns so some manual checking and updating will be required. s~declare void @llvm\.mem(set\|cpy\|move)\.p([^(])\((.), i32, i1\)~declare void @llvm.mem\1.p\2(\3, i1)~g s~call void @llvm\.memset\.p([^(])i8\(i8([^])\ (.), i8 (.), i8 (.), i32 [01], i1 ([^)])\)~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i16\(i8([^])\ (.), i8 (.), i16 (.), i32 [01], i1 ([^)])\)~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i32\(i8([^])\ (.), i8 (.), i32 (.), i32 [01], i1 ([^)])\)~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i64\(i8([^])\ (.), i8 (.), i64 (.), i32 [01], i1 ([^)])\)~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i128\(i8([^])\ (.), i8 (.), i128 (.), i32 [01], i1 ([^)])\)~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i8\(i8([^])\ (.), i8 (.), i8 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.memset.p\1i8(i8\2 align \6 \3, i8 \4, i8 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i16\(i8([^])\ (.), i8 (.), i16 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.memset.p\1i16(i8\2 align \6 \3, i8 \4, i16 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i32\(i8([^])\ (.), i8 (.), i32 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.memset.p\1i32(i8\2 align \6 \3, i8 \4, i32 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i64\(i8([^])\ (.), i8 (.), i64 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.memset.p\1i64(i8\2 align \6 \3, i8 \4, i64 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i128\(i8([^])\ (.), i8 (.), i128 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.memset.p\1i128(i8\2 align \6 \3, i8 \4, i128 \5, i1 \7)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i8\(i8([^])\ (.), i8([^])\ (.), i8 (.), i32 [01], i1 ([^)])\)~call void @llvm.mem\1.p\2i8(i8\3 \4, i8\5* \6, i8 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i16\(i8([^])\ (.), i8([^])\ (.), i16 (.), i32 [01], i1 ([^)])\)~call void @llvm.mem\1.p\2i16(i8\3 \4, i8\5* \6, i16 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i32\(i8([^])\ (.), i8([^])\ (.), i32 (.), i32 [01], i1 ([^)])\)~call void @llvm.mem\1.p\2i32(i8\3 \4, i8\5* \6, i32 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i64\(i8([^])\ (.), i8([^])\ (.), i64 (.), i32 [01], i1 ([^)])\)~call void @llvm.mem\1.p\2i64(i8\3 \4, i8\5* \6, i64 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i128\(i8([^])\ (.), i8([^])\ (.), i128 (.), i32 [01], i1 ([^)])\)~call void @llvm.mem\1.p\2i128(i8\3 \4, i8\5* \6, i128 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i8\(i8([^])\ (.), i8([^])\ (.), i8 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i16\(i8([^])\ (.), i8([^])\ (.), i16 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i32\(i8([^])\ (.), i8([^])\ (.), i32 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i64\(i8([^])\ (.), i8([^])\ (.), i64 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i128\(i8([^])\ (.), i8([^])\ (.), i128 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g The remaining changes in the series will: Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. Step 3) Update Clang to use the new IRBuilder API. Step 4) Update Polly to use the new IRBuilder API. Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use getDestAlignment() and getSourceAlignment() instead. Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reviewers: pete, hfinkel, lhames, reames, bollu Reviewed By: reames Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits Differential Revision: https://reviews.llvm.org/D41675 llvm-svn: 322965
*	[x86] add RUN line and auto-generate checks	Sanjay Patel	2018-01-19	1	-55/+182
\| \| \| \| \| \| \| \| \|	There were checks for a 32-bit target here, but no RUN line corresponding to that prefix. I don't know what the intent of these tests is, but at least now we can see what happens for both targets. llvm-svn: 322961
*	[x86] regenerate complete checks; NFC	Sanjay Patel	2018-01-19	1	-30/+51
\| \| \| \| \| \|	D42265 will improve something here, but it's not obvious how without more checks. llvm-svn: 322960
*	[x86] shrink 'and' immediate values by setting the high bits (PR35907)	Sanjay Patel	2018-01-19	11	-148/+127
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Try to reverse the constant-shrinking that happens in SimplifyDemandedBits() for 'and' masks when it results in a smaller sign-extended immediate. We are also able to detect dead 'and' ops here (the mask is all ones). In that case, we replace and return without selecting the 'and'. Other targets might want to share some of this logic by enabling this under a target hook, but I didn't see diffs for simple cases with PowerPC or AArch64, so they may already have some specialized logic for this kind of thing or have different needs. This should solve PR35907: https://bugs.llvm.org/show_bug.cgi?id=35907 Differential Revision: https://reviews.llvm.org/D42088 llvm-svn: 322957
*	[X86] Extend load-op-store fusion merge to ADC/SBB.	Nirav Dave	2018-01-19	2	-2/+297
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add handling of EFLAG input to X86 Load-op-store fusion checking. Reviewers: craig.topper, RKSimon Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D42128 llvm-svn: 322952
*	[AArch64][SVE] Asm: Add support for RDVL/ADDVL/ADDPL instructions	Sander de Smalen	2018-01-19	6	-0/+135
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: fhahn, rengolin, t.p.northover, echristo, olista01, SjoerdMeijer Reviewed By: SjoerdMeijer Subscribers: SjoerdMeijer, aemerson, javed.absar, tschuett, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D41900 llvm-svn: 322951
*	[X86][AVX] Add more variable permute tests for source vectors smaller than ↵	Simon Pilgrim	2018-01-19	1	-8/+1060
\| \| \| \| \| \|	destination llvm-svn: 322948
*	[SLP] Fix vectorization for tree with trunc to minimum required bit width.	Alexey Bataev	2018-01-19	2	-19/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If the vectorized tree has truncate to minimum required bit width and the vector type of the cast operation after the truncation is the same as the vector type of the cast operands, count cost of the vector cast operation as 0, because this cast will be later removed. Also, if the vectorization tree root operations are integer cast operations, do not consider them as candidates for truncation. It will just create extra number of the same vector/scalar operations, which will be removed by instcombiner. Reviewers: RKSimon, spatel, mkuper, hfinkel, mssimpso Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41948 llvm-svn: 322946
*	[AMDGPU][MC] Corrected parsing of image modifiers and encoding of image atomics	Dmitry Preobrazhensky	2018-01-19	1	-8/+52
\| \| \| \| \| \| \| \| \| \| \|	See bugs 35962: https://bugs.llvm.org/show_bug.cgi?id=35962 35963: https://bugs.llvm.org/show_bug.cgi?id=35963 Differential Revision: https://reviews.llvm.org/D42184 Reviewers: vpykhtin, artem.tamazov, arsenm llvm-svn: 322942
*	Fix line endings. NFCI.	Simon Pilgrim	2018-01-19	1	-30/+30
\| \| \| \|	llvm-svn: 322940
*	[X86] Add KNL target to slow PMULLD tests	Simon Pilgrim	2018-01-19	1	-0/+2
\| \| \| \|	llvm-svn: 322939
*	[X86] Add RDPID schedule test	Simon Pilgrim	2018-01-19	1	-0/+20
\| \| \| \|	llvm-svn: 322938
*	[X86] Regenerate RDPMC intrinsic test	Simon Pilgrim	2018-01-19	1	-12/+16
\| \| \| \|	llvm-svn: 322937
*	[InstCombine] Make foldSelectOpOp able to handle two-operand getelementptr	John Brawn	2018-01-19	1	-20/+15
\| \| \| \| \| \| \| \| \| \|	Three (or more) operand getelementptrs could plausibly also be handled, but handling only two-operand fits in easily with the existing BinaryOperator handling. Differential Revision: https://reviews.llvm.org/D39958 llvm-svn: 322930
*	Split TailDuplicatePass into pre- and post-RA variant; NFC	Matthias Braun	2018-01-19	1	-1/+1
\| \| \| \| \| \| \| \|	Split TailDuplicatePass into EarlyTailDuplicate and TailDuplicate. This avoids playing games with fake pass IDs and using MRI::isSSA() to determine pre-/post-RA state. llvm-svn: 322926
*	Move tests to the correct place	Matthias Braun	2018-01-19	19	-0/+0
\| \| \| \| \| \| \|	test/CodeGen/MIR is for testing the MIR parser/printer. Tests for passes and targets belong to test/CodeGen/TARGETNAME. llvm-svn: 322925
*	Revert [CGP] Re-enable Select in complex addressing mode	Serguei Katkov	2018-01-19	1	-1/+1
\| \| \| \| \| \|	One of buildbots failed. Revert for now till fix the issue. llvm-svn: 322923
*	AArch64: Fix emergency spillslot being out of reach for large callframes	Matthias Braun	2018-01-19	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Re-commit of r322200: The testcase shouldn't hit machineverifiers anymore with r322917 in place. Large callframes (calls with several hundreds or thousands or parameters) could lead to situations in which the emergency spillslot is out of range to be addressed relative to the stack pointer. This commit forces the use of a frame pointer in the presence of large callframes. This commit does several things: - Compute max callframe size at the end of instruction selection. - Add mirFileLoaded target callback. Use it to compute the max callframe size after loading a .mir file when the size wasn't specified in the file. - Let TargetFrameLowering::hasFP() return true if there exists a callframe > 255 bytes. - Always place the emergency spillslot close to FP if we have a frame pointer. - Note that `useFPForScavengingIndex()` would previously return false when a base pointer was available leading to the emergency spillslot getting allocated late (that's the whole effect of this callback). Which made no sense to me so I took this case out: Even though the emergency spillslot is technically not referenced by FP in this case we still want it allocated early. Differential Revision: https://reviews.llvm.org/D40876 llvm-svn: 322919
*	AArch64: Omit callframe setup/destroy when not necessary	Matthias Braun	2018-01-19	7	-62/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Do not create CALLSEQ_START/CALLSEQ_END when there is no callframe to setup and the callframe size is 0. - Fixes an invalid callframe nesting for byval arguments, which would look like this before this patch (as in `big-byval.ll`): ... ADJCALLSTACKDOWN 32768, 0, ... # Setup for extfunc ... ADJCALLSTACKDOWN 0, 0, ... # setup for memcpy ... BL &memcpy ... ADJCALLSTACKUP 0, 0, ... # destroy for memcpy ... BL &extfunc ADJCALLSTACKUP 32768, 0, ... # destroy for extfunc - Saves us two instructions in the common case of zero-sized stackframes. - Remove an unnecessary scheduling barrier (hence the small unittest changes). Differential Revision: https://reviews.llvm.org/D42006 llvm-svn: 322917
*	[X86] Add intrinsic support for the RDPID instruction	Craig Topper	2018-01-18	1	-0/+21
\| \| \| \| \| \| \| \|	This adds a new instrinsic to support the rdpid instruction. The implementation is a bit weird because the intrinsic is defined as always returning 32-bits, but the assembler support thinks the instruction produces a 64-bit register in 64-bit mode. But really it zeros the upper 32 bits. So I had to add separate patterns where 64-bit mode uses an extract_subreg. Differential Revision: https://reviews.llvm.org/D42205 llvm-svn: 322910
*	[InstSimplify] regenerate checks and add tests for commutes; NFC	Sanjay Patel	2018-01-18	1	-49/+92
\| \| \| \|	llvm-svn: 322907
*	[CodeView] Add line numbers for inlined call sites	Reid Kleckner	2018-01-18	1	-0/+26
\| \| \| \| \| \| \| \|	We did this for inline call site line tables, but we hadn't done it for regular function line tables yet. This patch copies that logic from encodeInlineLineTable. llvm-svn: 322905
*	AMDGPU/SI: Add d16 support for image intrinsics.	Changpeng Fang	2018-01-18	3	-0/+397
\| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch implements d16 support for image load, image store and image sample intrinsics. Reviewers: Matt, Brian. Differential Revision: https://reviews.llvm.org/D3991 llvm-svn: 322903
*	[test] Actually check the common parts in ↵	Martin Storsjo	2018-01-18	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \|	CodeGen/ARM/global-merge-external.ll. NFC. Previously, these parts weren't ever checked. The label patterns need to be extended to match successfully on macho. Differential Revision: https://reviews.llvm.org/D42126 llvm-svn: 322900
*	[DWARFv5] Number the line-table's directory array correctly.	Paul Robinson	2018-01-18	5	-28/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The compilation directory has always been #0, but as of DWARF v5 it is explicitly listed in the line-table section instead of implicitly being a reference to the compile_unit DIE's DW_AT_comp_dir attribute. This means the dumper should number the dumped array starting with 0 or 1 depending on the DWARF version of the line table. References in the generated DWARF are correct, it's just the dumper that was wrong. Also some assembler-coded tests were similarly confused about directory numbers. llvm-svn: 322884
*	[AArch64][GlobalISel] Add isel support for global values in the large code ↵	Amara Emerson	2018-01-18	1	-0/+61
\| \| \| \| \| \| \| \| \| \|	model. Fixes PR35958. Differential Revision: https://reviews.llvm.org/D42175 llvm-svn: 322878
*	[X86][SSE] Regenerate vector promotion tests	Simon Pilgrim	2018-01-18	1	-19/+46
\| \| \| \|	llvm-svn: 322877
*	[RISCV] Fixed setting predicates for compressed instructions.	Ana Pazos	2018-01-18	5	-0/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Fixed setting predicates for compressed instructions. Some instructions were being generated with C extension enabled only, without proper checks for the other required extensions like F, D and 32 and 64-bit target checks. Affected instructions: C_FLD, C_FLW, C_LD, C_FSD, C_FSW, C_SD, C_JAL, C_ADDIW, C_SUBW, C_ADDW, C_FLDSP, C_FLWSP, C_LDSP, C_FSDSP, C_FSWSP, C_SDSP Reviewers: asb, shiva0217 Reviewed By: asb Subscribers: rbar, johnrusso, simoncook, jordy.potman.lists, sabuasal, niosHD, llvm-commits Differential Revision: https://reviews.llvm.org/D42132 llvm-svn: 322876
*	[X86][AVX] Add 256/512-bit slow PMULLD tests	Simon Pilgrim	2018-01-18	1	-24/+768
\| \| \| \|	llvm-svn: 322874
*	[CodeGen] Print RegClasses on MI in verbose mode	Francis Visoiu Mistrih	2018-01-18	9	-66/+66
\| \| \| \| \| \| \| \| \| \| \| \| \|	r322086 removed the trailing information describing reg classes for each register. This patch adds printing reg classes next to every register when individual operands/instructions/basic blocks are printed. In the case of dumping MIR or printing a full function, by default don't print it. Differential Revision: https://reviews.llvm.org/D42239 llvm-svn: 322867
*	[SLP] Fix test checks, NFC.	Alexey Bataev	2018-01-18	1	-12/+22
\| \| \| \|	llvm-svn: 322865
*	[MachineOutliner] Fix r322788 - don't write to working directory	Sam McCall	2018-01-18	1	-1/+1
\| \| \| \|	llvm-svn: 322850
*	[X86] Add PR35918 test case	Simon Pilgrim	2018-01-18	1	-0/+108
\| \| \| \|	llvm-svn: 322846