bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[NVPTX] Allow libcalls that are defined in the current module.	Justin Lebar	2018-12-26	8	-4/+202
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The patch adds a possibility to make library calls on NVPTX. An important thing about library functions - they must be defined within the current module. This basically should guarantee that we produce a valid PTX assembly (without calls to not defined functions). The one who wants to use the libcalls is probably will have to link against compiler-rt or any other implementation. Currently, it's completely impossible to make library calls because of error LLVM ERROR: Cannot select: i32 = ExternalSymbol '...'. But we can lower ExternalSymbol to TargetExternalSymbol and verify if the function definition is available. Also, there was an issue with a DAG during legalisation. When we expand instruction into libcall, the inner call-chain isn't being "integrated" into outer chain. Since the last "data-flow" (call retval load) node is located in call-chain earlier than CALLSEQ_END node, the latter becomes a leaf and therefore a dead node (and is being removed quite fast). Proposed here solution relies on another data-flow pseudo nodes (ProxyReg) which purpose is only to keep CALLSEQ_END at legalisation and instruction selection phases - we remove the pseudo instructions before register scheduling phase. Patch by Denys Zariaiev! Differential Revision: https://reviews.llvm.org/D34708 llvm-svn: 350069
*	[NVPTX] Reduce stack size in NVPTXAsmPrinter::doInitialization().	Justin Lebar	2018-12-22	1	-5/+2
\| \| \| \| \| \| \| \|	NVPTXAsmPrinter::doInitialization() was creating an NVPTXSubtarget on the stack. This object is huge, about 80kb. Also it's slow to create. And it's all redundant; we have one in NVPTXTargetMachine anyway! llvm-svn: 349982
*	[NVPTX] Lower instructions that expand into libcalls.	Artem Belevich	2018-12-14	1	-4/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The change is an effort to split and refactor abandoned D34708 into smaller parts. Here the behaviour of unsupported instructions is changed to match the behaviour of explicit intrinsics calls. Currently LLVM crashes with: > Assertion getInstruction() && "Not a call or invoke instruction!" failed. With this patch LLVM produces a more sensible error message: > Cannot select: ... i32 = ExternalSymbol'__foobar' Author: Denys Zariaiev <denys.zariaiev@gmail.com> Differential Revision: https://reviews.llvm.org/D55145 llvm-svn: 349213
*	[NVPTX] do not rely on cached subtarget info.	Artem Belevich	2018-12-12	2	-13/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a module has function references, but no functions themselves, we may end up never calling runOnMachineFunction and therefore would never initialize nvptxSubtarget field which would eventually cause a crash. Instead of relying on nvptxSubtarget being initialized by one of the methods, retrieve subtarget info directly. Differential Revision: https://reviews.llvm.org/D55580 llvm-svn: 348952
*	[Targets] Add errors for tiny and kernel codemodel on targets that don't ↵	David Green	2018-12-07	1	-7/+1
\| \| \| \| \| \| \| \| \| \| \|	support them Adds fatal errors for any target that does not support the Tiny or Kernel codemodels by rejigging the getEffectiveCodeModel calls. Differential Revision: https://reviews.llvm.org/D50141 llvm-svn: 348585
*	[DEBUGINFO, NVPTX] Disable emission of ',debug' option if only debug ↵	Alexey Bataev	2018-12-06	1	-1/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	directives are allowed. Summary: If the output of debug directives only is requested, we should drop emission of ',debug' option from the target directive. Required for supporting of nvprof profiler. Reviewers: echristo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46061 llvm-svn: 348497
*	[DEBUGINFO, NVPTX]Emit last debugging directives.	Alexey Bataev	2018-12-06	3	-3/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We may end up with not emitted debug directives at the end of the module emission. Patch fixes this problem emitting those last directives the end of the module emission. Reviewers: echristo Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D54320 llvm-svn: 348495
*	[NVPTX] Add lowering of i128 numbers as struct fields	Artem Belevich	2018-12-01	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \|	Addition to D34555 - override VTs computation with ComputePTXValueVTs for struct fields. Author: Denys Zariaiev<denys.zariaiev@gmail.com> Differential Revision: https://reviews.llvm.org/D55144 llvm-svn: 348057
*	[SelectionDAG] Move (repeated) SDTIntShiftDOp double shift node def to ↵	Simon Pilgrim	2018-11-16	1	-3/+0
\| \| \| \| \| \| \| \|	common code. NFCI. Prep work for PR39467. llvm-svn: 347067
*	Revert "[DEBUGINFO, NVPTX]DO not emit ',debug' option if no debug info or ↵	Alexey Bataev	2018-11-09	3	-30/+4
\| \| \| \| \| \| \| \| \|	only debug directives are requested." This reverts commit r345972. Need to update the description + possibly to update the patch itself after discussion with Eric Christofer. llvm-svn: 346508
*	MachineFunction: Store more specific reference to LLVMTargetMachine; NFC	Matthias Braun	2018-11-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	MachineFunction can only be used in code using lib/CodeGen, hence we can keep a more specific reference to LLVMTargetMachine rather than just TargetMachine around. Do the same for references in ScheduleDAG and RegUsageInfoCollector. llvm-svn: 346183
*	[TargetLowering] Change TargetLoweringBase::getPreferredVectorAction to take ↵	Craig Topper	2018-11-05	2	-2/+2
\| \| \| \| \| \| \| \|	an MVT instead of an EVT. NFC The main caller of this already has an MVT and several targets called getSimpleVT inside without checking isSimple. This makes the simpleness explicit. llvm-svn: 346180
*	[DEBUGINFO, NVPTX]DO not emit ',debug' option if no debug info or only debug ↵	Alexey Bataev	2018-11-02	3	-4/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	directives are requested. Summary: If the output of debug directives only is requested, we should drop emission of ',debug' option from the target directive. Required for supporting of nvprof profiler. Reviewers: probinson, echristo, dblaikie Subscribers: Hahnfeld, jholewinski, llvm-commits, JDevlieghere, aprantl Differential Revision: https://reviews.llvm.org/D46061 llvm-svn: 345972
*	[DEBUG_INFO][NVPTX]Fix processing of DBG_VALUES.	Alexey Bataev	2018-10-25	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If the instruction in the eliminateFrameIndex function is a DBG_VALUE instruction, it requires special processing. The frame register is set to VRFrame and the offset is based on the object offset. The code is similar to the code used in lib/CodeGen/PrologEpilogInserter.cpp. Reviewers: tra Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D53657 llvm-svn: 345269
*	[NFC] Rename minnan and maxnan to minimum and maximum	Thomas Lively	2018-10-24	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Changes all uses of minnan/maxnan to minimum/maximum globally. These names emphasize that the semantic difference between these operations is more than just NaN-propagation. Reviewers: arsenm, aheejin, dschuff, javed.absar Subscribers: jholewinski, sdardis, wdng, sbc100, jgravelle-google, jrtc27, atanasyan, llvm-commits Differential Revision: https://reviews.llvm.org/D53112 llvm-svn: 345218
*	[DEBUGINFO, NVPTX] Try to pack bytes data into a single string.	Alexey Bataev	2018-10-24	2	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If the target does not support `.asciz` and `.ascii` directives, the strings are represented as bytes and each byte is placed on the new line as a separate byte directive `.b8 <data>`. NVPTX target allows to represent the vector of the data of the same type as a vector, where values are separated using `,` symbol: `.b8 <data1>,<data2>,...`. This allows to reduce the size of the final PTX file. Ptxas tool includes ptx files into the resulting binary object, so reducing the size of the PTX file is important. Reviewers: tra, jlebar, echristo Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D45822 llvm-svn: 345142
*	[TI removal] Make variables declared as `TerminatorInst` and initialized	Chandler Carruth	2018-10-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	by `getTerminator()` calls instead be declared as `Instruction`. This is the biggest remaining chunk of the usage of `getTerminator()` that insists on the narrow type and so is an easy batch of updates. Several files saw more extensive updates where this would cascade to requiring API updates within the file to use `Instruction` instead of `TerminatorInst`. All of these were trivial in nature (pervasively using `Instruction` instead just worked). llvm-svn: 344502
*	[CUDA] Added basic support for compiling with CUDA-10.0	Artem Belevich	2018-09-24	1	-0/+5
\| \| \| \|	llvm-svn: 342924
*	[NVPTX] Implement isLegalToVectorizeLoadChain	Benjamin Kramer	2018-08-27	1	-0/+13
\| \| \| \| \| \| \| \|	This lets LSV nicely split up underaligned chains. Differential Revision: https://reviews.llvm.org/D51306 llvm-svn: 340760
*	[NVPTX] Remove ftz variants of cvt with rounding mode	Benjamin Kramer	2018-08-21	1	-36/+6
\| \| \| \| \| \| \| \|	These do not exist in ptxas, it refuses to compile them. Differential Revision: https://reviews.llvm.org/D51042 llvm-svn: 340317
*	[SDAG] Remove the reliance on MI's allocation strategy for	Chandler Carruth	2018-08-14	1	-21/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	`MachineMemOperand` pointers attached to `MachineSDNodes` and instead have the `SelectionDAG` fully manage the memory for this array. Prior to this change, the memory management was deeply confusing here -- The way the MI was built relied on the `SelectionDAG` allocating memory for these arrays of pointers using the `MachineFunction`'s allocator so that the raw pointer to the array could be blindly copied into an eventual `MachineInstr`. This creates a hard coupling between how `MachineInstr`s allocate their array of `MachineMemOperand` pointers and how the `MachineSDNode` does. This change is motivated in large part by a change I am making to how `MachineFunction` allocates these pointers, but it seems like a layering improvement as well. This would run the risk of increasing allocations overall, but I've implemented an optimization that should avoid that by storing a single `MachineMemOperand` pointer directly instead of allocating anything. This is expected to be a net win because the vast majority of uses of these only need a single pointer. As a side-effect, this makes the API for updating a `MachineSDNode` and a `MachineInstr` reasonably different which seems nice to avoid unexpected coupling of these two layers. We can map between them, but we shouldn't be surprised at where that occurs. =] Differential Revision: https://reviews.llvm.org/D50680 llvm-svn: 339740
*	[NVPTX] Select atomic loads and stores	Jonas Hahnfeld	2018-08-09	1	-34/+82
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	According to PTX ISA .volatile has the same memory synchronization semantics as .relaxed.sys, so it can be used to implement monotonic atomic loads and stores. This is important for OpenMP's atomic construct where - 'read's and 'write's are lowered to atomic loads and stores, and - an update of float or double types are lowered into a cmpxchg loop. (Note that PTX could do better because it has atom.add.f{32,64} but LLVM's atomicrmw instruction only allows integer types.) Higher levels of atomicity (like acquire and release) need additional synchronization properties which were added with PTX ISA 6.0 / sm_70. So using these instructions still results in an error. Differential Revision: https://reviews.llvm.org/D50391 llvm-svn: 339316
*	[NVPTX] Handle __nvvm_reflect("__CUDA_ARCH").	Artem Belevich	2018-08-03	3	-5/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: libdevice in recent CUDA versions relies on __nvvm_reflect() to select GPU-specific bitcode. This patch addresses the requirement. Reviewers: jlebar Subscribers: jholewinski, sanjoy, hiraditya, bixia, llvm-commits Differential Revision: https://reviews.llvm.org/D50207 llvm-svn: 338908
*	Remove trailing space	Fangrui Song	2018-07-30	3	-3/+3
\| \| \| \| \| \|	sed -Ei 's/[[:space:]]+$//' include/*/.{def,h,td} lib/*/.{cpp,h} llvm-svn: 338293
*	[DEBUGINFO, NVPTX] Emit correct debug information for local variables.	Alexey Bataev	2018-07-26	4	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: NVPTX target dos not use register-based frame information. Instead it relies on the artificial local_depot that is used instead of the frame and the data for variables must be emitted relatively to this local_depot. Reviewers: tra, jlebar, echristo Subscribers: jholewinski, aprantl, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D45963 llvm-svn: 338039
*	[TableGen] Support multi-alternative pattern fragments	Ulrich Weigand	2018-07-13	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A TableGen instruction record usually contains a DAG pattern that will describe the SelectionDAG operation that can be implemented by this instruction. However, there will be cases where several different DAG patterns can all be implemented by the same instruction. The way to represent this today is to write additional patterns in the Pattern (or usually Pat) class that map those extra DAG patterns to the instruction. This usually also works fine. However, I've noticed cases where the current setup seems to require quite a bit of extra (and duplicated) text in the target .td files. For example, in the SystemZ back-end, there are quite a number of instructions that can implement an "add-with-overflow" operation. The same instructions also need to be used to implement just plain addition (simply ignoring the extra overflow output). The current solution requires creating extra Pat pattern for every instruction, duplicating the information about which particular add operands map best to which particular instruction. This patch enhances TableGen to support a new PatFrags class, which can be used to encapsulate multiple alternative patterns that may all match to the same instruction. It operates the same way as the existing PatFrag class, except that it accepts a list of DAG patterns to match instead of just a single one. As an example, we can now define a PatFrags to match either an "add-with-overflow" or a regular add operation: def z_sadd : PatFrags<(ops node:$src1, node:$src2), [(z_saddo node:$src1, node:$src2), (add node:$src1, node:$src2)]>; and then use this in the add instruction pattern: defm AR : BinaryRRAndK<"ar", 0x1A, 0xB9F8, z_sadd, GR32, GR32>; These SystemZ target changes are implemented here as well. Note that PatFrag is now defined as a subclass of PatFrags, which means that some users of internals of PatFrag need to be updated. (E.g. instead of using PatFrag.Fragment you now need to use !head(PatFrag.Fragments).) The implementation is based on the following main ideas: - InlinePatternFragments may now replace each original pattern with several result patterns, not just one. - parseInstructionPattern delays calling InlinePatternFragments and InferAllTypes. Instead, it extracts a single DAG match pattern from the main instruction pattern. - Processing of the DAG match pattern part of the main instruction pattern now shares most code with processing match patterns from the Pattern class. - Direct use of main instruction patterns in InferFromPattern and EmitResultInstructionAsOperand is removed; everything now operates solely on DAG match patterns. Reviewed by: hfinkel Differential Revision: https://reviews.llvm.org/D48545 llvm-svn: 336999
*	Use Type::isIntOrPtrTy where possible, NFC	Vedant Kumar	2018-07-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	It's a bit neater to write T.isIntOrPtrTy() over `T.isIntegerTy() \|\| T.isPointerTy()`. I used Python's re.sub with this regex to update users: r'([\w.\->()]+)isIntegerTy\s\\|\\|\s\1isPointerTy' llvm-svn: 336462
*	[NVPTX] Expand v2f16 INSERT_VECTOR_ELT	Benjamin Kramer	2018-07-03	1	-0/+1
\| \| \| \| \| \|	Vectorization can create them. llvm-svn: 336227
*	[NVPTX] Delete dead code	Benjamin Kramer	2018-06-28	5	-63/+0
\| \| \| \| \| \|	No functionality change. llvm-svn: 335913
*	[NVPTX] Ignore target-cpu and -features for inlining	Jonas Hahnfeld	2018-06-17	1	-0/+8
\| \| \| \| \| \| \| \| \| \|	We don't want to prevent inlining because of target-cpu and -features attributes that were added to newer versions of LLVM/Clang: There are no incompatible functions in PTX, ptxas will throw errors in such cases. Differential Revision: https://reviews.llvm.org/D47691 llvm-svn: 334904
*	[NVPTX] Delete dead code from the AsmPrinter.	Benjamin Kramer	2018-06-04	2	-142/+0
\| \| \| \|	llvm-svn: 333924
*	Set ADDE/ADDC/SUBE/SUBC to expand by default	Amaury Sechet	2018-06-01	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: They've been deprecated in favor of UADDO/ADDCARRY or USUBO/SUBCARRY for a while. Target that uses these opcodes are changed in order to ensure their behavior doesn't change. Reviewers: efriedma, craig.topper, dblaikie, bkramer Subscribers: jholewinski, arsenm, jyknight, sdardis, nemanjai, nhaehnle, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, llvm-commits Differential Revision: https://reviews.llvm.org/D47422 llvm-svn: 333748
*	Revert "Temporarily revert "[DEBUG] Initial adaptation of NVPTX target for ↵	Eric Christopher	2018-05-18	11	-273/+207
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	debug info emission."" This reapplies commits: r330271, r330592, r330779. [DEBUG] Initial adaptation of NVPTX target for debug info emission. Summary: Patch adds initial emission of the debug info for NVPTX target. Currently, only .file and .loc directives are emitted, everything else is commented out to not break the compilation of Cuda. llvm-svn: 332689
*	Rename DEBUG macro to LLVM_DEBUG.	Nicola Zaghen	2018-05-14	2	-6/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' \| xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master \| ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240
*	[NVPTX] Added a feature to use short pointers for const/local/shared AS.	Artem Belevich	2018-05-09	8	-61/+108
\| \| \| \| \| \| \| \| \| \| \| \|	Const/local/shared address spaces are all < 4GB and we can always use 32-bit pointers to access them. This has substantial performance impact on kernels that uses shared memory for intermediary results. The feature is disabled by default. Differential Revision: https://reviews.llvm.org/D46147 llvm-svn: 331941
*	Remove \brief commands from doxygen comments.	Adrian Prantl	2018-05-01	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272
*	Temporarily revert "[DEBUG] Initial adaptation of NVPTX target for debug ↵	Eric Christopher	2018-05-01	11	-207/+273
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	info emission." This appears to have some issues associated with the file directive output causing multiple global symbols with the name "file" to be emitted into a startup section. I'm investigating more specific causes and working with the original author. This reverts commit r330271. Also Revert "[DEBUGINFO, NVPTX] Add the test for the debug info of the local" This reverts commit r330592 and the follow up of 330779 as the testcase is dependent upon r330271. llvm-svn: 331237
*	[NVPTX] Turn on Loop/SLP vectorization	Benjamin Kramer	2018-04-27	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since PTX has grown a <2 x half> datatype vectorization has become more important. The late LoadStoreVectorizer intentionally only does loads and stores, but now arithmetic has to be vectorized for optimal throughput too. This is still very limited, SLP vectorization happily creates <2 x half> if it's a legal type but there's still a lot of register moving happening to get that fed into a vectorized store. Overall it's a small performance win by reducing the amount of arithmetic instructions. I haven't really checked what the loop vectorizer does to PTX code, the cost model there might need some more tweaks. I didn't see it causing harm though. Differential Revision: https://reviews.llvm.org/D46130 llvm-svn: 331035
*	[NVPTX] Make the legalizer expand shufflevector of <2 x half>	Benjamin Kramer	2018-04-26	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	There's no direct instruction for this, but it's trivially implemented with two movs. Without this the code generator just dies when encountering a shufflevector. Differential Revision: https://reviews.llvm.org/D46116 llvm-svn: 330948
*	[NVPTX] Deduplicate code. No functionality change.	Benjamin Kramer	2018-04-26	1	-18/+6
\| \| \| \|	llvm-svn: 330933
*	Consistently sort add_subdirectory calls in lib/Target/*/CMakeLists.txt	Nico Weber	2018-04-23	1	-1/+1
\| \| \| \|	llvm-svn: 330584
*	[NVPTX, CUDA] Added support for m8n32k16 and m32n8k16 variants of wmma ↵	Artem Belevich	2018-04-18	4	-17/+85
\| \| \| \| \| \| \| \| \| \|	instructions. The new instructions were added added for sm_70+ GPUs in CUDA-9.1. Differential Revision: https://reviews.llvm.org/D45068 llvm-svn: 330296
*	[DEBUG] Initial adaptation of NVPTX target for debug info emission.	Alexey Bataev	2018-04-18	11	-273/+207
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Patch adds initial emission of the debug info for NVPTX target. Currently, only .file and .loc directives are emitted, everything else is commented out to not break the compilation of Cuda. Reviewers: echristo, jlebar, tra, jholewinski Subscribers: mgorny, aprantl, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D41827 llvm-svn: 330271
*	[NVPTX] Removed 'satom' feature which is no longer used.	Artem Belevich	2018-04-11	2	-11/+4
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D45061 llvm-svn: 329830
*	[NVPTX, CUDA] Improved feature constraints on NVPTX target builtins.	Artem Belevich	2018-04-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	When NVPTX TARGET_BUILTIN specifies sm_XX or ptxYY as required feature, consider those features available if we're compiling for GPU >= sm_XX or have enabled PTX version >= ptxYY. Differential Revision: https://reviews.llvm.org/D45061 llvm-svn: 329829
*	[NVPTX] add support for initializing fp16 arrays.	Artem Belevich	2018-04-06	1	-1/+7
\| \| \| \| \| \| \| \| \|	Previously HalfTy was not handled which would either trigger an assertion, or result in array initialized with garbage. Differential Revision: https://reviews.llvm.org/D45391 llvm-svn: 329463
*	[NVPTX] Fixed vectorized LDG for f16.	Artem Belevich	2018-04-06	1	-0/+6
\| \| \| \| \| \| \| \| \|	v2f16 is a special case in NVPTX. v4f16 may be loaded as a pair of v2f16 and that was not previously handled correctly by tryLDGLDU() Differential Revision: https://reviews.llvm.org/D45339 llvm-svn: 329456
*	Sort targetgen calls in lib/Target/*/CMakeLists.	Nico Weber	2018-04-04	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \|	Makes it easier to see mistakes such as the one fixed in r329178 and makes the different target CMakeLists more consistent. Also remove some stale-looking comments from the Nios2 target cmakefile. No intended behavior change. llvm-svn: 329181
*	[CodeGen]Add NoVRegs property on PostRASink and ShrinkWrap	Jun Bum Lim	2018-04-03	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change declare that PostRAMachineSinking and ShrinkWrap require NoVRegs property, so now the MachineFunctionPass can enforce this check. These passes are disabled in NVPTX & WebAssembly. Reviewers: dschuff, jlebar, tra, jgravelle-google, MatzeB, sebpop, thegameg, mcrosier Reviewed By: dschuff, thegameg Subscribers: jholewinski, jfb, sbc100, aheejin, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D45183 llvm-svn: 329095
*	[NVPTX] Enable StructuredCFG for NVPTX	Tim Shen	2018-03-30	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Make NVPTX require structured CFG. Added a temporary flag to "roll back" the behavior for easy deployment. Combined with D45008, this fixes several internal Nvidia GPU test failures that we suspect to be ptxas miscompiles (PR27738). Reviewers: jlebar Subscribers: jholewinski, sanjoy, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D45070 llvm-svn: 328885