bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[AMDGPU] Implemented dwordx3 variants of buffer/tbuffer load/store intrinsics	Tim Renouf	2019-03-22	6	-24/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now we have vec3 MVTs, this commit implements dwordx3 variants of the buffer intrinsics. On gfx6, a dwordx3 buffer load intrinsic is implemented as a dwordx4 instruction, and a dwordx3 buffer store intrinsic is not supported. We need to support the dwordx3 load intrinsic because it is generated by subtarget-unaware code in InstCombine. Differential Revision: https://reviews.llvm.org/D58904 Change-Id: I016729d8557b98a52f529638ae97c340a5922a4e llvm-svn: 356755
*	[ObjectYAML] Add basic minidump generation support	Pavel Labath	2019-03-22	3	-0/+389
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds the ability to read a yaml form of a minidump file and write it out as binary. Apart from the minidump header and the stream directory, only three basic stream kinds are supported: - Text: This kind is used for streams which contain textual data. This is typically the contents of a /proc file on linux (e.g. /proc/PID/maps). In this case, we just put the raw stream contents into the yaml. - SystemInfo: This stream contains various bits of information about the host system in binary form. We expose the data in a structured form. - Raw: This kind is used as a fallback when we don't have any special knowledge about the stream. In this case, we just print the stream contents in hex. For this code to be really useful, more stream kinds will need to be added (particularly for things like lists of memory regions and loaded modules). However, these can be added incrementally. Reviewers: jhenderson, zturner, clayborg, aprantl Subscribers: mgorny, lemo, llvm-commits, lldb-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59482 llvm-svn: 356753
*	[RISCV] Add basic RV32E definitions and MC layer support	Alex Bradbury	2019-03-22	9	-15/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The RISC-V ISA defines RV32E as an alternative "base" instruction set encoding, that differs from RV32I by having only 16 rather than 32 registers. This patch adds basic definitions for RV32E as well as MC layer support (assembling, disassembling) and tests. The only supported ABI on RV32E is ILP32E. Add a new RISCVFeatures::validate() helper to RISCVUtils which can be called from codegen or MC layer libraries to validate the combination of TargetTriple and FeatureBitSet. Other targets have similar checks (e.g. erroring if SPE is enabled on PPC64 or oddspreg + o32 ABI on Mips), but they either duplicate the checks (Mips), or fail to check for both codegen and MC codepaths (PPC). Codegen for the ILP32E ABI support and RV32E codegen are left for a future patch/patches. Differential Revision: https://reviews.llvm.org/D59470 llvm-svn: 356744
*	[RISCV] Optimize emission of SELECT sequences	Alex Bradbury	2019-03-22	1	-17/+90
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch optimizes the emission of a sequence of SELECTs with the same condition, avoiding the insertion of unnecessary control flow. Such a sequence often occurs when a SELECT of values wider than XLEN is legalized into two SELECTs with legal types. We have identified several use cases where the SELECTs could be interleaved with other instructions. Therefore, we extend the sequence to include non-SELECT instructions if we are able to detect that the non-SELECT instructions do not impact the optimization. This patch supersedes https://reviews.llvm.org/D59096, which attempted to address this issue by introducing a new SelectionDAG node. Hat tip to Eli Friedman for his feedback on how to best handle this issue. Differential Revision: https://reviews.llvm.org/D59355 Patch by Luís Marques. llvm-svn: 356741
*	[RISCV] Allow conversion of CC logic to bitwise logic	Alex Bradbury	2019-03-22	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	Indicates in the TargetLowering interface that conversions from CC logic to bitwise logic are allowed. Adds tests that show the benefit when optimization opportunities are detected. Also adds tests that show that when the optimization is not applied correct code is generated (but opportunities for other optimizations remain). Differential Revision: https://reviews.llvm.org/D59596 Patch by Luís Marques. llvm-svn: 356740
*	[llvm-objcopy] - Fix a st_name of the first symbol table entry.	George Rimar	2019-03-22	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Spec says about the first symbol table entry that index 0 both designates the first entry in the table and serves as the undefined symbol index. It should have zero value. Hence the first symbol table entry has no name. And so has to have a st_name == 0. (http://refspecs.linuxbase.org/elf/gabi4+/ch4.symtab.html) Currently, we do not emit zero value for the first symbol table entry. That happens because we add empty strings to the string builder, which for each such case adds a zero byte: (https://github.com/llvm-mirror/llvm/blob/master/lib/MC/StringTableBuilder.cpp#L185) After the string optimization performed it might return non zero indexes for the empty string requested. The patch fixes this issue for the case above and other sections with no names. Differential revision: https://reviews.llvm.org/D59496 llvm-svn: 356739
*	[AMDGPU] Added v5i32 and v5f32 register classes	Tim Renouf	2019-03-22	10	-4/+144
\| \| \| \| \| \| \| \| \| \|	They are not used by anything yet, but a subsequent commit will start using them for image ops that return 5 dwords. Differential Revision: https://reviews.llvm.org/D58903 Change-Id: I63e1904081e39a6d66e4eb96d51df25ad399d271 llvm-svn: 356735
*	[DWARF] Refactor RelocVisitor and fix computation of SHT_RELA-typed ↵	Fangrui Song	2019-03-22	4	-12/+517
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	relocation entries Summary: getRelocatedValue may compute incorrect value for SHT_RELA-typed relocation entries. // DWARFDataExtractor.cpp uint64_t DWARFDataExtractor::getRelocatedValue(uint32_t Size, uint32_t Off, ... // This formula is correct for REL, but may be incorrect for RELA if the value // stored in the location (getUnsigned(Off, Size)) is not zero. return getUnsigned(Off, Size) + Rel->Value; In this patch, we refactor these visit* functions to include a new parameter `uint64_t A`. Since these visit* functions are no longer used as visitors, rename them to resolve. + REL: A is used as the addend. A is the value stored in the location where the relocation applies: getUnsigned(Off, Size) + RELA: The addend encoded in RelocationRef is used, e.g. getELFAddend(R) and add another set of supports* functions to check if a given relocation type is handled. DWARFObjInMemory uses them to fail early. Reviewers: echristo, dblaikie Reviewed By: echristo Subscribers: mgorny, aprantl, aheejin, fedor.sergeev, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57939 llvm-svn: 356729
*	[BPF] handle derived type properly for computing type id	Yonghong Song	2019-03-22	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, the type id for a derived type is computed incorrectly. For example, type #1: int type #2: ptr to #1 For a global variable "int *a", type #1 will be attributed to variable "a". This is due to a bug which assigns the type id of the basetype of that derived type as the derived type's type id. This happens to "const", "volatile", "restrict", "typedef" and "pointer" types. This patch fixed this bug, fixed existing test cases and added a new one focusing on pointers plus other derived types. Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 356727
*	[AArch64] Split the neon.addp intrinsic into integer and fp variants.	Amara Emerson	2019-03-21	4	-28/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is the result of discussions on the list about how to deal with intrinsics which require codegen to disambiguate them via only the integer/fp overloads. It causes problems for GlobalISel as some of that information is lost during translation, while with other operations like IR instructions the information is encoded into the instruction opcode. This patch changes clang to emit the new faddp intrinsic if the vector operands to the builtin have FP element types. LLVM IR AutoUpgrade has been taught to upgrade existing calls to aarch64.neon.addp with fp vector arguments, and we remove the workarounds introduced for GlobalISel in r355865. This is a more permanent solution to PR40968. Differential Revision: https://reviews.llvm.org/D59655 llvm-svn: 356722
*	[X86] Use LoadInst->getType() instead of ↵	Craig Topper	2019-03-21	1	-3/+2
\| \| \| \| \| \| \| \|	LoadInst->getPointerOperandType()->getElementType(). NFCI For the future day when the pointer's don't have element types, we shoudl just use the type of the load result instead. llvm-svn: 356721
*	[Object] Fix reading objects created with -fembed-bitcode-marker	Steven Wu	2019-03-21	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, this fails with many tools, e.g. $ clang -fembed-bitcode-marker -c -o test.o test.c $ nm test.o nm: test.o The file was not recognized as a valid object file -fembed-bitcode-marker creates a LLVM,bitcode section consisting of a single byte. When reading the object file, IRObjectFile::findBitcodeInObject succeeds, causing SymbolicFile::createSymbolicFile to try to read the "bitcode" rather than using the outer Mach-O data - when then fails. Fix this by making findBitcodeInObject return an error if the section size <= 1. Patched by: Nicholas Allegra Differential Revision: https://reviews.llvm.org/D44373 llvm-svn: 356718
*	Mips: Fix typo in assert message	Matt Arsenault	2019-03-21	1	-1/+1
\| \| \| \|	llvm-svn: 356717
*	Mips: Don't create copy of nothing	Matt Arsenault	2019-03-21	1	-2/+0
\| \| \| \| \| \| \| \| \|	This was creating a copy of the register the pseudo itself was def'ing, leaving a copy of an undefined register. I'm not sure how the verifier is not catching this, but this avoids asserting in a future change to RegAllocFast llvm-svn: 356716
*	GlobalISel: Fix RegBankSelect for REG_SEQUENCE	Matt Arsenault	2019-03-21	1	-4/+16
\| \| \| \| \| \| \| \| \| \| \| \| \|	The AArch64 test was broken since the result register already had a set register class, so this test was a no-op. The mapping verify call would fail because the result size is not the same as the inputs like in a copy or phi. The AMDGPU testcases are half broken and introduce illegal VGPR->SGPR copies which need much more work to handle correctly (same for phis), but add them as a baseline. llvm-svn: 356713
*	Don't add a tail keyword to calls to ObjC runtime functions if the calls	Akira Hatanaka	2019-03-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	are annotated with notail. r356705 annotated calls to objc_retainAutoreleasedReturnValue with notail on x86-64. This commit teaches ARC optimizer to check the notail marker on the call before turning it into a tail call. rdar://problem/38675807 llvm-svn: 356707
*	[AArch64] Update for Exynos	Evandro Menezes	2019-03-21	1	-1/+1
\| \| \| \| \| \|	Fix the feature set for Exynos M4 by removing support for `+fp16fml` and fix test case. llvm-svn: 356698
*	[X86] canonicalizeBitSelect - don't attempt to canonicalize mask registers	Simon Pilgrim	2019-03-21	1	-1/+1
\| \| \| \| \| \| \| \|	We don't use X86ISD::ANDNP for mask registers. Test case from @craig.topper (Craig Topper) llvm-svn: 356696
*	[llvm-pdbutil] Add -type-ref-stats to help find unused type info	Reid Kleckner	2019-03-21	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This considers module symbol streams and the global symbol stream to be roots. Most types that this considers "unreferenced" are referenced by LF_UDT_MOD_SRC_LINE id records, which VC seems to always include. Essentially, they are types that the user can only find in the debugger if they call them by name, they cannot be found by traversing a symbol. In practice, around 80% of type information in a PDB is referenced by a symbol. That seems like a reasonable number. I don't really plan to do anything with this tool. It mostly just exists for informational purposes, and to confirm that we probably don't need to implement type reference tracking in LLD. We can continue to merge all types as we do today without wasting space. Reviewers: zturner, aganea Subscribers: mgorny, hiraditya, arphaman, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59620 llvm-svn: 356692
*	[InstCombine] Don't transform ((C1 OP zext(X)) & C2) -> zext((C1 OP X) & C2) ↵	Craig Topper	2019-03-21	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	if either zext or OP has another use. If they have other users we'll just end up increasing the instruction count. We might be able to weaken this to only one of them having a single use if we can prove that the and will be removed. Fixes PR41164. Differential Revision: https://reviews.llvm.org/D59630 llvm-svn: 356690
*	[X86] Don't avoid folding multiple use sign extended 8-bit immediate into ↵	Craig Topper	2019-03-21	3	-18/+5
\| \| \| \| \| \| \| \| \| \| \| \|	instructions under optsize. Under optsize we try to avoid folding immediates into instructions under optsize. But if the immediate is 16-bits or 32 bits, but can be encoded as an 8-bit immediate we don't save enough from disabling the folding unless the immediate has enough uses to make up for the size of the move which is either 3 bytes or 5 bytes since there are no sign extended 8-bit moves. We would also save something if the immediate was a live out of the basic block and thus a move was unavoidable, but that would require a more advanced heuristic than just counting uses. Note we only avoid folding multiple use immediates into the patterns that use X86ISD::ADD/SUB/XOR/OR/AND/CMP/ADC/SBB nodes and not the more common ISD::ADD/SUB/XOR/OR/AND nodes. Differential Revision: https://reviews.llvm.org/D59522 llvm-svn: 356688
*	[ScalarizeMaskedMemIntrin] Add support for scalarizing expandload and ↵	Craig Topper	2019-03-21	4	-0/+196
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	compressstore intrinsics. This adds support for scalarizing these intrinsics as well the X86TargetTransformInfo support to avoid scalarizing them in the cases X86 can handle. I've omitted handling special cases for constant masks for this first pass. Though CodeGenPrepare can constant fold the branch conditions and remove some of the control flow anyway. Fixes PR40994 and is covers most of PR3666. Might want to implement constant masks to close that. Differential Revision: https://reviews.llvm.org/D59180 llvm-svn: 356687
*	[ValueTracking] Use ConstantRange based overflow check for signed sub	Nikita Popov	2019-03-21	1	-10/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is D59450, but for signed sub. This case is not NFC, because the overflow logic in ConstantRange is more powerful than the existing check. This resolves the TODO in the function. I've added two tests to show that this indeed catches more cases than the previous logic, but the main correctness test coverage here is in the existing ConstantRange unit tests. Differential Revision: https://reviews.llvm.org/D59617 llvm-svn: 356685
*	[DAGCombiner] Use getTokenFactor in a few more cases.	Florian Hahn	2019-03-21	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SDNodes can only have 64k operands and for some inputs (e.g. large number of stores), we can reach this limit when creating TokenFactor nodes. This patch is a follow up to D56740 and updates a few more places that potentially can create TokenFactors with too many operands. Reviewers: efriedma, craig.topper, aemerson, RKSimon Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D59156 llvm-svn: 356668
*	[DAGCombine] SimplifySelectCC - call FoldSetCC with the setcc result type	Simon Pilgrim	2019-03-21	1	-2/+3
\| \| \| \| \| \| \| \|	We were calling FoldSetCC with the compare operand type instead of the result type. Found by OSS-Fuzz #13838 (https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13838) llvm-svn: 356667
*	[CodeGenPrepare] limit formation of overflow intrinsics (PR41129)	Sanjay Patel	2019-03-21	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is probably a bigger limitation than necessary, but since we don't have any evidence yet that this transform led to real-world perf improvements rather than regressions, I'm making a quick, blunt fix. In the motivating x86 example from: https://bugs.llvm.org/show_bug.cgi?id=41129 ...and shown in the regression test, we want to avoid an extra instruction in the dominating block because that could be costly. The x86 LSR test diff is reversing the changes from D57789. There's no evidence that 1 version is any better than the other yet. Differential Revision: https://reviews.llvm.org/D59602 llvm-svn: 356665
*	[Thumb] Fix infinite loop in ABS expansion (PR41160)	Simon Pilgrim	2019-03-21	1	-1/+4
\| \| \| \| \| \|	Don't expand ISD::ABS node if its legal. llvm-svn: 356661
*	[AMDGPU] Support for v3i32/v3f32	Tim Renouf	2019-03-21	12	-52/+279
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added support for dwordx3 for most load/store types, but not DS, and not intrinsics yet. SI (gfx6) does not have dwordx3 instructions, so they are not enabled there. Some of this patch is from Matt Arsenault, also of AMD. Differential Revision: https://reviews.llvm.org/D58902 Change-Id: I913ef54f1433a7149da8d72f4af54dbb13436bd9 llvm-svn: 356659
*	Fix -Wmisleading-indentation gcc7 warning. NFCI.	Simon Pilgrim	2019-03-21	1	-6/+6
\| \| \| \|	llvm-svn: 356658
*	[AArch64] Allow -mattr=tpidr-el[1\|2\|3]	Oliver Stannard	2019-03-21	3	-0/+21
\| \| \| \| \| \| \| \| \| \| \|	Added subtarget features for AArch64 to use TPIDR_EL[1\|2\|3] as the TLS base register, rather than the default TPIDR_EL0. Patch by Philip Derrin! Differential revision: https://reviews.llvm.org/D54685 llvm-svn: 356657
*	[SelectionDAG] Add scalarization of ABS node (PR41149)	Simon Pilgrim	2019-03-21	1	-0/+1
\| \| \| \| \| \| \| \|	Patch by: @ikulagin (Ivan Kulagin) Differential Revision: https://reviews.llvm.org/D59577 llvm-svn: 356656
*	[Object] Add basic minidump support	Pavel Labath	2019-03-21	8	-1/+102
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds basic support for reading minidump files. It contains the definitions of various important minidump data structures (header, stream directory), and of one minidump stream (SystemInfo). The ability to read other streams will be added in follow-up patches. However, all streams can be read even now as raw data, which means lldb's minidump support (where this code is taken from) can be immediately rebased on top of this patch as soon as it lands. As we don't have any support for generating minidump files (yet), this tests the code via unit tests with some small handcrafted binaries in the form of c char arrays. Reviewers: Bigcheese, jhenderson, zturner Subscribers: srhines, dschuff, mgorny, fedor.sergeev, lemo, clayborg, JDevlieghere, aprantl, lldb-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59291 llvm-svn: 356652
*	[BasicAA] Use DenseMap::try_emplace after D59151. NFC	Fangrui Song	2019-03-21	1	-5/+5
\| \| \| \|	llvm-svn: 356651
*	Silence warning about unused variable in builds without asserts [NFC]	Mikael Holmen	2019-03-21	1	-0/+1
\| \| \| \|	llvm-svn: 356648
*	[ScalarizeMaskedMemIntrinsics] Reverse some if conditions to reduce ↵	Craig Topper	2019-03-21	1	-20/+16
\| \| \| \| \| \| \| \|	indentations to remove curly braces. Pre-commit for D59180 llvm-svn: 356646
*	[BasicAA] Reduce no of map seaches [NFCI].	Alina Sbirlea	2019-03-21	1	-14/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is a refactoring patch. - Reduce the number of map searches by reusing the iterator. - Add asserts to check that the entry is in the cache, as this is something BasicAA relies on to avoid infinite recursion. Reviewers: chandlerc, aschwaighofer Subscribers: sanjoy, jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59151 llvm-svn: 356644
*	[instcombine] Add some todos, and arrange code for readibility	Philip Reames	2019-03-21	2	-32/+38
\| \| \| \|	llvm-svn: 356642
*	[MSSA] Delete move ctor; remove dynamic never-moved verification	George Burgess IV	2019-03-21	1	-14/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Code archaeology in D59315 revealed that MSSA should never be moved. Rather than trying to check dynamically that this hasn't happened in the verify() functions of Walkers, it's likely best to just delete its move constructor. Since all these verify() functions did is check that MSSA hasn't moved, this allows us to remove these verify functions. I can readd the verification checks if someone's super concerned about us trying to `memcpy` MemorySSA or something somewhere, but I imagine we have other problems if we're trying anything like that... llvm-svn: 356641
*	[X86] Add CMPXCHG8B feature flag. Set it for all CPUs except i386/i486 ↵	Craig Topper	2019-03-20	6	-48/+93
\| \| \| \| \| \| \| \| \| \| \| \|	including 'generic'. Disable use of CMPXCHG8B when this flag isn't set. CMPXCHG8B was introduced on i586/pentium generation. If its not enabled, limit the atomic width to 32 bits so the AtomicExpandPass will expand to lib calls. Unclear if we should be using a different limit for other configs. The default is 1024 and experimentation shows that using an i256 atomic will cause a crash in SelectionDAG. Differential Revision: https://reviews.llvm.org/D59576 llvm-svn: 356631
*	Fix Mach-O bind and rebase validation errors in libObject	Michael Trent	2019-03-20	1	-116/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: llvm-objdump (via libObject) validates DYLD_INFO rebase and bind entries against the basic structure found in the Mach-O file before evaluating the contents of those entries. Certain malformed Mach-Os can defeat the validation check and force llvm-objdump (libObject) to crash. The previous logic verified a rebase or bind started in a valid Mach-O section, but did not verify that the section wholely contained the fixup. It also generally allows rebases or binds to start immediately after a valid section even if that range is not itself part of a valid section. Finally, bind and rebase opcodes that indicate more than one fixup (apply N times...) are not completely validated: only the first and final fixups are checked. The previous logic also rejected certain binaries as false positives. Some bind and rebase opcodes can modify the state machine such that the next bind or rebase will fail. libObject will reject these opcodes as invalid in order to be helpful and print an error message associated with the instruction that caused the problem, even though the binary is not actually illegal until it consumes the invalid state in the state machine. In other words, libObject may reject a Mach-O binary that Apple's dynamic linker may consider legal. The original version of macho-rebase-add-addr-uleb-too-big is an example of such a binary. I have replaced the existing checkSegAndOffset and checkCountAndSkip functions with a single function, checkSegAndOffsets, which validates all of the fixups realized by a DYLD_INFO opcode. checkSegAndOffsets verifies that a Mach-O section fully contains each fixup. Every fixup realized by an opcode is validated, and some (but not all!) inconsistencies in the state machine are allowed until a fixup is realized. This means that libObject may fail on an opcode that realizes a fixup, not on the opcode that introduced the arithmetic error. Existing test cases have been modified to reflect the changes in error messages returned by libObject. What's more, the test case for macho-rebase-add-addr-uleb-too-big has been modified so that it actually triggers the error condition; the new code in libObject considers the original test binary "legal". rdar://47797757 Reviewers: lhames, pete, ab Reviewed By: pete Subscribers: rupprecht, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59574 llvm-svn: 356629
*	[WebAssembly][NFC] Fix formatting error from rL356610	Thomas Lively	2019-03-20	1	-2/+3
\| \| \| \|	llvm-svn: 356622
*	[AMDGPU] Do not generate spurious PAL metadata	Tim Renouf	2019-03-20	2	-6/+10
\| \| \| \| \| \| \| \| \| \| \| \|	My previous fix rL356591 "[AMDGPU] Added MsgPack format PAL metadata" accidentally caused a spurious PAL metadata .note record to be emitted for any AMDGPU output. That caused failures in the lld test amdgpu-relocs.s. Fixed. Differential Revision: https://reviews.llvm.org/D59613 Change-Id: Ie04a2aaae890dcd490f22c89edf9913a77ce070e llvm-svn: 356621
*	Allow machine dce to remove uses in the same instruction	Stanislav Mekhanoshin	2019-03-20	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Machine DCE cannot remove a dead definition if there are non-dbg uses. A use however can be in the same instruction: dead %0 = INST %0 Such instructions sometimes created by Detect dead lanes pass. Allow this instruction to be deleted despite the use if the only use belongs to the same instruction. Differential Revision: https://reviews.llvm.org/D59565 llvm-svn: 356619
*	[X86] Call lowerShuffleAsBitMask for 512-bit vectors in lowerShuffleAsBlend.	Craig Topper	2019-03-20	1	-11/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch enables the use of lowerShuffleAsBitMask for 512-bit blends before falling back to move immedate, GPR to k-register, and masked op. I had to make some changes to support v8i64 when i64 is not a legal type. And to support floating point types. This trades a load for the move immediate and GPR move which is higher latency. But its probably better for register pressure not having to hop through other register classes. The load+and should play better with LICM and rematerialization I think. Differential Revision: https://reviews.llvm.org/D59479 llvm-svn: 356618
*	[AMDGPU] Fix dependency on `BinaryFormat`	Michael Liao	2019-03-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: - The linking is broken when this library is built as shared one. Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59610 llvm-svn: 356617
*	AMDGPU: Don't look for constant in insert/extract_vector_elt regbankselect	Matt Arsenault	2019-03-20	1	-44/+19
\| \| \| \| \| \| \| \| \|	The constantness shouldn't change the register bank choice. We also don't need to restrict this to only indexing VGPRs, since it's possible to index SGPRs (but SelectionDAG made using this difficult). Allow directly indexing SGPRs when appropriate. llvm-svn: 356611
*	[WebAssembly] Target features section	Thomas Lively	2019-03-20	6	-5/+134
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Implements a new target features section in assembly and object files that records what features are used, required, and disallowed in WebAssembly objects. The linker uses this information to ensure that all objects participating in a link are feature-compatible and records the set of used features in the output binary for use by optimizers and other tools later in the toolchain. The "atomics" feature is always required or disallowed to prevent linking code with stripped atomics into multithreaded binaries. Other features are marked used if they are enabled globally or on any function in a module. Future CLs will add linker flags for ignoring feature compatibility checks and for specifying the set of allowed features, implement using the presence of the "atomics" feature to control the type of memory and segments in the linked binary, and add front-end flags for relaxing the linkage policy for atomics. Reviewers: aheejin, sbc100, dschuff Subscribers: jgravelle-google, hiraditya, sunfish, mgrang, jfb, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59173 llvm-svn: 356610
*	[AMDGPU] Fix clamp bit DAG operand	Michael Liao	2019-03-20	1	-5/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - Should use `targetconstant` instead of `constant` operand for clamp bit, which is expected as an immediate operand. Under certain conditions, such as a common `i1 false` constant is used in other place and selected before the instruction with clamp bit, register operand may be added instead of immediate one. Use `targetcosntant` to enforce that. Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59608 llvm-svn: 356608
*	[ARC] Add ARCOptAddrMode pass to generate postincrement loads/stores.	Pete Couperus	2019-03-20	5	-2/+514
\| \| \| \| \| \| \| \| \| \|	Build on newly introduced ARC postincrement loads/stores from r356200. Patch By Denis Antrushin! <denis@synopsys.com> Differential Revision: https://reviews.llvm.org/D59409 llvm-svn: 356606
*	AMDHSA: Fix COMPUTE_PGM_RSRC2.USER_SGPR calculation when parsing ISA assembly	Konstantin Zhuravlyov	2019-03-20	1	-7/+7
\| \| \| \| \| \| \| \|	It must match https://llvm.org/docs/AMDGPUUsage.html#initial-kernel-execution-state Differential Revision: https://reviews.llvm.org/D59570 llvm-svn: 356603