bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Revert "peel loops with runtime small trip counts"	Krzysztof Parzyszek	2018-03-30	1	-37/+0
\| \| \| \| \| \|	This reverts commit r328854, it breaks some Hexagon tests. llvm-svn: 328875
*	[AMDGPU] Fixed some instructions latencies	Stanislav Mekhanoshin	2018-03-30	4	-11/+11
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D45073 llvm-svn: 328874
*	[SelectionDAG] Removing FABS folding from DAGCombiner	Sanjay Patel	2018-03-30	2	-54/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The code has bugs dealing with -0.0. Since D44550 introduced FABS pattern folding in InstCombine, this patch removes the now-redundant code that causes https://bugs.llvm.org/show_bug.cgi?id=36600. Patch by Mikhail Dvoretckii! Differential Revision: https://reviews.llvm.org/D44683 llvm-svn: 328872
*	[Hexagon] Recognize and handle :endloop01	Krzysztof Parzyszek	2018-03-30	1	-3/+10
\| \| \| \|	llvm-svn: 328870
*	[Hexagon] Fix printing :mem_noshuf on compiler-generated packets	Krzysztof Parzyszek	2018-03-30	1	-0/+46
\| \| \| \|	llvm-svn: 328869
*	[X86][BtVer2] Add missing ReadAfterLd to RM variants of AVX horizontal adds and	Andrea Di Biagio	2018-03-30	3	-12/+10
\| \| \| \| \| \| \| \| \|	most vector logic instructions. Fixed a few InstRW that forgot to specify a ReadAfterLd for the register input operand. llvm-svn: 328867
*	[X86][BtVer2] Add tests that show how ReadAfterLd is missing for some	Andrea Di Biagio	2018-03-30	4	-0/+92
\| \| \| \| \| \| \| \| \| \| \| \| \|	instructions. In the Btver2 model, there are a few InstRW overrides that don't specify a ReadAfterLd for the register input operand. As a result, a few AVX variants of horizontal operations and most vector logic operations with a folded memory operand don't have a ReadAdvance info associated to their input register operands. llvm-svn: 328865
*	[X86] Add llvm-mca tests for r328834.	Andrea Di Biagio	2018-03-30	4	-0/+120
\| \| \| \| \| \| \| \| \|	Verify that the ReadAfterLd is correctly applied to FMA and 4-ops variable blend instructions. As Craig pointed out in D44726, some Intel models still have to be fixed. llvm-svn: 328861
*	[X86] Add tests to verify the presence of "ReadAfterLd" after r328823.	Andrea Di Biagio	2018-03-30	2	-0/+53
\| \| \| \| \| \| \|	This change adds a couple of tests to verify the change introduced by revision 328823 ([X86] Correct the placement of ReadAfterLd in BEXTR and BZHI). llvm-svn: 328859
*	Revert "[LLVM-C] Finish exception instruction bindings"	Vlad Tsyrklevich	2018-03-30	1	-35/+0
\| \| \| \| \| \|	This reverts commit r328759. It was causing LSan failures on sanitizer-x86_64-linux-bootstrap llvm-svn: 328858
*	[AMDGPU] Fix the SDWA Peephole phase to handle src for dst:UNUSED_PRESERVE.	Michael Bedy	2018-03-30	1	-0/+88
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The phase attempts to transform operations that extract a portion of a value into an SDWA src operand in cases where that value is used only once. It was not prepared for this use to be the preserved portion of a value for dst:UNUSED_PRESERVE, resulting in a crash or assert. This change either rejects the illegal SDWA attempt, or in the case where dst:WORD_1 and the src_sel would be WORD_0, removes the unneeded extract instruction. Reviewers: arsenm, #amdgpu Reviewed By: arsenm, #amdgpu Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D44364 llvm-svn: 328856
*	[Hexagon] add missing lit config file	Ikhlas Ajbar	2018-03-30	1	-0/+3
\| \| \| \|	llvm-svn: 328855
*	peel loops with runtime small trip counts	Ikhlas Ajbar	2018-03-30	1	-0/+37
\| \| \| \| \| \| \| \| \| \|	For Hexagon, peeling loops with small runtime trip count is beneficial for our benchmarks. We set PeelCount in HexagonTargetInfo.cpp and we use PeelCount set by the target for computing the desired peel count. Differential Revision: https://reviews.llvm.org/D44880 llvm-svn: 328854
*	[MachineCopyPropagation] Handle COPY with overlapping source/dest.	Eli Friedman	2018-03-30	1	-0/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	MachineCopyPropagation::CopyPropagateBlock has a bunch of special handling for COPY instructions. This handling assumes that COPY instructions do not modify the source of the copy; this is wrong if the COPY destination overlaps the source. To fix the bug, check explicitly for this situation, and fall back to the generic instruction handling. This bug can't happen for most register classes because they don't have this sort of overlap, but there are a few register classes where this is possible. The testcase uses the AArch64 QQQQ register class. Differential Revision: https://reviews.llvm.org/D44911 llvm-svn: 328851
*	AMDGPU: Support realigning stack	Matt Arsenault	2018-03-29	1	-0/+125
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While the stack access instructions don't care about alignment > 4, some transformations on the pointer calculation do make assumptions based on knowing the low bits of a pointer are 0. If a stack object ends up being accessed through its absolute address (relative to the kernel scratch wave offset), the addressing expression may depend on the stack frame being properly aligned. This was breaking in a testcase due to the add->or combine. I think some of the SP/FP handling logic is still backwards, and overly simplistic to support all of the stack features. Code which tries to modify the SP with inline asm for example or variable sized objects will probably require redoing this. llvm-svn: 328831
*	Add msan custom mapping options.	Evgeniy Stepanov	2018-03-29	1	-0/+43
\| \| \| \| \| \| \| \| \| \| \|	Similarly to https://reviews.llvm.org/D18865 this adds options to provide custom mapping for msan. As discussed in http://lists.llvm.org/pipermail/llvm-dev/2018-February/121339.html Patch by vit9696(at)avp.su. Differential Revision: https://reviews.llvm.org/D44926 llvm-svn: 328830
*	[X86] Correct the placement of ReadAfterLd in BEXTR and BZHI. Add dedicated ↵	Craig Topper	2018-03-29	2	-8/+8
\| \| \| \| \| \| \| \| \| \|	SchedRW for BEXTR/BZHI. These instructions have the memory operand before the register operand. So we need to put ReadDefault for all the load ops first. Then the ReadAfterLd Differential Revision: https://reviews.llvm.org/D44838 llvm-svn: 328823
*	AMDGPU: Increase default stack alignment	Matt Arsenault	2018-03-29	8	-18/+18
\| \| \| \| \| \| \|	8 and 16-byte values are common, so increase the default alignment to avoid realigning the stack in most functions. llvm-svn: 328821
*	For llvm-nm and Mach-O files that are fully stripped, special case a ↵	Kevin Enderby	2018-03-29	2	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	redacted LC_MAIN As a further refinement on: r328274 - For llvm-nm and Mach-O files also use function starts info in some cases when printing symbols we want to special case a redacted LC_MAIN so it is easier to find. rdar://38978929 llvm-svn: 328820
*	AMDGPU: Fix selection error on constant loads with < 4 byte alignment	Matt Arsenault	2018-03-29	2	-0/+24
\| \| \| \|	llvm-svn: 328818
*	Try to fix a couple tests for Windows.	Paul Robinson	2018-03-29	2	-6/+6
\| \| \| \|	llvm-svn: 328814
*	[SLPVectorizer] Add tests related to PR30787, NFCI.	Dinar Temirbulatov	2018-03-29	4	-0/+390
\| \| \| \|	llvm-svn: 328813
*	[MSF] Default to FPM2, and always mark FPM pages allocated.	Zachary Turner	2018-03-29	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are two FPMs in an MSF file, the idea being that for incremental updates you can write to the alternate one and then atomically swap them on commit. LLVM defaulted to using FPM1 on the first commit, but this differs from Microsoft's behavior which is to default to using FPM2 on the first commit. To eliminate some byte-level file differences, this patch changes LLVM's default to also be FPM2. Additionally, LLVM was trying to be "smart" about marking FPM pages allocated. In addition to marking every page belonging to the alternate FPM as unallocated, LLVM also marked pages at the end of the main FPM which were not needed as unallocated. In order to match the behavior of Microsoft-generated PDBs, we now always mark every FPM block as allocated, regardless of whether it is in the main FPM or the alt FPM, and regardless of whether or not it describes blocks which are actually in the file. This has the side benefit of simplifying our code. llvm-svn: 328812
*	[PDB] Print some more details when explaining MSF fields.	Zachary Turner	2018-03-29	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \|	When we determine that a field belongs to an MSF super block or the free page map, we wouldn't print any additional information. With this patch, we now print the value of the field (for super block fields) or the allocation status of the specified byte (in the case of offsets in the FPM). llvm-svn: 328808
*	Reapply "[DWARFv5] Emit file 0 to the line table."	Paul Robinson	2018-03-29	9	-58/+93
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	DWARF v5 specifies that the root file (also given in the DW_AT_name attribute of the compilation unit DIE) should be emitted explicitly to the line table's list of files. This makes the line table more independent of the .debug_info section. We emit the new syntax only for DWARF v5 and later. Fixes the bug found by asan. Also XFAIL the new test for Darwin, which is stuck on DWARF v2, and fix up other tests so they stop failing on Windows. Last but not least, don't break "clang -g" of an assembler file that has .file directives in it. Differential Revision: https://reviews.llvm.org/D44054 llvm-svn: 328805
*	[PDB] Fix a bug in the explain subcommand.	Zachary Turner	2018-03-29	1	-4/+4
\| \| \| \| \| \| \| \|	We were trying to dig into the super block fields and print a description of the field at the specified offset, but we were printing the wrong field due to an off-by-one-field-error. llvm-svn: 328804
*	[PDB] Add an explain subcommand.	Zachary Turner	2018-03-29	1	-0/+83
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When investigating various things, we often have a file offset and what to know what's in the PDB at that address. For example we may be doing a binary comparison of two LLD-generated PDBs to look for sources of non-determinism, or we may wish to compare an LLD-generated PDB with a Microsoft generated PDB for sources of byte-for-byte incompatibility. In these cases, we can do a binary diff of the two files, and once we find a mismatched byte we can use explain to figure out what that byte is, immediately honining in on the problem. This patch implements this by trying to narrow the meaning of a particular file offset down as much as possible. Differential Revision: https://reviews.llvm.org/D44959 llvm-svn: 328799
*	[JumpThreading] Don't select an edge that we know we can't thread	Haicheng Wu	2018-03-29	1	-0/+99
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In r312664 (D36404), JumpThreading stopped threading edges into loop headers. Unfortunately, I observed a significant performance regression as a result of this change. Upon further investigation, the problematic pattern looked something like this (after many high level optimizations): while (true) { bool cond = ...; if (!cond) { <body> } if (cond) break; } Now, naturally we want jump threading to essentially eliminate the second if check and hook up the edges appropriately. However, the above mentioned change, prevented it from doing this because it would have to thread an edge into the loop header. Upon further investigation, what is happening is that since both branches are threadable, JumpThreading picks one of them at arbitrarily. In my case, because of the way that the IR ended up, it tended to pick the one to the loop header, bailing out immediately after. However, if it had picked the one to the exit block, everything would have worked out fine (because the only remaining branch would then be folded, not thraded which is acceptable). Thus, to fix this problem, we can simply eliminate loop headers from consideration as possible threading targets earlier, to make sure that if there are multiple eligible branches, we can still thread one of the ones that don't target a loop header. Patch by Keno Fischer! Differential Revision: https://reviews.llvm.org/D42260 llvm-svn: 328798
*	.debug_names: Correctly align the AugmentationStringSize field	Pavel Labath	2018-03-29	1	-0/+101
\| \| \| \| \| \| \| \| \| \| \| \| \|	We should align the value of the field, not the overall section offset. This distinction matters if one of the debug_names contributions is not of size which is a multiple of four. The dwarf producers may choose to emit rounded contributions, but they are not required to do so. In the latter case, without this patch we would corrupt the parsing state, as we would adjust the offset even if subsequent contributions contained correctly rounded augmentation strings. llvm-svn: 328796
*	[llvm-mca] Correctly set the ReadAdvance information for register use operands.	Andrea Di Biagio	2018-03-29	1	-0/+46
\| \| \| \| \| \| \| \| \| \| \| \|	The tool was passing the wrong operand index to method MCSubtargetInfo::getReadAdvanceCycles(). That method requires a "UseIdx", and not the operand index. This was found when testing X86 code where instructions had a memory folded operand. This patch fixes the issue and adds test read-advance-1.s to ensure that the ReadAfterLd (a ReadAdvance of 3cy) information is correctly used. llvm-svn: 328790
*	[Hexagon] Add support to handle bit-reverse load intrinsics	Krzysztof Parzyszek	2018-03-29	2	-111/+63
\| \| \| \| \| \|	Patch by Sumanth Gundapaneni. llvm-svn: 328774
*	.debug_names: Parse DW_IDX_die_offset as a reference	Pavel Labath	2018-03-29	7	-16/+16
\| \| \| \| \| \| \| \| \| \| \|	Before this patch we were parsing the attributes as section offsets, as that is what apple_names is doing. However, this is not correct as DWARF v5 specifies that this attribute should use the Reference form class. This also updates all the testcases (except the ones that deliberately pass a different form) to use the correct form class. llvm-svn: 328773
*	[LLVM-C] Finish exception instruction bindings	Robert Widmann	2018-03-29	1	-0/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add support for cleanupret, catchret, catchpad, cleanuppad and catchswitch and their associated accessors. Test is modified from SimplifyCFG because it contains many diverse usages of these instructions. Reviewers: whitequark, deadalnix, echristo Reviewed By: echristo Subscribers: llvm-commits, harlanhaskins Differential Revision: https://reviews.llvm.org/D44496 llvm-svn: 328759
*	[MemorySSA] Consider callsite args for hashing and equality.	George Burgess IV	2018-03-29	1	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We use a `DenseMap<MemoryLocOrCall, MemlocStackInfo>` to keep track of prior work when optimizing uses in MemorySSA. Because we weren't accounting for callsite arguments in either the hash code or equality tests for `MemoryLocOrCall`s, we optimized uses too aggressively in some rare cases. Fix by Daniel Berlin. Should fix PR36883. llvm-svn: 328748
*	[PostRAMachineSink] preserve CFG	Jun Bum Lim	2018-03-28	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Mark CFG is preserved since this pass do not make any change in CFG. Reviewers: sebpop, mzolotukhin, mcrosier Reviewed By: mzolotukhin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44845 llvm-svn: 328727
*	[Hexagon] Add support for "new" circular buffer intrinsics	Krzysztof Parzyszek	2018-03-28	1	-0/+294
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These instructions have been around for a long time, but we haven't supported intrinsics for them. The "new" versions use the CSx register for the start of the buffer instead of the K field in the Mx register. We need to use pseudo instructions for these instructions until after register allocation. The problem is that these instructions allocate a M0/CS0 or M1/CS1 pair. But, we can't generate code for the CSx set-up until after register allocation when the Mx register has been fixed for the instruction. There is a related clang patch. Patch by Brendon Cahoon. llvm-svn: 328724
*	[MachineOutliner] Simplify call outlining + require valid callee save info ↵	Jessica Paquette	2018-03-28	2	-2/+2
\| \| \| \| \| \| \| \| \| \|	for call outlining This commit simplifies the call outlining logic by removing references to the Function associated with the callee. To do this, it requires that valid callee save info is available to the outliner. llvm-svn: 328719
*	[llvm-ar] Support multiple dashed options	Peter Collingbourne	2018-03-28	2	-1/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This allows syntax like: $ llvm-ar -c -r -u file.a file.o This is in addition to the other formats that are already supported: $ llvm-ar cru file.a file.o $ llvm-ar -cru file.a file.o Patch by Tom Anderson! Differential Revision: https://reviews.llvm.org/D44452 llvm-svn: 328716
*	[X86][AVX2] Add shuffle test case from PR36933	Simon Pilgrim	2018-03-28	1	-5/+43
\| \| \| \|	llvm-svn: 328714
*	[AMDGPU][MC] Added ds_add_src2_f32	Dmitry Preobrazhensky	2018-03-28	2	-0/+11
\| \| \| \| \| \| \| \| \|	See bug 36833: https://bugs.llvm.org/show_bug.cgi?id=36833 Differential Revision: https://reviews.llvm.org/D44779 Reviewers: arsenm, artem.tamazov, timcorringham llvm-svn: 328713
*	[AMDGPU][MC] Added PCK variants of image load/store instructions	Dmitry Preobrazhensky	2018-03-28	2	-0/+69
\| \| \| \| \| \| \| \| \|	See bug 36834: https://bugs.llvm.org/show_bug.cgi?id=36834 Differential Revision: https://reviews.llvm.org/D44795 Reviewers: artem.tamazov, arsenm, timcorringham, nhaehnle llvm-svn: 328710
*	[AMDGPU][MC][GFX9] Added buffer_*_format_d16_hi_x	Dmitry Preobrazhensky	2018-03-28	2	-0/+70
\| \| \| \| \| \| \| \| \|	See bug 36835: https://bugs.llvm.org/show_bug.cgi?id=36835 Differential Revision: https://reviews.llvm.org/D44825 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 328707
*	[AMDGPU][MC][GFX9] Added s_scratch* instructions	Dmitry Preobrazhensky	2018-03-28	2	-24/+116
\| \| \| \| \| \| \| \| \|	See bug 36836: https://bugs.llvm.org/show_bug.cgi?id=36836 Differential Revision: https://reviews.llvm.org/D44832 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 328704
*	Revert "Reapply "[DWARFv5] Emit file 0 to the line table.""	Alexander Potapenko	2018-03-28	9	-84/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit r328676. Commit r328676 broke the -no-integrated-as flag necessary to build Linux kernel with Clang: $ cat t.c void foo() {} $ clang -no-integrated-as -c t.c -g /tmp/t-dcdec5.s: Assembler messages: /tmp/t-dcdec5.s:8: Error: file number less than one clang-7.0: error: assembler command failed with exit code 1 (use -v to see invocation) llvm-svn: 328699
*	[X86][BtVer2] Fix the number of micro opcodes for AES[ENC\|DEC] and other YMM ↵	Andrea Di Biagio	2018-03-28	1	-22/+22
\| \| \| \| \| \| \| \| \| \| \|	instructions. Similar to r328694. The number of micro opcodes should be 2 for those instructions. This was found when testing AVX code for BtVer2 using llvm-mca. llvm-svn: 328698
*	Revert "[AMDGPU] For OS type AMDPAL, fixed scratch on compute shader"	Tim Renouf	2018-03-28	1	-29/+0
\| \| \| \| \| \| \| \| \| \| \|	This reverts commit 0daf86291d3aa04d3cc280cd0ef24abdb0174981. It was causing an assert in test/CodeGen/AMDGPU/amdpal.ll only on a release-with-asserts build. I will resubmit the change when I have fixed that. Change-Id: If270594eba27a7dc4076bdeab3fa8e6bfda3288a llvm-svn: 328695
*	[X86][BtVer2] Fix the number of micro opcodes for a bunch of YMM instructions.	Andrea Di Biagio	2018-03-28	2	-14/+709
\| \| \| \| \| \| \| \| \| \| \| \| \|	The Jaguar backend natively supports 128-bit data types. Operations on YMM registers are split into two COPs (complex operations). Each COP consumes a slot in the dispatch group, and in the reorder buffer. The scheduling model for Jaguar should mark those instructions as `let NumMicroOps = 2`. This was found when testing AVX code for BtVer2 using llvm-mca. llvm-svn: 328694
*	[ARM] Support float literals under XO	Christof Douma	2018-03-28	1	-83/+26
\| \| \| \| \| \| \| \| \| \|	Follow up patch of r328313 to support the UseVMOVSR constraint. Removed some unneeded instructions from the test and removed some stray comments. Differential Revision: https://reviews.llvm.org/D44941 llvm-svn: 328691
*	[RegisterCoalescing] Don't move COPY if it would interfere with another value	Mikael Holmen	2018-03-28	1	-0/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: RegisterCoalescer::removePartialRedundancy tries to hoist B = A from BB0/BB2 to BB1: BB1: ... BB0/BB2: ---- B = A; \| ... \| A = B; \| \|------- \| It does so if a number of conditions are fulfilled. However, it failed to check if B was used by any of the terminators in BB1. Since we must insert B = A before the terminators (since it's not a terminator itself), this means that we could erroneously insert a new definition of B before a use of it. Reviewers: wmi, qcolombet Reviewed By: wmi Subscribers: MatzeB, llvm-commits, sdardis Differential Revision: https://reviews.llvm.org/D44918 llvm-svn: 328689
*	[AArch64] add ftrunc tests; NFC	Sanjay Patel	2018-03-28	1	-0/+47
\| \| \| \| \| \|	As suggested in D44909. llvm-svn: 328683