bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[llvm-pdbutil] Add a function for formatting MSF data.	Zachary Turner	2017-06-23	1	-10/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The goal here is to make it possible to display absolute file offsets when dumping byets from an MSF. The problem is that when dumping bytes from an MSF, often the bytes will cross a block boundary and encounter a discontinuity. We can't use the normal formatBinary() function for this because this would just treat the sequence as entirely ascending, and not account out-of-order blocks. This patch adds a formatMsfData() function to our printer, and then uses this function to improve the output of the -stream-data command line option for dumping bytes from a particular stream. Test coverage is also expanded to make sure to include all possible scenarios of offsets, sizes, and crossing block boundaries. llvm-svn: 306141
*	[x86] fix value types for SBB transform (PR33560)	Sanjay Patel	2017-06-23	1	-0/+25
\| \| \| \| \| \| \| \| \| \| \|	I'm not sure yet why this wouldn't fail in the simple case, but clearly I used the wrong value type with: https://reviews.llvm.org/rL306040 ...and the bug manifests with: https://bugs.llvm.org/show_bug.cgi?id=33560 llvm-svn: 306139
*	[X86][AVX] Regenerate i256 bitcasted store test	Simon Pilgrim	2017-06-23	1	-6/+20
\| \| \| \| \| \|	Check on slow/fast unaligned memory targets llvm-svn: 306138
*	Regenerate extract-store.ll tests	Simon Pilgrim	2017-06-23	1	-6/+123
\| \| \| \|	llvm-svn: 306131
*	[Hexagon] Handle decreasing of stack alignment in frame lowering	Krzysztof Parzyszek	2017-06-23	1	-0/+51
\| \| \| \|	llvm-svn: 306124
*	GlobalISel: remove G_SEQUENCE instruction.	Tim Northover	2017-06-23	2	-41/+13
\| \| \| \| \| \| \| \|	It was trying to do too many things. The basic lumping together of values for legalization purposes is now handled by G_MERGE_VALUES. More complex things involving gaps and odd sizes are handled by G_INSERT sequences. llvm-svn: 306120
*	GlobalISel: convert buildSequence to use non-deprecated instructions.	Tim Northover	2017-06-23	3	-9/+24
\| \| \| \| \| \| \| \|	G_SEQUENCE is going away soon so as a first step the MachineIRBuilder needs to be taught how to emulate it with alternatives. We use G_MERGE_VALUES where possible, and a sequence of G_INSERTs if not. llvm-svn: 306119
*	[InlineCost] Do not take INT_MAX when Cost is negative	Jun Bum Lim	2017-06-23	1	-0/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: visitSwitchInst should not take INT_MAX when Cost is negative. Instead of INT_MAX , we also use a valid upperbound cost when overflow occurs in Cost. Reviewers: hans, echristo, dmgreen Reviewed By: dmgreen Subscribers: mcrosier, javed.absar, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D34436 llvm-svn: 306118
*	[SystemZ] Remove unnecessary serialization before volatile loads	Ulrich Weigand	2017-06-23	1	-21/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts the use of TargetLowering::prepareVolatileOrAtomicLoad introduced by r196905. Nothing in the semantics of the "volatile" keyword or the definition of the z/Architecture actually requires that volatile loads are preceded by a serialization operation, and no other compiler on the platform actually implements this. Since we've now seen a use case where this additional serialization causes noticable performance degradation, this patch removes it. The patch still leaves in the serialization before atomic loads, which is now implemented directly in lowerATOMIC_LOAD. (This also seems overkill, but that can be addressed separately.) llvm-svn: 306117
*	[x86] auto-generate complete checks; NFC	Sanjay Patel	2017-06-23	1	-41/+93
\| \| \| \|	llvm-svn: 306114
*	[x86] auto-generate complete checks; NFC	Sanjay Patel	2017-06-23	1	-85/+212
\| \| \| \|	llvm-svn: 306113
*	AMDGPU/GlobalISel: Mark 32-bit G_AND as legal	Tom Stellard	2017-06-23	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D34349 llvm-svn: 306112
*	[x86] remove overridden target settings in test; NFC	Sanjay Patel	2017-06-23	1	-2/+0
\| \| \| \| \| \|	r306109 was supposed to make this change, but I committed the wrong version. llvm-svn: 306110
*	[x86] rename test file and auto-generate complete checks; NFC	Sanjay Patel	2017-06-23	2	-23/+35
\| \| \| \| \| \| \|	The command-line params override the target setting in the file itself, so delete that. Also, remove the cpu and arch because those don't matter and neither does the OS specification in the triple. llvm-svn: 306109
*	[X86][AVX] Extended vector average tests	Simon Pilgrim	2017-06-23	1	-411/+917
\| \| \| \| \| \|	Added AVX1 tests and merged AVX1/AVX2/AVX512 checks where possible llvm-svn: 306107
*	[SystemZ] Fix trap issue and enable expensive checks.	Jonas Paulsson	2017-06-23	3	-10/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The isBarrier/isTerminator flags have been removed from the SystemZ trap instructions, so that tests do not fail with EXPENSIVE_CHECKS. This was just an issue at -O0 and did not affect code output on benchmarks. (Like Eli pointed out: "targets are split over whether they consider their "trap" a terminator; x86, AArch64, and NVPTX don't, but ARM, MIPS, PPC, and SystemZ do. We should probably try to be consistent here.". This is still the case, although SystemZ has switched sides). SystemZ now returns true in isMachineVerifierClean() :-) These Generic tests have been modified so that they can be run with or without EXPENSIVE_CHECKS: CodeGen/Generic/llc-start-stop.ll and CodeGen/Generic/print-machineinstrs.ll Review: Ulrich Weigand, Simon Pilgrim, Eli Friedman https://bugs.llvm.org/show_bug.cgi?id=33047 https://reviews.llvm.org/D34143 llvm-svn: 306106
*	[X86][SSE] Dropped -mcpu from vector average tests	Simon Pilgrim	2017-06-23	1	-645/+686
\| \| \| \| \| \|	Use triple and attribute only for consistency llvm-svn: 306104
*	[InstCombine] Recognize and simplify three way comparison idioms	Anna Thomas	2017-06-23	1	-0/+395
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Many languages have a three way comparison idiom where comparing two values produces not a boolean, but a tri-state value. Typical values (e.g. as used in the lcmp/fcmp bytecodes from Java) are -1 for less than, 0 for equality, and +1 for greater than. We actually do a great job already of converting three way comparisons into binary comparisons when the result produced has one a single use. Unfortunately, such values can have more than one use, and in that case, our existing optimizations break down. The patch adds a peephole which converts a three-way compare + test idiom into a binary comparison on the original inputs. It focused on replacing the test on the result of the three way compare and does nothing about removing the three way compare itself. That's left to other optimizations (which do actually kick in commonly.) We currently recognize one idiom on signed integer compare. In the future, we plan to recognize and simplify other comparison idioms on other signed/unsigned datatypes such as floats, vectors etc. This is a resurrection of Philip Reames' original patch: https://reviews.llvm.org/D19452 Reviewers: majnemer, apilipenko, reames, sanjoy, mkazantsev Reviewed by: mkazantsev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34278 llvm-svn: 306100
*	Revert r306095: [mips] Fix reg positions in the aui/daui instructions	Petar Jovanovic	2017-06-23	6	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ELF/mips-plt-r6.s in lld-test is failing. Reverting the change. Original commit message: [mips] Fix register positions in the aui/daui instructions Swapped the position of the rt and rs register in the aut/daui instructions for mips32r6 and mips64r6. With this change, the format of the generated instructions complies with specifications and GCC. Patch by Milos Stojanovic. llvm-svn: 306099
*	[X86][SSE] Dropped -mcpu from scalar math tests	Simon Pilgrim	2017-06-23	1	-6/+4
\| \| \| \| \| \|	Use triple and attribute only for consistency llvm-svn: 306097
*	[mips] Fix register positions in the aui/daui instructions	Petar Jovanovic	2017-06-23	6	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \|	Swapped the position of the rt and rs register in the aut/daui instructions for mips32r6 and mips64r6. With this change, the format of the generated instructions complies with specifications and GCC. Patch by Milos Stojanovic. Differential Revision: https://reviews.llvm.org/D33988 llvm-svn: 306095
*	[X86][SSE] Dropped -mcpu from insertps tests	Simon Pilgrim	2017-06-23	1	-3/+3
\| \| \| \| \| \|	Use triple and attribute only for consistency llvm-svn: 306092
*	[mips][msa] Splat.d endianness check	Stefan Maksimovic	2017-06-23	1	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \| \|	Before this change, it was always the first element of a vector that got splatted since the lower 6 bits of vshf.d $wd were always zero for little endian. Additionally, masking has been performed for vshf via which splat.d is created. Vshf has a property where if its first operand's elements have either bit 6 or 7 set, destination element is set to zero. Initially masked with 63 to avoid this property, which would result in generation of and.v + vshf.d in all cases. Masking with one results in generating a single splati.d instruction when possible. Differential Revision: https://reviews.llvm.org/D32216 llvm-svn: 306090
*	COFF: Produce an error on invalid pcrel relocs.	Rafael Espindola	2017-06-23	2	-25/+18
\| \| \| \| \| \| \| \| \| \|	X86_64 COFF only has support for 32 bit pcrel relocations. Produce an error on all others. Note that gnu as has extended the relocation values to support this. It is not clear if we should support the gnu extension. llvm-svn: 306082
*	[LoopSimplify] Factor the logic to form dedicated exits into a utility.	Chandler Carruth	2017-06-23	3	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I want to use the same logic as LoopSimplify to form dedicated exits in another pass (SimpleLoopUnswitch) so I wanted to factor it out here. I also noticed that there is a pretty significantly more efficient way to implement this than the way the code in LoopSimplify worked. We don't need to actually retain the set of unique exit blocks, we can just rewrite them as we find them and use only a set to deduplicate. This did require changing one part of LoopSimplify to not re-use the unique set of exits, but it only used it to check that there was a single unique exit. That part of the code is about to walk the exiting blocks anyways, so it seemed better to rewrite it to use those exiting blocks to compute this property on-demand. I also had to ditch a statistic, but it doesn't seem terribly valuable. Differential Revision: https://reviews.llvm.org/D34049 llvm-svn: 306081
*	Make the test a bit more strict. NFC.	Rafael Espindola	2017-06-23	1	-62/+64
\| \| \| \|	llvm-svn: 306080
*	COFF: handle "undef - ." expressions.	Rafael Espindola	2017-06-23	2	-6/+10
\| \| \| \| \| \| \|	This is another thing that the ELF implementation can do but is missing from COFF. llvm-svn: 306078
*	[LVI] Teach LVI to reason about ORs of icmps similar to how it reasons about ↵	Craig Topper	2017-06-23	1	-0/+95
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ANDs of icmps Summary: LVI can reason about an AND of icmps on the true dest of a branch. I believe we can do similar for the false dest of ORs. This allows us to get the same answer for the demorganed versions of some of the AND test cases as you can see. Reviewers: anna, reames Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34431 llvm-svn: 306076
*	[x86] add/sub (X==0) --> sbb(cmp X, 1)	Sanjay Patel	2017-06-22	1	-6/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is very similar to the transform in: https://reviews.llvm.org/rL306040 ...but in this case, we use cmp X, 1 to set the carry bit as needed. Again, we can show that all of these are logically equivalent (although InstCombine currently canonicalizes to a form not seen here), and if we believe IACA, then this is the smallest/fastest code. Eg, with SNB: \| Num Of \| Ports pressure in cycles \| \| \| Uops \| 0 - DV \| 1 \| 2 - D \| 3 - D \| 4 \| 5 \| \| --------------------------------------------------------------------- \| 1 \| 1.0 \| \| \| \| \| \| \| cmp edi, 0x1 \| 2 \| \| 1.0 \| \| \| \| 1.0 \| CP \| sbb eax, eax The larger motivation is to clean up all select-of-constants combining/lowering because we're missing some common cases. llvm-svn: 306072
*	Restrict the definition of loop preheader to avoid EH blocks	Andrew Kaylor	2017-06-22	1	-0/+41
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D34487 llvm-svn: 306070
*	Define behavior of "stack-probe-size" attribute when inlining.	whitequark	2017-06-22	1	-0/+29
\| \| \| \| \| \| \| \| \| \|	Also document the attribute, since "probe-stack" already is. Reviewed By: majnemer Differential Revision: https://reviews.llvm.org/D34528 llvm-svn: 306069
*	Supported lowerInterleavedStore() in X86InterleavedAccess.	Farhana Aleen	2017-06-22	2	-69/+96
\| \| \| \| \| \| \| \| \| \|	Reviewers: RKSimon, DavidKreitzer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32658 llvm-svn: 306068
*	Remove the LoadCombine pass. It was never enabled and is unsupported.	Eric Christopher	2017-06-22	5	-355/+0
\| \| \| \| \| \|	Based on discussions with the author on mailing lists. llvm-svn: 306067
*	[x86] add more tests for select --> sbb transform; NFC	Sanjay Patel	2017-06-22	1	-4/+61
\| \| \| \| \| \|	These are siblings of the tests added with r306032. llvm-svn: 306064
*	Change creation of relative relocations on COFF.	Rafael Espindola	2017-06-22	2	-11/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For whatever reason, when processing .globl foo foo: .data bar: .long foo-bar llvm-mc creates a relocation with the section: 0x0 IMAGE_REL_I386_REL32 .text This is different than when the relocation is relative from the beginning. For example, a file with call foo produces 0x0 IMAGE_REL_I386_REL32 foo I would like to refactor the logic for converting "foo - ." into a relative relocation so that it is shared with ELF. This is the first step and just changes the coff implementation to match what ELF (and COFF in the case of calls) does. llvm-svn: 306063
*	[WebAssembly] WebAssemblyFastISel getelementptr variable index support	Jacob Gravelle	2017-06-22	1	-0/+100
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Previously -fast-isel getelementptr would constant-fold non-constant i8 load/stores. Reviewers: sunfish Subscribers: jfb, dschuff, sbc100, llvm-commits Differential Revision: https://reviews.llvm.org/D34044 llvm-svn: 306060
*	[Hexagon] Properly update kill flags in HexagonNewValueJump	Krzysztof Parzyszek	2017-06-22	1	-0/+53
\| \| \| \| \| \| \|	The feeder instruction will be moved to right before the compare, so the updating code should not be looking for kills past the compare. llvm-svn: 306059
*	[MC] Allow assembling .secidx and .secrel32 for undefined symbols	Reid Kleckner	2017-06-22	2	-10/+30
\| \| \| \| \| \| \| \| \| \| \|	There's nothing incorrect about emitting such relocations against symbols defined in other objects. The code in EmitCOFFSec* was missing the visitUsedExpr part of MCStreamer::EmitValueImpl, so these symbols were not being registered with the object file assembler. This will be used to make reduced test cases for LLD. llvm-svn: 306057
*	[llvm-pdbutil] Create a "bytes" subcommand.	Zachary Turner	2017-06-22	3	-14/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This idea originally came about when I was doing some deep investigation of why certain bytes in a PDB that we round-tripped differed from their original bytes in the source PDB. I found myself having to hack up the code in many places to dump the bytes of this substream, or that record. It would be nice if we could just do this for every possible stream, substream, debug chunk type, etc. It doesn't make sense to put this under dump because there's just so many options that would detract from the more common use case of just dumping deserialized records. So making a new subcommand seems like the most logical course of action. In doing so, we already have two command line options that are suitable for this new subcommand, so start out by moving them there. llvm-svn: 306056
*	[llvm-pdbutil] Rename "raw" to "dump".	Zachary Turner	2017-06-22	6	-15/+15
\| \| \| \| \| \| \| \| \|	Now you run llvm-pdbutil dump <options>. This is a followup after having renamed the tool, whereas before raw was obviously just the style of dumping, whereas now "dump" is the action to perform with the "util". llvm-svn: 306055
*	[Hexagon] Use LivePhysRegs to fix up kills in HexagonGenMux	Krzysztof Parzyszek	2017-06-22	3	-2/+33
\| \| \| \| \| \|	Remove the previous, manual shuffling of the kill flags. llvm-svn: 306054
*	[LoopDeletion] Update exits correctly when multiple duplicate edges from an ↵	Anna Thomas	2017-06-22	1	-0/+76
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	exiting block Summary: Currently, we incorrectly update exit blocks of loops when there are multiple edges from a single exiting block to the exit block. This can happen when we have switches as the terminator of the exiting blocks. The fix here is to correctly update the phi nodes in the exit block, and remove all incoming values except for one which is from the preheader. Note: Currently, this error can manifest only while deleting non-executed loops. However, it is possible to trigger this error in invariant loops, once we enhance the logic around the exit conditions for the loop check. Reviewers: chandlerc, dberlin, sanjoy, efriedma Reviewed by: efriedma Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D34516 llvm-svn: 306048
*	[AVX-512] Remove and autoupgrade the masked integer compare intrinsics	Craig Topper	2017-06-22	8	-2704/+4230
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: These intrinsics aren't used by clang and haven't been for a while. There's some really terrible codegen in the 32-bit target for avx512bw due to i64 not being legal. But as I said these intrinsics aren't used by clang even before this patch so this codegen reflects our clang behavior today. Reviewers: spatel, RKSimon, zvi, igorb Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34389 llvm-svn: 306047
*	Updated llvm-objdump for arm64 Mach-O MH_KEXT_BUNDLE file types so	Kevin Enderby	2017-06-22	2	-0/+9
\| \| \| \| \| \| \| \| \|	it symbolically disassembles the __text section from the __TEXT_EXEC segment not the usual __TEXT segment by default. rdar://30590208 llvm-svn: 306046
*	[x86] add/sub (X==0) --> sbb(neg X)	Sanjay Patel	2017-06-22	1	-9/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Our handling of select-of-constants is lumpy in IR (https://reviews.llvm.org/D24480), lumpy in DAGCombiner, and lumpy in X86ISelLowering. That's why we only had the 'sbb' codegen in 1 out of the 4 tests. This is a step towards smoothing that out. First, show that all of these IR forms are equivalent: http://rise4fun.com/Alive/mx Second, show that the 'sbb' version is faster/smaller. IACA output for SandyBridge (later Intel and AMD chips are similar based on Agner's tables): This is the "obvious" x86 codegen (what gcc appears to produce currently): \| Num Of \| Ports pressure in cycles \| \| \| Uops \| 0 - DV \| 1 \| 2 - D \| 3 - D \| 4 \| 5 \| \| --------------------------------------------------------------------- \| 1* \| \| \| \| \| \| \| \| xor eax, eax \| 1 \| 1.0 \| \| \| \| \| \| CP \| test edi, edi \| 1 \| \| \| \| \| \| 1.0 \| CP \| setnz al \| 1 \| \| 1.0 \| \| \| \| \| CP \| neg eax This is the adc version: \| 1* \| \| \| \| \| \| \| \| xor eax, eax \| 1 \| 1.0 \| \| \| \| \| \| CP \| cmp edi, 0x1 \| 2 \| \| 1.0 \| \| \| \| 1.0 \| CP \| adc eax, 0xffffffff And this is sbb: \| 1 \| 1.0 \| \| \| \| \| \| \| neg edi \| 2 \| \| 1.0 \| \| \| \| 1.0 \| CP \| sbb eax, eax If IACA is trustworthy, then sbb became a single uop in Broadwell, so this will be clearly better than the alternatives going forward. llvm-svn: 306040
*	Updated llvm-objdump symbolic disassembly with x86_64 Mach-O MH_KEXT_BUNDLE	Kevin Enderby	2017-06-22	2	-0/+9
\| \| \| \| \| \| \| \| \|	file types so it symbolically disassembles operands using the external relocation entries. rdar://31521343 llvm-svn: 306037
*	Add a common error checking for some invalid expressions.	Rafael Espindola	2017-06-22	3	-4/+8
\| \| \| \| \| \| \|	This refactors a bit of duplicated code and fixes an assertion failure on ELF. llvm-svn: 306035
*	[x86] add tests for select --> sbb transform; NFC	Sanjay Patel	2017-06-22	1	-0/+62
\| \| \| \|	llvm-svn: 306032
*	[AMDGPU] Add intrinsics for tbuffer load and store	David Stuttard	2017-06-22	10	-25/+333
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Intrinsic already existed for llvm.SI.tbuffer.store Needed tbuffer.load and also re-implementing the intrinsic as llvm.amdgcn.tbuffer.* Added CodeGen tests for the 2 new variants added. Left the original llvm.SI.tbuffer.store implementation to avoid issues with existing code Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, tpr Differential Revision: https://reviews.llvm.org/D30687 llvm-svn: 306031
*	[Hexagon] Fix typo in a testcase	Krzysztof Parzyszek	2017-06-22	1	-1/+1
\| \| \| \|	llvm-svn: 306030