bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[X86ISelLowering] Consolidate duplicated code in a single place.	Davide Italiano	2016-02-22	2	-24/+16
\| \| \| \|	llvm-svn: 261573
*	Revert "CodeGen: MachineInstr::getIterator() => getInstrIterator(), NFC"	Duncan P. N. Exon Smith	2016-02-22	1	-2/+1
\| \| \| \| \| \| \| \| \| \|	This reverts commit r261504, since it's not obvious the new name is better: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160222/334298.html I'll recommit if we get consensus that it's the right direction. llvm-svn: 261567
*	AVX512F: Add assembler Intel syntax tests for knl, fix minor bugs.	Igor Breger	2016-02-22	1	-5/+5
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D17498 llvm-svn: 261521
*	AVX512: Fix scalar mem operands.	Igor Breger	2016-02-22	1	-14/+17
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D17500 llvm-svn: 261520
*	[X86] Minor formatting fix. NFC	Craig Topper	2016-02-22	1	-9/+9
\| \| \| \|	llvm-svn: 261515
*	Document assumption in X86FrameLowering::inlineStackProbe()	Duncan P. N. Exon Smith	2016-02-22	1	-1/+2
\| \| \| \| \| \| \| \|	Resolve FIXME from r261504. Apparently bundled instructions are illegal here: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160215/334146.html llvm-svn: 261507
*	CodeGen: MachineInstr::getIterator() => getInstrIterator(), NFC	Duncan P. N. Exon Smith	2016-02-21	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Delete MachineInstr::getIterator(), since the term "iterator" is overloaded when talking about MachineInstr. - Downcast to ilist_node in iplist::getNextNode() and getPrevNode() so that ilist_node::getIterator() is still available. - Add it back as MachineInstr::getInstrIterator(). This matches the naming in MachineBasicBlock. - Add MachineInstr::getBundleIterator(). This is explicitly called "bundle" (not matching MachineBasicBlock) to disintinguish it clearly from ilist_node::getIterator(). - Update all calls. Some of these I switched to `auto` to remove boiler-plate, since the new name is clear about the type. There was one call I updated that looked fishy, but it wasn't clear what the right answer was. This was in X86FrameLowering::inlineStackProbe(), added in r252578 in lib/Target/X86/X86FrameLowering.cpp. I opted to leave the behaviour unchanged, but I'll reply to the original commit on the list in a moment. llvm-svn: 261504
*	ADT: Remove == and != comparisons between ilist iterators and pointers	Duncan P. N. Exon Smith	2016-02-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	I missed == and != when I removed implicit conversions between iterators and pointers in r252380 since they were defined outside ilist_iterator. Since they depend on getNodePtrUnchecked(), they indirectly rely on UB. This commit removes all uses of these operators. (I'll delete the operators themselves in a separate commit so that it can be easily reverted if necessary.) There should be NFC here. llvm-svn: 261498
*	[X86] Remove unused encoding types from disassembler. NFC	Craig Topper	2016-02-21	3	-22/+0
\| \| \| \|	llvm-svn: 261494
*	[X86][AVX] Add shuffle masking support for EltsFromConsecutiveLoads	Simon Pilgrim	2016-02-21	1	-3/+25
\| \| \| \| \| \| \| \|	Add support for the case where we have a consecutive load (which must include the first + last elements) with a mixture of undef/zero elements. We load the vector and then apply a shuffle to clear the zero'd elements. Differential Revision: http://reviews.llvm.org/D17297 llvm-svn: 261490
*	Fix LLVM's handling and detection of skylake and cannonlake CPUs	Sanjoy Das	2016-02-21	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - Rename `"skylake"` == SkylakeServerProc to `"skylake-avx512"` - Change `"skylake"` to denote SkylakeClientProc - Fix the detection of cpu family 6 and model 94 to be SkylakeClientProc instead of SkylakeServerProc - Remove the `"cnl"` for CannonLake Reviewers: craig.topper, delena Subscribers: zansari, echristo, qcolombet, RKSimon, spatel, DavidKreitzer, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D17090 llvm-svn: 261482
*	[X86] Use the correct alignment for COMDAT constant pool entries	David Majnemer	2016-02-21	3	-8/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	COFF doesn't have sections with mergeable contents. Instead, each constant pool entry ends up in a COMDAT section. The linker, when choosing between COMDAT sections, doesn't choose the max alignment of the two sections. You just get whatever alignment was on the section. If one constant needed a higher alignment in one object file from another one, then we will get into trouble if the linker chooses the lower alignment one. Instead, lets promote the alignment of the constant pool entry to make sure we don't use an under aligned constant with an instruction which assumed otherwise. This fixes PR26680. llvm-svn: 261462
*	[X86][SSE] Fixed issue with commutation of 'faux unary' target shuffles ↵	Simon Pilgrim	2016-02-20	1	-5/+4
\| \| \| \| \| \| \| \|	(PR26667) Fixed a bug introduced by D16683 when a binary shuffle is simplified to a unary shuffle (with undef/zero sentinel mask indices) - if this resulted in only the second input being used combineX86ShuffleChain failed to take this into account and still referenced the first input. llvm-svn: 261434
*	[X86][SSE] Move all undef/zero cases before target shuffle combining.	Simon Pilgrim	2016-02-20	1	-20/+14
\| \| \| \| \| \|	First small step towards fixing PR26667 - we need to ensure that combineX86ShuffleChain only gets called with a valid shuffle input node (a similar issue was found in D17041). llvm-svn: 261433
*	[X86] Enable the LEA optimization pass by default.	Andrey Turetskiy	2016-02-20	1	-4/+5
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D16877 llvm-svn: 261429
*	[X86] PR26575: Fix LEA optimization pass (Part 2).	Andrey Turetskiy	2016-02-20	1	-36/+78
\| \| \| \| \| \| \| \| \| \|	Handle address displacement operands of a type other than Immediate or Global in LEAs and load/stores. Ref: https://llvm.org/bugs/show_bug.cgi?id=26575 Differential Revision: http://reviews.llvm.org/D17374 llvm-svn: 261428
*	Move some code from doInitialization to runOnFunction	David Majnemer	2016-02-20	1	-3/+4
\| \| \| \| \| \| \|	This has no observable behavior change, it just makes the state insertion pass look a little more like normal passes. llvm-svn: 261420
*	[X86] Add some missing reversed forms of XOP instructions.	Craig Topper	2016-02-20	1	-0/+29
\| \| \| \|	llvm-svn: 261417
*	[X86ISelLowering] Fix TLSADDR lowering when shrink-wrapping is enabled.	Davide Italiano	2016-02-20	3	-2/+39
\| \| \| \| \| \| \| \| \| \|	TLSADDR nodes are lowered into actuall calls inside MC. In order to prevent shrink-wrapping from pushing prologue/epilogue past them (which result in TLS variables being accessed before the stack frame is set up), we put markers, so that the stack gets adjusted properly. Thanks to Quentin Colombet for guidance/help on how to fix this problem! llvm-svn: 261387
*	[X86ISelLowering] Provide a more informative assert message.	Davide Italiano	2016-02-19	1	-1/+1
\| \| \| \| \| \|	I stumbled upon this while debugging a lowering bug. llvm-svn: 261371
*	[X86ISelLowering] Merge two conditions inside a single if.	Davide Italiano	2016-02-19	1	-3/+1
\| \| \| \|	llvm-svn: 261370
*	Revert r253557 "Alternative to long nops for X86 CPUs, by Andrey Turetsky"	Hans Wennborg	2016-02-19	1	-32/+14
\| \| \| \| \| \|	Turns out the new nop sequences aren't actually nops on x86_64 (PR26554). llvm-svn: 261365
*	Fix incorrect selection of AVX512 sqrt when OptForSize is on	Dimitry Andric	2016-02-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When optimizing for size, sqrt calls can be incorrectly selected as AVX512 VSQRT instructions. This is because X86InstrAVX512.td has a `Requires<[OptForSize]>` in its `avx512_sqrt_scalar` multiclass definition. Even if the target does not support AVX512, the class can apparently still be chosen, leading to an incorrect selection of `vsqrtss`. In PR26625, this lead to an assertion: Reg >= X86::FP0 && Reg <= X86::FP6 && "Expected FP register!", because the `vsqrtss` instruction requires an XMM register, which is not available on i686 CPUs. Reviewers: grosbach, resistor, joker.eph Subscribers: spatel, emaste, llvm-commits Differential Revision: http://reviews.llvm.org/D17414 llvm-svn: 261360
*	[X86] Remove unused entries from the disassembler type enum.	Craig Topper	2016-02-19	3	-6/+0
\| \| \| \|	llvm-svn: 261311
*	[x86] fix initialization of PredictableSelectIsExpensive	Sanjay Patel	2016-02-18	1	-3/+3
\| \| \| \| \| \| \| \| \| \|	This is effectively NFC because Atom is the only in-order x86 subtarget currently, but the predicate would have become wrong if any other in-order CPU came along. See related discussion in: http://reviews.llvm.org/D16836 llvm-svn: 261275
*	Remove uses of builtin comma operator.	Richard Trieu	2016-02-18	1	-1/+2
\| \| \| \| \| \|	Cleanup for upcoming Clang warning -Wcomma. No functionality change intended. llvm-svn: 261270
*	[WinEH] Hoist state stores from successors	David Majnemer	2016-02-18	1	-1/+54
\| \| \| \| \| \| \| \| \| \|	If we know that all of our successors want to be in the exact same state, it makes sense to hoist the state transition into their common predecessor. Differential Revision: http://reviews.llvm.org/D17391 llvm-svn: 261262
*	[X86ISelLowering] Use isPowerof2 instead of rewriting it. NFC.	Davide Italiano	2016-02-18	1	-1/+1
\| \| \| \|	llvm-svn: 261255
*	Revert to extend i8/i16 return values on Darwin (PR26665)	Hans Wennborg	2016-02-18	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	In r260133, LLVM was changed to no longer extend i8/i16 return values, as it's not required by the ABI. However, code was found in the wild that relies on the old behaviour on Darwin, so this commit reverts back to that old behaviour for Darwin. On other platforms, it's less likely that code would be depending on the old behaviour, as GCC and MSVC haven't been extending such return values. llvm-svn: 261235
*	[X86][SSE] Improve PSHUFB shuffle mask decoding.	Simon Pilgrim	2016-02-18	1	-16/+36
\| \| \| \| \| \| \| \|	In cases where the PSHUFB shuffle mask is shared it might not be bitcasted to a vXi8 byte vector. This patch adds support for decoding these wider shuffle masks from the ConstantPool. The test case in question makes use of this to recognise the shuffle mask is an unary UNPCKL pattern and simplifies accordingly. llvm-svn: 261201
*	[AVX512][PRORQ][PRORD] Change imm8 to int	Michael Zuckerman	2016-02-18	1	-6/+6
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D17024 llvm-svn: 261198
*	Remove superfluous semicolon.	Nico Weber	2016-02-17	1	-1/+1
\| \| \| \|	llvm-svn: 261128
*	[WinEH] Optimize WinEH state stores	David Majnemer	2016-02-17	1	-32/+175
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	32-bit x86 Windows targets use a linked-list of nodes allocated on the stack, referenced to via thread-local storage. The personality routine interprets one of the fields in the node as a 'state number' which indicates where the personality routine should transfer control. State transitions are possible only before call-sites which may throw exceptions. Our previous scheme had us update the state number before all call-sites which may throw. Instead, we can try to minimize the number of times we need to store by reasoning about the nearest store which dominates the current call-site. If the last store agrees with the current call-site, then we know that the state-update is redundant and can be elided. This is largely straightforward: an RPO walk of the blocks allows us to correctly forward propagate the information when the function is a DAG. Currently, loops are not handled optimally and may trigger superfluous state stores. Differential Revision: http://reviews.llvm.org/D16763 llvm-svn: 261122
*	AVX512: Fix LowerMSCATTER() return value.	Igor Breger	2016-02-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	Bug description: The bug was discovered when test was compiled with -O0. In case scatter result is DAG root , VectorLegalizer failed (assert) due to LowerMSCATTER() return kmask as result. Change LowerMSCATTER() to return chain as original node do. Differential Revision: http://reviews.llvm.org/D17331 llvm-svn: 261090
*	[X86][AVX] Support bit-blend integer shuffles for 256-bit integer vectors	Simon Pilgrim	2016-02-17	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \|	AVX1 doesn't support the shuffling of 256-bit integer vectors. For 32/64-bit elements we get around this by shuffling as float/double but for 8/16-bit elements (assuming they can't widen) we currently just split, shuffle as 128-bit vectors and concatenate the results back. This patch adds the ability to lower using the bit-blend patterns before defaulting to the splitting behaviour. Part 2 of 2 Differential Revision: http://reviews.llvm.org/D17292 llvm-svn: 261082
*	[X86][AVX] Support bit-mask integer shuffles for 256-bit integer vectors	Simon Pilgrim	2016-02-17	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \|	AVX1 doesn't support the shuffling of 256-bit integer vectors. For 32/64-bit elements we get around this by shuffling as float/double but for 8/16-bit elements (assuming they can't widen) we currently just split, shuffle as 128-bit vectors and concatenate the results back. This patch adds the ability to lower using the bit-mask patterns before defaulting to the splitting behaviour. In some cases this ends up matching what AVX2 would do anyhow or what AVX1 does on the split vectors. Part 1 of 2 Differential Revision: http://reviews.llvm.org/D17292 llvm-svn: 261081
*	[X86][SSE] Tidyup BUILD_VECTOR operand collection. NFCI.	Simon Pilgrim	2016-02-17	1	-23/+20
\| \| \| \| \| \| \| \|	Avoid reuse of operand variables, keep them local to a particular lowering - the operand collection is unique to each case anyhow. Renamed from V to Ops to more closely match their purpose. llvm-svn: 261078
*	Revert r260979 "[X86] Enable the LEA optimization pass by default."	Hans Wennborg	2016-02-17	1	-5/+4
\| \| \| \| \| \|	Asserts are still firing in Chromium builds. PR26575. llvm-svn: 261058
*	[X86] Fix a shrink-wrapping miscompile around __chkstk	Reid Kleckner	2016-02-17	1	-7/+6
\| \| \| \| \| \| \| \| \|	__chkstk clobbers EAX. If EAX is live across the prologue, then we have to take extra steps to save it. We already had code to do this if EAX was a register parameter. This change adapts it to work when shrink wrapping is used. llvm-svn: 261039
*	[X86] Remove the now-unused X86ISD::PSIGN. NFC.	Ahmed Bougacha	2016-02-16	6	-46/+30
\| \| \| \|	llvm-svn: 261025
*	[X86] Generalize logic blend of (x, -x) combine to match (-x, x).	Ahmed Bougacha	2016-02-16	1	-7/+17
\| \| \| \| \| \|	I suspect this is what let PR26110 lie dormant for so long. llvm-svn: 261024
*	[X86] Don't turn (c?-v:v) into (c?-v:0) by blindly using PSIGN.	Ahmed Bougacha	2016-02-16	1	-10/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, we sometimes miscompile this vector pattern: (c ? -v : v) We lower it to (because "c" is <4 x i1>, lowered as a vector mask): (~c & v) \| (c & -v) When we have SSSE3, we incorrectly lower that to PSIGN, which does: (c < 0 ? -v : c > 0 ? v : 0) in other words, when c is either all-ones or all-zero: (c ? -v : 0) While this is an old bug, it rarely triggers because the PSIGN combine is too sensitive to operand order. This will be improved separately. Note that the PSIGN tests are also incorrect. Consider: %b.lobit = ashr <4 x i32> %b, <i32 31, i32 31, i32 31, i32 31> %sub = sub nsw <4 x i32> zeroinitializer, %a %0 = xor <4 x i32> %b.lobit, <i32 -1, i32 -1, i32 -1, i32 -1> %1 = and <4 x i32> %a, %0 %2 = and <4 x i32> %b.lobit, %sub %cond = or <4 x i32> %1, %2 ret <4 x i32> %cond if %b is zero: %b.lobit = <4 x i32> zeroinitializer %sub = sub nsw <4 x i32> zeroinitializer, %a %0 = <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1> %1 = <4 x i32> %a %2 = <4 x i32> zeroinitializer %cond = or <4 x i32> %a, zeroinitializer ret <4 x i32> %a whereas we currently generate: psignd %xmm1, %xmm0 retq which returns 0, as %xmm1 is 0. Instead, use a pure logic sequence, as described in: https://graphics.stanford.edu/~seander/bithacks.html#ConditionalNegate Fixes PR26110. Differential Revision: http://reviews.llvm.org/D17181 llvm-svn: 261023
*	[X86] Extract PSIGN/BLENDVP combine. NFC.	Ahmed Bougacha	2016-02-16	1	-77/+95
\| \| \| \|	llvm-svn: 261021
*	[X86] Extract ANDNP combine. NFC.	Ahmed Bougacha	2016-02-16	1	-61/+57
\| \| \| \| \| \|	This makes it IMO more readable and reduces indentation. llvm-svn: 261020
*	[X86] Enable the LEA optimization pass by default.	Andrey Turetskiy	2016-02-16	1	-4/+5
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D16877 llvm-svn: 260979
*	[X86] PR26575: Fix LEA optimization pass.	Andrey Turetskiy	2016-02-16	1	-0/+5
\| \| \| \| \| \| \| \| \| \|	Add a missing check for a type of address displacement operand of the load/store instruction being a candidate for LEA substitution. Ref: https://llvm.org/bugs/show_bug.cgi?id=26575 Differential Revision: http://reviews.llvm.org/D17261 llvm-svn: 260959
*	[X86] Fix typos. NFC	Craig Topper	2016-02-16	1	-2/+2
\| \| \| \|	llvm-svn: 260943
*	[X86] Use range-based for loop. NFC	Craig Topper	2016-02-16	1	-3/+2
\| \| \| \|	llvm-svn: 260942
*	[X86] Fix typo in comment. NFC	Craig Topper	2016-02-16	1	-1/+1
\| \| \| \|	llvm-svn: 260940
*	Implemented stack symbol table ordering/packing optimization to improve data ↵	Zia Ansari	2016-02-15	2	-0/+149
\| \| \| \| \| \| \| \|	locality and code size from SP/FP offset encoding. Differential Revision: http://reviews.llvm.org/D15393 llvm-svn: 260917