bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Reapply r240291: Fix shl folding in DAG combiner.	Pawel Bylica	2015-07-02	1	-0/+9
\| \| \| \| \| \| \| \|	The code responsible for shl folding in the DAGCombiner was assuming incorrectly that all constants are less than 64 bits. This patch simply changes the way values are compared. It has been reverted previously because of some problems with comparing APInt with raw uint64_t. That has been fixed/changed with r241204. llvm-svn: 241254
*	[TwoAddressInstructionPass] Try 3 Addr Conversion After Commuting.	Quentin Colombet	2015-07-01	3	-9/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TwoAddressInstructionPass stops after a successful commuting but 3 Addr conversion might be good for some cases. Consider: int foo(int a, int b) { return a + b; } Before this commit, we emit: addl %esi, %edi movl %edi, %eax ret After this commit, we try 3 Addr conversion: leal (%rsi,%rdi), %eax ret Patch by Volkan Keles <vkeles@apple.com>! Differential Revision: http://reviews.llvm.org/D10851 llvm-svn: 241206
*	add a cl::opt override for TargetLoweringBase's JumpIsExpensive	Sanjay Patel	2015-07-01	1	-11/+20
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch is not intended to change existing codegen behavior for any target. It just exposes the JumpIsExpensive setting on the command-line to allow for easier testing and emergency overrides. Also, change the existing regression test to use FileCheck, explicitly specify the jump-is-expensive option, and use more precise checks. Differential Revision: http://reviews.llvm.org/D10846 llvm-svn: 241179
*	[SEH] Don't assert if the parent function lacks a personality	Reid Kleckner	2015-07-01	1	-0/+33
\| \| \| \| \| \| \| \|	The EH code might have been deleted as unreachable and the personality pruned while the filter is still present. Currently I'm hitting this at -O0 due to the clang bug PR24009. llvm-svn: 241170
*	AVX-512: Implemented missing encoding for FMA scalar instructions	Igor Breger	2015-07-01	1	-1/+30
\| \| \| \| \| \| \| \|	Added tests for encoding Differential Revision: http://reviews.llvm.org/D10865 llvm-svn: 241159
*	[SEH] Add new intrinsics for recovering and restoring parent frames	Reid Kleckner	2015-06-30	2	-31/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The incoming EBP value established by the runtime is actually a pointer to the end of the EH registration object, and not the true parent function frame pointer. Clang doesn't need llvm.x86.seh.exceptioninfo anymore because we know that the exception info pointer is at a fixed offset from this incoming EBP. The llvm.x86.seh.recoverfp intrinsic takes an EBP value provided by the EH runtime and returns a pointer that is usable with llvm.framerecover. The llvm.x86.seh.restoreframe intrinsic is inserted by the 32-bit specific preparation pass in blocks targetted by the EH runtime. It re-establishes any physical registers used by the parent function to address the stack, such as the frame, base, and stack pointers. Neither of these intrinsics correctly handle stack realignment prologues yet, but it's possible to add that later. Reviewers: majnemer Differential Revision: http://reviews.llvm.org/D10848 llvm-svn: 241125
*	[FaultMaps] Let the frontend pre-select implicit null check candidates.	Sanjoy Das	2015-06-30	2	-5/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change introduces a !make.implicit metadata that allows the frontend to pre-select the set of explicit null checks that will be considered for transformation into implicit null checks. The reason for not using profiling data instead of !make.implicit is explained in the change to `FaultMaps.rst`. Reviewers: atrick, reames, pgavlin, JosephTremoulet Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10824 llvm-svn: 241116
*	COFF: Do not assign linker-weak symbols to selectany comdat sections.	Peter Collingbourne	2015-06-30	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is mandatory to specify a comdat in order to receive comdat semantics for a symbol. We were previously getting this wrong in -function-sections mode; linker-weak symbols were being emitted in a selectany comdat. This change causes such symbols to use a noduplicates comdat instead, fixing the inconsistency. Also correct an inaccuracy in the docs. Differential Revision: http://reviews.llvm.org/D10828 llvm-svn: 241103
*	[X86] Fix a bug in WIN_FTOL_32/64 handling.	Michael Kuperstein	2015-06-30	1	-0/+22
\| \| \| \| \| \| \| \| \| \|	Duplicating an FP register "as itself" is a bad idea, since it violates the invariant that every FP register is mapped to at most one FPU stack slot. Use the scratch FP register instead. This fixes PR23957. llvm-svn: 241069
*	[X86] Add FXSR intrinsics	Michael Kuperstein	2015-06-30	2	-0/+50
\| \| \| \| \| \|	Add intrinsics for the FXSR instructions (FXSAVE/FXSAVE64/FXRSTOR/FXRSTOR64) llvm-svn: 241049
*	Teach LTOModule to emit linker flags for dllexported symbols, plus interface ↵	Peter Collingbourne	2015-06-29	2	-65/+66
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	cleanup. This change unifies how LTOModule and the backend obtain linker flags for globals: via a new TargetLoweringObjectFile member function named emitLinkerFlagsForGlobal. A new function LTOModule::getLinkerOpts() returns the list of linker flags as a single concatenated string. This change affects the C libLTO API: the function lto_module_get_deplibs now exposes an empty list, and lto_module_get_linkeropts exposes a single element which combines the contents of all observed flags. libLTO should never have tried to parse the linker flags; it is the linker's job to do so. Because linkers will need to be able to parse flags in regular object files, it makes little sense for libLTO to have a redundant mechanism for doing so. The new API is compatible with the old one. It is valid for a user to specify multiple linker flags in a single pragma directive like this: #pragma comment(linker, "/defaultlib:foo /defaultlib:bar") The previous implementation would not have exposed either flag via lto_module_get_deplibs (as the test in TargetLoweringObjectFileCOFF::getDepLibFromLinkerOpt was case sensitive) and would have exposed "/defaultlib:foo /defaultlib:bar" as a single flag via lto_module_get_linkeropts. This may have been a bug in the implementation, but it does give us a chance to fix the interface. Differential Revision: http://reviews.llvm.org/D10548 llvm-svn: 241010
*	X86: Rework inline asm integer register specification.	Matthias Braun	2015-06-29	2	-0/+143
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a new version of http://reviews.llvm.org/D10260. It turned out that when you specify an integer register in inline asm on x86 you get the register of the required type size back. That means that X86TargetLowering::getRegForInlineAsmConstraint() has to accept any of the integer registers and adapt its size to the given target size which may be any 8/16/32/64 bit sized type. Surprisingly that means given a constraint of "{ax}" and a type of MVT::F32 we need to return X86::EAX. This change makes this face explicit, the previous code seemed like working by accident because there it never returned an error once a register was found. On the other hand this rewrite allows to actually return errors for invalid situations like requesting an integer register for an i128 type. Related to rdar://21042280 Differential Revision: http://reviews.llvm.org/D10813 llvm-svn: 241002
*	[FaultMaps] Fix test case.	Sanjoy Das	2015-06-29	1	-16/+1
\| \| \| \| \| \| \|	implicit-null-check-negative.ll had a missing 2>&1. Fix this, and remove an incorrect test case that this exposes. llvm-svn: 240998
*	[DAGCombiner] Fix & simplify constant folding of sext/zext.	Pawel Bylica	2015-06-29	1	-0/+92
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch fixes the cases of sext/zext constant folding in DAG combiner where constans do not fit 64 bits. The fix simply removes un$ Test Plan: New regression test included. Reviewers: RKSimon Reviewed By: RKSimon Subscribers: RKSimon, llvm-commits Differential Revision: http://reviews.llvm.org/D10607 llvm-svn: 240991
*	AVX-512: all forms of SCATTER instruction on SKX,	Elena Demikhovsky	2015-06-29	1	-0/+241
\| \| \| \| \| \|	encoding, intrinsics and tests. llvm-svn: 240936
*	AVX-512: Implemented missing encoding and intrinsics for FMA instructions	Igor Breger	2015-06-29	3	-303/+1308
\| \| \| \| \| \| \| \|	Added tests for DAG lowering ,encoding and intrinsics Differential Revision: http://reviews.llvm.org/D10796 llvm-svn: 240926
*	[x86][AVX512]	Asaf Badouh	2015-06-28	2	-0/+74
\| \| \| \| \| \| \| \| \| \| \|	Add vscalef support include encoding and intrinsics review: http://reviews.llvm.org/D10730 llvm-svn: 240906
*	AVX-512: Added all SKX forms of GATHER instructions.	Elena Demikhovsky	2015-06-28	1	-96/+411
\| \| \| \| \| \| \|	Added intrinsics. Added encoding and tests. llvm-svn: 240905
*	[SDAG] Now that we have a way to communicate the exact bit on sdiv use it to ↵	Benjamin Kramer	2015-06-27	1	-1/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	simplify sdiv by a constant. We had a hack in SDAGBuilder in place to work around this but now we can avoid that. Call BuildExactSDIV from BuildSDIV so DAGCombiner can perform this trick automatically. The added check in DAGCombiner is necessary to prevent exact sdiv by pow2 from regressing as the target-specific pow2 lowering is not aware of exact bits yet. This is mostly covered by existing tests. One side effect is that we get the better lowering for exact vector sdivs now too :) llvm-svn: 240891
*	llvm/test/CodeGen/X86/xor.ll: Appease Win32 targets since r240796.	NAKAMURA Takumi	2015-06-27	1	-1/+1
\| \| \| \| \| \| \| \| \|	%struct.ref_s = type { %union.v, i16, i16 } %union.v = type { i64 } It seems %struct.ref_s is incompatible in tail padding. llvm-svn: 240874
*	Revert "Revert r240762 "[X86] Cleanup ↵	David Majnemer	2015-06-26	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \|	X86WindowsTargetObjectFile::getSectionForConstant"" This reverts commit r240793 while fixing how we handle array constant pool entries. This fixes PR23966. llvm-svn: 240811
*	[DAGCombine] Fix demanded bits computation for exact shifts.	Benjamin Kramer	2015-06-26	1	-0/+19
\| \| \| \| \| \|	Fixes a miscompilation of MultiSource/Benchmarks/MallocBench/gs llvm-svn: 240796
*	[DAGCombiner] Preserve the exact bit when simplifying SRA to SRL.	Benjamin Kramer	2015-06-26	1	-0/+10
\| \| \| \| \| \|	Allows more aggressive folding of ashr/shl pairs. llvm-svn: 240788
*	[DAGCombine] fold (X >>?,exact C1) << C2 --> X << (C2-C1)	Benjamin Kramer	2015-06-26	1	-0/+49
\| \| \| \| \| \| \|	Instcombine also does this but many opportunities only become visible after GEPs are lowered. llvm-svn: 240787
*	Revert "X86: Reject register operands with obvious type mismatches."	Matthias Braun	2015-06-26	1	-10/+0
\| \| \| \| \| \| \| \|	Revert until http://llvm.org/PR23955 is investigated. This reverts commit r239309. llvm-svn: 240746
*	aad/fix labels in test/CodeGen/X86/StackColoring.ll	Matthias Braun	2015-06-26	1	-15/+18
\| \| \| \|	llvm-svn: 240744
*	[X86] Accept hasAVX512() as well as hasFMA() when generating FMA.	Ahmed Bougacha	2015-06-25	3	-2/+3
\| \| \| \| \| \| \| \| \| \|	We don't always have FMA, for example when using 'clang -mavx512f' without an explicit CPU. Also check for an explicit +avx512f instead of CPUs in a couple related tests. llvm-svn: 240616
*	[X86] Cleanup fma tests a little bit. NFC.	Ahmed Bougacha	2015-06-25	6	-732/+758
\| \| \| \| \| \| \|	Reformat, isolate 213->231 xform, actually --check-prefix CHECK, and deduplicate the FMA intrinsic tests (FMA3 in AMD-land). llvm-svn: 240615
*	Enable StackMap Serialization for COFF	Swaroop Sridhar	2015-06-25	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary This change turns on the emission of __LLVM_Stackmaps section when generating COFF binaries. Test Plan Added a scenario to the test case: test\CodeGen\X86\statepoint-stackmap-format.ll. Code Review: http://reviews.llvm.org/D10680 llvm-svn: 240613
*	[X86][AVX] Added full set of 256-bit vector shift tests.	Simon Pilgrim	2015-06-24	3	-0/+1774
\| \| \| \|	llvm-svn: 240542
*	Fix instruction scheduling live register tracking	Pawel Bylica	2015-06-24	1	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch fixes PR23405 (https://llvm.org/bugs/show_bug.cgi?id=23405). During a node unscheduling an entry in LiveRegGens can be replaced with a new value. That corrupts the live reg tracking and LiveReg* structure is not cleared as should be during unscheduling. Problematic condition that enforces Gen replacement is `I->getSUnit()->getHeight() < LiveRegGens[I->getReg()]->getHeight()`. This condition should be checked only if LiveRegGen was set in current node unscheduling. Test Plan: Regression test included. Reviewers: hfinkel, atrick Reviewed By: atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9993 llvm-svn: 240538
*	[X86] Don't generate vbroadcasti128 for v4i64 splats from memory.	Ahmed Bougacha	2015-06-24	1	-0/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We used to erroneously match: (v4i64 shuffle (v2i64 load), <0,0,0,0>) Whereas vbroadcasti128 is more like: (v4i64 shuffle (v2i64 load), <0,1,0,1>) This problem doesn't exist for vbroadcastf128, which kept matching the intrinsic after r231182. We should perhaps re-introduce the intrinsic here as well, but that's a separate issue still being discussed. While there, add some proper vbroadcastf128 tests. We don't currently match those, like for loading vbroadcastsd/ss on AVX (the reg-reg broadcasts where added in AVX2). Fixes PR23886. llvm-svn: 240488
*	[X86] update_llc_test_checks vector-shuffle-*. NFC.	Ahmed Bougacha	2015-06-24	4	-70/+50
\| \| \| \| \| \|	Some of them had gone stale. llvm-svn: 240485
*	[X86][SSE] Added full set of 128-bit vector shift tests.	Simon Pilgrim	2015-06-23	4	-527/+2458
\| \| \| \| \| \|	Removed some old duplicate tests. llvm-svn: 240465
*	AVX-512: Added all forms of VPABS instruction	Elena Demikhovsky	2015-06-23	4	-12/+155
\| \| \| \| \| \|	Added all intrinsics, tests for encoding, tests for intrinsics. llvm-svn: 240386
*	[x86] generalize reassociation optimization in machine combiner to 2 ↵	Sanjay Patel	2015-06-23	2	-78/+99
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instructions Currently ( D10321, http://reviews.llvm.org/rL239486 ), we can use the machine combiner pass to reassociate the following sequence to reduce the critical path: A = ? op ? B = A op X C = B op Y --> A = ? op ? B = X op Y C = A op B 'op' is currently limited to x86 AVX scalar FP adds (with fast-math on), but in theory, it could be any associative math/logic op (see TODO in code comment). This patch generalizes the pattern match to ignore the instruction that defines 'A'. So instead of a sequence of 3 adds, we now only need to find 2 dependent adds and decide if it's worth reassociating them. This generalization has a compile-time cost because we can now match more instruction sequences and we rely more heavily on the machine combiner to discard sequences where reassociation doesn't improve the critical path. For example, in the new test case: A = M div N B = A add X C = B add Y We'll match 2 reassociation patterns, but this transform doesn't reduce the critical path: A = M div N B = A add Y C = B add X We need the combiner to reject that pattern but select this: A = M div N B = X add Y C = B add A Differential Revision: http://reviews.llvm.org/D10460 llvm-svn: 240361
*	Revert r240291: causes problems in self-hosted builds.	Pawel Bylica	2015-06-22	1	-9/+0
\| \| \| \|	llvm-svn: 240343
*	Set missing x86 arch in a CodeGen regression test.	Pawel Bylica	2015-06-22	1	-1/+2
\| \| \| \| \| \|	Fixes the regression test added in r240291. llvm-svn: 240336
*	[X86][AVX2] Added missing stack folding tests for vpshufhw/vpshuflw	Simon Pilgrim	2015-06-22	1	-2/+14
\| \| \| \|	llvm-svn: 240332
*	[X86] Teach load folding to accept scalar _Int users of MOVSS/MOVSD.	Ahmed Bougacha	2015-06-22	1	-0/+142
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The _Int instructions are special, in that they operate on the full VR128 instead of FR32. The load folding then looks at MOVSS, at the user, and bails out when it sees a size mismatch. What we really know is that the rm_Int instructions don't load the higher lanes, so folding is fine. This happens for the straightforward intrinsic code, e.g.: _mm_add_ss(a, _mm_load_ss(p)); Fixes PR23349. Differential Revision: http://reviews.llvm.org/D10554 llvm-svn: 240326
*	[x86] set default reciprocal (division and square root) codegen to match GCC	Sanjay Patel	2015-06-22	2	-54/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	D8982 ( checked in at http://reviews.llvm.org/rL239001 ) added command-line options to allow reciprocal estimate instructions to be used in place of divisions and square roots. This patch changes the default settings for x86 targets to allow that recip codegen (except for scalar division because that breaks too much code) when using -ffast-math or its equivalent. This matches GCC behavior for this kind of codegen. Differential Revision: http://reviews.llvm.org/D10396 llvm-svn: 240310
*	[FaultMaps] Add a parser for the __llvm__faultmaps section.	Sanjoy Das	2015-06-22	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The parser is exercised by llvm-objdump using -print-fault-maps. As is probably obvious, the code itself was "heavily inspired" by http://reviews.llvm.org/D10434. Reviewers: reames, atrick, JosephTremoulet Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10491 llvm-svn: 240304
*	Fix shl folding in DAG combiner.	Pawel Bylica	2015-06-22	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The code responsible for shl folding in the DAGCombiner was assuming incorrectly that all constants are less than 64 bits. This patch simply changes the way values are compared. Test Plan: A regression test included. Reviewers: andreadb Reviewed By: andreadb Subscribers: andreadb, test, llvm-commits Differential Revision: http://reviews.llvm.org/D10602 llvm-svn: 240291
*	AVX-512: added VPSHUFB instruction - all SKX forms	Elena Demikhovsky	2015-06-22	2	-0/+39
\| \| \| \| \| \|	Added intrinsics and encoding tests. llvm-svn: 240277
*	Reverted AVX-512 vector shuffle	Elena Demikhovsky	2015-06-22	3	-609/+506
\| \| \| \|	llvm-svn: 240258
*	[X86] Allow more call sequences to use push instructions for argument passing	Michael Kuperstein	2015-06-22	1	-40/+53
\| \| \| \| \| \| \| \| \|	This allows more call sequences to use pushes instead of movs when optimizing for size. In particular, calling conventions that pass some parameters in registers (e.g. thiscall) are now supported. Differential Revision: http://reviews.llvm.org/D10500 llvm-svn: 240257
*	AVX-512: Added intrinsics for VPERMT2W/D/Q/PS/PD and	Elena Demikhovsky	2015-06-22	4	-1/+346
\| \| \| \| \| \| \|	VPERMI2W/D/Q/PS/PD instructions. Added tests. llvm-svn: 240256
*	Add the testcase from pr23900.	Rafael Espindola	2015-06-22	1	-0/+29
\| \| \| \|	llvm-svn: 240253
*	[X86][SSE] Added missing stack folding test for CVTSD2SS instruction.	Simon Pilgrim	2015-06-21	1	-1/+7
\| \| \| \|	llvm-svn: 240241
*	Switch lowering: add heuristic for filling leaf nodes in the weight-balanced ↵	Hans Wennborg	2015-06-20	1	-4/+77
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	binary search tree Sparse switches with profile info are lowered as weight-balanced BSTs. For example, if the node weights are {1,1,1,1,1,1000}, the right-most node would end up in a tree by itself, bringing it closer to the top. However, a leaf in this BST can contain up to 3 cases, and having a single case in a leaf node as in the example means the tree might become unnecessarily high. This patch adds a heauristic to the pivot selection algorithm that moves more cases into leaf nodes unless that would lower their rank. It still doesn't yield the optimal tree in every case, but I believe it's conservatibely correct. llvm-svn: 240224