bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[EABI] Add LLVM support for -meabi flag	Renato Golin	2015-11-09	2	-104/+214
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	"GCC requires the freestanding environment provide memcpy, memmove, memset and memcmp": https://gcc.gnu.org/onlinedocs/gcc-5.2.0/gcc/Standards.html Hence in GNUEABI targets LLVM should not convert 'memops' to their equivalent '__aeabi_memops'. This convertion violates GCC contract. The -meabi flag controls whether or not LLVM will modify 'memops' in GNUEABI targets. Without -meabi: use the triple default EABI. With -meabi=default: use the triple default EABI. With -meabi=gnu: use 'memops'. With -meabi=4 or -meabi=5: use '__aeabi_memops'. With -meabi set to an unknown value: same as -meabi=default. Patch by Vinicius Tinti. llvm-svn: 252462
*	Revert "[ARM] Combine CMOV into BFI where possible"	Renato Golin	2015-11-09	1	-23/+0
\| \| \| \| \| \| \|	This reverts commit r252057, as it broke ARM self-hosting buildbots, probably due to a code-gen fault. llvm-svn: 252460
*	[CodeGen] Always promote f16 if not legal	Oliver Stannard	2015-11-09	1	-164/+150
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We don't currently have any runtime library functions for operations on f16 values (other than conversions to and from f32 and f64), so we should always promote it to f32, even if that is not a legal type. In that case, the f32 values would be softened to f32 library calls. SoftenFloatRes_FP_EXTEND now needs to check the promoted operand's type, as it may ne a no-op or require a different library call. getCopyFromParts and getCopyToParts now need to cope with a floating-point value stored in a larger integer part, as is the case for any target that needs to store an f16 value in a 32-bit integer register. Differential Revision: http://reviews.llvm.org/D12856 llvm-svn: 252459
*	[Hexagon] Removing XFAIL on Hexagon target.	Colin LeMahieu	2015-11-09	2	-2/+0
\| \| \| \|	llvm-svn: 252450
*	[Hexagon] Enabling ASM parsing on Hexagon backend and adding instruction ↵	Colin LeMahieu	2015-11-09	36	-194/+4220
\| \| \| \| \| \|	parsing tests. General updating of the code emission. llvm-svn: 252443
*	[RuntimeDyld] Add support for R_X86_64_PC8 relocation.	Maksim Panchenko	2015-11-08	1	-0/+26
\| \| \| \|	llvm-svn: 252423
*	[PowerPC] Fix LoopPreIncPrep not to depend on SCEV constant simplifications	Hal Finkel	2015-11-08	1	-0/+94
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Under most circumstances, if SCEV can simplify X-Y to a constant, then it can also simplify Y-X to a constant. However, there is no guarantee that this is always true, and concensus is not to consider that a correctness bug in SCEV (although it is undesirable). PPCLoopPreIncPrep gathers pointers used to access memory (via loads, stores and prefetches) into buckets, where in each bucket the relative pointer offsets are constant. We used to keep each bucket as a multimap, where SCEV's subtraction operation was used to define the ordering predicate. Instead, use a fixed SCEV base expression for each bucket, record the constant offsets from that base expression, and adjust it later, if desirable, once all pointers have been collected. Doing it this way should be more compile-time efficient than the previous scheme (in addition to making the implementation less sensitive to SCEV simplification quirks). Fixes PR25170. llvm-svn: 252417
*	[LoopStrengthReduce] Don't bother fixing up PHIs from EH Pad preds	David Majnemer	2015-11-08	1	-0/+53
\| \| \| \| \| \| \| \|	We cannot really insert fixup code into a PHI's predecessor. This fixes PR25445. llvm-svn: 252416
*	[WinEH] Update PHIs of CATCHRET successors	David Majnemer	2015-11-08	1	-0/+34
\| \| \| \| \| \| \| \| \| \| \|	The TailDuplication machine pass ran across a malformed CFG: a PHI node referred it's predecessor's predecessor instead of it's predecessor. This occurred because we split the edge in X86ISelLowering when we processed the CATCHRET but forgot to do something about the PHI nodes. This fixes PR25444. llvm-svn: 252413
*	[FunctionAttrs] Add handling for operand bundles	Sanjoy Das	2015-11-07	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Teach the FunctionAttrs to do the right thing for IR with operand bundles. Reviewers: reames, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14408 llvm-svn: 252387
*	[FunctionAttrs] Fix an iterator wraparound bug	Sanjoy Das	2015-11-07	1	-0/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change fixes an iterator wraparound bug in `determinePointerReadAttrs`. Ideally, ++'ing off the `end()` of an iplist should result in a failed assert, but currently iplist seems to silently wrap to the head of the list on `end()++`. This is why the bad behavior is difficult to demonstrate. Reviewers: chandlerc, reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14350 llvm-svn: 252386
*	[WinEH] Update exception pointer registers	Joseph Tremoulet	2015-11-07	2	-5/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The CLR's personality routine passes these in rdx/edx, not rax/eax. Make getExceptionPointerRegister a virtual method parameterized by personality function to allow making this distinction. Similarly make getExceptionSelectorRegister a virtual method parameterized by personality function, for symmetry. Reviewers: pgavlin, majnemer, rnk Subscribers: jyknight, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D14344 llvm-svn: 252383
*	[InstCombine] Teach FoldPHIArgZextsIntoPHI about EHPads	David Majnemer	2015-11-07	1	-1/+41
\| \| \| \| \| \| \| \|	FoldPHIArgZextsIntoPHI cannot insert an instruction after the PHI if there is an EHPad in the BB. Doing so would result in an instruction inserted after a terminator. llvm-svn: 252377
*	[InstCombine] Don't insert an instruction after a terminator	David Majnemer	2015-11-06	1	-2/+38
\| \| \| \| \| \| \| \|	We tried to insert a cast of a phi in a block whose terminator is an EHPad. This is invalid. Do not attempt the transform in these circumstances. llvm-svn: 252370
*	Add 'notail' marker for call instructions.	Akira Hatanaka	2015-11-06	2	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \|	This marker prevents optimization passes from adding 'tail' or 'musttail' markers to a call. Is is used to prevent tail call optimization from being performed on the call. rdar://problem/22667622 Differential Revision: http://reviews.llvm.org/D12923 llvm-svn: 252368
*	[AArch64][FastISel] Don't even try to select vector icmps.	Ahmed Bougacha	2015-11-06	1	-0/+100
\| \| \| \| \| \| \| \| \| \| \| \|	We used to try to constant-fold them to i32 immediates. Given that fast-isel doesn't otherwise support vNi1, when selecting the result users, we'd fallback to SDAG anyway. However, if the users were in another block, we'd insert broken cross-class copies (GPR32 to FPR64). Give up, let SDAG agree with itself on a vNi1 legalization strategy. llvm-svn: 252364
*	[X86] Fold (trunc (i32 (zextload i16))) into vbroadcast.	Ahmed Bougacha	2015-11-06	2	-12/+4
\| \| \| \| \| \| \| \| \| \| \|	When matching non-LSB-extracting truncating broadcasts, we now insert the necessary SRL. If the scalar resulted from a load, the SRL will be folded into it, creating a narrower, offset, load. However, i16 loads aren't Desirable, so we get i16->i32 zextloads. We already catch i16 aextloads; catch these as well. llvm-svn: 252363
*	[X86] SRL non-LSB extracts when folding to truncating broadcasts.	Ahmed Bougacha	2015-11-06	4	-58/+110
\| \| \| \| \| \| \| \| \| \| \| \|	Now that we recognize this, we can support it instead of bailing out. That is, we can fold: (v8i16 (shufflevector (v8i16 (bitcast (v4i32 (build_vector X, Y, ...)))), <1,1,...,1>)) into: (v8i16 (vbroadcast (i16 (trunc (srl Y, 16))))) llvm-svn: 252362
*	[X86] Don't fold non-LSB extracts into truncating broadcasts.	Ahmed Bougacha	2015-11-06	4	-0/+396
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We used to incorrectly assume that the offset we're extracting from was a multiple of the element size. So, we'd fold: (v8i16 (shufflevector (v8i16 (bitcast (v4i32 (build_vector X, Y, ...)))), <1,1,...,1>)) into: (v8i16 (vbroadcast (i16 (trunc Y)))) whereas we should have extracted the higher bits from X. Instead, bail out if the assumption doesn't hold. llvm-svn: 252361
*	DAGCombiner: Check shouldReduceLoadWidth before combining (and (load), x) -> ↵	Tom Stellard	2015-11-06	3	-10/+32
\| \| \| \| \| \| \| \| \| \| \| \|	extload Reviewers: resistor, arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13805 llvm-svn: 252349
*	[WebAssembly] Use more explicit types in testcases.	Dan Gohman	2015-11-06	10	-114/+114
\| \| \| \|	llvm-svn: 252345
*	[WebAssembly] Add more explicit pushes to the tests.	Dan Gohman	2015-11-06	19	-169/+169
\| \| \| \|	llvm-svn: 252344
*	[InstCombine] Don't RAUW tokens with undef	David Majnemer	2015-11-06	1	-0/+21
\| \| \| \| \| \|	Let SimplifyCFG remove unreachable BBs which define token instructions. llvm-svn: 252343
*	[ShrinkWrapping] Teach shrink-wrapping how to analyze RegMask.	Quentin Colombet	2015-11-06	1	-0/+59
\| \| \| \| \| \| \|	Previously we were conservatively assuming that RegMask operands clobber callee saved registers. llvm-svn: 252341
*	Fix SLPVectorizer commutativity reordering	Mehdi Amini	2015-11-06	1	-0/+78
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The SLPVectorizer had a very crude way of trying to benefit from associativity: it tried to optimize for splat/broadcast or in order to have the same operator on the same side. This is benefitial to the cost model and allows more vectorization to occur. This patch improve the logic and make the detection optimal (locally, we don't look at the full tree but only at the immediate children). Should fix https://llvm.org/bugs/show_bug.cgi?id=25247 Reviewers: mzolotukhin Differential Revision: http://reviews.llvm.org/D13996 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 252337
*	Improved the operands commute transformation for X86-FMA3 instructions.	Andrew Kaylor	2015-11-06	2	-12/+515
\| \| \| \| \| \| \| \| \| \| \| \|	All 3 operands of FMA3 instructions are commutable now. Patch by Slava Klochkov Reviewers: Quentin Colombet(qcolombet), Ahmed Bougacha(ab). Differential Revision: http://reviews.llvm.org/D13269 llvm-svn: 252335
*	[WebAssembly] Make expression-stack pushing explicit	Dan Gohman	2015-11-06	15	-191/+191
\| \| \| \| \| \| \| \| \|	Modelling of the expression stack is evolving. This patch takes another step by making pushes explicit. Differential Revision: http://reviews.llvm.org/D14338 llvm-svn: 252334
*	[ValueTracking] De-pessimize isImpliedCondition around unsigned compares	Sanjoy Das	2015-11-06	1	-1/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Currently `isImpliedCondition` will optimize "I +_nuw C < L ==> I < L" only if C is positive. This is an unnecessary restriction -- the implication holds even if `C` is negative. Reviewers: reames, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14369 llvm-svn: 252332
*	[ValueTracking] Add a framework for encoding implication rules	Sanjoy Das	2015-11-06	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change adds a framework for adding more smarts to `isImpliedCondition` around inequalities. Informally, `isImpliedCondition` will now try to prove "A < B ==> C < D" by proving "C <= A && B <= D", since then it follows "C <= A < B <= D". While this change is in principle NFC, I could not think of a way to not handle cases like "i +_nsw 1 < L ==> i < L +_nsw 1" (that ValueTracking did not handle before) while keeping the change understandable. I've added tests for these cases. Reviewers: reames, majnemer, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14368 llvm-svn: 252331
*	AMDGPU: Create emergency stack slots during frame lowering	Matt Arsenault	2015-11-06	2	-1/+487
\| \| \| \| \| \|	Test has a bogus verifier error which will be fixed by later commits. llvm-svn: 252327
*	AMDGPU: Add pass to detect used kernel features	Matt Arsenault	2015-11-06	1	-0/+193
\| \| \| \| \| \| \| \| \| \| \|	Mark kernels that use certain features that require user SGPRs to support with kernel attributes. We need to know before instruction selection begins because it impacts the kernel calling convention lowering. For now this only detects the workitem intrinsics. llvm-svn: 252323
*	AMDGPU: Hack for VS_32 register pressure	Matt Arsenault	2015-11-06	2	-11/+11
\| \| \| \| \| \| \| \| \| \| \| \| \|	For some reason VS_32 ends up factoring into the pressure heuristics even though we should never see a virtual register with this class. When SGPRs are reserved for register spilling, this for some reason triggers reg-crit scheduling. Setting isAllocatable = 0 may help with this since that seems to remove it from the default implementation's generated table. llvm-svn: 252321
*	Restore "Move metadata linking after lazy global materialization/linking."	Teresa Johnson	2015-11-06	3	-5/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This reverts commit r251965. Restore "Move metadata linking after lazy global materialization/linking." This restores commit r251926, with fixes for the LTO bootstrapping bot failure. The bot failure was caused by references from debug metadata to otherwise unreferenced globals. Previously, this caused the lazy linking to link in their defs, which is unnecessary. With this patch, because lazy linking is complete when we encounter the metadata reference, the materializer created a declaration. For definitions such as aliases and comdats, it is illegal to have a declaration. Furthermore, metadata linking should not change code generation. Therefore, when linking of global value bodies is complete, the materializer will simply return nullptr as the new reference for the linked metadata. This change required fixing a different test to ensure there was a real reference to a linkonce global that was only being reference from metadata. Note that the new changes to the only-needed-named-metadata.ll test illustrate an issue with llvm-link -only-needed handling of comdat groups, whereby it may result in an incomplete comdat group. I note this in the test comments, but the issue is orthogonal to this patch (it can be reproduced without any metadata at head). Reviewers: dexonsmith, rafael, tra Subscribers: tobiasvk, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D14447 llvm-svn: 252320
*	Restore "Move metadata linking after lazy global materialization/linking."	Teresa Johnson	2015-11-06	2	-0/+25
\| \| \| \| \| \|	This reverts commit r251965. llvm-svn: 252319
*	[WinEH] Mark funclet entries and exits as clobbering all registers	Reid Kleckner	2015-11-06	2	-0/+177
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In this implementation, LiveIntervalAnalysis invents a few register masks on basic block boundaries that preserve no registers. The nice thing about this is that it prevents the prologue inserter from thinking it needs to spill all XMM CSRs, because it doesn't see any explicit physreg defs in the MI. Reviewers: MatzeB, qcolombet, JosephTremoulet, majnemer Subscribers: MatzeB, llvm-commits Differential Revision: http://reviews.llvm.org/D14407 llvm-svn: 252318
*	[AArch64]Enable the narrow ld promotion only on profitable microarchitectures	Jun Bum Lim	2015-11-06	2	-48/+47
\| \| \| \| \| \| \| \| \|	The benefit from converting narrow loads into a wider load (r251438) could be micro-architecturally dependent, as it assumes that a single load with two bitfield extracts is cheaper than two narrow loads. Currently, this conversion is enabled only in cortex-a57 on which performance benefits were verified. llvm-svn: 252316
*	Bring r252305 back with a test fix.	Rafael Espindola	2015-11-06	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	We now create the .eh_frame section early, just like every other special section. This means that the special flags are visible in code that explicitly asks for ".eh_frame". llvm-svn: 252313
*	Use SHT_X86_64_UNWIND on every OS.	Rafael Espindola	2015-11-06	19	-19/+19
\| \| \| \| \| \| \|	That is the ABI required type. Linkers still check the section name, so everything should still work. llvm-svn: 252300
*	[mips][ias] Range check uimm4 operands and fixed a bug this revealed.	Daniel Sanders	2015-11-06	3	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The bug was that the sldi instructions have immediate widths dependant on their element size. So sldi.d has a 1-bit immediate and sldi.b has a 4-bit immediate. All of these were using 4-bit immediates previously. Reviewers: vkalintiris Subscribers: llvm-commits, atanasyan, dsanders Differential Revision: http://reviews.llvm.org/D14018 llvm-svn: 252297
*	[mips][ias] Range check uimm3 operands.	Daniel Sanders	2015-11-06	4	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Reviewers: vkalintiris Subscribers: atanasyan, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D14016 llvm-svn: 252296
*	[mips][ias] Range check uimm2 operands and fix a bug this revealed.	Daniel Sanders	2015-11-06	14	-12/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The bug was that the MIPS32R6/MIPS64R6/microMIPS32R6 versions of LSA and DLSA (unlike the MSA version) failed to account for the off-by-one encoding of the immediate. The range is actually 1..4 rather than 0..3. Reviewers: vkalintiris Subscribers: atanasyan, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D14015 llvm-svn: 252295
*	[mips][ias] Range check uimmz operands.	Daniel Sanders	2015-11-06	2	-0/+22
\| \| \| \| \| \| \| \| \| \|	Reviewers: vkalintiris Subscribers: dsanders, atanasyan, llvm-commits Differential Revision: http://reviews.llvm.org/D14013 llvm-svn: 252294
*	[mips] Define patterns for the atomic_{load,store}_{8,16,32,64} nodes.	Vasileios Kalintiris	2015-11-06	4	-32/+110
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Without these patterns we would generate a complete LL/SC sequence. This would be problematic for memory regions marked as WRITE-only or READ-only, as the instructions LL/SC would read/write to the protected memory regions correspondingly. Reviewers: dsanders Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D14397 llvm-svn: 252293
*	AMDGPU/SI: Emit HSA kernels with symbol type STT_AMDGPU_HSA_KERNEL	Tom Stellard	2015-11-06	2	-4/+24
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13804 llvm-svn: 252291
*	Add a new attribute: norecurse	James Molloy	2015-11-06	8	-12/+21
\| \| \| \| \| \|	This attribute allows the compiler to assume that the function never recurses into itself, either directly or indirectly (transitively). This can be used among other things to demote global variables to locals. llvm-svn: 252282
*	Revert r252249 (and r252255, r252258), "[WinEH] Clone funclets with multiple ↵	NAKAMURA Takumi	2015-11-06	4	-1690/+10
\| \| \| \| \| \| \| \|	parents" It behaved flaky due to iterating pointer key values on std::set and std::map. llvm-svn: 252279
*	Temporarily disable flaky checks in wineh-multi-parent-cloning.	Andrew Kaylor	2015-11-06	1	-4/+8
\| \| \| \|	llvm-svn: 252258
*	[WinEH] Clone funclets with multiple parents	Andrew Kaylor	2015-11-06	4	-10/+1686
\| \| \| \| \| \| \| \| \| \|	Windows EH funclets need to always return to a single parent funclet. However, it is possible for earlier optimizations to combine funclets (probably based on one funclet having an unreachable terminator) in such a way that this condition is violated. These changes add code to the WinEHPrepare pass to detect situations where a funclet has multiple parents and clone such funclets, fixing up the unwind and catch return edges so that each copy of the funclet returns to the correct parent funclet. Differential Revision: http://reviews.llvm.org/D13274?id=39098 llvm-svn: 252249
*	[bugpoint] Add a named metadata (+their operands) reducer	Keno Fischer	2015-11-06	1	-0/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We frequently run bugpoint on a linked module that consists of all modules we create while jitting the julia standard library. This module has a very large number of compile units (10000+) in `llvm.dbg.cu`, which didn't get reduced at all, requiring manual post processing. This is an attempt to have bugpoint go through and attempt to reduce the number of global named metadata nodes as well as their operands, to cut down the number of roots for such metadata. Reviewers: dexonsmith, reames, pete Subscribers: pete, dexonsmith, reames, llvm-commits Differential Revision: http://reviews.llvm.org/D14043 llvm-svn: 252247
*	Re-apply r251050 with a for PR25421	Sanjoy Das	2015-11-05	2	-0/+88
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The bug: I missed adding break statements in the switch / case. Original commit message: [SCEV] Teach SCEV some axioms about non-wrapping arithmetic Summary: - A s< (A + C)<nsw> if C > 0 - A s<= (A + C)<nsw> if C >= 0 - (A + C)<nsw> s< A if C < 0 - (A + C)<nsw> s<= A if C <= 0 Right now `C` needs to be a constant, but we can later generalize it to be a non-constant if needed. Reviewers: atrick, hfinkel, reames, nlewycky Subscribers: sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D13686 llvm-svn: 252236