bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Revert r267210, it makes clang assert (PR27490).	Nico Weber	2016-04-22	1	-39/+0
\| \| \| \|	llvm-svn: 267232
*	Re-commit optimization bisect support (r267022) without new pass manager ↵	Andrew Kaylor	2016-04-22	2	-0/+187
\| \| \| \| \| \| \| \| \| \|	support. The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling). Differential Revision: http://reviews.llvm.org/D19172 llvm-svn: 267231
*	Differential Revision: http://reviews.llvm.org/D19040	Sriraman Tallam	2016-04-22	2	-4/+123
\| \| \| \|	llvm-svn: 267229
*	llvm-symbolizer: prefer .dwo contents over fission-gmlt-like-data when .dwo ↵	David Blaikie	2016-04-22	5	-14/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	file is present Rather than relying on the gmlt-like data emitted into the .o/executable which only contains the simple name of any inlined functions, use the .dwo file if present. Test symbolication with/without a .dwo, and the old test that was testing behavior when no gmlt-like data was present. (I haven't included a test of non-gmlt-like data + no .dwo (that would be akin to symbolication with no debug info) but we could add one for completeness) The test was simplified a bit to be a little clearer (unoptimized, force inline, using a function call as the inlined entity) and regenerated with ToT clang. For the no-gmlt-like-data case, I modified Clang back to its old behavior temporarily & the .dwo file is identical so it is shared between the two executables. llvm-svn: 267227
*	Introduce llvm.load.relative intrinsic.	Peter Collingbourne	2016-04-22	4	-1/+121
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This intrinsic takes two arguments, ``%ptr`` and ``%offset``. It loads a 32-bit value from the address ``%ptr + %offset``, adds ``%ptr`` to that value and returns it. The constant folder specifically recognizes the form of this intrinsic and the constant initializers it may load from; if a loaded constant initializer is known to have the form ``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``. LLVM provides that the calculation of such a constant initializer will not overflow at link time under the medium code model if ``x`` is an ``unnamed_addr`` function. However, it does not provide this guarantee for a constant initializer folded into a function body. This intrinsic can be used to avoid the possibility of overflows when loading from such a constant. Differential Revision: http://reviews.llvm.org/D18367 llvm-svn: 267223
*	DAGCombiner: Relax alignment restriction when changing store type	Matt Arsenault	2016-04-22	2	-1/+54
\| \| \| \| \| \|	If the target allows the alignment, this should be OK. llvm-svn: 267217
*	[unordered] sink unordered stores at end of blocks	Philip Reames	2016-04-22	1	-0/+34
\| \| \| \| \| \|	The existing code turned out to be completely correct when auditted. Thus, only minor code changes and adding a couple of tests. llvm-svn: 267215
*	Fold compares for distinct allocations	Sanjoy Das	2016-04-22	1	-0/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We can fold compares to false when two distinct allocations within a function are compared for equality. Patch by Anna Thomas! Reviewers: majnemer, reames, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19390 llvm-svn: 267214
*	CodeGen: Use PLT relocations for relative references to unnamed_addr functions.	Peter Collingbourne	2016-04-22	5	-5/+59
\| \| \| \| \| \| \| \| \| \| \| \| \|	The relative vtable ABI (PR26723) needs PLT relocations to refer to virtual functions defined in other DSOs. The unnamed_addr attribute means that the function's address is not significant, so we're allowed to substitute it with the address of a PLT entry. Also includes a bonus feature: addends for COFF image-relative references. Differential Revision: http://reviews.llvm.org/D17938 llvm-svn: 267211
*	[unordered] Extend load/store type canonicalization to handle unordered ↵	Philip Reames	2016-04-22	1	-0/+39
\| \| \| \| \| \| \| \|	operations Extend the type canonicalization logic to work for unordered atomic loads and stores. Note that while this change itself is fairly simple and low risk, there's a reasonable chance this will expose problems in the backends by suddenly generating IR they wouldn't have seen before. Anything of this nature will be an existing bug in the backend (you could write an atomic float load), but this will definitely change the frequency with which such cases are encountered. If you see problems, feel free to revert this change, but please make sure you collect a test case. llvm-svn: 267210
*	DAGCombiner: Relax alignment restriction when changing load type	Matt Arsenault	2016-04-22	4	-5/+43
\| \| \| \| \| \|	If the target allows the alignment, this should still be OK. llvm-svn: 267209
*	[AArch64] Fix optimizeCondBranch logic.	Quentin Colombet	2016-04-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	The opcode for the optimized branch does not depend on the size of the activate bits in the AND masks, but the AND opcode itself. Indeed, we need to use a X or W variant based on the AND variant not based on whether the mask fits into the related variant. Otherwise, we may end up using the W variant of the optimized branch for 64-bit register inputs! This fixes the last make check verifier issues for AArch64: PR27479. llvm-svn: 267206
*	PM: Port SinkingPass to the new pass manager	Justin Bogner	2016-04-22	2	-1/+1
\| \| \| \|	llvm-svn: 267199
*	[DeadStoreElimination] Shorten beginning of memset overwritten by later stores	Jun Bum Lim	2016-04-22	1	-0/+90
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change will shorten memset if the beginning of memset is overwritten by later stores. Reviewers: hfinkel, eeckstein, dberlin, mcrosier Subscribers: mgrang, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18906 llvm-svn: 267197
*	PM: Port DCE to the new pass manager	Justin Bogner	2016-04-22	1	-0/+11
\| \| \| \| \| \| \|	Also add a very basic test, since apparently there aren't any tests for DCE whatsoever to add the new pass version to. llvm-svn: 267196
*	MachineScheduler: Limit the size of the ready list.	Matthias Braun	2016-04-22	1	-0/+1
\| \| \| \| \| \| \| \| \|	Avoid quadratic complexity in unusually large basic blocks by limiting the size of the ready lists. Differential Revision: http://reviews.llvm.org/D19349 llvm-svn: 267189
*	[AArch64] When creating MRS instruction, make sure the destination register is	Quentin Colombet	2016-04-22	1	-1/+1
\| \| \| \| \| \| \| \|	declared as a definition. This fixes the machine verifier error for CodeGen/AArch64/nzcv-save.ll. llvm-svn: 267185
*	[LoopVersioningLICM] Add test coverage for llvm.loop.licm_versioning.disable	Adam Nemet	2016-04-22	1	-0/+104
\| \| \| \| \| \| \| \|	In the next change, I am generalizing the function findStringMetadataForLoop and I want to make sure I don't break this. Looks like there was no coverage for this so far. llvm-svn: 267182
*	[AArch64][AdvSIMDScalar] Update the kill flags correctly.	Quentin Colombet	2016-04-22	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We used to simply set the kill flags to true when transforming a scalar instruction to a vector one. SrcScalar1 = copy SrcVector1 ... = opScalar SrcScalar1 => SrcScalar1 = copy SrcVector1 ... = opVector SrcVector1<kill> This is obviously wrong. The proper update consists in: 1. Propagate the kill status from the copy to the new opVector 2. Reset the kill status on the copy, since the live-range of SrcVector1 got extended. This fixes some of the machine verifier errors for AArch64 with make check. llvm-svn: 267180
*	test: split test into two runs	Saleem Abdulrasool	2016-04-22	1	-8/+9
\| \| \| \| \| \| \| \| \| \|	Rather than checking both stdout and stderr simultaneously, split it into two tests. This apparently breaks on Windows where MSVCRT does not buffer output correctly. NFC. Thanks to chapuni for bringing the issue to my attention! llvm-svn: 267179
*	[SimplifyCFG] Add final missing implications to isImpliedTrueByMatchingCmp.	Chad Rosier	2016-04-22	1	-8/+8
\| \| \| \| \| \| \| \| \|	Summary: eq imply [u\|s]ge and [u\|s]le are true. Remove redundant logic by implementing isImpliedFalseByMatchingCmp(Pred1, Pred2) as isImpliedTrueByMatchingCmp(Pred1, getInversePredicate(Pred2)). llvm-svn: 267177
*	Have isKnownNotFullPoison be smarter around control flow	Sanjoy Das	2016-04-22	1	-0/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: (... while still not using a PostDomTree) The way we use isKnownNotFullPoison from SCEV today, the new CFG walking logic will not trigger for any realistic cases -- it will kick in only for situations where we could have merged the contiguous basic blocks anyway[0], since the poison generating instruction dominates all of its non-PHI uses (which are the only uses we consider right now). However, having this change in place will allow a later bugfix to break fewer llvm-lit tests. [0]: i.e. cases where block A branches to block B and B is A's only successor and A is B's only predecessor. Reviewers: broune, bjarke.roune Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19212 llvm-svn: 267175
*	[Hexagon] Properly close live range in HexagonBlockRanges ---add testcase	Krzysztof Parzyszek	2016-04-22	1	-0/+55
\| \| \| \|	llvm-svn: 267174
*	[SimplifyCFG] Add missing implications to isImpliedTrueByMatchingCmp.	Chad Rosier	2016-04-22	1	-0/+1005
\| \| \| \| \| \| \| \| \|	Summary: [u\|s]gt and [u\|s]lt imply [u\|s]ge and [u\|s]le are true, respectively. I've simplified the existing tests and added additional tests to cover the new cases mentioned above. I've also added tests for all the cases where the first compare doesn't imply anything about the second compare. llvm-svn: 267171
*	[SimplifyCFG] Simplify code review by temporarily removing this test file.	Chad Rosier	2016-04-22	1	-478/+0
\| \| \| \| \| \| \|	A followup commit will replace these tests with simplified and more inclusive tests. The diff is unreadable if this were to be done in a single commit. llvm-svn: 267170
*	[AMDGPU] Insert nop pass: take care of outstanding feedback	Konstantin Zhuravlyov	2016-04-22	1	-4/+8
\| \| \| \| \| \| \| \| \| \| \|	- Switch few loops to range-based for loops - Fix nop insertion at the end of BB - Fix formatting - Check for endpgm Differential Revision: http://reviews.llvm.org/D19380 llvm-svn: 267167
*	[mips][microMIPS] Revert commit r266861.	Zoran Jovanovic	2016-04-22	6	-26/+7
\| \| \| \| \| \|	Commit r266861 was the reason for failing tests in LLVM test suite. llvm-svn: 267166
*	[Hexagon] Teach mux expansion how to deal with undef predicates	Krzysztof Parzyszek	2016-04-22	1	-0/+22
\| \| \| \|	llvm-svn: 267165
*	[Hexagon] Add definitions for trap/pause instructions	Krzysztof Parzyszek	2016-04-22	1	-0/+36
\| \| \| \| \| \|	Also add tests for other instructions from HexagonSystemInst.td. llvm-svn: 267162
*	[EarlyCSE] Don't add the overflow flags to the hash	David Majnemer	2016-04-22	1	-3/+2
\| \| \| \| \| \| \| \|	We take the intersection of overflow flags while CSE'ing. This permits us to consider two instructions with different overflow behavior to be replaceable. llvm-svn: 267153
*	Emit code16 in assembly in 16-bit mode	Nirav Dave	2016-04-22	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When generating assembly using -m16 we must explicitly mark it as 16-bit. Emit .code16 at beginning of file. Fixes wrong results when using -fno-integrated-as. Reviewers: dwmw2 Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19392 llvm-svn: 267152
*	[mips] Fix select patterns for MIPS64	Simon Dardis	2016-04-22	1	-0/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When targetting MIPS64R6 some of the patterns for select were guarded by a broken predicate. The predicate was supposed to test if a constant value could fit in a 16 bit zero-extended field. Instead the value was tested to fit in a 16 bit sign-extended field. For negative constants of native word width this resulted in wrong code generation. Reviewers: vkalintiris, dsanders Differential Review: http://reviews.llvm.org/D19378 llvm-svn: 267151
*	Revert r267049, r26706[16789], r267071 - Refactor raw pdb dumper into library	Daniel Sanders	2016-04-22	1	-1/+1
\| \| \| \| \| \|	r267049 broke multiple buildbots (e.g. clang-cmake-mips, and clang-x86_64-linux-selfhost-modules) which the follow-ups have not yet resolved and this is preventing subsequent committers from being notified about additional failures on the affected buildbots. llvm-svn: 267148
*	AMDGPU/SI: Add test missed in rL266865	Nikolay Haustov	2016-04-22	1	-0/+55
\| \| \| \|	llvm-svn: 267144
*	[InstCombine] Preserve fast math flags when combining PHIs	Silviu Baranga	2016-04-22	1	-0/+89
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When optimizing PHIs which have inputs floating point binary operators, we preserve all IR flags except the fast math flags. This change removes the logic which tracked some of the IR flags (no wrap, exact) and replaces it by doing an and on the IR flags of all inputs to the PHI - which will also handle the fast math flags. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19370 llvm-svn: 267139
*	[mips][microMIPS] Implement SLT, SLTI, SLTIU, SLTU microMIPS32r6 instructions	Hrvoje Varga	2016-04-22	6	-2/+18
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D19354 llvm-svn: 267137
*	[mips][microMIPS] Add R_MICROMIPS_PC18_S3 relocation	Zoran Jovanovic	2016-04-22	2	-0/+9
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D15026 llvm-svn: 267130
*	Revert r267098 - [MachineCombiner] Support for floating-point FMA on ARM64	Daniel Sanders	2016-04-22	2	-264/+0
\| \| \| \| \| \|	It introduced buildbot failures on clang-cmake-mips, clang-ppc64le-linux, among others. llvm-svn: 267127
*	[X86]: Changing cost for “TRUNCATE v16i32 to v16i8” in SSE4.1 mode.	Ashutosh Nema	2016-04-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: rL256194 transforms truncations between vectors of integers into PACKUS/PACKSS operations during DAG combine. This generates better code for truncate, so cost of truncate needs to be changed but looks like it got changed only in SSE2 table Whereas this change is also applicable for SSE4.1, so the cost of truncate needs to be changed for that as well. Cost of “TRUNCATE v16i32 to v16i8” & “TRUNCATE v16i16 to v16i8” should be same in SSE4.1 & SSE2 table. Removing their cost from SSE4.1, so it will fall back to SSE2. Reviewers: Simon Pilgrim llvm-svn: 267123
*	Revert "Initial implementation of optimization bisect support."	Vedant Kumar	2016-04-22	3	-296/+0
\| \| \| \| \| \| \| \|	This reverts commit r267022, due to an ASan failure: http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549 llvm-svn: 267115
*	[mips][microMIPS] Implement DVP, EVP and JALRC.HB instructions	Zlatko Buljan	2016-04-22	6	-0/+36
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D18687 llvm-svn: 267114
*	[GVN] Respect fast-math-flags on fcmps	David Majnemer	2016-04-22	1	-0/+18
\| \| \| \| \| \| \|	We assumed that flags were only present on binary operators. This is not true, they may also be present on calls and fcmps. llvm-svn: 267113
*	[EarlyCSE] Take the intersection of flags on instructions	David Majnemer	2016-04-22	1	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \| \|	EarlyCSE had inconsistent behavior with regards to flag'd instructions: - In some cases, it would pessimize if the available instruction had different flags by not performing CSE. - In other cases, it would miscompile if it replaced an instruction which had no flags with an instruction which has flags. Fix this by being more consistent with our flag handling by utilizing andIRFlags. llvm-svn: 267111
*	AMDGPU/SI: add llvm.amdgcn.ps.live intrinsic	Nicolai Haehnle	2016-04-22	1	-0/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This intrinsic returns true if the current thread belongs to a live pixel and false if it belongs to a pixel that we are executing only for derivative computation. It will be used by Mesa to implement gl_HelperInvocation. Note that for pixels that are killed during the shader, this implementation also returns true, but it doesn't matter because those pixels are always disabled in the EXEC mask. This unearthed a corner case in the instruction verifier, which complained about a v_cndmask 0, 1, exec, exec<imp-use> instruction. That's stupid but correct code, so make the verifier accept it as such. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19191 llvm-svn: 267102
*	[AVX512] Teach lowering to use vplzcntd/q to implement 128/256-bit ↵	Craig Topper	2016-04-22	3	-2/+317
\| \| \| \| \| \|	CTTZ_ZERO_UNDEF even without VLX support. We can just extend to 512-bits and extract like we do for CTLZ. llvm-svn: 267100
*	[MachineCombiner] Support for floating-point FMA on ARM64	Gerolf Hoflehner	2016-04-22	2	-0/+264
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Evaluates fmul+fadd -> fmadd combines and similar code sequences in the machine combiner. It adds support for float and double similar to the existing integer implementation. The key features are: - DAGCombiner checks whether it should combine greedily or let the machine combiner do the evaluation. This is only supported on ARM64. - It gives preference to throughput over latency: the heuristic used is to combine always in loops. The targets decides whether the machine combiner should optimize for throughput or latency. - Supports for fmadd, f(n)msub, fmla, fmls patterns - On by default at O3 ffast-math llvm-svn: 267098
*	Try to fix UNRESOLVED: LLVM :: CodeGen/AArch64/arm64-regress-opt-cmp.s on bots.	Nico Weber	2016-04-22	1	-0/+1
\| \| \| \| \| \| \| \|	This test used to write a .s file until r266971 fixed that. But on most bots, the .s file still exists. Add an rm statement to clean up the bots. In a few days, this statement can go away again. llvm-svn: 267095
*	ARM: fix test for Windows division	Saleem Abdulrasool	2016-04-22	1	-4/+4
\| \| \| \| \| \| \|	This was meant to be part of SVN r267080. cbz cannot use a high register, which would be silently truncated. This has now been fixed. llvm-svn: 267092
*	[WebAssembly] Limit alignment hints to natural alignment.	Dan Gohman	2016-04-21	3	-17/+21
\| \| \| \| \| \|	This follows the current binary format rules. llvm-svn: 267082
*	ARM: restrict register class for WIN__DBZCHK	Saleem Abdulrasool	2016-04-21	1	-0/+47
\| \| \| \| \| \| \| \| \| \| \|	WIN__DBZCHK will insert a CBZ instruction into the stream. This instruction reserves 3 bits for the condition register (rn). As such, we must ensure that we restrict the register to a low register. Use the tGPR class instead of GPR to ensure that this is properly constrained. In debug builds, we would attempt to use lr as a condition register which would silently get truncated with no hint that the register selection was incorrect. llvm-svn: 267080