bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	R600: Use optimized 24bit path in udivrem	Jan Vesely	2014-08-12	1	-0/+244
\| \| \| \| \| \| \| \| \|	v2: drop enum keyword use correct extension mode don't bother computing the sign in unsinged case Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 215462
*	R600: Use i24 optimized path for SREM	Jan Vesely	2014-08-12	1	-0/+118
\| \| \| \| \| \| \| \| \|	v2: add tests rename LowerSDIV24 to LowerSDIVREM24 handle the rem part in this function Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 215460
*	Don't upgrade global constructors when reading bitcode	Duncan P. N. Exon Smith	2014-08-12	3	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	An optional third field was added to `llvm.global_ctors` (and `llvm.global_dtors`) in r209015. Most of the code has been changed to deal with both versions of the variables. Users of the C API might create either version, the helper functions in LLVM create the two-field version, and clang now creates the three-field version. However, the BitcodeReader was changed to always upgrade to the three-field version. This created an unnecessary inconsistency in the IR before/after serializing to bitcode. This commit resolves the inconsistency by making the third field truly optional (and not upgrading in the bitcode reader). Since `llvm-link` was relying on this upgrade code, rather than deleting it I've moved it into `ModuleLinker`, where it upgrades these arrays as necessary to resolve inconsistencies between modules. The ideal resolution would be to remove the 2-field version and make the third field required. I filed PR20506 to track that. I changed `test/Bitcode/upgrade-global-ctors.ll` to a negative test and duplicated the `llvm-link` check in `test/Linker/global_ctors.ll` to check both upgrade directions. Since I came across this as part of PR5680 (serializing use-list order), I've also added the missing `verify-uselistorder` RUN line to `test/Bitcode/metadata-2.ll`. llvm-svn: 215457
*	Make the test a bit more strict.	Rafael Espindola	2014-08-12	1	-2/+2
\| \| \| \| \| \| \|	Before it would pass even if @b or @c ended up pointing to a variable named @a123. llvm-svn: 215450
*	Add a plugin testcase for merging weak variables.	Rafael Espindola	2014-08-12	2	-0/+18
\| \| \| \| \| \| \| \| \|	I initially thought I could implement COMDATs with aliases by just internalizing GVs instead of dropping them. This is a counter example: Internalizing one of the @a would make @b and @c point to different variables. llvm-svn: 215447
*	llvm/test/TableGen/Foreach.td: Remove XFAIL:vg_leak. They have not been ↵	NAKAMURA Takumi	2014-08-12	4	-4/+0
\| \| \| \| \| \|	failing since r215176. llvm-svn: 215445
*	llvm-objdump: print contents of MachO __unwind_info sections	Tim Northover	2014-08-12	4	-0/+57
\| \| \| \|	llvm-svn: 215437
*	[MachineCombiner] Fix for ICE bug 20598	Gerolf Hoflehner	2014-08-12	1	-0/+106
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The combiner ignored DBG nodes when checking the uses of a virtual register. It combined a sequence like %vreg1 = madd %vreg2, %vreg3,... DBG_VALUE (%vreg1 ...) %vreg4 = add %vreg1,... to %vreg4 = madd %vreg2, %vreg3 leaving behind a dangling DBG_VALUE with a definition. This triggered an assertion in the MachineTraceMetrics.cpp module. llvm-svn: 215431
*	DebugLocEntry: Restore the comparison predicate from before the	Adrian Prantl	2014-08-12	1	-0/+1
\| \| \| \| \| \| \| \| \|	refactoring in 215384. This way it can unique multiple entries describing the same piece even if they don't have the exact same location. (The same piece may get merged in and be added from OpenRanges). There ought to be a more elegant solution for this, though. llvm-svn: 215418
*	msan: Handle musttail calls	Reid Kleckner	2014-08-12	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	First, avoid calling setTailCall(false) on musttail calls. The funciton prototypes should be "congruent", so the shadow layout should be exactly the same. Second, avoid inserting instrumentation after a musttail call to propagate the return value shadow. We don't need to propagate the result of a tail call, it should already be in the right place. Reviewed By: eugenis Differential Revision: http://reviews.llvm.org/D4331 llvm-svn: 215415
*	[x86] Fold extract_vector_elt of a load into the Load's address computation.	Michael J. Spencer	2014-08-11	1	-1/+19
\| \| \| \|	llvm-svn: 215409
*	InstCombine: Combine (add (and %a, %b) (or %a, %b)) to (add %a, %b)	David Majnemer	2014-08-11	1	-0/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	What follows bellow is a correctness proof of the transform using CVC3. $ < t.cvc A, B : BITVECTOR(32); QUERY BVPLUS(32, A & B, A \| B) = BVPLUS(32, A, B); $ cvc3 < t.cvc Valid. llvm-svn: 215400
*	R600/SI: Add a ComplexPattern for selecting MUBUF _OFFSET variant	Tom Stellard	2014-08-11	7	-22/+37
\| \| \| \| \| \| \|	This saves us from having to copy a 64-bit 0 value into VGPRs for BUFFER_* instruction which only have a 12-bit immediate offset. llvm-svn: 215399
*	R600/SI: Add check for low 32 bits of encoding to mubuf tests	Tom Stellard	2014-08-11	1	-7/+7
\| \| \| \| \| \| \| \|	There are no variable values like registers encoded in the low 32 bits of MUBUF instructions, so it is relatively easy to check these bits, and it will help prevent us from introducing encoding bugs. llvm-svn: 215397
*	R600/SI: Clear lds bit on MUBUF instructions used for private stores	Tom Stellard	2014-08-11	1	-10/+9
\| \| \| \| \| \| \| \|	This bit was left uninitialized, which was causing some random failures of piglit tests. NOTE: This is a candidate for the 3.5 branch. llvm-svn: 215396
*	R600/SI: Fix broken test	Tom Stellard	2014-08-11	1	-3/+5
\| \| \| \|	llvm-svn: 215395
*	[AArch64] Fix registerAllocator assigns same register for base and wback in	Quentin Colombet	2014-08-11	1	-0/+12
\| \| \| \| \| \| \| \|	pre/post-index load and store. Patch by Steven Wu <stevenwu@apple.com> llvm-svn: 215390
*	ARM: try harder to detect non-IT eligible instructions	Saleem Abdulrasool	2014-08-11	1	-8/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For many Thumb-1 register register instructions, setting the CPSR is not permitted inside an IT block. We would not correctly flag those instructions. The previous change to identify this scenario was insufficient as it did not actually catch all the instances. The current list is formed by manual inspection of the ARMv6M ARM. The change to the Thumb2 IT block test is due to the fact that the new more stringent checking of the MIs results in the If Conversion pass being prevented from executing (since not all the instructions in the BB are predicable). This results in code gen changes. Thanks to Tim Northover for pointing out that the previous patch was insufficient and hinting that the use of the v6M ARM would be much easier to use than the v7 or v8! llvm-svn: 215382
*	Fix using -plugin-opt=apiflie when also using -plugin-opt=emit-llvm.	Rafael Espindola	2014-08-11	1	-0/+8
\| \| \| \|	llvm-svn: 215378
*	Correct a missing RUN line in the ARM codegen test for fneg ops. We should ↵	Sanjay Patel	2014-08-11	1	-1/+4
\| \| \| \| \| \| \| \| \| \|	also explicitly specify +/-neonfp. The bug was introduced at r99570 when use of "-arm-use-neon-fp" was removed. Differential Revision: http://reviews.llvm.org/D4846 llvm-svn: 215377
*	Add missing test for r215031	Reid Kleckner	2014-08-11	1	-0/+13
\| \| \| \|	llvm-svn: 215374
*	MC: Diagnose an unexpected token in COFF .section instead of asserting	Reid Kleckner	2014-08-11	1	-0/+3
\| \| \| \| \| \| \|	This can easily arise when trying to assemble and ELF style .section directive for a COFF object file. llvm-svn: 215373
*	Fix use of uninitialized variable.	Rafael Espindola	2014-08-11	2	-0/+20
\| \| \| \| \| \| \|	Fixes linking bitcode files that use the new style comdats for constructors with ones that don't. llvm-svn: 215364
*	Revert r215359 - [mips] Implement .ent, .end, .frame, .mask and .fmask ↵	Daniel Sanders	2014-08-11	2	-106/+0
\| \| \| \| \| \| \| \|	assembler directives It seems to cause an lld test (elf/Mips/hilo16-3.test) to fail. Reverted while we investigate. llvm-svn: 215361
*	[mips] Implement .ent, .end, .frame, .mask and .fmask assembler directives	Daniel Sanders	2014-08-11	2	-0/+106
\| \| \| \| \| \| \| \|	Patch by Matheus Almeida and Toma Tabacu Differential Revision: http://reviews.llvm.org/D4179 llvm-svn: 215359
*	AArch64: add support for dynamic-loader relocations	Tim Northover	2014-08-11	3	-0/+22
\| \| \| \| \| \| \| \| \|	LLD needs them, and it's good to be able to print them properly when our object dumpers encounter them. Patch by Daniel Stewart. llvm-svn: 215352
*	llvm-readobj: zero out timestamp in COFF auto-generated test files.	Tim Northover	2014-08-11	3	-0/+4
\| \| \| \| \| \| \| \|	The timestamp meant these files changed with each invocation of relocs.py, confusing matters when we add relocations and need to update the tests. llvm-svn: 215350
*	ARM: __gnu_h2f_ieee and __gnu_f2h_ieee always use the soft-float calling ↵	Oliver Stannard	2014-08-11	1	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \|	convention By default, LLVM uses the "C" calling convention for all runtime library functions. The half-precision FP conversion functions use the soft-float calling convention, and are needed for some targets which use the hard-float convention by default, so must have their calling convention explicitly set. llvm-svn: 215348
*	In Machine CSE pass, the source register of a COPY machine instruction can	Jiangning Liu	2014-08-11	1	-0/+45
\| \| \| \| \| \| \| \|	be propagated to all its users, and this propagation could increase the probability of finding common subexpressions. If the COPY has only one user, the COPY itself can be removed. llvm-svn: 215344
*	In LVI(Lazy Value Info), originally value on a BB can only be caculated once,	Jiangning Liu	2014-08-11	1	-0/+33
\| \| \| \| \| \| \| \| \|	and the lattice will be updated to be a state other than "undefined". This limiation could miss some opportunities of lowering "overdefined" to be an even accurate value. So this patch ask the algorithm to try to lower the lattice value again even if the value has been lowered to be "overdefined". llvm-svn: 215343
*	Add support for scalarizing cttz_zero_undef	Petar Jovanovic	2014-08-10	1	-0/+37
\| \| \| \| \| \| \| \| \|	Follow up to r214266. Add missing case in ScalarizeVectorResult() for cttz_zero_undef. Differential Revision: http://reviews.llvm.org/D4813 llvm-svn: 215330
*	ARM: correct isPredicable for MULS in ThHUMB mode	Saleem Abdulrasool	2014-08-10	1	-0/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The ARM ARM states that CPSR may not be updated by a MUL in thumb mode. Due to an ordering of Thumb 2 Size Reduction and If Conversion, we would end up generating a THUMB MULS inside an IT block. The If Conversion pass uses the TTI isPredicable method to ensure that it can transform a Basic Block. However, because we only check for IT handling on Thumb2 functions, we may miss some cases. Even then, it only validates that the CPSR is not live rather than it is not accessed. This corrects the handling for that particular case since the same restriction does not hold on the vast majority of the instructions. This does prevent the IfConversion optimization from kicking in in certain cases, but generating correct code is more valuable. Addresses PR20555. llvm-svn: 215328
*	@l and friends adjust their value depending the context used in.	Joerg Sonnenberger	2014-08-10	2	-5/+21
\| \| \| \| \| \| \|	For ori, they are unsigned, for addi, signed. Create a new target expression type to handle this and evaluate Fixups accordingly. llvm-svn: 215315
*	Allow the third argument for the subi family to be an expression.	Joerg Sonnenberger	2014-08-09	1	-3/+3
\| \| \| \|	llvm-svn: 215286
*	Update disassembler test to check the full dccci/iccci form.	Joerg Sonnenberger	2014-08-09	1	-4/+4
\| \| \| \|	llvm-svn: 215283
*	Use the full form of dccci and iccci from the early PPC 405 documents,	Joerg Sonnenberger	2014-08-09	1	-6/+12
\| \| \| \| \| \| \|	since the operands are actually used on those cores. Provide aliases for the only documented case in the newer Power ISA speec. llvm-svn: 215282
*	R600/SI: Custom lower CONCAT_VECTORS	Tom Stellard	2014-08-09	1	-2/+1
\| \| \| \| \| \| \|	This will lower them using register copies rather than loads and stores to the stack. llvm-svn: 215270
*	R600/SI: Update concat_vectors.ll to check for scratch usage	Tom Stellard	2014-08-09	1	-0/+36
\| \| \| \| \| \| \| \| \| \| \| \|	These tests were using SI-NOT: MOVREL to make sure concat vectors weren't being lowered to stack loads and stores, but we are using scratch buffers for the stack now instead of registers, so we need to add an additional SI-NOT check for scratch buffers. With this change I was able to uncover one broken test which will be fixed in a future commit. llvm-svn: 215269
*	Allow large immediates for branch instructions in 32bit mode.	Joerg Sonnenberger	2014-08-08	1	-0/+6
\| \| \| \|	llvm-svn: 215240
*	Provide an implementation of getNoopForMachoTarget for PPC, otherwise	Joerg Sonnenberger	2014-08-08	1	-2/+3
\| \| \| \| \| \|	empty functions will assert in the MC object writer. llvm-svn: 215238
*	[FastISel][X86] Fix INC/DEC optimization (r215230)	Juergen Ributzka	2014-08-08	1	-0/+47
\| \| \| \| \| \| \| \|	I accidentally also used INC/DEC for unsigned arithmetic which doesn't work, because INC/DEC don't set the required flag which is used for the overflow check. llvm-svn: 215237
*	[FastISel][X86] Use INC/DEC when possible for {sadd\|ssub}.with.overflow ↵	Juergen Ributzka	2014-08-08	1	-26/+59
\| \| \| \| \| \| \| \| \| \|	intrinsics. This is a small peephole optimization to emit INC/DEC when possible. Fixes <rdar://problem/17952308>. llvm-svn: 215230
*	Add support for SPE load/store from memory.	Joerg Sonnenberger	2014-08-08	1	-0/+157
\| \| \| \|	llvm-svn: 215220
*	pr20589: Fix duplicated arch flag.	Rafael Espindola	2014-08-08	1	-0/+10
\| \| \| \|	llvm-svn: 215216
*	[mips] Invert the abicalls feature bit to be noabicalls so that it's ↵	Daniel Sanders	2014-08-08	1	-13/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	possible for -mno-abicalls to take effect. Also added the testcase that should have been in r215194. This behaviour has surprised me a few times now. The problem is that the generated MipsSubtarget::ParseSubtargetFeatures() contains code like this: if ((Bits & Mips::FeatureABICalls) != 0) IsABICalls = true; so '-abicalls' means 'leave it at the default' and '+abicalls' means 'set it to true'. In this case, (and the similar -modd-spreg case) I'd like the code to be IsABICalls = (Bits & Mips::FeatureABICalls) != 0; or possibly: if ((Bits & Mips::FeatureABICalls) != 0) IsABICalls = true; else IsABICalls = false; and preferably arrange for 'Bits & Mips::FeatureABICalls' to be true by default (on some triples). llvm-svn: 215211
*	Add missing Interpreter intrinsic lowering for sin, cos and ceil	Josh Klontz	2014-08-08	3	-0/+23
\| \| \| \|	llvm-svn: 215209
*	[AArch64] Fix a type conversion bug for anlyzing compare.	Jiangning Liu	2014-08-08	1	-0/+32
\| \| \| \| \| \| \| \|	The bug can cause spec2006/483.xalancbmk failure. Patched by David Xu. llvm-svn: 215206
*	[mips] Remove reason for XFAIL from a test that isn't actually XFAILed.	Daniel Sanders	2014-08-08	1	-3/+0
\| \| \| \|	llvm-svn: 215201
*	[LoopVectorizer] Enable support for floating-point subtraction reductions	James Molloy	2014-08-08	1	-0/+22
\| \| \| \|	llvm-svn: 215200
*	[AArch64] Add an FP load balancing pass for Cortex-A57	James Molloy	2014-08-08	1	-0/+323
\| \| \| \| \| \| \| \| \| \| \| \|	For best-case performance on Cortex-A57, we should try to use a balanced mix of odd and even D-registers when performing a critical sequence of independent, non-quadword FP/ASIMD floating-point multiply or multiply-accumulate operations. This pass attempts to detect situations where the register allocation may adversely affect this load balancing and to change the registers used so as to better utilize the CPU. Ideally we'd just take each multiply or multiply-accumulate in turn and allocate it alternating even or odd registers. However, multiply-accumulates are most efficiently performed in the same functional unit as their accumulation operand. Therefore this pass tries to find maximal sequences ("Chains") of multiply-accumulates linked via their accumulation operand, and assign them all the same "color" (oddness/evenness). This optimization affects S-register and D-register floating point multiplies and FMADD/FMAs, as well as vector (floating point only) muls and FMADD/FMA. Q register instructions (and 128-bit vector instructions) are not affected. llvm-svn: 215199