bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	ARM: add pseudo-instructions for lit-pool global materialisation	Tim Northover	2013-12-02	6	-69/+109
\| \| \| \| \| \| \| \| \| \| \| \|	These are used by MachO only at the moment, and (much like the existing MOVW/MOVT set) work around the fact that the labels used in the actual instructions often contain PC-dependent components, which means that repeatedly materialising the same global can't be CSEed. With small modifications, it could be adapted to how ELF finds the address of _GLOBAL_OFFSET_TABLE_, which would give similar benefits in PIC mode there. llvm-svn: 196090
*	Change the default of AsmWriterClassName and isMCAsmWriter.	Rafael Espindola	2013-12-02	1	-13/+0
\| \| \| \|	llvm-svn: 196065
*	ARM: fix bug in -Oz stack adjustment folding	Tim Northover	2013-12-01	4	-21/+26
\| \| \| \| \| \| \| \| \| \| \|	Previously, we clobbered callee-saved registers when folding an "add sp, #N" into a "pop {rD, ...}" instruction. This change checks whether a register we're going to add to the "pop" could actually be live outside the function before doing so and should fix the issue. This should fix PR18081. llvm-svn: 196046
*	[CMake] Let add_public_tablegen_target() provide intrinsics_gen, too.	NAKAMURA Takumi	2013-11-28	1	-2/+0
\| \| \| \| \| \| \| \| \| \|	I think, in principle, intrinsics_gen may be added explicitly. That said, it can be added incidentally, since each target already has dependencies to llvm-tblgen. Almost all source files depend on both CommonTaleGen and intrinsics_gen. Explicit add_dependencies() have been pruned under lib/Target. llvm-svn: 195929
*	[CMake] Let add_public_tablegen_target responsible to provide dependency to ↵	NAKAMURA Takumi	2013-11-28	6	-9/+1
\| \| \| \| \| \| \| \| \|	CommonTableGen. add_public_tablegen_target adds *CommonTableGen to LLVM_COMMON_DEPENDS. LLVM_COMMON_DEPENDS affects add_llvm_library (and other add_target stuff) within its scope. llvm-svn: 195927
*	[CMake] Prune include_directories() in llvm/lib/Target. add_llvm_target() ↵	NAKAMURA Takumi	2013-11-28	5	-11/+0
\| \| \| \| \| \|	sets them. llvm-svn: 195921
*	Darwin-ARM: use movw/movt for static relocations	Tim Northover	2013-11-26	2	-8/+4
\| \| \| \|	llvm-svn: 195759
*	Fix indentation typo	Tim Northover	2013-11-25	1	-1/+1
\| \| \| \|	llvm-svn: 195660
*	ARM: remove special cases for Darwin dynamic-no-pic mode.	Tim Northover	2013-11-25	11	-104/+73
\| \| \| \| \| \| \| \| \|	These are handled almost identically to static mode (and ELF's global address materialisation), except that a symbol may have "$non_lazy_ptr" appended. This can be handled by passing appropriate flags along with the instruction instead of using entirely separate pseudo-instructions. llvm-svn: 195655
*	ARM: remove unused patterns.	Tim Northover	2013-11-25	3	-6/+1
\| \| \| \| \| \| \| \|	There is no sane way for an LEApcrel (= single ADR) instruction to generate a global address on any ARM target I know of. Fortunately, no-one was trying to any more, but there were vestigial patterns. llvm-svn: 195644
*	[ARM] Enable FeatureMP for Cortex-A5 by default.	Amara Emerson	2013-11-25	1	-1/+1
\| \| \| \| \| \|	Patch by Oliver Stannard. llvm-svn: 195640
*	Add support for Cortex-A12.	Richard Barton	2013-11-22	2	-2/+19
\| \| \| \| \| \|	Patch by Oliver Stannard! llvm-svn: 195448
*	Fix a typo where we were creating <def,kill> operands instead of	Lang Hames	2013-11-22	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	<def,dead> ones. Add an assertion to make sure we catch this in the future. Fixes <rdar://problem/15464559>. llvm-svn: 195401
*	[ARM] add basic Cortex-A7 support to LLVM backend	Artyom Skrobov	2013-11-21	2	-1/+13
\| \| \| \|	llvm-svn: 195358
*	[weak vtables] Remove a bunch of weak vtables	Juergen Ributzka	2013-11-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	This patch removes most of the trivial cases of weak vtables by pinning them to a single object file. The memory leaks in this version have been fixed. Thanks Alexey for pointing them out. Differential Revision: http://llvm-reviews.chandlerc.com/D2068 Reviewed by Andy llvm-svn: 195064
*	Revert r194865 and r194874.	Alexey Samsonov	2013-11-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	This change is incorrect. If you delete virtual destructor of both a base class and a subclass, then the following code: Base *foo = new Child(); delete foo; will not cause the destructor for members of Child class. As a result, I observe plently of memory leaks. Notable examples I investigated are: ObjectBuffer and ObjectBufferStream, AttributeImpl and StringSAttributeImpl. llvm-svn: 194997
*	[weak vtables] Remove a bunch of weak vtables	Juergen Ributzka	2013-11-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	This patch removes most of the trivial cases of weak vtables by pinning them to a single object file. Differential Revision: http://llvm-reviews.chandlerc.com/D2068 Reviewed by Andy llvm-svn: 194865
*	Avoid illegal integer promotion in fastisel	Bob Wilson	2013-11-15	1	-7/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Stop folding constant adds into GEP when the type size doesn't match. Otherwise, the adds' operands are effectively being promoted, changing the conditions of an overflow. Results are different when: sext(a) + sext(b) != sext(a + b) Problem originally found on x86-64, but also fixed issues with ARM and PPC, which used similar code. <rdar://problem/15292280> Patch by Duncan Exon Smith! llvm-svn: 194840
*	ARM: produce friendly error for invalid inline asm	Tim Northover	2013-11-14	1	-0/+4
\| \| \| \| \| \| \| \| \|	We used to perform an invalid operation on an MVT and crash, which wasn't much fun. Patch by Oliver Stannard. llvm-svn: 194714
*	Enable generating legacy IT block for AArch32	Weiming Zhao	2013-11-13	5	-6/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	By default, the behavior of IT block generation will be determinated dynamically base on the arch (armv8 vs armv7). This patch adds backend options: -arm-restrict-it and -arm-no-restrict-it. The former one restricts the generation of IT blocks (the same behavior as thumbv8) for both arches. The later one allows the generation of legacy IT block (the same behavior as ARMv7 Thumb2) for both arches. Clang will support -mrestrict-it and -mno-restrict-it, which is compatible with GCC. llvm-svn: 194592
*	ARM: diagnose invalid system LDM/STM	Tim Northover	2013-11-12	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \|	The system LDM and STM instructions can't usually writeback to the base register. The one exception is when an LDM is actually an exception-return (i.e. contains PC in the register list). (There's already a test that "ldm sp!, {r0-r3, pc}^" works, which is why there is no positive test). rdar://problem/15223374 llvm-svn: 194512
*	[ARM] Add support for FP_HP_extension build attribute	Bradley Smith	2013-11-12	2	-1/+7
\| \| \| \|	llvm-svn: 194470
*	[ARM] Add support for MVFR2 which is new in ARMv8	Artyom Skrobov	2013-11-11	2	-0/+3
\| \| \| \|	llvm-svn: 194416
*	Remove some unnecessary temporary strings.	Benjamin Kramer	2013-11-09	1	-1/+1
\| \| \| \|	llvm-svn: 194335
*	[arm] Refine ARMBuildAttrs.h.	Logan Chien	2013-11-09	1	-6/+8
\| \| \| \| \| \| \| \|	This commit cleans up some comments in ARMBuildAttrs.h. Besides, this commit fixes an error related to AllowWMMXv1 and AllowWMMXv2 (although they are not used currently.) llvm-svn: 194327
*	ARM: fold prologue/epilogue sp updates into push/pop for code size	Tim Northover	2013-11-08	4	-32/+166
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ARM prologues usually look like: push {r7, lr} sub sp, sp, #4 If code size is extremely important, this can be optimised to the single instruction: push {r6, r7, lr} where we don't actually care about the contents of r6, but pushing it subtracts 4 from sp as a side effect. This should implement such a conversion, predicated on the "minsize" function attribute (-Oz) since I've yet to find any code it actually makes faster. llvm-svn: 194264
*	[ARM] Handling for coprocessor instructions that are undefined starting from ↵	Artyom Skrobov	2013-11-08	1	-8/+21
\| \| \| \| \| \|	ARMv8 (Thumb encodings) llvm-svn: 194263
*	[ARM] Handling for coprocessor instructions that are undefined starting from ↵	Artyom Skrobov	2013-11-08	2	-9/+24
\| \| \| \| \| \|	ARMv8 (ARM encodings) llvm-svn: 194261
*	[ARM] In ARMAsmParser, MatchCoprocessorOperandName() permitted p10 and p11 ↵	Artyom Skrobov	2013-11-08	1	-2/+3
\| \| \| \| \| \|	as operands for coprocessor instructions, resulting in encodings that clash with FP/NEON instruction encodings llvm-svn: 194253
*	ARM: permit bare dmb/dsb/isb aliases on Cortex-M0	Tim Northover	2013-11-05	1	-3/+3
\| \| \| \| \| \| \| \|	Cortex-M0 supports these 32-bit instructions despite being Thumb1 only (mostly). We knew about that but not that the aliases without the default "sy" operand were also permitted. llvm-svn: 194094
*	ARM: remove unnecessary state-tracking during frame lowering.	Tim Northover	2013-11-04	6	-115/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ResolveFrameIndex had what appeared to be a very nasty hack for when the frame-index referred to a callee-saved register. In this case it "adjusted" the offset so that the address was correct if (and only if) the MachineInstr immediately followed the respective push. This "worked" for all forms of GPR & DPR but was only ever used to set the frame pointer itself, and once this was put in a more sensible location the entire state-tracking machinery it relied on became redundant. So I stripped it. The only wrinkle is that "add r7, sp, #0" might theoretically be slower (need an actual ALU slot) compared to "mov r7, sp" so I added a micro-optimisation that also makes emitARMRegUpdate and emitT2RegUpdate also work when NumBytes == 0. No test changes since there shouldn't be any functionality change. llvm-svn: 194025
*	Enable optimization of sin / cos pair into call to __sincos_stret for iOS7+.	Bob Wilson	2013-11-03	4	-0/+87
\| \| \| \| \| \| \|	rdar://12856873 Patch by Evan Cheng, with a fix for rdar://13209539 by Tilmann Scheller llvm-svn: 193942
*	[ARM] Add Virtualization subtarget feature and more build attributes in this ↵	Bradley Smith	2013-11-01	5	-5/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	area Add a Virtualization ARM subtarget feature along with adding proper build attribute emission for Tag_Virtualization_use (encodes Virtualization and TrustZone) and Tag_MPextension_use. Also rework test/CodeGen/ARM/2010-10-19-mc-elf-objheader.ll testcase to something that is more maintainable. This changes the focus of this testcase away from testing CPU defaults (which is tested elsewhere), onto specifically testing that attributes are encoded correctly. llvm-svn: 193859
*	[ARM] Fix Tag_ABI_HardFP_use build attribute	Bradley Smith	2013-11-01	2	-5/+13
\| \| \| \| \| \| \| \|	Fix Tag_ABI_HardFP_use build attribute to handle single precision FP, replace deprecated Tag_ABI_HardFP_use value of 3 with 0 and also add some tests for Tag_ABI_VFP_args. llvm-svn: 193856
*	Legalize: Improve legalization of long vector extends.	Jim Grosbach	2013-10-31	1	-55/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When an extend more than doubles the size of the elements (e.g., a zext from v16i8 to v16i32), the normal legalization method of splitting the vectors will run into problems as by the time the destination vector is legal, the source vector is illegal. The end result is the operation often becoming scalarized, with the typical horrible performance. For example, on x86_64, the simple input of: define void @bar(<16 x i8> %a, <16 x i32>* %p) nounwind { %tmp = zext <16 x i8> %a to <16 x i32> store <16 x i32> %tmp, <16 x i32>*%p ret void } Generates: .section __TEXT,__text,regular,pure_instructions .section __TEXT,__const .align 5 LCPI0_0: .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .section __TEXT,__text,regular,pure_instructions .globl _bar .align 4, 0x90 _bar: vpunpckhbw %xmm0, %xmm0, %xmm1 vpunpckhwd %xmm0, %xmm1, %xmm2 vpmovzxwd %xmm1, %xmm1 vinsertf128 $1, %xmm2, %ymm1, %ymm1 vmovaps LCPI0_0(%rip), %ymm2 vandps %ymm2, %ymm1, %ymm1 vpmovzxbw %xmm0, %xmm3 vpunpckhwd %xmm0, %xmm3, %xmm3 vpmovzxbd %xmm0, %xmm0 vinsertf128 $1, %xmm3, %ymm0, %ymm0 vandps %ymm2, %ymm0, %ymm0 vmovaps %ymm0, (%rdi) vmovaps %ymm1, 32(%rdi) vzeroupper ret So instead we can check if there are legal types that enable us to split more cleverly when the input vector is already legal such that we don't turn it into an illegal type. If the extend is such that it's more than doubling the size of the input we check if - the number of vector elements is even, - the source type is legal, - the type of a split source is illegal, - the type of an extended (by doubling element size) source is legal, and - the type of that extended source when split is legal. If the conditions are met, instead of just splitting both the destination and the source types, we create an extend that only goes up one "step" (doubling the element width), and the continue legalizing the rest of the operation normally. The result is that this operates as a new, more effecient, termination condition for the loop of "split the operation until the destination type is legal." With this change, the above example now compiles to: _bar: vpxor %xmm1, %xmm1, %xmm1 vpunpcklbw %xmm1, %xmm0, %xmm2 vpunpckhwd %xmm1, %xmm2, %xmm3 vpunpcklwd %xmm1, %xmm2, %xmm2 vinsertf128 $1, %xmm3, %ymm2, %ymm2 vpunpckhbw %xmm1, %xmm0, %xmm0 vpunpckhwd %xmm1, %xmm0, %xmm3 vpunpcklwd %xmm1, %xmm0, %xmm0 vinsertf128 $1, %xmm3, %ymm0, %ymm0 vmovaps %ymm0, 32(%rdi) vmovaps %ymm2, (%rdi) vzeroupper ret This generalizes a custom lowering that was added a while back to the ARM backend. That lowering is no longer necessary, and is removed. The testcases for it, however, provide excellent ARM tests for this change and so remain. rdar://14735100 llvm-svn: 193727
*	[ARM] NEON instructions were erroneously decoded from certain invalid encodings	Artyom Skrobov	2013-10-30	1	-20/+20
\| \| \| \|	llvm-svn: 193705
*	Struct byval cleanup: add helper functions to reduce code duplication.	Manman Ren	2013-10-29	1	-180/+117
\| \| \| \| \| \| \| \| \| \|	Helper functions are added: emitPostLd: emit a post-increment load operation with given size. emitPostSt: emit a post-increment store operation with given size. No functionality change. llvm-svn: 193656
*	Move getSymbol to TargetLoweringObjectFile.	Rafael Espindola	2013-10-29	1	-1/+1
\| \| \| \| \| \|	This allows constructing a Mangler with just a TargetMachine. llvm-svn: 193630
*	Add a helper getSymbol to AsmPrinter.	Rafael Espindola	2013-10-29	2	-6/+6
\| \| \| \|	llvm-svn: 193627
*	[ARM] Make sure HasCRC is initialized to false in Subtarget.	Amara Emerson	2013-10-29	1	-0/+1
\| \| \| \|	llvm-svn: 193624
*	ARM: Add subtarget feature for CRC	Bernard Ogden	2013-10-29	5	-6/+14
\| \| \| \| \| \| \| \|	Adds a subtarget feature for the CRC instructions (optional in v8-A) to the ARM (32-bit) backend. Differential Revision: http://llvm-reviews.chandlerc.com/D2036 llvm-svn: 193599
*	ARM cost model: Unaligned vectorized double stores are expensive	Arnold Schwaighofer	2013-10-29	1	-0/+15
\| \| \| \| \| \| \| \| \|	Updated a test case that assumed that <2 x double> would vectorize to use <4 x float>. radar://15338229 llvm-svn: 193574
*	ARM cost model: Account for zero cost scalar SROA instructions	Arnold Schwaighofer	2013-10-29	1	-3/+15
\| \| \| \| \| \| \| \| \|	By vectorizing a series of srl, or, ... instructions we have obfuscated the intention so much that the backend does not know how to fold this code away. radar://15336950 llvm-svn: 193573
*	Return early from getUnconditionalBranchTargetOpValue if the branch target is	Lang Hames	2013-10-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	an MCExpr, in order to avoid writing an encoded zero value in the immediate field. When getUnconditionalBranchTargetOpValue is called with an MCExpr target, we don't know what the final immediate field value should be. We shouldn't explicitly set the immediate field to an encoded zero value as zero is encoded with a non-zero bit pattern. This leads to bits being set that pollute the final immediate value. The nature of the encoding is such that the polluted bits only affect very large immediate values, explaining why this hasn't caused problems earlier. Fixes <rdar://problem/15155975>. llvm-svn: 193535
*	[arm] Implement eabi_attribute, cpu, and fpu directives.	Logan Chien	2013-10-28	7	-265/+514
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit allows the ARM integrated assembler to parse and assemble the code with .eabi_attribute, .cpu, and .fpu directives. To implement the feature, this commit moves the code from AttrEmitter to ARMTargetStreamers, and several new test cases related to cortex-m4, cortex-r5, and cortex-a15 are added. Besides, this commit also change the Subtarget->isFPOnlySP() to Subtarget->hasD16() to match the usage of .fpu directive. This commit changes the test cases: * Several .eabi_attribute directives in 2010-09-29-mc-asm-header-test.ll are removed because the .fpu directive already cover the functionality. * In the Cortex-A15 test case, the value for Tag_Advanced_SIMD_arch has be changed from 1 to 2, which is more precise. llvm-svn: 193524
*	ARM: allow .thumb_func to be separated from symbol definition	Tim Northover	2013-10-25	1	-17/+18
\| \| \| \| \| \| \| \| \| \|	When assembling, a .thumb_func directive is supposed to be applicable to the next symbol definition, even if there are intervening directives. We were racing ahead to try and find it, and this commit should fix the issue. Patch by Gabor Ballabas llvm-svn: 193403
*	ARM: don't expand atomicrmw inline on Cortex-M0	Tim Northover	2013-10-25	2	-9/+13
\| \| \| \| \| \| \| \| \| \|	There's a barrier instruction so that should still be used, but most actual atomic operations are going to need a platform decision on the correct behaviour (either nop if single-threaded or OS-support otherwise). rdar://problem/15287210 llvm-svn: 193399
*	ARM: Tweak usage of '*vfp' compiler_rt functions.	Jim Grosbach	2013-10-24	1	-1/+2
\| \| \| \| \| \| \| \| \|	Only use them if the subtarget has ARM mode, as these routines are implemented as ARM code. rdar://15302004 llvm-svn: 193381
*	Remove class abstraction from ARM struct byval lowering	David Peixotto	2013-10-24	1	-553/+262
\| \| \| \| \| \| \| \| \| \| \|	This commit changes the struct byval lowering for arm to use inline checks for the subtarget instead of a class abstraction to represent the differences. The class abstraction was judged to be too much code for this task. No intended functionality change. llvm-svn: 193357
*	ARM: Mark double-precision instructions as such	Tim Northover	2013-10-24	3	-45/+66
\| \| \| \| \| \| \| \| \| \| \| \|	This prevents us from silently accepting invalid instructions on (for example) Cortex-M4 with just single-precision VFP support. No tests for the extra Pat Requires because they're essentially assertions: the affected code should have been lowered to libcalls before ISel. rdar://problem/15302004 llvm-svn: 193354