bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[ARM] Re-re-apply VLD1/VST1 base-update combine.	Ahmed Bougacha	2015-02-19	4	-19/+524
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This re-applies r223862, r224198, r224203, and r224754, which were reverted in r228129 because they exposed Clang misalignment problems when self-hosting. The combine caused the crashes because we turned ISD::LOAD/STORE nodes to ARMISD::VLD1/VST1_UPD nodes. When selecting addressing modes, we were very lax for the former, and only emitted the alignment operand (as in "[r1:128]") when it was larger than the standard alignment of the memory type. However, for ARMISD nodes, we just used the MMO alignment, no matter what. In our case, we turned ISD nodes to ARMISD nodes, and this caused the alignment operands to start being emitted. And that's how we exposed alignment problems that were ignored before (but I believe would have been caught with SCTRL.A==1?). To fix this, we can just mirror the hack done for ISD nodes: only take into account the MMO alignment when the access is overaligned. Original commit message: We used to only combine intrinsics, and turn them into VLD1_UPD/VST1_UPD when the base pointer is incremented after the load/store. We can do the same thing for generic load/stores. Note that we can only combine the first load/store+adds pair in a sequence (as might be generated for a v16f32 load for instance), because other combines turn the base pointer addition chain (each computing the address of the next load, from the address of the last load) into independent additions (common base pointer + this load's offset). rdar://19717869, rdar://14062261. llvm-svn: 229932
*	llvm-mc: Use Target::createNullStreamer to fix crashes on target-specific ↵	Peter Collingbourne	2015-02-19	1	-0/+2
\| \| \| \| \| \|	asm directives. llvm-svn: 229798
*	[ARM] Add missing M/R class CPUs	Bradley Smith	2015-02-18	1	-0/+170
\| \| \| \| \| \| \| \| \| \| \| \|	Add some of the missing M and R class Cortex CPUs, namely: Cortex-M0+ (called Cortex-M0plus for GCC compatibility) Cortex-M1 SC000 SC300 Cortex-R5 llvm-svn: 229660
*	Canonicalize splats as build_vectors (PR22283)	Sanjay Patel	2015-02-17	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a follow-on patch to: http://reviews.llvm.org/D7093 That patch canonicalized constant splats as build_vectors, and this patch removes the constant check so we can canonicalize all splats as build_vectors. This fixes the 2nd test case in PR22283: http://llvm.org/bugs/show_bug.cgi?id=22283 The unfortunate code duplication between SelectionDAG and DAGCombiner is discussed in the earlier patch review. At least this patch is just removing code... This improves an existing x86 AVX test and changes codegen in an ARM test. Differential Revision: http://reviews.llvm.org/D7389 llvm-svn: 229511
*	[SimplifyCFG] Swap to using TargetTransformInfo for cost	James Molloy	2015-02-11	2	-4/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	analysis. We're already using TTI in SimplifyCFG, so remove the hard-baked "cheapness" heuristic and use TTI directly. Generally NFC intended, but we're using a slightly different heuristic now so there is a slight test churn. Test changes: * combine-comparisons-by-cse.ll: Removed unneeded branch check. * 2014-08-04-muls-it.ll: Test now doesn't branch but emits muleq. * coalesce-subregs.ll: Superfluous block check. * 2008-01-02-hoist-fp-add.ll: fadd is safe to speculate. Change to udiv. * PhiBlockMerge.ll: Superfluous CFG checking code. Main checks still present. * select-gep.ll: A variable GEP is not expensive, just TCC_Basic, according to the TTI. llvm-svn: 228826
*	[ARM] Add armv6s[-]m as an alias to armv6[-]m	Bradley Smith	2015-02-10	1	-0/+4
\| \| \| \|	llvm-svn: 228696
*	Fix a bug in DemoteRegToStack where a reload instruction was inserted into the	Akira Hatanaka	2015-02-09	1	-1/+127
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	wrong basic block. This would happen when the result of an invoke was used by a phi instruction in the invoke's normal destination block. An instruction to reload the invoke's value would get inserted before the critical edge was split and a new basic block (which is the correct insertion point for the reload) was created. This commit fixes the bug by splitting the critical edge before all the reload instructions are inserted. Also, hoist up the code which computes the insertion point to the only place that need that computation. rdar://problem/15978721 llvm-svn: 228566
*	ARM & AArch64: teach LowerVSETCC that output type size may differ from input.	Tim Northover	2015-02-08	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While various DAG combines try to guarantee that a vector SETCC operation will have the same output size as input, there's nothing intrinsic to either creation or LegalizeTypes that actually guarantees it, so the function needs to be ready to handle a mismatch. Fortunately this is easy enough, just extend or truncate the naturally compared result. I couldn't reproduce the failure in other backends that I know have SIMD, so it's probably only an issue for these two due to shared heritage. Should fix PR21645. llvm-svn: 228518
*	MC: Emit COFF section flags in the "proper" order	David Majnemer	2015-02-07	2	-2/+2
\| \| \| \| \| \| \| \|	COFF section flags are not idempotent: 'rd' will make a read-write section because 'd' implies write 'dr' will make a read-only section because 'r' disables write llvm-svn: 228490
*	[ARM] Use patterns instead of hardcoded regs in test. NFC.	Ahmed Bougacha	2015-02-05	1	-5/+5
\| \| \| \|	llvm-svn: 228259
*	[ARM] Make testcase more explicit. NFC.	Ahmed Bougacha	2015-02-05	1	-23/+58
\| \| \| \| \| \| \|	The q8/d16 thing is silly; I'd be happy to hear about a better way to write those tests where simple substitution isn't enough.. llvm-svn: 228258
*	Adding support to LLVM for targeting Cortex-A72	Renato Golin	2015-02-04	1	-0/+36
\| \| \| \| \| \| \| \| \| \|	Currently, Cortex-A72 is modelled as an Cortex-A57 except the fp load balancing pass isn't enabled for Cortex-A72 as it's not profitable to have it enabled for this core. Patch by Ranjeet Singh. llvm-svn: 228140
*	Reverting VLD1/VST1 base-updating/post-incrementing combining	Renato Golin	2015-02-04	4	-524/+19
\| \| \| \| \| \| \| \| \| \| \|	This reverts patches 223862, 224198, 224203, and 224754, which were all related to the vector load/store combining and were reverted/reaplied a few times due to the same alignment problems we're seeing now. Further tests, mainly self-hosting Clang, will be needed to reapply this patch in the future. llvm-svn: 228129
*	Fix ARM peephole optimizeCompare to avoid optimizing unsigned cmp to 0.	Jan Wen Voung	2015-02-02	1	-0/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Previously it only avoided optimizing signed comparisons to 0. Sometimes the DAGCombiner will optimize the unsigned comparisons to 0 before it gets to the peephole pass, but sometimes it doesn't. Fix for PR22373. Test Plan: test/CodeGen/ARM/sub-cmp-peephole.ll Reviewers: jfb, manmanren Subscribers: aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D7274 llvm-svn: 227809
*	ARM: support stack probe size on Windows on ARM	Saleem Abdulrasool	2015-01-31	1	-0/+27
\| \| \| \| \| \| \| \| \|	Now that -mstack-probe-size is piped through to the backend via the function attribute as on Windows x86, honour the value to permit handling of non-default values for stack probes. This is needed /Gs with the clang-cl driver or -mstack-probe-size with the clang driver when targeting Windows on ARM. llvm-svn: 227667
*	Add a missing Tag_DIV_use test for Cortex-M7.	Charlie Turner	2015-01-29	1	-0/+1
\| \| \| \|	llvm-svn: 227429
*	This patch fixes issue with lowering below mentioned pattern :-	Jyoti Allur	2015-01-23	1	-0/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	_foo: smull r0, r1, r1, r0 smull r2, r3, r3, r2 adds r0, r2, r0 adc r1, r3, r1 bx lr to _foo: smull r0, r1, r1, r0 smlal r0, r1, r3, r2 bx lr llvm-svn: 226904
*	Fix load-store optimizer on thumbv4t	Jonathan Roelofs	2015-01-21	1	-0/+55
\| \| \| \| \| \| \| \| \| \|	Thumbv4t does not have lo->lo copies other than MOVS, and that can't be predicated. So emit MOVS when needed and bail if there's a predicate. http://reviews.llvm.org/D6592 llvm-svn: 226711
*	Bring r226038 back.	Rafael Espindola	2015-01-19	2	-17/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	No change in this commit, but clang was changed to also produce trivial comdats when needed. Original message: Don't create new comdats in CodeGen. This patch stops the implicit creation of comdats during codegen. Clang now sets the comdat explicitly when it is required. With this patch clang and gcc now produce the same result in pr19848. llvm-svn: 226467
*	Revert r226242 - Revert Revert Don't create new comdats in CodeGen	Timur Iskhodzhanov	2015-01-16	2	-1/+17
\| \| \| \| \| \|	This breaks AddressSanitizer (ninja check-asan) on Windows llvm-svn: 226251
*	Revert "Revert Don't create new comdats in CodeGen"	Rafael Espindola	2015-01-16	2	-17/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit r226173, adding r226038 back. No change in this commit, but clang was changed to also produce trivial comdats for costructors, destructors and vtables when needed. Original message: Don't create new comdats in CodeGen. This patch stops the implicit creation of comdats during codegen. Clang now sets the comdat explicitly when it is required. With this patch clang and gcc now produce the same result in pr19848. llvm-svn: 226242
*	Revert "r226086 - Revert "r226071 - [RegisterCoalescer] Remove copies to ↵	Hal Finkel	2015-01-15	1	-1/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	reserved registers"" Reapply r226071 with fixes. Two fixes: 1. We need to manually remove the old and create the new 'deaf defs' associated with physical register definitions when we move the definition of the physical register from the copy point to the point of the original vreg def. This problem was picked up by the machinstr verifier, and could trigger a verification failure on test/CodeGen/X86/2009-02-12-DebugInfoVLA.ll, so I've turned on the verifier in the tests. 2. When moving the def point of the phys reg up, we need to make sure that it is neither defined nor read in between the two instructions. We don't, however, extend the live ranges of phys reg defs to cover uses, so just checking for live-range overlap between the pair interval and the phys reg aliases won't pick up reads. As a result, we manually iterate over the range and check for reads. A test soon to be committed to the PowerPC backend will test this change. Original commit message: [RegisterCoalescer] Remove copies to reserved registers This allows the RegisterCoalescer to join "non-flipped" range pairs with a physical destination register -- which allows the RegisterCoalescer to remove copies like this: <vreg> = something (maybe a load, for example) ... (things that don't use PHYSREG) PHYSREG = COPY <vreg> (with all of the restrictions normally applied by the RegisterCoalescer: having compatible register classes, etc. ) Previously, the RegisterCoalescer handled only the opposite case (copying from a physical register). I don't handle the problem fully here, but try to get the common case where there is only one use of <vreg> (the COPY). An upcoming commit to the PowerPC backend will make this pattern much more common on PPC64/ELF systems. llvm-svn: 226200
*	Revert Don't create new comdats in CodeGen	Timur Iskhodzhanov	2015-01-15	2	-1/+17
\| \| \| \| \| \|	It breaks AddressSanitizer on Windows. llvm-svn: 226173
*	Revert "r226071 - [RegisterCoalescer] Remove copies to reserved registers"	Hal Finkel	2015-01-15	1	-13/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reverting this while I investigate some bad behavior this is causing. As a possibly-related issue, adding -verify-machineinstrs to one of the test cases now fails because of this change: llc test/CodeGen/X86/2009-02-12-DebugInfoVLA.ll -march=x86-64 -o - -verify-machineinstrs * Bad machine code: No instruction at def index * - function: foo - basic block: BB#0 return (0x10007e21f10) [0B;736B) - liverange: [128r,128d:9)[160r,160d:8)[176r,176d:7)[336r,336d:6)[464r,464d:5)[480r,480d:4)[624r,624d:3)[752r,752d:2)[768r,768d:1)[78 4r,784d:0) 0@784r 1@768r 2@752r 3@624r 4@480r 5@464r 6@336r 7@176r 8@160r 9@128r - register: %DS Valno #3 is defined at 624r * Bad machine code: Live segment doesn't end at a valid instruction * - function: foo - basic block: BB#0 return (0x10007e21f10) [0B;736B) - liverange: [128r,128d:9)[160r,160d:8)[176r,176d:7)[336r,336d:6)[464r,464d:5)[480r,480d:4)[624r,624d:3)[752r,752d:2)[768r,768d:1)[78 4r,784d:0) 0@784r 1@768r 2@752r 3@624r 4@480r 5@464r 6@336r 7@176r 8@160r 9@128r - register: %DS [624r,624d:3) LLVM ERROR: Found 2 machine code errors. where 624r corresponds exactly to the interval combining change: 624B %RSP<def> = COPY %vreg16; GR64:%vreg16 Considering merging %vreg16 with %RSP RHS = %vreg16 [608r,624r:0) 0@608r updated: 608B %RSP<def> = MOV64rm <fi#3>, 1, %noreg, 0, %noreg; mem:LD8[%saved_stack.1] Success: %vreg16 -> %RSP Result = %RSP llvm-svn: 226086
*	[RegisterCoalescer] Remove copies to reserved registers	Hal Finkel	2015-01-15	1	-1/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This allows the RegisterCoalescer to join "non-flipped" range pairs with a physical destination register -- which allows the RegisterCoalescer to remove copies like this: <vreg> = something (maybe a load, for example) ... (things that don't use PHYSREG) PHYSREG = COPY <vreg> (with all of the restrictions normally applied by the RegisterCoalescer: having compatible register classes, etc. ) Previously, the RegisterCoalescer handled only the opposite case (copying from a physical register). I don't handle the problem fully here, but try to get the common case where there is only one use of <vreg> (the COPY). An upcoming commit to the PowerPC backend will make this pattern much more common on PPC64/ELF systems. llvm-svn: 226071
*	IR: Move MDLocation into place	Duncan P. N. Exon Smith	2015-01-14	17	-147/+147
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit moves `MDLocation`, finishing off PR21433. There's an accompanying clang commit for frontend testcases. I'll attach the testcase upgrade script I used to PR21433 to help out-of-tree frontends/backends. This changes the schema for `DebugLoc` and `DILocation` from: !{i32 3, i32 7, !7, !8} to: !MDLocation(line: 3, column: 7, scope: !7, inlinedAt: !8) Note that empty fields (line/column: 0 and inlinedAt: null) don't get printed by the assembly writer. llvm-svn: 226048
*	Don't create new comdats in CodeGen.	Rafael Espindola	2015-01-14	2	-17/+1
\| \| \| \| \| \| \| \| \|	This patch stops the implicit creation of comdats during codegen. Clang now sets the comdat explicitly when it is required. With this patch clang and gcc now produce the same result in pr19848. llvm-svn: 226038
*	ARM: add test for crc32 instructions in CodeGen.	Tim Northover	2015-01-14	1	-0/+58
\| \| \| \| \| \| \|	Somehow we seem to have ended up without any actual tests of the CodeGen side. Easy enough to fix. llvm-svn: 225930
*	Debug info: Factor out the creation of DWARF expressions from AsmPrinter	Adrian Prantl	2015-01-12	2	-6/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	into a new class DwarfExpression that can be shared between AsmPrinter and DwarfUnit. This is the first step towards unifying the two entirely redundant implementations of dwarf expression emission in DwarfUnit and AsmPrinter. Almost no functional change — Testcases were updated because asm comments that used to be on two lines now appear on the same line, which is actually preferable. llvm-svn: 225706
*	Fix large stack alignment codegen for ARM and Thumb2 targets	Kristof Beyls	2015-01-08	5	-10/+174
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This partially fixes PR13007 (ARM CodeGen fails with large stack alignment): for ARM and Thumb2 targets, but not for Thumb1, as it seems stack alignment for Thumb1 targets hasn't been supported at all. Producing an aligned stack pointer is done by zero-ing out the lower bits of the stack pointer. The BIC instruction was used for this. However, the immediate field of the BIC instruction only allows to encode an immediate that can zero out up to a maximum of the 8 lower bits. When a larger alignment is requested, a BIC instruction cannot be used; llvm was silently producing incorrect code in this case. This commit fixes code generation for large stack aligments by using the BFC instruction instead, when the BFC instruction is available. When not, it uses 2 instructions: a right shift, followed by a left shift to zero out the lower bits. The lowering of ARM::Int_eh_sjlj_dispatchsetup still has code that unconditionally uses BIC to realign the stack pointer, so it very likely has the same problem. However, I wasn't able to produce a test case for that. This commit adds an assert so that the compiler will fail the assert instead of silently generating wrong code if this is ever reached. llvm-svn: 225446
*	[ARM] Add missing Tag_DIV_use tests.	Charlie Turner	2015-01-07	1	-0/+8
\| \| \| \|	llvm-svn: 225348
*	Emit the build attribute Tag_conformance.	Charlie Turner	2015-01-05	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	Claim conformance to version 2.09 of the ARM ABI. This build attribute must be emitted first amongst the build attributes when written to an object file. This is to simplify conformance detection by consumers. Change-Id: If9eddcfc416bc9ad6e5cc8cdcb05d0031af7657e llvm-svn: 225166
*	ARM: permit tail calls to weak externals on COFF	Saleem Abdulrasool	2015-01-03	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \|	Weak externals are resolved statically, so we can actually generate the tail call on PE/COFF targets without breaking the requirements. It is questionable whether we want to propagate the current behaviour for MachO as the requirements are part of the ARM ELF specifications, and it seems that prior to the SVN r215890, we would have tail'ed the call. For now, be conservative and only permit it on PE/COFF where the call will always be fully resolved. llvm-svn: 225119
*	[ARM] Don't break alignment when combining base updates into load/stores.	Ahmed Bougacha	2014-12-23	3	-24/+105
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	r223862/r224203 tried to also combine base-updating load/stores. There was a mistake there: the alignment was added as is as an operand to the ARMISD::VLD/VST node. However, the VLD/VST selection logic doesn't care about less-than-standard alignment attributes. For example, no matter the alignment of a v2i64 load (say 1), SelectVLD picks VLD1q64 (because of the memory type). But VLD1q64 ("vld1.64 {dXX, dYY}") is 8-aligned, per ARMARMv7a 3.2.1. For the 1-aligned load, what we really want is VLD1q8. This commit introduces bitcasts if necessary, and changes the vld/vst type to one whose standard alignment matches the original load/store alignment. Differential Revision: http://reviews.llvm.org/D6759 llvm-svn: 224754
*	Convert a few tests to FileCheck. NFC.	Rafael Espindola	2014-12-22	4	-14/+38
\| \| \| \|	llvm-svn: 224705
*	Add a new string member to the TargetOptions struct for the name	Eric Christopher	2014-12-18	5	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \|	of the abi we should be using. For targets that don't use the option there's no change, otherwise this allows external users to set the ABI via string and avoid some of the -backend-option pain in clang. Use this option to move the ABI for the ARM port from the Subtarget to the TargetMachine and update the testcases accordingly since it's no longer valid to set via -mattr. llvm-svn: 224492
*	Model ARM backend ABI selection after the front end code doing the	Eric Christopher	2014-12-18	5	-7/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	same. This will change the "bare metal" ABI from APCS to AAPCS. The only difference between the front and back end code is that the code for Triple::GNU was added for environment. That will migrate to the front end shortly. Tests updated with the ABI they were originally testing in the case of bare metal (e.g. -mtriple armv7) or with a -gnu for arm-linux triples. llvm-svn: 224489
*	[ARM] Prevent PerformVCVTCombine from combining a vmul/vcvt with 8 lanes	Bradley Smith	2014-12-16	1	-0/+26
\| \| \| \| \| \|	This would result in a crash since the vcvt used does not support v8i32 types. llvm-svn: 224332
*	IR: Make metadata typeless in assembly	Duncan P. N. Exon Smith	2014-12-15	37	-894/+894
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that `Metadata` is typeless, reflect that in the assembly. These are the matching assembly changes for the metadata/value split in r223802. - Only use the `metadata` type when referencing metadata from a call intrinsic -- i.e., only when it's used as a `Value`. - Stop pretending that `ValueAsMetadata` is wrapped in an `MDNode` when referencing it from call intrinsics. So, assembly like this: define @foo(i32 %v) { call void @llvm.foo(metadata !{i32 %v}, metadata !0) call void @llvm.foo(metadata !{i32 7}, metadata !0) call void @llvm.foo(metadata !1, metadata !0) call void @llvm.foo(metadata !3, metadata !0) call void @llvm.foo(metadata !{metadata !3}, metadata !0) ret void, !bar !2 } !0 = metadata !{metadata !2} !1 = metadata !{i32* @global} !2 = metadata !{metadata !3} !3 = metadata !{} turns into this: define @foo(i32 %v) { call void @llvm.foo(metadata i32 %v, metadata !0) call void @llvm.foo(metadata i32 7, metadata !0) call void @llvm.foo(metadata i32* @global, metadata !0) call void @llvm.foo(metadata !3, metadata !0) call void @llvm.foo(metadata !{!3}, metadata !0) ret void, !bar !2 } !0 = !{!2} !1 = !{i32* @global} !2 = !{!3} !3 = !{} I wrote an upgrade script that handled almost all of the tests in llvm and many of the tests in cfe (even handling many `CHECK` lines). I've attached it (or will attach it in a moment if you're speedy) to PR21532 to help everyone update their out-of-tree testcases. This is part of PR21532. llvm-svn: 224257
*	Reapply "[ARM] Combine base-updating/post-incrementing vector load/stores."	Ahmed Bougacha	2014-12-13	4	-19/+443
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	r223862 tried to also combine base-updating load/stores. r224198 reverted it, as "it created a regression on the test-suite on test MultiSource/Benchmarks/Ptrdist/anagram by scrambling the order in which the words are shown." Reapply, with a fix to ignore non-normal load/stores. Truncstores are handled elsewhere (you can actually write a pattern for those, whereas for postinc loads you can't, since they return two values), but it should be possible to also combine extloads base updates, by checking that the memory (rather than result) type is of the same size as the addend. Original commit message: We used to only combine intrinsics, and turn them into VLD1_UPD/VST1_UPD when the base pointer is incremented after the load/store. We can do the same thing for generic load/stores. Note that we can only combine the first load/store+adds pair in a sequence (as might be generated for a v16f32 load for instance), because other combines turn the base pointer addition chain (each computing the address of the next load, from the address of the last load) into independent additions (common base pointer + this load's offset). Differential Revision: http://reviews.llvm.org/D6585 llvm-svn: 224203
*	Revert "[ARM] Combine base-updating/post-incrementing vector load/stores."	Renato Golin	2014-12-13	4	-381/+19
\| \| \| \| \| \| \| \| \|	This reverts commit r223862, as it created a regression on the test-suite on test MultiSource/Benchmarks/Ptrdist/anagram by scrambling the order in which the words are shown. We'll investigate the issue and re-apply when safe. llvm-svn: 224198
*	Emit Tag_ABI_FP_16bit_format build attribute.	Charlie Turner	2014-12-12	1	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \| \|	The __fp16 type is unconditionally exposed. Since -mfp16-format is not yet supported, there is not a user switch to change this behaviour. This build attribute should capture the default behaviour of the compiler, which is to expose the IEEE 754 version of __fp16. When -mfp16-format is emitted, that will be the way to control the value of this build attribute. Change-Id: I8a46641ff0fd2ef8ad0af5f482a6d1af2ac3f6b0 llvm-svn: 224115
*	ARM: correctly expand LDR-lit based globals.	Tim Northover	2014-12-10	3	-3/+4
\| \| \| \| \| \| \| \|	Quite a major error here: the expansions for the Pseudos with and without folded load were mixed up. Fortunately it only affects ARM-mode, when not using movw/movt, on Darwin. I'm guessing no-one actually uses that combination. llvm-svn: 223986
*	[ARM] Combine base-updating/post-incrementing vector load/stores.	Ahmed Bougacha	2014-12-10	4	-19/+381
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We used to only combine intrinsics, and turn them into VLD1_UPD/VST1_UPD when the base pointer is incremented after the load/store. We can do the same thing for generic load/stores. Note that we can only combine the first load/store+adds pair in a sequence (as might be generated for a v16f32 load for instance), because other combines turn the base pointer addition chain (each computing the address of the next load, from the address of the last load) into independent additions (common base pointer + this load's offset). Differential Revision: http://reviews.llvm.org/D6585 llvm-svn: 223862
*	[ARM] Make testcase more explicit. NFC.	Ahmed Bougacha	2014-12-09	1	-30/+49
\| \| \| \|	llvm-svn: 223841
*	[ARM] Also support v2f64 vld1/vst1.	Ahmed Bougacha	2014-12-09	2	-0/+19
\| \| \| \| \| \| \| \| \|	It was missing from the VLD1/VST1 handling logic, even though the corresponding instructions exist (same form as v2i64). In preparation for a future patch. llvm-svn: 223832
*	Add missing FP build attribute tests.	Charlie Turner	2014-12-05	1	-28/+148
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The test file test/CodeGen/ARM/build-attributes.ll was missing several floating-point build attribute tests. The intention of this commit is that for each CPU / architecture currently tested, there are now tests that make sure the following attributes are sufficiently checked, * Tag_ABI_FP_rounding * Tag_ABI_FP_denormal * Tag_ABI_FP_exceptions * Tag_ABI_FP_user_exceptions * Tag_ABI_FP_number_model Also in this commit, the -unsafe-fp-math flag has been augmented with the full suite of flags Clang sends to LLVM when you pass -ffast-math to Clang. That is, `-unsafe-fp-math' has been changed to `-enable-unsafe-fp-math -disable-fp-elim -enable-no-infs-fp-math -enable-no-nans-fp-math -fp-contract=fast' Change-Id: I35d766076bcbbf09021021c0a534bf8bf9a32dfc llvm-svn: 223454
*	Fix thumbv4t indirect calls	Jonathan Roelofs	2014-12-04	2	-2/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	So there are a couple of issues with indirect calls on thumbv4t. First, the most 'obvious' instruction, 'blx' isn't available until v5t. And secondly, the next-most-obvious sequence: 'mov lr, pc; bx rN' doesn't DTRT in thumb code because the saved off pc has its thumb bit cleared, so when the callee returns we end up in ARM mode.... yuck. The solution is to 'bl' to a nearby landing pad with a 'bx rN' in it. We could cut down on code size by sharing the landing pads between call sites that are close enough, but for the moment let's do correctness first and look at performance later. Patch by: Iain Sandoe http://reviews.llvm.org/D6519 llvm-svn: 223380
*	Emit ABI_FP_rounding attribute.	Charlie Turner	2014-12-03	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \|	LLVM understands a -enable-sign-dependent-rounding-fp-math codegen option. When the user has specified this option, the Tag_ABI_FP_rounding attribute should be emitted with value 1. This option currently does not appear to disable transformations and optimizations that assume default floating point rounding behavior, AFAICT, but the intention should be recorded in the build attributes, regardless of what the compiler actually does with the intention. Change-Id: If838578df3dc652b6f2796b8d152545674bcb30e llvm-svn: 223218
*	Add tests for default value of Tag_ABI_FP_rounding.	Charlie Turner	2014-12-03	1	-0/+48
\| \| \| \| \|	Change-Id: I051866d073fc6ce87ce3e693a3762da6d81f4393 llvm-svn: 223217