bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Fix the ARM IIC_iCMPsi itinerary and add an important assert.	Andrew Trick	2011-01-04	1	-0/+1
\| \| \| \|	llvm-svn: 122794
*	Various bits of framework needed for precise machine-level selection	Andrew Trick	2010-12-24	1	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	DAG scheduling during isel. Most new functionality is currently guarded by -enable-sched-cycles and -enable-sched-hazard. Added InstrItineraryData::IssueWidth field, currently derived from ARM itineraries, but could be initialized differently on other targets. Added ScheduleHazardRecognizer::MaxLookAhead to indicate whether it is active, and if so how many cycles of state it holds. Added SchedulingPriorityQueue::HasReadyFilter to allowing gating entry into the scheduler's available queue. ScoreboardHazardRecognizer now accesses the ScheduleDAG in order to get information about it's SUnits, provides RecedeCycle for bottom-up scheduling, correctly computes scoreboard depth, tracks IssueCount, and considers potential stall cycles when checking for hazards. ScheduleDAGRRList now models machine cycles and hazards (under flags). It tracks MinAvailableCycle, drives the hazard recognizer and priority queue's ready filter, manages a new PendingQueue, properly accounts for stall cycles, etc. llvm-svn: 122541
*	whitespace	Andrew Trick	2010-12-24	1	-2/+2
\| \| \| \|	llvm-svn: 122539
*	Making use of VFP / NEON floating point multiply-accumulate / subtraction is	Evan Cheng	2010-12-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Work in progress, only A+B are enabled. llvm-svn: 120960
*	Define the subtarget feature for the architecture version,	Bob Wilson	2010-11-09	1	-15/+40
\| \| \| \| \| \| \|	as derived from the target triple. This is important for enabling features that are implied based on the architecture version. llvm-svn: 118643
*	Fix preload instruction isel. Only v7 supports pli, and only v7 with mp ↵	Evan Cheng	2010-11-03	1	-0/+1
\| \| \| \| \| \|	extension supports pldw. Add subtarget attribute to denote mp extension support and legalize illegal ones to nothing. llvm-svn: 118160
*	PR8359: The ARM backend may end up allocating registers D16 to D31 when	Bob Wilson	2010-10-12	1	-0/+1
\| \| \| \| \| \| \| \|	"-mattr=+vfp3" is specified. However, this will not work for hardware that only supports 16 registers. Add a new flag to support -"mattr=+vfp3,+d16". Patch by Jan Voung! llvm-svn: 116310
*	Add a subtarget hook for reporting the misprediction penalty. Use this to ↵	Owen Anderson	2010-09-28	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \|	provide more precise cost modeling for if-conversion. Now if only we had a way to estimate the misprediction probability. Adjsut CodeGen/ARM/ifcvt10.ll. The pipeline on Cortex-A8 is long enough that it is still profitable to predicate an ldm, but the shorter pipeline on Cortex-A9 makes it unprofitable. llvm-svn: 114995
*	Add a command line option "-arm-strict-align" to disallow unaligned memory	Bob Wilson	2010-09-28	1	-0/+10
\| \| \| \| \| \|	accesses for ARM targets that would otherwise allow it. Radar 8465431. llvm-svn: 114941
*	Teach if-converter to be more careful with predicating instructions that would	Evan Cheng	2010-09-10	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \|	take multiple cycles to decode. For the current if-converter clients (actually only ARM), the instructions that are predicated on false are not nops. They would still take machine cycles to decode. Micro-coded instructions such as LDM / STM can potentially take multiple cycles to decode. If-converter should take treat them as non-micro-coded simple instructions. llvm-svn: 113570
*	cortex m4 has floating point support, but only single precision.	Jim Grosbach	2010-08-11	1	-0/+1
\| \| \| \|	llvm-svn: 110810
*	Report error if codegen tries to instantiate a ARM target when the cpu does ↵	Evan Cheng	2010-08-11	1	-0/+1
\| \| \| \| \| \|	support it. e.g. cortex-m* processors. llvm-svn: 110798
*	- Add subtarget feature -mattr=+db which determine whether an ARM cpu has the	Evan Cheng	2010-08-11	1	-0/+1
\| \| \| \| \| \| \| \| \|	memory and synchronization barrier dmb and dsb instructions. - Change instruction names to something more sensible (matching name of actual instructions). - Added tests for memory barrier codegen. llvm-svn: 110785
*	Explicitly initialize SlowFPBrcc and Pref32BitThumb to false.	Evan Cheng	2010-08-09	1	-0/+2
\| \| \| \|	llvm-svn: 110587
*	Cleanup of ARMv7M support. Move hardware divide and Thumb2 extract/pack	Jim Grosbach	2010-05-05	1	-0/+2
\| \| \| \| \| \| \|	instructions to subtarget features and update tests to reflect. PR5717. llvm-svn: 103136
*	Add initial support for ARMv7M subtarget and cortex-m3 cpu. Patch by	Jim Grosbach	2010-05-05	1	-0/+2
\| \| \| \| \| \| \| \| \|	Jordy <snhjordy@gmail.com>. Followup patches will add some tests and adjust to use Subtarget features for the instructions. llvm-svn: 103119
*	Add const qualifiers to CodeGen's use of LLVM IR constructs.	Dan Gohman	2010-04-15	1	-1/+2
\| \| \| \|	llvm-svn: 101334
*	switch the flag for using NEON for SP floating point to a subtarget 'feature'.	Jim Grosbach	2010-03-25	1	-13/+1
\| \| \| \| \| \|	Re-commit. This time complete with testsuite updates. llvm-svn: 99570
*	need to fix 'make check' tests first. revert for a moment.	Jim Grosbach	2010-03-25	1	-1/+13
\| \| \| \|	llvm-svn: 99569
*	switch the flag for using NEON for SP floating point to a subtarget 'feature'	Jim Grosbach	2010-03-25	1	-13/+1
\| \| \| \|	llvm-svn: 99568
*	switch the use-vml[as] instructions flag to a subtarget 'feature'	Jim Grosbach	2010-03-25	1	-11/+1
\| \| \| \|	llvm-svn: 99565
*	ARM cortex-a8 doesn't do vmla/vmls well. disable them by default for that cpu	Jim Grosbach	2010-03-25	1	-0/+6
\| \| \| \|	llvm-svn: 99549
*	Make the use of the vmla and vmls VFP instructions controllable via cmd line.	Jim Grosbach	2010-03-24	1	-0/+5
\| \| \| \| \| \| \|	Preliminary testing shows significant performance wins by not using these instructions. llvm-svn: 99436
*	Add substarget feature for FP16	Anton Korobeynikov	2010-03-14	1	-0/+1
\| \| \| \|	llvm-svn: 98503
*	Initial bits of ARMv4-only support.	Anton Korobeynikov	2010-03-06	1	-16/+24
\| \| \| \| \| \|	Patch by John Tytgat! llvm-svn: 97886
*	Kill ModuleProvider and ghost linkage by inverting the relationship between	Jeffrey Yasskin	2010-01-27	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Modules and ModuleProviders. Because the "ModuleProvider" simply materializes GlobalValues now, and doesn't provide modules, it's renamed to "GVMaterializer". Code that used to need a ModuleProvider to materialize Functions can now materialize the Functions directly. Functions no longer use a magic linkage to record that they're materializable; they simply ask the GVMaterializer. Because the C ABI must never change, we can't remove LLVMModuleProviderRef or the functions that refer to it. Instead, because Module now exposes the same functionality ModuleProvider used to, we store a Module* in any LLVMModuleProviderRef and translate in the wrapper methods. The bindings to other languages still use the ModuleProvider concept. It would probably be worth some time to update them to follow the C++ more closely, but I don't intend to do it. Fixes http://llvm.org/PR5737 and http://llvm.org/PR5735. llvm-svn: 94686
*	Remove isProfitableToDuplicateIndirectBranch target hook. It is profitable	Bob Wilson	2009-11-30	1	-2/+0
\| \| \| \| \| \| \| \| \|	for all the processors where I have tried it, and even when it might not help performance, the cost is quite low. The opportunities for duplicating indirect branches are limited by other factors so code size does not change much due to tail duplicating indirect branches aggressively. llvm-svn: 90144
*	Materialize global addresses via movt/movw pair, this is always better	Anton Korobeynikov	2009-11-24	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \|	than doing the same via constpool: 1. Load from constpool costs 3 cycles on A9, movt/movw pair - just 2. 2. Load from constpool might stall up to 300 cycles due to cache miss. 3. Movt/movw does not use load/store unit. 4. Less constpool entries => better compiler performance. This is only enabled on ELF systems, since darwin does not have needed relocations (yet). llvm-svn: 89720
*	Add a target hook to allow changing the tail duplication limit based on the	Bob Wilson	2009-11-18	1	-0/+2
\| \| \| \| \| \| \| \| \|	contents of the block to be duplicated. Use this for ARM Cortex A8/9 to be more aggressive tail duplicating indirect branches, since it makes it much more likely that they will be predicted in the branch target buffer. Testcase coming soon. llvm-svn: 89187
*	Allow target to specify regclass for which antideps will only be broken ↵	David Goodwin	2009-11-13	1	-3/+3
\| \| \| \| \| \|	along the critical path. llvm-svn: 88682
*	Fixed to address code review. No functional changes.	David Goodwin	2009-11-10	1	-0/+11
\| \| \| \|	llvm-svn: 86634
*	I am no spelling bee.	Evan Cheng	2009-10-16	1	-1/+1
\| \| \| \|	llvm-svn: 84250
*	Enable post-alloc scheduling for all ARM variants except for Thumb1.	Evan Cheng	2009-10-16	1	-3/+5
\| \| \| \|	llvm-svn: 84249
*	Add comment.	Evan Cheng	2009-10-16	1	-0/+2
\| \| \| \|	llvm-svn: 84246
*	Remove neonfp attribute and instead set default based on CPU string. Add ↵	David Goodwin	2009-10-01	1	-1/+7
\| \| \| \| \| \|	-arm-use-neon-fp to override the default. llvm-svn: 83218
*	Restore the -post-RA-scheduler flag as an override for the target ↵	David Goodwin	2009-10-01	1	-0/+5
\| \| \| \| \| \|	specification. Remove -mattr for setting PostRAScheduler enable and instead use CPU string. llvm-svn: 83215
*	Remove -post-RA-schedule flag and add a TargetSubtarget method to enable ↵	David Goodwin	2009-09-30	1	-0/+1
\| \| \| \| \| \|	post-register-allocation scheduling. By default it is off. For ARM, enable/disable with -mattr=+/-postrasched. Enable by default for cortex-a8. llvm-svn: 83122
*	Reference to hidden symbols do not have to go through non-lazy pointer in ↵	Evan Cheng	2009-09-03	1	-6/+46
\| \| \| \| \| \|	non-pic mode. rdar://7187172. llvm-svn: 80904
*	Let Darwin linker auto-synthesize stubs and lazy-pointers. This deletes a ↵	Evan Cheng	2009-08-28	1	-0/+11
\| \| \| \| \| \|	bunch of nasty code in ARM asm printer. llvm-svn: 80404
*	Remove some dead code.	Daniel Dunbar	2009-08-05	1	-4/+0
\| \| \| \|	llvm-svn: 78219
*	Initial support for single-precision FP using NEON. Added "neonfp" attribute ↵	David Goodwin	2009-08-04	1	-0/+1
\| \| \| \| \| \|	to enable. Added patterns for some binary FP operations. llvm-svn: 78081
*	Normalize Subtarget constructors to take a target triple string instead of	Daniel Dunbar	2009-08-02	1	-4/+1
\| \| \| \| \| \| \| \| \| \|	Module*. Also, dropped uses of TargetMachine where unnecessary. The only target which still takes a TargetMachine& is Mips, I would appreciate it if someone would normalize this to match other targets. llvm-svn: 77918
*	Fix Thumb2 function call isel. Thumb1 and Thumb2 should share the same	Evan Cheng	2009-08-01	1	-0/+4
\| \| \| \| \| \| \| \| \|	instructions for calls since BL and BLX are always 32-bit long and BX is always 16-bit long. Also, we should be using BLX to call external function stubs. llvm-svn: 77756
*	Use thumb2 for ARM architectures V6T2 and later. Fix a bug in checking	Bob Wilson	2009-06-22	1	-4/+8
\| \| \| \| \| \|	for "thumb" and add a check for V6T2. llvm-svn: 73905
*	For Darwin on ARMv6 and newer, make register r9 available for use as a	Bob Wilson	2009-06-22	1	-2/+7
\| \| \| \| \| \|	caller-saved register. llvm-svn: 73901
*	Remove UseThumbBacktraces. Just check if subtarget is darwin.	Evan Cheng	2009-06-18	1	-4/+1
\| \| \| \|	llvm-svn: 73734
*	The attached patches implement most of the ARM AAPCS-VFP hard float	Anton Korobeynikov	2009-06-08	1	-0/+6
\| \| \| \| \| \| \| \| \|	ABI. The missing piece is support for putting "homogeneous aggregates" into registers. Patch by Sandeep Patel! llvm-svn: 73095
*	Implement review feedback. Make thumb2 'normal' subtarget feature	Anton Korobeynikov	2009-06-01	1	-9/+6
\| \| \| \|	llvm-svn: 72698
*	Add placeholder for thumb2 stuff	Anton Korobeynikov	2009-05-29	1	-5/+13
\| \| \| \|	llvm-svn: 72593
*	Add ARMv7 architecture, Cortex processors and different FPU modes handling.	Anton Korobeynikov	2009-05-23	1	-1/+1
\| \| \| \|	llvm-svn: 72337