summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* Extend the IL for selecting TLS models (PR9788)Hans Wennborg2012-06-232-8/+57
| | | | | | | | | | | | | | | This allows the user/front-end to specify a model that is better than what LLVM would choose by default. For example, a variable might be declared as @x = thread_local(initialexec) global i32 42 if it will not be used in a shared library that is dlopen'ed. If the specified model isn't supported by the target, or if LLVM can make a better choice, a different model may be used. llvm-svn: 159077
* Use correct memory types for (V)CVTDQ2PD instructions.Craig Topper2012-06-231-3/+3
| | | | llvm-svn: 159075
* Silence an unused variable warning on release builds.Craig Topper2012-06-231-2/+2
| | | | llvm-svn: 159074
* Compress flags in X86 op folding to reduce space in static tables.Craig Topper2012-06-231-16/+16
| | | | llvm-svn: 159073
* Make helper method static since it doesn't use anything in the class.Craig Topper2012-06-231-3/+3
| | | | llvm-svn: 159071
* Remove intrinsic specific instructions for 128-bit (V)CVTDQ2PD. Replace with ↵Craig Topper2012-06-232-26/+9
| | | | | | intrinsic patterns. Mem forms omitted because the load size is only 64-bits. llvm-svn: 159070
* Handle aliases to tls variables in all architectures, not just x86.Rafael Espindola2012-06-232-8/+11
| | | | llvm-svn: 159058
* (sub X, imm) gets canonicalized to (add X, -imm)Evan Cheng2012-06-233-7/+21
| | | | | | | | | | | | | | | There are patterns to handle immediates when they fit in the immediate field. e.g. %sub = add i32 %x, -123 => sub r0, r0, #123 Add patterns to catch immediates that do not fit but should be materialized with a single movw instruction rather than movw + movt pair. e.g. %sub = add i32 %x, -65535 => movw r1, #65535 sub r0, r0, r1 rdar://11726136 llvm-svn: 159057
* ARM: Add a better diagnostic for some out of range immediates.Jim Grosbach2012-06-222-2/+13
| | | | | | | | | | | As an example of how the custom DiagnosticType can be used to provide better operand-mismatch diagnostics, add a custom diagnostic for the imm0_15 operand class used for several system instructions. Update the tests to expect the improved diagnostic. rdar://8987109 llvm-svn: 159051
* Add support for the PPC isel instruction.Hal Finkel2012-06-228-14/+84
| | | | | | | The isel (integer select) instruction is supported on the 440 and A2 embedded cores and on the POWER7. llvm-svn: 159045
* Whitespace.Chad Rosier2012-06-221-8/+8
| | | | llvm-svn: 159035
* Revert r158679 - use case is unclear (and it increases the memory footprint).Hal Finkel2012-06-222-2/+2
| | | | | | | | | | Original commit message: Allow up to 64 functional units per processor itinerary. This patch changes the type used to hold the FU bitset from unsigned to uint64_t. This will be needed for some upcoming PowerPC itineraries. llvm-svn: 159027
* Use "NoItineraries" for processors with no itineraries.Andrew Trick2012-06-224-9/+2
| | | | | | | | This makes it explicit when ScoreboardHazardRecognizer will be used. "GenericItineraries" would only make sense if it contained real itinerary values and still required ScoreboardHazardRecognizer. llvm-svn: 158963
* Functions calling __builtin_eh_return must have a frame pointer.Jakob Stoklund Olesen2012-06-221-1/+1
| | | | | | | | | | | | | | The code in X86TargetLowering::LowerEH_RETURN() assumes that a frame pointer exists, but the frame pointer was forced by the presence of llvm.eh.unwind.init which isn't guaranteed. If llvm.eh.unwind.init is actually required in functions calling eh.return (is it?), we should diagnose that instead of emitting bad machine code. This should fix the dragonegg-x86_64-linux-gcc-4.6-test bot. llvm-svn: 158961
* ARM scheduling fix: don't guess at implicit operand latency.Andrew Trick2012-06-221-5/+9
| | | | | | | | | | This is a minor drive-by fix with no robust way to unit test. As an example see neon-div.ll: SU(16): %Q8<def> = VMOVLsv4i32 %D17, pred:14, pred:%noreg, %Q8<imp-use,kill> val SU(1): Latency=2 Reg=%Q8 ...should be latency=1 llvm-svn: 158960
* ARM scheduling fix: compute predicated implicit use properly.Andrew Trick2012-06-221-3/+1
| | | | | | | | Minor drive by fix to cleanup latency computation. Calling getOperandLatency with a deliberately incorrect operand index does not give you the latency you want. llvm-svn: 158959
* Rename -allow-excess-fp-precision flag to -fuse-fp-ops, and switch from aLang Hames2012-06-221-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | boolean flag to an enum: { Fast, Standard, Strict } (default = Standard). This option controls the creation by optimizations of fused FP ops that store intermediate results in higher precision than IEEE allows (E.g. FMAs). The behavior of this option is intended to match the behaviour specified by a soon-to-be-introduced frontend flag: '-ffuse-fp-ops'. Fast mode - allows formation of fused FP ops whenever they're profitable. Standard mode - allow fusion only for 'blessed' FP ops. At present the only blessed op is the fmuladd intrinsic. In the future more blessed ops may be added. Strict mode - allow fusion only if/when it can be proven that the excess precision won't effect the result. Note: This option only controls formation of fused ops by the optimizers. Fused operations that are explicitly requested (e.g. FMA via the llvm.fma.* intrinsic) will always be honored, regardless of the value of this option. Internally TargetOptions::AllowExcessFPPrecision has been replaced by TargetOptions::AllowFPOpFusion. llvm-svn: 158956
* Convert the PPC backend to use the new FMA infrastructure.Hal Finkel2012-06-224-42/+48
| | | | | | | The existing contraction patterns are replaced with fma/fneg. Overall functionality should be the same. llvm-svn: 158955
* 1. fix null program output after some other changesAkira Hatanaka2012-06-213-9/+30
| | | | | | | | | 2. re-enable null.ll test 3. fix some minor style violations Patch by Reed Kotler. llvm-svn: 158935
* Treat TargetGlobalAddress as a constant for the purpose of matching pre-inc ↵Hal Finkel2012-06-211-1/+6
| | | | | | | | stores on PPC. Thanks to Tobias von Koch for pointing out this problem. llvm-svn: 158932
* The inline asm operand modifier 'c' is suppose Jack Carter2012-06-211-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to be generic across architectures. It has the following description in the gnu sources: Substitute immediate value without immediate syntax Several Architectures such as x86 have local implementations of operand modifier 'c' which go beyond the above description slightly. To make use of the generic modifiers without overriding local implementation one can make a call to the base class method for AsmPrinter::PrintAsmOperand() in the locally derived method's "default" case in the switch statement. That way if it is already defined locally the generic version will never get called. This change is needed when test/CodeGen/generic/asm-large-immediate.ll failed on a native Mips board. The test was assuming a generic implementation was in place. Affected files: lib/Target/Mips/MipsAsmPrinter.cpp: Changed the default case to call the base method. lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp Added 'c' to the switch cases. test/CodeGen/Mips/asm-large-immediate.ll Mips compiled version of the generic one Contributer: Jack Carter llvm-svn: 158925
* Add a missing llvm.fma -> VFNMS pattern to the ARM backend.Lang Hames2012-06-211-0/+8
| | | | llvm-svn: 158902
* Revert r158846.Akira Hatanaka2012-06-202-128/+166
| | | | llvm-svn: 158855
* In MipsDisassembler.cpp, instead of defining register class tables, use the onesAkira Hatanaka2012-06-202-166/+128
| | | | | | | | | | | that are generated by TableGen and are already available in MipsGenRegisterInfo.inc. Suggested by Jakob Stoklund Olesen. Also, fix bug in function DecodeAFGR64RegisterClass. Patch by Vladimir Medic. llvm-svn: 158846
* Add support for generating reg+reg (indexed) pre-inc loads on PPC.Hal Finkel2012-06-204-10/+106
| | | | llvm-svn: 158823
* Remove 'static' from inline functions defined in header files.Chandler Carruth2012-06-201-7/+7
| | | | | | | | | | | | | | | | | There is a pretty staggering amount of this in LLVM's header files, this is not all of the instances I'm afraid. These include all of the functions that (in my build) are used by a non-static inline (or external) function. Specifically, these issues were caught by the new '-Winternal-linkage-in-inline' warning. I'll try to just clean up the remainder of the clearly redundant "static inline" cases on functions (not methods!) defined within headers if I can do so in a reliable way. There were even several cases of a missing 'inline' altogether, or my personal favorite "static bool inline". Go figure. ;] llvm-svn: 158800
* Add predicate check around some patterns.Craig Topper2012-06-201-35/+37
| | | | llvm-svn: 158797
* Add predicate check around some patterns.Craig Topper2012-06-201-18/+22
| | | | llvm-svn: 158795
* Don't insert 128-bit UNDEF into 256-bit vectors. Just keep the 256-bit ↵Craig Topper2012-06-201-3/+6
| | | | | | vector. Original patch by Elena Demikhovsky. Tweaked by me to allow possibility of covering more cases. llvm-svn: 158792
* Add DAG-combines for aggressive FMA formation.Lang Hames2012-06-192-2/+2
| | | | | | | | | | | | | | | | | | | | This patch adds DAG combines to form FMAs from pairs of FADD + FMUL or FSUB + FMUL. The combines are performed when: (a) Either AllowExcessFPPrecision option (-enable-excess-fp-precision for llc) OR UnsafeFPMath option (-enable-unsafe-fp-math) are set, and (b) TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) is true for the type of the FADD/FSUB, and (c) The FMUL only has one user (the FADD/FSUB). If your target has fast FMA instructions you can make use of these combines by overriding TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) to return true for types supported by your FMA instruction, and adding patterns to match ISD::FMA to your FMA instructions. llvm-svn: 158757
* Implement PPCInstrInfo::isCoalescableExtInstr().Jakob Stoklund Olesen2012-06-192-0/+19
| | | | | | | | | | | | | The PPC::EXTSW instruction preserves the low 32 bits of its input, just like some of the x86 instructions. Use it to reduce register pressure when the low 32 bits have multiple uses. This requires a small change to PeepholeOptimizer since EXTSW takes a 64-bit input register. This is related to PR5997. llvm-svn: 158743
* Have ARM ELF use correct reloc for "b" instr.Jan Wen Voung2012-06-191-5/+3
| | | | | | | The condition code didn't actually matter for arm "b" instructions, unlike "bl". It should just use the R_ARM_JUMP24 reloc. llvm-svn: 158722
* Mark most PPC register classes to avoid write-after-write.Hal Finkel2012-06-192-0/+16
| | | | | | | | | | | | | | | | | | | | | | | For processors with the G5-like instruction-grouping scheme, this helps avoid early group termination due to a write-after-write dependency within the group. It should also help on pipelined embedded cores. On POWER7, over the test suite, this gives an average 0.5% speedup. The largest speedups are: SingleSource/Benchmarks/Stanford/Quicksort - 33% MultiSource/Applications/d/make_dparser - 21% MultiSource/Benchmarks/FreeBench/analyzer/analyzer - 12% MultiSource/Benchmarks/MiBench/telecomm-FFT/telecomm-fft - 12% Largest slowdowns: SingleSource/Benchmarks/Stanford/Bubblesort - 23% MultiSource/Benchmarks/Prolangs-C++/city/city - 21% MultiSource/Benchmarks/BitBench/uuencode/uuencode - 16% MultiSource/Benchmarks/mediabench/mpeg2/mpeg2dec/mpeg2decode - 13% llvm-svn: 158719
* Make MipsLongBranch::runOnMachineFunction return true.Akira Hatanaka2012-06-191-4/+4
| | | | llvm-svn: 158702
* Use MachineBasicBlock::instr_iterator instead of MachineBasicBlock::iterator inAkira Hatanaka2012-06-191-2/+2
| | | | | | MipsCodeEmitter.cpp. llvm-svn: 158701
* Add support for generating reg+reg preinc stores on PPC.Hal Finkel2012-06-196-25/+114
| | | | | | PPC will now generate STWUX and friends. llvm-svn: 158698
* Move the support for using .init_array from ARM to the genericRafael Espindola2012-06-195-45/+19
| | | | | | | | | | TargetLoweringObjectFileELF. Use this to support it on X86. Unlike ARM, on X86 it is not easy to find out if .init_array should be used or not, so the decision is made via TargetOptions and defaults to off. Add a command line option to llc that enables it. llvm-svn: 158692
* ARM: use NOEN loads and stores if possible when handling struct byval.Manman Ren2012-06-181-8/+42
| | | | | | | | This change is to be enabled in clang. rdar://9877866 llvm-svn: 158684
* Allow up to 64 functional units per processor itinerary.Hal Finkel2012-06-182-2/+2
| | | | | | | This patch changes the type used to hold the FU bitset from unsigned to uint64_t. This will be needed for some upcoming PowerPC itineraries. llvm-svn: 158679
* ARM: Define generic HINT instruction.Jim Grosbach2012-06-184-50/+46
| | | | | | | | | | | The NOP, WFE, WFI, SEV and YIELD instructions are all hints w/ a different immediate value in bits [7,0]. Define a generic HINT instruction and refactor NOP, WFI, WFI, SEV and YIELD to be assembly aliases of that. rdar://11600518 llvm-svn: 158674
* This change handles a another case for generating the bic instruction Joel Jones2012-06-181-0/+31
| | | | | | | | | | | when a compile time constant is known. This occurs when implicitly zero extending function arguments from 16 bits to 32 bits. The 8 bit case doesn't need to be handled, as the 8 bit constants are encoded directly, thereby not needing a separate load instruction to form the constant into a register. <rdar://problem/11481151> llvm-svn: 158659
* Temporarily revert r158087.Chandler Carruth2012-06-183-93/+14
| | | | | | | | | | | | | This patch causes problems when both dynamic stack realignment and dynamic allocas combine in the same function. With this patch, we no longer build the epilog correctly, and silently restore registers from the wrong position in the stack. Thanks to Matt for tracking this down, and getting at least an initial test case to Chad. I'm going to try to check a variation of that test case in so we can easily track the fixes required. llvm-svn: 158654
* Cleanup trip-count finding for PPC CTR loops (and some bug fixes).Hal Finkel2012-06-161-86/+127
| | | | | | | | | | | | | | This cleans up the method used to find trip counts in order to form CTR loops on PPC. This refactoring allows the pass to find loops which have a constant trip count but also happen to end with a comparison to zero. This also adds explicit FIXMEs to mark two different classes of loops that are currently ignored. In addition, we now search through all potential induction operations instead of just the first. Also, we check the predicate code on the conditional branch and abort the transformation if the code is not EQ or NE, and we then make sure that the branch to be transformed matches the condition register defined by the comparison (multiple possible comparisons will be considered). llvm-svn: 158607
* *no need to pollute Intel syntax with bonus mnemonics; operand size is ↵Kay Tiong Khoo2012-06-161-6/+6
| | | | | | explicitly specified llvm-svn: 158603
* Mips/AsmParser/CMakeLists.txt: Fix dependency.NAKAMURA Takumi2012-06-161-2/+1
| | | | llvm-svn: 158602
* Fix the encoding of the armv7m (MClass) for MSR registers other than aspr,Kevin Enderby2012-06-152-21/+31
| | | | | | iaspr, espr and xpsr which also needed to have 0b10 in their mask encoding bits. llvm-svn: 158560
* ARM: optimization for sub+abs.Manman Ren2012-06-151-11/+6
| | | | | | | | | | | | | | This patch will optimize abs(x-y) FROM sub, movs, rsbmi TO subs, rsbmi For abs, we will use cmp instead of movs. This is necessary because we already have an existing peephole pass which optimizes away cmp following sub. rdar: 11633193 llvm-svn: 158551
* *fixed to separate mnemonic from operands with tabKay Tiong Khoo2012-06-151-4/+4
| | | | llvm-svn: 158543
* Preserve <undef> flags in ARMExpandPseudo.Jakob Stoklund Olesen2012-06-151-5/+6
| | | | | | This probably mostly shows up in bugpoint-generated code. llvm-svn: 158527
* Move AVX version of convert instructions that write to GPRs to the Op1 table.Craig Topper2012-06-151-9/+13
| | | | llvm-svn: 158497
OpenPOWER on IntegriCloud