summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* ARM/AArch64: Attach the FrameSetup MIFlag to CFI instructions.Adrian Prantl2014-12-162-12/+25
| | | | | | | | | | Debug info marks the first instruction without the FrameSetup flag as being the end of the function prologue. Any CFI instructions in the middle of the function prologue would cause debug info to end the prologue too early and worse, attach the line number of the CFI instruction, which incidentally is often 0. llvm-svn: 224294
* [Hexagon] Adding doubleword multiplies with and without accumulation.Colin LeMahieu2014-12-162-0/+136
| | | | llvm-svn: 224293
* [Hexagon] Adding halfword to doubleword multiplies.Colin LeMahieu2014-12-151-0/+59
| | | | llvm-svn: 224289
* [Hexagon] Adding logical-logical accumulation instructions and tests.Colin LeMahieu2014-12-151-19/+40
| | | | llvm-svn: 224288
* x86: Emit LOCK prefix after DATA16JF Bastien2014-12-151-4/+6
| | | | | | | | | | | | | | Summary: x86 allows either ordering for the LOCK and DATA16 prefixes, but using GCC+GAS leads to different code generation than using LLVM. This change matches the order that GAS emits the x86 prefixes when a semicolon isn't used in inline assembly (see tc-i386.c comment before define LOCK_PREFIX), and helps simplify tooling that operates on the instruction's byte sequence (such as NaCl's validator). This change shouldn't have any performance impact. Test Plan: ninja check Reviewers: craig.topper, jvoung Subscribers: jfb, llvm-commits Differential Revision: http://reviews.llvm.org/D6630 llvm-svn: 224283
* [Hexagon] Adding a number of additional multiply forms with tests.Colin LeMahieu2014-12-151-11/+126
| | | | llvm-svn: 224282
* [Hexagon] Adding misc multiply encodings and tests.Colin LeMahieu2014-12-151-0/+48
| | | | llvm-svn: 224273
* [Hexagon] Adding doubleworld accumulating multiplies of halfwords.Colin LeMahieu2014-12-151-0/+74
| | | | llvm-svn: 224267
* [Hexagon] Adding accumulating half word multiplies.Colin LeMahieu2014-12-151-0/+105
| | | | llvm-svn: 224266
* [Hexagon] Adding multiply with rnd/sat/rndsatColin LeMahieu2014-12-151-0/+46
| | | | llvm-svn: 224265
* [Hexagon] Adding encoding bits for halfword multiplies.Colin LeMahieu2014-12-151-0/+39
| | | | llvm-svn: 224261
* [X86] Also pretty-print shuffle mask for INSERTPS rm variants.Ahmed Bougacha2014-12-151-3/+7
| | | | llvm-svn: 224260
* Silence more static analyzer warnings.Michael Ilseman2014-12-153-2/+7
| | | | | | | | Add in definedness checks for shift operators, null checks when pointers are assumed by the code to be non-null, and explicit unreachables. llvm-svn: 224255
* Add disassembler tests for mips3 platform. There are no functional changes.Vladimir Medic2014-12-151-1/+2
| | | | llvm-svn: 224253
* [X86] Break false dependencies before partial register updates when the ↵Michael Kuperstein2014-12-151-0/+20
| | | | | | | | | | source operand is in memory Adds the various "rm" instruction variants into the list of instructions that have a partial register update. Also adds all variants of SQRTSD that were missing in the original list. Differential Revision: http://reviews.llvm.org/D6620 llvm-svn: 224246
* AVX-512: Added EXPAND instructions and intrinsics.Elena Demikhovsky2014-12-154-15/+150
| | | | llvm-svn: 224241
* Loop Vectorizer minor changes in the code - Elena Demikhovsky2014-12-141-5/+5
| | | | | | | | some comments, function names, identation. Reviewed here: http://reviews.llvm.org/D6527 llvm-svn: 224218
* [PowerPC] Handle cmp op promotion for SELECT[_CC] nodes in ↵Hal Finkel2014-12-141-18/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PPCTL::DAGCombineExtBoolTrunc PPCTargetLowering::DAGCombineExtBoolTrunc contains logic to remove unwanted truncations and extensions when dealing with nodes of the form: zext(binary-ops(binary-ops(trunc(x), trunc(y)), ...) There was a FIXME in the implementation (now removed) regarding the fact that the function would abort the transformations if any of the non-output operands of a SELECT or SELECT_CC node would need to be promoted (because they were also output operands, for example). As a result, we continued to generate unnecessary zero-extends for code such as this: unsigned foo(unsigned a, unsigned b) { return (a <= b) ? a : b; } which would produce: cmplw 0, 3, 4 isel 3, 4, 3, 1 rldicl 3, 3, 0, 32 blr and now we produce: cmplw 0, 3, 4 isel 3, 4, 3, 1 blr which is better in the obvious way. llvm-svn: 224213
* Reapply "[ARM] Combine base-updating/post-incrementing vector load/stores."Ahmed Bougacha2014-12-131-6/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | r223862 tried to also combine base-updating load/stores. r224198 reverted it, as "it created a regression on the test-suite on test MultiSource/Benchmarks/Ptrdist/anagram by scrambling the order in which the words are shown." Reapply, with a fix to ignore non-normal load/stores. Truncstores are handled elsewhere (you can actually write a pattern for those, whereas for postinc loads you can't, since they return two values), but it should be possible to also combine extloads base updates, by checking that the memory (rather than result) type is of the same size as the addend. Original commit message: We used to only combine intrinsics, and turn them into VLD1_UPD/VST1_UPD when the base pointer is incremented after the load/store. We can do the same thing for generic load/stores. Note that we can only combine the first load/store+adds pair in a sequence (as might be generated for a v16f32 load for instance), because other combines turn the base pointer addition chain (each computing the address of the next load, from the address of the last load) into independent additions (common base pointer + this load's offset). Differential Revision: http://reviews.llvm.org/D6585 llvm-svn: 224203
* Revert "[ARM] Combine base-updating/post-incrementing vector load/stores."Renato Golin2014-12-131-38/+6
| | | | | | | | | This reverts commit r223862, as it created a regression on the test-suite on test MultiSource/Benchmarks/Ptrdist/anagram by scrambling the order in which the words are shown. We'll investigate the issue and re-apply when safe. llvm-svn: 224198
* [PowerPC] Add a DAGToDAG peephole to remove unnecessary zero-extsHal Finkel2014-12-123-5/+310
| | | | | | | | | | | | | | | | | | | | On PPC64, we end up with lots of i32 -> i64 zero extensions, not only from all of the usual places, but also from the ABI, which specifies that values passed are zero extended. Almost all 32-bit PPC instructions in PPC64 mode are defined to do *something* to the higher-order bits, and for some instructions, that action clears those bits (thus providing a zero-extended result). This is especially common after rotate-and-mask instructions. Adding an additional instruction to zero-extend the results of these instructions is unnecessary. This PPCISelDAGToDAG peephole optimization examines these zero-extensions, and looks back through their operands to see if all instructions will implicitly zero extend their results. If so, we convert these instructions to their 64-bit variants (which is an internal change only, the actual encoding of these instructions is the same as the original 32-bit ones) and remove the unnecessary zero-extension (changing where the INSERT_SUBREG instructions are to make everything internally consistent). llvm-svn: 224169
* [ARMConstantIsland] Insert tbb/tbh optimization where previous jump table ↵Chad Rosier2014-12-121-1/+3
| | | | | | resided. llvm-svn: 224165
* [Hexagon] Adding double word add/min/minu/max/maxu instructions and tests.Colin LeMahieu2014-12-121-21/+63
| | | | llvm-svn: 224153
* [Hexagon] Adding J class call instructions.Colin LeMahieu2014-12-122-9/+48
| | | | llvm-svn: 224150
* [AVX512] Enabling bit logic loweringRobert Khasanov2014-12-122-0/+9
| | | | | | Added lowering tests. llvm-svn: 224132
* [mips] Enable code generation for MIPS-III.Vasileios Kalintiris2014-12-123-9/+17
| | | | | | | | | | | | | | | | | | Summary: This commit enables the MIPS-III target and adds support for code generation of SELECT nodes. We have to use pseudo-instructions with custom inserters for these nodes as MIPS-III CPUs do not have conditional-move instructions. Depends on D6212 Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6464 llvm-svn: 224128
* [AVX512] Enabling MIN/MAX lowering.Robert Khasanov2014-12-122-4/+19
| | | | | | Added lowering tests. llvm-svn: 224127
* [mips] Support SELECT nodes for targets that don't have conditional-move ↵Vasileios Kalintiris2014-12-124-0/+129
| | | | | | | | | | | | | | | | | | | | | | | | instructions. Summary: For Mips targets that do not have conditional-move instructions, ie. targets before MIPS32 and MIPS-IV, we have to insert a diamond control-flow pattern in order to support SELECT nodes. In order to do that, we add pseudo-instructions with a custom inserter that emits the necessary control-flow that selects the correct value. With this patch we add complete support for code generation of Mips-II targets based on the LLVM test-suite. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6212 llvm-svn: 224124
* [AVX512] Minor fix in lowering pattern for broadcast intrustions.Robert Khasanov2014-12-121-6/+5
| | | | | | No functional change. llvm-svn: 224122
* Emit Tag_ABI_FP_16bit_format build attribute.Charlie Turner2014-12-121-0/+7
| | | | | | | | | | | | | The __fp16 type is unconditionally exposed. Since -mfp16-format is not yet supported, there is not a user switch to change this behaviour. This build attribute should capture the default behaviour of the compiler, which is to expose the IEEE 754 version of __fp16. When -mfp16-format is emitted, that will be the way to control the value of this build attribute. Change-Id: I8a46641ff0fd2ef8ad0af5f482a6d1af2ac3f6b0 llvm-svn: 224115
* R600: Fix min/max matching problems with unordered comparesMatt Arsenault2014-12-124-50/+60
| | | | | | | | The returned operand needs to be permuted for the unordered compares. Also fix incorrectly producing fmin_legacy / fmax_legacy for f64, which don't exist. llvm-svn: 224094
* R600/SI: fmin/fmax_legacy are not associativeMatt Arsenault2014-12-121-2/+2
| | | | llvm-svn: 224093
* R600/SI: Don't promote f32 select to i32Matt Arsenault2014-12-122-2/+5
| | | | | | | | This is nice for the instruction patterns, but it complicates min / max matching. The select doesn't have the correct type and would require looking through the bitcasts for the real float operands. llvm-svn: 224092
* Add target hook for whether it is profitable to reduce load widthsMatt Arsenault2014-12-122-0/+26
| | | | | | | | Add an option to disable optimization to shrink truncated larger type loads to smaller type loads. On SI this prevents using scalar load instructions in some cases, since there are no scalar extloads. llvm-svn: 224084
* remove function names from comments; NFCSanjay Patel2014-12-111-29/+23
| | | | llvm-svn: 224080
* R600/SI: Handle physical registers in getOpRegClassMatt Arsenault2014-12-111-2/+7
| | | | llvm-svn: 224079
* R600/SI: Don't verify constant bus usage of flag opsMatt Arsenault2014-12-111-2/+10
| | | | | | | | | | | | This was checking if pseudo-operands like the source modifiers were using the constant bus, which happens to work because the values these all can be happen to be valid inline immediates. This fixes a later commit which starts checking the register class of the operands. llvm-svn: 224078
* return without temporary; NFCSanjay Patel2014-12-111-4/+1
| | | | llvm-svn: 224076
* Enable MachineVerifier in debug mode for X86, ARM, AArch64, Mips.Matthias Braun2014-12-114-20/+20
| | | | llvm-svn: 224075
* [X86] Add a temporary testcase for PR21876/r223996.Ahmed Bougacha2014-12-111-0/+1
| | | | llvm-svn: 224074
* [PowerPC] Better lowering for add/or of a FrameIndexHal Finkel2014-12-112-30/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we have an add (or an or that is really an add), where one operand is a FrameIndex and the other operand is a small constant, we can combine the lowering of the FrameIndex (which is lowered as an add of the FI and a zero offset) with the constant operand. Amusingly, this is an old potential improvement entry from lib/Target/PowerPC/README.txt which had never been resolved. In short, we used to lower: %X = alloca { i32, i32 } %Y = getelementptr {i32,i32}* %X, i32 0, i32 1 ret i32* %Y as: addi 3, 1, -8 ori 3, 3, 4 blr and now we produce: addi 3, 1, -4 blr which is much more sensible. llvm-svn: 224071
* R600/SI: Use unordered equal instructionsMatt Arsenault2014-12-112-6/+2
| | | | llvm-svn: 224067
* R600/SI: Make more unordered comparisons legalMatt Arsenault2014-12-113-18/+9
| | | | | | | This saves a second compare and an and / or by using the unordered comparison instructions. llvm-svn: 224066
* R600/SI: Use unordered not equal instructionsMatt Arsenault2014-12-114-10/+19
| | | | llvm-svn: 224065
* [CodeGen] Add print and verify pass after each MachineFunctionPass by defaultMatthias Braun2014-12-1112-164/+109
| | | | | | | | | | | | | | | | | | | Previously print+verify passes were added in a very unsystematic way, which is annoying when debugging as you miss intermediate steps and allows bugs to stay unnotice when no verification is performed. To make this change practical I added the possibility to explicitely disable verification. I used this option on all places where no verification was performed previously (because alot of places actually don't pass the MachineVerifier). In the long term these problems should be fixed properly and verification enabled after each pass. I'll enable some more verification in subsequent commits. This is the 2nd attempt at this after realizing that PassManager::add() may actually delete the pass. llvm-svn: 224059
* This reverts commit r224043 and r224042.Rafael Espindola2014-12-1112-89/+144
| | | | | | check-llvm was failing. llvm-svn: 224045
* Enable machineverifier in debug mode for X86, ARM, AArch64, MipsMatthias Braun2014-12-114-20/+20
| | | | llvm-svn: 224043
* [CodeGen] Add print and verify pass after each MachineFunctionPass by defaultMatthias Braun2014-12-1112-164/+109
| | | | | | | | | | | | | | | | Previously print+verify passes were added in a very unsystematic way, which is annoying when debugging as you miss intermediate steps and allows bugs to stay unnotice when no verification is performed. To make this change practical I added the possibility to explicitely disable verification. I used this option on all places where no verification was performed previously (because alot of places actually don't pass the MachineVerifier). In the long term these problems should be fixed properly and verification enabled after each pass. I'll enable some more verification in subsequent commits. llvm-svn: 224042
* [Hexagon] Renaming classes in preparation for replacement.Colin LeMahieu2014-12-111-13/+13
| | | | llvm-svn: 224036
* ARM: convert isTargetIOS checks to isTargetDarwin.Tim Northover2014-12-114-12/+8
| | | | | | | | | | | The distinction is mostly useful in the front-end. By the time we get here, there are very few situations where we actually want different behaviour for Darwin and IOS (in fact Darwin mostly just exists in a few tests). So this should reduce any surprising weirdness for anyone using it. No functional change on anything anyone actually cares about. llvm-svn: 224035
OpenPOWER on IntegriCloud