summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* [mips][mips64r6] Correct the encoding of dmuh, dmuhu, dmul, and dmulu.Daniel Sanders2014-07-041-4/+4
| | | | | | | | We have detected a documentation bug in the encoding tables of the released MIPS64r6 specification that has resulted in the wrong encodings being used for these instructions in LLVM. This commit corrects them. llvm-svn: 212330
* [x86] Generalize BuildVectorSDNode::getConstantSplatValue to work forChandler Carruth2014-07-041-75/+76
| | | | | | | | | | | | | | | | | | | | | | | | any constant, constant FP, or undef splat and to tolerate any undef lanes in a splat, then replace all uses of isSplatVector in X86's lowering with it. This fixes issues where undef lanes in an otherwise splat vector would prevent the splat logic from firing. It is a touch more awkward to use this interface, but it is much more accurate. Suggestions for better interface structuring welcome. With this fix, the code generated with the widening legalization strategy for widen_cast-4.ll is *dramatically* improved as the special lowering strategies for a v16i8 SRA kick in even though the high lanes are undef. We also get a slightly different choice for broadcasting an aligned memory location, and use vpshufd instead of vbroadcastss. This looks like a minor win for pipelining and domain crossing, but a minor loss for the number of micro-ops. I suspect its a wash, but folks can easily tweak the lowering if they want. llvm-svn: 212324
* [X86] Limit maximum nop length on SilvermontAlexey Volkov2014-07-041-3/+4
| | | | | | | | | | | Silvermont can only decode one instruction per cycle if the instruction exceeds 8 bytes. Also in Silvermont instructions with more than 3 prefixes will cause 3 cycle penalty. Maximum nop length is limited to 7 bytes when used for padding on Silvermont. For other x86 processors max nop length remains unchanged 15 bytes. Differential Revision: http://reviews.llvm.org/D4374 llvm-svn: 212321
* XCore target: remove incorrect DebugLoc entries from prologueRobert Lytton2014-07-042-5/+10
| | | | | | | | Summary: This was causing the prologue_end to be incorrectly positioned. Differential Revision: http://reviews.llvm.org/D4122 llvm-svn: 212318
* Move function dependent resetting of a subtarget variable out of theEric Christopher2014-07-048-26/+26
| | | | | | | | | | subtarget. This involved having the movt predicate take the current function - since we care about size in instruction selection for whether or not to use movw/movt take the function so we can check the attributes. This required adding the current MachineFunction to FastISel and propagating through. llvm-svn: 212309
* [x86] Clarify that this lowering only applies to vectors and is onlyChandler Carruth2014-07-031-3/+2
| | | | | | used when we have SSE2. llvm-svn: 212300
* Remove caching of the target machine and initialization of theEric Christopher2014-07-031-10/+5
| | | | | | | subtarget from ARMISelDAGtoDAG. The former is unnecessary and the latter is initialized on each runOnMachineFunction. llvm-svn: 212297
* [CostModel][x86] Improved cost model for alternate shuffles.Andrea Di Biagio2014-07-031-17/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch: 1) Improves the cost model for x86 alternate shuffles (originally added at revision 211339); 2) Teaches the Cost Model Analysis pass how to analyze alternate shuffles. Alternate shuffles are a special kind of blend; on x86, we can often easily lowered alternate shuffled into single blend instruction (depending on the subtarget features). The existing cost model didn't take into account subtarget features. Also, it had a couple of "dead" entries for vector types that are never legal (example: on x86 types v2i32 and v2f32 are not legal; those are always either promoted or widened to 128-bit vector types). The new x86 cost model takes into account what target features we have before returning the shuffle cost (i.e. the number of instructions after the blend is lowered/expanded). This patch also teaches the Cost Model Analysis how to identify and analyze alternate shuffles (i.e. 'SK_Alternate' shufflevector instructions): - added function 'isAlternateVectorMask'; - added some logic to check if an instruction is a alternate shuffle and, in case, call the target specific TTI to get the corresponding shuffle cost; - added a test to verify the cost model analysis on alternate shuffles. llvm-svn: 212296
* [X86] Add ISel patterns to select 'f32_to_f16' and 'f16_to_f32' dag nodes.Andrea Di Biagio2014-07-032-0/+23
| | | | | | | | | | This patch adds tablegen patterns to select F16C float-to-half-float conversion instructions from 'f32_to_f16' and 'f16_to_f32' dag nodes. If the target doesn't have F16C, then 'f32_to_f16' and 'f16_to_f32' are expanded into library calls. llvm-svn: 212293
* [ARM] Implement ISB memory barrier intrinsicYi Kong2014-07-032-7/+8
| | | | | | | Adds support for __builtin_arm_isb. Also corrects DSB and ISB instructions modelling by adding has-side-effects property. llvm-svn: 212276
* [x86] Fix crashes in lowering bitcast instructions with the wideningChandler Carruth2014-07-031-0/+7
| | | | | | | | | | mode. This also runs the test in that mode which would reproduce the crash. What I love is that *every single FIXME* in the test is addressed by switching to widening. llvm-svn: 212254
* [x86] Based on a long conversation between myself, Jim Grosbach, HalChandler Carruth2014-07-032-0/+19
| | | | | | | | | | | | | | | | | Finkel, Eric Christopher, and a bunch of other people I'm probably forgetting (sorry), add an option to the x86 backend to widen vectors during type legalization rather than promote them. This still would promote vNi1 vectors to get the masks right, but would widen other vectors. A lot of experiments are piling up right now showing that widening should probably be the default legalization strategy outside of vNi1 cases, but it is very hard to test the rammifications of that and fix bugs in widening-based legalization without an option that enables it. I'll be checking in tests shortly that use this option to exercise cases where widening doesn't work well and hopefully we'll be able to switch fully to this soon. llvm-svn: 212249
* Make these preprocessor directives match all of the others in the port.Eric Christopher2014-07-032-4/+4
| | | | llvm-svn: 212245
* Remove dead code.Eric Christopher2014-07-031-7/+0
| | | | llvm-svn: 212244
* [codegen,aarch64] Add a target hook to the code generator to controlChandler Carruth2014-07-036-6/+32
| | | | | | | | | | | | | | | | | | | | | vector type legalization strategies in a more fine grained manner, and change the legalization of several v1iN types and v1f32 to be widening rather than scalarization on AArch64. This fixes an assertion failure caused by scalarizing nodes like "v1i32 trunc v1i64". As v1i64 is legal it will fail to scalarize v1i32. This also provides a foundation for other targets to have more granular control over how vector types are legalized. Patch by Hao Liu, reviewed by Tim Northover. I'm committing it to allow some work to start taking place on top of this patch as it adds some really important hooks to the backend that I'd like to immediately start using. =] http://reviews.llvm.org/D4322 llvm-svn: 212242
* Move subtarget dependent features into the subtarget from the targetEric Christopher2014-07-034-96/+97
| | | | | | | machine. Includes a fix for a subtarget initialization for hard floating point on mips16. llvm-svn: 212240
* So that we can include frame lowering in the subtarget, remove includeEric Christopher2014-07-025-5/+10
| | | | | | | circular dependency with the subtarget by inlining accessor methods and outlining a routine. llvm-svn: 212236
* So that we can include target lowering in the subtarget, remove includeEric Christopher2014-07-024-64/+80
| | | | | | | circular dependency with the subtarget by inlining accessor methods and outlining a routine. llvm-svn: 212234
* Fix typos.Eric Christopher2014-07-021-1/+1
| | | | llvm-svn: 212228
* Move the data layout and selection dag info from the mips target machineEric Christopher2014-07-024-42/+46
| | | | | | down to the subtarget. llvm-svn: 212224
* [X86] AVX512: Allow writemask argument in vpermt* intrinsicsAdam Nemet2014-07-021-5/+15
| | | | llvm-svn: 212223
* [X86] AVX512: Generate Pat<>'s for the vpermt2* intrinsics via multiclassAdam Nemet2014-07-021-19/+14
| | | | | | | | | | | | | This new multiclass, avx512_perm_table_3src derives from the current one and provides the Pat<>. The next patch will add another Pat<> that uses the writemask. Note that I dropped the type annotation from the intrinsic call, i.e.: (v16f32 VR512:$src1) -> R512:$src1. I think that this should be fine (at least many intrinsic calls don't provide them) and it greatly reduces the number of template arguments. llvm-svn: 212222
* [X86] AVX512: Add writemask variants for vperm*2*Adam Nemet2014-07-021-14/+68
| | | | | | | | | This includes assembler and codegen support (see the new tests in avx512-encodings.s and avx512-shuffle.ll). <rdar://problem/17492620> llvm-svn: 212221
* R600: Add a comment that llvm.AMDGPU.trunc is a legacy intrinsicTom Stellard2014-07-021-1/+1
| | | | llvm-svn: 212218
* R600/SI: Use a ComplexPattern for ADDR64 addressing of MUBUF loadsTom Stellard2014-07-022-37/+35
| | | | llvm-svn: 212217
* R600: Promote i64 loads to v2i32Tom Stellard2014-07-023-7/+12
| | | | llvm-svn: 212216
* R600/SI: Adjsut SGPR live ranges before register allocationTom Stellard2014-07-024-0/+118
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SGPRs are written by instructions that sometimes will ignore control flow, which means if you have code like: if (VGPR0) { SGPR0 = S_MOV_B32 0 } else { SGPR0 = S_MOV_B32 1 } The value of SGPR0 will 1 no matter what the condition is. In order to deal with this situation correctly, we need to view the program as if it were a single basic block when we calculate the live ranges for the SGPRs. They way we actually update the live range is by iterating over all of the segments in each LiveRange object and setting the end of each segment equal to the start of the next segment. So a live range like: [3888r,9312r:0)[10032B,10384B:0) 0@3888r will become: [3888r,10032B:0)[10032B,10384B:0) 0@3888r This change will allow us to use SALU instructions within branches. llvm-svn: 212215
* R600/SI: Add verifier check for immediates in register operands.Tom Stellard2014-07-024-2/+33
| | | | llvm-svn: 212214
* [RegAllocGreedy] Provide a subtarget hook to disable the local reassignmentQuentin Colombet2014-07-021-0/+5
| | | | | | | | | | | | heuristic. By default, no functionality change. This is a follow-up of r212099. This hook provides a finer grain to control the optimization. <rdar://problem/17444599> llvm-svn: 212204
* AArch64: Re-enable AArch64AddressTypePromotionDuncan P. N. Exon Smith2014-07-022-1/+3
| | | | | | | | | | | This reverts commits r212189 and r212190. While this pass was accidentally disabled (until r212073), r205437 slipped in a use of `auto` that should have been `auto&`. This fixes PR20188. llvm-svn: 212201
* AArch64: Remove unnecessary parensDuncan P. N. Exon Smith2014-07-021-1/+1
| | | | llvm-svn: 212199
* R600: Fix crashes when an illegal type load or store is not handled.Matt Arsenault2014-07-021-2/+6
| | | | | | | I don't think anything hits this now, but will be exposed in future patches. llvm-svn: 212197
* AArch64: Merge isa with dyn_castDuncan P. N. Exon Smith2014-07-021-2/+1
| | | | llvm-svn: 212194
* AArch64: Temporarily disable AArch64AddressTypePromotionDuncan P. N. Exon Smith2014-07-021-2/+0
| | | | | | | Temporarily disable AArch64AddressTypePromotion, which was effectively re-enabled in r212073 and r212075, while I look into PR20188. llvm-svn: 212189
* X86: When combining shuffles just remove shuffles that are completely redundant.Benjamin Kramer2014-07-021-0/+7
| | | | | | | CombineTo doesn't allow replacing a node with itself so this would crash if the combined shuffle is the same as the input shuffle. llvm-svn: 212181
* AVX-512: dec/inc instructions are slow on KNLElena Demikhovsky2014-07-021-1/+2
| | | | | | | After Alexey Volkov, I'm adding the same property for KNL, that prefers ADD/SUB instead of INC/DEC. Added a test. llvm-svn: 212178
* aarch64: support target-specific .req assembler directiveSaleem Abdulrasool2014-07-021-3/+96
| | | | | | | | | | Based on the support for .req on ARM. The aarch64 variant has to keep track if the alias register was a vector register (v0-31) or a general purpose or VFP/Advanced SIMD ([bhsdq]0-31) register. Patch by Janne Grunau! llvm-svn: 212161
* Break out subtarget initialization that dependent variables need intoEric Christopher2014-07-022-11/+17
| | | | | | a separate function and clean up calling convention for helper function. llvm-svn: 212153
* Unify these two lines.Eric Christopher2014-07-021-2/+1
| | | | llvm-svn: 212152
* Move MipsJITInfo to the subtarget rather than the target machine.Eric Christopher2014-07-024-5/+10
| | | | llvm-svn: 212151
* Remove unnecessary include.Eric Christopher2014-07-021-1/+0
| | | | llvm-svn: 212150
* Remove the cached InstrItineraryData on the TargetMachine, it's unnecessary.Eric Christopher2014-07-022-15/+13
| | | | llvm-svn: 212149
* Move the subtarget dependent features from XCoreTargetMachineEric Christopher2014-07-026-37/+42
| | | | | | down to the subtarget. llvm-svn: 212147
* Make XCoreSelectionDAGInfo take a DataLayout since it only needsEric Christopher2014-07-023-4/+4
| | | | | | that information. llvm-svn: 212146
* X86: remove atomic instructions *after* we've iterated through them.Tim Northover2014-07-011-3/+6
| | | | | | | | | Otherwise they get freed and the implicit "isa<XYZ>" tests following turn out badly (at least under sanitizers). Also corrects the ordering of unordered atomic stores. llvm-svn: 212136
* [DAG] Pass the argument list to the CallLoweringInfo via move semantics. NFCI.Juergen Ributzka2014-07-0111-16/+19
| | | | | | | | The argument list vector is never used after it has been passed to the CallLoweringInfo and moving it to the CallLoweringInfo is cleaner and pretty much as cheap as keeping a pointer to it. llvm-svn: 212135
* X86: delegate expanding atomic libcalls to generic code.Tim Northover2014-07-011-0/+14
| | | | | | | | | | | | | | | On targets without cmpxchg16b or cmpxchg8b, the borderline atomic operations were slipping through the gaps. X86AtomicExpand.cpp was delegating to ISelLowering. Generic ISelLowering was delegating to X86ISelLowering and X86ISelLowering was asserting. The correct behaviour is to expand to a libcall, preferably in generic ISelLowering. This can be achieved by X86ISelLowering deciding it doesn't want the faff after all. llvm-svn: 212134
* Move the subtarget dependent features from SystemZTargetMachineEric Christopher2014-07-016-38/+55
| | | | | | down to the subtarget. Add an initialization routine to assist. llvm-svn: 212124
* Remove the use and initialization of the target machine and subtargetEric Christopher2014-07-013-29/+19
| | | | | | from SystemZFrameLowering. llvm-svn: 212123
* AArch64: fix comment typoTim Northover2014-07-011-1/+1
| | | | llvm-svn: 212120
OpenPOWER on IntegriCloud