summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [ARM] Enable changing instprinter's behavior based on the per-functionAkira Hatanaka2015-03-273-141/+276
| | | | | | subtarget. llvm-svn: 233451
* clang-format ARMInstPrinter.{h,cpp} before I make changes to these files.Akira Hatanaka2015-03-272-267/+282
| | | | llvm-svn: 233448
* [AArch64InstPrinter] Use the feature bits of the subtarget passed to the printAkira Hatanaka2015-03-271-6/+3
| | | | | | | | | | | method. This enables the instprinter to print a different system register name based on the feature bits of the per-function subtarget. Differential Revision: http://reviews.llvm.org/D8668 llvm-svn: 233412
* [MCInstPrinter] Enable MCInstPrinter to change its behavior based on theAkira Hatanaka2015-03-2730-89/+180
| | | | | | | | | | | | | | | | | | | | per-function subtarget. Currently, code-gen passes the default or generic subtarget to the constructors of MCInstPrinter subclasses (see LLVMTargetMachine::addPassesToEmitFile), which enables some targets (AArch64, ARM, and X86) to change their instprinter's behavior based on the subtarget feature bits. Since the backend can now use different subtargets for each function, instprinter has to be changed to use the per-function subtarget rather than the default subtarget. This patch takes the first step towards enabling instprinter to change its behavior based on the per-function subtarget. It adds a bit "PassSubtarget" to AsmWriter which tells table-gen to pass a reference to MCSubtargetInfo to the various print methods table-gen auto-generates. I will follow up with changes to instprinters of AArch64, ARM, and X86. llvm-svn: 233411
* R600/SI: Fix VOP2 VI encodingMarek Olsak2015-03-271-1/+1
| | | | | | Broken by "R600/SI: Refactor VOP2 instruction defs". llvm-svn: 233399
* [bpf] add support for bpf pseudo instructionAlexei Starovoitov2015-03-272-1/+24
| | | | | | | Expose bpf pseudo load instruction via intrinsic. It is used by front-ends that can encode file descriptors directly into IR instead of relying on relocations. llvm-svn: 233396
* Remove superfluous .str() and replace std::string concatenation with Twine.Yaron Keren2015-03-275-6/+6
| | | | llvm-svn: 233392
* [AArch64] Don't store available subtarget features in ↵Vladimir Sukharev2015-03-274-20/+20
| | | | | | | | | | | | | | AArch64SysReg::SysRegMapper Subtarget features must not be a part of the target machine. So, they are now not being stored in SysRegMapper, but provided each time fromString()/toString() are called Reviewers: jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8655 llvm-svn: 233386
* Close unique sections when switching away from them.Rafael Espindola2015-03-271-1/+2
| | | | | | | It is not possible to switch back to unique secitons, so close them automatically when switching away. llvm-svn: 233380
* Use movw/movt instead of constant pool loads to lower byval parameter copiesDerek Schuff2015-03-261-5/+9
| | | | | | | | | | | | | | Summary: The ARM backend can use a loop to implement copying byval parameters before a call. In non-thumb2 mode it uses a constant pool load to materialize the trip count. For targets that need movt instead (e.g. Native Client), use the same code as in thumb2 mode to materialize the trip count. Reviewers: jfb, t.p.northover Differential Revision: http://reviews.llvm.org/D8442 llvm-svn: 233324
* Adds an option to disable ARM ld/st optim passRenato Golin2015-03-261-7/+16
| | | | | | | | Enabled by default, but it's useful when debugging with llc. Patch by Ranjeet Singh. llvm-svn: 233303
* [ARM] Add v8.1a "Rounding Double Multiply Add/Subtract" extensionVladimir Sukharev2015-03-262-14/+168
| | | | | | | | | | Reviewers: t.p.northover Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8503 llvm-svn: 233301
* [AArch64] Rename Pairs to Mappings in AArch64NamedImmMapperVladimir Sukharev2015-03-262-66/+66
| | | | | | | | | | | | Third element is to be added soon to "struct AArch64NamedImmMapper::Mapping". So its instances are renamed from ...Pairs to ...Mappings Reviewers: jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8582 llvm-svn: 233300
* [AArch64] Move initializations of AArch64NamedImmMapper out of void ↵Vladimir Sukharev2015-03-261-18/+48
| | | | | | | | | | | | | | | | AArch64Operand::print(...) class AArch64NamedImmMapper is to become dependent of SubTargetFeatures, while class AArch64Operand don't have access to the latter. So, AArch64NamedImmMapper constructor invocations are refactored away from methods of AArch64Operand. Reviewers: jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8579 llvm-svn: 233297
* comment cleanup; NFCSanjay Patel2015-03-261-5/+5
| | | | llvm-svn: 233293
* Remove outdated README-SSE.txt entries.Benjamin Kramer2015-03-261-78/+0
| | | | llvm-svn: 233292
* InstCombine: fold (A << C) == (B << C) --> ((A^B) & (~0U >> C)) == 0Benjamin Kramer2015-03-261-38/+0
| | | | | | | Anding and comparing with zero can be done in a single instruction on most archs so this is a bit cheaper. llvm-svn: 233291
* [AArch64, ARM] Add v8.1a architecture and generic cpuVladimir Sukharev2015-03-2612-6/+50
| | | | | | | | | | | | New architecture and cpu added, following http://community.arm.com/groups/processors/blog/2014/12/02/the-armv8-a-architecture-and-its-ongoing-development Reviewers: t.p.northover Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8505 llvm-svn: 233290
* Use SDValue bool checks; NFC intendedSanjay Patel2015-03-261-20/+13
| | | | llvm-svn: 233289
* [mips] Move the setATReg definition inside the MipsAssemblerOptions class. NFC.Toma Tabacu2015-03-261-9/+7
| | | | | | | | | | | | | | Summary: This groups all of the MipsAssemblerOptions functionality together, making it more reader-friendly. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8445 llvm-svn: 233271
* [X86][FastIsel] Teach how to select vector load instructions.Andrea Di Biagio2015-03-261-3/+34
| | | | | | | | | This patch teaches fast-isel how to select 128-bit vector load instructions. Added test CodeGen/X86/fast-isel-vecload.ll Differential Revision: http://reviews.llvm.org/D8605 llvm-svn: 233270
* Add computeFSAdditions to the function based subtarget creationEric Christopher2015-03-261-1/+9
| | | | | | | | | for PPC due to some unfortunate default setting via TargetMachine creation. I've added a FIXME on how this can be unraveled in the backend and a test to make sure we successfully legalize 64-bit things if we say we're 64-bits. llvm-svn: 233239
* Fix typo in comment.Nico Weber2015-03-251-1/+1
| | | | llvm-svn: 233226
* Fix remaining MSVC warningAndrew Kaylor2015-03-251-2/+2
| | | | llvm-svn: 233220
* Revert r233206Krzysztof Parzyszek2015-03-251-3/+0
| | | | llvm-svn: 233213
* [Hexagon] Keep the bare getSubtargetImpl for nowKrzysztof Parzyszek2015-03-251-0/+3
| | | | llvm-svn: 233206
* Add Hardware Transactional Memory (HTM) SupportKit Barton2015-03-2516-32/+360
| | | | | | | | | | | | | | | | | | | | | | | | This patch adds Hardware Transaction Memory (HTM) support supported by ISA 2.07 (POWER8). The intrinsic support is based on GCC one [1], but currently only the 'PowerPC HTM Low Level Built-in Function' are implemented. The HTM instructions follows the RC ones and the transaction initiation result is set on RC0 (with exception of tcheck). Currently approach is to create a register copy from CR0 to GPR and comapring. Although this is suboptimal, since the branch could be taken directly by comparing the CR0 value, it generates code correctly on both test and branch and just return value. A possible future optimization could be elimitate the MFCR instruction to branch directly. The HTM usage requires a recently newer kernel with PPC HTM enabled. Tested on powerpc64 and powerpc64le. This is send along a clang patch to enabled the builtins and option switch. [1] https://gcc.gnu.org/onlinedocs/gcc/PowerPC-Hardware-Transactional-Memory-Built-in-Functions.html Phabricator Review: http://reviews.llvm.org/D8247 llvm-svn: 233204
* [X86, AVX] improve insertion into zero element of 256-bit vectorSanjay Patel2015-03-251-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows AVX blend instructions to handle insertion into the low element of a 256-bit vector for the appropriate data types. For f32, instead of: vblendps $1, %xmm1, %xmm0, %xmm1 ## xmm1 = xmm1[0],xmm0[1,2,3] vblendps $15, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0,1,2,3],ymm0[4,5,6,7] we get: vblendps $1, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0],ymm0[1,2,3,4,5,6,7] For f64, instead of: vmovsd %xmm1, %xmm0, %xmm1 ## xmm1 = xmm1[0],xmm0[1] vblendpd $3, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0,1],ymm0[2,3] we get: vblendpd $1, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0],ymm0[1,2,3] For the hardware-neglected integer data types, I left a TODO comment in the code and added regression tests for a follow-on patch. Differential Revision: http://reviews.llvm.org/D8609 llvm-svn: 233199
* [APInt] Add an isSplat helper and use it in some places.Benjamin Kramer2015-03-251-11/+4
| | | | | | | To complement getSplat. This is more general than the binary decomposition method as it also handles non-pow2 splat sizes. llvm-svn: 233195
* [Hexagon] Pattern match a CTZ loop into a call to countTrailingZeros.Benjamin Kramer2015-03-251-4/+1
| | | | | | No functional change intended. llvm-svn: 233192
* [ARM] Rewrite .save/.vsave emission with bit mathBenjamin Kramer2015-03-251-51/+21
| | | | | | | Hopefully makes it a bit easier to understand what's going on. No functional change intended. llvm-svn: 233191
* [X86] Remove GetCpuIDAndInfo, GetCpuIDAndInfoEx and DetectFamilyModel ↵Craig Topper2015-03-252-149/+0
| | | | | | functions from X86 MC layer. They haven't been used since CPU autodetection was removed from X86Subtarget.cpp. llvm-svn: 233170
* X86: Fix frameescape when not using an FPReid Kleckner2015-03-241-5/+5
| | | | | | | | | | | We can't use TargetFrameLowering::getFrameIndexOffset directly, because Win64 really wants the offset from the stack pointer at the end of the prologue. Instead, use X86FrameLowering::getFrameIndexOffsetFromSP(), which is a pretty close approximiation of that. It fails to handle cases with interestingly large stack alignments, which is pretty uncommon on Win64 and is TODO. llvm-svn: 233137
* Disabling warnings for MSVC build to enable /W4 use.Andrew Kaylor2015-03-242-4/+2
| | | | | | Differential Revision: http://reviews.llvm.org/D8572 llvm-svn: 233133
* Opaque Pointer Types: GEP API migrations to specify the gep type explicitlyDavid Blaikie2015-03-242-3/+4
| | | | | | | | | | | | | | | | | | | | | The changes to InstCombine (& SCEV) do seem a bit silly - it doesn't make anything obviously better to have the caller access the pointers element type (the thing I'm trying to remove) than the GEP itself, but it's a helpful migration step. This will allow me to more obviously lock down GEP (& Load, etc) API usage, then fix all the code that accesses pointer element types except the places that need to be removed (most of the InstCombines) anyway - at which point I'll need to just remove all that code because it won't be meaningful anymore (there will be no pointer types, so no bitcasts to combine) SCEV looks like it'll need some restructuring - we'll have to do a bit more work for GEP canonicalization, since it'll depend on how it's used if we can even manage to canonicalize it to a non-ugly GEP. I guess we can do some fun stuff like voting (do 2 out of 3 load from the GEP with a certain type that gives a pretty GEP? Does every typed use of the GEP use either a specific type or a generic type (i8*, etc)?) llvm-svn: 233131
* AArch64: use a different means to determine whether to byte swap relocations.Peter Collingbourne2015-03-241-3/+18
| | | | | | | | This code depended on a bug in the FindAssociatedSection function that would cause it to return the wrong result for certain absolute expressions. Instead, use EvaluateAsRelocatable. llvm-svn: 233119
* [X86, AVX] recognize shufflevector with zero input as a vperm2 (PR22984)Sanjay Patel2015-03-241-20/+56
| | | | | | | | | | | | | | vperm2x128 instructions have the special ability (aka free hardware capability) to shuffle zero values into a vector. This patch recognizes that type of shuffle and generates the appropriate control byte. https://llvm.org/bugs/show_bug.cgi?id=22984 Differential Revision: http://reviews.llvm.org/D8563 llvm-svn: 233100
* Refactor: Simplify boolean expressions in AArch64 targetDavid Blaikie2015-03-242-3/+3
| | | | | | | | | | | | Simplify boolean expressions using `true` and `false` with `clang-tidy` Patch by Richard Thomson. Reviewed By: rengolin Differential Revision: http://reviews.llvm.org/D8525 llvm-svn: 233089
* [mips] Support 16-bit offsets for 'm' inline assembly memory constraint.Daniel Sanders2015-03-241-1/+9
| | | | | | | | | | | | Reviewers: vkalintiris Reviewed By: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8435 llvm-svn: 233086
* R600/SI: Insert more NOPs after READLANE on VI, don't use NOPs on CIMarek Olsak2015-03-241-1/+16
| | | | | | This is a candidate for stable. llvm-svn: 233080
* R600/SI: Select V_BFE_U32 for and+shift with a non-literal offsetMarek Olsak2015-03-243-15/+14
| | | | llvm-svn: 233079
* R600/SI: Custom-select 32-bit S_BFE from bitwise opcodesMarek Olsak2015-03-241-12/+104
| | | | llvm-svn: 233078
* R600/SI: Improve BFM supportMarek Olsak2015-03-242-3/+20
| | | | llvm-svn: 233077
* R600/SI: Use V_FRACT_F64 for faster 64-bit floor on SIMarek Olsak2015-03-244-1/+73
| | | | | | | Other f64 opcodes not supported on SI can be lowered in a similar way. v2: use complex VOP3 patterns llvm-svn: 233076
* R600/SI: Expand fract to floor, then only select V_FRACT on CIMarek Olsak2015-03-244-3/+32
| | | | | | | | | V_FRACT is buggy on SI. R600-specific code is left intact. v2: drop the multiclass, use complex VOP3 patterns llvm-svn: 233075
* Revert "Use std::bitset for SubtargetFeatures"Michael Kuperstein2015-03-2423-249/+233
| | | | | | | | This reverts commit r233055. It still causes buildbot failures (gcc running out of memory on several platforms, and a self-host failure on arm), although less than the previous time. llvm-svn: 233068
* [mips] Simplify boolean expressions in Mips target with `clang-tidy`Simon Atanasyan2015-03-242-11/+7
| | | | | | | | | | No functional changes. Patch by Richard Thomson. Differential Revision: http://reviews.llvm.org/D8522 llvm-svn: 233065
* [mips] Distinguish 'R', 'ZC', and 'm' inline assembly memory constraint.Daniel Sanders2015-03-246-12/+111
| | | | | | | | | | | | | | | | | | | | | | | | Summary: Previous behaviour of 'R' and 'm' has been preserved for now. They will be improved in subsequent commits. The offset permitted by ZC varies according to the subtarget since it is intended to match the restrictions of the pref, ll, and sc instructions. The restrictions on these instructions are: * For microMIPS: 12-bit signed offset. * For Mips32r6/Mips64r6: 9-bit signed offset. * Otherwise: 16-bit signed offset. Reviewers: vkalintiris Reviewed By: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8414 llvm-svn: 233063
* Use std::bitset for SubtargetFeaturesMichael Kuperstein2015-03-2423-233/+249
| | | | | | | | | | | | | Previously, subtarget features were a bitfield with the underlying type being uint64_t. Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset. No functional change. The first time this was committed (r229831), it caused several buildbot failures. At least some of the ARM ones were due to gcc/binutils issues, and should now be fixed. Differential Revision: http://reviews.llvm.org/D8542 llvm-svn: 233055
* [AArch64, ARM] Enable GlobalMerge with -O3 rather than -O1.Ahmed Bougacha2015-03-232-2/+2
| | | | | | | | | | | | | | | | | | | | The pass used to be enabled by default with CodeGenOpt::Less (-O1). This is too aggressive, considering the pass indiscriminately merges all globals together. Currently, performance doesn't always improve, and, on code that uses few globals (e.g., the odd file- or function- static), more often than not is degraded by the optimization. Lengthy discussion can be found on llvmdev (AArch64-focused; ARM has similar problems): http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-February/082800.html Also, it makes tooling and debuggers less useful when dealing with globals and data sections. GlobalMerge needs to better identify those cases that benefit, and this will be done separately. In the meantime, move the pass to run with -O3 rather than -O1, on both ARM and AArch64. llvm-svn: 233024
OpenPOWER on IntegriCloud