summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [A57FPLoadBalancing] Ignore <def>s when checking if a chain may be killed.James Molloy2014-09-121-0/+4
| | | | | | | | Defs are seen before uses, so a def without the kill flag doesn't necessarily mean that the register is not killed on that instruction. It may be killed in a later use operand. llvm-svn: 217689
* [A57LoadBalancing] unique_ptr-ify.James Molloy2014-09-121-25/+20
| | | | | | Thanks to David Blakie for the in-depth review! llvm-svn: 217682
* [mips][microMIPS] Implement JRADDIUSP instructionZoran Jovanovic2014-09-124-0/+52
| | | | | | Differential Revision: http://reviews.llvm.org/D5046 llvm-svn: 217681
* Address comments on r217622Bill Schmidt2014-09-121-4/+6
| | | | llvm-svn: 217680
* [mips][microMIPS] Implement BGEZALS and BLTZALS instructionsZoran Jovanovic2014-09-122-0/+13
| | | | | | Differential Revision: http://reviews.llvm.org/D5004 llvm-svn: 217678
* [mips][microMIPS] Implement JALS and JALRS instructions.Zoran Jovanovic2014-09-122-4/+37
| | | | | | Differential Revision: http://reviews.llvm.org/D5003 llvm-svn: 217676
* [mips][microMIPS] Implement TLBP, TLBR, TLBWI and TLBWR instructionsZoran Jovanovic2014-09-123-5/+19
| | | | | | Differential Revision: http://reviews.llvm.org/D5211 llvm-svn: 217675
* [ARM] Teach the cost model that cross-class copies are costly.James Molloy2014-09-121-0/+7
| | | | | | Cross-class copies being expensive is actually a trait of the microarchitecture, but as I haven't yet seen an example of a microarchitecture where they're cheap it seems best to just enable this by default, covering the non-mcpu build case. llvm-svn: 217674
* Fix gcc -Wpedantic.Patrik Hagglund2014-09-121-1/+1
| | | | llvm-svn: 217669
* Remove a temporary variable and just construct a unique_ptr directly using ↵Craig Topper2014-09-121-9/+6
| | | | | | make_unique. llvm-svn: 217655
* R600/SI: Fix off by 1 error in used register countMatt Arsenault2014-09-111-2/+4
| | | | | | | The register numbers start at 0, so if only 1 register was used, this was reported as 0. llvm-svn: 217636
* [PATCH, PowerPC] Accept 'U' and 'X' constraints in inline asmBill Schmidt2014-09-111-0/+10
| | | | | | | | | | | | | | | | | | Inline asm may specify 'U' and 'X' constraints to print a 'u' for an update-form memory reference, or an 'x' for an indexed-form memory reference. However, these are really only useful in GCC internal code generation. In inline asm the operand of the memory constraint is typically just a register containing the address, so 'U' and 'X' make no sense. This patch quietly accepts 'U' and 'X' in inline asm patterns, but otherwise does nothing. If we ever unexpectedly see a non-register, we'll assert and sort it out afterwards. I've added a new test for these constraints; the test case should be used for other asm-constraints changes down the road. llvm-svn: 217622
* Provide an implementation of getNoopForMachoTarget for SPARC.Brad Smith2014-09-112-0/+7
| | | | llvm-svn: 217611
* [AVX512] Fix miscompile for unpackAdam Nemet2014-09-111-56/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | r189189 implemented AVX512 unpack by essentially performing a 256-bit unpack between the low and the high 256 bits of src1 into the low part of the destination and another unpack of the low and high 256 bits of src2 into the high part of the destination. I don't think that's how unpack works. AVX512 unpack simply has more 128-bit lanes but other than it works the same way as AVX. So in each 128-bit lane, we're always interleaving certain parts of both operands rather different parts of one of the operands. E.g. for this: __v16sf a = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 }; __v16sf b = { 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 }; __v16sf c = __builtin_shufflevector(a, b, 0, 8, 1, 9, 4, 12, 5, 13, 16, 24, 17, 25, 20, 28, 21, 29); we generated punpcklps (notice how the elements of a and b are not interleaved in the shuffle). In turn, c was set to this: 0 16 1 17 4 20 5 21 8 24 9 25 12 28 13 29 Obviously this should have just returned the mask vector of the shuffle vector. I mostly reverted this change and made sure the original AVX code worked for 512-bit vectors as well. Also updated the tests because they matched the logic from the code. llvm-svn: 217602
* Move constant-sized bitvector to the stack.Benjamin Kramer2014-09-111-2/+2
| | | | llvm-svn: 217600
* R600: Add cmpxchg instruction for evergreenAaron Watry2014-09-112-5/+29
| | | | | | | | | | | | | | | | | Refactored the R600_LDS_1A2D class a bit to get it to actually work. It seemed to be previously unused and broken. We also have to disable the conversion to the noret variant for now in R600ISelLowering because the getLDSNoRetOp method only handles 1A1D LDS ops. Someone can feel free to modify the AMDGPU::getLDSNoRetOp method to work for more than 1A1D variants of LDS operations. It's being left as a future TODO for now. Signed-off-by: Aaron Watry <awatry at gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> llvm-svn: 217596
* R600: Add LDS_WRXCHG[_RET] instructions for Evergreen.Aaron Watry2014-09-111-0/+4
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> llvm-svn: 217594
* R600: Add LDS_MIN_[U]INT[_RET] instructions for EvergreenAaron Watry2014-09-111-0/+8
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> llvm-svn: 217593
* R600: Add LDS_XOR[_RET] instructions for EvergreenAaron Watry2014-09-111-0/+4
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> llvm-svn: 217592
* R600: Add LDS_OR[_RET] instructions for EvergreenAaron Watry2014-09-111-0/+4
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> llvm-svn: 217591
* R600: Add LDS_AND[_RET] instructions for EvergreenAaron Watry2014-09-111-0/+4
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> llvm-svn: 217590
* R600: Add LDS_MAX_[U]INT[_RET] instructions for EvergreenAaron Watry2014-09-111-0/+8
| | | | | | | | | | | | | This was only present for SI before. Cayman may still be missing, but I am unable to test that currently. v2: Don't create atomicrmw max tests in separate file Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> CC: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 217589
* R600/SI: Fix losing chain when fixing reg class of loads.Matt Arsenault2014-09-101-6/+14
| | | | | | | The lost chain resulting in earlier side effecting nodes being deleted. llvm-svn: 217561
* R600/SI: Report offset in correct units for st64 DS instructionsMatt Arsenault2014-09-101-0/+15
| | | | | | | | | | | Need to convert the 64 element offset into bytes, not just the element size like the normal case instructions. Noticed by inspection. This can't be hit now because st64 instructions aren't emitted during instruction selection, and the post-RA scheduler isn't enabled. llvm-svn: 217560
* R600: Custom lower fremMatt Arsenault2014-09-102-0/+20
| | | | llvm-svn: 217553
* Add doInitialization/doFinalization to DataLayoutPass.Rafael Espindola2014-09-102-2/+2
| | | | | | | | | | | | | With this a DataLayoutPass can be reused for multiple modules. Once we have doInitialization/doFinalization, it doesn't seem necessary to pass a Module to the constructor. Overall this change seems in line with the idea of making DataLayout a required part of Module. With it the only way of having a DataLayout used is to add it to the Module. llvm-svn: 217548
* [AArch64] Revert r216141 for cycloneGerolf Hoflehner2014-09-101-1/+1
| | | | | | | | | | The increase of the interleave factor to 4 has side-effects like performance losses eg. due to reminder loops being executed more frequently and may increase code size. It requires more analysis and careful heuristic tuning. Expect double digit gains in small benchmarks like lowercase.c and losses in puzzle.c. llvm-svn: 217540
* Rename getMaximumUnrollFactor -> getMaxInterleaveFactor; also rename option ↵Sanjay Patel2014-09-105-9/+9
| | | | | | | | | | | names controlling this variable. "Unroll" is not the appropriate name for this variable. Clang already uses the term "interleave" in pragmas and metadata for this. Differential Revision: http://reviews.llvm.org/D5066 llvm-svn: 217528
* [AArch64] Address Chad's post commit review comments for r217504 (PBQP ↵Arnaud A. de Grandmaison2014-09-101-11/+10
| | | | | | experimental support) llvm-svn: 217518
* [AArch64] Pacify lld buildbot complaining about an unused static function in ↵Arnaud A. de Grandmaison2014-09-101-0/+2
| | | | | | release build. llvm-svn: 217505
* [AArch64] Add experimental PBQP supportArnaud A. de Grandmaison2014-09-105-2/+435
| | | | | | | | | | This adds target specific support for using the PBQP register allocator on the AArch64, for the A57 cpu. By default, the PBQP allocator is not used, unless explicitely required on the command line with "-aarch64-pbqp". llvm-svn: 217504
* [AArch 64] Use a constant pool load for weak symbol references whenAsiri Rathnayake2014-09-104-6/+39
| | | | | | | | | | | | | | using static relocation model and small code model. Summary: currently we generate GOT based relocations for weak symbol references regardless of the underlying relocation model. This should be change so that in static relocation model we use a constant pool load instead. Patch from: Keith Walker Reviewers: Renato Golin, Tim Northover llvm-svn: 217503
* Add missing HWEncoding to base register class.Sid Manning2014-09-101-8/+10
| | | | | | | This change gives tblgen the information needed to fill in the HexagonRegEncodingTable. llvm-svn: 217500
* ARM: don't size-reduce STMs using the LR register.Tim Northover2014-09-101-1/+1
| | | | | | | | | The only Thumb-1 multi-store capable of using LR is the PUSH instruction, which translates to STMDB, so we shouldn't convert STMIAs. Patch by Sergey Dmitrouk. llvm-svn: 217498
* [mips] Remove inverted predicates from MipsSubtarget that were only used by ↵Daniel Sanders2014-09-102-14/+15
| | | | | | | | | | | | | | | | MipsCallingConv.td Summary: No functional change Reviewers: echristo, vmedic Reviewed By: echristo, vmedic Subscribers: echristo, llvm-commits Differential Revision: http://reviews.llvm.org/D5266 llvm-svn: 217494
* [mips] Return an ArrayRef from MipsCC::intArgRegs() and remove ↵Daniel Sanders2014-09-102-24/+19
| | | | | | | | | | | | | | | | MipsCC::numIntArgRegs() Summary: No functional change. Reviewers: vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5265 llvm-svn: 217485
* [asan-assembly-instrumentation] Added CFI directives to the generated ↵Yuri Gorshenin2014-09-103-1/+62
| | | | | | | | | | | | | | instrumentation code. Summary: [asan-assembly-instrumentation] Added CFI directives to the generated instrumentation code. Reviewers: eugenis Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5189 llvm-svn: 217482
* Drop the W postfix on the 16-bit registers.Job Noorman2014-09-107-196/+196
| | | | | | | This ensures the inline assembly register constraints are properly recognised in TargetLowering::getRegForInlineAsmConstraint. llvm-svn: 217479
* [MIPS] Add aliases for sync instruction used by Octeon CPUKai Nacke2014-09-101-0/+6
| | | | | | | | | This commit adds aliases for the sync instruction (synciobdma, syncs, syncw, syncws) which are used by the Octeon CPU. Reviewed by D. Sanders llvm-svn: 217477
* Use cast to MVT instead of EVT on a couple calls to getSizeInBits.Craig Topper2014-09-101-2/+2
| | | | llvm-svn: 217473
* Add a scheduling model for AMD 16H Jaguar (btver2).Sanjay Patel2014-09-093-4/+350
| | | | | | | | | | | | | This is a first pass at a scheduling model for Jaguar. It's structured largely on the existing SandyBridge and SLM sched models. Using this model, in addition to turning on the PostRA scheduler, results in some perf wins on internal and 3rd party benchmarks. There's not much difference in LLVM's test-suite benchmarking subset of tests. Differential Revision: http://reviews.llvm.org/D5229 llvm-svn: 217457
* [mips] Add assembler support for .set mips0 directive.Toma Tabacu2014-09-093-0/+21
| | | | | | | | | | | | | | | | | | Summary: This directive is used to reset the assembler options to their initial values. Assembly programmers use it in conjunction with the ".set mipsX" directives. This patch depends on the .set push/pop directive (http://reviews.llvm.org/D4821). Contains work done by Matheus Almeida. Reviewers: dsanders Reviewed By: dsanders Differential Revision: http://reviews.llvm.org/D4957 llvm-svn: 217438
* [mips] Move MipsTargetLowering::MipsCC::regSize() to ↵Daniel Sanders2014-09-093-32/+32
| | | | | | | | | | | | | | | | | | | | MipsSubtarget::getGPRSizeInBytes() Summary: The GPR size is more a property of the subtarget than that of the ABI so move this information to the MipsSubtarget. No functional change. Reviewers: vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5009 llvm-svn: 217436
* [x32] Emit callq for CALLpcrel32Pavel Chupin2014-09-093-4/+19
| | | | | | | | | | | | | | | | | Summary: In AT&T annotation for both x86_64 and x32 calls should be printed as callq in assembly. It's only a matter of correct mnemonic, object output is ok. Test Plan: trivial test added Reviewers: nadav, dschuff, craig.topper Subscribers: llvm-commits, zinovy.nis Differential Revision: http://reviews.llvm.org/D5213 llvm-svn: 217435
* [mips] Don't cache IsO32 and IsFP64 in MipsTargetLowering::MipsCCDaniel Sanders2014-09-092-23/+28
| | | | | | | | | | | | | | | | | Summary: Use a MipsSubtarget reference instead. No functional change. Reviewers: vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5008 llvm-svn: 217434
* [mips] Add assembler support for .set push/pop directive.Toma Tabacu2014-09-093-17/+95
| | | | | | | | | | | | | | | Summary: These directives are used to save the current assembler options (in the case of ".set push") and restore the previously saved options (in the case of ".set pop"). Contains work done by Matheus Almeida. Reviewers: dsanders Reviewed By: dsanders Differential Revision: http://reviews.llvm.org/D4821 llvm-svn: 217432
* ARM: Negative offset support problemRenato Golin2014-09-091-2/+2
| | | | | | | | This patch is to permit a negative offset usage for a non frame access. Patch by Igor Oblakov. llvm-svn: 217431
* Set trunc store action to Expand for all X86 targets.Bob Wilson2014-09-091-2/+2
| | | | | | | | When compiling without SSE2, isTruncStoreLegal(F64, F32) would return Legal, whereas with SSE2 it would return Expand. And since the Target doesn't seem to actually handle a truncstore for double -> float, it would just output a store of a full double in the space for a float hence overwriting other bits on the stack. Patch by Luqman Aden! llvm-svn: 217410
* [AArch64] Enabled AA support for Cortex-A57.Chad Rosier2014-09-081-1/+1
| | | | llvm-svn: 217381
* R600/SI: Fix assertion from copying a TargetGlobalAddressMatt Arsenault2014-09-081-1/+2
| | | | | | | | | | | | | Assert in scheduler from an inserted copy_to_regclass from a constant. This only seems to break sometimes when a constant initializer address is forced into VGPRs in a non-entry block. No test since the only case I've managed to hit only happens with a future patch, and that case will also not be a problem once scalar instructions are used in non-entry blocks. llvm-svn: 217380
OpenPOWER on IntegriCloud