summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* [Hexagon] Add aliases for vector loads/stores with no explicit offsetKrzysztof Parzyszek2016-05-051-0/+79
| | | | | | The mem(r0) instructions are treated as mem(r0+#0). llvm-svn: 268661
* AMDGPU: Uniform branch conditions can originate with intrinsicsNicolai Haehnle2016-05-051-2/+1
| | | | | | | | | | | | | | | | | Summary: Discovered by Dave Airlie, fixes an assertion in Khronos OpenGL CTS GL43-CTS.shader_storage_buffer_object.advanced-matrix. In this particular case, the buffer load intrinsic fed into a uniform conditional branch, and led the brcond lowering down the wrong path. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19931 llvm-svn: 268650
* AMDGPU/SI: Add support for AMD code object version 2.Tom Stellard2016-05-058-133/+6
| | | | | | | | | | | | | | Summary: Version 2 is now the default. If you want to emit version 1, use the amdgcn--amdhsa-amdcov1 triple. Reviewers: arsenm, kzhuravl Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19283 llvm-svn: 268647
* X86CallFrameOptimization: make adjustCallSequence's return type voidHans Wennborg2016-05-051-7/+8
| | | | | | It always returned the same value (true). No functionality change. llvm-svn: 268645
* [Hexagon] Merge HexagonAlias.td into HexagonInstrAlias.td, NFCKrzysztof Parzyszek2016-05-053-95/+81
| | | | llvm-svn: 268641
* [Hexagon] Handle operand type differences for A2_tfrpiKrzysztof Parzyszek2016-05-051-1/+16
| | | | | | | | | | The instruction A2_tfrpi has a 64-bit operand, while the corresponding intrinsic takes a 32-bit value. The actual value has only 8 significant bits, so the difference is only in the type used to represent it. In order to map the intrinsic to the instruction, the operand needs to be extended to the correct type. llvm-svn: 268635
* Remove bit-rotten CppBackend.James Y Knight2016-05-058-2281/+0
| | | | | | | | | | | | | | | | | This backend was supposed to generate C++ code which will re-construct the LLVM IR passed as input. This seems to me to have very marginal usefulness in the first place. However, the code has never been updated to use IRBuilder, which makes its current value negative -- people who look at the output may be steered to use the *wrong* C++ APIs to construct IR. Furthermore, it's generated code that doesn't compile since at least 2013. Differential Revision: http://reviews.llvm.org/D19942 llvm-svn: 268631
* Fix Mips Parser error reportingNirav Dave2016-05-051-27/+49
| | | | | | | | | | | | | [mips] On error, ParseDirective should always return false to signify that the directive was understood. Reviewers: dsanders, vkalintiris, sdardis Subscribers: dsanders, llvm-commits, sdardis Differential Revision: http://reviews.llvm.org/D19929 llvm-svn: 268630
* [X86] Extend some Linux special cases to cover kFreeBSD.Marcin Koscielnicki2016-05-054-5/+9
| | | | | | | | | | | | Both Linux and kFreeBSD use glibc, so follow similiar code paths. Add isTargetGlibc to check for this, and use it instead of isTargetLinux in a few places. Fixes PR22248 for kFreeBSD. Differential Revision: http://reviews.llvm.org/D19104 llvm-svn: 268624
* [X86] Use the right type when folding xor (truncate (shift)) -> setccDavid Majnemer2016-05-051-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The result type of setcc is dependent on whether or not AVX512 is present. We had an X86-specific DAG-combine which assumed that the result type should be i8 when it could be i1. This meant that we would generate illegal setccs which LowerSETCC did not like. Instead, use an appropriate type and zero extend to i8. Also, there were some scenarios where the fold should have fired but didn't because we were overly cautious about the types. This meant that we generated: shrl $31, %edi andl $1, %edi kmovw %edi, %k0 kxnorw %k0, %k0, %k1 kshiftrw $15, %k1, %k1 kxorw %k1, %k0, %k0 kmovw %k0, %eax instead of: testl %edi, %edi setns %al This fixes PR27638. llvm-svn: 268609
* ARM: Use a Handle to track SDNodes in case they're CSE'd. NFCJustin Bogner2016-05-051-4/+2
| | | | | | | | | | | | The code here is recursively Select-ing a new Node to avoid issues where N is CSE'd during replaceDAGValue and stops being valid. We can accomplish the same goal in a more principled way by using a HandleSDNode. This is essentially a less dodgy fix for PR25733 than the original attempt back in r255120. llvm-svn: 268590
* [SystemZ] Implement backchain attribute (recommit with fix).Marcin Koscielnicki2016-05-052-4/+46
| | | | | | | | | | | | | This introduces a SystemZ-specific "backchain" attribute on function, which enables writing the frame backchain link as specified by the ABI. This will be used to implement -mbackchain option in clang. Differential Revision: http://reviews.llvm.org/D19889 Fixed in this version: added RegState::Define and RegState::Kill on R1D in prologue. llvm-svn: 268581
* Revert "[SystemZ] Implement backchain attribute."Marcin Koscielnicki2016-05-042-46/+4
| | | | | | | | This reverts commit rL268571. It caused failures in register scavenger. llvm-svn: 268576
* [SystemZ] Implement llvm.get.dynamic.area.offsetMarcin Koscielnicki2016-05-043-0/+15
| | | | | | | | To be used for AddressSanitizer. Differential Revision: http://reviews.llvm.org/D19817 llvm-svn: 268572
* [SystemZ] Implement backchain attribute.Marcin Koscielnicki2016-05-042-4/+46
| | | | | | | | | | This introduces a SystemZ-specific "backchain" attribute on function, which enables writing the frame backchain link as specified by the ABI. This will be used to implement -mbackchain option in clang. Differential Revision: http://reviews.llvm.org/D19889 llvm-svn: 268571
* [X86] Add a few register classes for x32 address accesses.Quentin Colombet2016-05-042-2/+19
| | | | | | | | | | | | The new register classes allow to tell the machine verifier that it is fine to use RIP for address accesses in x32 mode. Prior to that patch, we would complain that we are using a GR64 in place of GR32, whereas it is actually fine to use GR64 for x32 as long as the 32 high bits are 0s. RIP has this property and is used for RIP-relative addressing. This partially fixes http://llvm.org/PR27481. llvm-svn: 268567
* [AArch64] Add cheap as move instructions for Exynos M1Evandro Menezes2016-05-041-2/+33
| | | | llvm-svn: 268549
* [AArch64] Use the reciprocal estimation machineryEvandro Menezes2016-05-045-3/+102
| | | | | | | This patch adds support for estimating the square root, its reciprocal and division or reciprocal using the combiner generic reciprocal machinery. llvm-svn: 268539
* Revert r268529 because it caused use-of-uninitialized-valueVitaly Buka2016-05-041-19/+6
| | | | | | | | | | | | | | | | | | | | | | Summary: This reverts commit d88cc0862bf7da64850b89e9bb5ea9f95e7f1184. #0 0xfed467 in llvm::ARMFrameLowering::determineCalleeSaves(llvm::MachineFunction&, llvm::BitVector&, llvm::RegScavenger*) const /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/Target/ARM/ARMFrameLowering.cpp:1625:52 #1 0x330d4cc in (anonymous namespace)::PEI::runOnMachineFunction(llvm::MachineFunction&) /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/CodeGen/PrologEpilogInserter.cpp:186:3 #2 0x3193e12 in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/CodeGen/MachineFunctionPass.cpp:60:13 #3 0x396237d in llvm::FPPassManager::runOnFunction(llvm::Function&) /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/IR/LegacyPassManager.cpp:1526:23 #4 0x3962a23 in llvm::FPPassManager::runOnModule(llvm::Module&) /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/IR/LegacyPassManager.cpp:1547:16 #5 0x3963d52 in runOnModule /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/IR/LegacyPassManager.cpp:1603:23 #6 0x3963d52 in llvm::legacy::PassManagerImpl::run(llvm::Module&) /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/IR/LegacyPassManager.cpp:1706 #7 0x6bb910 in compileModule(char**, llvm::LLVMContext&) /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/tools/llc/llc.cpp:412:5 #8 0x6b3c25 in main /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/tools/llc/llc.cpp:218:22 #9 0x7fd4a7d37ec4 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21ec4) #10 0x625c93 in _start (/mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm_build_msan/bin/llc+0x625c93) Reviewers: Subscribers: llvm-svn: 268536
* [ARM] Fix Scavenger assert due to underestimated stack sizeWeiming Zhao2016-05-041-6/+19
| | | | | | | | | | | | | Summary: Currently, when checking if a stack is "BigStack" or not, it doesn't count into spills and arguments. Therefore, LLVM won't reserve spill slot for this actually "BigStack". This may cause scavenger failure. Reviewers: rengolin Subscribers: aemerson, rengolin, tberghammer, danalbert, srhines, llvm-commits Differential Revision: http://reviews.llvm.org/D19896 llvm-svn: 268529
* [PowerPC] Generate VSX version of splat wordNemanja Ivanovic2016-05-045-7/+37
| | | | | | | | | | This patch corresponds to review: http://reviews.llvm.org/D18592 It allows the PPC back end to generate the xxspltw instruction where we previously only emitted vspltw. llvm-svn: 268516
* AMDGPU/R600: Minor cleanup in InstrInfoJan Vesely2016-05-041-17/+16
| | | | | | | | | | | | | | Use std::make_pair instead of constructor Use C++11 loop Reuse helper var Reviewers: tstellardAMD Subsribers: arsenm Differential Revision: http://reviews.llvm.org/D19787 llvm-svn: 268503
* [mips][ias] Only round section sizes when explicitly requested.Daniel Sanders2016-05-041-12/+24
| | | | | | | | As requested by Rafael Espindola in his post-commit comments on r268036. This makes the previous behaviour the default while still allowing verification of IAS. llvm-svn: 268496
* [Sparc] Allow taking of function address into a register.Chris Dewhurst2016-05-041-5/+5
| | | | | | | | Modification of previously existing code (variable rename only), with unit test added. Differential Revision: http://reviews.llvm.org/D19368 llvm-svn: 268493
* [mips][microMIPS] Add CodeGen support for microMIPSr6 ROTR and ROTRV and add ↵Zlatko Buljan2016-05-044-11/+34
| | | | | | | | tests for LL, SC, SYSCALL, ROTR, ROTRV, LWM32, SWM32 and MOVEP instructions Differential Revision: http://reviews.llvm.org/D19857 llvm-svn: 268491
* [Sparc] Implement __builtin_setjmp, __builtin_longjmp back-end.Chris Dewhurst2016-05-043-21/+292
| | | | | | | | | | | | | | This code implements builtin_setjmp and builtin_longjmp exception handling intrinsics for 32-bit Sparc back-ends. The code started as a mash-up of the PowerPC and X86 versions, although there are sufficient differences to both that had to be made for Sparc handling. Note: I have manual tests running. I'll work on a unit test and add that to the rest of this diff in the next day. Also, this implementation is only for 32-bit Sparc. I haven't focussed on a 64-bit version, although I have left the code in a prepared state for implementing this, including detecting pointer size and comments indicating where I suspect there may be differences. Differential Revision: http://reviews.llvm.org/D19798 llvm-svn: 268483
* [X86] Lower zext i1 argumentsDavid Majnemer2016-05-041-0/+15
| | | | | | | | | | | | | i1 is now a legal type for X86 with AVX512. There were some paths in X86FastISel which were not quite ready to see an i1 value: they were not quite sure how to deal with sign/zero extends for call arguments. DTRT by extending to i8 for zeroext and bailing out of FastISel for signext. This fixes PR27591. llvm-svn: 268470
* [X86] Tidied up SDValue's SDNode referencing. NFCI.Simon Pilgrim2016-05-031-5/+5
| | | | llvm-svn: 268445
* X86-Darwin: start emitting data-region directives for jump-tables.Tim Northover2016-05-031-1/+1
| | | | | | The surrounding tools can cope these days, and they were invented for a reason. llvm-svn: 268437
* Add an address space for the X86 SS segment.David L Kreitzer2016-05-031-2/+8
| | | | | | | | Patch by Michael LeMay (michael.lemay@intel.com) Differential Revision: http://reviews.llvm.org/D17093 llvm-svn: 268431
* AMDGPU/SI: Use range loops to simplify some code in the SI SchedulerTom Stellard2016-05-031-18/+18
| | | | | | | | | | Reviewers: arsenm, axeldavy Subscribers: MatzeB, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19822 llvm-svn: 268396
* Silence unused variable warning; NFC.Aaron Ballman2016-05-031-2/+1
| | | | llvm-svn: 268392
* [X86][SSE] Added target shuffle combine to MOVQ Simon Pilgrim2016-05-031-0/+16
| | | | llvm-svn: 268391
* [Sparc] Constification of TargetMachine argumentsJames Y Knight2016-05-034-4/+4
| | | | | | | | | | | This patch changes the TargetMachine arguments to be const. This is required for {D19265}, and was requested to be done in a separate patch. Patch by Jacob Hansen! Differential Revision: http://reviews.llvm.org/D19797 llvm-svn: 268389
* [mips][fastisel] ADJCALLSTACKUP has a second immediate operand.Daniel Sanders2016-05-031-1/+1
| | | | | | | | | | | | | | Summary: It's always zero for SelectionDAG and is never read by the MIPS backend so do the same for FastISel. Reviewers: sdardis Subscribers: dsanders, llvm-commits, sdardis Differential Revision: http://reviews.llvm.org/D19863 llvm-svn: 268386
* [mips] Fix unused variable warning for release builds introduced by r268379.Daniel Sanders2016-05-031-3/+1
| | | | llvm-svn: 268383
* [mips] Use MipsMCExpr instead of MCSymbolRefExpr for all relocations.Daniel Sanders2016-05-0313-404/+571
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is much closer to the way MIPS relocation expressions work (%hi(foo + 2) rather than %hi(foo) + 2) and removes the need for the various bodges in MipsAsmParser::evaluateRelocExpr(). Removing those bodges ensures that the constant stored in MCValue is the full 32 or 64-bit (depending on ABI) offset from the symbol. This will be used to correct the %hi/%lo matching needed to sort the relocation table correctly. As part of this: * Gave MCExpr::print() the ability to omit parenthesis when emitting a symbol reference inside a MipsMCExpr operator like %hi(X). Without this we print things like %lo(($L1)). * %hi(%neg(%gprel(X))) is now three MipsMCExpr's instead of one. Most of the related special cases have been removed or moved to MipsMCExpr. We can remove the rest as we gain support for the less common relocations when they are not part of this specific combination. * Renamed MipsMCExpr::VariantKind and the enum prefix ('VK_') to avoid confusion with MCSymbolRefExpr::VariantKind and its prefix (also 'VK_'). * fixup_Mips_GOT_Local and fixup_Mips_GOT_Global were found to be identical and merged into fixup_Mips_GOT. * MO_GOT16 and MO_GOT turned out to be identical and have been merged into MO_GOT. * VK_Mips_GOT and VK_Mips_GOT16 turned out to be the same thing so they have been merged into MEK_GOT Reviewers: sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D19716 llvm-svn: 268379
* [AVX512] Add support for commutative MAX/MIN . In general VMAX{PS,PD} and ↵Igor Breger2016-05-031-2/+6
| | | | | | | | VMIN{PS,PD} instruction are not commutative . In combine pass only if UnsafeFPMath are used VMAX/VMAX are converted to commutative nodes VMAXC/VMAXC. Differential Revision: http://reviews.llvm.org/D19860 llvm-svn: 268375
* [AVX512] Fix lowerV4X128VectorShuffle to select correctly input operands .Igor Breger2016-05-031-4/+18
| | | | | | Differential Revision: http://reviews.llvm.org/D19803 llvm-svn: 268368
* AArch64/optimizeCondBranch: Remove earlier kill flag when forming TBZMatthias Braun2016-05-031-0/+2
| | | | | | | This fixes -verify-machineinstrs complaints when compiling test-suite/SingleSource/Benchmarks/Shootout-C++/wordfreq.cpp llvm-svn: 268360
* livePhysRegs: Pass MBB by reference in addLive{Ins|Outs}(); NFCMatthias Braun2016-05-038-10/+10
| | | | | | | The block must no be nullptr for the addLiveIns()/addLiveOuts() function. llvm-svn: 268340
* LivePhysRegs: Automatically determine presence of pristine regs.Matthias Braun2016-05-036-8/+8
| | | | | | | | | | | | | | | | | | | | | | Remove the AddPristinesAndCSRs parameters from addLiveIns()/addLiveOuts(). We need to respect pristine registers after prologue epilogue insertion, Seeing that we got this wrong in at least two commits already, we should rather pay the small price to query MachineFrameInfo for it. There are three cases that did not set AddPristineAndCSRs to true even after register allocation: - ExecutionDepsFix: live-out registers are used as a hint that the register is used soon. This is not true for pristine registers so use the new addLiveOutsNoPristines() to maintain this behaviour. - SystemZShortenInst: Not setting AddPristineAndCSRs to true looks like a bug, should do the right thing automatically now. - StackMapLivenessAnalysis: Not adding pristine registers looks like a bug to me. Added a FIXME comment but maintain the current behaviour as a change may need to get coordinated with GC runtimes. llvm-svn: 268336
* [X86] Model FAULTING_LOAD_OP as a terminator and branch.Quentin Colombet2016-05-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | This operation may branch to the handler block and we do not want it to happen anywhere within the basic block. Moreover, by marking it "terminator and branch" the machine verifier does not wrongly assume (because of AnalyzeBranch not knowing better) the branch is analyzable. Indeed, the target was seeing only the unconditional branch and not the faulting load op and thought it was a simple unconditional block. The machine verifier was complaining because of that and moreover, other optimizations could have done wrong transformation! In the process, simplify the representation of the handler block in the faulting load op. Now, we directly reference the handler block instead of using a label. This has the benefits of: 1. MC knows how to issue a label for a BB, so leave that to it. 2. Accessing the target BB from its label is painful, whereas it is direct from a MBB operand. Note: The 2 bytes offset in implicit-null-check.ll comes from the fact the unconditional jumps are not removed anymore, as the whole terminator sequence is not analyzable anymore. Will fix it in a subsequence commit. llvm-svn: 268327
* [X86][SSE] Added placeholder for 128/256-bit wide shuffle combinesSimon Pilgrim2016-05-021-6/+14
| | | | | | Begun adding placeholder for future support for vperm2f128/vshuff64x2 style 128/256-bit wide shuffles llvm-svn: 268306
* AMDGPU: Custom lower v2i32 loads and storesMatt Arsenault2016-05-021-7/+39
| | | | | | | This will allow us to split up 64-bit private accesses when necessary. llvm-svn: 268296
* AMDGPU/SI: Use v_readfirstlane_b32 when restoring SGPRs spilled to scratchTom Stellard2016-05-021-2/+1
| | | | | | | | | We were using v_readlane_b32 with the lane set to zero, but this won't work if thread 0 is not active. Differential Revision: http://reviews.llvm.org/D19745 llvm-svn: 268295
* AMDGPU: Make i64 loads/stores promote to v2i32Matt Arsenault2016-05-022-55/+12
| | | | | | | | | | | | Now that unaligned access expansion should not attempt to produce i64 accesses, we can remove the hack in PreprocessISelDAG where this is done. This allows splitting i64 private accesses while allowing the new add nodes indexing the vector components can be folded with the base pointer arithmetic. llvm-svn: 268293
* Fix instance of -Winconsistent-missing-override in AMDGPU codeReid Kleckner2016-05-021-1/+1
| | | | llvm-svn: 268289
* AMDGPU/SI: Set the kill flag on temp VGPRs used to restore SGPRs from scratchTom Stellard2016-05-021-1/+1
| | | | | | | | | | | | | | | | | | | | | Summary: When we restore an SGPR value from scratch, we first load it into a temporary VGPR and then use v_readlane_b32 to copy the value from the VGPR back into an SGPR. We weren't setting the kill flag on the VGPR in the v_readlane_b32 instruction, so the register scavenger wasn't able to re-use this temp value later. I wasn't able to create a lit test for this. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19744 llvm-svn: 268287
* ARM: fix handling of SUB immediates in peephole opt.Tim Northover2016-05-021-12/+30
| | | | | | | | | | | We were negating an immediate that was going to be used in a SUBri form unnecessarily. Since ADD/SUB are very similar we *can* do that, but we have to change the SUB to an ADD at the same time. This also applies to ADD, and allows us to handle a slightly larger range of immediates for those two operations. rdar://25992245 llvm-svn: 268276
OpenPOWER on IntegriCloud