summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [Sparc] Implement __builtin_setjmp, __builtin_longjmp back-end.Chris Dewhurst2016-05-043-21/+292
| | | | | | | | | | | | | | This code implements builtin_setjmp and builtin_longjmp exception handling intrinsics for 32-bit Sparc back-ends. The code started as a mash-up of the PowerPC and X86 versions, although there are sufficient differences to both that had to be made for Sparc handling. Note: I have manual tests running. I'll work on a unit test and add that to the rest of this diff in the next day. Also, this implementation is only for 32-bit Sparc. I haven't focussed on a 64-bit version, although I have left the code in a prepared state for implementing this, including detecting pointer size and comments indicating where I suspect there may be differences. Differential Revision: http://reviews.llvm.org/D19798 llvm-svn: 268483
* [X86] Lower zext i1 argumentsDavid Majnemer2016-05-041-0/+15
| | | | | | | | | | | | | i1 is now a legal type for X86 with AVX512. There were some paths in X86FastISel which were not quite ready to see an i1 value: they were not quite sure how to deal with sign/zero extends for call arguments. DTRT by extending to i8 for zeroext and bailing out of FastISel for signext. This fixes PR27591. llvm-svn: 268470
* [X86] Tidied up SDValue's SDNode referencing. NFCI.Simon Pilgrim2016-05-031-5/+5
| | | | llvm-svn: 268445
* X86-Darwin: start emitting data-region directives for jump-tables.Tim Northover2016-05-031-1/+1
| | | | | | The surrounding tools can cope these days, and they were invented for a reason. llvm-svn: 268437
* Add an address space for the X86 SS segment.David L Kreitzer2016-05-031-2/+8
| | | | | | | | Patch by Michael LeMay (michael.lemay@intel.com) Differential Revision: http://reviews.llvm.org/D17093 llvm-svn: 268431
* AMDGPU/SI: Use range loops to simplify some code in the SI SchedulerTom Stellard2016-05-031-18/+18
| | | | | | | | | | Reviewers: arsenm, axeldavy Subscribers: MatzeB, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19822 llvm-svn: 268396
* Silence unused variable warning; NFC.Aaron Ballman2016-05-031-2/+1
| | | | llvm-svn: 268392
* [X86][SSE] Added target shuffle combine to MOVQ Simon Pilgrim2016-05-031-0/+16
| | | | llvm-svn: 268391
* [Sparc] Constification of TargetMachine argumentsJames Y Knight2016-05-034-4/+4
| | | | | | | | | | | This patch changes the TargetMachine arguments to be const. This is required for {D19265}, and was requested to be done in a separate patch. Patch by Jacob Hansen! Differential Revision: http://reviews.llvm.org/D19797 llvm-svn: 268389
* [mips][fastisel] ADJCALLSTACKUP has a second immediate operand.Daniel Sanders2016-05-031-1/+1
| | | | | | | | | | | | | | Summary: It's always zero for SelectionDAG and is never read by the MIPS backend so do the same for FastISel. Reviewers: sdardis Subscribers: dsanders, llvm-commits, sdardis Differential Revision: http://reviews.llvm.org/D19863 llvm-svn: 268386
* [mips] Fix unused variable warning for release builds introduced by r268379.Daniel Sanders2016-05-031-3/+1
| | | | llvm-svn: 268383
* [mips] Use MipsMCExpr instead of MCSymbolRefExpr for all relocations.Daniel Sanders2016-05-0313-404/+571
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is much closer to the way MIPS relocation expressions work (%hi(foo + 2) rather than %hi(foo) + 2) and removes the need for the various bodges in MipsAsmParser::evaluateRelocExpr(). Removing those bodges ensures that the constant stored in MCValue is the full 32 or 64-bit (depending on ABI) offset from the symbol. This will be used to correct the %hi/%lo matching needed to sort the relocation table correctly. As part of this: * Gave MCExpr::print() the ability to omit parenthesis when emitting a symbol reference inside a MipsMCExpr operator like %hi(X). Without this we print things like %lo(($L1)). * %hi(%neg(%gprel(X))) is now three MipsMCExpr's instead of one. Most of the related special cases have been removed or moved to MipsMCExpr. We can remove the rest as we gain support for the less common relocations when they are not part of this specific combination. * Renamed MipsMCExpr::VariantKind and the enum prefix ('VK_') to avoid confusion with MCSymbolRefExpr::VariantKind and its prefix (also 'VK_'). * fixup_Mips_GOT_Local and fixup_Mips_GOT_Global were found to be identical and merged into fixup_Mips_GOT. * MO_GOT16 and MO_GOT turned out to be identical and have been merged into MO_GOT. * VK_Mips_GOT and VK_Mips_GOT16 turned out to be the same thing so they have been merged into MEK_GOT Reviewers: sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D19716 llvm-svn: 268379
* [AVX512] Add support for commutative MAX/MIN . In general VMAX{PS,PD} and ↵Igor Breger2016-05-031-2/+6
| | | | | | | | VMIN{PS,PD} instruction are not commutative . In combine pass only if UnsafeFPMath are used VMAX/VMAX are converted to commutative nodes VMAXC/VMAXC. Differential Revision: http://reviews.llvm.org/D19860 llvm-svn: 268375
* [AVX512] Fix lowerV4X128VectorShuffle to select correctly input operands .Igor Breger2016-05-031-4/+18
| | | | | | Differential Revision: http://reviews.llvm.org/D19803 llvm-svn: 268368
* AArch64/optimizeCondBranch: Remove earlier kill flag when forming TBZMatthias Braun2016-05-031-0/+2
| | | | | | | This fixes -verify-machineinstrs complaints when compiling test-suite/SingleSource/Benchmarks/Shootout-C++/wordfreq.cpp llvm-svn: 268360
* livePhysRegs: Pass MBB by reference in addLive{Ins|Outs}(); NFCMatthias Braun2016-05-038-10/+10
| | | | | | | The block must no be nullptr for the addLiveIns()/addLiveOuts() function. llvm-svn: 268340
* LivePhysRegs: Automatically determine presence of pristine regs.Matthias Braun2016-05-036-8/+8
| | | | | | | | | | | | | | | | | | | | | | Remove the AddPristinesAndCSRs parameters from addLiveIns()/addLiveOuts(). We need to respect pristine registers after prologue epilogue insertion, Seeing that we got this wrong in at least two commits already, we should rather pay the small price to query MachineFrameInfo for it. There are three cases that did not set AddPristineAndCSRs to true even after register allocation: - ExecutionDepsFix: live-out registers are used as a hint that the register is used soon. This is not true for pristine registers so use the new addLiveOutsNoPristines() to maintain this behaviour. - SystemZShortenInst: Not setting AddPristineAndCSRs to true looks like a bug, should do the right thing automatically now. - StackMapLivenessAnalysis: Not adding pristine registers looks like a bug to me. Added a FIXME comment but maintain the current behaviour as a change may need to get coordinated with GC runtimes. llvm-svn: 268336
* [X86] Model FAULTING_LOAD_OP as a terminator and branch.Quentin Colombet2016-05-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | This operation may branch to the handler block and we do not want it to happen anywhere within the basic block. Moreover, by marking it "terminator and branch" the machine verifier does not wrongly assume (because of AnalyzeBranch not knowing better) the branch is analyzable. Indeed, the target was seeing only the unconditional branch and not the faulting load op and thought it was a simple unconditional block. The machine verifier was complaining because of that and moreover, other optimizations could have done wrong transformation! In the process, simplify the representation of the handler block in the faulting load op. Now, we directly reference the handler block instead of using a label. This has the benefits of: 1. MC knows how to issue a label for a BB, so leave that to it. 2. Accessing the target BB from its label is painful, whereas it is direct from a MBB operand. Note: The 2 bytes offset in implicit-null-check.ll comes from the fact the unconditional jumps are not removed anymore, as the whole terminator sequence is not analyzable anymore. Will fix it in a subsequence commit. llvm-svn: 268327
* [X86][SSE] Added placeholder for 128/256-bit wide shuffle combinesSimon Pilgrim2016-05-021-6/+14
| | | | | | Begun adding placeholder for future support for vperm2f128/vshuff64x2 style 128/256-bit wide shuffles llvm-svn: 268306
* AMDGPU: Custom lower v2i32 loads and storesMatt Arsenault2016-05-021-7/+39
| | | | | | | This will allow us to split up 64-bit private accesses when necessary. llvm-svn: 268296
* AMDGPU/SI: Use v_readfirstlane_b32 when restoring SGPRs spilled to scratchTom Stellard2016-05-021-2/+1
| | | | | | | | | We were using v_readlane_b32 with the lane set to zero, but this won't work if thread 0 is not active. Differential Revision: http://reviews.llvm.org/D19745 llvm-svn: 268295
* AMDGPU: Make i64 loads/stores promote to v2i32Matt Arsenault2016-05-022-55/+12
| | | | | | | | | | | | Now that unaligned access expansion should not attempt to produce i64 accesses, we can remove the hack in PreprocessISelDAG where this is done. This allows splitting i64 private accesses while allowing the new add nodes indexing the vector components can be folded with the base pointer arithmetic. llvm-svn: 268293
* Fix instance of -Winconsistent-missing-override in AMDGPU codeReid Kleckner2016-05-021-1/+1
| | | | llvm-svn: 268289
* AMDGPU/SI: Set the kill flag on temp VGPRs used to restore SGPRs from scratchTom Stellard2016-05-021-1/+1
| | | | | | | | | | | | | | | | | | | | | Summary: When we restore an SGPR value from scratch, we first load it into a temporary VGPR and then use v_readlane_b32 to copy the value from the VGPR back into an SGPR. We weren't setting the kill flag on the VGPR in the v_readlane_b32 instruction, so the register scavenger wasn't able to re-use this temp value later. I wasn't able to create a lit test for this. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19744 llvm-svn: 268287
* ARM: fix handling of SUB immediates in peephole opt.Tim Northover2016-05-021-12/+30
| | | | | | | | | | | We were negating an immediate that was going to be used in a SUBri form unnecessarily. Since ADD/SUB are very similar we *can* do that, but we have to change the SUB to an ADD at the same time. This also applies to ADD, and allows us to handle a slightly larger range of immediates for those two operations. rdar://25992245 llvm-svn: 268276
* [NVPTX] Fix sign/zero-extending ldg/ldu instruction selectionJustin Holewinski2016-05-023-48/+77
| | | | | | | | | | | | | | | | | Summary: We don't have sign-/zero-extending ldg/ldu instructions defined, so we need to emulate them with explicit CVTs. We were originally handling the i8 case, but not any other cases. Fixes PR26185 Reviewers: jingyue, jlebar Subscribers: jholewinski Differential Revision: http://reviews.llvm.org/D19615 llvm-svn: 268272
* AMDGPU: Move R600 specific code out of AMDGPUISelLowering.cppTom Stellard2016-05-023-39/+51
| | | | | | | | | | Reviewers: arsenm Subscribers: jvesely, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19736 llvm-svn: 268267
* AMDGPU/SI: Fix bug in SIInstrInfo::insertWaitStates() uncovered by r268260Tom Stellard2016-05-021-1/+2
| | | | | | | We can't use MI->getDebugLoc() when MI is an iterator that could be MBB.end(). llvm-svn: 268265
* AMDGPU/SI: Use the hazard recognizer to break SMEM soft clausesTom Stellard2016-05-023-4/+72
| | | | | | | | | | | | | | | Summary: Add support for detecting hazards in SMEM soft clauses, so that we only break the clauses when necessary, either by adding s_nop or re-ordering other alu instructions. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18870 llvm-svn: 268260
* AMDGPU: llvm.SI.fs.constant is a source of divergenceNicolai Haehnle2016-05-021-0/+1
| | | | | | | | | | | | | | | | Summary: This intrinsic is used to get flat-shaded fragment shader inputs. Those are uniform across a primitive, but a fragment shader wave may process pixels from multiple primitives (as indicated by the prim_mask), and so that's where divergence can arise. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19747 llvm-svn: 268259
* [WebAssembly] Rename memory_size intrinsic to current_memoryDerek Schuff2016-05-021-9/+9
| | | | | | This follows the recent renaming in the wasm spec. llvm-svn: 268255
* AMDGPU/SI: Use hazard recognizer to detect DPP hazardsTom Stellard2016-05-023-55/+27
| | | | | | | | | | Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18603 llvm-svn: 268247
* [X86][SSE] Dropped X86ISD::FGETSIGNx86 and use MOVMSK instead for FGETSIGN ↵Simon Pilgrim2016-05-024-37/+12
| | | | | | | | lowering movmsk.ll tests are unchanged. llvm-svn: 268237
* Cleanup comments. NFC.Chad Rosier2016-05-022-3/+4
| | | | llvm-svn: 268236
* Cleanup comments. NFC.Chad Rosier2016-05-021-4/+3
| | | | llvm-svn: 268235
* Silence unused variable warnings; NFC.Aaron Ballman2016-05-021-9/+4
| | | | llvm-svn: 268234
* Enable the X86 call frame optimization for the 64-bit targets that allow it.David L Kreitzer2016-05-022-16/+36
| | | | | | | | Fixes PR27241. Differential Revision: http://reviews.llvm.org/D19688 llvm-svn: 268227
* [SystemZ] Fix in restoreCalleeSavedRegisters()Jonas Paulsson2016-05-021-1/+2
| | | | | | | | Only add operands for GRs to the LMG. Reviewed by Ulrich Weigand. llvm-svn: 268216
* [SystemZ] Mark CC defs as dead whenever possible.Jonas Paulsson2016-05-023-5/+25
| | | | | | | | | | | | | | Marking implicit CC defs as dead everywhere except when CC is actually defined and used explicitly, is important since the post-ra scheduler will otherwise insert edges between instructions unnecessarily. Also temporarily disable LA(Y)-> AGSI optimization in foldMemoryOperandImpl(), since this inroduces a def of the CC reg, which is illegal unless it is known to be dead. Reviewed by Ulrich Weigand. llvm-svn: 268215
* [X86] Fix a bug in LOCK arithmetic operation pattern matching where the ↵Craig Topper2016-05-021-1/+1
| | | | | | | | wrong immediate predicate check was being used for 64-bit instructions with 8-bit immediates. This didn't cause a bug because the order of the patterns ensured that the 64-bit instructions with 32-bit immediates were selected first. llvm-svn: 268212
* [AVX512] VPACKUSWB/VPACKSSWB should not be encoded with EVEX.W=1. While ↵Craig Topper2016-05-011-4/+4
| | | | | | there fix the execution domain for VPACKSSDW/VPACKUSDW. llvm-svn: 268200
* Change AVX512 braodcastsd/ss patterns interaction with spilling . New ↵Igor Breger2016-05-013-110/+98
| | | | | | | | implementation take a scalar register and generate a vector without COPY_TO_REGCLASS (turn it into a VR128 register ) .The issue is that during register allocation we may spill a scalar value using 128-bit loads and stores, wasting cache bandwidth. Differential Revision: http://reviews.llvm.org/D19579 llvm-svn: 268190
* [AVX512] Prefer AVX512 VPACK instructions over AVX/AVX2 instructions when ↵Craig Topper2016-05-011-3/+3
| | | | | | VLX and BWI are supported. llvm-svn: 268189
* [AVX512] Add HasVLX to the 128/256-bit versions of VPACKSSDW/USDW/SSWB/USWB ↵Craig Topper2016-05-011-13/+14
| | | | | | and VPMADDUBSW/VPMADDWD. llvm-svn: 268188
* [AVX512] Make sure 128/256-bit DQI versions of VAND/VANDN/VOR/VXOR are also ↵Craig Topper2016-05-011-16/+16
| | | | | | marked as requiring VLX. llvm-svn: 268186
* [X86] Add an AddedComplexity to another pattern to put it near similar in ↵Craig Topper2016-05-011-2/+1
| | | | | | the output file. llvm-svn: 268184
* [X86] Remove a seemlingly unused pattern. The same pattern appears elsewhere ↵Craig Topper2016-05-011-2/+0
| | | | | | with an AddedComplexity that made this unreachable. llvm-svn: 268183
* [X86] Add AddedComplexity to keep some similar patterns near each other in ↵Craig Topper2016-05-011-0/+1
| | | | | | the output file. llvm-svn: 268181
* [X86] Remove some redundant selection patterns.Craig Topper2016-05-012-11/+0
| | | | llvm-svn: 268180
* [AVX512] Replace vector_extract with extractelt in some patterns. They mean ↵Craig Topper2016-05-011-5/+5
| | | | | | the same thing but vector_extract is deprecated. NFC llvm-svn: 268179
OpenPOWER on IntegriCloud