summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* GlobalISel: mark simple ops legal even on types < 32-bit.Tim Northover2016-08-251-4/+3
| | | | | | | | The 32-bit variants of these operations don't depend on the bits not being operated on, so they also naturally model operations narrower than the actual register width. llvm-svn: 279760
* GlobalISel: mark pointer constants as legal on AArch64.Tim Northover2016-08-251-0/+2
| | | | llvm-svn: 279759
* GlobalISel: perform multi-step legalizationTim Northover2016-08-251-0/+18
| | | | llvm-svn: 279758
* GlobalISel: mark small extends as legal on AArch64Tim Northover2016-08-251-0/+13
| | | | llvm-svn: 279757
* [X86] 512-bit VPAVG requires AVX512BWMichael Kuperstein2016-08-251-4/+4
| | | | | | | | | Fix VPAVG detection to require AVX512BW, not AVX512F for 512-bit widths, and change associated asserts to assert in the right direction... This fixes PR29111. llvm-svn: 279755
* [X86][SSE] INSERTPS is only combined on v4f32 types. NFCI.Simon Pilgrim2016-08-251-2/+1
| | | | llvm-svn: 279751
* [Hexagon] Remove extraneous debug output from HexagonCopyToCombine.cpp Ron Lieberman2016-08-251-1/+0
| | | | | | BB# ... llvm-svn: 279750
* Fix line endingsSimon Pilgrim2016-08-251-16/+16
| | | | llvm-svn: 279745
* [Hexagon] vector store print tracing.Ron Lieberman2016-08-251-4/+14
| | | | | | | | Add vector store print tracing option for hexagon vector instructions. https://reviews.llvm.org/D23870 llvm-svn: 279739
* [X86][AVX] Provide SubVectorBroadcast fallback if load fold fails (PR29133)Simon Pilgrim2016-08-253-2/+79
| | | | | | Fix for PR29133, matching the approach that was taken for AVX1 scalar broadcasts. llvm-svn: 279735
* [X86] Simplify getOperandBias as a bit. NFCCraig Topper2016-08-251-12/+11
| | | | | | There's no reason for it to return a signed type. Just return the operand bias in each if instead of starting from 0 and adding in the 'if'. llvm-svn: 279720
* [X86] Fix indentation per coding standards. NFCCraig Topper2016-08-251-9/+9
| | | | llvm-svn: 279719
* MachineFunctionProperties/MIRParser: Rename AllVRegsAllocated->NoVRegs, ↵Matthias Braun2016-08-2542-42/+42
| | | | | | | | | | | | | compute it Rename AllVRegsAllocated to NoVRegs. This avoids the connotation of running after register and simply describes that no vregs are used in a machine function. With that we can simply compute the property and do not need to dump/parse it in .mir files. Differential Revision: http://reviews.llvm.org/D23850 llvm-svn: 279698
* Make some LLVM_CONSTEXPR variables const. NFC.George Burgess IV2016-08-251-1/+1
| | | | | | | | | | This patch changes LLVM_CONSTEXPR variable declarations to const variable declarations, since LLVM_CONSTEXPR expands to nothing if the current compiler doesn't support constexpr. In all of the changed cases, it looks like the code intended the variable to be const instead of sometimes-constexpr sometimes-not. llvm-svn: 279696
* [WebAssembly] Change a comment lineHeejin Ahn2016-08-241-1/+2
| | | | | | Test for commit access. llvm-svn: 279683
* [Hexagon] Check for block end when skipping debug instructionsKrzysztof Parzyszek2016-08-241-4/+3
| | | | llvm-svn: 279681
* [Hexagon] Change insertion of expand-condsets pass to avoid memory leaksKrzysztof Parzyszek2016-08-242-5/+10
| | | | llvm-svn: 279678
* ARM: don't diagnose cbz/cbnz to Thumb functions.Tim Northover2016-08-241-1/+2
| | | | | | | | A branch-distance to a Thumb function shouldn't be forced to be odd for CBZ/CBNZ instructions because (assuming it's within range), it's going to be a valid, even offset. llvm-svn: 279665
* AMDGCN/SI: Implement readlane/readfirstlane intrinsicsChangpeng Fang2016-08-241-4/+5
| | | | | | | | | | | | | | | Summary: This patch implements readlane/readfirstlane intrinsics. TODO: need to define a new register class to consider the case that the source could be a vector register or M0. Reviewed by: arsenm and tstellarAMD Differential Revision: http://reviews.llvm.org/D22489 llvm-svn: 279660
* Use isTargetMachO instead of isTargetDarwin.Rafael Espindola2016-08-241-1/+1
| | | | llvm-svn: 279655
* [X86][SSE] Add MINSD/MAXSD/MINSS/MAXSS intrinsic scalar load folding supportSimon Pilgrim2016-08-241-0/+4
| | | | | | These are no different in load behaviour to the existing ADD/SUB/MUL/DIV scalar ops but were missing from isNonFoldablePartialRegisterLoad llvm-svn: 279652
* [AArch64] Adjust the feature set for Exynos M1.Evandro Menezes2016-08-241-1/+2
| | | | | | Enable zero cycle zeroing. llvm-svn: 279648
* [X86][SSE] Add support for combining VZEXT_MOVL target shufflesSimon Pilgrim2016-08-241-12/+31
| | | | | | | | | | Includes adding more general support for the pattern: VZEXT_MOVL(VZEXT_LOAD(ptr)) -> VZEXT_LOAD(ptr) This has unearthed a couple of latent poor codegen issues (MINSS/MAXSS scalar load folding and MOVDDUP/BROADCAST load folding patterns), which will be fixed shortly. Its also reduced a couple of tests so that they no longer reach the instruction threshold necessary to be combined to PSHUFB (see PR26183). llvm-svn: 279646
* [Hexagon] Enable subregister liveness trackingKrzysztof Parzyszek2016-08-241-1/+1
| | | | llvm-svn: 279642
* [Hexagon] Remove the utilization of IMPLICIT_DEFs from expand-condsetsKrzysztof Parzyszek2016-08-241-104/+1
| | | | | | | This is no longer necessary, because since r279625 the subregister liveness properly accounts for read-undefs. llvm-svn: 279637
* AMDGPU : Add V_SAD_U32 instruction pattern.Wei Ding2016-08-242-0/+27
| | | | | | Differential Revision: http://reviews.llvm.org/D23069 llvm-svn: 279629
* Create subranges for new intervals resulting from live interval splittingKrzysztof Parzyszek2016-08-241-1/+7
| | | | | | | | | | | | | | | | | | | The register allocator can split a live interval of a register into a set of smaller intervals. After the allocation of registers is complete, the rewriter will modify the IR to replace virtual registers with the corres- ponding physical registers. At this stage, if a register corresponding to a subregister of a virtual register is used, the rewriter will check if that subregister is undefined, and if so, it will add the <undef> flag to the machine operand. The function verifying liveness of the subregis- ter would assume that it is undefined, unless any of the subranges of the live interval proves otherwise. The problem is that the live intervals created during splitting do not have any subranges, even if the original parent interval did. This could result in the <undef> flag placed on a register that is actually defined. Differential Revision: http://reviews.llvm.org/D21189 llvm-svn: 279625
* [mips] Preparatory work for a generic schedulerSimon Dardis2016-08-248-360/+566
| | | | | | | | | | | | | Extend instruction definitions from nearly all ISAs to include appropriate instruction itineraries. Change MIPS16s gp prologue generation to use real instructions instead of using a pseudo instruction. Reviewers: dsanders, vkalintiris Differential Review: https://reviews.llvm.org/D23548 llvm-svn: 279623
* [X86][AVX2] Ensure on 32-bit targets that we broadcast f64 types not i64 ↵Simon Pilgrim2016-08-241-0/+7
| | | | | | (PR29101) llvm-svn: 279622
* [X86][SSE] Add support for 32-bit element vectors to X86ISD::VZEXT_LOADSimon Pilgrim2016-08-243-23/+33
| | | | | | | | | | | | | | Consecutive load matching (EltsFromConsecutiveLoads) currently uses VZEXT_LOAD (load scalar into lowest element and zero uppers) for vXi64 / vXf64 vectors only. For vXi32 / vXf32 vectors it instead creates a scalar load, SCALAR_TO_VECTOR and finally VZEXT_MOVL (zero upper vector elements), relying on tablegen patterns to match this into an equivalent of VZEXT_LOAD. This patch adds the VZEXT_LOAD patterns for vXi32 / vXf32 vectors directly and updates EltsFromConsecutiveLoads to use this. This has proven necessary to allow us to easily make VZEXT_MOVL a full member of the target shuffle set - without this change the call to combineShuffle (which is the main caller of EltsFromConsecutiveLoads) tended to recursively recreate VZEXT_MOVL nodes...... Differential Revision: https://reviews.llvm.org/D23673 llvm-svn: 279619
* CodeGen: Remove MachineFunctionAnalysis => Enable (Machine)ModulePassesMatthias Braun2016-08-2410-17/+1
| | | | | | | | | | | | | | | | | | | | | | Re-apply this patch, hopefully I will get away without any warnings in the constructor now. This patch removes the MachineFunctionAnalysis. Instead we keep a map from IR Function to MachineFunction in the MachineModuleInfo. This allows the insertion of ModulePasses into the codegen pipeline without breaking it because the MachineFunctionAnalysis gets dropped before a module pass. Peak memory should stay unchanged without a ModulePass in the codegen pipeline: Previously the MachineFunction was freed at the end of a codegen function pipeline because the MachineFunctionAnalysis was dropped; With this patch the MachineFunction is freed after the AsmPrinter has finished. Differential Revision: http://reviews.llvm.org/D23736 llvm-svn: 279602
* [stackmaps] More extraction of common code [NFCI]Philip Reames2016-08-234-11/+10
| | | | | | General cleanup before starting to work on the part I want to actually change. llvm-svn: 279586
* Revert r279564. It introduces undefined behavior (binding a reference to aRichard Smith2016-08-2310-1/+17
| | | | | | | dereferenced null pointer) in MachineModuleInfo::MachineModuleInfo that causes -Werror builds (including several buildbots) to fail. llvm-svn: 279580
* MachineFunction: Introduce NoPHIs propertyMatthias Braun2016-08-231-2/+5
| | | | | | | | | | | | | I want to compute the SSA property of .mir files automatically in upcoming patches. The problem with this is that some inputs will be reported as static single assignment with some passes claiming not to support SSA form. In reality though those passes do not support PHI instructions => Track the presence of PHI instructions separate from the SSA property. Differential Revision: https://reviews.llvm.org/D22719 llvm-svn: 279573
* GlobalISel: legalize integer comparisons on AArch64.Tim Northover2016-08-231-0/+13
| | | | | | | Next step is doing both legalizations at the same time! Marvel at GlobalISel's cunning. llvm-svn: 279566
* GlobalISel: legalize conditional branches on AArch64.Tim Northover2016-08-231-0/+6
| | | | llvm-svn: 279565
* CodeGen: Remove MachineFunctionAnalysis => Enable (Machine)ModulePassesMatthias Braun2016-08-2310-17/+1
| | | | | | | | | | | | | | | | | | | | | | | Re-apply this commit with the deletion of a MachineFunction delegated to a separate pass to avoid use after free when doing this directly in AsmPrinter. This patch removes the MachineFunctionAnalysis. Instead we keep a map from IR Function to MachineFunction in the MachineModuleInfo. This allows the insertion of ModulePasses into the codegen pipeline without breaking it because the MachineFunctionAnalysis gets dropped before a module pass. Peak memory should stay unchanged without a ModulePass in the codegen pipeline: Previously the MachineFunction was freed at the end of a codegen function pipeline because the MachineFunctionAnalysis was dropped; With this patch the MachineFunction is freed after the AsmPrinter has finished. Differential Revision: http://reviews.llvm.org/D23736 llvm-svn: 279564
* GlobalISel: extend legalizer interface to handle multiple types.Tim Northover2016-08-231-14/+21
| | | | | | | | Instructions like G_ICMP have multiple types that may need to be legalized (the boolean output and nearly arbitrary inputs in this case). So the legalizer must be capable of deciding what to do for each of them separately. llvm-svn: 279554
* GlobalISel: mark pointer casts legal on AArch64.Tim Northover2016-08-231-0/+3
| | | | llvm-svn: 279553
* GlobalISel: legalize 1-bit load/store and mark 8/16 bit variants legal on ↵Tim Northover2016-08-231-2/+5
| | | | | | AArch64. llvm-svn: 279548
* [Hexagon] Packetize return value setup with the return instructionKrzysztof Parzyszek2016-08-231-3/+4
| | | | | | Commit r279241 unintentionally reverted that ability. llvm-svn: 279526
* [lanai] Use const instead of constexprJacques Pienaar2016-08-231-2/+2
| | | | | | The windows build bot did not like constexpr. llvm-svn: 279517
* Fix SystemZ hang caused by r279105Elliot Colp2016-08-232-29/+55
| | | | | | | | | The change in r279105 causes an infinite loop in some cases, as it sets the upper bits of an AND mask constant, which DAGCombiner::SimplifyDemandedBits then unsets. This patch reverts that part of the behaviour, instead relying on .td peepholes to perform the transformation to NILL. I reapplied my original fix for the problem addressed by r279105 (unsetting the upper bits, which prevents a compiler abort for a different reason). Differential Revision: https://reviews.llvm.org/D23781 llvm-svn: 279515
* LLVMLanaDesc: Update libdesp.NAKAMURA Takumi2016-08-231-1/+1
| | | | llvm-svn: 279510
* Change the target's name, s/LanaiMCTargetDesc/LanaiDesc/g.NAKAMURA Takumi2016-08-235-5/+5
| | | | | | "AllTargetsDescs" in llvm-mc/CMakeLists.txt expects not ${target}MCTargetDesc, but ${target}Desc. llvm-svn: 279509
* [ARM] Generate consistent frame records for Thumb2Oliver Stannard2016-08-234-33/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is not an official documented ABI for frame pointers in Thumb2, but we should try to emit something which is useful. We use r7 as the frame pointer for Thumb code, which currently means that if a function needs to save a high register (r8-r11), it will get pushed to the stack between the frame pointer (r7) and link register (r14). This means that while a stack unwinder can follow the chain of frame pointers up the stack, it cannot know the offset to lr, so does not know which functions correspond to the stack frames. To fix this, we need to push the callee-saved registers in two batches, with the first push saving the low registers, fp and lr, and the second push saving the high registers. This is already implemented, but previously only used for iOS. This patch turns it on for all Thumb2 targets when frame pointers are required by the ABI, and the frame pointer is r7 (Windows uses r11, so this isn't a problem there). If frame pointer elimination is enabled we still emit a single push/pop even if we need a frame pointer for other reasons, to avoid increasing code size. We must also ensure that lr is pushed to the stack when using a frame pointer, so that we end up with a complete frame record. Situations that could cause this were rare, because we already push lr in most situations so that we can return using the pop instruction. Differential Revision: https://reviews.llvm.org/D23516 llvm-svn: 279506
* Revert "(HEAD -> master, origin/master, origin/HEAD) CodeGen: Remove ↵Matthias Braun2016-08-2310-1/+17
| | | | | | | | | | MachineFunctionAnalysis => Enable (Machine)ModulePasses" Reverting while tracking down a use after free. This reverts commit r279502. llvm-svn: 279503
* CodeGen: Remove MachineFunctionAnalysis => Enable (Machine)ModulePassesMatthias Braun2016-08-2310-17/+1
| | | | | | | | | | | | | | | | | | | This patch removes the MachineFunctionAnalysis. Instead we keep a map from IR Function to MachineFunction in the MachineModuleInfo. This allows the insertion of ModulePasses into the codegen pipeline without breaking it because the MachineFunctionAnalysis gets dropped before a module pass. Peak memory should stay unchanged without a ModulePass in the codegen pipeline: Previously the MachineFunction was freed at the end of a codegen function pipeline because the MachineFunctionAnalysis was dropped; With this patch the MachineFunction is freed after the AsmPrinter has finished. Differential Revision: http://reviews.llvm.org/D23736 llvm-svn: 279502
* BranchRelaxation: Fix handling of blocks with multiple conditionalMatt Arsenault2016-08-231-6/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | branches Looping over all terminators exposed AArch64 tests hitting an assert from analyzeBranch failing. I believe these cases were miscompiled before. e.g. fcmp s0, s1 b.ne LBB0_1 b.vc LBB0_2 b LBB0_2 LBB0_1: ; Large block LBB0_2: ; ... Both of the individual conditional branches need to be expanded, since neither can reach the final block. Split the original block into ones which analyzeBranch will be able to understand. llvm-svn: 279499
* [lanai] Exit early in Mem Alu combiner if sentinel reach.Jacques Pienaar2016-08-231-0/+3
| | | | | | LanaiMemAluCombiner could try to query the debug value of a list sentinel. Add check to exit early instead. llvm-svn: 279497
OpenPOWER on IntegriCloud