summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [WebAssembly] Only emit stack pointer delcaration in BinFormatWasm assemblyDerek Schuff2017-12-061-2/+4
| | | | llvm-svn: 319870
* [X86] Update to getSetCCResultType to be more robust to EVT types.Craig Topper2017-12-061-28/+17
| | | | | | Attempt to determine what the type will be legalized to and then analyze that to see if we will be able to use a vXi1 compare. llvm-svn: 319861
* [AArch64] Do not abort if overflow check does not use EQ or NE.Joel Galenson2017-12-051-3/+2
| | | | | | | | | | As suggested by Eli Friedman, instead of aborting if an overflow check uses something other than SETEQ or SETNE, simply do not apply the optimization. Differential Revision: https://reviews.llvm.org/D39147 llvm-svn: 319837
* [X86][AVX512] Tag BLENDM instruction scheduler classesSimon Pilgrim2017-12-051-28/+45
| | | | llvm-svn: 319833
* [X86][AVX512] Tag GATHER/SCATTER instruction scheduler classesSimon Pilgrim2017-12-052-7/+11
| | | | | NOTE: At the moment these use the WriteLoad/WriteStore classes, which severely underestimates the costs. This needs to be reviewed. llvm-svn: 319829
* AMDGPU: Fix SDWA crash on inline asmMatt Arsenault2017-12-051-1/+2
| | | | | | | | This was only searching for explicit defs, and asserting for any implicit or variadic instruction defs, like inline asm. llvm-svn: 319826
* Re-commit r319490 "XOR the frame pointer with the stack cookie when ↵Hans Wennborg2017-12-054-0/+41
| | | | | | | | | | | | | | | | | | protecting the stack" The patch originally broke Chromium (crbug.com/791714) due to its failing to specify that the new pseudo instructions clobber EFLAGS. This commit fixes that. > Summary: This strengthens the guard and matches MSVC. > > Reviewers: hans, etienneb > > Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits > > Differential Revision: https://reviews.llvm.org/D40622 llvm-svn: 319824
* [X86][AVX512] Tag VPSLLDQ/VPSRLDQ instruction scheduler classesSimon Pilgrim2017-12-051-9/+20
| | | | llvm-svn: 319822
* Modify ModRefInfo values using static inline method abstractions [NFC].Alina Sbirlea2017-12-051-1/+2
| | | | | | | | | | | | | | | | | Summary: The aim is to make ModRefInfo checks and changes more intuitive and less error prone using inline methods that abstract the bit operations. Ideally ModRefInfo would become an enum class, but that change will require a wider set of changes into FunctionModRefBehavior. Reviewers: sanjoy, george.burgess.iv, dberlin, hfinkel Subscribers: nlopes, llvm-commits Differential Revision: https://reviews.llvm.org/D40749 llvm-svn: 319821
* [SystemZ] Validate shifted compare value in adjustForTestUnderMaskUlrich Weigand2017-12-051-0/+2
| | | | | | | | | | | When folding a shift into a test-under-mask comparison, make sure that there is no loss of precision when creating the shifted comparison value. This usually never happens, except for certain always-true comparisons in unoptimized code. Fixes PR35529. llvm-svn: 319818
* [X86][AVX512] Tag VPTRUNC/VPMOVSX/VPMOVZX instruction scheduler classesSimon Pilgrim2017-12-051-90/+106
| | | | llvm-svn: 319815
* AMDGPU: Fix infinite loop with dbg_valueMatt Arsenault2017-12-051-1/+4
| | | | | | | | | Surprisingly SIOptimizeExecMaskingPreRA can infinite loop in some case with DBG_VALUE. Most tests using dbg_value are run at -O0, so don't run this pass. This seems to only happen when the value argument is undef. llvm-svn: 319808
* [X86][X87] Tag FCMOV instruction scheduler classesSimon Pilgrim2017-12-054-16/+22
| | | | llvm-svn: 319804
* [WebAssembly] Implement WASM_STACK_POINTER.Dan Gohman2017-12-053-9/+14
| | | | | | | Use the .stack_pointer directive to implement WASM_STACK_POINTER for specifying a global variable to be the stack pointer. llvm-svn: 319797
* [WebAssembly] Don't emit .import_global for the wasm target.Dan Gohman2017-12-051-1/+2
| | | | | | | .import_global is used by the ELF-based target and not needed by the wasm target. llvm-svn: 319796
* [X86][AVX512] Tag VNNIW instruction scheduler classesSimon Pilgrim2017-12-051-15/+18
| | | | llvm-svn: 319784
* [X86][AVX512] Drop some default NoItinerary arguments that aren't needed any ↵Simon Pilgrim2017-12-051-9/+10
| | | | | | more llvm-svn: 319782
* [x86][AVX512] Lowering kunpack intrinsics to LLVM IRJina Nahias2017-12-052-3/+47
| | | | | | | | | This patch, together with a matching clang patch (https://reviews.llvm.org/D39719), implements the lowering of X86 kunpack intrinsics to IR. Differential Revision: https://reviews.llvm.org/D39720 Change-Id: I4088d9428478f9457f6afddc90bd3d66b3daf0a1 llvm-svn: 319778
* [X86][AVX512] Tag VPMADD52/VPSADBW instruction scheduler classesSimon Pilgrim2017-12-051-22/+25
| | | | llvm-svn: 319772
* [X86][AVX512] Add missing scalar CMPSS/CMPSD logic scheduler classesSimon Pilgrim2017-12-051-16/+21
| | | | llvm-svn: 319770
* [X86][AVX512] Cleanup bit logic scheduler classesSimon Pilgrim2017-12-051-21/+24
| | | | llvm-svn: 319767
* [X86][AVX512] Tag scalar CVT and CMP instruction scheduler classesSimon Pilgrim2017-12-052-130/+150
| | | | llvm-svn: 319765
* [X86][AVX512] Tag VPCMP/VPCMPU instruction scheduler classesSimon Pilgrim2017-12-051-42/+60
| | | | | | Move hardcoded itinerary out to the instruction declarations. Not sure that IIC_SSE_ALU_F32P is the best schedule for integer comparisons, but I'm not going to change it right now. llvm-svn: 319760
* [X86][AVX512] Cleanup VPCMP scheduler classesSimon Pilgrim2017-12-051-27/+30
| | | | | | Move hardcoded itinerary out to the instruction declarations. Not sure that IIC_SSE_ALU_F32P is the best schedule for integer comparisons, but I'm not going to change it right now. llvm-svn: 319758
* [X86][AVX512] Tag VFIXUPIMM instructions scheduler classesSimon Pilgrim2017-12-051-23/+30
| | | | llvm-svn: 319757
* [SystemZ] set 'guessInstructionProperties = 0' and set flags as needed.Jonas Paulsson2017-12-056-123/+131
| | | | | | | | | | | | | | | | | | | | This has proven a healthy exercise, as many cases of incorrect instruction flags were corrected in the process. As part of this, IntrWriteMem was added to several SystemZ instrinsics. Furthermore, a bug was exposed in TwoAddress with this change (as incorrect hasSideEffects flags were removed and instructions could now be sunk), and the test case for that bugfix (r319646) is included here as test/CodeGen/SystemZ/twoaddr-sink.ll. One temporary test regression (one extra copy) which will hopefully go away in upcoming patches for similar cases: test/CodeGen/SystemZ/vec-trunc-to-i1.ll Review: Ulrich Weigand. https://reviews.llvm.org/D40437 llvm-svn: 319756
* [Regalloc] Generate and store multiple regalloc hints.Jonas Paulsson2017-12-051-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MachineRegisterInfo used to allow just one regalloc hint per virtual register. This patch extends this to a vector of regalloc hints, which is filled in by common code with sorted copy hints. Such hints will make for more ID copies that can be removed. NB! This improvement is currently (and hopefully temporarily) *disabled* by default, except for SystemZ. The only reason for this is the big impact this has on tests, which has unfortunately proven unmanageable. It was a long while since all the tests were updated and just waiting for review (which didn't happen), but now targets have to enable this themselves instead. Several targets could get a head-start by downloading the tests updates from the Phabricator review. Thanks to those who helped, and sorry you now have to do this step yourselves. This should be an improvement generally for any target! The target may still create its own hint, in which case this has highest priority and is stored first in the vector. If it has target-type, it will not be recomputed, as per the previous behaviour. The temporary hook enableMultipleCopyHints() will be removed as soon as all targets return true. Review: Quentin Colombet, Ulrich Weigand. https://reviews.llvm.org/D38128 llvm-svn: 319754
* [X86] Fix a bug in handling GRXX subclasses in Domain Reassignment passGuy Blank2017-12-051-4/+4
| | | | | | | | | | | When trying to determine the correct Mask register class corresponding to a GPR register class, not all register classes were handled. This caused an assertion to be raised on some scenarios. Differential Revision: https://reviews.llvm.org/D40290 llvm-svn: 319745
* [X86] Use vector widening to support sign extend from i1 when the dest type ↵Craig Topper2017-12-051-20/+31
| | | | | | | | | | is not 512-bits and vlx is not enabled. Previously we used a wider element type and truncated. But its more efficient to keep the element type and drop unused elements. If BWI isn't supported and we have a i16 or i8 type, we'll extend it to be i32 and still use a truncate. llvm-svn: 319740
* Revert r319691: [globalisel][tablegen] Split atomic load/store into separate ↵Daniel Sanders2017-12-054-8/+17
| | | | | | | | opcode and enable for AArch64. Some concerns were raised with the direction. Revert while we discuss it and look into an alternative llvm-svn: 319739
* [X86] Fix a crash if avx512bw and xop are both enabled when the IR contrains ↵Craig Topper2017-12-051-2/+3
| | | | | | a v32i8 bitreverse. llvm-svn: 319737
* AMDGPU: Fix missing subtarget feature initializerMatt Arsenault2017-12-051-0/+1
| | | | llvm-svn: 319733
* AMDGPU: Fix crash when scheduling DBG_VALUEMatt Arsenault2017-12-051-1/+5
| | | | | | | | | | | | | | This calls handleMove with a DBG_VALUE instruction, which isn't tracked by LiveIntervals. I'm not sure this is the correct place to fix this. The generic scheduler seems to have more deliberate region selection that skips dbg_value. The test is also really hard to reduce. I haven't been able to figure out what exactly causes this particular case to try moving the dbg_value. llvm-svn: 319732
* [X86] Use vector widening to support zero extend from i1 when the dest type ↵Craig Topper2017-12-051-9/+31
| | | | | | | | | | is not 512-bits and vlx is not enabled. Previously we used a wider element type and truncated. But its more efficient to keep the element type and drop unused elements. If BWI isn't supported and we have a i16 or i8 type, we'll extend it to be i32 and still use a truncate. llvm-svn: 319728
* [X86] Don't use kunpck for vXi1 concat_vectors if the upper bits are undef.Craig Topper2017-12-051-3/+5
| | | | | | This can be efficiently selected by a COPY_TO_REGCLASS without the need for an extra instruction. llvm-svn: 319726
* [X86] Use getZeroVector and remove an unnecessary creation of an APInt ↵Craig Topper2017-12-051-4/+2
| | | | | | | | | | | | before calling getConstant. NFCI The getConstant function can take care of creating the APInt internally. getZeroVector will take care of using the correct type for the build vector to avoid re-lowering. The test change here is because execution domain constraints apparently pass through undef inputs of a zeroing xor. So the different ordering of register allocation here caused the dependency to change. llvm-svn: 319725
* [X86] Rearrange some of the code around AVX512 sign/zero extends. NFCICraig Topper2017-12-051-12/+12
| | | | | | | | Move the AVX512 code out of LowerAVXExtend. LowerAVXExtend has two callers but one of them pre-checks for AVX-512 so the code is only live from the other caller. So move the AVX-512 checks up to that caller for symmetry. Move all of the i1 input type code in Lower_AVX512ZeroExend together. llvm-svn: 319724
* AMDGPU/EG: Add a new FeatureFMA and use it to selectively enable FMA instructionJan Vesely2017-12-046-3/+23
| | | | | | | | | Only used by pre-GCN targets v2: fix predicate setting for FMA_Common Differential Revision: https://reviews.llvm.org/D40692 llvm-svn: 319712
* AMDGPU: Disable fp64 support on pre GCN asicsJan Vesely2017-12-043-14/+19
| | | | | | | | | | | It's not implemented. Passing +fp64-fp16-denormal feature enables fp64 even on asics that don't support it v2: fix hasFP64 query Differential Revision: https://reviews.llvm.org/D39931 llvm-svn: 319709
* Revert r319490 "XOR the frame pointer with the stack cookie when protecting ↵Hans Wennborg2017-12-044-41/+0
| | | | | | | | | | | | | | | | | | the stack" This broke the Chromium build (crbug.com/791714). Reverting while investigating. > Summary: This strengthens the guard and matches MSVC. > > Reviewers: hans, etienneb > > Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits > > Differential Revision: https://reviews.llvm.org/D40622 > > git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319490 91177308-0d34-0410-b5e6-96231b3b80d8 llvm-svn: 319706
* AMDGPU: Fix creating invalid copy when adjusting dmaskMatt Arsenault2017-12-046-61/+129
| | | | | | | | | Move the entire optimization to one place. Before it was possible to adjust dmask without changing the register class of the output instruction, since they were done in separate places. Fix all lane sizes and move all of the optimization into the DAG folding. llvm-svn: 319705
* AMDGPU: Use return value of MorphNodeToMatt Arsenault2017-12-041-3/+1
| | | | llvm-svn: 319704
* [globalisel][tablegen] Split atomic load/store into separate opcode and ↵Daniel Sanders2017-12-044-17/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | enable for AArch64. This patch splits atomics out of the generic G_LOAD/G_STORE and into their own G_ATOMIC_LOAD/G_ATOMIC_STORE. This is a pragmatic decision rather than a necessary one. Atomic load/store has little in implementation in common with non-atomic load/store. They tend to be handled very differently throughout the backend. It also has the nice side-effect of slightly improving the common-case performance at ISel since there's no longer a need for an atomicity check in the matcher table. All targets have been updated to remove the atomic load/store check from the G_LOAD/G_STORE path. AArch64 has also been updated to mark G_ATOMIC_LOAD/G_ATOMIC_STORE legal. There is one issue with this patch though which also affects the extending loads and truncating stores. The rules only match when an appropriate G_ANYEXT is present in the MIR. For example, (G_ATOMIC_STORE (G_TRUNC:s16 (G_ANYEXT:s32 (G_ATOMIC_LOAD:s16 X)))) will match but: (G_ATOMIC_STORE (G_ATOMIC_LOAD:s16 X)) will not. This shouldn't be a problem at the moment, but as we get better at eliminating extends/truncates we'll likely start failing to match in some cases. The current plan is to fix this in a patch that changes the representation of extending-load/truncating-store to allow the MMO to describe a different type to the operation. llvm-svn: 319691
* [CodeGen] Unify MBB reference format in both MIR and debug outputFrancis Visoiu Mistrih2017-12-0443-274/+290
| | | | | | | | | | | | | | | | As part of the unification of the debug format and the MIR format, print MBB references as '%bb.5'. The MIR printer prints the IR name of a MBB only for block definitions. * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(*\1)/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g' * find . \( -name "*.txt" -o -name "*.s" -o -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g' * grep -nr 'BB#' and fix Differential Revision: https://reviews.llvm.org/D40422 llvm-svn: 319665
* Fix function pointer tail calls in armv8-M.basePablo Barrio2017-12-041-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: The compiler fails with the following error message: fatal error: error in backend: ran out of registers during register allocation Tail call optimization for Armv8-M.base fails to meet all the required constraints when handling calls to function pointers where the arguments take up r0-r3. This is because the pointer to the function to be called can only be stored in r0-r3, but these are all occupied by arguments. This patch makes sure that tail call optimization does not try to handle this type of calls. Reviewers: chill, MatzeB, olista01, rengolin, efriedma Reviewed By: olista01, efriedma Subscribers: efriedma, aemerson, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D40706 llvm-svn: 319664
* [AMDGPU] SDWA: add support for PRESERVE into SDWA peephole.Sam Kolton2017-12-042-285/+534
| | | | | | | | | | | | Summary: Reviewers: arsenm, vpykhtin, rampitec Subscribers: kzhuravl, wdng, nhaehnle, mgorny, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D37817 llvm-svn: 319662
* [NVPTX] Assign valid global namesJonas Hahnfeld2017-12-041-0/+6
| | | | | | | | | | | | | | | | | PTX requires that identifiers consist only of [a-zA-Z0-9_$]. The existing pass already ensured this for globals and this patch adds the cleanup for functions with local linkage. However, there was a different problem in the case of collisions of the adjusted name: The ValueSymbolTable then automatically appended ".N" with increasing Ns to get a unique name while helping the ABI demangling. Special case this behavior to omit the dots and append N directly. This will always give us legal names according to the PTX requirements. Differential Revision: https://reviews.llvm.org/D40573 llvm-svn: 319657
* Revert r319649 - [Asm, ARM] Add fallback diag for multiple invalid operandsOliver Stannard2017-12-041-17/+0
| | | | | | | | This is causing a failure in the llvm-clang-x86_64-expensive-checks-win buildbot, and I can't reproduce it locally, so reverting until I can work out what is wrong. llvm-svn: 319654
* AMDGPU: fix missing s_waitcntTim Corringham2017-12-041-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: The pass that inserts s_waitcnt instructions where needed propagated info used to track dependencies for each block by iterating over the predecessor blocks. The iteration was terminated when a predecessor that had not yet been processed was encountered. Any info in blocks later in the list was therefore not processed, leading to the possiblility of a required s_waitcnt not being inserted. The fix is simply to change the "break" to "continue" for the relevant loops, so that all visited blocks are processed. This is likely what was intended when the code was written. There is no test case provided for this fix because: 1) the only example that reproduces this is large and resistant to being reduced 2) the change is trivial Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D40544 llvm-svn: 319651
* [Asm, ARM] Add fallback diag for multiple invalid operandsOliver Stannard2017-12-041-0/+17
| | | | | | | | | | | | | | | | This adds a "invalid operands for instruction" diagnostic for instructions where there is an instruction encoding with the correct mnemonic and which is available for this target, but where multiple operands do not match those which were provided. This makes it clear that there is some combination of operands that is valid for the current target, which the default diagnostic of "invalid instruction" does not. Since this is a very general error, we only emit it if we don't have a more specific error. Differential revision: https://reviews.llvm.org/D36747 llvm-svn: 319649
OpenPOWER on IntegriCloud