summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* AArch64: diagnose unrecognized features in .cpu directive.Tim Northover2017-05-151-2/+17
| | | | | | | We were silently ignoring any features we couldn't match up, which led to errors in an inline asm block missing the conventional "\n\t". llvm-svn: 303108
* [AArch64][Falkor] Fix sched details for FMOVGeoff Berry2017-05-152-4/+6
| | | | llvm-svn: 303099
* Revert 303091.Jan Sjodin2017-05-157-3380/+12
| | | | llvm-svn: 303098
* Add AMDGPUMachineCFGStructurizer.Jan Sjodin2017-05-157-12/+3380
| | | | | | Differential Revision: https://reviews.llvm.org/D23209 llvm-svn: 303091
* [NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ↵Simon Pilgrim2017-05-151-3/+3
| | | | | | | | | | | | | | ReadMem/WriteMem (PR32146) Follow up to D33147 NVPTXTargetLowering::LowerCall was trusting the default argument values. Fixes another 17 of the NVPTX '-verify-machineinstrs with EXPENSIVE_CHECKS' errors in PR32146. Differential Revision: https://reviews.llvm.org/D33189 llvm-svn: 303082
* [AArch64] Enable FeatureFuseAES on Cortex-A72.Florian Hahn2017-05-151-0/+1
| | | | | | | | This patch enables fusing dependent AESE/AESMC and AESD/AESIMC instruction pairs on Cortex-A72, as recommended in the Software Optimization Guide, section 4.10. llvm-svn: 303073
* [AMDGPU][MC] Corrected several VI opcodes to avoid printing _e64Dmitry Preobrazhensky2017-05-151-11/+22
| | | | | | | | | | See bug 32936: https://bugs.llvm.org//show_bug.cgi?id=32936 Reviewers: artem.tamazov, vpykhtin Differential Revision: https://reviews.llvm.org/D33123 llvm-svn: 303070
* [AMDGPU][MC] Removed V_MQSAD_U16_U8Dmitry Preobrazhensky2017-05-151-3/+0
| | | | | | | | | | | | This instruction does not really exist See Bug 33018: https://bugs.llvm.org//show_bug.cgi?id=33018 Reviewers: vpykhtin, artem.tamazov Differential Revision: https://reviews.llvm.org/D33126 llvm-svn: 303055
* [ARM] Mark LEApcrel instructions as isAsCheapAsAMoveJohn Brawn2017-05-153-3/+3
| | | | | | | | | | | Doing this means that if an LEApcrel is used in two places we will rematerialize instead of generating two MOVs. This is particularly useful for printfs using the same format string, where we want to generate an address into a register that's going to get corrupted by the call. Differential Revision: https://reviews.llvm.org/D32858 llvm-svn: 303054
* [ARM] Mark LEApcrel as not having side effectsJohn Brawn2017-05-151-2/+2
| | | | | | | | | | | | | | | Doing this lets us hoist it out of loops, and I've also marked it as rematerializable the same as the thumb1 and thumb2 counterparts. It looks like it being marked as such was just a mistake, as the commit that made that change only mentions LEApcrelJT and in thumb1 and thumb2 only the LEApcrelJT instructions were marked as having side-effects, so it looks like the intent was to only mark LEApcrelJT as having side-effects but LEApcrel was accidentally marked as such also. Differential Revision: https://reviews.llvm.org/D32857 llvm-svn: 303053
* [NVPTX] Don't rely on default arguments to ↵Simon Pilgrim2017-05-151-3/+8
| | | | | | | | SelectionDAG::getMemIntrinsicNode. NFC. NFC followup to D33147, this explicitly sets all the arguments (instead of relying on the defaults) to SelectionDAG::getMemIntrinsicNode to help identify -verify-machineinstrs issues. llvm-svn: 303047
* [X86] Utilize SelectionDAG::getSelect(). NFC.Zvi Rackover2017-05-141-34/+27
| | | | | | | | | | Replace SelectionDAG::getNode(ISD::SELECT, ...) and SelectionDAG::getNode(ISD::VSELECT, ...) with SelectionDAG::getSelect(...) Saves a few lines of code and in some cases saves the need to explicitly check the type of the desired node. llvm-svn: 303024
* [X86][AVX1] Account for cost of extract/insert of 256-bit shiftsSimon Pilgrim2017-05-141-49/+49
| | | | llvm-svn: 303023
* [X86][AVX2] Fix costs for v4i64 ashr by splatSimon Pilgrim2017-05-141-0/+5
| | | | llvm-svn: 303022
* [X86][AVX1] Account for cost of extract/insert of 256-bit shifts by splatSimon Pilgrim2017-05-141-12/+12
| | | | llvm-svn: 303021
* [X86] Remove unused value from IntrinsicType enum. NFCCraig Topper2017-05-142-7/+1
| | | | llvm-svn: 303018
* [X86][AVX1] Account for cost of extract/insert of 256-bit SDIV/UDIV by mul ↵Simon Pilgrim2017-05-141-17/+17
| | | | | | sequences llvm-svn: 303017
* [X86][XOP] XOP's general v16i8 shifts will be used instead of v8i16 shift + ↵Simon Pilgrim2017-05-141-3/+6
| | | | | | | | mask. Tweak cost model to match what lowering actually does. llvm-svn: 303013
* [X86][SSE] Account for cost of extract/insert of v32i8 vector shiftsSimon Pilgrim2017-05-141-3/+3
| | | | llvm-svn: 303012
* [X86][XOP] Account for cost of extract/insert of 256-bit vector shiftsSimon Pilgrim2017-05-141-12/+12
| | | | llvm-svn: 303010
* [X86][AVX] Allow 32-bit targets to peek through subvectors to extract ↵Simon Pilgrim2017-05-141-1/+10
| | | | | | constant splats for vXi64 shifts. llvm-svn: 303009
* [x86, SSE] AVX1 PR28129 (256-bit all-ones rematerialization)Simon Pilgrim2017-05-132-10/+22
| | | | | | | | | | | | | | | | | | | | Further perf tests on Jaguar indicate that: vxorps %ymm0, %ymm0, %ymm0 vcmpps $15, %ymm0, %ymm0, %ymm0 is consistently faster (by about 9%) than: vpcmpeqd %xmm0, %xmm0, %xmm0 vinsertf128 $1, %xmm0, %ymm0, %ymm0 Testing equivalent code on a SandyBridge (E5-2640) puts it slightly (~3%) faster as well. Committed on behalf of @dtemirbulatov Differential Revision: https://reviews.llvm.org/D32416 llvm-svn: 302989
* [AVR] When lowering Select8/Select16, put newly generated MBBs in the same spotDylan McKay2017-05-131-2/+3
| | | | | | | | | | Contributed by Dr. Gergő Érdi. Fixes a bug. Raised from (https://github.com/avr-rust/rust/issues/49). llvm-svn: 302973
* [AVR] Remove an unused variableDylan McKay2017-05-131-1/+0
| | | | llvm-svn: 302970
* AMDGPU/SI: Don't promote to vector if the load/store is volatile.Changpeng Fang2017-05-121-2/+5
| | | | | | | | | | | | | Summary: We should not change volatile loads/stores in promoting alloca to vector. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D33107 llvm-svn: 302943
* [NVPTX] Don't flag StoreRetVal memory chain operands as ReadMem (PR32146)Simon Pilgrim2017-05-121-1/+3
| | | | | | | | This fixes 47 of the 75 NVPTX '-verify-machineinstrs with EXPENSIVE_CHECKS' errors in PR32146. Differential Revision: https://reviews.llvm.org/D33147 llvm-svn: 302942
* [PPC] Move the combine "a << (b % (sizeof(a) * 8)) -> (PPCshl a, b)" to the ↵Tim Shen2017-05-123-17/+108
| | | | | | | | | | | | | | | | | | | | | | backend. NFC. Summary: Eli pointed out that it's unsafe to combine the shifts to ISD::SHL etc., because those are not defined for b > sizeof(a) * 8, even after some of the combiners run. However, PPCISD::SHL defines that behavior (as the instructions themselves). Move the combination to the backend. The tests in shift_mask.ll still pass. Reviewers: echristo, hfinkel, efriedma, iteratee Subscribers: nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D33076 llvm-svn: 302937
* [AArch64][Falkor] Refine modeling of multiply accumulate forwarding.Geoff Berry2017-05-122-44/+61
| | | | llvm-svn: 302933
* Strip trailing whitespace. NFCI.Simon Pilgrim2017-05-121-7/+7
| | | | llvm-svn: 302927
* [KnownBits] Add bit counting methods to KnownBits struct and use them where ↵Craig Topper2017-05-124-4/+4
| | | | | | | | | | | | possible This patch adds min/max population count, leading/trailing zero/one bit counting methods. The min methods return answers based on bits that are known without considering unknown bits. The max methods give answers taking into account the largest count that unknown bits could give. Differential Revision: https://reviews.llvm.org/D32931 llvm-svn: 302925
* AMDGPU/GlobalISel: Mark 32-bit integer constants as legalTom Stellard2017-05-121-0/+1
| | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D33115 llvm-svn: 302919
* [SPARC] Support 'f' and 'e' inline asm constraints.James Y Knight2017-05-122-3/+26
| | | | | | | | Based on patch by Patrick Boettcher and Chris Dewhurst. Differential Revision: https://reviews.llvm.org/D29116 llvm-svn: 302911
* Use SDValue::getOperand() helper. NFCI.Simon Pilgrim2017-05-121-22/+19
| | | | llvm-svn: 302894
* [AVR] Migrate to new StructType::get owing to Supress all uses of ↵Leslie Zhai2017-05-121-1/+1
| | | | | | | | | | | | LLVM_END_WITH_NULL Reviewers: dylanmckay, jroelofs, RKSimon, serge-sans-paille Reviewed By: serge-sans-paille Differential Revision: https://reviews.llvm.org/D33119 llvm-svn: 302885
* Issue diagnostics when returning FP values on x86_64 without SSE1/2Reid Kleckner2017-05-111-9/+24
| | | | | | | | | | | | | Avoid using report_fatal_error, because it will ask the user to file a bug. If the user attempts to disable SSE on x86_64 and them use floating point, that's a bug in their code, not a bug in the compiler. This is just a start. There are other ways to crash the backend in this configuration, but they should be updated to follow this pattern. Differential Revision: https://reviews.llvm.org/D27522 llvm-svn: 302835
* [PPC] Change the register constraint of the first source operand of ↵Guozhi Wei2017-05-112-1/+18
| | | | | | | | | | | | instruction mtvsrdd to g8rc_nox0 According to Power ISA V3.0 document, the first source operand of mtvsrdd is constant 0 if r0 is specified. So the corresponding register constraint should be g8rc_nox0. This bug caused wrong output generated by 401.bzip2 when -mcpu=power9 and fdo are specified. Differential Revision: https://reviews.llvm.org/D32880 llvm-svn: 302834
* [AArch64][MachineCombine] Fold FNMUL+FSUB -> FNMADD.Chad Rosier2017-05-111-0/+28
| | | | | | Differential Revision: http://reviews.llvm.org/D33101. llvm-svn: 302822
* [AMDGPU] Placate unused variable warning in release builds.Davide Italiano2017-05-111-0/+1
| | | | llvm-svn: 302821
* [MSP430] Generate EABI-compliant libcallsVadzim Dambrouski2017-05-113-38/+236
| | | | | | | | | | | | | Updates the MSP430 target to generate EABI-compatible libcall names. As a byproduct, adjusts the hardware multiplier options available in the MSP430 target, adds support for promotion of the ISD::MUL operation for 8-bit integers, and correctly marks R11 as used by call instructions. Patch by Andrew Wygle. Differential Revision: https://reviews.llvm.org/D32676 llvm-svn: 302820
* AMDGPU: Remove tfe bit from flat instruction definitionsMatt Arsenault2017-05-113-23/+22
| | | | | | | | | | We don't use it and it was removed in gfx9, and the encoding bit repurposed. Additionally actually using it requires changing the output register class, which wasn't done anyway. llvm-svn: 302814
* AMDGPU: Pull fneg out of extract_vector_eltMatt Arsenault2017-05-114-1/+31
| | | | | | | This allows folding source modifiers in more f16 cases. Makes it easier to select per-component packed neg modifiers. llvm-svn: 302813
* [AMDGPU] Fix incorrect register pressure calculationStanislav Mekhanoshin2017-05-111-2/+3
| | | | | | | | | Earlier fix D32572 introduced a bug where live-ins were calculated for basic block instead of scheduling region. This change fixes it. Differential Revision: https://reviews.llvm.org/D33086 llvm-svn: 302812
* [PowerPC] Eliminate integer compare instructions - vol. 1Nemanja Ivanovic2017-05-115-5/+284
| | | | | | | | | | | | | This patch is the first in a series of patches to provide code gen for doing compares in GPRs when the compare result is required in a GPR. It adds the infrastructure to select GPR sequences for i1->i32 and i1->i64 extensions. This first patch handles equality comparison on i32 operands with the result sign or zero extended. Differential Revision: https://reviews.llvm.org/D31847 llvm-svn: 302810
* [GlobalISel][X86] Remove hand-written G_FADD/F_SUB selection.Igor Breger2017-05-111-105/+0
| | | | | | Now it handle by TableGen. llvm-svn: 302793
* [x86] Fix a failure to select with AVX-512 when the type legalizerChandler Carruth2017-05-111-5/+29
| | | | | | | | | | | | | | | | | | | | | | | | | manages to form a VSELECT with a non-i1 element type condition. Those are technically allowed in SDAG (at least, the generic type legalization logic will form them and I wouldn't want to try to audit everything te preclude forming them) so we need to be able to lower them. This isn't too hard to implement. We mark VSELECT as custom so we get a chance in C++, add a fast path for i1 conditions to get directly handled by the patterns, and a fallback when we need to manually force the condition to be an i1 that uses the vptestm instruction to turn a non-mask into a mask. This, unsurprisingly, generates awful code. But it at least doesn't crash. This was actually impacting open source packages built with LLVM for AVX-512 in the wild, so quickly landing a patch that at least stops the immediate bleeding. I think I've found where to fix the codegen quality issue, but less confident of that change so separating it out from the thing that doesn't change the result of any existing test case but causes mine to not crash. llvm-svn: 302785
* Strip trailing whitespace. NFCI.Simon Pilgrim2017-05-111-1/+1
| | | | llvm-svn: 302784
* [ARM][GlobalISel] Legalize narrow scalar ops by wideningDiana Picus2017-05-111-3/+5
| | | | | | | | | | | | | | This is the same as r292827 for AArch64: we widen 8- and 16-bit ADD, SUB and MUL to 32 bits since we only have TableGen patterns for 32 bits. See the commit message for r292827 for more details. At this point we could just remove some of the tests for regbankselect and instruction-select, since we're not going to see any narrow operations at those levels anymore. Instead I decided to update them with G_ANYEXT/G_TRUNC operations, so we can validate the full sequences generated by the legalizer. llvm-svn: 302782
* Remove spurious cast of nullptr. NFC.Serge Guelton2017-05-111-1/+1
| | | | | | Conversion rules allow automatic casting of nullptr to any pointer type. llvm-svn: 302780
* Remove now useless trailing nullptr in StructType::getSerge Guelton2017-05-111-1/+1
| | | | llvm-svn: 302779
* [ARM][GlobalISel] Support for G_ANYEXTDiana Picus2017-05-112-10/+4
| | | | | | | | | | | | | | G_ANYEXT can be introduced by the legalizer when widening scalars. Add support for it in the register bank info (same mapping as everything else) and in the instruction selector. When selecting it, we treat it as a COPY, just like G_TRUNC. On this occasion we get rid of some assertions in selectCopy so we can reuse it. This shouldn't be a problem at the moment since we're not supporting any complicated cases (e.g. FPR, different register banks). We might want to separate the paths when we do. llvm-svn: 302778
OpenPOWER on IntegriCloud