summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* Don't conditionalize Neon instructions, even in IT blocks.Kristof Beyls2017-06-221-3/+5
| | | | | | | | | | | | | | This has been deprecated since ARMARM v7-AR, release C.b, published back in 2012. This also removes test/CodeGen/Thumb2/ifcvt-neon.ll that originally was introduced to check that conditionalization of Neon instructions did happen when generating Thumb2. However, the test had evolved and was no longer testing that. Rather than trying to adapt that test, this commit introduces test/CodeGen/Thumb2/ifcvt-neon-deprecated.mir, since we can now use the MIR framework to write nicer/more maintainable tests. llvm-svn: 305998
* [mips] Implement the ".rdata" MIPS assembly directive.Simon Dardis2017-06-221-0/+22
| | | | | | | | | | | | | | Rather than creating a separate ".rdata" section distinct from the customary ".rodata" in ELF, ".rdata" switches to the ".rodata" section. This patch relands r305949 and r305950 with the correct commit message and addresses nit raised during review. Patch By: John Baldwin! Differential Revision: https://reviews.llvm.org/D34452 llvm-svn: 305995
* [ARM] Add .w aliases of MOV with shifted operandJohn Brawn2017-06-222-2/+14
| | | | | | | | These appear to have been simply missing. Differential Revision: https://reviews.llvm.org/D34461 llvm-svn: 305993
* [ARM] Clean up choice of narrow instructions in ARMAsmParser, NFCJohn Brawn2017-06-221-33/+27
| | | | | | | | | | | This patch makes a couple of changes to how we decide whether to use the narrow or wide encoding of thumb2 instructions: * Common out the detection of the .w qualifier * Check for the CPSR operand in a consistent way Differential Revision: https://reviews.llvm.org/D34460 llvm-svn: 305992
* [GlobalISel][X86] Support vector type G_INSERT legalization/selection.Igor Breger2017-06-222-3/+136
| | | | | | | | | | | | | | | | Summary: Support vector type G_INSERT legalization/selection. Split from https://reviews.llvm.org/D33665 Reviewers: qcolombet, t.p.northover, zvi, guyblank Reviewed By: guyblank Subscribers: guyblank, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D33956 llvm-svn: 305989
* [ARM] Add macro fusion for AES instructions.Florian Hahn2017-06-226-1/+99
| | | | | | | | | | | | | | | | Summary: This patch adds a macro fusion using CodeGen/MacroFusion.cpp to pair AES instructions back to back and adds FeatureFuseAES to enable the feature. Reviewers: evandro, javed.absar, rengolin, t.p.northover Reviewed By: javed.absar Subscribers: aemerson, mgorny, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34142 llvm-svn: 305988
* AVX-512: Lowering Masked Gather intrinsic - fixed a bugElena Demikhovsky2017-06-225-9/+103
| | | | | | | | | | | | Masked gather for vector length 2 is lowered incorrectly for element type i32. The type <2 x i32> was automatically extended to <2 x i64> and we generated VPGATHERQQ instead of VPGATHERQD. The type <2 x float> is extended to <4 x float>, so there is no bug for this type, but the sequence may be more optimal. In this patch I'm fixing <2 x i32>bug and optimizing <2 x float> sequence for GATHERs only. The same fix should be done for Scatters as well. Differential revision: https://reviews.llvm.org/D34343 llvm-svn: 305987
* [AMDGPU] SDWA: add support for GFX9 in peephole passSam Kolton2017-06-226-39/+127
| | | | | | | | | | | | | | | | Summary: Added support based on merged SDWA pseudo instructions. Now peephole allow one scalar operand, omod and clamp modifiers. Added several subtarget features for GFX9 SDWA. This diff also contains changes from D34026. Depends D34026 Reviewers: vpykhtin, rampitec, arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D34241 llvm-svn: 305986
* [PowerPC] fix potential verification errorsHiroshi Inoue2017-06-222-5/+12
| | | | | | This patch fixes trivial mishandling of 32-bit/64-bit instructions that may cause verification errors with -verify-machineinstrs. llvm-svn: 305984
* [wasm] Fix WebAssembly asm backend after r305968Reid Kleckner2017-06-221-10/+12
| | | | llvm-svn: 305978
* Revert "[Target] Implement the ".rdata" MIPS assembly directive."Davide Italiano2017-06-221-22/+0
| | | | | | | This reverts commit r305949 and r305950 as they didn't have the correct commit message. llvm-svn: 305973
* [AMDGPU] Add FP_CLASS to the add/setcc combineStanislav Mekhanoshin2017-06-211-1/+3
| | | | | | | | This is one of the nodes which also compile as v_cmp_*. Differential Revision: https://reviews.llvm.org/D34485 llvm-svn: 305970
* Use a MutableArrayRef. NFC.Rafael Espindola2017-06-2114-40/+39
| | | | llvm-svn: 305968
* Fix build.Rafael Espindola2017-06-211-1/+1
| | | | llvm-svn: 305967
* [AMDGPU] Combine add and adde, sub and subeStanislav Mekhanoshin2017-06-212-9/+81
| | | | | | | | | If one of the arguments of adde/sube is zero we can fold another add/sub into it. Differential Revision: https://reviews.llvm.org/D34374 llvm-svn: 305964
* Mark dump() methods as const. NFCSam Clegg2017-06-211-1/+1
| | | | | | | | | Add const qualifier to any dump() method where adding one was trivial. Differential Revision: https://reviews.llvm.org/D34481 llvm-svn: 305963
* [AMDGPU] simplify add x, *ext (setcc) => addc|subb x, 0, setccStanislav Mekhanoshin2017-06-214-0/+59
| | | | | | | | | This simplification allows to avoid generating v_cndmask_b32 to serialize condition code between compare and use. Differential Revision: https://reviews.llvm.org/D34300 llvm-svn: 305962
* [Hexagon] Use MachineInstrBuilder instead of changing instruction in placeKrzysztof Parzyszek2017-06-211-45/+9
| | | | llvm-svn: 305953
* [Target] Implement the ".rdata" MIPS assembly directive.Davide Italiano2017-06-211-0/+22
| | | | | | | | Patch by John Baldwin < jhb at freebsd dot org >! Differential Revision: https://reviews.llvm.org/D34452 llvm-svn: 305949
* [Solaris] emit .init_array instead of .ctors on Solaris (Sparc/x86)Davide Italiano2017-06-215-0/+21
| | | | | | | | Patch by Fedor Sergeev. Differential Revision: https://reviews.llvm.org/D33868 llvm-svn: 305948
* [Hexagon] Handle more types of immediate operands in expand-condsetsKrzysztof Parzyszek2017-06-211-2/+13
| | | | llvm-svn: 305943
* [PowerPC] define target hook isReallyTriviallyReMaterializable()Lei Huang2017-06-213-2/+29
| | | | | | | | | | | Define target hook isReallyTriviallyReMaterializable() to explicitly specify PowerPC instructions that are trivially rematerializable. This will allow the MachineLICM pass to accurately identify PPC instructions that should always be hoisted. Differential Revision: https://reviews.llvm.org/D34255 llvm-svn: 305932
* [AMDGPU][MC][GFX9] Corrected VOP3P relevant code to fix disassembler failuresDmitry Preobrazhensky2017-06-214-11/+6
| | | | | | | | | | See Bug 33509: https://bugs.llvm.org//show_bug.cgi?id=33509 Reviewers: Sam Kolton, Artem Tamazov, Valery Pykhtin Differential Revision: https://reviews.llvm.org/D34360 llvm-svn: 305923
* [AMDGPU][MC] Corrected V_*QSAD* instructions to check that dest register is ↵Dmitry Preobrazhensky2017-06-213-5/+84
| | | | | | | | | | | | different than any of the src See Bug 33279: https://bugs.llvm.org//show_bug.cgi?id=33279 Reviewers: artem.tamazov, vpykhtin Differential Revision: https://reviews.llvm.org/D34003 llvm-svn: 305915
* [x86] fix formatting; NFCSanjay Patel2017-06-211-15/+13
| | | | llvm-svn: 305914
* [AARCH64][LSE] Preliminary support for ARMv8.1 LSE Atomics.Christof Douma2017-06-214-5/+114
| | | | | | | | | | | | | | | | | | | | | | Implemented support to AArch64 codegen for ARMv8.1 Large System Extensions atomic instructions. Where supported, these instructions can provide atomic operations with higher performance. Currently supported operations include: fetch_add, fetch_or, fetch_xor, fetch_smin, fetch_min/max (signed and unsigned), swap, and compare_exchange. This implementation implies sequential-consistency ordering, more relaxed ordering is under development. Subtarget->hasLSE is currently supported for Cavium ThunderX2T99. Patch by Ananth Jasty. Differential Revision: https://reviews.llvm.org/D33586 Change-Id: I82f6d3d64255622791ceb0715b7ab9f4dc4d4b2c llvm-svn: 305893
* [AArch64] Add early exit to promoteLoadFromStore.Florian Hahn2017-06-211-1/+4
| | | | | | | | There should be at most a single kill flag for the promoted operand between the store/load pair. Discussed in https://reviews.llvm.org/D34402. llvm-svn: 305889
* [MIPS] Fix for selecting of DINS/INS instructionStrahinja Petrovic2017-06-211-0/+5
| | | | | | | | | | This patch adds one more condition in selection DINS/INS instruction, which fixes MultiSource/Applications/JM/ldecod/ for mips32r2 (and mips64r2 n32 abi). Differential Revision: https://reviews.llvm.org/D33725 llvm-svn: 305888
* [AMDGPU] SDWA: merge VI and GFX9 pseudo instructionsSam Kolton2017-06-2115-281/+323
| | | | | | | | | | | | Summary: Previously there were two separate pseudo instruction for SDWA on VI and on GFX9. Created one pseudo instruction that is union of both of them. Added verifier to check that operands conform either VI or GFX9. Reviewers: dp, arsenm, vpykhtin Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, artem.tamazov Differential Revision: https://reviews.llvm.org/D34026 llvm-svn: 305886
* [AArch64] Preserve register flags when promoting a load from store.Florian Hahn2017-06-211-3/+4
| | | | | | | | | | | | | | | | | | | | | Summary: This patch updates promoteLoadFromStore to use the store MachineOperand as the source operand of the of the new instruction instead of creating a new register MachineOperand. This way, the existing register flags are preserved. This fixes PR33468 (https://bugs.llvm.org/show_bug.cgi?id=33468). Reviewers: MatzeB, t.p.northover, junbuml Reviewed By: MatzeB Subscribers: aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34402 llvm-svn: 305885
* clang-format a region.Rafael Espindola2017-06-201-20/+19
| | | | | | It will make a followup patch easier to read. llvm-svn: 305865
* AMDGPU: Allow vectorization of packed typesMatt Arsenault2017-06-202-8/+20
| | | | llvm-svn: 305844
* [AMDGPU] Fix illegal shrink of V_SUBB_U32 and V_ADDC_U32Stanislav Mekhanoshin2017-06-201-0/+2
| | | | | | | | | If there is an immediate operand we shall not shrink V_SUBB_U32 and V_ADDC_U32, it does not fit e32 encoding. Differential Revison: https://reviews.llvm.org/D34291 llvm-svn: 305840
* AMDGPU: Start adding global_* instructionsMatt Arsenault2017-06-206-6/+106
| | | | llvm-svn: 305838
* AMDGPU: Do operand folding in program orderMatt Arsenault2017-06-201-5/+3
| | | | | | | | | Before it was possible to partially fold use instructions before the defs. After the xor is folded into a copy, the same mov can end up in the fold list twice, so on the second attempt it will fail expecting to see a register to fold. llvm-svn: 305821
* AMDGPU: Preserve undef when folding register operandsMatt Arsenault2017-06-201-0/+2
| | | | | | | | If the source was a copy of an undef register, this would produce a read of an undefined register which is a verifier error. llvm-svn: 305816
* [AMDGPU] Eliminate SGPR to VGPR copy when possibleStanislav Mekhanoshin2017-06-201-0/+30
| | | | | | | | SGPRs are generally cheaper, so try to use them over VGPRs. Differential Revision: https://reviews.llvm.org/D34130 llvm-svn: 305815
* AMDGPU: Fix crash with undef vreg input operandMatt Arsenault2017-06-201-1/+1
| | | | llvm-svn: 305814
* [PowerPC] fix trivial typos in comment, NFCHiroshi Inoue2017-06-201-1/+1
| | | | llvm-svn: 305813
* [x86] enable CGP memcmp() expansion for 2/4/8 byte sizesSanjay Patel2017-06-203-1/+13
| | | | | | | | | There are a couple of potential improvements as seen in the IR and asm: 1. We're unnecessarily extending to a larger type to compare values. 2. The codegen for (select cond, 1, -1) could avoid a cmov. (or we could change the order of the compares, so we have a select with 0 operand) llvm-svn: 305802
* [X86][SSE] Relax 0/-1 vector element insertion to work for any vector with ↵Simon Pilgrim2017-06-201-1/+2
| | | | | | | | >=16bit elements Shuffle lowering/combining now does a good job for 256/512-bit vectors - we don't need to prevent this llvm-svn: 305801
* [X86][SSE] Dropped old INSERT_VECTOR_ELT lowering TODOSimon Pilgrim2017-06-201-2/+0
| | | | | | Target shuffle combining now supports the matching of INSERT_VECTOR_ELT/PINSRW/PINSRB for merging multiple insertions into shuffles/bitmasks. llvm-svn: 305788
* [GlobalISel][X86] fix compilation error ( -Werror=unused-function )Igor Breger2017-06-201-2/+2
| | | | llvm-svn: 305786
* [GlobalISel][X86] Get correct RegClass for given RegBank.Igor Breger2017-06-201-17/+26
| | | | | | | | | | | | | | | | | | | Summary: In some cases RegClass depends on target feature. Hight (16-31) vector registers exist only if AVX512f available. Split from https://reviews.llvm.org/D33665 Reviewers: qcolombet, t.p.northover, zvi, guyblank Reviewed By: t.p.northover, guyblank Subscribers: guyblank, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D33952 Conflicts: test/CodeGen/X86/GlobalISel/select-memop-scalar.mir llvm-svn: 305784
* [ARM] Support constant pools in data when generating execute-only code.Alexandros Lamprineas2017-06-203-15/+44
| | | | | | | | | | | | | | | | | | Resubmission of r305387, which was reverted at r305390. The Address Sanitizer caught a stack-use-after-scope of a Twine variable. This is now fixed by passing the Twine directly as a function parameter. The ARM backend asserts against constant pool lowering when it generates execute-only code in order to prevent the generation of constant pools in the text section. It appears that target independent optimizations might generate DAG nodes that represent constant pools. By lowering such nodes as global addresses we don't violate the semantics of execute-only code and also it is guaranteed that execute-only behaves correct with the position-independent addressing modes that support execute-only code. Differential Revision: https://reviews.llvm.org/D33773 llvm-svn: 305776
* AMDGPU: Fix scratch wave offset relative FI expansionMatt Arsenault2017-06-191-9/+20
| | | | | | | | The offset may not be an inline immediate, so this needs to be materialized into a register. The post-RA run of SIShrinkInstructions is able to fold it later if it can. llvm-svn: 305761
* [AMDGPU] Add infer address spaces pass before SROAStanislav Mekhanoshin2017-06-191-0/+8
| | | | | | | | | It adds it for the target after inlining but before SROA where we can get most out of it. Differential Revision: https://reviews.llvm.org/D34366 llvm-svn: 305759
* [Target] Fix some Clang-tidy modernize-use-using and Include What You Use ↵Eugene Zelenko2017-06-192-17/+37
| | | | | | warnings; other minor fixes (NFC). llvm-svn: 305757
* [AArch64][Falkor] Fix MOVZ sched predicate to not assert on non-imm operands ↵Geoff Berry2017-06-191-1/+2
| | | | | | (e.g. blockaddress). llvm-svn: 305752
* [AArch64][Kryo] Add missing write latency for LDAXP, LDXP second destination.Geoff Berry2017-06-191-2/+4
| | | | | | Fixes PR33491 and PR33512. llvm-svn: 305751
OpenPOWER on IntegriCloud