summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU: Update isFPImmLegal for f16Matt Arsenault2016-12-221-1/+2
| | | | | | I don't think this matters because ConstantFP is legal. llvm-svn: 290299
* [AArch64] Correct the check of signed 9-bit imm in getIndexedAddressParts().Haicheng Wu2016-12-221-2/+4
| | | | | | | | -256 is a legal indexed address part. Differential Revision: https://reviews.llvm.org/D27537 llvm-svn: 290296
* [NVVMIntrRange] Only set range metadata if none is already presentDavid Majnemer2016-12-221-0/+4
| | | | | | | The range metadata inserted by NVVMIntrRange is pessimistic, range metadata already present could be more precise. llvm-svn: 290294
* [GlobalISel] Add basic Selector-emitter tblgen backend.Ahmed Bougacha2016-12-213-6/+13
| | | | | | | | | | | | | | | | | This adds a basic tablegen backend that analyzes the SelectionDAG patterns to find simple ones that are eligible for GlobalISel-emission. That's similar to FastISel, with one notable difference: we're not fed ISD opcodes, so we need to map the SDNode operators to generic opcodes. That's done using GINodeEquiv in TargetGlobalISel.td. Otherwise, this is mostly boilerplate, and lots of filtering of any kind of "complicated" pattern. On AArch64, this is sufficient to match G_ADD up to s64 (to ADDWrr/ADDXrr) and G_BR (to B). Differential Revision: https://reviews.llvm.org/D26878 llvm-svn: 290284
* [WebAssembly] Fix the opcode value for i64.rotr.Dan Gohman2016-12-211-1/+1
| | | | llvm-svn: 290281
* [AArch64] Remove a redundant check. NFC.Haicheng Wu2016-12-211-2/+1
| | | | | | | | The case AM.Scale == 0 is already handled by the code right above. Differential Revision: https://reviews.llvm.org/D28003 llvm-svn: 290275
* [X86][SSE] Improve lowering of vXi64 multiplies Simon Pilgrim2016-12-212-26/+35
| | | | | | | | | | | | | | | | | | | | | | As mentioned on PR30845, we were performing our vXi64 multiplication as: AloBlo = pmuludq(a, b); AloBhi = pmuludq(a, psrlqi(b, 32)); AhiBlo = pmuludq(psrlqi(a, 32), b); return AloBlo + psllqi(AloBhi, 32)+ psllqi(AhiBlo, 32); when we could avoid one of the upper shifts with: AloBlo = pmuludq(a, b); AloBhi = pmuludq(a, psrlqi(b, 32)); AhiBlo = pmuludq(psrlqi(a, 32), b); return AloBlo + psllqi(AloBhi + AhiBlo, 32); This matches the lowering on gcc/icc. Differential Revision: https://reviews.llvm.org/D27756 llvm-svn: 290267
* AMDGPU/SI: Fix file headerTom Stellard2016-12-211-1/+1
| | | | llvm-svn: 290265
* revert first commit . removing empty line in X86.hMichael Zuckerman2016-12-211-1/+0
| | | | llvm-svn: 290255
* First commit adding new line to X86.hMichael Zuckerman2016-12-211-0/+1
| | | | llvm-svn: 290254
* Added a template for building target specific memory node in DAG.Elena Demikhovsky2016-12-215-117/+359
| | | | | | | | | | I added API for creation a target specific memory node in DAG. Today, all memory nodes are common for all targets and their constructors are located in SelectionDAG.cpp. There are some cases in X86 where we need to create a special node - truncation-with-saturation store, float-to-half-store. In the current patch I added truncation-with-saturation nodes and I'm using them for intrinsics. In the future I plan to implement DAG lowering for truncation-with-saturation pattern. Differential Revision: https://reviews.llvm.org/D27899 llvm-svn: 290250
* [AMDGPU] Garbage collect dead code. NFCI.Davide Italiano2016-12-211-15/+0
| | | | llvm-svn: 290249
* [X86] Vectorcall Calling Convention - Adding CodeGen Complete SupportOren Ben Simhon2016-12-211-4/+4
| | | | | | Fixing a warning. llvm-svn: 290248
* [X86] Vectorcall Calling Convention - Adding CodeGen Complete SupportOren Ben Simhon2016-12-211-4/+4
| | | | | | Fixing build issues. llvm-svn: 290244
* [X86] Vectorcall Calling Convention - Adding CodeGen Complete SupportOren Ben Simhon2016-12-214-64/+227
| | | | | | | | | | | | | The vectorcall calling convention specifies that arguments to functions are to be passed in registers, when possible. vectorcall uses more registers for arguments than fastcall or the default x64 calling convention use. The vectorcall calling convention is only supported in native code on x86 and x64 processors that include Streaming SIMD Extensions 2 (SSE2) and above. The current implementation does not handle Homogeneous Vector Aggregates (HVAs) correctly and this review attempts to fix it. This aubmit also includes additional lit tests to cover better HVAs corner cases. Differential Revision: https://reviews.llvm.org/D27392 llvm-svn: 290240
* [ARM] Implement isExtractSubvectorCheap.Eli Friedman2016-12-202-0/+12
| | | | | | | | | | | | | | See https://reviews.llvm.org/D6678 for the history of isExtractSubvectorCheap. Essentially the same considerations apply to ARM. This temporarily breaks the formation of vpadd/vpaddl in certain cases; AddCombineToVPADDL essentially assumes that we won't form VUZP shuffles. See https://reviews.llvm.org/D27779 for followup fix. Differential Revision: https://reviews.llvm.org/D27774 llvm-svn: 290198
* AMDGPU: Allow 16-bit types in inline asm constraintsMatt Arsenault2016-12-201-0/+2
| | | | llvm-svn: 290193
* AMDGPU: Don't add same instruction multiple times to worklistMatt Arsenault2016-12-201-1/+7
| | | | | | | | | When the instruction is processed the first time, it may be deleted resulting in crashes. While the new test adds the same user to the worklist twice, this particular case doesn't crash but I'm not sure why. llvm-svn: 290191
* AMDGPU/SI: Make a function constTom Stellard2016-12-202-4/+3
| | | | llvm-svn: 290185
* AMDGPU/SI: Add a MachineMemOperand when lowering llvm.amdgcn.buffer.load.*Tom Stellard2016-12-206-6/+77
| | | | | | | | | | Reviewers: arsenm, nhaehnle, mareko Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D27834 llvm-svn: 290184
* [X86][SSE] Ensure we're only combining shuffles with legal mask types.Simon Pilgrim2016-12-201-0/+4
| | | | | | I haven't managed to get this to fail yet but its technically possible for the AND -> shuffle decomposition to result in illegal types. llvm-svn: 290183
* AMDGPU/SI: Add a MachineMemOperand to MIMG instructionsTom Stellard2016-12-204-6/+57
| | | | | | | | | | | | | | | Summary: Without a MachineMemOperand, the scheduler was assuming MIMG instructions were ordered memory references, so no loads or stores could be reordered across them. Reviewers: arsenm Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27536 llvm-svn: 290179
* Fix build with expensive checks enabledSerge Pavlov2016-12-201-0/+1
| | | | | | | | | Include of llvm/IR/Verifier.h was removed from HexagonCommonGEP.cpp in r289604 as unused. In fact it is required when expensive checks are enabled, because it declared function `verifyFunction`, which is called in conditionally compiled part of the file. llvm-svn: 290170
* [TargetInstrInfo] replace redundant expression in getMemOpBaseRegImmOfsMichael LeMay2016-12-191-2/+1
| | | | | | | | | | | | | | | Summary: The expression for computing the return value of getMemOpBaseRegImmOfs has only one possible value. The other value would result in a return earlier in the function. This patch replaces the expression with its only possible value. Reviewers: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27437 llvm-svn: 290133
* [AMDGPU] When unifying metadata, add operands to named metadata individuallyKonstantin Zhuravlyov2016-12-191-3/+5
| | | | | | Differential Revision: https://reviews.llvm.org/D27725 llvm-svn: 290114
* Silence unused warning.Daniel Jasper2016-12-191-0/+1
| | | | llvm-svn: 290109
* [ARM] GlobalISel: Lower i8 and i16 register argsDiana Picus2016-12-191-5/+20
| | | | | | | | | | | This allows lowering i8 and i16 arguments if they can fit in the registers. Note that the lowering is incomplete - ABI extensions are handled in a subsequent patch. (Last part of) Differential Revision: https://reviews.llvm.org/D27704 llvm-svn: 290106
* [ARM] GlobalISel: Allow i8 and i16 addsDiana Picus2016-12-191-1/+6
| | | | | | | | | Teach the instruction selector and legalizer that it's ok to have adds with 8 or 16-bit integers. This is the second part of https://reviews.llvm.org/D27704 llvm-svn: 290105
* [ARM] GlobalISel: Select i8 and i16 copiesDiana Picus2016-12-191-2/+9
| | | | | | | | | Teach the instruction selector that it's ok to copy small values from physical registers. First part of https://reviews.llvm.org/D27704 llvm-svn: 290104
* [Power9] Processor Model for SchedulingEhsan Amiri2016-12-194-3/+1145
| | | | | | | | PWR9 processor model for instruction scheduling. A subsequent patch will migrate PWR9 to Post RA MIScheduler. https://reviews.llvm.org/D24525 llvm-svn: 290102
* [Hexagon] Restore minimum profit check accidentally changed in r290024Malcolm Parsons2016-12-191-2/+2
| | | | llvm-svn: 290100
* [ARM] GlobalISel: Lower more than 4 argumentsDiana Picus2016-12-191-10/+22
| | | | | | | | | | This adds support for lowering more than 4 arguments (although still i32 only). It uses the handleAssignments / ValueHandler infrastructure extracted from the AArch64 backend in r288658. Differential Revision: https://reviews.llvm.org/D27195 llvm-svn: 290098
* AMDGPU: [AMDGPU] Assembler: add .hsa_code_object_metadata directive for ↵Sam Kolton2016-12-194-72/+143
| | | | | | | | | | | | | | | | | | | | | | | | functime metadata V2.0 Summary: Added pair of directives .hsa_code_object_metadata/.end_hsa_code_object_metadata. Between them user can put YAML string that would be directly put to the generated note. E.g.: ''' .hsa_code_object_metadata { amd.MDVersion: [ 2, 0 ] } .end_hsa_code_object_metadata ''' Based on D25046 Reviewers: vpykhtin, nhaustov, yaxunl, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, mgorny, tony-tye Differential Revision: https://reviews.llvm.org/D27619 llvm-svn: 290097
* [ARM] GlobalISel: Support loading from the stackDiana Picus2016-12-193-10/+45
| | | | | | | | | | Add support for selecting simple G_LOAD and G_FRAME_INDEX instructions (32-bit scalars only). This will be useful for functions that need to pass arguments on the stack. First part of https://reviews.llvm.org/D27195. llvm-svn: 290096
* [X86] When recognizing vector loads or VZEXT_LOAD in selectScalarSSELoad ↵Craig Topper2016-12-191-2/+2
| | | | | | make sure we pass the load's user rather than load itself to the second operand of IsLegalToFold. llvm-svn: 290089
* [X86] Remove all of the patterns that use X86ISD:FAND/FXOR/FOR/FANDN except ↵Craig Topper2016-12-192-131/+42
| | | | | | | | for the ones needed for SSE1. Anything SSE2 or above uses the integer ISD opcode. This removes 11721 bytes from the DAG isel table or 2.2% llvm-svn: 290073
* Revert r289955 and r289962. This is causing lots of ASAN failures for us.Daniel Jasper2016-12-181-22/+10
| | | | | | | | Not sure whether it causes and ASAN false positive or whether it actually leads to incorrect code or whether it even exposes bad code. Hans, I'll get you instructions to reproduce this. llvm-svn: 290066
* [X86] [AVX512] Minor fix in encoding of scalar EVEX instructions. NFC.Michael Zuckerman2016-12-181-3/+2
| | | | | | | | | | | | Commit on behalf of Gadi Haber Removed EVEX_V512 prefix from scalar EVEX instructions since HW ignores L'L bits anyway (LIG). 4 instructions are modified. The changed encodings are validated with XED. Rviewers: delena, igorb Differential revision: https://reviews.llvm.org/D27802 llvm-svn: 290065
* [X86][SSE] Add support for combining target shuffles to SHUFPS.Simon Pilgrim2016-12-181-2/+108
| | | | | | As discussed on D27692, the next step will be to allow cross-domain shuffles once the combined shuffle depth passes a certain point. llvm-svn: 290064
* [X86][SSE][AVX-512] Convert FAND/FOR/FXOR/FANDN nodes to integer operations ↵Craig Topper2016-12-181-13/+14
| | | | | | | | | | | | if they are available. This will allow a bunch of patterns to be removed. These nodes are only emitted for lowering FABS/FNEG/FNABS/FCOPYSIGN. Ideally we just wouldn't create these nodes if SSE2 or higher is available, but it was simple to just convert them in DAG combine. For SSE2, AVX, and AVX512 with DQI this is no functional change as the execution domain fixing pass ensures the right domain is selected regardless of the ISD opcode. For AVX-512 without DQI we end up using integer instructions since the floating point versions aren't available. But we were already doing that for any logical operations in code that didn't come from FABS/FNEG/FNABS/FCOPYSIGN so this seems no worse. And we get the benefit of being able to fold broadcasts now. llvm-svn: 290060
* [AVX-512] Use EVEX encoded XOR instruction for zeroing scalar registers when ↵Craig Topper2016-12-183-5/+22
| | | | | | | | DQI and VLX instructions are available. This can give the register allocator more registers to use. llvm-svn: 290057
* [AVX-512] Make sure VLX is also enabled before using EVEX encoded logic ops ↵Craig Topper2016-12-182-2/+2
| | | | | | for scalars. I missed this in r290049. llvm-svn: 290055
* [AVX-512] Use EVEX encoded logic operations for scalar types when they are ↵Craig Topper2016-12-172-1/+38
| | | | | | available. This gives the register allocator more registers to work with. llvm-svn: 290049
* Revert "AArch64CollectLOH: Rewrite as block-local analysis."Matthias Braun2016-12-171-279/+841
| | | | | | | | It is still breaking Chrome. http://llvm.org/PR31361 This reverts commit r290026. llvm-svn: 290047
* [Hexagon] Other attempt to fix build with enabled asserts broken in 290024 ↵Eugene Zelenko2016-12-171-0/+1
| | | | | | (NFC). llvm-svn: 290028
* [Hexagon] Fix build with enabled asserts broken in 290024 (NFC).Eugene Zelenko2016-12-171-0/+1
| | | | llvm-svn: 290027
* AArch64CollectLOH: Rewrite as block-local analysis.Matthias Braun2016-12-171-841/+279
| | | | | | | | | | | | | | | | | Re-apply r288561: Liveness tracking should be correct now after r290014. Previously this pass was using up to 5% compile time in some cases which is a bit much for what it is doing. The pass featured a full blown data-flow analysis which in the default configuration was restricted to a single block. This rewrites the pass under the assumption that we only ever work on a single block. This is done in a single pass maintaining a state machine per general purpose register to catch LOH patterns. Differential Revision: https://reviews.llvm.org/D27329 llvm-svn: 290026
* [Hexagon] Fix some Clang-tidy modernize and Include What You Use warnings; ↵Eugene Zelenko2016-12-1711-163/+220
| | | | | | other minor fixes (NFC). llvm-svn: 290024
* AArch64: Enable post-ra liveness updatesMatthias Braun2016-12-163-1/+13
| | | | | | Differential Revision: https://reviews.llvm.org/D27559 llvm-svn: 290014
* Implement LaneBitmask::any(), use it to replace !none(), NFCIKrzysztof Parzyszek2016-12-165-11/+11
| | | | llvm-svn: 289974
OpenPOWER on IntegriCloud