summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86
Commit message (Collapse)AuthorAgeFilesLines
* [X86] Add an additional isel pattern to CVTDQ2PDrm/VCVTDQ2PDrm to enable ↵Craig Topper2017-10-141-2/+6
| | | | | | | | load folding without the peephole pass. This pattern is already used in AVX512VL version of these instructions. Though AVX512VL version is missing other patterns. llvm-svn: 315794
* [X86] Use X86ISD::VBROADCAST in place of v2f64 X86ISD::MOVDDUP when AVX2 is ↵Craig Topper2017-10-133-17/+27
| | | | | | | | | | | | | | | | available This is particularly important for AVX512VL where we are better able to recognize the VBROADCAST loads to fold with other operations. For AVX512VL we now use X86ISD::VBROADCAST for all of the patterns and remove the 128-bit X86ISD::VMOVDDUP. We may be able to use this for AVX1 as well which would allow us to remove more isel patterns. I also had to add X86ISD::VBROADCAST as a node to call combineShuffle for so that we treat it similar to X86ISD::MOVDDUP. Differential Revision: https://reviews.llvm.org/D38836 llvm-svn: 315768
* [globalisel][tablegen] Add support for fpimm and import of APInt/APFloat ↵Daniel Sanders2017-10-131-8/+8
| | | | | | | | | | | | | | | | | | | | | | based ImmLeaf. Summary: There's only a tablegen testcase for IntImmLeaf and not a CodeGen one because the relevant rules are rejected for other reasons at the moment. On AArch64, it's because there's an SDNodeXForm attached to the operand. On X86, it's because the rule either emits multiple instructions or has another predicate using PatFrag which cannot easily be supported at the same time. Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Reviewed By: qcolombet Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D36569 llvm-svn: 315761
* [X86] Add initial skeleton support for knm cpuCraig Topper2017-10-131-5/+16
| | | | | | | | This adds Intel's Knights Mill CPU to valid CPU names for the backend. For now its an alias of "knl", but ultimately we need to support AVX5124FMAPS and AVX5124VNNIW instruction sets for it. Differential Revision: https://reviews.llvm.org/D38811 llvm-svn: 315722
* [X86] Fix some inconsistent formatting in the processor feature lists.Craig Topper2017-10-131-4/+4
| | | | llvm-svn: 315696
* [X86] Add ProcIntelBDW to BroadwellProc class not BDWFeatures class.Craig Topper2017-10-131-4/+5
| | | | | | This isn't a property we want inherited. llvm-svn: 315695
* [X86] Stop creating CMOV nodes with a second MVT::Glue resultCraig Topper2017-10-131-24/+9
| | | | | | | | | | | | | | Summary: We seem to inconsistently create CMOV nodes some with a Glue result and some without. But I can't find any cases that use the Glue result. So I've tried to remove all the place that did this. Reviewers: RKSimon, spatel, zvi Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38664 llvm-svn: 315686
* [X86] Remove patterns that select unmasked vbroadcastf2x32/vbroadcasti2x32. ↵Craig Topper2017-10-131-8/+20
| | | | | | | | Prefer vbroadcastsd/vpbroadcastq instead. There's no advantage to using these instructions when they aren't masked. This enables some additional execution domain switching without needing to update the table. llvm-svn: 315674
* Revert "TargetMachine: Merge TargetMachine and LLVMTargetMachine"Matthias Braun2017-10-122-2/+2
| | | | | | | | | | Reverting to investigate layering effects of MCJIT not linking libCodeGen but using TargetMachine::getNameWithPrefix() breaking the lldb bots. This reverts commit r315633. llvm-svn: 315637
* TargetMachine: Merge TargetMachine and LLVMTargetMachineMatthias Braun2017-10-122-2/+2
| | | | | | | | | | | | | | | Merge LLVMTargetMachine into TargetMachine. - There is no in-tree target anymore that just implements TargetMachine but not LLVMTargetMachine. - It should still be possible to stub out all the various functions in case a target does not want to use lib/CodeGen - This simplifies the code and avoids methods ending up in the wrong interface. Differential Revision: https://reviews.llvm.org/D38489 llvm-svn: 315633
* [X86] Add CLWB intrinsic. llvm partCraig Topper2017-10-121-2/+2
| | | | llvm-svn: 315613
* [codeview] Don't emit FPO data in funclet prologuesReid Kleckner2017-10-122-6/+3
| | | | | | Attempt 3 to work around bugs in FPO data with funclets. llvm-svn: 315600
* [dump] Remove NDEBUG from test to enable dump methods [NFC]Don Hinton2017-10-122-2/+2
| | | | | | | | | | | | | | | Summary: Add LLVM_FORCE_ENABLE_DUMP cmake option, and use it along with LLVM_ENABLE_ASSERTIONS to set LLVM_ENABLE_DUMP. Remove NDEBUG and only use LLVM_ENABLE_DUMP to enable dump methods. Move definition of LLVM_ENABLE_DUMP from config.h to llvm-config.h so it'll be picked up by public headers. Differential Revision: https://reviews.llvm.org/D38406 llvm-svn: 315590
* [x86] replace isEqualTo with == for efficiencySanjay Patel2017-10-121-4/+4
| | | | | | | This is a follow-up suggested in D37534. Patch by Yulia Koval. llvm-svn: 315589
* [X86][SSE] Pull out repeated INSERT_VECTOR_ELT code from LowerBUILD_VECTOR ↵Simon Pilgrim2017-10-121-57/+51
| | | | | | v16i8/v8i16 insertion. NFCI. llvm-svn: 315587
* Speculative build fix 2Reid Kleckner2017-10-121-1/+1
| | | | llvm-svn: 315542
* Revert r307036 because of PR34919.Wei Mi2017-10-121-13/+0
| | | | llvm-svn: 315540
* Speculative build fix, apparently I built llc without my patch applied to ↵Reid Kleckner2017-10-121-1/+1
| | | | | | test it llvm-svn: 315539
* [codeview] Disable FPO in functions using EH funcletsReid Kleckner2017-10-122-0/+5
| | | | | | | Funclets are emitted by WinException which doesn't have access to X86TargetStreamer so it's hard to make a quick fix for this. llvm-svn: 315538
* [X86] Sink X86AsmPrinter ctor into .cpp file, NFCReid Kleckner2017-10-112-3/+5
| | | | | | I keep adding and removing code here, so let's sink it. llvm-svn: 315534
* [MC] Have MCObjectStreamer take its MCAsmBackend argument via unique_ptr.Lang Hames2017-10-112-5/+9
| | | | | | | | MCObjectStreamer owns its MCCodeEmitter -- this fixes the types to reflect that, and allows us to remove the last instance of MCObjectStreamer's weird "holding ownership via someone else's reference" trick. llvm-svn: 315531
* [codeview] Implement FPO data assembler directivesReid Kleckner2017-10-1112-45/+704
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This adds a set of new directives that describe 32-bit x86 prologues. The directives are limited and do not expose the full complexity of codeview FPO data. They are merely a convenience for the compiler to generate more readable assembly so we don't need to generate tons of labels in CodeGen. If our prologue emission changes in the future, we can change the set of available directives to suit our needs. These are modelled after the .seh_ directives, which use a different format that interacts with exception handling. The directives are: .cv_fpo_proc _foo .cv_fpo_pushreg ebp/ebx/etc .cv_fpo_setframe ebp/esi/etc .cv_fpo_stackalloc 200 .cv_fpo_endprologue .cv_fpo_endproc .cv_fpo_data _foo I tried to follow the implementation of ARM EHABI CFI directives by sinking most directives out of MCStreamer and into X86TargetStreamer. This helps avoid polluting non-X86 code with WinCOFF specific logic. I used cdb to confirm that this can show locals in parent CSRs in a few cases, most importantly the one where we use ESI as a frame pointer, i.e. the one in http://crbug.com/756153#c28 Once we have cdb integration in debuginfo-tests, we can add integration tests there. Reviewers: majnemer, hans Subscribers: aemerson, mgorny, kristof.beyls, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D38776 llvm-svn: 315513
* [x86] avoid infinite loop from SoftenFloatOperand (PR34866)Sanjay Patel2017-10-111-0/+5
| | | | | | | | | Legalization of fp128 assumes things that we should have asserts for, so that's another potential improvement. Differential Revision: https://reviews.llvm.org/D38771 llvm-svn: 315485
* Spelling mistake in comment. NFCI.Simon Pilgrim2017-10-111-1/+1
| | | | llvm-svn: 315471
* [X86] Remove MVT::i1 handling code from LowerTRUNCATECraig Topper2017-10-111-8/+0
| | | | | | | | | | | | | | Summary: I don't think this is necessary with i1 being illegal now. Reviewers: RKSimon, zvi, guyblank Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38784 llvm-svn: 315469
* [Asm] Add debug tracing in table-generated assembly matcherOliver Stannard2017-10-111-2/+1
| | | | | | | | | | | | | This adds debug tracing to the table-generated assembly instruction matcher, enabled by the -debug-only=asm-matcher option. The changes in the target AsmParsers are to add an MCInstrInfo reference under a consistent name, so that we can use it from table-generated code. This was already being used this way for targets that use deprecation warnings, but 5 targets did not have it, and Hexagon had it under a different name to the other backends. llvm-svn: 315445
* [MC] Have MCObjectStreamer take its MCAsmBackend argument via unique_ptr.Lang Hames2017-10-112-7/+11
| | | | | | | | MCObjectStreamer owns its MCAsmBackend -- this fixes the types to reflect that, and allows us to remove another instance of MCObjectStreamer's weird "holding ownership via someone else's reference" trick. llvm-svn: 315410
* [X86] Remove temporary std::string creation from shuffle comment printing. ↵Craig Topper2017-10-111-12/+12
| | | | | | We can just write directly to the raw_ostream. llvm-svn: 315399
* [X86] Add 128-bit version of vbroadcasti32x2 to shuffle comment decoding.Craig Topper2017-10-111-0/+11
| | | | llvm-svn: 315395
* [X86] Add broadcast patterns that allow a scalar_to_vector between the ↵Craig Topper2017-10-101-0/+18
| | | | | | | | broadcast and the load. We already have these patterns for AVX512VL, but not AVX1 or 2. llvm-svn: 315382
* [X86] Fix some patterns that select VLX instructions, but were incorrectly ↵Craig Topper2017-10-102-2/+6
| | | | | | | | also checking presence of BWI instructions. The EVEX->VEX pass probably obscures this. llvm-svn: 315365
* [MC] Thread unique_ptr<MCObjectWriter> through the create.*ObjectWriterLang Hames2017-10-105-22/+34
| | | | | | | | | | functions. This makes the ownership of the resulting MCObjectWriter clear, and allows us to remove one instance of MCObjectStreamer's bizarre "holding ownership via someone else's reference" trick. llvm-svn: 315327
* after fixing the i386 caseUriel Korach2017-10-101-2/+2
| | | | | Change-Id: If6fe0b6ec01f111115fb734fe31c0e152dbc165f llvm-svn: 315311
* [AVX512] Add patterns to commute integer comparison instructions during isel.Craig Topper2017-10-101-0/+41
| | | | | | This enables broadcast loads to be commuted and allows normal loads to be folded without the peephole pass. llvm-svn: 315274
* [SEH] Use reportError instead of report_fatal_error for bad directivesReid Kleckner2017-10-101-3/+3
| | | | | | | | | | This makes the .seh_ directives slightly more usable from standalone assembly files. This removes a large number of report_fatal_errors and recovers from the error by ignoring the directive. llvm-svn: 315262
* [MC] Plumb unique_ptr<MCWinCOFFObjectTargetWriter> throughLang Hames2017-10-101-2/+2
| | | | | | | | | | | createWinCOFFObjectWriter to WinCOFFObjectWriter's constructor. Fixes the same ownership issue for COFF that r315245 did for MachO: WinCOFFObjectWriter takes ownership of its MCWinCOFFObjectTargetWriter, so we want to pass this through to the constructor via a unique_ptr, rather than a raw ptr. llvm-svn: 315257
* [MC] Plumb unique_ptr<MCELFObjectTargetWriter> through createELFObjectWriter toLang Hames2017-10-091-3/+2
| | | | | | | | | | ELFObjectWriter's constructor. Fixes the same ownership issue for ELF that r315245 did for MachO: ELFObjectWriter takes ownership of its MCELFObjectTargetWriter, so we want to pass this through to the constructor via a unique_ptr, rather than a raw ptr. llvm-svn: 315254
* [MC] Plumb unique_ptr<MCMachObjectTargetWriter> through createMachObjectWriterLang Hames2017-10-091-4/+3
| | | | | | | | | | | to MCObjectWriter's constructor. MCObjectWriter takes ownership of its MCMachObjectTargetWriter argument -- this patch plumbs that ownership relationship through the constructor (which previously took raw MCMachObjectTargetWriter*) and the createMachObjectWriter function. llvm-svn: 315245
* [GISel]: Fix generation of illegal COPYs during CallLoweringAditya Nandakumar2017-10-091-6/+24
| | | | | | | | | | | We end up creating COPY's that are either truncating/extending and this should be illegal. https://reviews.llvm.org/D37640 Patch for X86 and ARM by igorb, rovka llvm-svn: 315240
* [X86] Unsigned saturation subtraction canonicalization [the backend part]Zvi Rackover2017-10-091-0/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: On behalf of julia.koval@intel.com The patch transforms canonical version of unsigned saturation, which is sub(max(a,b),a) or sub(a,min(a,b)) to special psubus insturuction on targets, which support it(8bit and 16bit uints). umax(a,b) - b -> subus(a,b) a - umin(a,b) -> subus(a,b) There is also extra case handled, when right part of sub is 32 bit and can be truncated, using UMIN(this transformation was discussed in https://reviews.llvm.org/D25987). The example of special case code: ``` void foo(unsigned short *p, int max, int n) { int i; unsigned m; for (i = 0; i < n; i++) { m = *--p; *p = (unsigned short)(m >= max ? m-max : 0); } } ``` Max in this example is truncated to max_short value, if it is greater than m, or just truncated to 16 bit, if it is not. It is vaid transformation, because if max > max_short, result of the expression will be zero. Here is the table of types, I try to support, special case items are bold: | Size | 128 | 256 | 512 | ----- | ----- | ----- | ----- | i8 | v16i8 | v32i8 | v64i8 | i16 | v8i16 | v16i16 | v32i16 | i32 | | **v8i32** | **v16i32** | i64 | | | **v8i64** Reviewers: zvi, spatel, DavidKreitzer, RKSimon Reviewed By: zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37534 llvm-svn: 315237
* [X86] Remove a setLoadExtAction from the AVX512 section that uses an ↵Craig Topper2017-10-091-1/+0
| | | | | | AVX512BW type and is alraedy present in the AVX512BW section. llvm-svn: 315202
* [X86] Enable extended comparison predicate support for SETUEQ/SETONE when ↵Craig Topper2017-10-092-18/+16
| | | | | | | | | | targeting AVX instructions. We believe that despite AMD's documentation, that they really do support all 32 comparision predicates under AVX. Differential Revision: https://reviews.llvm.org/D38609 llvm-svn: 315201
* [X86][SSE] Don't call combineTo inside combineX86ShufflesRecursively. NFCI.Simon Pilgrim2017-10-081-51/+60
| | | | | | | | Return the combined shuffle from combineX86ShufflesRecursively and perform the combineTo in the caller. Makes it easier for future patches to use this in functions that aren't actually shuffles themselves. llvm-svn: 315195
* Tidyup with clang-format. NFCI.Simon Pilgrim2017-10-081-8/+5
| | | | llvm-svn: 315187
* [X86] getTargetConstantBitsFromNode - add support for decoding scalar constantsSimon Pilgrim2017-10-081-0/+7
| | | | llvm-svn: 315182
* [X86] Prefer MOVSS/SD over BLENDI during legalization. Remove BLENDI ↵Craig Topper2017-10-083-92/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | versions of scalar arithmetic patterns Summary: We currently disable some converting of shuffles to MOVSS/MOVSD during legalization if SSE41 is enabled. But later during shuffle combining we go back to prefering MOVSS/MOVSD. Additionally we have patterns that look for BLENDIs to detect scalar arithmetic operations. I believe due to the combining using MOVSS/MOVSD these are unnecessary. Interestingly, we still codegen blend instructions even though lowering/isel emit movss/movsd instructions. Turns out machine CSE commutes them to blend, and then commuting those blends back into blends that are equivalent to the original movss/movsd. This patch fixes the inconsistency in legalization to prefer MOVSS/MOVSD. The one test change was caused by this change. The problem is that we have integer types and are mostly selecting integer instructions except for the shufps. This shufps forced the execution domain, but the vpblendw couldn't have its domain changed with a naive instruction swap. We could fix this by special casing VPBLENDW based on the immediate to widen the element type. The rest of the patch is removing all the excess scalar patterns. Long term we should probably add isel patterns to make MOVSS/MOVSD emit blends directly instead of relying on the double commute. We may also want to consider emitting movss/movsd for optsize. I also wonder if we should still use the VEX encoded blendi instructions even with AVX512. Blends have better throughput, and that may outweigh the register constraint. Reviewers: RKSimon, zvi Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38023 llvm-svn: 315181
* [X86][SKX] Adding the scheduling information for the SKX target.Gadi Haber2017-10-083-2/+6952
| | | | | | | | | | | | | | | | | | Adding the scheduling information for the SkylakeServer (SKX) target. This patch adds the instruction scheduling information for the SkylakeServer (SKX) architecture target by adding the file X86SchedSkylakeServer.td located under the X86 Target. We used the scheduling information retrieved from the Skylake architects in order to create the file. The scheduling information includes latency, number of micro-Ops and used ports by each SKL instruction. The patch continues the scheduling replacement and insertion effort started with the SNB target in r310792, the HSW target in r311879 and the SkylakeClient (SKL) target in rL313613. Please expect some performance fluctuations due to code alignment effects. Reviewers: zvi, RKSimon, craig.topper, chandlerc, aymanmu Differential Revision: https://reviews.llvm.org/D38443 Change-Id: I5c228fcc09e9e5a99b6116e62b356c4f9b971185 llvm-svn: 315175
* [X86] Add missing entries in 'MemoryFoldTable2Addr' to get complete form of ↵Ayman Musa2017-10-081-0/+50
| | | | | | | | | | the table. Get the folding table 'MemoryFoldTable2Addr' to a complete state as part of the process explained in https://reviews.llvm.org/D38028 Differential Revision: https://reviews.llvm.org/D38500 llvm-svn: 315174
* [X86][TableGen] Recommitting the X86 memory folding tables TableGen backend ↵Ayman Musa2017-10-081-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | while disabling it by default. After the original commit ([[ https://reviews.llvm.org/rL304088 | rL304088 ]]) was reverted, a discussion in llvm-dev was opened on 'how to accomplish this task'. In the discussion we concluded that the best way to achieve our goal (which is to automate the folding tables and remove the manually maintained tables) is: # Commit the tablegen backend disabled by default. # Proceed with an incremental updating of the manual tables - while checking the validity of each added entry. # Repeat previous step until we reach a state where the generated and the manual tables are identical. Then we can safely remove the manual tables and include the generated tables instead. # Schedule periodical (1 week/2 weeks/1 month) runs of the pass: - if changes appear (new entries): - make sure the entries are legal - If they are not, mark them as illegal to folding - Commit the changes (if there are any). CMake flag added for this purpose is "X86_GEN_FOLD_TABLES". Building with this flags will run the pass and emit the X86GenFoldTables.inc file under build/lib/Target/X86/ directory which is a good reference for any developer who wants to take part in the effort of completing the current folding tables. Differential Revision: https://reviews.llvm.org/D38028 llvm-svn: 315173
* [X86] Stop LowerSIGN_EXTEND_AVX512 from creating v8i16/v16i16/v16i8 vselects ↵Craig Topper2017-10-081-1/+6
| | | | | | | | with a v8i1/v16i1 condition when BWI is not available. Some of the tests in vector-shuffle-v1.ll would get into an infinite loop without this. llvm-svn: 315172
OpenPOWER on IntegriCloud