summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [X86] Add initial skeleton support for knm cpuCraig Topper2017-10-131-5/+16
| | | | | | | | This adds Intel's Knights Mill CPU to valid CPU names for the backend. For now its an alias of "knl", but ultimately we need to support AVX5124FMAPS and AVX5124VNNIW instruction sets for it. Differential Revision: https://reviews.llvm.org/D38811 llvm-svn: 315722
* [X86] Fix some inconsistent formatting in the processor feature lists.Craig Topper2017-10-131-4/+4
| | | | llvm-svn: 315696
* [X86] Add ProcIntelBDW to BroadwellProc class not BDWFeatures class.Craig Topper2017-10-131-4/+5
| | | | | | This isn't a property we want inherited. llvm-svn: 315695
* [Hexagon] Add patterns for cmpb/cmph with immediate argumentsKrzysztof Parzyszek2017-10-131-0/+46
| | | | | | Patch by Sumanth Gundapaneni. llvm-svn: 315692
* [X86] Stop creating CMOV nodes with a second MVT::Glue resultCraig Topper2017-10-131-24/+9
| | | | | | | | | | | | | | Summary: We seem to inconsistently create CMOV nodes some with a Glue result and some without. But I can't find any cases that use the Glue result. So I've tried to remove all the place that did this. Reviewers: RKSimon, spatel, zvi Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38664 llvm-svn: 315686
* [X86] Remove patterns that select unmasked vbroadcastf2x32/vbroadcasti2x32. ↵Craig Topper2017-10-131-8/+20
| | | | | | | | Prefer vbroadcastsd/vpbroadcastq instead. There's no advantage to using these instructions when they aren't masked. This enables some additional execution domain switching without needing to update the table. llvm-svn: 315674
* Revert "TargetMachine: Merge TargetMachine and LLVMTargetMachine"Matthias Braun2017-10-1242-72/+546
| | | | | | | | | | Reverting to investigate layering effects of MCJIT not linking libCodeGen but using TargetMachine::getNameWithPrefix() breaking the lldb bots. This reverts commit r315633. llvm-svn: 315637
* TargetMachine: Merge TargetMachine and LLVMTargetMachineMatthias Braun2017-10-1242-546/+72
| | | | | | | | | | | | | | | Merge LLVMTargetMachine into TargetMachine. - There is no in-tree target anymore that just implements TargetMachine but not LLVMTargetMachine. - It should still be possible to stub out all the various functions in case a target does not want to use lib/CodeGen - This simplifies the code and avoids methods ending up in the wrong interface. Differential Revision: https://reviews.llvm.org/D38489 llvm-svn: 315633
* [X86] Add CLWB intrinsic. llvm partCraig Topper2017-10-121-2/+2
| | | | llvm-svn: 315613
* Implement custom lowering for ISD::CTTZ_ZERO_UNDEF and ISD::CTTZ.Wei Ding2017-10-125-38/+78
| | | | | | Differential Revision: http://reviews.llvm.org/D37348 llvm-svn: 315610
* AMDGPU/NFC: Move AMDGPU specific note types to ELF.hKonstantin Zhuravlyov2017-10-123-10/+6
| | | | | | Differential Revision: https://reviews.llvm.org/D38747 llvm-svn: 315608
* [NVPTX] Implemented wmma intrinsics and instructions.Artem Belevich2017-10-124-0/+845
| | | | | | | | | | WMMA = "Warp Level Matrix Multiply-Accumulate". These are the new instructions introduced in PTX6.0 and available on sm_70 GPUs. Differential Revision: https://reviews.llvm.org/D38645 llvm-svn: 315601
* [codeview] Don't emit FPO data in funclet prologuesReid Kleckner2017-10-122-6/+3
| | | | | | Attempt 3 to work around bugs in FPO data with funclets. llvm-svn: 315600
* AMDGPU: Fix warnings introduced in r315526Konstantin Zhuravlyov2017-10-122-5/+5
| | | | llvm-svn: 315596
* [PowerPC] Add profitablilty check for conversion to mtctr loopsLei Huang2017-10-121-1/+32
| | | | | | | | | | | | | | | Add profitability checks for modifying counted loops to use the mtctr instruction. The latency of mtctr is only justified if there are more than 4 comparisons that will be removed as a result. Usually counted loops are formed relatively early and before unrolling, so most low trip count loops often don't survive. However we want to ensure that if they do, we do not mistakenly update them to mtctr loops. Use CodeMetrics to ensure we are only doing this for small loops with small trip counts. Differential Revision: https://reviews.llvm.org/D38212 llvm-svn: 315592
* [AMDGPU] For amdpal, widen interpolation mode workaroundTim Renouf2017-10-121-8/+25
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: The interpolation mode workaround ensures that at least one interpolation mode is enabled in PSInputAddr. It does not also check PSInputEna on the basis that the user might enable bits in that depending on run-time state. However, for amdpal os type, the user does not enable some bits after compilation based on run-time states; the register values being generated here are the final ones set in the hardware. Therefore, apply the workaround to PSInputAddr and PSInputEnable together. (The case where a bit is set in PSInputAddr but not in PSInputEnable is where the frontend set up an input arg for a particular interpolation mode, but nothing uses that input arg. Really we should have an earlier pass that removes such an arg.) Reviewers: arsenm, nhaehnle, dstuttard Subscribers: kzhuravl, wdng, yaxunl, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D37758 llvm-svn: 315591
* [dump] Remove NDEBUG from test to enable dump methods [NFC]Don Hinton2017-10-1211-13/+13
| | | | | | | | | | | | | | | Summary: Add LLVM_FORCE_ENABLE_DUMP cmake option, and use it along with LLVM_ENABLE_ASSERTIONS to set LLVM_ENABLE_DUMP. Remove NDEBUG and only use LLVM_ENABLE_DUMP to enable dump methods. Move definition of LLVM_ENABLE_DUMP from config.h to llvm-config.h so it'll be picked up by public headers. Differential Revision: https://reviews.llvm.org/D38406 llvm-svn: 315590
* [x86] replace isEqualTo with == for efficiencySanjay Patel2017-10-121-4/+4
| | | | | | | This is a follow-up suggested in D37534. Patch by Yulia Koval. llvm-svn: 315589
* [X86][SSE] Pull out repeated INSERT_VECTOR_ELT code from LowerBUILD_VECTOR ↵Simon Pilgrim2017-10-121-57/+51
| | | | | | v16i8/v8i16 insertion. NFCI. llvm-svn: 315587
* Speculative build fix 2Reid Kleckner2017-10-121-1/+1
| | | | llvm-svn: 315542
* Revert r307036 because of PR34919.Wei Mi2017-10-121-13/+0
| | | | llvm-svn: 315540
* Speculative build fix, apparently I built llc without my patch applied to ↵Reid Kleckner2017-10-121-1/+1
| | | | | | test it llvm-svn: 315539
* [codeview] Disable FPO in functions using EH funcletsReid Kleckner2017-10-122-0/+5
| | | | | | | Funclets are emitted by WinException which doesn't have access to X86TargetStreamer so it's hard to make a quick fix for this. llvm-svn: 315538
* Fix AMDGPU build issueReid Kleckner2017-10-111-1/+1
| | | | llvm-svn: 315535
* [X86] Sink X86AsmPrinter ctor into .cpp file, NFCReid Kleckner2017-10-112-3/+5
| | | | | | I keep adding and removing code here, so let's sink it. llvm-svn: 315534
* [MC] Have MCObjectStreamer take its MCAsmBackend argument via unique_ptr.Lang Hames2017-10-1124-110/+154
| | | | | | | | MCObjectStreamer owns its MCCodeEmitter -- this fixes the types to reflect that, and allows us to remove the last instance of MCObjectStreamer's weird "holding ownership via someone else's reference" trick. llvm-svn: 315531
* AMDGPU/NFC: Minor clean ups in HSA metadataKonstantin Zhuravlyov2017-10-117-125/+110
| | | | | | | | | - Use HSA metadata streamer directly from AMDGPUAsmPrinter - Make naming consistent with PAL metadata Differential Revision: https://reviews.llvm.org/D38746 llvm-svn: 315526
* AMDGPU/NFC: Minor clean ups in PAL metadataKonstantin Zhuravlyov2017-10-116-90/+79
| | | | | | | | | - Move PAL metadata definitions to AMDGPUMetadata - Make naming consistent with HSA metadata Differential Revision: https://reviews.llvm.org/D38745 llvm-svn: 315523
* AMDGPU/NFC: Rename code object metadata as HSA metadataKonstantin Zhuravlyov2017-10-117-77/+76
| | | | | | | | | - Rename AMDGPUCodeObjectMetadata to AMDGPUMetadata (PAL metadata will be included in this file in the follow up change) - Rename AMDGPUCodeObjectMetadataStreamer to AMDGPUHSAMetadataStreamer - Introduce HSAMD namespace - Other minor name changes in function and test names llvm-svn: 315522
* [codeview] Implement FPO data assembler directivesReid Kleckner2017-10-1112-45/+704
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This adds a set of new directives that describe 32-bit x86 prologues. The directives are limited and do not expose the full complexity of codeview FPO data. They are merely a convenience for the compiler to generate more readable assembly so we don't need to generate tons of labels in CodeGen. If our prologue emission changes in the future, we can change the set of available directives to suit our needs. These are modelled after the .seh_ directives, which use a different format that interacts with exception handling. The directives are: .cv_fpo_proc _foo .cv_fpo_pushreg ebp/ebx/etc .cv_fpo_setframe ebp/esi/etc .cv_fpo_stackalloc 200 .cv_fpo_endprologue .cv_fpo_endproc .cv_fpo_data _foo I tried to follow the implementation of ARM EHABI CFI directives by sinking most directives out of MCStreamer and into X86TargetStreamer. This helps avoid polluting non-X86 code with WinCOFF specific logic. I used cdb to confirm that this can show locals in parent CSRs in a few cases, most importantly the one where we use ESI as a frame pointer, i.e. the one in http://crbug.com/756153#c28 Once we have cdb integration in debuginfo-tests, we can add integration tests there. Reviewers: majnemer, hans Subscribers: aemerson, mgorny, kristof.beyls, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D38776 llvm-svn: 315513
* [Hexagon] Make sure that new-value jump is packetized with producerKrzysztof Parzyszek2017-10-111-9/+15
| | | | llvm-svn: 315510
* [PowerPC] Utilize DQ-Form instructions for spill/restore and fix FrameIndex ↵Lei Huang2017-10-112-9/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | elimination to only use `lis/addi` if necessary. Currently we produce a bunch of unnecessary code when emitting the prologue/epilogue for spills/restores. Namely, if the load from stack slot/store to stack slot instruction is an X-Form instruction, we will always produce an LIS/ORI sequence for the stack offset. Furthermore, we have not exploited the P9 vector D-Form loads/stores for this purpose. This patch address both issues. Specifying the D-Form load as the instruction to use for stack spills/reloads should be safe because: 1. The stack should be aligned according to the ABI 2. If the stack isn't aligned, PPCRegisterInfo::eliminateFrameIndex() will check for the offset being a multiple of 16 and will convert it to an X-Form instruction if it isn't. Differential Revision : https://reviews.llvm.org/D38758 llvm-svn: 315500
* [x86] avoid infinite loop from SoftenFloatOperand (PR34866)Sanjay Patel2017-10-111-0/+5
| | | | | | | | | Legalization of fp128 assumes things that we should have asserts for, so that's another potential improvement. Differential Revision: https://reviews.llvm.org/D38771 llvm-svn: 315485
* [Hexagon] Handle non-immediate operands to A2_addi in getIncrementValueKrzysztof Parzyszek2017-10-111-4/+6
| | | | llvm-svn: 315472
* Spelling mistake in comment. NFCI.Simon Pilgrim2017-10-111-1/+1
| | | | llvm-svn: 315471
* [X86] Remove MVT::i1 handling code from LowerTRUNCATECraig Topper2017-10-111-8/+0
| | | | | | | | | | | | | | Summary: I don't think this is necessary with i1 being illegal now. Reviewers: RKSimon, zvi, guyblank Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38784 llvm-svn: 315469
* [Pipeliner] Fix offset value for instrs dependent on post-inc load/storesKrzysztof Parzyszek2017-10-111-7/+8
| | | | | | | | | | | | The software pipeliner and the packetizer try to break dependence between the post-increment instruction and the dependent memory instructions by changing the base register and the offset value. However, in some cases, the existing logic didn't work properly and created incorrect offset value. Patch by Jyotsna Verma. llvm-svn: 315468
* [Pipeliner] Improve serialization order for post-incrementsKrzysztof Parzyszek2017-10-113-1/+63
| | | | | | | | | | | | | | | | | | | The pipeliner is generating a serial sequence that causes poor register allocation when a post-increment instruction appears prior to the use of the post-increment register. This occurs when there is a circular set of dependences involved with a sequence of instructions in the same cycle. In this case, there is no serialization of the parallel semantics that will not cause an additional register to be allocated. This patch fixes the problem by changing the instructions so that the post-increment instruction is used by the subsequent instruction, which enables the register allocator to make a better decision and not require another register. Patch by Brendon Cahoon. llvm-svn: 315466
* [RISCV] Fix build after r315327Alex Bradbury2017-10-113-6/+10
| | | | | | | Differential Revision: https://reviews.llvm.org/D38779 Patch by Chih-Mao Chen. llvm-svn: 315455
* [mips] Add support for parsing target specific flags for MIRSimon Dardis2017-10-112-0/+42
| | | | | | | | Reviewers: atanasyan Differential Revision: https://reviews.llvm.org/D38620 llvm-svn: 315451
* [Asm] Add debug tracing in table-generated assembly matcherOliver Stannard2017-10-1113-25/+20
| | | | | | | | | | | | | This adds debug tracing to the table-generated assembly instruction matcher, enabled by the -debug-only=asm-matcher option. The changes in the target AsmParsers are to add an MCInstrInfo reference under a consistent name, so that we can use it from table-generated code. This was already being used this way for targets that use deprecation warnings, but 5 targets did not have it, and Hexagon had it under a different name to the other backends. llvm-svn: 315445
* [MC] Have MCObjectStreamer take its MCAsmBackend argument via unique_ptr.Lang Hames2017-10-1124-120/+170
| | | | | | | | MCObjectStreamer owns its MCAsmBackend -- this fixes the types to reflect that, and allows us to remove another instance of MCObjectStreamer's weird "holding ownership via someone else's reference" trick. llvm-svn: 315410
* [X86] Remove temporary std::string creation from shuffle comment printing. ↵Craig Topper2017-10-111-12/+12
| | | | | | We can just write directly to the raw_ostream. llvm-svn: 315399
* [X86] Add 128-bit version of vbroadcasti32x2 to shuffle comment decoding.Craig Topper2017-10-111-0/+11
| | | | llvm-svn: 315395
* [X86] Add broadcast patterns that allow a scalar_to_vector between the ↵Craig Topper2017-10-101-0/+18
| | | | | | | | broadcast and the load. We already have these patterns for AVX512VL, but not AVX1 or 2. llvm-svn: 315382
* [X86] Fix some patterns that select VLX instructions, but were incorrectly ↵Craig Topper2017-10-102-2/+6
| | | | | | | | also checking presence of BWI instructions. The EVEX->VEX pass probably obscures this. llvm-svn: 315365
* [mips] Correct the instruction predicates for microMIPSr3Simon Dardis2017-10-101-205/+224
| | | | | | | | | | | | | | | | Rather than using the AdditionalPredicates mechanism to guard the microMIPS instructions, use the existing predicates to properly guard those instructions. This also resolves a case where an instruction pattern was incorrectly available for microMIPS32R6, which caused a register allocation failure as the registers specified in the pattern were not available. Reviewers: nitesh.jain, atanasyan Differential Revision: https://reviews.llvm.org/D38451 llvm-svn: 315362
* AMDGPU: Fix missing skipFunction callsMatt Arsenault2017-10-102-1/+4
| | | | llvm-svn: 315361
* AMDGPU: Fix failure to select branch with optnoneMatt Arsenault2017-10-103-20/+6
| | | | | | | | opt-bisect/optnone disable the AMDGPUUniformAnnotateValues pass. The heuristic in the custom selector for brcond deferred the branch uniformity check to the pattern, which would fail. llvm-svn: 315360
* AMDGPU: Fix incorrect selection of pseudo-branchesMatt Arsenault2017-10-105-2/+13
| | | | | | These should only be used if the machine structurizer is enabled. llvm-svn: 315357
OpenPOWER on IntegriCloud