summaryrefslogtreecommitdiffstats
path: root/include/npu2.h
Commit message (Collapse)AuthorAgeFilesLines
* hw/npu2: Dump (more) npu2 registers on link error and HMIsFrederic Barrat2019-04-091-0/+1
| | | | | | | | | | | | | | | | | | We were already logging some NPU registers during an HMI. This patch cleans up a bit how it is done and separates what is global from what is specific to nvlink or opencapi. Since we can now receive an error interrupt when an opencapi link goes down unexpectedly, we also dump the NPU state but we limit it to the registers of the brick which hit the error. The list of registers to dump was worked out with the hw team to allow for proper debugging. For each register, we print the name as found in the NPU workbook, the scom address and the register value. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/npu2: Report errors to the OS if an OpenCAPI brick is fencedFrederic Barrat2019-04-091-0/+1
| | | | | | | | | | | | | | Now that the NPU may report interrupts due to the link going down unexpectedly, report those errors to the OS when queried by the 'next_error' PHB callback. The hardware doesn't support recovery of the link when it goes down unexpectedly. So we report the PHB as dead, so that the OS can log the proper message, notify the drivers and take the devices down. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/npu2: Use NVLink irq setup for OpenCAPIFrederic Barrat2019-04-091-1/+0
| | | | | | | | | | | | | | Start using the irq setup code from NVLink for OpenCAPI, since the 2 versions are so close. There are only 2 differences: - the NPU may trigger more interrupts for OpenCAPI, 35 vs. 23, though none are configured to be triggered for now. - we need to enable the 4 translation faults interrupts for OpenCAPI. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/npu2: Fix OpenCAPI PE assignmentAndrew Donnellan2019-04-091-3/+18
| | | | | | | | | | | | | | | | | | | | When we support mixing NVLink and OpenCAPI devices on the same NPU, we're going to have to share the same range of 16 PE numbers between NVLink and OpenCAPI PHBs. For OpenCAPI devices, PE assignment is only significant for determining which System Interrupt Log register is used for a particular brick - unlike NVLink, it doesn't play any role in determining how links are fenced. Split the PE range into a lower half which is used for NVLink, and an upper half that is used for OpenCAPI, with a fixed PE number assigned per brick. As the PE assignment for OpenCAPI devices is fixed, set the PE once during device init and then ignore calls to the set_pe() operation. Suggested-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2/hw-procedures: Fix parallel zcal for opencapiFrederic Barrat2019-03-201-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | For opencapi, we currently do impedance calibration when initializing the PHY for the device, which could run in parallel if we were rich and had multiple opencapi devices. But if 2 devices are on the same obus, the 2 calibration sequences could overlap, which likely yields bad results and is useless anyway since it only needs to be done once per obus. This patch splits the opencapi PHY reset in 2 parts: - a 'init' part called serially at boot. That's when zcal is done. If we have 2 devices on the same socket, the zcal won't be redone, since we're called serially and we'll see it has already be done for the obus - a 'reset' part called during fundamental reset as a prereq for link training. It does the PHY setup for a set of lanes and the dccal. The PHY team confirmed there's no dependency between zcal and the other reset steps and it can be moved earlier. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2-opencapi: Fix adapter reset when using 2 adaptersFrederic Barrat2019-03-131-0/+4
| | | | | | | | | | | | | | | | If two opencapi adapters are on the same obus, we may try to train the two links in parallel at boot time, when all the PCI links are being trained. Both links use the same i2c controller to handle the reset signal, so some care is needed to make sure resetting one doesn't interfere with the reset of the other. We need to keep track of the current state of the i2c controller (and use locking). This went mostly unnoticed as you need to have 2 opencapi cards on the same socket and links tended to train anyway because of the retries. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2: Add XTS_BDF_MAP wildcard refcountAlexey Kardashevskiy2019-02-251-0/+2
| | | | | | | | | | | | | | | | | | Currently PID wildcard is programmed into the NPU once and never cleared up. This works for the bare metal as MSR does not change while the host OS is running. However with the device virtualization, we need to keep track of wildcard entries use and clear them up before switching a GPU from a host to a guest or vice versa. This adds refcount to a NPU2, one counter per wildcard entry. The index is a short lparid (4 bits long) which is allocated in opal_npu_map_lpar() and should be smaller than NPU2_XTS_BDF_MAP_SIZE (defined as 16). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Reza Arbab <arbab@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* opal: Deprecate reading the PHB statusAlexey Kardashevskiy2019-02-181-2/+1
| | | | | | | | | | | | | | | | | The OPAL_PCI_EEH_FREEZE_STATUS call takes a bunch of parameters, one of them is @phb_status. It is defined as __be64* and always NULL in the current Linux upstream but if anyone ever decides to read that status, then the PHB3's handler will assume it is struct OpalIoPhb3ErrorData* (which is a lot bigger than 8 bytes) and zero it causing the stack corruption; p7ioc-phb has the same issue. This removes @phb_status from all eeh_freeze_status() hooks and moves the error message from PHB4 to the affected OPAL handlers. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-By: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2: Remove unused npu2_dev_nvlink::vendor_capReza Arbab2019-01-161-3/+0
| | | | | | | | This variable is never used. Remove it. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2: Remove unused npu2_dev::procedure_dataReza Arbab2019-01-161-1/+0
| | | | | | | | This variable is never used. Remove it. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2: Remove unused npu2::lxive_cacheReza Arbab2019-01-161-1/+0
| | | | | | | | This variable is never used. Remove it. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2: Remove unused npu2::bdf2pe_cacheReza Arbab2019-01-161-1/+0
| | | | | | | | | | This cache is written but never read. Wiring it up would gain us little (except added complexity), and it obviously hasn't been missed thus far, so remove it altogether. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/npu2, platform: Add NPU2 platform device detection callbackAndrew Donnellan2018-09-171-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no standardised way to determine the presence and type of devices connected to an NPU on POWER9. Currently, we hardcode device types based on platform type (as no platform currently supports both OpenCAPI and NVLink), and for OpenCAPI platforms we use I2C to detect presence. Witherspoon (and potentially other platforms later on) supports both NVLink and OpenCAPI, and additionally uses SXM2 connectors which can carry more than one link, rather than the SlimSAS connectors used for OpenCAPI on Zaius and ZZ. This necessitates some special handling. Add a platform callback for NPU device detection. In a later patch, we will use this to implement Witherspoon-specific device detection. For now, add a Witherspoon stub that sets all links to NVLink (i.e. current behaviour). Move the existing I2C-based presence detection for OpenCAPI devices on Zaius/ZZ into common code, which we use by default for platforms which do not define a callback. Clean up the use of the ibm,npu-link-type property, which will now be exposed solely for debugging and not consumed internally. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/npu2: Common NPU2 init routine between NVLink and OpenCAPIAndrew Donnellan2018-09-171-0/+5
| | | | | | | | | | | | | | | Replace probe_npu2() and probe_npu2_opencapi() with a new shared probe_npu2(). Refactor some of the common NPU setup code into shared code. No functional change. This patch does not implement support for using both types of devices simultaneously on the same NPU - we expect to add this sometime in the future. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Alistair Popple <alistair@popple.id.au> Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2: Split device index into brick and link indexAndrew Donnellan2018-09-171-3/+4
| | | | | | | | | | | | | | On Witherspoon, OpenCAPI devices attached to link indexes 0 and 1 are handled by bricks 2 and 3. Rename index to brick_index, and add a new field, link_index, to refer to the link index. For now, we set those values identically. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Alistair Popple <alistair@popple.id.au> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2: Use same compatible string for NVLink and OpenCAPI link nodes in ↵Andrew Donnellan2018-07-031-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | device tree Currently, we distinguish between NPU links for NVLink devices and OpenCAPI devices through the use of two different compatible strings - ibm,npu-link and ibm,npu-link-opencapi. As we move towards supporting configurations with both NVLink and OpenCAPI devices behind a single NPU, we need to detect the device type as part of presence detection, which can't happen until well after the point where the HDAT or platform code has created the NPU device tree nodes. Changing a node's compatible string after it's been created is a bit ugly, so instead we should move the device type to a new property which we can add to the node later on. Get rid of the ibm,npu-link-opencapi compatible string, add a new ibm,npu-link-type property, and a helper function to check the link type. Add an "unknown" device type in preparation for later patches to detect device type dynamically. These device tree bindings are entirely internal to skiboot and are not consumed directly by Linux, so this shouldn't break anything (other than internal BML lab environments). Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2-opencapi: Train links on fundamental resetFrederic Barrat2018-06-011-0/+2
| | | | | | | | | | | | | | | | | | | | Reorder our link training steps so that they are executed on fundamental reset instead of during the initial setup. Skiboot always call a fundamental reset on all the PHBs during pci init. It is done through a state machine, similarly to what is done for 'real' PHBs. This is the first step for a longer term goal to be able to trigger an adapter reset from linux. We'll need the reset callbacks of the PHB to be defined. We have to handle the various delays differently, since a linux thread shouldn't stay stuck waiting in opal for too long. No functional changes. Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2: Remove DD1 supportAndrew Donnellan2018-03-221-1/+0
| | | | | | | | | | | | | | | | | | Major changes in the NPU between DD1 and DD2 necessitated a fair bit of revision-specific code. Now that all our lab machines are DD2, we no longer test anything on DD1 and it's time to get rid of it. Remove DD1-specific code and abort probe if we're running on a DD1 machine. Cc: Alistair Popple <alistair@popple.id.au> Cc: Reza Arbab <arbab@linux.vnet.ibm.com> Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-By: Alistair Popple <alistair@popple.id.au> Acked-by: Reza Arbab <arbab@linux.vnet.ibm.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* npu2: Remove unused fields in struct npu2Andrew Donnellan2018-03-221-2/+0
| | | | | | | | | | | Trivial cleanup of two unused fields in struct npu2. Cc: Alistair Popple <alistair@popple.id.au> Cc: Reza Arbab <arbab@linux.vnet.ibm.com> Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-By: Alistair Popple <alistair@popple.id.au> Acked-by: Reza Arbab <arbab@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* npu2-opencapi: Train OpenCAPI links and setup devicesAndrew Donnellan2018-03-011-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | Scan the OpenCAPI links under the NPU, and for each link, reset the card, set up a device, train the link and register a PHB. Implement the necessary operations for the OpenCAPI PHB type. For bringup, test and debug purposes, we allow an NVRAM setting, "opencapi-link-training" that can be set to either disable link training completely or to use the prbs31 test pattern. To disable link training: nvram -p ibm,skiboot --update-config opencapi-link-training=none To use prbs31: nvram -p ibm,skiboot --update-config opencapi-link-training=prbs31 Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* npu2-hw-procedures: Add support for OpenCAPI PHY link trainingAndrew Donnellan2018-03-011-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | Unlike NVLink, which uses the pci-virt framework to fake a PCI configuration space for NVLink devices, the OpenCAPI device model presents us with a real configuration space handled by the device over the OpenCAPI link. As a result, we have to train the OpenCAPI link in skiboot before we do PCI probing, so that config space can be accessed, rather than having link training being triggered by the Linux driver. Add some helper functions to wrap the existing NVLink PHY training sequence so we can easily run it within skiboot. Additionally, we add OpenCAPI-specific lane settings, and a function to "bump" lanes that haven't trained properly (this process isn't documented in the workbook, but the hardware experts assure us that this improves link training reliability...) We also support the PRBS31 pattern that's used for bringup and test purposes. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Acked-by: Reza Arbab <arbab@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* npu2-opencapi: Configure NPU for OpenCAPIAndrew Donnellan2018-03-011-0/+2
| | | | | | | | | | | | | Scan the device tree for NPUs with OpenCAPI links and configure the NPU per the initialisation sequence in the NPU OpenCAPI workbook. Training of individual links and setup of per-AFU/link configuration will be in a later patch. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* npu2: Rework NPU data structures for OpenCAPIAndrew Donnellan2018-03-011-19/+58
| | | | | | | | | | | | | | | | | | | | | Unlike NVLink, OpenCAPI registers a separate PHB for each device, in order to allow us to force Linux to use the correct MMIO windows for each NPU link. This requires some reworking of NPU data structures to account for the fact that a PHB could correspond to either an NPU (NVLink) or a single link (OpenCAPI). At some later point, we may want to rework the NVLink code to present a separate PHB per device in order to simplify this. For now, we split NVLink-specific device data into a separate struct in order to make it clear which fields are NVLink-only. Additionally, add helper functions to correctly translate between OpenCAPI/NVLink PHBs and the underlying structures, and various fields for OpenCAPI data that we're going to need later on. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Reza Arbab <arbab@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* npu2: Split out common helper functions into separate fileAndrew Donnellan2018-03-011-0/+2
| | | | | | | | | Split out common helper functions for NPU register access into a separate file, as these will be used extensively by both NVLink and OpenCAPI code. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Reza Arbab <arbab@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* hw/npu2: support creset of npu2 devicesBalbir Singh2018-02-131-0/+1
| | | | | | | | | creset calls in the hw procedure that resets the PHY, we don't take them out of reset, just put them in reset. Signed-off-by: Balbir Singh <bsingharora@gmail.com> Acked-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* npu2: Print bdfn in NPU2DEV* logging macrosReza Arbab2017-11-211-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | Revise the NPU2DEV{DBG,INF,ERR} logging macros to include the device's bdfn. It's useful to know exactly which link we're referring to. For instance, instead of [ 234.044921238,6] NPU6: Starting procedure reset_ntl [ 234.048578101,6] NPU6: Starting procedure reset_ntl [ 234.051049676,6] NPU6: Starting procedure reset_ntl [ 234.053503542,6] NPU6: Starting procedure reset_ntl [ 234.057182864,6] NPU6: Starting procedure reset_ntl [ 234.059666137,6] NPU6: Starting procedure reset_ntl we'll get [ 234.044921238,6] NPU6:0:0.0 Starting procedure reset_ntl [ 234.048578101,6] NPU6:0:0.1 Starting procedure reset_ntl [ 234.051049676,6] NPU6:0:0.2 Starting procedure reset_ntl [ 234.053503542,6] NPU6:0:1.0 Starting procedure reset_ntl [ 234.057182864,6] NPU6:0:1.1 Starting procedure reset_ntl [ 234.059666137,6] NPU6:0:1.2 Starting procedure reset_ntl Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com> Acked-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* npu2: Remove unused npu2_dev struct membersAndrew Donnellan2017-11-191-3/+0
| | | | | | | | | | | | There's a few members of struct npu2_dev that are completely unused. Remove them. Cc: Alistair Popple <alistair@popple.id.au> Cc: Reza Arbab <arbab@linux.vnet.ibm.com> Cc: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Reza Arbab <arbab@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* npu2: Move to new GPU memory mapMichael Neuling2017-11-151-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | There are three different ways we configure the MCD and memory map. 1) Old way (current way) Skiboot configures the MCD and puts GPUs at 4TB and below 2) New way with MCD Hostboot configures the MCD and skiboot puts GPU at 4TB and above 3) New way without MCD No one configures the MCD and skiboot puts GPU at 4TB and below The patch keeps option 1 and adds options 2 and 3. The different configurations are detected using certain scoms (see patch). Option 1 will go away eventually as it's a configuration that can cause xstops or data integrity problems. We are keeping it around to support existing hostboot. Option 2 supports only 4 GPUs and 512GB of memory per socket. Option 3 supports 6 GPUs and 4TB of memory but may have some performance impact. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* npu2: Add npu2_write_mask_4b()Reza Arbab2017-11-131-0/+1
| | | | | | | | | Add a 4-byte version of npu2_write_mask(). Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com> Reviewed-by: Alistair Popple <alistair@popple.id.au> Reviewed-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* npu2: Implement FLRReza Arbab2017-09-121-0/+1
| | | | | | | | | | | | | | Add basic handling of FLR (function level reset) by porting the changes from commit b74841db759d ("npu: Implement FLR") to npu2. The only difference for npu2 is that we track the reset state explicitly with a link flag instead of inferring it from dev->procedure_{status,number,step,data}. Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com> Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* npu2: Add npu2_clear_link_flag()Reza Arbab2017-09-121-0/+1
| | | | | | | Add a complement to npu2_set_link_flag(). Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* npu2: Use phys-map to get MMIO BARsAndrew Donnellan2017-06-301-0/+4
| | | | | | | | | | | Commit bdea201a4c4b ("hw/npu2.c: Use phys-map to get GPU memory BARs") added use of phys-map for setting GPU memory BARs. Move the MMIO BARs over to using phys-map as well. Acked-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* NPU2: Add flag to nvlink config space indicating DL reset stateAlistair Popple2017-06-201-1/+6
| | | | | | | | | | Device drivers need to be able to determine if the DL is out of reset or not so they can safely probe to see if links have already been trained. This patch adds a flag to the vendor specific config space indicating if the DL is out of reset. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* hw/npu2-hw-procedures.c: Add nvram option to override zcal calculationsAlistair Popple2017-06-201-1/+1
| | | | | | | | | | | In some rare cases the zcal state machine may fail and flag an error. According to hardware designers it is sometimes ok to ignore this failure and use nominal values for the calculations. In this case we add a nvram variable (nv_zcal_override) which will cause skiboot to ignore the failure and use the nominal value specified in nvram. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* npu2: Fix npu2_{read,write}_4b()Reza Arbab2017-06-061-2/+2
| | | | | | | | | | | | When writing or reading 4-byte values, we need to use the upper half of the 64-bit SCOM register. Fix npu2_{read,write}_4b() and their callers to use uint32_t, and appropriately shift the value being written or returned. Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com> Acked-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* npu2: Fix BAR mapping for multiple chipsAlistair Popple2017-05-101-12/+2
| | | | | | | | | | | | | | NPU2 BARs were being assigned and tracked with a global static array. This worked fine when there was only a single chip/NPU2 in the system however multiple chips results in the a shared data structure for BAR management which results in multiple chips getting assigned the same BAR addresses and other incorrect sharing of BAR properties. This patch splits the static and dynamic BAR configuration and stores the dynamic configuration in the per-NPU2 data structure. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* npu2: Add hardware link training proceduresAlistair Popple2017-03-301-1/+8
| | | | | | | | | | Unlike other system buses the NVLink2 links need to be trained at runtime as training requires interaction from the GPU device drivers. This patch implements the required training procedures for NVLink2, which are different than the NVLink1 equivalents. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* npu2: Allocate GPU memory and describe it in the dtReza Arbab2017-03-301-0/+2
| | | | | | | | | | | | | Allocate memory for the GPU vidmem aperture and create "memory@" dt nodes to describe GPU memory with a phandle in each pointing to the emulated PCI device. Also provide the compressed 47-bit device address in "ibm,device-tgt-addr". Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com> Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* Introduce NPU2 supportAlistair Popple2017-03-301-0/+152
NVLink2 is a new feature introduced on POWER9 systems. It is an evolution of of the NVLink1 feature included in POWER8+ systems but adds several new features including support for GPU address translation using the Nest MMU and cache coherence. Similar to NVLink1 the functionality is exposed to the OS as a series of virtual PCIe devices. However the actual hardware interfaces are significantly different which limits the amount of common code that can be shared between implementations in the firmware. This patch adds basic hardware initialisation and exposure of the virtual NVLink2 PCIe devices to the running OS. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
OpenPOWER on IntegriCloud