summaryrefslogtreecommitdiffstats
path: root/include
Commit message (Collapse)AuthorAgeFilesLines
...
* sparse: Make tree 'constant is so big' warning cleanStewart Smith2019-01-184-27/+27
| | | | Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2: Remove unused npu2_dev_nvlink::vendor_capReza Arbab2019-01-161-3/+0
| | | | | | | | This variable is never used. Remove it. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2: Remove unused npu2_dev::procedure_dataReza Arbab2019-01-161-1/+0
| | | | | | | | This variable is never used. Remove it. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2: Remove unused npu2::lxive_cacheReza Arbab2019-01-161-1/+0
| | | | | | | | This variable is never used. Remove it. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2: Remove unused npu2::bdf2pe_cacheReza Arbab2019-01-161-1/+0
| | | | | | | | | | This cache is written but never read. Wiring it up would gain us little (except added complexity), and it obviously hasn't been missed thus far, so remove it altogether. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* capp/phb4: Introduce PHB4 flag, PHB4_CAPP_DISABLE to disable CAPPVaibhav Jain2019-01-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces a PHB4 flag PHB4_CAPP_DISABLE and scaffolding necessary to handle it during CRESET flow. The flag is set when CAPP is request to switch to PCIe mode via call to phb4_set_capi_mode() with mode OPAL_PHB_CAPI_MODE_PCIE. This starts the below sequence that ultimately ends in newly introduced phb4_slot_sm_run_completed() 1. Set PHB4_CAPP_DISABLE to phb4->flags. 2. Start a CRESET on the phb slot. This also starts the opal pci reset state machine. 3. Wait for slot state to be PHB4_SLOT_CRESET_WAIT_CQ. 4. Perform CAPP recovery as PHB is still fenced, by calling do_capp_recovery_scoms(). 5. Call newly introduced 'disable_capi_mode()' to disable CAPP. 6. Wait for slot reset to complete while it transitions to PHB4_SLOT_FRESET and optionally to PHB4_SLOT_LINK_START. 7. Once slot reset is complete opal pci-core state machine will call slot->ops.completed_sm_run(). 8. For PHB4 this branches newly introduced 'phb4_slot_sm_run_completed()'. 9. Inside this function we mark the CAPP as disabled and un-register the opal syncer phb4_host_sync_reset(). 10. Optionally if the slot reset was unsuccessful disable fast-reboot. **************************** Notes: **************************** a. Function 'disable_capi_mode()' performs various sanity tests on CAPP to to determine if its ok to disable it and perform necessary xscoms to disable it. However the current implementation proposed in this patch is a skeleton one that just does sanity tests. A followup patch will be proposed that implements the xscoms necessary to disable CAPP. b. The sequence expects that Opal PCI reset state machine makes forward progress hence needs someone to call slot->ops.run_sm(). This can be either from phb4_host_sync_reset() or opal_pci_poll(). Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* capp/phb: Introduce 'struct capp' to hold capp related info in 'struct phb'Vaibhav Jain2019-01-163-1/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously struct proc_chip member 'capp_phb3_attached_mask' was used for Power-8 to keep track of PHB attached to the single CAPP on the chip. CAPP on that chip supported a flexible PHB assignment scheme. However since then new chips only support a static assignment i.e a CAPP can only be attached to a specific PEC. Hence instead of using 'proc_chip.capp_phb4_attached_mask' to manage CAPP <-> PEC assignments which needs a global lock (capi_lock) to be updated, we introduce a new struct named 'capp' a pointer to which resides inside struct 'phb4'. Since updates to struct 'phb4' already happen in context of phb_lock; this eliminates the need to use mutex 'capi_lock' while updating 'capp_phb4_attached_mask'. This struct is also used to hold CAPP specific variables such as pointer to the 'struct phb' to which the CAPP is attached, 'capp_xscom_offset' which is the xscom offset to be added to CAPP registers in case there are more than 1 on the chip, 'capp_index' which is the index of the CAPP on the chip, and attached_pe' which is the process endpoint index to which CAPP is attached. Finally member 'chip_id' holds the chip-id thats used for performing xscom read/writes. Also new helpers named capp_xscom_read()/write() are introduced to make access to CAPP xscom registers easier. Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* core/pci: Introduce a new pci_slot_op named completed_sm_run()Vaibhav Jain2019-01-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | At times we need to perform some cleanup activities when the Opal PCI state machine that perform creset/freset/hreset (driven by pci_slot_ops->run_sm which) of a slot completes. One example can be to mark CAPP attached to a PHB, as deactivated when creset/freset of a CAPI card slot is completed. However the calls to pci_slot_ops->run_sm() is scattered through out the code and patching each call site to check for the return value and perform custom cleanup tacks is difficult. Hence this patch introduces a new pci_slot_ops named completed_sm_run() which should be called when pci_slot_ops->run_sm() determines that the reset state machine is complete. This provides a more centralized way to handle slot related cleanup activities. Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* opal: Update opal_del_host_sync_notifier() to accept 'void *data'Vaibhav Jain2019-01-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | Current implementation of opal_del_host_sync_notifier() will only delete the first entry of the 'notify' callback found from opal_syncers list irrespective of the 'data' of list-node. This is problematic when multiple notifiers with same callback function but different 'data' are registered. In this case when the cleanup code will call opal_del_host_sync_notifier() it cannot be sure if correct opal_syncer is removed. Hence this patch updates the function to accept a new argument named 'void *data' which is then used to iterates over the opal_syncers list and only remove the first node node having the matching value for 'notify' callback as 'data'. Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* Revert "npu2: Allow ATSD for LPAR other than 0"Stewart Smith2018-12-121-2/+0
| | | | | | | This reverts commit d8b161f4b361f70a7bb43be47d4a32b8f937287a. As discussed on list, a bit premature to merge, removing for now. Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* Add purging CPU L2 and L3 caches into NPU hreset.Rashmica Gupta2018-12-101-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a GPU is passed through to a guest and the guest unexpectedly terminates, there can be cache lines in CPUs that belong to the GPU. So purge the caches as part of the reset sequence. L1 is write through, so doesn't need to be purged. The sequence to purge the L2 and L3 caches from the hw team: "L2 purge: (1) initiate purge putspy pu.ex EXP.L2.L2MISC.L2CERRS.PRD_PURGE_CMD_TYPE L2CAC_FLUSH -all putspy pu.ex EXP.L2.L2MISC.L2CERRS.PRD_PURGE_CMD_TRIGGER ON -all (2) check this is off in all caches to know purge completed getspy pu.ex EXP.L2.L2MISC.L2CERRS.PRD_PURGE_CMD_REG_BUSY -all (3) putspy pu.ex EXP.L2.L2MISC.L2CERRS.PRD_PURGE_CMD_TRIGGER OFF -all L3 purge: 1) Start the purge: putspy pu.ex EXP.L3.L3_MISC.L3CERRS.L3_PRD_PURGE_TTYPE FULL_PURGE -all putspy pu.ex EXP.L3.L3_MISC.L3CERRS.L3_PRD_PURGE_REQ ON -all 2) Ensure that the purge has completed by checking the status bit: getspy pu.ex EXP.L3.L3_MISC.L3CERRS.L3_PRD_PURGE_REQ -all You should see it say OFF if it's done: p9n.ex k0:n0:s0:p00:c0 EXP.L3.L3_MISC.L3CERRS.L3_PRD_PURGE_REQ OFF" Suggested-by: Alistair Popple <alistair@popple.id.au> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Rashmica Gupta <rashmica.g@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2: Allow ATSD for LPAR other than 0Alexey Kardashevskiy2018-12-101-0/+2
| | | | | | | | | | | | | | | | | Each XTS MMIO ATSD# register is accompanied by another register - XTS MMIO ATSD0 LPARID# - which controls LPID filtering for ATSD transactions. When a host system passes a GPU through to a guest, we need to enable some ATSD for an LPAR. At the moment the host assigns one ATSD to a NVLink bridge and this maps it to an LPAR when GPU is assigned to the LPAR. The link number is used for an ATSD index. ATSD6&7 stay mapped to the host (LPAR=0) all the time which seems to be acceptable price for the simplicity. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2-opencapi: Log ODL endpoint information registerFrederic Barrat2018-11-281-0/+5
| | | | | | | | | | | If the link trains in degraded mode, log the ODL endpoint information register for debug. Its content is specific to the DLx and TLx implementation, so this is really information useful for the hardware team. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2-opencapi: Detect if link trained in degraded modeFrederic Barrat2018-11-281-0/+2
| | | | | | | | | | | There's no status readily available to tell the effective link width. Instead, we have to look at the individual status of each lane, on the transmit and receive direction. All relevant information is in the ODL status register. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* Warn on long OPAL callsStewart Smith2018-11-211-0/+1
| | | | | | | | Measure entry/exit time for OPAL calls and warn appropriately if the calls take too long (>100ms gets us a DEBUG log, > 1000ms gets us a warning). Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* Add the other 7 ATSD registers to the device tree.Rashmica Gupta2018-11-181-0/+2
| | | | | | | Suggested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Rashmica Gupta <rashmica.g@gmail.com> Reviewed-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* phb4: Update & cleanup register definitionsBenjamin Herrenschmidt2018-11-081-22/+9
| | | | | | | | | | We had a bunch of remaining definitions for registers that don't actually exist in PHB4 anymore (copied from PHB3). This removes them along with a handful of minor style cleanups Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* phb4/capp: Only reset FIR bits that cause capp machine checkVaibhav Jain2018-11-011-0/+1
| | | | | | | | | | | | | | | | | | During CAPP recovery do_capp_recovery_scoms() will reset the CAPP Fir register just after CAPP recovery is completed. This has an unintentional side effect of preventing PRD from analyzing and reporting this error. If PRD tries to read the CAPP FIR after opal has already reset it, then it logs a critical error complaining "No active error bits found". To prevent this from happening we update do_capp_recovery_scoms() to only reset fir bits that cause CAPP machine check (local xstop). This is done by reading the CAPP Fir Action0/1 & Mask registers and generating a mask which is then written on CAPP_FIR_CLEAR register. Cc: stable Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* phb4: Check for RX errors after link trainingOliver O'Halloran2018-11-012-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | Some PHB4 PHYs can get stuck in a bad state where they are constantly retraining the link. This happens transparently to skiboot and Linux but will causes PCIe to be slow. Resetting the PHB4 clears the problem. We can detect this case by looking at the RX errors count where we check for link stability. This patch does this by modifying the link optimal code to check for RX errors. If errors are occurring we retrain the link irrespective of the chip rev or card. Normally when this problem occurs, the RX error count is maxed out at 255. When there is no problem, the count is 0. We chose 8 as the max rx errors value to give us some margin for a few errors. There is also a knob that can be used to set the error threshold for when we should retrain the link. ie nvram -p ibm,skiboot --update-config phb-rx-err-max=8 Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2-opencapi: Enable presence detection on ZZFrederic Barrat2018-10-251-1/+0
| | | | | | | | | | | | | | | | | | Presence detection for opencapi adapters was broken for ZZ planars v3 and below. All ZZ systems currently used in the lab have had their planar upgraded, so we can now remove the override we had to force presence and activate presence detection. Which should improve boot time. Considering the state of opal support on ZZ, this is really only for lab usage on BML. The opencapi enablement team has okay'd the change. In the unlikely case somebody tries opencapi on an old ZZ, the presence detection through i2c will show that no adapter is present and skiboot won't try to access or train the link. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* phb4/capp: Update the expected Eye-catcher for CAPP ucode lidVaibhav Jain2018-10-161-1/+1
| | | | | | | | | | | | | | | | | | | | Currently on a FSP based P9 system load_capp_code() expects CAPP ucode lid header to have eye-catcher magic of 'CAPPPSLL'. However skiboot currently supports CAPP ucode only lids that have a eye-catcher magic of 'CAPPLIDH'. This prevents skiboot from loading the ucode with this error message: CAPP: ucode header invalid We fix this issue by updating load_capp_ucode() to use the eye-catcher value of 'CAPPLIDH' instead of 'CAPPPSLL'. Cc: stable Fixes: e50764d4f2b1("capi: Load capp microcode") Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* platform: Restructure bmc_platform typeAndrew Jeffery2018-10-112-2/+15
| | | | | | | | | | Segregate the BMC platform configuration into hardware and software components. This allows population of platform default values for hardware configuration that may no-longer be accessible by the host. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> [stewart: fixup pci-quirk unit test] Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* lpc: Introduce generic probe capabilityAndrew Jeffery2018-10-111-4/+10
| | | | | | | | | | | | | | Introduce generic read and write probe functions that allow detection of valid addresses by way of synchronous testing for the SYNC no-response state. If the no-response state is detected the probe functions will return an error to the caller, who can do with it what they wish. In the process, rip out the naive mechanism for muting the equivalent asynchronous error logging (regretfully introduced recently by yours truly). Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* astbmc: Remove coordinated isolation supportAndrew Jeffery2018-10-111-2/+0
| | | | | Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* astbmc: Prefer ipmi-hiomap for PNOR accessAndrew Jeffery2018-10-111-2/+2
| | | | | | | | | If the IPMI command is not available, fall back to the mailbox interface. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> [stewart: fix up mbox test] Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* libflash: Add ipmi-hiomapAndrew Jeffery2018-10-113-17/+87
| | | | | | | | | | | | | | | | | | | | | | ipmi-hiomap implements the PNOR access control protocol formerly known as "the mbox protocol" but uses IPMI instead of the AST LPC mailbox as a transport. As there is no-longer any mailbox involved in this alternate implementation the old protocol name is quite misleading, and so it has been renamed to "the hiomap protoocol" (Host I/O Mapping protocol). The same commands and events are used though this client-side implementation assumes v2 of the protocol is supported by the BMC. The code is a heavily-reworked copy of the mbox-flash source and is introduced this way to allow for the mbox implementation's eventual removal. mbox-flash should in theory be renamed to mbox-hiomap for consistency, but as it is on life-support effective immediately we may as well just remove it entirely when the time is right. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> [stewart: prlog debug over prerror for mbox fallback, fix indent] Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* ipmi: Introduce registration for SEL command handlersAndrew Jeffery2018-10-101-0/+5
| | | | | Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/p8-i2c: Fix i2c request timeoutFrederic Barrat2018-09-271-1/+1
| | | | | | | | | | | | | | | | Commit eb146fac9685 ("core/i2c: Move the timeout field into i2c_request") simplified a bit how a request timeout is handled. However there's now some confusion between milliseconds and timebase increments when defining or using the timeout values, which breaks i2c requests made for opencapi, and probably others too. This patch declares all the timeout in milliseconds and just converts to timebase at the end of the chain, as needed. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* fast-reboot: verify firmware "romem" checksumNicholas Piggin2018-09-201-0/+3
| | | | | | | | | | | | | | | | This takes a checksum of skiboot memory after boot that should be unchanged during OS operation, and verifies it before allowing a fast reboot. This is not read-only memory from skiboot's point of view, beause it includes things like the opal branch table that gets populated during boot. This helps to improve the integrity of firmware against host and runtime firmware memory scribble bugs. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* skiboot.lds.S: move read-write data after the end of symbol mapNicholas Piggin2018-09-201-0/+7
| | | | | | | | This also tidies up linker script symbol declarations and adds _rodata_mem symbol for the next change to use. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* Add fast-reboot property to /ibm,opal DT nodeStewart Smith2018-09-181-0/+1
| | | | | | | this means that if it's permanently disabled on boot, the test suite can pick that up and not try a fast reboot test. Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* errorlog: Rename PHB3 to just PHBRussell Currey2018-09-171-2/+2
| | | | | | | | | | | | | I don't see a reason why there would need to be a PHB3 *specific* subsystem in the error logs, so rename it to PHB so that PHB4 and later can use it too without continually redefining it. This shouldn't change any existing assumptions because it's unused. Signed-off-by: Russell Currey <ruscur@russell.cc> Reviewed-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* core/i2c: Remove bus specific alloc and free callbacksOliver O'Halloran2018-09-171-12/+0
| | | | | | | These are now pointless and they can be replaced with zalloc() and free(). Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* core/i2c: Move the timeout field into i2c_requestOliver O'Halloran2018-09-171-9/+1
| | | | | | | | | | | Currently to set a per-request timeout you need to use i2c_req_set_timeout() which is a wrapper for a per-bus method that sets the actual timeout. This design doesn't make a whole lot of sense, so move the timeout field into the generic i2c_request structure and set the timeout to be set using that. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/npu2, platform: Restructure OpenCAPI i2c reset/presence pinsAndrew Donnellan2018-09-171-4/+8
| | | | | | | | | | | | | | | | | | | | | | In platform_ocapi, we define i2c_{reset,presence}_odl{0,1} to specify the appropriate reset/presence GPIO pins for devices connected to ODL0 and ODL1 respectively. This is obviously wrong, because a device connected to brick 2 and a device connected to brick 4 are going to be different devices connected to different I2C pins, but rather conveniently we haven't had to deal with systems that can use the full 4 bricks as yet. Now that we're adding OpenCAPI support for Witherspoon, we should change this to specify pins separately for all 4 bricks. Replace i2c_{reset,presence}_odl{0,1} with i2c_{reset,presence}_brick{2,3,4,5} and update the presence detection code, device reset code, and existing platforms accordingly. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/npu2, platform: Add NPU2 platform device detection callbackAndrew Donnellan2018-09-172-2/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no standardised way to determine the presence and type of devices connected to an NPU on POWER9. Currently, we hardcode device types based on platform type (as no platform currently supports both OpenCAPI and NVLink), and for OpenCAPI platforms we use I2C to detect presence. Witherspoon (and potentially other platforms later on) supports both NVLink and OpenCAPI, and additionally uses SXM2 connectors which can carry more than one link, rather than the SlimSAS connectors used for OpenCAPI on Zaius and ZZ. This necessitates some special handling. Add a platform callback for NPU device detection. In a later patch, we will use this to implement Witherspoon-specific device detection. For now, add a Witherspoon stub that sets all links to NVLink (i.e. current behaviour). Move the existing I2C-based presence detection for OpenCAPI devices on Zaius/ZZ into common code, which we use by default for platforms which do not define a callback. Clean up the use of the ibm,npu-link-type property, which will now be exposed solely for debugging and not consumed internally. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/npu2: Common NPU2 init routine between NVLink and OpenCAPIAndrew Donnellan2018-09-172-1/+5
| | | | | | | | | | | | | | | Replace probe_npu2() and probe_npu2_opencapi() with a new shared probe_npu2(). Refactor some of the common NPU setup code into shared code. No functional change. This patch does not implement support for using both types of devices simultaneously on the same NPU - we expect to add this sometime in the future. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Alistair Popple <alistair@popple.id.au> Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2: Split device index into brick and link indexAndrew Donnellan2018-09-172-10/+11
| | | | | | | | | | | | | | On Witherspoon, OpenCAPI devices attached to link indexes 0 and 1 are handled by bricks 2 and 3. Rename index to brick_index, and add a new field, link_index, to refer to the link index. For now, we set those values identically. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Alistair Popple <alistair@popple.id.au> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* lock: Fix interactions between lock dependency checker and stack checkerBenjamin Herrenschmidt2018-08-162-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | The lock dependency checker does a few nasty things that can cause re-entrancy deadlocks in conjunction with the stack checker or in fact other debug tests. A lot of it revolves around taking a new lock (dl_lock) as part of the locking process. This tries to fix it by making sure we do not hit the stack checker while holding dl_lock. We achieve that in part by directly using the low-level __try_lock and manually unlocking on the dl_lock, and making some functions "nomcount". In addition, we mark the dl_lock as being in the console path to avoid deadlocks with the UART driver. We move the enabling of the deadlock checker to a separate config option from DEBUG_LOCKS as well, in case we chose to disable it by default later on. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2: Add support for relaxed-ordering modeReza Arbab2018-08-064-4/+32
| | | | | | | | | | | | | | | | | | Some device drivers support out of order access to GPU memory. This does not affect the CPU view of memory but it does affect the GPU view of memory. It should only be enabled if the GPU driver has requested it. Add OPAL APIs allowing the driver to query relaxed ordering state or request it to be set for a device. Current hardware only allows relaxed ordering to be enabled per PCIe root port. So the code here doesn't enable relaxed ordering until it has been explicitly requested for every device on the port. Signed-off-by: Alistair Popple <alistair@popple.id.au> [arbab@linux.ibm.com: Rebase/refactor original changes] Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-By: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2: Don't open code NPU2_RELAXED_ORDERING_CFG2Reza Arbab2018-08-061-0/+2
| | | | | | | | | Make the code that initializes these registers more descriptive by using macros instead of open coded literals. No functional change. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-By: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2: Add NPU2_SM_REG_OFFSET()Reza Arbab2018-08-061-0/+4
| | | | | | | | | | | Add a register offset calculation macro using SM block index, similar to the other NPU2_*_REG_OFFSET() macros. Signed-off-by: Alistair Popple <alistair@popple.id.au> [arbab@linux.ibm.com: Rebase/refactor original changes] Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-By: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* phb4: Track PEC index in dt and phb4 structReza Arbab2018-08-061-0/+1
| | | | | | | | | | | Knowing the PEC index is going to be important when we enable relaxed ordering, so store this value for later use. Signed-off-by: Alistair Popple <alistair@popple.id.au> [arbab@linux.ibm.com: Rebase/refactor original changes] Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-By: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* pci: Move logging macros to pci.hReza Arbab2018-08-061-0/+21
| | | | | | | | | Move the PCI{TRACE,DBG,NOTICE,ERR} logging macros from pci.c to pci.h so they can be used in other files. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* phb4: Reallocate PEC2 DMA-Read engines to improve GPU-Direct bandwidthVaibhav Jain2018-07-191-0/+2
| | | | | | | | | | | | | | | | | | | We reallocate additional 16/8 DMA-Read engines allocated to stack0/1 on PEC2 respectively. This is needed to improve bandwidth available to the Mellanox CX5 adapter when trying to read GPU memory (GPU-Direct). If kernel cxl driver indicates a request to allocate maximum possible DMA read engines when calling enable_capi_mode() and card is attached to PEC2/stack0 slot then we assume its a Mellanox CX5 adapter. We then allocate additional 16/8 extra DMA read engines to stack0 and stack1 respectively on PEC2. This is done by populating the XPEC_PCI_PRDSTKOVR and XPEC_NEST_READ_STACK_OVERRIDE as suggested by the h/w team. Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* phb4: Disable nodal scoped DMA accesses when PB pump mode is enabledAlistair Popple2018-07-171-0/+2
| | | | | | | | | | | | | | | | | | | By default when a PCIe device issues a read request via the PHB it is first issued with nodal scope. When accessing GPU memory the NPU does not know at the time of response if the requested memory page is off node or not. Therefore every read of GPU memory by a PHB is retried with larger scope which introduces bandwidth and latency issues. On smaller boxes which have pump mode enabled nodal and group scoped reads are treated the same and both types of request are broadcast to one chip. Therefore we can avoid the retry by disabling nodal scope on the PHB for these boxes. On larger boxes nodal (single chip) and group (multiple chip) scoped reads are treated differently. Therefore we avoid disabling nodal scope on large boxes which have pump mode disabled to avoid all PHB requests being broadcast to multiple chips. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* Move pb_cen_hp_mode_curr register definition to xscom-p9-reg.hAlistair Popple2018-07-172-2/+4
| | | | | | | | Currently it is defined in npu2-regs.h but needs to be used by other files as well so move it somewhere generic. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2/hw-procedures: Enable parity and credit overflow checksReza Arbab2018-07-171-0/+4
| | | | | | | | | | | | Enable these error checking features by setting the appropriate bits in our one-off initialization of each "NTL Misc Config 2" register. The exception is NDL RX parity checking, which should be disabled during the link training procedures. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2/hw-procedures: Don't open code NPU2_NTL_MISC_CFG2_BRICK_ENABLEReza Arbab2018-07-171-0/+1
| | | | | | | | | Name this bit properly. There's a lot more cleanup like this to be done, but I'm catching this one now as part of some related changes. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* lpc: Silence LPC SYNC no-response error when necessaryAndrew Jeffery2018-07-172-0/+4
| | | | | | | | | | | | | Add the ability to silence particular errors from the LPC bus when they can be expected, particularly: LPC[000]: Got SYNC no-response error. Error address reg: 0xd001002f This is necessary on platform exit on some astbmc machines to avoid unnecessary noise in the msglog. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
OpenPOWER on IntegriCloud