| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
| |
This variable is never used. Remove it.
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
| |
This variable is never used. Remove it.
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
| |
This variable is never used. Remove it.
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
| |
This cache is written but never read. Wiring it up would gain us little
(except added complexity), and it obviously hasn't been missed thus far,
so remove it altogether.
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces a PHB4 flag PHB4_CAPP_DISABLE and scaffolding
necessary to handle it during CRESET flow. The flag is set when CAPP
is request to switch to PCIe mode via call to phb4_set_capi_mode()
with mode OPAL_PHB_CAPI_MODE_PCIE. This starts the below sequence that
ultimately ends in newly introduced phb4_slot_sm_run_completed()
1. Set PHB4_CAPP_DISABLE to phb4->flags.
2. Start a CRESET on the phb slot. This also starts the opal pci reset
state machine.
3. Wait for slot state to be PHB4_SLOT_CRESET_WAIT_CQ.
4. Perform CAPP recovery as PHB is still fenced, by calling
do_capp_recovery_scoms().
5. Call newly introduced 'disable_capi_mode()' to disable CAPP.
6. Wait for slot reset to complete while it transitions to
PHB4_SLOT_FRESET and optionally to PHB4_SLOT_LINK_START.
7. Once slot reset is complete opal pci-core state machine will call
slot->ops.completed_sm_run().
8. For PHB4 this branches newly introduced 'phb4_slot_sm_run_completed()'.
9. Inside this function we mark the CAPP as disabled and un-register
the opal syncer phb4_host_sync_reset().
10. Optionally if the slot reset was unsuccessful disable
fast-reboot.
****************************
Notes:
****************************
a. Function 'disable_capi_mode()' performs various sanity tests on CAPP to
to determine if its ok to disable it and perform necessary xscoms
to disable it. However the current implementation proposed in this
patch is a skeleton one that just does sanity tests. A followup patch
will be proposed that implements the xscoms necessary to disable CAPP.
b. The sequence expects that Opal PCI reset state machine makes
forward progress hence needs someone to call slot->ops.run_sm(). This
can be either from phb4_host_sync_reset() or opal_pci_poll().
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously struct proc_chip member 'capp_phb3_attached_mask' was used
for Power-8 to keep track of PHB attached to the single CAPP on the
chip. CAPP on that chip supported a flexible PHB assignment
scheme. However since then new chips only support a static assignment
i.e a CAPP can only be attached to a specific PEC.
Hence instead of using 'proc_chip.capp_phb4_attached_mask' to manage
CAPP <-> PEC assignments which needs a global lock (capi_lock) to be
updated, we introduce a new struct named 'capp' a pointer to which
resides inside struct 'phb4'. Since updates to struct 'phb4' already
happen in context of phb_lock; this eliminates the
need to use mutex 'capi_lock' while updating
'capp_phb4_attached_mask'.
This struct is also used to hold CAPP specific variables such as
pointer to the 'struct phb' to which the CAPP is attached,
'capp_xscom_offset' which is the xscom offset to be added to CAPP
registers in case there are more than 1 on the chip, 'capp_index'
which is the index of the CAPP on the chip, and attached_pe' which is
the process endpoint index to which CAPP is attached. Finally member
'chip_id' holds the chip-id thats used for performing xscom
read/writes.
Also new helpers named capp_xscom_read()/write() are introduced to
make access to CAPP xscom registers easier.
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
At times we need to perform some cleanup activities when the Opal PCI
state machine that perform creset/freset/hreset (driven by
pci_slot_ops->run_sm which) of a slot completes. One example can be to
mark CAPP attached to a PHB, as deactivated when creset/freset of a
CAPI card slot is completed.
However the calls to pci_slot_ops->run_sm() is scattered through out
the code and patching each call site to check for the return value and
perform custom cleanup tacks is difficult.
Hence this patch introduces a new pci_slot_ops named
completed_sm_run() which should be called when pci_slot_ops->run_sm()
determines that the reset state machine is complete. This provides a
more centralized way to handle slot related cleanup activities.
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Current implementation of opal_del_host_sync_notifier() will only
delete the first entry of the 'notify' callback found from opal_syncers
list irrespective of the 'data' of list-node. This is problematic when
multiple notifiers with same callback function but different 'data'
are registered. In this case when the cleanup code will call
opal_del_host_sync_notifier() it cannot be sure if correct opal_syncer
is removed.
Hence this patch updates the function to accept a new argument named
'void *data' which is then used to iterates over the opal_syncers list
and only remove the first node node having the matching value for
'notify' callback as 'data'.
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
| |
This reverts commit d8b161f4b361f70a7bb43be47d4a32b8f937287a.
As discussed on list, a bit premature to merge, removing for now.
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a GPU is passed through to a guest and the guest unexpectedly terminates,
there can be cache lines in CPUs that belong to the GPU. So purge the caches
as part of the reset sequence. L1 is write through, so doesn't need to be purged.
The sequence to purge the L2 and L3 caches from the hw team:
"L2 purge:
(1) initiate purge
putspy pu.ex EXP.L2.L2MISC.L2CERRS.PRD_PURGE_CMD_TYPE L2CAC_FLUSH -all
putspy pu.ex EXP.L2.L2MISC.L2CERRS.PRD_PURGE_CMD_TRIGGER ON -all
(2) check this is off in all caches to know purge completed
getspy pu.ex EXP.L2.L2MISC.L2CERRS.PRD_PURGE_CMD_REG_BUSY -all
(3) putspy pu.ex EXP.L2.L2MISC.L2CERRS.PRD_PURGE_CMD_TRIGGER OFF -all
L3 purge:
1) Start the purge:
putspy pu.ex EXP.L3.L3_MISC.L3CERRS.L3_PRD_PURGE_TTYPE FULL_PURGE -all
putspy pu.ex EXP.L3.L3_MISC.L3CERRS.L3_PRD_PURGE_REQ ON -all
2) Ensure that the purge has completed by checking the status bit:
getspy pu.ex EXP.L3.L3_MISC.L3CERRS.L3_PRD_PURGE_REQ -all
You should see it say OFF if it's done:
p9n.ex k0:n0:s0:p00:c0
EXP.L3.L3_MISC.L3CERRS.L3_PRD_PURGE_REQ
OFF"
Suggested-by: Alistair Popple <alistair@popple.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Rashmica Gupta <rashmica.g@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Each XTS MMIO ATSD# register is accompanied by another register -
XTS MMIO ATSD0 LPARID# - which controls LPID filtering for ATSD
transactions.
When a host system passes a GPU through to a guest, we need to enable
some ATSD for an LPAR. At the moment the host assigns one ATSD to
a NVLink bridge and this maps it to an LPAR when GPU is assigned to
the LPAR. The link number is used for an ATSD index.
ATSD6&7 stay mapped to the host (LPAR=0) all the time which seems to be
acceptable price for the simplicity.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
| |
If the link trains in degraded mode, log the ODL endpoint information
register for debug. Its content is specific to the DLx and TLx
implementation, so this is really information useful for the hardware
team.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
| |
There's no status readily available to tell the effective link
width. Instead, we have to look at the individual status of each lane,
on the transmit and receive direction. All relevant information is in
the ODL status register.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
| |
Measure entry/exit time for OPAL calls and warn appropriately if the
calls take too long (>100ms gets us a DEBUG log, > 1000ms gets us a
warning).
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
| |
Suggested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Rashmica Gupta <rashmica.g@gmail.com>
Reviewed-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
| |
We had a bunch of remaining definitions for registers that
don't actually exist in PHB4 anymore (copied from PHB3).
This removes them along with a handful of minor style cleanups
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During CAPP recovery do_capp_recovery_scoms() will reset the CAPP Fir
register just after CAPP recovery is completed. This has an
unintentional side effect of preventing PRD from analyzing and
reporting this error. If PRD tries to read the CAPP FIR after opal has
already reset it, then it logs a critical error complaining "No active
error bits found".
To prevent this from happening we update do_capp_recovery_scoms() to
only reset fir bits that cause CAPP machine check (local xstop). This
is done by reading the CAPP Fir Action0/1 & Mask registers and
generating a mask which is then written on CAPP_FIR_CLEAR register.
Cc: stable
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some PHB4 PHYs can get stuck in a bad state where they are constantly
retraining the link. This happens transparently to skiboot and Linux
but will causes PCIe to be slow. Resetting the PHB4 clears the
problem.
We can detect this case by looking at the RX errors count where we
check for link stability. This patch does this by modifying the link
optimal code to check for RX errors. If errors are occurring we
retrain the link irrespective of the chip rev or card.
Normally when this problem occurs, the RX error count is maxed out at
255. When there is no problem, the count is 0. We chose 8 as the max
rx errors value to give us some margin for a few errors. There is also
a knob that can be used to set the error threshold for when we should
retrain the link. ie
nvram -p ibm,skiboot --update-config phb-rx-err-max=8
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Presence detection for opencapi adapters was broken for ZZ planars v3
and below. All ZZ systems currently used in the lab have had their
planar upgraded, so we can now remove the override we had to force
presence and activate presence detection. Which should improve boot
time.
Considering the state of opal support on ZZ, this is really only for
lab usage on BML. The opencapi enablement team has okay'd the
change. In the unlikely case somebody tries opencapi on an old ZZ, the
presence detection through i2c will show that no adapter is present
and skiboot won't try to access or train the link.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently on a FSP based P9 system load_capp_code() expects CAPP ucode
lid header to have eye-catcher magic of 'CAPPPSLL'. However skiboot
currently supports CAPP ucode only lids that have a eye-catcher magic
of 'CAPPLIDH'. This prevents skiboot from loading the ucode with this
error message:
CAPP: ucode header invalid
We fix this issue by updating load_capp_ucode() to use the eye-catcher
value of 'CAPPLIDH' instead of 'CAPPPSLL'.
Cc: stable
Fixes: e50764d4f2b1("capi: Load capp microcode")
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
| |
Segregate the BMC platform configuration into hardware and software
components. This allows population of platform default values for
hardware configuration that may no-longer be accessible by the host.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
[stewart: fixup pci-quirk unit test]
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce generic read and write probe functions that allow detection of
valid addresses by way of synchronous testing for the SYNC no-response
state. If the no-response state is detected the probe functions will
return an error to the caller, who can do with it what they wish.
In the process, rip out the naive mechanism for muting the equivalent
asynchronous error logging (regretfully introduced recently by yours
truly).
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
| |
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
| |
If the IPMI command is not available, fall back to the mailbox
interface.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
[stewart: fix up mbox test]
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ipmi-hiomap implements the PNOR access control protocol formerly known
as "the mbox protocol" but uses IPMI instead of the AST LPC mailbox as a
transport. As there is no-longer any mailbox involved in this alternate
implementation the old protocol name is quite misleading, and so it has
been renamed to "the hiomap protoocol" (Host I/O Mapping protocol). The
same commands and events are used though this client-side implementation
assumes v2 of the protocol is supported by the BMC.
The code is a heavily-reworked copy of the mbox-flash source and is
introduced this way to allow for the mbox implementation's eventual
removal.
mbox-flash should in theory be renamed to mbox-hiomap for consistency,
but as it is on life-support effective immediately we may as well just
remove it entirely when the time is right.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
[stewart: prlog debug over prerror for mbox fallback, fix indent]
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
| |
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit eb146fac9685 ("core/i2c: Move the timeout field into
i2c_request") simplified a bit how a request timeout is
handled. However there's now some confusion between milliseconds and
timebase increments when defining or using the timeout values, which
breaks i2c requests made for opencapi, and probably others too.
This patch declares all the timeout in milliseconds and just converts
to timebase at the end of the chain, as needed.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This takes a checksum of skiboot memory after boot that should be
unchanged during OS operation, and verifies it before allowing a
fast reboot.
This is not read-only memory from skiboot's point of view, beause
it includes things like the opal branch table that gets populated
during boot.
This helps to improve the integrity of firmware against host and
runtime firmware memory scribble bugs.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
| |
This also tidies up linker script symbol declarations and adds
_rodata_mem symbol for the next change to use.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
| |
this means that if it's permanently disabled on boot, the test suite can
pick that up and not try a fast reboot test.
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I don't see a reason why there would need to be a PHB3 *specific*
subsystem in the error logs, so rename it to PHB so that PHB4 and
later can use it too without continually redefining it.
This shouldn't change any existing assumptions because it's unused.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
| |
These are now pointless and they can be replaced with zalloc() and free().
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Currently to set a per-request timeout you need to use
i2c_req_set_timeout() which is a wrapper for a per-bus method that sets the
actual timeout. This design doesn't make a whole lot of sense, so move
the timeout field into the generic i2c_request structure and set the
timeout to be set using that.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In platform_ocapi, we define i2c_{reset,presence}_odl{0,1} to specify the
appropriate reset/presence GPIO pins for devices connected to ODL0 and ODL1
respectively.
This is obviously wrong, because a device connected to brick 2 and a device
connected to brick 4 are going to be different devices connected to
different I2C pins, but rather conveniently we haven't had to deal with
systems that can use the full 4 bricks as yet. Now that we're adding
OpenCAPI support for Witherspoon, we should change this to specify pins
separately for all 4 bricks.
Replace i2c_{reset,presence}_odl{0,1} with
i2c_{reset,presence}_brick{2,3,4,5} and update the presence detection code,
device reset code, and existing platforms accordingly.
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is no standardised way to determine the presence and type of devices
connected to an NPU on POWER9.
Currently, we hardcode device types based on platform type (as no platform
currently supports both OpenCAPI and NVLink), and for OpenCAPI platforms
we use I2C to detect presence.
Witherspoon (and potentially other platforms later on) supports both
NVLink and OpenCAPI, and additionally uses SXM2 connectors which can carry
more than one link, rather than the SlimSAS connectors used for OpenCAPI on
Zaius and ZZ. This necessitates some special handling.
Add a platform callback for NPU device detection. In a later patch, we
will use this to implement Witherspoon-specific device detection. For now,
add a Witherspoon stub that sets all links to NVLink (i.e. current
behaviour).
Move the existing I2C-based presence detection for OpenCAPI devices on
Zaius/ZZ into common code, which we use by default for platforms which do
not define a callback. Clean up the use of the ibm,npu-link-type property,
which will now be exposed solely for debugging and not consumed internally.
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Replace probe_npu2() and probe_npu2_opencapi() with a new shared
probe_npu2(). Refactor some of the common NPU setup code into shared code.
No functional change. This patch does not implement support for using both
types of devices simultaneously on the same NPU - we expect to add this
sometime in the future.
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Reza Arbab <arbab@linux.ibm.com>
Reviewed-by: Alistair Popple <alistair@popple.id.au>
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On Witherspoon, OpenCAPI devices attached to link indexes 0 and 1 are
handled by bricks 2 and 3.
Rename index to brick_index, and add a new field, link_index, to
refer to the link index. For now, we set those values identically.
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Reza Arbab <arbab@linux.ibm.com>
Reviewed-by: Alistair Popple <alistair@popple.id.au>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The lock dependency checker does a few nasty things that can cause
re-entrancy deadlocks in conjunction with the stack checker or
in fact other debug tests.
A lot of it revolves around taking a new lock (dl_lock) as part
of the locking process.
This tries to fix it by making sure we do not hit the stack
checker while holding dl_lock.
We achieve that in part by directly using the low-level __try_lock
and manually unlocking on the dl_lock, and making some functions
"nomcount".
In addition, we mark the dl_lock as being in the console path to
avoid deadlocks with the UART driver.
We move the enabling of the deadlock checker to a separate config
option from DEBUG_LOCKS as well, in case we chose to disable it
by default later on.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some device drivers support out of order access to GPU memory. This does
not affect the CPU view of memory but it does affect the GPU view of
memory. It should only be enabled if the GPU driver has requested it.
Add OPAL APIs allowing the driver to query relaxed ordering state or
request it to be set for a device. Current hardware only allows relaxed
ordering to be enabled per PCIe root port. So the code here doesn't
enable relaxed ordering until it has been explicitly requested for every
device on the port.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
[arbab@linux.ibm.com: Rebase/refactor original changes]
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Reviewed-By: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
| |
Make the code that initializes these registers more descriptive by
using macros instead of open coded literals. No functional change.
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Reviewed-By: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Add a register offset calculation macro using SM block index, similar to
the other NPU2_*_REG_OFFSET() macros.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
[arbab@linux.ibm.com: Rebase/refactor original changes]
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Reviewed-By: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Knowing the PEC index is going to be important when we enable relaxed
ordering, so store this value for later use.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
[arbab@linux.ibm.com: Rebase/refactor original changes]
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Reviewed-By: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
| |
Move the PCI{TRACE,DBG,NOTICE,ERR} logging macros from pci.c to pci.h so
they can be used in other files.
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Reviewed-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We reallocate additional 16/8 DMA-Read engines allocated to stack0/1
on PEC2 respectively. This is needed to improve bandwidth available to
the Mellanox CX5 adapter when trying to read GPU memory (GPU-Direct).
If kernel cxl driver indicates a request to allocate maximum possible
DMA read engines when calling enable_capi_mode() and card is attached
to PEC2/stack0 slot then we assume its a Mellanox CX5 adapter. We then
allocate additional 16/8 extra DMA read engines to stack0 and stack1
respectively on PEC2. This is done by populating the
XPEC_PCI_PRDSTKOVR and XPEC_NEST_READ_STACK_OVERRIDE as suggested by
the h/w team.
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By default when a PCIe device issues a read request via the PHB it is first
issued with nodal scope. When accessing GPU memory the NPU does not know at the
time of response if the requested memory page is off node or not. Therefore
every read of GPU memory by a PHB is retried with larger scope which introduces
bandwidth and latency issues.
On smaller boxes which have pump mode enabled nodal and group scoped reads are
treated the same and both types of request are broadcast to one chip. Therefore
we can avoid the retry by disabling nodal scope on the PHB for these boxes. On
larger boxes nodal (single chip) and group (multiple chip) scoped reads are
treated differently. Therefore we avoid disabling nodal scope on large boxes
which have pump mode disabled to avoid all PHB requests being broadcast to
multiple chips.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
| |
Currently it is defined in npu2-regs.h but needs to be used by other files
as well so move it somewhere generic.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Enable these error checking features by setting the appropriate bits in
our one-off initialization of each "NTL Misc Config 2" register.
The exception is NDL RX parity checking, which should be disabled during
the link training procedures.
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
| |
Name this bit properly. There's a lot more cleanup like this to be done,
but I'm catching this one now as part of some related changes.
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add the ability to silence particular errors from the LPC bus when they
can be expected, particularly:
LPC[000]: Got SYNC no-response error. Error address reg: 0xd001002f
This is necessary on platform exit on some astbmc machines to avoid
unnecessary noise in the msglog.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|