| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
There's a thought to write more extensive boot progress codes to LPC
ports 81 and 82 to supplement/replace any reliance on port 80.
We want to still emit port 80 for platforms like Zaius and Barreleye
that have the physical display. Ports 81 and 82 can be monitored by a
BMC though.
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is an adaptation of what we currently do for op_display() on FSP
machines, inventing an encoding for what we can write into the single
byte at LPC port 80h.
Port 80h is often used on x86 systems to indicate boot progress/status
and dates back a decent amount of time. Since a byte isn't exactly very
expressive for everything that can go on (and wrong) during boot, it's
all about compromise.
Some systems (such as Zaius/Barreleye G2) have a physical dual 7 segment
display that display these codes. So far, this has only been driven by
hostboot (see hostboot commit 90ec2e65314c).
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
| |
Fixes: 2c8f96534a978bb4cac3e4b7dd393a9cc4926555
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
| |
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The PCI-PCI bridge spec says that bridges that implement an IO window
should hardcode the IO base and limit registers to zero.
Unfortunately, these registers only define the upper bits of the IO
window and the low bits are assumed to be 0 for the base and 1 for the
limit address. As a result, setting both to zero can be mis-interpreted
as a 4K IO window.
This patch fixes the problem the same way PHB3 does. It sets the IO base
and limit values to 0xf000 and 0x1000 respectively which most software
interprets as a disabled window.
lspci before patch:
0000:00:00.0 PCI bridge: IBM Device 04c1 (prog-if 00 [Normal decode])
I/O behind bridge: 00000000-00000fff
lspci after patch:
0000:00:00.0 PCI bridge: IBM Device 04c1 (prog-if 00 [Normal decode])
I/O behind bridge: None
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was disabled at some point during bringup to make life easier for
the lab folks trying to debug NVLink issues. This hack really should
have never made it out into the wild though, so we now have the
following situation occuring in the field:
1) A bad happens
2) The host kernel recieves an unrecoverable HMI and calls into OPAL to
request a platform reboot.
3) OPAL rejects the reboot attempt and returns to the kernel with
OPAL_PARAMETER.
4) Kernel panics and attempts to kexec into a kdump kernel.
A side effect of the HMI seems to be CPUs becoming stuck which results
in the initialisation of the kdump kernel taking a extremely long time
(6+ hours). It's also been observed that after performing a dump the
kdump kernel then crashes itself because OPAL has ended up in a bad
state as a side effect of the HMI.
All up, it's not very good so re-enable the software checkstop by
default. If people still want to turn it off they can using the nvram
override.
Cc: skiboot-stable@lists.ozlabs.org
Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Acked-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were already logging some NPU registers during an HMI. This patch
cleans up a bit how it is done and separates what is global from what
is specific to nvlink or opencapi.
Since we can now receive an error interrupt when an opencapi link goes
down unexpectedly, we also dump the NPU state but we limit it to the
registers of the brick which hit the error.
The list of registers to dump was worked out with the hw team to
allow for proper debugging. For each register, we print the name as
found in the NPU workbook, the scom address and the register value.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that the NPU may report interrupts due to the link going down
unexpectedly, report those errors to the OS when queried by the
'next_error' PHB callback.
The hardware doesn't support recovery of the link when it goes down
unexpectedly. So we report the PHB as dead, so that the OS can log the
proper message, notify the drivers and take the devices down.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Many errors reported in the NPU FIR2 register, mostly catching
unexpected errors on the opencapi link are defined as 'brick fatal' in
the workbook, yet the default action is set to system checkstop. It's
possible to see those errors during AFU development, where the AFU may
send unexpected packets on the link, therefore triggering those
errors. Checkstopping the system in this case is clearly extreme, as
the error could be contained to the brick and proper analysis of a
checkstop is not trivial outside of a bringup environment.
This patch changes the default action of those errors so that the NPU
will raise an interrupt instead. Follow-up patches will log
proper information so that the error can be debugged and linux can
catch the event.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Start using the irq setup code from NVLink for OpenCAPI, since the 2
versions are so close. There are only 2 differences:
- the NPU may trigger more interrupts for OpenCAPI, 35 vs. 23, though
none are configured to be triggered for now.
- we need to enable the 4 translation faults interrupts for OpenCAPI.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
| |
The NPU IRQ setup code is currently duplicated between NVLink and
OpenCAPI, yet it's almost identical. This patch moves the NVLink
version of the code to the common file. A later patch will make use of
it for OpenCAPI.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we support mixing NVLink and OpenCAPI devices on the same NPU, we're
going to have to share the same range of 16 PE numbers between NVLink and
OpenCAPI PHBs.
For OpenCAPI devices, PE assignment is only significant for determining
which System Interrupt Log register is used for a particular brick - unlike
NVLink, it doesn't play any role in determining how links are fenced.
Split the PE range into a lower half which is used for NVLink, and an upper
half that is used for OpenCAPI, with a fixed PE number assigned per brick.
As the PE assignment for OpenCAPI devices is fixed, set the PE once
during device init and then ignore calls to the set_pe() operation.
Suggested-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
| |
Allow the submitter to track the state of an I2C request by adding
a state field to the request. This avoids the need to use a stub
completion callback in some cases.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
| |
The delay between the ASSERT_DELAY and DEASSERT_DELAY states is set to
one timebase tick. This state seems to have been a hold over from PHB3
where it was used to add a 1s delay between de-asserting PERST and
polling the link for the CAPI FPGA. There's no requirement for that here
since the link polling on PHB4 is a bit smarter so we should be fine.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some time ago Mikey added some code work around a bug we found where a
certain RAID card wouldn't come back again after a fast-reboot. The
workaround is setting the Link Disable bit before asserting PERST and
clear it after de-asserting PERST.
Currently we do this in the FRESET path, but not in the CRESET path.
This patch moves the PERST control into its own function to reduce
duplication and to the workaround is applied in all circumstances.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we do an freset the first step is to check if a card is present in
the slot. However, this only occurs when we enter phb4_freset() with the
slot state set to SLOT_NORMAL. This occurs in:
a) The creset path, and
b) When the OS manually requests an FRESET via an OPAL call.
a) is problematic because in the boot path the generic code will put the
slot into FRESET_START manually before calling into phb4_freset(). This
can result in a situation where a device is detected on boot, but not
after a CRESET.
I've noticed this occurring on systems where the PHB's slot presence
detect signal is not wired to an adapter. In this situation we can rely
on the in-band presence mechanism, but the presence check will make
us exit before that has a chance to work.
Additionally, if we enter from the CRESET path this early exit leaves
the slot's PERST signal being left asserted. This isn't currently an issue,
but if we want to support hotplug of devices into the root port it will
be.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PERST is asserted at the beginning of the CRESET process to prevent
the downstream device from interacting with the host while the PHB logic
is being reset and re-initialised. There is at least a 100ms wait during
the CRESET processing so it's not necessary to wait this time again
in the FRESET handler.
This patch extends the delay after re-setting the PHB logic to extend
to the 250ms PERST wait period that we typically use and sets the
skip_perst flag so that we don't wait this time again in the FRESET
handler.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Patch to enhance the imc opal call to support and handle trace_imc mode.
To initialize the trace-mode, TRACE_IMC_SCOM value is written to
TRACE_IMC_ADDR of the respective core.
TRACE_IMC_SCOM is a 64bit value, and each bit represent the following:
0:1 : SAMPSEL
2:33 : CPMC_LOAD
34:40 : CPMC1SEL
41:47 : CPMC2SEL
48:50 : BUFFERSIZE
51:63 : RESERVED
Currently the value for TRACE_IMC_SCOM is hard coded.
During initialization htm_mode is disabled, and enabled only at start.
The opal calls to start/stop the counters, will write CORE_IMC_HTM_MODE_ENABLE/
CORE_IMC_HTM_MODE_DISABLE respectively to the htm_scom_index of the desired
cores.
Additional switch cases are added to the current opal calls to start/stop
the counters for trace-mode.
Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Factor out core-imc stop api code from opal_imc_counters_init() for
better readability.
Also fix the error message if, wake_up_engine_state is not
"WAKEUP_ENGINE_PRESENT".
Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Cc: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
| |
Rename ___backtrace() to backtrace_create() and ___print_backtrace() to
backtrace_print(). Get rid of __backtrace() and __print_backtrace()
wrappers.
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
| |
We're about to get rid of __backtrace() and __print_backtrace(), convert
the FSP/IPMI attn code to not use them.
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To be able to support migration of guests using the XIVE native
exploitation mode, (where the queue is effectively owned by the
guest), KVM needs to be able to save and restore the HW-modified
fields of the queue, such as the current queue producer pointer and
generation bit, and to retrieve the modified thread context registers
of the VP from the NVT structure : the VP interrupt pending bits.
However, there is no need to set back the NVT structure on P9. P10
should be the same.
Based on previous work from BenH.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
| |
The hub-id is stored in the PBCQ node rather than the stack node so we
never add it to the PHB node. This breaks the lxvpd slot lookup code
since the hub-id is encoded in the VPD record that we need to find the
slot information.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We've been getting this warning/error from recent GCC:
In file included from hw/ipmi/test/run-fru.c:22:
hw/ipmi/test/../ipmi-fru.c: In function ‘fru_add’:
hw/ipmi/test/../ipmi-fru.c:162:3: warning: ‘strncpy’ output truncated copying 32 bytes from a string of length 38 [-Wstringop-truncation]
strncpy(info.version, version, MAX_STR_LEN + 1);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This patch does two things:
1) Re-arrange some code to shut GCC up.
2) Add extra fu to tests to ensure we're producing correct bytes.
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
Tested-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For opencapi, we currently do impedance calibration when initializing
the PHY for the device, which could run in parallel if we were rich
and had multiple opencapi devices. But if 2 devices are on the same
obus, the 2 calibration sequences could overlap, which likely yields
bad results and is useless anyway since it only needs to be done once
per obus.
This patch splits the opencapi PHY reset in 2 parts:
- a 'init' part called serially at boot. That's when zcal is done. If
we have 2 devices on the same socket, the zcal won't be redone,
since we're called serially and we'll see it has already be done for
the obus
- a 'reset' part called during fundamental reset as a prereq for link
training. It does the PHY setup for a set of lanes and the dccal.
The PHY team confirmed there's no dependency between zcal and the
other reset steps and it can be moved earlier.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The zcal procedure needs to be run once per obus. We keep track of
which obus is already calibrated in an array indexed by the obus
number. However, the obus number is inferred from the brick index,
which works well for nvlink but not for opencapi.
Create an obus_index() function, which, from a device, returns the
correct obus index, irrespective of the device type.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
set_iovalid() is called on the PHY reset path. The hw logic it touches
is meaningless for opencapi. It's not hurting as long as all the links
under the NPU are in opencapi mode, but in case of mixing opencapi and
nvlink, we'll be in troubles: the code finds which bit to modify based
on the brick index, which varies depending on the mode. So calling
that function on an opencapi device may modify a nvlink brick! For
example, for brick index 3.
So we simply avoid doing anything when calling set_iovalid() for an
opencapi device.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
OCC can change the pstate table at runtime to modify pstate limits or
for characterization purpose. These changes are reflected by re-parsing
the pstate table during fast-reboot to update the device-tree. Only
relevant pstate DT properties are deleted and newly added during
fast-reboot. The device-tree properties like 'freq-domain-mask' and
'domain-runs-at' are currently hard-coded and need not be updated
during fast-reboot. So this patch removes them from the fast-reboot path.
This patch fixes the below crash:
[ 270.313998453,5] OCC: All Chip Rdy after 0 ms
[ 270.314148918,3] Duplicate property "freq-domain-mask" in node /ibm,opal/power-mgt
[ 270.314208553,0] Aborting!
CPU 083c Backtrace:
S: 0000000035de3a20 R: 000000003001b480 ._abort+0x4c
S: 0000000035de3aa0 R: 0000000030028704 .new_property+0xd8
S: 0000000035de3b30 R: 0000000030028964 .__dt_add_property_cells+0x30
S: 0000000035de3bd0 R: 0000000030042980 .occ_pstates_init+0x7c8
S: 0000000035de3d90 R: 00000000300145f4 .load_and_boot_kernel+0x980
S: 0000000035de3e70 R: 00000000300276b4 .fast_reboot_entry+0x37c
S: 0000000035de3f00 R: 0000000030002ac4 reset_fast_reboot_wakeup+0x40
Fixes: b821f8c2a8e3("power-mgmt : occ : Add 'freq-domain-mask' DT property")
Reported-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Tested-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If two opencapi adapters are on the same obus, we may try to train the
two links in parallel at boot time, when all the PCI links are being
trained. Both links use the same i2c controller to handle the reset
signal, so some care is needed to make sure resetting one doesn't
interfere with the reset of the other. We need to keep track of the
current state of the i2c controller (and use locking).
This went mostly unnoticed as you need to have 2 opencapi cards on the
same socket and links tended to train anyway because of the retries.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Give more time to the FPGA to process the reset signal. The previous
delay, 5ms, is too short for newer adapters with bigger FPGAs. Extend
it to 250ms.
Ultimately, that delay will likely end up being added to the opencapi
specification, but we are not there yet.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We haven't hit any problem so far, but from the ODL designer, the ODL
should be in reset when it is enabled.
The ODL remains in reset until we start a fundamental reset to
initiate link training. We still assert and deassert the ODL reset
signal as part of the normal procedure just before training the
link. Asserting is therefore useless at boot, since the ODL is already
in reset, but we keep it as it's only a scom write and it's needed
when we reset/retrain from the OS.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Split the function to assert and deassert the reset signal on the ODL,
so that we can keep the ODL in reset while we reset the adapter,
therefore having a window where both sides are in reset.
It is actually not required with our current DLx at boot time, but I
need to split the ODL reset function for the following patch and it
will become useful/required later when we introduce resetting an
opencapi link from the OS.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
| |
This is really to avoid confusion with a later patch and clarify
whether we're resetting the ODL or the adapter.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's possible to set up performance counters for the PLL to detect
various conditions for the links in nvlink or opencapi mode. Since
those counters are currently unused, let's configure them when an obus
is in opencapi mode to detect CRC errors on the link. Each link has
two counters:
- CRC error detected by the host
- CRC error detected by the DLx (NAK received by the host)
We also dump the counters shortly after the link trains, but they can
be read multiple times through cronus, pdbg or linux. The counters are
configured to be reset after each read.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ODL registers used to control the opencapi link state have an address
built on a base address and an offset for each brick which can be
computed instead of hard-coded individually for each brick.
Rework how we access the ODL registers, to avoid repeating switch
statements all over the place.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On TOD failure, all cores/thread receives HMI and very first thread that
gets interrupt fixes the TOD where as others just resets the respective
HMER error bit and return. But when TOD is unrecoverable, all the threads
try to do TOD recovery one by one causing threads to spend more time inside
opal. Set a global flag when TOD is unrecoverable so that rest of the
threads go back to linux immediately avoiding lock ups in system
reboot/panic path.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
BT send logic always sends top of bt message list to BMC. Once BMC reads the
message, it clears the interrupt and bt_idle() becomes true.
bt_add_ipmi_msg_head() adds message to top of the list. If bt message list
is not empty then:
- if bt_idle() is true then we will endup sending message to BMC before
getting response from BMC for inflight message. Looks like on some
BMC implementation this results in message timeout.
- else we endup starting message timer without actually sending message
to BMC.. which is not correct.
This patch introduces separate list to track synchronous messages.
bt_add_ipmi_msg_head() will add messages to tail of this new list. We
will always process this queue before processing normal queue.
Finally this patch introduces new variable (inflight_bt_msg) to track
inflight message. This will point to current inflight message.
Suggested-by: Oliver O'Halloran <oohall@gmail.com>
Suggested-by: Stewart Smith <stewart@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
| |
In __xive_set_irq_config() change the no_sync parameter to sync and
fix all the call sites.
Just a cleanup. No functional change.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
| |
Come on bridge control register. You're letting the team down.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Each XTS MMIO ATSD# register is accompanied by another register -
XTS MMIO ATSD0 LPARID# - which controls LPID filtering for ATSD
transactions.
When a host system passes a GPU through to a guest, we need to enable
some ATSD for an LPAR. At the moment the host assigns one ATSD to
a NVLink bridge and this maps it to an LPAR when GPU is assigned to
the LPAR. The link number is used for an ATSD index.
ATSD6&7 stay mapped to the host (LPAR=0) all the time which seems to be
acceptable price for the simplicity.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently PID wildcard is programmed into the NPU once and never cleared
up. This works for the bare metal as MSR does not change while the host
OS is running.
However with the device virtualization, we need to keep track of wildcard
entries use and clear them up before switching a GPU from a host to
a guest or vice versa.
This adds refcount to a NPU2, one counter per wildcard entry. The index
is a short lparid (4 bits long) which is allocated in opal_npu_map_lpar()
and should be smaller than NPU2_XTS_BDF_MAP_SIZE (defined as 16).
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Reza Arbab <arbab@linux.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a new device-tree property freq-domain-indicator to define group of
CPUs which would share same frequency. This property has been added under
power-mgmt node. It is a bitmask.
Bitwise AND is taken between this bitmask value and PIR of cpu. All the
CPUs lying in the same frequency domain will have same result for AND.
For example, For POWER9, 0xFFF0 indicates quad wide frequency domain.
Taking AND with the PIR of CPUs will yield us frequency domain which is
quad wise distribution as last 4 bits have been masked which represent the
cores.
Similarly, 0xFFF8 will represent core wide frequency domain for P8.
Also, Add a new device-tree property domain-runs-at which will denote the
strategy OCC is using to change the frequency of a frequency-domain. There
can be two strategy - FREQ_MOST_RECENTLY_SET and FREQ_MAX_IN_DOMAIN.
FREQ_MOST_RECENTLY_SET : the OCC sets the frequency of the quad to the most
recent frequency value requested by the CPUs in the quad.
FREQ_MAX_IN_DOMAIN : the OCC sets the frequency of the CPUs in
the Quad to the maximum of the latest frequency requested by each of
the component cores.
Signed-off-by: Abhishek Goel <huntbag@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
repeatedly failed
Certain older PCIe 1.0 devices will not train unless the training process starts at GEN1 speeds.
As a last resort when a device will not train, fall back to GEN1 speed for the last training attempt.
This is verified to fix devices based on the Conexant CX23888 on the Talos II platform.
Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
[stewart: cut P9NDD1.0 support, fixup dt_max_link_speed]
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In-Memory Collection(IMC) counters catalog is compressed blob which is
loaded from the flash; decompression starts once the data is loaded from
nvram by the main thread. This can be optimized by using the libxz API
function which creates a job to do the decompression by not blocking the
main thread.
Refactor decompress() to use the libxz asynchronous wrapper
functions. This also cleans up the error handling path in imc_init().
CC: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Santosh Sivaraj <santosh@fossix.org>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
OCC provides two limits for minimum powercap. One being hard powercap
minimum which is guaranteed by OCC and the other one is a soft
powercap minimum which is lesser than hard-min and may or may not be
asserted due to various power-thermal reasons. So to allow the users
to access the entire powercap range, this patch exports soft powercap
minimum as the "powercap-min" DT property. And it also adds a new
DT property called "powercap-hard-min" to export the hard-min powercap
limit.
Fixes: c6aabe3f2eb5("powercap: occ: Add a generic powercap framework")
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
| |
Yes, a bunch of HOMER symbols should be.
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
| |
Yes, a bunch of these should be.
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
| |
Yes they should. Do so by adding static to the macro.
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
| |
Yes they should. Also, some are unused so we comment them out to at
least keep the code as documentation complete.
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
| |
be static?
Yes they should.
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|