summaryrefslogtreecommitdiffstats
path: root/hw
Commit message (Collapse)AuthorAgeFilesLines
* Write boot progress to LPC ports 81 and 82Stewart Smith2019-04-242-2/+102
| | | | | | | | | | | There's a thought to write more extensive boot progress codes to LPC ports 81 and 82 to supplement/replace any reliance on port 80. We want to still emit port 80 for platforms like Zaius and Barreleye that have the physical display. Ports 81 and 82 can be monitored by a BMC though. Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* Write boot progress to LPC port 80hStewart Smith2019-04-245-2/+194
| | | | | | | | | | | | | | | | | This is an adaptation of what we currently do for op_display() on FSP machines, inventing an encoding for what we can write into the single byte at LPC port 80h. Port 80h is often used on x86 systems to indicate boot progress/status and dates back a decent amount of time. Since a byte isn't exactly very expressive for everything that can go on (and wrong) during boot, it's all about compromise. Some systems (such as Zaius/Barreleye G2) have a physical dual 7 segment display that display these codes. So far, this has only been driven by hostboot (see hostboot commit 90ec2e65314c). Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/xscom: P9P rather than P9Stewart Smith2019-04-171-1/+1
| | | | | Fixes: 2c8f96534a978bb4cac3e4b7dd393a9cc4926555 Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/xscom: add missing P9P chip nameNicholas Piggin2019-04-171-1/+1
| | | | | Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/phb4: Squash the IO bridge windowOliver O'Halloran2019-04-171-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | The PCI-PCI bridge spec says that bridges that implement an IO window should hardcode the IO base and limit registers to zero. Unfortunately, these registers only define the upper bits of the IO window and the low bits are assumed to be 0 for the base and 1 for the limit address. As a result, setting both to zero can be mis-interpreted as a 4K IO window. This patch fixes the problem the same way PHB3 does. It sets the IO base and limit values to 0xf000 and 0x1000 respectively which most software interprets as a disabled window. lspci before patch: 0000:00:00.0 PCI bridge: IBM Device 04c1 (prog-if 00 [Normal decode]) I/O behind bridge: 00000000-00000fff lspci after patch: 0000:00:00.0 PCI bridge: IBM Device 04c1 (prog-if 00 [Normal decode]) I/O behind bridge: None Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/xscom: Enable sw xstop by default on p9Oliver O'Halloran2019-04-171-24/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was disabled at some point during bringup to make life easier for the lab folks trying to debug NVLink issues. This hack really should have never made it out into the wild though, so we now have the following situation occuring in the field: 1) A bad happens 2) The host kernel recieves an unrecoverable HMI and calls into OPAL to request a platform reboot. 3) OPAL rejects the reboot attempt and returns to the kernel with OPAL_PARAMETER. 4) Kernel panics and attempts to kexec into a kdump kernel. A side effect of the HMI seems to be CPUs becoming stuck which results in the initialisation of the kdump kernel taking a extremely long time (6+ hours). It's also been observed that after performing a dump the kdump kernel then crashes itself because OPAL has ended up in a bad state as a side effect of the HMI. All up, it's not very good so re-enable the software checkstop by default. If people still want to turn it off they can using the nvram override. Cc: skiboot-stable@lists.ozlabs.org Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Acked-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/npu2: Dump (more) npu2 registers on link error and HMIsFrederic Barrat2019-04-091-0/+234
| | | | | | | | | | | | | | | | | | We were already logging some NPU registers during an HMI. This patch cleans up a bit how it is done and separates what is global from what is specific to nvlink or opencapi. Since we can now receive an error interrupt when an opencapi link goes down unexpectedly, we also dump the NPU state but we limit it to the registers of the brick which hit the error. The list of registers to dump was worked out with the hw team to allow for proper debugging. For each register, we print the name as found in the NPU workbook, the scom address and the register value. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/npu2: Report errors to the OS if an OpenCAPI brick is fencedFrederic Barrat2019-04-091-4/+51
| | | | | | | | | | | | | | Now that the NPU may report interrupts due to the link going down unexpectedly, report those errors to the OS when queried by the 'next_error' PHB callback. The hardware doesn't support recovery of the link when it goes down unexpectedly. So we report the PHB as dead, so that the OS can log the proper message, notify the drivers and take the devices down. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/npu2: Setup an error interrupt on some opencapi FIRsFrederic Barrat2019-04-092-13/+53
| | | | | | | | | | | | | | | | | | | | Many errors reported in the NPU FIR2 register, mostly catching unexpected errors on the opencapi link are defined as 'brick fatal' in the workbook, yet the default action is set to system checkstop. It's possible to see those errors during AFU development, where the AFU may send unexpected packets on the link, therefore triggering those errors. Checkstopping the system in this case is clearly extreme, as the error could be contained to the brick and proper analysis of a checkstop is not trivial outside of a bringup environment. This patch changes the default action of those errors so that the NPU will raise an interrupt instead. Follow-up patches will log proper information so that the error can be debugged and linux can catch the event. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/npu2: Use NVLink irq setup for OpenCAPIFrederic Barrat2019-04-092-50/+19
| | | | | | | | | | | | | | Start using the irq setup code from NVLink for OpenCAPI, since the 2 versions are so close. There are only 2 differences: - the NPU may trigger more interrupts for OpenCAPI, 35 vs. 23, though none are configured to be triggered for now. - we need to enable the 4 translation faults interrupts for OpenCAPI. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/npu2: Move npu2 irq setup code to common areaFrederic Barrat2019-04-092-100/+102
| | | | | | | | | | | The NPU IRQ setup code is currently duplicated between NVLink and OpenCAPI, yet it's almost identical. This patch moves the NVLink version of the code to the common file. A later patch will make use of it for OpenCAPI. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/npu2: Fix OpenCAPI PE assignmentAndrew Donnellan2019-04-091-40/+34
| | | | | | | | | | | | | | | | | | | | When we support mixing NVLink and OpenCAPI devices on the same NPU, we're going to have to share the same range of 16 PE numbers between NVLink and OpenCAPI PHBs. For OpenCAPI devices, PE assignment is only significant for determining which System Interrupt Log register is used for a particular brick - unlike NVLink, it doesn't play any role in determining how links are fenced. Split the PE range into a lower half which is used for NVLink, and an upper half that is used for OpenCAPI, with a fixed PE number assigned per brick. As the PE assignment for OpenCAPI devices is fixed, set the PE once during device init and then ignore calls to the set_pe() operation. Suggested-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* core/i2c: Add request state trackingOliver O'Halloran2019-03-281-0/+1
| | | | | | | | | Allow the submitter to track the state of an I2C request by adding a state field to the request. This avoids the need to use a stub completion callback in some cases. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/phb4: Drop FRESET_DEASSERT_DELAY stateOliver O'Halloran2019-03-281-5/+0
| | | | | | | | | | | The delay between the ASSERT_DELAY and DEASSERT_DELAY states is set to one timebase tick. This state seems to have been a hold over from PHB3 where it was used to add a 1s delay between de-asserting PERST and polling the link for the CAPI FPGA. There's no requirement for that here since the link polling on PHB4 is a bit smarter so we should be fine. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/phb4: Factor out PERST controlOliver O'Halloran2019-03-281-28/+36
| | | | | | | | | | | | | | Some time ago Mikey added some code work around a bug we found where a certain RAID card wouldn't come back again after a fast-reboot. The workaround is setting the Link Disable bit before asserting PERST and clear it after de-asserting PERST. Currently we do this in the FRESET path, but not in the CRESET path. This patch moves the PERST control into its own function to reduce duplication and to the workaround is applied in all circumstances. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/phb4: Remove FRESET presence checkOliver O'Halloran2019-03-281-12/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | When we do an freset the first step is to check if a card is present in the slot. However, this only occurs when we enter phb4_freset() with the slot state set to SLOT_NORMAL. This occurs in: a) The creset path, and b) When the OS manually requests an FRESET via an OPAL call. a) is problematic because in the boot path the generic code will put the slot into FRESET_START manually before calling into phb4_freset(). This can result in a situation where a device is detected on boot, but not after a CRESET. I've noticed this occurring on systems where the PHB's slot presence detect signal is not wired to an adapter. In this situation we can rely on the in-band presence mechanism, but the presence check will make us exit before that has a chance to work. Additionally, if we enter from the CRESET path this early exit leaves the slot's PERST signal being left asserted. This isn't currently an issue, but if we want to support hotplug of devices into the root port it will be. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/phb4: Skip FRESET PERST when coming from CRESETOliver O'Halloran2019-03-281-1/+23
| | | | | | | | | | | | | | | | PERST is asserted at the beginning of the CRESET process to prevent the downstream device from interacting with the host while the PHB logic is being reset and re-initialised. There is at least a 100ms wait during the CRESET processing so it's not necessary to wait this time again in the FRESET handler. This patch extends the delay after re-setting the PHB logic to extend to the 250ms PERST wait period that we typically use and sets the skip_perst flag so that we don't wait this time again in the FRESET handler. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/imc: Enable opal calls to init/start/stop IMC Trace modeAnju T Sudhakar2019-03-281-1/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | Patch to enhance the imc opal call to support and handle trace_imc mode. To initialize the trace-mode, TRACE_IMC_SCOM value is written to TRACE_IMC_ADDR of the respective core. TRACE_IMC_SCOM is a 64bit value, and each bit represent the following: 0:1 : SAMPSEL 2:33 : CPMC_LOAD 34:40 : CPMC1SEL 41:47 : CPMC2SEL 48:50 : BUFFERSIZE 51:63 : RESERVED Currently the value for TRACE_IMC_SCOM is hard coded. During initialization htm_mode is disabled, and enabled only at start. The opal calls to start/stop the counters, will write CORE_IMC_HTM_MODE_ENABLE/ CORE_IMC_HTM_MODE_DISABLE respectively to the htm_scom_index of the desired cores. Additional switch cases are added to the current opal calls to start/stop the counters for trace-mode. Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/imc: Refactor opal init call for core-imcAnju T Sudhakar2019-03-281-27/+43
| | | | | | | | | | | | Factor out core-imc stop api code from opal_imc_counters_init() for better readability. Also fix the error message if, wake_up_engine_state is not "WAKEUP_ENGINE_PRESENT". Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Cc: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* core/stack: Rename backtrace functions, get rid of wrappersAndrew Donnellan2019-03-282-5/+4
| | | | | | | | | Rename ___backtrace() to backtrace_create() and ___print_backtrace() to backtrace_print(). Get rid of __backtrace() and __print_backtrace() wrappers. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/fsp, hw/ipmi: Convert attn code to not use backtrace wrappersAndrew Donnellan2019-03-282-9/+10
| | | | | | | | We're about to get rid of __backtrace() and __print_backtrace(), convert the FSP/IPMI attn code to not use them. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* xive: Add calls to save/restore the queues and VPs HW stateCédric Le Goater2019-03-281-0/+130
| | | | | | | | | | | | | | | | | To be able to support migration of guests using the XIVE native exploitation mode, (where the queue is effectively owned by the guest), KVM needs to be able to save and restore the HW-modified fields of the queue, such as the current queue producer pointer and generation bit, and to retrieve the modified thread context registers of the VP from the NVT structure : the VP interrupt pending bits. However, there is no need to set back the NVT structure on P9. P10 should be the same. Based on previous work from BenH. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/phb4: Look for the hub-id from in the PBCQ nodeOliver O'Halloran2019-03-281-3/+9
| | | | | | | | | | The hub-id is stored in the PBCQ node rather than the stack node so we never add it to the PHB node. This breaks the lxvpd slot lookup code since the hub-id is encoded in the VPD record that we need to find the slot information. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/ipmi/test/run-fru: Fix string truncation warning, enhance testStewart Smith2019-03-202-8/+19
| | | | | | | | | | | | | | | | | | We've been getting this warning/error from recent GCC: In file included from hw/ipmi/test/run-fru.c:22: hw/ipmi/test/../ipmi-fru.c: In function ‘fru_add’: hw/ipmi/test/../ipmi-fru.c:162:3: warning: ‘strncpy’ output truncated copying 32 bytes from a string of length 38 [-Wstringop-truncation] strncpy(info.version, version, MAX_STR_LEN + 1); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This patch does two things: 1) Re-arrange some code to shut GCC up. 2) Add extra fu to tests to ensure we're producing correct bytes. Signed-off-by: Stewart Smith <stewart@linux.ibm.com> Tested-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2/hw-procedures: Fix parallel zcal for opencapiFrederic Barrat2019-03-202-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | For opencapi, we currently do impedance calibration when initializing the PHY for the device, which could run in parallel if we were rich and had multiple opencapi devices. But if 2 devices are on the same obus, the 2 calibration sequences could overlap, which likely yields bad results and is useless anyway since it only needs to be done once per obus. This patch splits the opencapi PHY reset in 2 parts: - a 'init' part called serially at boot. That's when zcal is done. If we have 2 devices on the same socket, the zcal won't be redone, since we're called serially and we'll see it has already be done for the obus - a 'reset' part called during fundamental reset as a prereq for link training. It does the PHY setup for a set of lanes and the dccal. The PHY team confirmed there's no dependency between zcal and the other reset steps and it can be moved earlier. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2-hw-procedures: Fix zcal in mixed opencapi and nvlink modeFrederic Barrat2019-03-201-3/+21
| | | | | | | | | | | | | | The zcal procedure needs to be run once per obus. We keep track of which obus is already calibrated in an array indexed by the obus number. However, the obus number is inferred from the brick index, which works well for nvlink but not for opencapi. Create an obus_index() function, which, from a device, returns the correct obus index, irrespective of the device type. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2-hw-procedures: Don't set iovalid for opencapi devicesFrederic Barrat2019-03-201-0/+3
| | | | | | | | | | | | | | | | | set_iovalid() is called on the PHY reset path. The hw logic it touches is meaningless for opencapi. It's not hurting as long as all the links under the NPU are in opencapi mode, but in case of mixing opencapi and nvlink, we'll be in troubles: the code finds which bit to modify based on the brick index, which varies depending on the mode. So calling that function on an opencapi device may modify a nvlink brick! For example, for brick index 3. So we simply avoid doing anything when calling set_iovalid() for an opencapi device. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* fast-reboot: occ: Remove 'freq-domain-mask' from fast-reboot pathShilpasri G Bhat2019-03-151-43/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | OCC can change the pstate table at runtime to modify pstate limits or for characterization purpose. These changes are reflected by re-parsing the pstate table during fast-reboot to update the device-tree. Only relevant pstate DT properties are deleted and newly added during fast-reboot. The device-tree properties like 'freq-domain-mask' and 'domain-runs-at' are currently hard-coded and need not be updated during fast-reboot. So this patch removes them from the fast-reboot path. This patch fixes the below crash: [ 270.313998453,5] OCC: All Chip Rdy after 0 ms [ 270.314148918,3] Duplicate property "freq-domain-mask" in node /ibm,opal/power-mgt [ 270.314208553,0] Aborting! CPU 083c Backtrace: S: 0000000035de3a20 R: 000000003001b480 ._abort+0x4c S: 0000000035de3aa0 R: 0000000030028704 .new_property+0xd8 S: 0000000035de3b30 R: 0000000030028964 .__dt_add_property_cells+0x30 S: 0000000035de3bd0 R: 0000000030042980 .occ_pstates_init+0x7c8 S: 0000000035de3d90 R: 00000000300145f4 .load_and_boot_kernel+0x980 S: 0000000035de3e70 R: 00000000300276b4 .fast_reboot_entry+0x37c S: 0000000035de3f00 R: 0000000030002ac4 reset_fast_reboot_wakeup+0x40 Fixes: b821f8c2a8e3("power-mgmt : occ : Add 'freq-domain-mask' DT property") Reported-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Tested-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2-opencapi: Fix adapter reset when using 2 adaptersFrederic Barrat2019-03-132-7/+30
| | | | | | | | | | | | | | | | If two opencapi adapters are on the same obus, we may try to train the two links in parallel at boot time, when all the PCI links are being trained. Both links use the same i2c controller to handle the reset signal, so some care is needed to make sure resetting one doesn't interfere with the reset of the other. We need to keep track of the current state of the i2c controller (and use locking). This went mostly unnoticed as you need to have 2 opencapi cards on the same socket and links tended to train anyway because of the retries. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2-opencapi: Extend delay after releasing reset on adapterFrederic Barrat2019-03-131-2/+2
| | | | | | | | | | | | Give more time to the FPGA to process the reset signal. The previous delay, 5ms, is too short for newer adapters with bigger FPGAs. Extend it to 250ms. Ultimately, that delay will likely end up being added to the opencapi specification, but we are not there yet. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2-opencapi: ODL should be in reset when enabledFrederic Barrat2019-03-131-0/+6
| | | | | | | | | | | | | | | | We haven't hit any problem so far, but from the ODL designer, the ODL should be in reset when it is enabled. The ODL remains in reset until we start a fundamental reset to initiate link training. We still assert and deassert the ODL reset signal as part of the normal procedure just before training the link. Asserting is therefore useless at boot, since the ODL is already in reset, but we keep it as it's only a scom write and it's needed when we reset/retrain from the OS. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2-opencapi: Keep ODL and adapter in reset at the same timeFrederic Barrat2019-03-131-25/+43
| | | | | | | | | | | | | | | Split the function to assert and deassert the reset signal on the ODL, so that we can keep the ODL in reset while we reset the adapter, therefore having a window where both sides are in reset. It is actually not required with our current DLx at boot time, but I need to split the ODL reset function for the following patch and it will become useful/required later when we introduce resetting an opencapi link from the OS. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2-opencapi: Rename functions used to reset an adapterFrederic Barrat2019-03-131-4/+4
| | | | | | | | | | This is really to avoid confusion with a later patch and clarify whether we're resetting the ODL or the adapter. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2-opencapi: Setup perf counters to detect CRC errorsFrederic Barrat2019-03-131-0/+62
| | | | | | | | | | | | | | | | | | | It's possible to set up performance counters for the PLL to detect various conditions for the links in nvlink or opencapi mode. Since those counters are currently unused, let's configure them when an obus is in opencapi mode to detect CRC errors on the link. Each link has two counters: - CRC error detected by the host - CRC error detected by the DLx (NAK received by the host) We also dump the counters shortly after the link trains, but they can be read multiple times through cronus, pdbg or linux. The counters are configured to be reset after each read. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2-opencapi: Rework ODL register accessFrederic Barrat2019-03-132-116/+10
| | | | | | | | | | | | | ODL registers used to control the opencapi link state have an address built on a base address and an offset for each brick which can be computed instead of hard-coded individually for each brick. Rework how we access the ODL registers, to avoid repeating switch statements all over the place. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* opal/hmi: Don't retry TOD recovery if it is already in failed state.Mahesh Salgaonkar2019-03-051-9/+22
| | | | | | | | | | | | | On TOD failure, all cores/thread receives HMI and very first thread that gets interrupt fixes the TOD where as others just resets the respective HMER error bit and return. But when TOD is unrecoverable, all the threads try to do TOD recovery one by one causing threads to spend more time inside opal. Set a global flag when TOD is unrecoverable so that rest of the threads go back to linux immediately avoiding lock ups in system reboot/panic path. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/bt: Introduce separate list for synchronous messagesVasant Hegde2019-03-011-45/+63
| | | | | | | | | | | | | | | | | | | | | | | | | BT send logic always sends top of bt message list to BMC. Once BMC reads the message, it clears the interrupt and bt_idle() becomes true. bt_add_ipmi_msg_head() adds message to top of the list. If bt message list is not empty then: - if bt_idle() is true then we will endup sending message to BMC before getting response from BMC for inflight message. Looks like on some BMC implementation this results in message timeout. - else we endup starting message timer without actually sending message to BMC.. which is not correct. This patch introduces separate list to track synchronous messages. bt_add_ipmi_msg_head() will add messages to tail of this new list. We will always process this queue before processing normal queue. Finally this patch introduces new variable (inflight_bt_msg) to track inflight message. This will point to current inflight message. Suggested-by: Oliver O'Halloran <oohall@gmail.com> Suggested-by: Stewart Smith <stewart@linux.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* xive: Make no_sync parameter affermative in __xive_set_irq_config()Michael Neuling2019-02-281-6/+6
| | | | | | | | | | In __xive_set_irq_config() change the no_sync parameter to sync and fix all the call sites. Just a cleanup. No functional change. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/phb4: Fix indentation of brdgCtlOliver O'Halloran2019-02-251-2/+1
| | | | | | | Come on bridge control register. You're letting the team down. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2: Allow ATSD for LPAR other than 0Alexey Kardashevskiy2019-02-251-1/+21
| | | | | | | | | | | | | | | | | Each XTS MMIO ATSD# register is accompanied by another register - XTS MMIO ATSD0 LPARID# - which controls LPID filtering for ATSD transactions. When a host system passes a GPU through to a guest, we need to enable some ATSD for an LPAR. At the moment the host assigns one ATSD to a NVLink bridge and this maps it to an LPAR when GPU is assigned to the LPAR. The link number is used for an ATSD index. ATSD6&7 stay mapped to the host (LPAR=0) all the time which seems to be acceptable price for the simplicity. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2: Add XTS_BDF_MAP wildcard refcountAlexey Kardashevskiy2019-02-251-16/+30
| | | | | | | | | | | | | | | | | | Currently PID wildcard is programmed into the NPU once and never cleared up. This works for the bare metal as MSR does not change while the host OS is running. However with the device virtualization, we need to keep track of wildcard entries use and clear them up before switching a GPU from a host to a guest or vice versa. This adds refcount to a NPU2, one counter per wildcard entry. The index is a short lparid (4 bits long) which is allocated in opal_npu_map_lpar() and should be smaller than NPU2_XTS_BDF_MAP_SIZE (defined as 16). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Reza Arbab <arbab@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* power-mgmt : occ : Add 'freq-domain-mask' DT propertyAbhishek Goel2019-02-251-0/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new device-tree property freq-domain-indicator to define group of CPUs which would share same frequency. This property has been added under power-mgmt node. It is a bitmask. Bitwise AND is taken between this bitmask value and PIR of cpu. All the CPUs lying in the same frequency domain will have same result for AND. For example, For POWER9, 0xFFF0 indicates quad wide frequency domain. Taking AND with the PIR of CPUs will yield us frequency domain which is quad wise distribution as last 4 bits have been masked which represent the cores. Similarly, 0xFFF8 will represent core wide frequency domain for P8. Also, Add a new device-tree property domain-runs-at which will denote the strategy OCC is using to change the frequency of a frequency-domain. There can be two strategy - FREQ_MOST_RECENTLY_SET and FREQ_MAX_IN_DOMAIN. FREQ_MOST_RECENTLY_SET : the OCC sets the frequency of the quad to the most recent frequency value requested by the CPUs in the quad. FREQ_MAX_IN_DOMAIN : the OCC sets the frequency of the CPUs in the Quad to the maximum of the latest frequency requested by each of the component cores. Signed-off-by: Abhishek Goel <huntbag@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* Retry link training at PCIe GEN1 if presence detected but training ↵Timothy Pearson2019-02-261-13/+46
| | | | | | | | | | | | | repeatedly failed Certain older PCIe 1.0 devices will not train unless the training process starts at GEN1 speeds. As a last resort when a device will not train, fall back to GEN1 speed for the last training attempt. This is verified to fix devices based on the Conexant CX23888 on the Talos II platform. Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com> [stewart: cut P9NDD1.0 support, fixup dt_max_link_speed] Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* imc/catalog: Decompress catalog asynchronouslySantosh Sivaraj2019-02-251-84/+57
| | | | | | | | | | | | | | | In-Memory Collection(IMC) counters catalog is compressed blob which is loaded from the flash; decompression starts once the data is loaded from nvram by the main thread. This can be optimized by using the libxz API function which creates a job to do the decompression by not blocking the main thread. Refactor decompress() to use the libxz asynchronous wrapper functions. This also cleans up the error handling path in imc_init(). CC: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Santosh Sivaraj <santosh@fossix.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* powercap: occ: Fix the powercapping range allowed for userShilpasri G Bhat2019-02-251-8/+22
| | | | | | | | | | | | | | | | OCC provides two limits for minimum powercap. One being hard powercap minimum which is guaranteed by OCC and the other one is a soft powercap minimum which is lesser than hard-min and may or may not be asserted due to various power-thermal reasons. So to allow the users to access the entire powercap range, this patch exports soft powercap minimum as the "powercap-min" DT property. And it also adds a new DT property called "powercap-hard-min" to export the hard-min powercap limit. Fixes: c6aabe3f2eb5("powercap: occ: Add a generic powercap framework") Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* sparse: symbol '*bar*' was not declared. Should it be static?Stewart Smith2019-02-251-3/+3
| | | | | | Yes, a bunch of HOMER symbols should be. Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* sparse: symbols in imc.c weren't declared, Should they be static?Stewart Smith2019-02-251-6/+6
| | | | | | Yes, a bunch of these should be. Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* sparse: symbol 'procedure_*' was not declared. Should it be static?Stewart Smith2019-02-251-2/+2
| | | | | | Yes they should. Do so by adding static to the macro. Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* sparse: symbol 'NPU2_PHY_*' was not declared. Should it be static?Stewart Smith2019-02-251-66/+71
| | | | | | | Yes they should. Also, some are unused so we comment them out to at least keep the code as documentation complete. Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* sparse: symbol 'xive_buddy_lock/xive_vp_buddy' was not declared. Should it ↵Stewart Smith2019-02-251-3/+3
| | | | | | | | be static? Yes they should. Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
OpenPOWER on IntegriCloud