talos-skiboot - Talos™ II skiboot sources

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	npu2-opencapi: Enable presence detection on ZZ	Frederic Barrat	2018-10-25	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Presence detection for opencapi adapters was broken for ZZ planars v3 and below. All ZZ systems currently used in the lab have had their planar upgraded, so we can now remove the override we had to force presence and activate presence detection. Which should improve boot time. Considering the state of opal support on ZZ, this is really only for lab usage on BML. The opencapi enablement team has okay'd the change. In the unlikely case somebody tries opencapi on an old ZZ, the presence detection through i2c will show that no adapter is present and skiboot won't try to access or train the link. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	lpc: Clear sync no-response field prior to device probe	Andrew Jeffery	2018-10-23	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Artem Senichev reported[1] his P8 platform was failing to boot from a43e9a66aae9 ("astbmc: Fail SFC init if SIO is unavailable") with the following error: [ 110.097168975,3] PLAT: Failed to open PNOR flash controller I reproduced this behaviour on a Palmetto; we need to ensure the state of the no-response error bit is clear before proceding with the presence test. The fix appears to resolve the failure to open the PNOR flash controller on Palmetto and doesn't change the expected behaviour on Witherspoon. [1] https://github.com/open-power/skiboot/issues/197 Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Tested-by: Artem Senichev <a.senichev@yadro.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	phb4: Enable PHB MMIO-0/1 Bars only when mmio window exists	Vaibhav Jain	2018-10-16	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Presently phb4_probe_stack() will always enable PHB MMIO0/1 windows even if they doesn't exist in phy_map. Hence we do some minor shuffling in the phb4_probe_stack() so that MMIO-0/1 Bars are only enabled if there corresponding MMIO window exists in the phy_map. In case phy_map for an mmio window is '0' we set the corresponding BAR register to '0'. Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Reviewed-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	phb4/capp: Update the expected Eye-catcher for CAPP ucode lid	Vaibhav Jain	2018-10-16	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently on a FSP based P9 system load_capp_code() expects CAPP ucode lid header to have eye-catcher magic of 'CAPPPSLL'. However skiboot currently supports CAPP ucode only lids that have a eye-catcher magic of 'CAPPLIDH'. This prevents skiboot from loading the ucode with this error message: CAPP: ucode header invalid We fix this issue by updating load_capp_ucode() to use the eye-catcher value of 'CAPPLIDH' instead of 'CAPPPSLL'. Cc: stable Fixes: e50764d4f2b1("capi: Load capp microcode") Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	phb4/capp: Use link width to allocate STQ engines to CAPP	Vaibhav Jain	2018-10-16	1	-17/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Update phb4_init_capp_regs() to allocates STQ Engines to CAPP/PEC2 based on link width instead of always assuming it to x8. Also re-factor the function slightly to evaluate the link-width only once and cache it so that it can also be used to allocate DMA read engines. Cc: stable Fixes: 47c09cdfe7a3("phb4/capp: Calculate STQ/DMA read engines based on link-width for PEC") Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	astbmc: Fail SFC init if SIO is unavailable	Andrew Jeffery	2018-10-11	1	-0/+3
\| \| \| \| \| \| \| \| \|	If SuperIO is unavailable then the driver cannot perform accesses on which it currently depends. Test for SuperIO availability during initialsation and bail out immediately if it is absent. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	platform: Restructure bmc_platform type	Andrew Jeffery	2018-10-11	2	-5/+5
\| \| \| \| \| \| \| \| \| \|	Segregate the BMC platform configuration into hardware and software components. This allows population of platform default values for hardware configuration that may no-longer be accessible by the host. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> [stewart: fixup pci-quirk unit test] Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	astbmc: Use LPC probe calls to determine SIO presence	Andrew Jeffery	2018-10-11	1	-20/+10
\| \| \| \| \| \| \| \| \|	Avoid the probabilistic approach and use a deterministic one instead. The probe calls use a slow, synchronous method to capture the the state of the target device, so it is used sparingly (only on first access). Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	lpc: Introduce generic probe capability	Andrew Jeffery	2018-10-11	1	-54/+146
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Introduce generic read and write probe functions that allow detection of valid addresses by way of synchronous testing for the SYNC no-response state. If the no-response state is detected the probe functions will return an error to the caller, who can do with it what they wish. In the process, rip out the naive mechanism for muting the equivalent asynchronous error logging (regretfully introduced recently by yours truly). Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	astbmc: Remove coordinated isolation support	Andrew Jeffery	2018-10-11	1	-56/+0
\| \| \| \| \|	Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	astbmc: Prefer ipmi-hiomap for PNOR access	Andrew Jeffery	2018-10-11	1	-3/+7
\| \| \| \| \| \| \| \| \|	If the IPMI command is not available, fall back to the mailbox interface. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> [stewart: fix up mbox test] Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	ipmi: Introduce registration for SEL command handlers	Andrew Jeffery	2018-10-10	1	-29/+89
\| \| \| \| \|	Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	phb4: Generate checkstop on AIB ECC corr/uncorr for DD2.0 parts	Michael Neuling	2018-09-27	1	-9/+34
\| \| \| \| \| \| \| \| \| \| \| \| \|	On DD2.0 parts, PCIe ECC protection is not warranted in the response data path. Thus, for these parts, we need to flag any ECC errors detected from the adjacent AIB RX Data path so the part can be replaced. This patch configures the FIRs so that we escalate these AIB ECC errors to a checkstop so the parts can be replaced. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	hw/p8-i2c: Fix i2c request timeout	Frederic Barrat	2018-09-27	1	-6/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit eb146fac9685 ("core/i2c: Move the timeout field into i2c_request") simplified a bit how a request timeout is handled. However there's now some confusion between milliseconds and timebase increments when defining or using the timeout values, which breaks i2c requests made for opencapi, and probably others too. This patch declares all the timeout in milliseconds and just converts to timebase at the end of the chain, as needed. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	FSP: Improve Reset/Reload log message	Vasant Hegde	2018-09-20	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Below message is confusing. Lets make it clear. FSP sends "R/R complete notification" whenever there is a dump. We use `flag` to identify whether its its R/R completion -OR- just new dump notification. [ 483.406351956,6] FSP: SP says Reset/Reload complete [ 483.406354278,5] DUMP: FipS dump available. ID = 0x1a00001f [size: 6367640 bytes] [ 483.406355968,7] A Reset/Reload was NOT done Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	phb4: Re-factor phb4_fenced() and introduce phb4_dump_pec_err_regs()	Vaibhav Jain	2018-09-20	1	-30/+38
\| \| \| \| \| \| \| \| \| \| \|	Couple of places in 'phb4.c' where we may want to dump the PEC's error registers. Hence we introduce a phb4_dump_pec_err_regs() that dumps all the PEC error registers and also update phb4->nfir_cache & phb4->pfir_cache for later use. Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Reviewed-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	phb4: Reset pfir and nfir if new errors reported during ETU reset	Vaibhav Jain	2018-09-20	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During fast-reboot new PEC errors can be latched even after ETU-Reset is asserted. This will result in values of variables nfir_cache and pfir_cache to be out of sync. During step-2 of CRESET nfir_cache and pfir_cache values are used to bring the PHB out of reset state. However if these variables are out as noted above of date the nfir/pfir registers are never reset completely and ETU still remains frozen. Hence this patch updates step-2 of phb4_creset to re-read the values of nfir/pfir registers to check if any new errors were reported after ETU-reset was asserted, report these new errors and reset the nfir/pfir registers. This should bring the ETU out of reset successfully. Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Tested-By: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Reviewed-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	SBE-p8: Do all sbe timer update with xscom lock held	Stewart Smith	2018-09-17	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Without this, on some P8 platforms, we could (falsely) think the SBE timer had stalled getting the dreaded "timer stuck" message. The code was doing the mftb() to set the start of the timeout period while not holding the lock, so the 1ms timeout started sometime when somebody else had the xscom lock. The simple solution is to just do the whole routine holding the xscom lock, so do it that way. Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	phb4: Fix typo in disable lane eq code	Michael Neuling	2018-09-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In this commit commit 737c0ba3d72b8aab05a765a9fc111a48faac0f75 Author: Michael Neuling <mikey@neuling.org> Date: Thu Feb 22 10:52:18 2018 +1100 phb4: Disable lane eq when retrying some nvidia GEN3 devices We made a typo and set PH2 twice. This fixes it. It worked previously as if only phase 2 (PH2) is set it, skips phase 2 and phase 3 (PH3). Reported-by: Meng Li <shlimeng@cn.ibm.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	core/i2c: Remove bus specific alloc and free callbacks	Oliver O'Halloran	2018-09-17	1	-10/+0
\| \| \| \| \| \| \|	These are now pointless and they can be replaced with zalloc() and free(). Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	hw/p8-i2c: Remove p8_i2c_request structure	Oliver O'Halloran	2018-09-17	1	-31/+3
\| \| \| \| \| \| \| \| \|	The p8_i2c_request structure is barely used and the only useful data it contains (port_num) can be derived from the bus pointer. Remove it in preperation for removing the per-bus allocation and free methods. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	core/i2c: Move the timeout field into i2c_request	Oliver O'Halloran	2018-09-17	1	-24/+4
\| \| \| \| \| \| \| \| \| \| \|	Currently to set a per-request timeout you need to use i2c_req_set_timeout() which is a wrapper for a per-bus method that sets the actual timeout. This design doesn't make a whole lot of sense, so move the timeout field into the generic i2c_request structure and set the timeout to be set using that. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	hw/npu2, platform: Restructure OpenCAPI i2c reset/presence pins	Andrew Donnellan	2018-09-17	2	-5/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In platform_ocapi, we define i2c_{reset,presence}_odl{0,1} to specify the appropriate reset/presence GPIO pins for devices connected to ODL0 and ODL1 respectively. This is obviously wrong, because a device connected to brick 2 and a device connected to brick 4 are going to be different devices connected to different I2C pins, but rather conveniently we haven't had to deal with systems that can use the full 4 bricks as yet. Now that we're adding OpenCAPI support for Witherspoon, we should change this to specify pins separately for all 4 bricks. Replace i2c_{reset,presence}_odl{0,1} with i2c_{reset,presence}_brick{2,3,4,5} and update the presence detection code, device reset code, and existing platforms accordingly. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	hw/npu2, platform: Add NPU2 platform device detection callback	Andrew Donnellan	2018-09-17	2	-93/+112
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is no standardised way to determine the presence and type of devices connected to an NPU on POWER9. Currently, we hardcode device types based on platform type (as no platform currently supports both OpenCAPI and NVLink), and for OpenCAPI platforms we use I2C to detect presence. Witherspoon (and potentially other platforms later on) supports both NVLink and OpenCAPI, and additionally uses SXM2 connectors which can carry more than one link, rather than the SlimSAS connectors used for OpenCAPI on Zaius and ZZ. This necessitates some special handling. Add a platform callback for NPU device detection. In a later patch, we will use this to implement Witherspoon-specific device detection. For now, add a Witherspoon stub that sets all links to NVLink (i.e. current behaviour). Move the existing I2C-based presence detection for OpenCAPI devices on Zaius/ZZ into common code, which we use by default for platforms which do not define a callback. Clean up the use of the ibm,npu-link-type property, which will now be exposed solely for debugging and not consumed internally. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	hw/npu2: Common NPU2 init routine between NVLink and OpenCAPI	Andrew Donnellan	2018-09-17	3	-268/+245
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Replace probe_npu2() and probe_npu2_opencapi() with a new shared probe_npu2(). Refactor some of the common NPU setup code into shared code. No functional change. This patch does not implement support for using both types of devices simultaneously on the same NPU - we expect to add this sometime in the future. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Alistair Popple <alistair@popple.id.au> Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	npu2: Split device index into brick and link index	Andrew Donnellan	2018-09-17	3	-50/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	On Witherspoon, OpenCAPI devices attached to link indexes 0 and 1 are handled by bricks 2 and 3. Rename index to brick_index, and add a new field, link_index, to refer to the link index. For now, we set those values identically. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Alistair Popple <alistair@popple.id.au> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	occ: Wait if OCC GPU presence status not immediately available	Andrew Donnellan	2018-09-17	1	-3/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It takes a few seconds for the OCC to set everything up in order to read GPU presence. At present, we try to kick off OCC initialisation as early as possible to maximise the time it has to read GPU presence. Unfortunately sometimes that's not enough, so add a loop in occ_get_gpu_presence() so that on the first time we try to get GPU presence we keep trying for up to 2 seconds. Experimentally this seems to be adequate. Fixes: 9b394a32c8ea ("occ: Add support for GPU presence detection") Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	npu2: Use correct kill type for TCE invalidation	Alexey Kardashevskiy	2018-09-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	kill_type is enum of OPAL_PCI_TCE_KILL_PAGES, OPAL_PCI_TCE_KILL_PE, OPAL_PCI_TCE_KILL_ALL and phb4_tce_kill() gets it right but npu2_tce_kill() uses OPAL_PCI_TCE_KILL which is an OPAL API token. This fixes an obvious mistype. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	hw/npu2-hw-procedures: Enable RX auto recal on OpenCAPI links	Andrew Donnellan	2018-09-17	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The RX_RC_ENABLE_AUTO_RECAL flag is required on OpenCAPI but not NVLink. Traditionally, Hostboot sets this value according to the machine type. However, now that Witherspoon supports both NVLink and OpenCAPI, it can't tell whether or not a link is OpenCAPI. So instead, set it in skiboot, where it will only be triggered after we've done device detection and found an OpenCAPI device. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Acked-by: Reza Arbab <arbab@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	hw/npu2-opencapi: Fix setting of supported OpenCAPI templates	Andrew Donnellan	2018-09-17	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In opal_npu_tl_set(), we made a typo that means the OPAL_NPU_TL_SET call may not clear the enable bits for templates that were previously enabled but are now disabled. Fix the typo so we clear NPU2_OTL_CONFIG1_TX_TEMP2_EN as well as TEMP{1,3}_EN. Reported-by: Tyler Seredynski <tseredynski@gmail.com> Fixes: cd8b82a8e83ed ("npu2-opencapi: Add OpenCAPI OPAL API calls") Cc: stable Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	fsp/surv: Improve log message	Vasant Hegde	2018-09-13	1	-2/+4
\| \| \| \| \|	Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
*	phb4: Don't probe a PHB if its garded	Vaibhav Jain	2018-09-13	1	-2/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Presently phb4_probe_stack() causes an exception while trying to probe a PHB if its garded. This causes skiboot to go into a reboot loop with following exception log: *********************************************** Fatal MCE at 000000003006ecd4 .probe_phb4+0x570 CFAR : 00000000300b98a0 <snip> Aborting! CPU 0018 Backtrace: S: 0000000031cc37e0 R: 000000003001a51c ._abort+0x4c S: 0000000031cc3860 R: 0000000030028170 .exception_entry+0x180 S: 0000000031cc3a40 R: 0000000000001f10 * S: 0000000031cc3c20 R: 000000003006ecb0 .probe_phb4+0x54c S: 0000000031cc3e30 R: 0000000030014ca4 .main_cpu_entry+0x5b0 S: 0000000031cc3f00 R: 0000000030002700 boot_entry+0x1b8 This is caused as phb4_probe_stack() will ignore all xscom read/write errors to enable PHB Bars and then tries to perform an mmio to read PHB Version registers that cause the fatal MCE. We fix this by ignoring the PHB probe if the first xscom_write() to populate the PHB Bar register fails, which indicates that there is something wrong with the PHB. Cc: stable Fixes: dc21b4db3a2e('hw/phb4: Add initial support') Reviewed-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
*	phb4: Workaround PHB errata with CFG write UR/CA errors	Benjamin Herrenschmidt	2018-09-13	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \|	If the PHB encounters a UR or CA status on a CFG write, it will incorrectly freeze the wrong PE. Instead of using the PE# specified in the CONFIG_ADDRESS register, it will use the PE# of whatever MMIO occurred last. Work around this disabling freeze on such errors Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Tested-By: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
*	phb4: Handle allocation errors in phb4_eeh_dump_regs()	Benjamin Herrenschmidt	2018-09-13	1	-0/+4
\| \| \| \| \| \| \| \|	If the zalloc fails (and it can be a rather large allocation), we will overwite memory at 0 instead of failing. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
*	phb4: Don't try to access non-existent PEST entries	Benjamin Herrenschmidt	2018-09-13	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	In a POWER9 chip, some PHB4s have 256 PEs, some have 512. Currently, the diagnostics code retrieves 512 unconditionally, which is wrong and causes us to incorrectly report bogus values for the "high" PEs on the small PHBs. Use the actual number of implemented PEs instead Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
*	phb4: Disable 32-bit MSI in capi mode	Frederic Barrat	2018-08-15	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a capi device does a DMA write targeting an address lower than 4GB, it does so through a 32-bit operation, per the PCI spec. In capi mode, the first TVE entry is configured in bypass mode, so the address is valid. But with any (bad) luck, the address could be 0xFFFFxxxx, thus looking like a 32-bit MSI. We currently enable both 32-bit and 64-bit MSIs, so the PHB will interpret the DMA write as a MSI, which very likely results in an EEH (MSI with a bad payload size). We can fix it by disabling 32-bit MSI when switching the PHB to capi mode. Capi devices are 64-bit. Cc: stable Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	capp: Fix the capp recovery timeout comparison	Vaibhav Jain	2018-08-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current capp recovery timeout control loop in do_capp_recovery_scoms() uses a wrong comparison for return value of tb_compare(). This may cause do_capp_recovery_scoms() to report an timeout earlier than the 168ms stipulated time. The patch fixes this by updating the loop timeout control branch in do_capp_recovery_scoms() to use the correct enum tb_cmpval. Cc: Stable #6.0+ Fixes: 09b853cae0aa0("capi: Poll Err/Status register during CAPP recovery") Reported-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	phb4/capp: Update DMA read engines set in APC_FSM_READ_MASK based on link-width	Vaibhav Jain	2018-08-13	1	-4/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 47c09cdfe7a3("phb4/capp: Calculate STQ/DMA read engines based on link-width for PEC") update the CAPP init sequence by calculating the needed STQ/DMA-read engines based on link width and populating it in XPEC_NEST_CAPP_CNTL register. This however needs to be synchronized with the value set in CAPP APC FSM Read Machine Mask Register. Hence this patch update phb4_init_capp_regs() to calculate the link width of the stack on PEC2 and populate the same values as previously populated in PEC CAPP_CNTL register. Cc: stable # v5.7+ Fixes: 47c09cdfe7a3("phb4/capp: Calculate STQ/DMA read engines based on link-width for PEC") Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	hw/chiptod: test QUIRK_NO_CHIPTOD in opal_resync_timebase	Nicholas Piggin	2018-08-13	1	-0/+4
\| \| \| \| \| \| \| \|	This allows some test coverage of deep stop states in Linux with Mambo. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	ast-io: Use bmc_sio_{get, put}() where required	Andrew Jeffery	2018-08-13	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Booting the host with a particular BMC configuration could lead to the following error appearing in the OPAL msglog: [ 71.470748378,3] PLAT: AST IO initialisation failed! Wrap access to BMC_SIO_PLAT_FLAGS in bmc_sio_get()/bmc_sio_put() in order to unlock and relock the SuperIO controller as required and avoid the failure. Fixes: ebc8524a3a45 ("ast-io: Rework setup/tear-down of communication with the BMC") Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	phb4: Use the return value of phb4_fenced() in phb4_get_diag_data()	Cyril Bur	2018-08-06	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \|	phb4_get_diag_data() checks the flags for the PHB4_AIB_FENCED after having called phb4_fenced(). This information is returned by phb4_fenced(). This patch was prompted by an unused return value warning in Coverity. Fixes: CID 163734 Signed-off-by: Cyril Bur <cyril.bur@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	npu2: Add support for relaxed-ordering mode	Reza Arbab	2018-08-06	1	-2/+273
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some device drivers support out of order access to GPU memory. This does not affect the CPU view of memory but it does affect the GPU view of memory. It should only be enabled if the GPU driver has requested it. Add OPAL APIs allowing the driver to query relaxed ordering state or request it to be set for a device. Current hardware only allows relaxed ordering to be enabled per PCIe root port. So the code here doesn't enable relaxed ordering until it has been explicitly requested for every device on the port. Signed-off-by: Alistair Popple <alistair@popple.id.au> [arbab@linux.ibm.com: Rebase/refactor original changes] Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-By: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	npu2: Don't open code NPU2_RELAXED_ORDERING_CFG2	Reza Arbab	2018-08-06	1	-18/+13
\| \| \| \| \| \| \| \| \|	Make the code that initializes these registers more descriptive by using macros instead of open coded literals. No functional change. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-By: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	phb4: Track PEC index in dt and phb4 struct	Reza Arbab	2018-08-06	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \|	Knowing the PEC index is going to be important when we enable relaxed ordering, so store this value for later use. Signed-off-by: Alistair Popple <alistair@popple.id.au> [arbab@linux.ibm.com: Rebase/refactor original changes] Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-By: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	hw/npu2: Don't assert if we hit a mixed OpenCAPI/NVLink setup	Andrew Donnellan	2018-08-06	1	-1/+1
\| \| \| \| \| \| \| \| \|	If our device tree contains a mix of OpenCAPI and NVLink links, that's a problem, but it's not fatal and we should simply abort NPU init rather than kill the machine - this is helpful for doing further debugging. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	xive: Disable block tracker	Benjamin Herrenschmidt	2018-08-02	1	-2/+4
\| \| \| \| \| \| \| \| \| \|	Due to some HW errata, the block tracking facility (performance optimisation for large systems) should be disabled on Nimbus chips. Disable it unconditionally for now. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	hw/p8-i2c: Print the set error bits	Oliver O'Halloran	2018-08-01	1	-0/+10
\| \| \| \| \| \| \| \|	This is purely to save me from having to look it up every time someone gets an I2C error. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	hw/phb4: Use local_alloc for phb4 structures	Oliver O'Halloran	2018-08-01	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Struct phb4 is fairly heavyweight at 283664 bytes. On systems with 6x PHBs per socket this results in using 3.2MB of heap space the PHB structures alone. This is a fairly large chunk of our 12MB heap and on systems with particularly large PCIe topologies, or additional PHBs we can fail to boot because we cannot allocate space for the FDT blob. This patch switches to using local_alloc() for the PHB structures so they don't consume too large a portion of our 12MB heap space. Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	hw/phb4: Fix unused value/parameter warnings	Andrew Donnellan	2018-07-26	2	-23/+27
\| \| \| \| \| \| \| \|	Remove the phb4.c-specific CFLAGS that disable the unused value and unused parameter warnings, and cleanup the ensuing warnings. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	npu2-opencapi: Don't send commands to NPU when link is down	Frederic Barrat	2018-07-26	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Even if an opencapi link is down, we currently always try to issue a config read operation when probing for PCI devices, because of the default scan map used for an opencapi PHB. The config operation fails, as expected, but it can also raise a FIR bit and trigger an HMI. For opencapi, there's no root device like for a "normal" PCI PHB, so there's no reason to do the config operation. To fix it, we keep the scan map blank by default, and only add a device once the link is trained. CC: stable # v6.1+ Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>