summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* hdat_to_dt: hash_prop the same on all platformsStewart Smith2018-04-301-1/+1
| | | | Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* uart: fix uart_opal_flush to take console lock over uart_con_flushNicholas Piggin2018-04-291-2/+8
| | | | | | | Cc: Russell Currey <ruscur@russell.cc> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* xive: fix missing unlock in error pathStewart Smith2018-04-291-0/+1
| | | | | | | | Found with sparse and some added lock annotations. CC: stable # 5.10+ Fixes: de82c2e0e Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* OPAL_PCI_SET_POWER_STATE: fix locking in error pathsStewart Smith2018-04-291-4/+12
| | | | | | | | | Otherwise we could exit OPAL holding locks, potentially leading to all sorts of problems later on. Cc: stable # 5.3+ Fixes: 7a3e2c4ee3aa0 Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/slw: Don't assert on a unknown chipOliver O'Halloran2018-04-291-2/+10
| | | | | | | | | | | | | | | | | For some reason skiboot populates nodes in /cpus/ for the cores on chips that are deconfigured. As a result Linux includes the threads of those cores in it's set of possible CPUs in the system and attempts to set the SPR values that should be used when waking a thread from a deep sleep state. However, in the case where we have deconfigured chip we don't create a xscom node for that chip and as a result we don't have a proc_chip structure for that chip either. In turn, this results in an assertion failure when calling opal_slw_set_reg() since it expects the chip structure to exist. Fix this up and print an error instead. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* core/pci-dt-slots: Fix devfn lookupOliver O'Halloran2018-04-291-1/+1
| | | | | | | | | | | | | | | We only want to use the device part of the bdfn when looking up the switch down port. The required bit twiddling happens inside find_devfn() and the masking here is broken since: a) Keeps the fn part of the bdfn, and b) Masks off part of the device number. This breaks looking up the slot information in some cases. Fixes: 6878b806682f ("pci-dt-slot: Big ol' cleanup") Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* mambo: Add persistent memory disk supportMichael Neuling2018-04-291-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support to for mapping disks images using persistent memory. Disks can be added by setting this ENV variable: PMEM_DISK="/mydisks/disk1.img,/mydisks/disk2.img" These will show up in Linux as /dev/pmem0 and /dev/pmem1. This uses a new feature in mambo "mysim memory mmap .." which is only available since mambo commit 0131f0fc08 (from 24/4/2018). This also needs the of_pmem.c driver in Linux which is only available since v4.17. It works with powernv_defconfig + CONFIG_OF_PMEM. ie --- a/arch/powerpc/configs/powernv_defconfig +++ b/arch/powerpc/configs/powernv_defconfig @@ -238,6 +238,8 @@ CONFIG_RTC_CLASS=y CONFIG_RTC_DRV_GENERIC=y CONFIG_VIRTIO_PCI=m CONFIG_VIRTIO_BALLOON=m +CONFIG_LIBNVDIMM=y +# CONFIG_ND_BLK is not set CONFIG_EXT2_FS=y CONFIG_EXT2_FS_XATTR=y CONFIG_EXT2_FS_POSIX_ACL=y Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* asm/head: Fix comparison in opal_entry for switching to emergencyVaibhav Jain2018-04-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 3fdd2629516d ("core/opal: Emergency stack for re-entry") introduced an emergency stack for re-entrant OPAL calls. A branch was added in opal_entry() that switches to emergency stack checks if current thread is already in an active opal call. However the conditional branch that checks the value cpu_thread->in_opal_call is reverse forcing the use of emergency stack in even in non re-entrant cases. This causes a opal stack guard routine __mcount_stack_check() to falsely assume that stack is overflown as stack pointer of EMERGENCY_STACK is compared against the bounds of NORMAL_STACK, forcing the function to call abort() with an error message of this form: INIT: Starting kernel at 0x20010000, fdt at 0x3073f708 53664 bytes) CPU 0004 Stack overflow detected ! pc=3001d8ec sp=31c27c90 (gap=30488) token=70 Aborting! CPU 0004 Backtrace: S: 0000000031c27b20 R: 000000003001ca2c E ._abort+0x60 S: 0000000031c27bb0 R: 0000000030013e10 E .__mcount_stack_check+0x168 S: 0000000031c27c90 R: 000000003001d8ec E .opal_entry_check+0x1c S: 0000000031c27d20 R: 00000000300051a4 E opal_entry+0xf4 --- OPAL call token: 0x46 caller R1: 0xc0000000011e3e50 --- So this patch update the 'bne' branch in opal_entry() to 'bgt' branch so that switch to emergency stack only happens when current cpu_thread->is_opal_call is greater than 1 indicating an re-entrant opal call. Fixes: 3fdd2629516d ("core/opal: Emergency stack for re-entry") Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Signed-off-by: Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* p9dsu: detect p9dsu variant even when hostboot doesn't tell usStewart Smith2018-04-241-68/+80
| | | | | | | | | | | | | | | | | | | The SuperMicro BMC can tell us what riser type we have, which dictates the PCI slot tables. Usually, in an environment that a customer would experience, Hostboot will do the query with an SMC specific patch (not upstream as there's no platform specific code in hostboot) and skiboot knows what variant it is based on the compatible string. However, if you're using upstream hostboot, you only get the bare 'p9dsu' compatible type. We can work around this by asking the BMC ourselves and setting the slot table appropriately. We do this syncronously in platform init so that we don't start probing PCI before we setup the slot table. This adds a bit of funky logic in the p9dsu platform file, but on the whole makes things simpler. Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* opal/hmi: Generate one event per core for processor recovery.Mahesh Salgaonkar2018-04-241-3/+3
| | | | | | | | | Processor recovery is per core error. All threads on that core receive HMI. All threads don't need to generate HMI event for same error. Let thread 0 only generate the event. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* opal:hmi: Add missing processor recovery reason string.Mahesh Salgaonkar2018-04-241-0/+1
| | | | | | | | | | | With this patch now we see reason string printed for CORE_WOF[43] bit. [ 477.352234986,7] HMI: [Loc: U78D3.001.WZS004A-P1-C48]: P:8 C:22 T:3: Processor recovery occurred. [ 477.352240742,7] HMI: Core WOF = 0x0000000000100000 recovered error: [ 477.352242181,7] HMI: PC - Thread hang recovery Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* external/mambo: Add di command to decode instructionsMichael Neuling2018-04-241-0/+11
| | | | | | | | | | | | | | | | By default you get 16 instructions but you can specify the number you want. ie systemsim % di 0x100 4 0x0000000000000100: Enc:0xA64BB17D : mtspr HSPRG1,r13 0x0000000000000104: Enc:0xA64AB07D : mfspr r13,HSPRG0 0x0000000000000108: Enc:0xF0092DF9 : std r9,0x9F0(r13) 0x000000000000010C: Enc:0xA6E2207D : mfspr r9,PPR Using di since it's what xmon uses. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* p9dsu: add slot power limit.Jim Yuan2018-04-241-0/+27
| | | | | Signed-off-by: Jim Yuan <jim.yuan@supermicro.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* p9dsu: add pci slot table for Boston LC 1U/2U and Boston LA/ESS.Jim Yuan2018-04-241-7/+584
| | | | | | Signed-off-by: Jim Yuan <jim.yuan@supermicro.com> [stewart: remove trailing whitespace, incorrect BMC comment] Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* p9dsu HACK: fix system-vpd eepromOliver O'Halloran2018-04-241-0/+20
| | | | | Signed-off-by: Jim Yuan <jim.yuan@supermicro.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* p9dsu: change esel command from AMI to IBM 0x3a.Jim Yuan2018-04-241-1/+6
| | | | | Signed-off-by: Jim Yuan <jim.yuan@supermicro.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hdata/i2c: Fix up pci hotplug labelsOliver O'Halloran2018-04-241-2/+2
| | | | | | | | | These labels are used on the devices used to do PCIe slot power control for implementing PCIe hotplug. I'm not sure how they ended up as "eeprom-pgood" and "eeprom-controller" since that doesn't make any sense. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hdata/i2c: Ignore multi-port I2C devicesOliver O'Halloran2018-04-241-4/+13
| | | | | | | | | | | | | | | | | Recent FSP firmware builds add support for multi-port I2C devices such as the GPIO expanders used for the presence detect of OpenCAPI devices and the PCIe hotplug controllers used to power cycle PCIe slots on ZZ. The OpenCAPI driver inside of skiboot currently uses a platform-specific method to talk to the relevant I2C device rather than relying on HDAT since not all platforms correctly report the I2C devices (hello Zaius). Additionally the nature of multi-port devices require that we a device specific handler so that we generate the correct DT bindings. Currently we don't and there is no immediate need for this support so just ignore the multi-port devices for now. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hdata/i2c: Replace i2c_ prefix with dev_Oliver O'Halloran2018-04-241-8/+8
| | | | | | | | | | | | The current naming scheme makes it easy to conflate "i2cm_port" and "i2c_port." The latter is used to describe multi-port I2C devices such as GPIO expanders and multi-channel PCIe hotplug controllers. Rename i2c_port to dev_port to make the two a bit more distinct. Also rename i2c_addr to dev_addr for consistency. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hdata/i2c: Ignore CFAM I2C masterOliver O'Halloran2018-04-241-0/+10
| | | | | | | | | | | | | | | Recent FSP firmware builds put in information about the CFAM I2C master in addition the to host I2C masters accessible via XSCOM. Odds are this information should not be there since there's no handshaking between the FSP/BMC and the host over who controls that I2C master, but it is so we need to deal with it. This patch adds filtering to the HDAT parser so it ignores the CFAM I2C master. Without this it will create a bogus i2cm@<addr> which migh cause issues. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* skiboot 5.10.5 release notesStewart Smith2018-04-241-0/+61
| | | | | | Signed-off-by: Stewart Smith <stewart@linux.ibm.com> (cherry picked from commit b2e7d224fd7677de1dd653c45deee94ef6886ffd) Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2: Use ibm, loc-code rather than ibm, slot-labelOliver O'Halloran2018-04-231-13/+7
| | | | | | | | | | | | The ibm,slot-label property is to name the slot that appears under a PCIe bridge. In the past we (ab)used the slot tables to attach names to GPU devices and their corresponding NVLinks which resulted in npu2.c using slot-label as a location code rather than as a way to name slots. Fix this up since it's confusing. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hdata/slots: Apply slot label to the parent slotOliver O'Halloran2018-04-232-2/+22
| | | | | | | | | | Slot names only really make sense when applied to an actual slot rather than a device. On witherspoon the GPU devices have a name associated with the device rather than the slot for the GPUs. Add a hack that moves the slot label to the parent slot rather than on the device itself. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* pci-dt-slot: Big ol' cleanupOliver O'Halloran2018-04-231-80/+74
| | | | | | | | | | | | | | | The underlying data that we get from HDAT can only really describe a PCIe system. As such we can simplify the devicetree slot lookup code by only caring about the important cases, namly, root ports and switch downstream ports. This also fixes a bug where root port didn't get a Slot label applied which results in devices under that port not having ibm,loc-code set. This results in the EEH core being unable to report the location of EEHed devices under that port. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2/hw-procedures: fence bricks on GPU resetBalbir Singh2018-04-231-7/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The NPU workbook defines a way of fencing a brick and getting the brick out of fence state. We do have an implementation of bringing the brick out of fenced/quiesced state. We do the latter in our procedures, but to support run time reset we need to do the former. The fencing ensures that access to memory behind the links will not lead to HMI's, but instead SUE's will be populated in cache (in the case of speculation). The expectation is then that prior to and after reset, the operating system components will flush the cache for the region of memory behind the GPU. This patch does the following: 1. Implements a npu2_dev_fence_brick() function to set/clear fence state 2. Clear FIR bits prior to clearing the fence status 3. Clear's the fence status 4. We take the powerbus out of CQ fence much later now, in credits_check() which is the last hardware procedure called after link training. Signed-off-by: Balbir Singh <bsingharora@gmail.com> Reviewed-By: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hdata/tpmrel: detect tpm not present by looking up the stinfo->statusClaudio Carvalho2018-04-231-0/+8
| | | | | | | | | | | | | | | | Skiboot detects if tpm is present by checking if a secureboot_tpm_info entry exists. However, if a tpm is not present, hostboot also creates a secureboot_tpm_info entry. In this case, hostboot creates an empty entry, but setting the field tpm_status to TPM_NOT_PRESENT. This detects if tpm is not present by looking up the stinfo->status. This fixes the "TPMREL: TPM node not found for chip_id=0 (HB bug)" issue, reproduced when skiboot is running on a system that has no tpm. Signed-off-by: Claudio Carvalho <cclaudio@linux.vnet.ibm.com> Tested-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hdata: Add DIMM actual speed to device treeVasant Hegde2018-04-231-1/+8
| | | | | | | | Recent HDAT provides DIMM actuall speed. Lets add this to device tree. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [stewart: use Hz rather than Mhz, consistent with other properties] Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hdata: Fix DIMM size propertyVasant Hegde2018-04-233-36/+17
| | | | | | | | | | | Today we parse vpd blob to get DIMM size information. This is limited to FSP based system. HDAT provides DIMM size value. Lets use that to populate device tree. So that we can get size information on BMC based system as well. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> CC: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* mambo/mambo_utils.tcl: Inject an MCE at a specified addressBalbir Singh2018-04-191-1/+15
| | | | | | | | | | | | | | | | | | | | | | Currently we don't support injecting an MCE on a specific address. This is useful for testing functionality like memcpy_mcsafe() (see https://patchwork.ozlabs.org/cover/893339/) The core of the functionality is a routine called inject_mce_ue_on_addr, which takes an addr argument and injects an MCE (load/store with UE) when the specified address is accessed by code. This functionality can easily be enhanced to cover instruction UE's as well. A sample use case to create an MCE on stack access would be set addr [mysim display gpr 1] inject_mce_ue_on_addr $addr This would cause an mce on any r1 or r1 based access Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/npu2.c: Remove static configuration of NPU2 registerAlistair Popple2018-04-191-12/+12
| | | | | | | | | | | | | | | | | | The NPU_SM_CONFIG0 register currently needs to be configured in Skiboot to select NVLink mode, however Hostboot should configure other bits in this register. For some reason Skiboot was explicitly clearing bit-6 (CONFIG_DISABLE_VG_NOT_SYS). It is unclear why this bit was getting cleared as recent Hostboot versions explicitly set it to the correct value based on the specific system configuration. Therefore Skiboot should not alter it. Bit-58 (CONFIG_NVLINK_MODE) selects if NVLink mode should be enabled or not. Hostboot does not configure this bit so Skiboot should continue to configure it. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* external/mambo: improve helper for machine checksNicholas Piggin2018-04-191-9/+53
| | | | | | | | | | | | Improve workarounds for stop injection, because mambo often will trigger on 0x104/204 when injecting sreset/mces. This also adds a workaround to skip injecting on reservations to avoid infinite loops when doing inject_mce_step. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* travis: Enable ppc64le buildsStewart Smith2018-04-1912-60/+63
| | | | | | | | | | | | | At least on the IBM Travis Enterprise instance, we can now do ppc64le builds! We can only build a subset of our matrix due to availability of ppc64le distros. The Dockerfiles need some tweaking to only attempt to install (x86_64 only) Mambo binaries, as well as the build scripts. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* external: Add "lpc" toolBenjamin Herrenschmidt2018-04-192-0/+193
| | | | | | | | This is a little front-end to the lpc debugfs files to access the LPC bus from userspace on the host. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* npu2: Improve log output of GPU-to-link mappingReza Arbab2018-04-191-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | Debugging issues related to unconnected NVLinks can be a little less irritating if we use the NPU2DEV{DBG,INF}() macros instead of prlog(). In short, change this: NPU2: comparing GPU 'GPU2' and NPU2 'GPU1' NPU2: comparing GPU 'GPU3' and NPU2 'GPU1' NPU2: comparing GPU 'GPU4' and NPU2 'GPU1' NPU2: comparing GPU 'GPU5' and NPU2 'GPU1' : npu2_dev_bind_pci_dev: No PCI device for NPU2 device 0006:00:01.0 to bind to. If you expect a GPU to be there, this is a problem. to this: NPU6:0:1.0 Comparing GPU 'GPU2' and NPU2 'GPU1' NPU6:0:1.0 Comparing GPU 'GPU3' and NPU2 'GPU1' NPU6:0:1.0 Comparing GPU 'GPU4' and NPU2 'GPU1' NPU6:0:1.0 Comparing GPU 'GPU5' and NPU2 'GPU1' : NPU6:0:1.0 No PCI device found for slot 'GPU1' Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* sensors: Dont add DTS sensors when OCC inband sensors are availableShilpasri G Bhat2018-04-194-9/+12
| | | | | | | | | | | | | | | | | | There are two sets of core temperature sensors today. One is DTS scom based core temperature sensors and the second group is the sensors provided by OCC. DTS is the highest temperature among the different temperature zones in the core while OCC core temperature sensors are the average temperature of the core. DTS sensors are read directly by the host by SCOMing the DTS sensors while OCC sensors are read and updated by OCC to main memory. Reading DTS sensors by SCOMing is a heavy and slower operation as compared to reading OCC sensors which is as good as reading memory. So dont add DTS sensors when OCC sensors are available. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Acked-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* travis-ci: pull Mambo over http rather than ftpStewart Smith2018-04-194-4/+4
| | | | Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* external/trace: fix makefileStewart Smith2018-04-181-1/+1
| | | | Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* core/test/run-trace: fix on ppc64elStewart Smith2018-04-181-1/+2
| | | | | | | Hackish fix from benh Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* core/fast-reboot: Increase timeout for dctl sreset to 1secVaidyanathan Srinivasan2018-04-181-1/+1
| | | | | | | | | | | | Direct control xscom can take more time to complete. We seem to wait too little on Boston failing fast-reboot for no good reason. Increase timeout to 1 sec as a reasonable value for sreset to be delivered and core to start executing instructions. Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* core: Fix iteration condition to skip garded cpuVaidyanathan Srinivasan2018-04-181-1/+1
| | | | | | | | | Fix the logic error in the loop that iterated incorrectly over garded cpu. Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* core/opal: Allow poller re-entry if OPAL was re-enteredNicholas Piggin2018-04-181-4/+8
| | | | | | | | | | | | | | | If an NMI interrupts the middle of running pollers and the OS invokes pollers again (e.g., for console output), the poller re-entrancy check will prevent it from running and spam the console. That check was designed to catch a poller calling opal_run_pollers, OPAL re-entrancy is something different and is detected elsewhere. Avoid the poller recursion check if OPAL has been re-entered. This is a best-effort attempt to cope with errors. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* core/opal: Emergency stack for re-entryNicholas Piggin2018-04-186-16/+52
| | | | | | | | | | | | | | | | | | | | This detects OPAL being re-entered by the OS, and switches to an emergency stack if it was. This protects the firmware's main stack from re-entrancy and allows the OS to use NMI facilities for crash / debug functionality. Further nested re-entry will destroy the previous emergency stack and prevent returning, but those should be rare cases. This stack is sized at 16kB, which doubles the size of CPU stacks, so as not to introduce a regression in primary stack size. The 16kB stack originally had a 4kB machine check stack at the top, which was removed by 80eee1946 ("opal: Remove machine check interrupt patching in OPAL."). So it is possible the size could be tightened again, but that would require further analysis. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* asm/head: implement quiescing without stack or clobbering regsNicholas Piggin2018-04-184-34/+83
| | | | | | | | | | | | | | | | | | | | | | | | Quiescing currently is implmeented in C in opal_entry before the opal call handler is called. This works well enough for simple cases like fast reset when one CPU wants all others out of the way. Linux would like to use it to prevent an sreset IPI from interrupting firmware, which could lead to deadlocks when crash dumping or entering the debugger. Linux interrupts do not recover well when returning back to general OPAL code, due to r13 not being restored. OPAL also can't be re-entered, which may happen e.g., from the debugger. So move the quiesce hold/reject to entry code, beore the stack or r1 or r13 registers are switched. OPAL can be interrupted and returned to or re-entered during this period. This does not completely solve all such problems. OPAL will be interrupted with sreset if the quiesce times out, and it can be interrupted by MCEs as well. These still have the issues above. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* core/stack: backtrace unwind basic OPAL call detailsNicholas Piggin2018-04-183-11/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Put OPAL callers' r1 into the stack back chain, and then use that to unwind back to the OPAL entry frame (as opposed to boot entry, which has a 0 back chain). >From there, dump the OPAL call token and the caller's r1. A backtrace looks like this: CPU 0000 Backtrace: S: 0000000031c03ba0 R: 000000003001a548 ._abort+0x4c S: 0000000031c03c20 R: 000000003001baac .opal_run_pollers+0x3c S: 0000000031c03ca0 R: 000000003001bcbc .opal_poll_events+0xc4 S: 0000000031c03d20 R: 00000000300051dc opal_entry+0x12c --- OPAL call entry token: 0xa caller R1: 0xc0000000006d3b90 --- This is pretty basic for the moment, but it does give you the bottom of the Linux stack. It will allow some interesting improvements in future. First, with the eframe, all the call's parameters can be printed out as well. The ___backtrace / ___print_backtrace API needs to be reworked in order to support this, but it's otherwise very simple (see opal_trace_entry()). Second, it will allow Linux's stack to be passed back to Linux via a debugging opal call. This will allow Linux's BUG() or xmon to also print the Linux back trace in case of a NMI or MCE or watchdog lockup that hits in OPAL. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* opal/hmi: Add documentation for opal_handle_hmi2 callMahesh Salgaonkar2018-04-171-0/+126
| | | | | Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* opal/hmi: Generate hmi event for recovered HDEC parity error.Mahesh Salgaonkar2018-04-173-8/+10
| | | | | Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* opal/hmi: check thread 0 tfmr to validate latched tfmr errors.Mahesh Salgaonkar2018-04-172-19/+50
| | | | | | | | | | | Due to P9 errata, HDEC parity and TB residue errors are latched for non-zero threads 1-3 even if they are cleared. But these are not latched on thread 0. Hence, use xscom SCOMC/SCOMD to read thread 0 tfmr value and ignore them on non-zero threads if they are not present on thread 0. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* opal/hmi: Print additional debug information in rendezvous.Mahesh Salgaonkar2018-04-171-2/+4
| | | | | | | Helps in debugging... Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* opal/hmi: Fix handling of TFMR parity/corrupt error.Mahesh Salgaonkar2018-04-171-5/+4
| | | | | | | | | | | | | | | | | | | While testing TFMR parity/corrupt error it has been observed that HMIs are delivered twice for this error - First time HMI is delivered with HMER[4,5]=1 and TFMR[60]=1. - Second time HMI is delivered with HMER[4,5]=1 and TFMR[60]=0 with valid TB. On second HMI we end up throwing below error message even though TB is in valid state. "HMI: TB invalid without core error reported" This patch fixes this issue by ignoring HMER[5] and checking only for TFMR[60] before setting this_cpu()->tb_invalid to true. Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* opal/hmi: Stop flooding HMI event for TOD errors.Mahesh Salgaonkar2018-04-171-2/+5
| | | | | | | | | | | | | | Fix the issue where every thread on the chip sends HMI event to host for TOD errors. TOD errors are reported to all the core/threads on the chip. Any one thread can fix the error and send event. Rest of the threads don't need to send HMI event unnecessarily. This patch fixes this by modifying __chiptod_recover_tod_errors() function to return -1 if no errors found. Without this change every thread that see TFMR[51]=1 sends HMI event to the host kernel. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
OpenPOWER on IntegriCloud