summaryrefslogtreecommitdiffstats
path: root/hw/fsp
Commit message (Collapse)AuthorAgeFilesLines
* Write boot progress to LPC port 80hStewart Smith2019-04-241-1/+3
| | | | | | | | | | | | | | | | | This is an adaptation of what we currently do for op_display() on FSP machines, inventing an encoding for what we can write into the single byte at LPC port 80h. Port 80h is often used on x86 systems to indicate boot progress/status and dates back a decent amount of time. Since a byte isn't exactly very expressive for everything that can go on (and wrong) during boot, it's all about compromise. Some systems (such as Zaius/Barreleye G2) have a physical dual 7 segment display that display these codes. So far, this has only been driven by hostboot (see hostboot commit 90ec2e65314c). Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* core/stack: Rename backtrace functions, get rid of wrappersAndrew Donnellan2019-03-281-2/+2
| | | | | | | | | Rename ___backtrace() to backtrace_create() and ___print_backtrace() to backtrace_print(). Get rid of __backtrace() and __print_backtrace() wrappers. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/fsp, hw/ipmi: Convert attn code to not use backtrace wrappersAndrew Donnellan2019-03-281-5/+5
| | | | | | | | We're about to get rid of __backtrace() and __print_backtrace(), convert the FSP/IPMI attn code to not use them. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* FSP: Improve Reset/Reload log messageVasant Hegde2018-09-201-2/+2
| | | | | | | | | | | | | | Below message is confusing. Lets make it clear. FSP sends "R/R complete notification" whenever there is a dump. We use `flag` to identify whether its its R/R completion -OR- just new dump notification. [ 483.406351956,6] FSP: SP says Reset/Reload complete [ 483.406354278,5] DUMP: FipS dump available. ID = 0x1a00001f [size: 6367640 bytes] [ 483.406355968,7] A Reset/Reload was NOT done Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* fsp/surv: Improve log messageVasant Hegde2018-09-131-2/+4
| | | | | Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* Move include lock.h to fsp-console.h from console.hStewart Smith2018-06-181-0/+1
| | | | | | It's only used there, let's minimise our needed includes. Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* fast-reboot: Disable on FSP IPL side changeVasant Hegde2018-06-181-0/+26
| | | | | | | | | | | | If FSP changes next IPL side, then disable fast reboot. sample output: [ 620.196442259,5] FSP: Got sysparam update, param ID 0xf0000007 [ 620.196444501,5] CUPD: FW IPL side changed. Disable fast reboot [ 620.196445389,5] CUPD: Next IPL side : perm Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* check for NULL input string in is_sai_loc_codeBalbir singh2018-05-241-2/+5
| | | | | | | | | Caught by scan-build, also constant-ify the input parameter. Signed-off-by: Balbir singh <bsingharora@gmail.com> Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* fsp/console: Always establish OPAL console API backendBenjamin Herrenschmidt2018-05-241-2/+3
| | | | | | | | | | | | | | | | | Currently we only call set_opal_console() to establish the backend used by the OPAL console API if we find at least one FSP serial port in HDAT. On systems where there is none (IPMI only), we fail to set it, causing the console code to try to use the dummy console causing an assertion failure during boot due to clashing on the device-tree node names. So always set it if an FSP is present Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* fsp: Fix msg vaargs usageJoel Stanley2018-05-041-2/+2
| | | | | | | | | | | | | | | | | | | | hw/fsp/fsp.c:1011:17: warning: passing an object that undergoes default argument promotion to 'va_start' has undefined behavior [-Wvarargs] va_start(list, add_words); ^ hw/fsp/fsp.c:1007:59: note: parameter of type 'u8' (aka 'unsigned char') is declared here void fsp_fillmsg(struct fsp_msg *msg, u32 cmd_sub_mod, u8 add_words, ...) ^ [CC] platforms/ibm-fsp/apollo-pci.o hw/fsp/fsp.c:1026:17: warning: passing an object that undergoes default argument promotion to 'va_start' has undefined behavior [-Wvarargs] va_start(list, add_words); ^ hw/fsp/fsp.c:1016:47: note: parameter of type 'u8' (aka 'unsigned char') is declared here struct fsp_msg *fsp_mkmsg(u32 cmd_sub_mod, u8 add_words, ...) Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* hw/imc: Add support to load imc catalog lid fileMadhavan Srinivasan2018-04-101-0/+3
| | | | | | | | | Add support to load the imc catalog from a lid file packaged as part of the system firmware. Lid number allocated is 0x80f00103.lid. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
* Revert "console(lpc/fsp-console): Use only stdout-path property on P9 and above"Stewart Smith2018-03-061-11/+3
| | | | | | | | | | | | | This reverts commit 20f685a3627a2a522c465716377561a8fbcc608f. We've hit problems on Zaius machines and the needed petitboot changes haven't made it upstream yet. Let's revert for the time being while we sort everything out. We probably have to keep both around for a few years. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* capp: Add lid definition for P9 DD-2.2Christophe Lombard2018-03-061-0/+2
| | | | | | | | | Update fsp_lid_map to include CAPP ucode lid for phb4-chipid == 0x202d1 that corresponds to P9 DD-2.2 chip. Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* console(lpc/fsp-console): Use only stdout-path property on P9 and abovePridhiviraj Paidipeddi2018-03-011-3/+11
| | | | | | | | | | | | | | | | | | | dtc tool complaining about below warning as usage of linux,stdout-path property under /chosen node is deprecated. dts: Warning (chosen_node_stdout_path): Use 'stdout-path' instead of 'linux,stdout-path' So this patch fix this by using stdout-path property on all the systems and keep linux,stdout-path only on P8 and before. This property refers to a node which represents the device to be used for boot console output. Verified boot on both P8 and P9 systems with new and older kernels. And also verified dtc warnings got fixed in both P8 and P9. Signed-off-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com> [stewart: simplify logic] Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* build: use thin archives rather than incremental linkingNicholas Piggin2018-02-281-1/+1
| | | | | | | | | | | | | | | | | | | | This changes to build system to use thin archives rather than incremental linking for built-in.o, similar to recent change to Linux. built-in.o is renamed to built-in.a, and is created as a thin archive with no index, for speed and size. All built-in.a are aggregated into a skiboot.tmp.a which is a thin archive built with an index, making it suitable or linking. This is input into the final link. The advantags of build size and linker code placement flexibility are not as great with skiboot as a bigger project like Linux, but it's a conceptually better way to build, and is more compatible with link time optimisation in toolchains which might be interesting for skiboot particularly for size reductions. Size of build tree before this patch is 34.4MB, afterwards 23.1MB. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* sensors: Support reading u64 sensor valuesShilpasri G Bhat2018-02-211-3/+4
| | | | | | | | | | | | | This patch adds support to read u64 sensor values. This also adds changes to the core and the backend implementation code to make this API as the base call. Host can use this new API to read sensors upto 64bits. This adds a list to store the pointer to the kernel u32 buffer, for older kernels making async sensor u32 reads. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* fsp: Bail out of HIR if FSP is resetting voluntarilyAnanth N Mavinakayanahalli2017-12-061-4/+16
| | | | | | | | | | | | | | | | a. Surveillance response times out and OPAL triggers a HIR b. Before the HIR process kicks in, OPAL gets a PSI interrupt indicating link down c. HIR process continues and OPAL tries to write to DRCR; PSI link inactive => xstop OPAL should confirm that the FSP is not already in reset in the HIR path. [V2] Handle the case where a second reset is triggered due to the two resets happening in succession. Signed-off-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Tested-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com> Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* nvram: Fix 'missing' nvram on FSP systems.Cyril Bur2017-11-301-24/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | commit ba4d46fdd9eb ("console: Set log level from nvram") wants to read from NVRAM rather early. This works fine on BMC based systems as nvram_init() is actually synchronous. This is not true for FSP systems and it turns out that the query for the console log level simply queries blank nvram. The simple fix is to wait for the NVRAM read to complete before performing any query. Unfortunately it turns out that the fsp-nvram code does not inform the generic NVRAM layer when the read is complete, rather, it must be prompted to do so. This patch addresses both these problems. This patch adds a check before the first read of the NVRAM (for the console log level) that the read has completed. The fsp-nvram code has been updated to inform the generic layer as soon as the read completes. The old prompt to the fsp-nvram code has been removed but a check to ensure that the NVRAM has been loaded remains. It is conservative but if the NVRAM is not done loading before the host is booted it will not have an nvram device-tree node which means it won't be able to access the NVRAM at all, ever, even after the NVRAM has loaded. Signed-off-by: Cyril Bur <cyril.bur@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* fsp-elog: Reduce verbosity of elog messagesMichael Neuling2017-11-281-2/+2
| | | | | | | | | | | | These messages just fill up the opal console log with useless messages resulting in us losing useful information. They have been like this since the first commit in skiboot. Make them trace. Signed-off-by: Michael Neuling <mikey@neuling.org> Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* FSP/CONSOLE: remove redundant flush_all_input() call in fsp_console_reset()Vasant Hegde2017-10-301-2/+0
| | | | | Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* FSP/CONSOLE: Disable notification on unresponsive consolesVasant Hegde2017-10-301-3/+5
| | | | | | | | | | | | | | | | | Commit fd6b71fc fixed the situation where ipmi console was open (hvc0) but got data on different console (hvc1). During FSP R/R OPAL closes all consoles. After R/R complete FSP requests to open hvc1 and sends data on this. If hvc1 registration failed or not opened in host kernel then it will not read data and results in RCU stalls. Note that this is workaround for older kernel where we don't have separate irq for each console. Latest kernel works fine without this patch. CC: stable CC: Sam Mendoza-Jonas <sam@mendozajonas.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* FSP/CONSOLE: Limit number of error loggingVasant Hegde2017-10-111-8/+13
| | | | | | | | | | | | | | | | Commit c8a7535f (FSP/CONSOLE: Workaround for unresponsive ipmi daemon) added error logging when buffer is full. In some corner cases kernel may call this function multiple time and we may endup logging error again and again. This patch fixes it by generating error log only once. I think this is enough to indicate something went wrong. Also with previous patch, once console buffer is full, OPAL is returning error to payload from fsp_console_write_buffer_space(). So payload will never call fsp_console_write(). Hence move error logging logic to right place. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* FSP/CONSOLE: Fix fsp_console_write_buffer_space() callVasant Hegde2017-10-111-1/+35
| | | | | | | | | | | | | | | | | | | | | | | | | Kernel calls fsp_console_write_buffer_space() to check console buffer space availability. If there is enough buffer space to write data, then kernel will call fsp_console_write() to write actual data. In some extreme corner cases (like one explained in commit c8a7535f) console becomes full and this function returns 0 to kernel (or space available in console buffer < next incoming data size). Kernel will continue retrying until it gets enough space. So we will start seeing RCU stalls. This patch keeps track of previous available space. If previous space is same as current means not enough space in console buffer to write incoming data. It may be due to very high console write operation and slow response from FSP -OR- FSP has stopped processing data (ex: because of ipmi daemon died). At this point we will start timer with timeout of SER_BUFFER_OUT_TIMEOUT (10 secs). If situation is not improved within 10 seconds means something went bad. Lets return OPAL_RESOURCE so that kernel can drop console write and continue. CC: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> CC: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [stewart: reset timeout in fsp_console_write() path] Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* FSP/CONSOLE: Close SOL session during R/RVasant Hegde2017-10-111-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Presently we are not closing SOL and FW console sessions during R/R. Host will continue to write to SOL buffer during FSP R/R. If there is heavy console write operation happening during FSP R/R (like running `top` command inside console), then at some point console buffer becomes full. fsp_console_write_buffer_space() returns 0 (or less than required space to write data) to host. While one thread is busy writing to console, if some other threads tries to write data to console we may see RCU stalls (like below) in kernel. kernel call trace: ------------------ [ 2082.828363] INFO: rcu_sched detected stalls on CPUs/tasks: { 32} (detected by 16, t=6002 jiffies, g=23154, c=23153, q=254769) [ 2082.828365] Task dump for CPU 32: [ 2082.828368] kworker/32:3 R running task 0 4637 2 0x00000884 [ 2082.828375] Workqueue: events dump_work_fn [ 2082.828376] Call Trace: [ 2082.828382] [c000000f1633fa00] [c00000000013b6b0] console_unlock+0x570/0x600 (unreliable) [ 2082.828384] [c000000f1633fae0] [c00000000013ba34] vprintk_emit+0x2f4/0x5c0 [ 2082.828389] [c000000f1633fb60] [c00000000099e644] printk+0x84/0x98 [ 2082.828391] [c000000f1633fb90] [c0000000000851a8] dump_work_fn+0x238/0x250 [ 2082.828394] [c000000f1633fc60] [c0000000000ecb98] process_one_work+0x198/0x4b0 [ 2082.828396] [c000000f1633fcf0] [c0000000000ed3dc] worker_thread+0x18c/0x5a0 [ 2082.828399] [c000000f1633fd80] [c0000000000f4650] kthread+0x110/0x130 [ 2082.828403] [c000000f1633fe30] [c000000000009674] ret_from_kernel_thread+0x5c/0x68 Hence lets close SOL (and FW console) during FSP R/R. CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* FSP/CONSOLE: Do not associate unavailable consoleVasant Hegde2017-10-111-0/+11
| | | | | | | | | | | | | | | Presently OPAL sends associate/unassociate MBOX command for all FSP serial console (like below OPAL message). We have to check console is available or not before sending this message. OPAL log: ------- [ 5013.227994012,7] FSP: Reassociating HVSI console 1 [ 5013.227997540,7] FSP: Reassociating HVSI console 2 Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* FSP: Disable PSI link whenever FSP tells OPAL about impending R/RVasant Hegde2017-10-111-17/+8
| | | | | | | | | | | | | Commit 42d5d047 fixed scenario where DPO has been initiated, but FSP went into reset before the CEC power down came in. But this is generic issue that can happen in normal shutdown path as well. Hence disable PSI link as soon as we detect FSP impending R/R. CC: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> CC: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* FSP/NVRAM: Handle "get vNVRAM statistics" commandVasant Hegde2017-10-101-0/+41
| | | | | | | | | | | | | | | | | | | | | FSP sends MBOX command (cmd : 0xEB, subcmd : 0x05, mod : 0x00) to get vNVRAM statistics. OPAL doesn't maintain any such statistics. Hence return FSP_STATUS_INVALID_SUBCMD. Sample OPAL log: [16944.384670488,3] FSP: Unhandled message eb0500 [16944.474110465,3] FSP: Unhandled message eb0500 [16945.111280784,3] FSP: Unhandled message eb0500 [16945.293393485,3] FSP: Unhandled message eb0500 With this patch, I don't think FSP will ever call "free vNVRAM" MBOX command. But to be safer side lets return FSP_STATUS_INVALID_SUBCMD for this MBOX command as well. Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Tested-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* capp: Add lid definitions for P9 DD-2.0 & DD-2.1Vaibhav Jain2017-10-061-0/+4
| | | | | | | | | | Update fsp_lid_map to include CAPP ucode lids for phb4-chipid == 0x200d1 and phb4-chipid == 0x201d1 that corresponds to P9 DD-2.0 & DD-2.1 chips respectively. Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* fsp: Move common prints to traceMichael Neuling2017-09-201-2/+2
| | | | | | | | | | These two prints just end up filling the skiboot logs on any machine that's been booted for more than a few hours. They have never been useful, so make them trace level. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* FSP/CONSOLE: Do not enable input irq in write pathVasant Hegde2017-07-211-3/+0
| | | | | | | | | | | We use irq for reading input from console, but not in output path. Hence do not enable input irq in write path. Fixes : 583c8203 (fsp/console: Allocate irq for each hvc console) CC: Sam Mendoza-Jonas <sam@mendozajonas.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-By: Samuel Mendoza-Jonas <sam@mendozajonas.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* FSP: Add check to detect FSP R/R inside fsp_sync_msg()Vasant Hegde2017-06-211-2/+11
| | | | | | | | | | | | | | | | | | | | OPAL sends MBOX message to FSP and updates message state from fsp_msg_queued -> fsp_msg_sent. fsp_sync_msg() queues message and waits until we get response from FSP. During FSP R/R we move outstanding MBOX messages from msgq to rr_queue including inflight message (fsp_reset_cmdclass()). But we are not resetting inflight message state. In extreme croner case where we sent message to FSP via fsp_sync_msg() path and FSP R/R happens before getting respose from FSP, then we will endup waiting in fsp_sync_msg() until everything becomes normal. This patch adds fsp_in_rr() check to fsp_sync_msg() and return error to caller if FSP is in R/R. CC: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* capi: Load capp microcodeChristophe Lombard2017-06-191-0/+2
| | | | | | | | | | | CAPP microcode flash download and CAPP upload for PHB4. A new file 'capp.c' is created to receive common capp code for PHB3 and PHB4. Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* FSP/CONSOLE: Fix possible NULL dereferenceVasant Hegde2017-06-191-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | Fix coverity warning message. Null pointer dereferences (NULL_RETURNS) /hw/fsp/fsp-console.c: 295 in fsp_open_vserial() 289 290 fs->open = true; 291 292 fs->poke_msg = fsp_mkmsg(FSP_CMD_VSERIAL_OUT, 2, 293 msg->data.words[0], 294 msg->data.words[1] & 0xffff); >>> CID 145796: Null pointer dereferences (NULL_RETURNS) >>> Dereferencing a null pointer "fs->poke_msg". 295 fs->poke_msg->user_data = fs; 296 297 fs->in_buf->partition_id = fs->out_buf->partition_id = part_id; 298 fs->in_buf->session_id = fs->out_buf->session_id = sess_id; 299 fs->in_buf->hmc_id = fs->out_buf->hmc_id = hmc_indx; 300 fs->in_buf->data_offset = fs->out_buf->data_offset = Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* FSP/CONSOLE: Workaround for unresponsive ipmi daemonVasant Hegde2017-06-141-1/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We use TCE mapped area to write data to console. Console header (fsp_serbuf_hdr) is modified by both FSP and OPAL (OPAL updates next_in pointer in fsp_serbuf_hdr and FSP updates next_out pointer). Kernel makes opal_console_write() OPAL call to write data to console. OPAL write data to TCE mapped area and sends MBOX command to FSP. If our console becomes full and we have data to write to console, we keep on waiting until FSP reads data. In some corner cases, where FSP is active but not responding to console MBOX message (due to buggy IPMI) and we have heavy console write happening from kernel, then eventually our console buffer becomes full. At this point OPAL starts sending OPAL_BUSY_EVENT to kernel. Kernel will keep on retrying. This is creating kernel soft lockups. In some extreme case when every CPU is trying to write to console, user will not be able to ssh and thinks system is hang. If we reset FSP or restart IPMI daemon on FSP, system recovers and everything becomes normal. This patch adds workaround to above issue by returning OPAL_HARDWARE when cosole is full. Side effect of this patch is, we may endup dropping latest console data. But better to drop console data than system hang. Alternative approach is to drop old data from console buffer, make space for new data. But in normal condition only FSP can update 'next_out' pointer and if we touch that pointer, it may introduce some other race conditions. Hence we decided to just new console write request. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* FSP: Set status field in response message for timed out messageVasant Hegde2017-06-141-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | For timed out FSP messages, we set message status as "fsp_msg_timeout". But most FSP driver users (like surviellance) are ignoring this field. They always look for FSP returned status value in callback function (second byte in word1). So we endup treating timed out message as success response from FSP. Sample output: [69902.432509048,7] SURV: Sending the heartbeat command to FSP [70023.226860117,4] FSP: Response from FSP timed out, word0 = d66a00d7, word1 = 0 state: 3 .... [70023.226901445,7] SURV: Received heartbeat acknowledge from FSP [70023.226903251,3] FSP: fsp_trigger_reset() entry Here SURV code thought it got valid response from FSP. But actually we didn't receive response from FSP. This patch fixes above issue by updating status field in response structure. CC: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* FSP: Improve timeout messageVasant Hegde2017-06-141-4/+5
| | | | | | | | | | | | | Presently we print word0 and word1 in error log. word0 contains sequence number and command class. One has to understand word0 format to identify command class. Lets explicitly print command class, sub command etc. CC: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* FSP/RTC: Remove local fsp_in_reset variableVasant Hegde2017-06-141-10/+0
| | | | | | | | | Now that we are using fsp_in_rr() to detect FSP reset/reload, fsp_in_reset become redundant. Lets remove this local variable. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* FSP/RTC: Fix possible FSP R/R issue in rtc write pathVasant Hegde2017-06-141-9/+11
| | | | | | | | | | | | | | fsp_opal_rtc_write() checks FSP status before queueing message to FSP. But if FSP R/R starts before getting response to queued message then we will continue to return OPAL_BUSY_EVENT to host. In some extreme condition host may experience hang. Once FSP is back we will repost message, get response from FSP and return OPAL_SUCCES to host. This patch caches new values and returns OPAL_SUCCESS if FSP R/R is happening. And once FSP is back we will send cached value to FSP. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* hw/fsp/rtc: read/write cached rtc tod on fsp hir.ppaidipe@linux.vnet.ibm.com2017-06-141-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently fsp-rtc reads/writes the cached RTC TOD on an fsp reset. Use latest fsp_in_rr() function to properly read the cached rtc value when fsp reset initiated by the hir. Below is the kernel trace when we set hw clock, when hir process starts. [ 1727.775824] NMI watchdog: BUG: soft lockup - CPU#57 stuck for 23s! [hwclock:7688] [ 1727.775856] Modules linked in: vmx_crypto ibmpowernv ipmi_powernv uio_pdrv_genirq ipmi_devintf powernv_op_panel uio ipmi_msghandler powernv_rng leds_powernv ip_tables x_tables autofs4 ses enclosure scsi_transport_sas crc32c_vpmsum lpfc ipr tg3 scsi_transport_fc [ 1727.775883] CPU: 57 PID: 7688 Comm: hwclock Not tainted 4.10.0-14-generic #16-Ubuntu [ 1727.775883] task: c000000fdfdc8400 task.stack: c000000fdfef4000 [ 1727.775884] NIP: c00000000090540c LR: c0000000000846f4 CTR: 000000003006dd70 [ 1727.775885] REGS: c000000fdfef79a0 TRAP: 0901 Not tainted (4.10.0-14-generic) [ 1727.775886] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> [ 1727.775889] CR: 28024442 XER: 20000000 [ 1727.775890] CFAR: c00000000008472c SOFTE: 1 GPR00: 0000000030005128 c000000fdfef7c20 c00000000144c900 fffffffffffffff4 GPR04: 0000000028024442 c00000000090540c 9000000000009033 0000000000000000 GPR08: 0000000000000000 0000000031fc4000 c000000000084710 9000000000001003 GPR12: c0000000000846e8 c00000000fba0100 [ 1727.775897] NIP [c00000000090540c] opal_set_rtc_time+0x4c/0xb0 [ 1727.775899] LR [c0000000000846f4] opal_return+0xc/0x48 [ 1727.775899] Call Trace: [ 1727.775900] [c000000fdfef7c20] [c00000000090540c] opal_set_rtc_time+0x4c/0xb0 (unreliable) [ 1727.775901] [c000000fdfef7c60] [c000000000900828] rtc_set_time+0xb8/0x1b0 [ 1727.775903] [c000000fdfef7ca0] [c000000000902364] rtc_dev_ioctl+0x454/0x630 [ 1727.775904] [c000000fdfef7d40] [c00000000035b1f4] do_vfs_ioctl+0xd4/0x8c0 [ 1727.775906] [c000000fdfef7de0] [c00000000035bab4] SyS_ioctl+0xd4/0xf0 [ 1727.775907] [c000000fdfef7e30] [c00000000000b184] system_call+0x38/0xe0 [ 1727.775908] Instruction dump: [ 1727.775909] f821ffc1 39200000 7c832378 91210028 38a10020 39200000 38810028 f9210020 [ 1727.775911] 4bfffe6d e8810020 80610028 4b77f61d <60000000> 7c7f1b78 3860000a 2fbffff4 This is found when executing the testcase https://github.com/open-power/op-test-framework/blob/master/testcases/fspresetReload.py With this fix ran fsp hir torture testcase in the above test which is working fine. Signed-off-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com> Acked-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* FSP/CHIPTOD: Return false in error pathVasant Hegde2017-06-141-0/+1
| | | | | | CC: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* FSP/CONSOLE: Do not free fsp_msg in error pathVasant Hegde2017-06-081-1/+0
| | | | | | | .. as we reuse same msg to send next output message. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* FSP/CONSOLE: Remove __unused attribute from fsp_console_read()Vasant Hegde2017-06-081-1/+1
| | | | | | | ..as we use buffer to copy data. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* FSP/RTC: Improve error logVasant Hegde2017-06-081-1/+1
| | | | | | | .. it makes easy to differentiate errors. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* fsp/tpo: Provide support for disabling TPO alarmVaibhav Jain2017-05-261-2/+8
| | | | | | | | | | | | | | | | | This patch adds support for disabling a preconfigured Timed-Power-On(TPO) alarm on FSP based systems. Presently once a TPO alarm is configured from the kernel it will be triggered even if its subsequently disabled. With this patch a TPO alarm can be disabled by passing y_m_d==hr_min==0 to fsp_opal_tpo_write(). A branch is added to the function to handle this case by sending FSP_CMD_TPO_DISABLE message to the FSP instead of usual FSP_CMD_TPO_WRITE message. The kernel is expected to call opal_tpo_write() with y_m_d==hr_min==0 to request opal to disable TPO alarm. Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* Remove multiple logging for un-handled fsp sub commands.ppaidipe@linux.vnet.ibm.com2017-05-121-1/+0
| | | | | | | | | | | | | | | If any new or unknown command need to be handled, just log un-hnadled message from only fsp, not required from fsp-dpo. cat /sys/firmware/opal/msglog | grep -i ,3 [ 110.232114723,3] FSP: fsp_trigger_reset() entry [ 188.431793837,3] FSP #0: Link down, starting R&R [ 464.109239162,3] FSP #0: Got XUP with no pending message ! [ 466.340598554,3] FSP-DPO: Unknown command 0xce0900 [ 466.340600126,3] FSP: Unhandled message ce0900 Signed-off-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* GCC7: fixes for -Wimplicit-fallthrough expected regexesStewart Smith2017-05-121-1/+1
| | | | | | | | | | | It turns out GCC7 adds a useful warning and does fancy things like parsing your comments to work out that you intended to do the fallthrough. There's a few places where we don't match the regex. Fix them, as it's harmless to do so. Found by building on Fedora Rawhide in Travis. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* FSP: Notify FSP of Platform Log ID after Host Initiated Reset ReloadStewart Smith2017-05-102-22/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Trigging a Host Initiated Reset (when the host detects the FSP has gone out to lunch and should be rebooted), would cause "Unknown Command" messages to appear in the OPAL log. This patch implements those messages How to trigger FSP RR(HIR): $ putmemproc 300000f8 0x00000000deadbeef s1 k0:n0:s0:p00 ecmd_ppc putmemproc 300000f8 0x00000000deadbeef Log showing unknown command: / # cat /sys/firmware/opal/msglog | grep -i ,3 [ 110.232114723,3] FSP: fsp_trigger_reset() entry [ 188.431793837,3] FSP #0: Link down, starting R&R [ 464.109239162,3] FSP #0: Got XUP with no pending message ! [ 466.340598554,3] FSP-DPO: Unknown command 0xce0900 [ 466.340600126,3] FSP: Unhandled message ce0900 The message we need to handle is "Get PLID after host initiated FipS reset/reload". When the FSP comes back from HIR, it asks "hey, so, which error log explains why you rebooted me?". So, we tell it. Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* hw/fsp: Do not queue SP and SPCN class messages during reset/reloadAnanth N Mavinakayanahalli2017-03-163-0/+31
| | | | | | | | | | | | | | | | | | | | | | | | During FSP R/R, the FSP is inaccessible and will lose state. Messages to the FSP are generally queued for sending later. It does seem like the FSP fails to process any subseuqent messages of certain classes (SP info -- ipmi) if it receives queued mbox messages it isn't expecting. In certain other cases (sensors), the FSP driver returns a default code (async completion) even though there is no known bound from the time of this error return to the actual data being available. The kernel driver keeps waiting leading to soft-lockup on the host side. Mitigate both these (known) cases by returning OPAL_BUSY so the host driver knows to retry later. With this change, the sensors command works fine when the FSP comes back. This version also resolves the remaining IPMI issues Signed-off-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Tested-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* fsp-leds: add missing \n in duplicate loc code error msgStewart Smith2017-03-101-1/+1
| | | | | | Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* console: use opal_con_ops APIOliver O'Halloran2017-01-041-6/+3
| | | | | | | | | | | | | | | | | | Adds a new structure that contains the implementations of the various OPAL console handlers. This is intended to replace the existing ad-hoc mechanism where the OPAL call handlers are overwritten in the OPAL console driver's init function. Currently this just moves the site where the OPAL call handlers are overwritten to inside of console.c, but it is intended to give us a mechanism for implementing features such as pointer validation for the OPAL console calls without having to manually update each driver. This also helps to clarify differences between the internal (skiboot) console and the external (OPAL) console. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
OpenPOWER on IntegriCloud