| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to this change, there was a switch statement that was saying
"if this RC is found in the psu FFDC, then do this recovery action"
that is obviously not very easy to maintain because for every error
we need to add the proper action. Instead of this, now we will just
look if any GARD records were created as part of the error found in
the FFDC. If a gard was found , Hostboot will stop trying to recover
the SBE and instead enter a reconfig loop to try and IPL w/ the target
garded out. Again this only applies to OP system, in the FSP world we
will commit the error logs w/ the gard records and then TI telling HWSV
they need to look at the SBE
Change-Id: I04e03feebf2bbd1eae2d725bee31993062fe7c94
Reviewed-on: http://rchgit01.rchland.ibm.com/gerrit1/66374
Reviewed-by: Matt Derksen <mderkse1@us.ibm.com>
Tested-by: Jenkins Server <pfd-jenkins+hostboot@us.ibm.com>
Tested-by: Jenkins OP Build CI <op-jenkins+hostboot@us.ibm.com>
Tested-by: Jenkins OP HW <op-hw-jenkins+hostboot@us.ibm.com>
Tested-by: FSP CI Jenkins <fsp-CI-jenkins+hostboot@us.ibm.com>
Reviewed-by: Roland Veloz <rveloz@us.ibm.com>
Reviewed-by: Daniel M. Crowell <dcrowell@us.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-In some cases the the SBE will return a proc target type as
part of the SBE FFDC buffer, the current code only considers
non-proc types when converting the FFDC buffer contents to a target.
This commit will correctly convert all valid target types
passed back in the FFDC buffer.
Change-Id: If9f3542f18b72652d3353b6f167a264fcba21352
CQ:SW444855
Reviewed-on: http://rchgit01.rchland.ibm.com/gerrit1/65832
Tested-by: Jenkins Server <pfd-jenkins+hostboot@us.ibm.com>
Reviewed-by: Christian R. Geddes <crgeddes@us.ibm.com>
Tested-by: HWSV CI <hwsv-ci+hostboot@us.ibm.com>
Reviewed-by: Dean Sanner <dsanner@us.ibm.com>
Tested-by: PPE CI <ppe-ci+hostboot@us.ibm.com>
Tested-by: Hostboot CI <hostboot-ci+hostboot@us.ibm.com>
Reviewed-by: Jennifer A. Stofer <stofer@us.ibm.com>
Reviewed-on: http://rchgit01.rchland.ibm.com/gerrit1/65835
Tested-by: Jenkins OP Build CI <op-jenkins+hostboot@us.ibm.com>
Tested-by: Jenkins OP HW <op-hw-jenkins+hostboot@us.ibm.com>
Tested-by: FSP CI Jenkins <fsp-CI-jenkins+hostboot@us.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the event that the SBE fails hostboot will attempt to recover it.
During runtime hostboot will attempt an HRESET if the SBE is in a
failed state. When the SBE performs the HRESET it will save some
important information that will persist through the reset. If one
side is failing to recover the retry code will attempt to switch sides
and do the hreset. If the SBE seeproms have different versions of the
SBE code the data that was supposed to persist through the HRESET will
be in incorrect places because the version mismatch. Because of this
we cannot switch seeprom sides and perform a hreset if the seeproms have
different level of the SBE code.
CQ: SW438029
Change-Id: Ic7078a886088cc4d5355cc076e72d0fc36f85027
Reviewed-on: http://rchgit01.rchland.ibm.com/gerrit1/61605
Reviewed-by: Matt Derksen <mderkse1@us.ibm.com>
Tested-by: Jenkins Server <pfd-jenkins+hostboot@us.ibm.com>
Tested-by: Jenkins OP Build CI <op-jenkins+hostboot@us.ibm.com>
Tested-by: FSP CI Jenkins <fsp-CI-jenkins+hostboot@us.ibm.com>
Tested-by: Jenkins OP HW <op-hw-jenkins+hostboot@us.ibm.com>
Reviewed-by: Daniel M. Crowell <dcrowell@us.ibm.com>
Reviewed-by: William G. Hoffa <wghoffa@us.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit introduces a new error log that will get displayed
when a slave SBE fails to reach runtime after the intial start_cbs.
Before adding this there was no notification letting the user know
that Hostboot failed to boot the slave sbe and that it gave control
to the FSP so hwsv can figure out what to do next. In addition to
this new error log this commit also updates other errlCommit func
calls in the sbe_retry_handler to ensure we are using SBEIO as the
component id for these error logs.
Change-Id: I73854f753a6186958d55909e8e37a605c1ad57c9
Reviewed-on: http://ralgit01.raleigh.ibm.com/gerrit1/58049
Tested-by: Jenkins Server <pfd-jenkins+hostboot@us.ibm.com>
Tested-by: Jenkins OP Build CI <op-jenkins+hostboot@us.ibm.com>
Tested-by: Jenkins OP HW <op-hw-jenkins+hostboot@us.ibm.com>
Tested-by: FSP CI Jenkins <fsp-CI-jenkins+hostboot@us.ibm.com>
Reviewed-by: Martin Gloff <mgloff@us.ibm.com>
Reviewed-by: Brian E. Bakke <bbakke@us.ibm.com>
Reviewed-by: Roland Veloz <rveloz@us.ibm.com>
Reviewed-by: Daniel M. Crowell <dcrowell@us.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are instances where we will call the sbe_retry_handler
in the middle of a HWP. This happens when we attempt to use the FIFO
path to scom during a HWP and the SBE is dead. This results in the
sbe_retry_handler being called which itself will try to run HWPs.
If the sbe_retry_handler attempts to call a HWP with FAPI_INVOKE while
handling a fail inside a HWP the INVOKE call will get stuck waiting
for the fapi mutex to get unlocked which it never will. To get around
this we are making all FAPI_INVOKE calls into FAPI_EXEC calls and
we are handling the returnCode from FAPI_EXEC locally.
Change-Id: I87e54be4ca738c3c327e6519093fb4b6848a542b
Reviewed-on: http://ralgit01.raleigh.ibm.com/gerrit1/57401
Reviewed-by: Martin Gloff <mgloff@us.ibm.com>
Tested-by: Jenkins Server <pfd-jenkins+hostboot@us.ibm.com>
Tested-by: Jenkins OP Build CI <op-jenkins+hostboot@us.ibm.com>
Tested-by: Jenkins OP HW <op-hw-jenkins+hostboot@us.ibm.com>
Tested-by: FSP CI Jenkins <fsp-CI-jenkins+hostboot@us.ibm.com>
Reviewed-by: Roland Veloz <rveloz@us.ibm.com>
Reviewed-by: Daniel M. Crowell <dcrowell@us.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During the hreset and start_cbs hwps if we get an error (likely scom
or cfam error) then we want to callout the processor. Previously we
were GARDing out the processor if this occurred. It was pointed out this
might be a little harsh so instead we are going to deconfigure the
processor
Change-Id: Id5bfe0af392a4863ef2d225777bc17f3c308340e
Reviewed-on: http://ralgit01.raleigh.ibm.com/gerrit1/56960
Reviewed-by: Daniel M. Crowell <dcrowell@us.ibm.com>
Reviewed-by: Martin Gloff <mgloff@us.ibm.com>
Tested-by: Jenkins Server <pfd-jenkins+hostboot@us.ibm.com>
Tested-by: Jenkins OP Build CI <op-jenkins+hostboot@us.ibm.com>
Tested-by: Jenkins OP HW <op-hw-jenkins+hostboot@us.ibm.com>
Tested-by: FSP CI Jenkins <fsp-CI-jenkins+hostboot@us.ibm.com>
Reviewed-by: William G. Hoffa <wghoffa@us.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously if a PLID was passed to the ctor of the sbe_retry_handler
we would link all errors logs created during the process of recovering
the SBE with this PLID. But if no PLID was passed then we would not
link the logs. This commit changes it so if no PLID is passed to the
ctor then the first log created in the recovery process will become
the PLID that all logs after will be set to.
Change-Id: I93ef3a48b4cc1d7df3237d7ba3dfefba21d5fb6b
Reviewed-on: http://ralgit01.raleigh.ibm.com/gerrit1/56885
Reviewed-by: Martin Gloff <mgloff@us.ibm.com>
Tested-by: Jenkins Server <pfd-jenkins+hostboot@us.ibm.com>
Tested-by: Jenkins OP Build CI <op-jenkins+hostboot@us.ibm.com>
Tested-by: Jenkins OP HW <op-hw-jenkins+hostboot@us.ibm.com>
Tested-by: FSP CI Jenkins <fsp-CI-jenkins+hostboot@us.ibm.com>
Reviewed-by: Daniel M. Crowell <dcrowell@us.ibm.com>
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 4f32915aa1240d07bb2671010f95695ba5f306c3.
Change-Id: Ie51fd274d018df63aef6f725bf57c7b1f7f59265
Reviewed-on: http://ralgit01.raleigh.ibm.com/gerrit1/57026
CI-Ready: Christian R. Geddes <crgeddes@us.ibm.com>
Reviewed-by: Christian R. Geddes <crgeddes@us.ibm.com>
Tested-by: Christian R. Geddes <crgeddes@us.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before we initiate an HRESET we need to make sure that the perv
scratch register that the SBE looks at during boot are all zeros.
When these registers are zeros the SBE will ignore them and pull
values from its own image. We need to do this because Hostboot will
repurpose these registers after the SBE boots the first time so the
data in the registers is no longer valid. It is okay for the SBE to
pull the values from it's own image because during the IPL hostboot
customized the SBE image with the correct values.
Change-Id: I8b434d04cde3c384e35a3089a349a1d121b6b1dc
RTC: 180242
Reviewed-on: http://ralgit01.raleigh.ibm.com/gerrit1/56959
Reviewed-by: Martin Gloff <mgloff@us.ibm.com>
Tested-by: Jenkins Server <pfd-jenkins+hostboot@us.ibm.com>
Tested-by: Jenkins OP Build CI <op-jenkins+hostboot@us.ibm.com>
CI-Ready: Christian R. Geddes <crgeddes@us.ibm.com>
Tested-by: Jenkins OP HW <op-hw-jenkins+hostboot@us.ibm.com>
Reviewed-by: Roland Veloz <rveloz@us.ibm.com>
Reviewed-by: Daniel M. Crowell <dcrowell@us.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If PRD notices that a sbe vital attention is set (TP_LFIR 26) then
it will call hostboot code to attempt to recover the SBE. If this
occurs during IPL time then hostboot will not be able to recover
the SBE and we will deconfigure the processor. If this occurs
during runtime HBRT will attempt to run the retry_handler. This will
result in us calling hreset on the SBE that failed. If we were able
to recover the SBE then no error will be returned. If we are unable
to recover the SBE then we will return an error with a deconfig
record.
Change-Id: I3da6ec932ef8e59f7b2a184621a47e88d465e0c5
RTC: 167191
Reviewed-on: http://ralgit01.raleigh.ibm.com/gerrit1/56821
CI-Ready: Daniel M. Crowell <dcrowell@us.ibm.com>
Reviewed-by: Martin Gloff <mgloff@us.ibm.com>
Tested-by: Jenkins Server <pfd-jenkins+hostboot@us.ibm.com>
Tested-by: Jenkins OP Build CI <op-jenkins+hostboot@us.ibm.com>
Tested-by: Jenkins OP HW <op-hw-jenkins+hostboot@us.ibm.com>
Tested-by: FSP CI Jenkins <fsp-CI-jenkins+hostboot@us.ibm.com>
Reviewed-by: Roland Veloz <rveloz@us.ibm.com>
Reviewed-by: Daniel M. Crowell <dcrowell@us.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the sbe_retry_handler code we have two methods we can use to restart
the sbe. They are restarting the cfam boot sequence (start_cbs HWP) and
performing a hw reset on the PPE (hreset HWP). We use start_cbs if there
are issues with initial power on of the slave proc's SBE because we will
not lose any state info (fabric isn't up yet). During runtime we will want
to use the hreset HWP to recover the SBE. Hreset is handy because it will
not force a reboot of the entire proc chip, so the fabric can stay up while
we reset the PPE in the SBE. This commit implements the code path for the
hreset HWP in the sbe_retry_handler. In addition this commit enables calls to
the sbe_retry_handler in rt_fwnotify's sbeAttemptRecovery function which
handles PHYP requests to recover the SBE.
(Also some small typos in related code fixed)
Change-Id: I8f85c38a09e8d5ab80b2809e5665c77a54e35bc4
CQ: SW415675
RTC: 180242
Reviewed-on: http://ralgit01.raleigh.ibm.com/gerrit1/56276
Reviewed-by: Martin Gloff <mgloff@us.ibm.com>
Tested-by: Jenkins Server <pfd-jenkins+hostboot@us.ibm.com>
Tested-by: Jenkins OP Build CI <op-jenkins+hostboot@us.ibm.com>
Tested-by: Jenkins OP HW <op-hw-jenkins+hostboot@us.ibm.com>
Reviewed-by: Roland Veloz <rveloz@us.ibm.com>
Tested-by: FSP CI Jenkins <fsp-CI-jenkins+hostboot@us.ibm.com>
Reviewed-by: Daniel M. Crowell <dcrowell@us.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously the sbe_retry_handler had logic and wording that
assumed that it was being used to tell if the slave sbe booted or not.
However this code has many more use cases then that. Also there was some
indirect recursion that made the code hard to follow. With this refactor
the code should be easier to follow and the vocabulary used should be more
generic.
Change-Id: If6520197b3dd561857e336ed89d9356c1f2601d6
CQ: SW416106
RTC: 167191
Reviewed-on: http://ralgit01.raleigh.ibm.com/gerrit1/55896
Tested-by: Jenkins Server <pfd-jenkins+hostboot@us.ibm.com>
Tested-by: Jenkins OP Build CI <op-jenkins+hostboot@us.ibm.com>
Tested-by: Jenkins OP HW <op-hw-jenkins+hostboot@us.ibm.com>
Reviewed-by: Daniel M. Crowell <dcrowell@us.ibm.com>
Tested-by: Daniel M. Crowell <dcrowell@us.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is a path of the sbe_retry_handler that is missing a
visible error (Once we realize that there is an issue with the
SBE, and we're attempting a boot). This commit adds in that change
Change-Id: Ia613d0b5210aec7bfa923f565d3f57192e585361
Reviewed-on: http://ralgit01.raleigh.ibm.com/gerrit1/53774
Tested-by: Jenkins Server <pfd-jenkins+hostboot@us.ibm.com>
Tested-by: Jenkins OP Build CI <op-jenkins+hostboot@us.ibm.com>
Tested-by: Jenkins OP HW <op-hw-jenkins+hostboot@us.ibm.com>
Tested-by: FSP CI Jenkins <fsp-CI-jenkins+hostboot@us.ibm.com>
Reviewed-by: Roland Veloz <rveloz@us.ibm.com>
CI-Ready: Christian R. Geddes <crgeddes@us.ibm.com>
Reviewed-by: Christian R. Geddes <crgeddes@us.ibm.com>
Reviewed-by: Martin Gloff <mgloff@us.ibm.com>
Reviewed-by: William G. Hoffa <wghoffa@us.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Updated sendProgressCode() so that it would only output trace or
console messages once per step/substep. Replaced most calls to
IPMIWATCHDOG::resetWatchDogTimer() with calls to sendProgressCode()
which calls resetWatchDogTimer() if communicating with a BMC, or
updates the FSP with its progress on FSP-based systems.
Change-Id: I29beb7ce5cdae467d26a0b2c5fee9e3cc4629161
RTC: 169682
Reviewed-on: http://ralgit01.raleigh.ibm.com/gerrit1/53995
Reviewed-by: William G. Hoffa <wghoffa@us.ibm.com>
Reviewed-by: Prachi Gupta <pragupta@us.ibm.com>
Tested-by: Jenkins Server <pfd-jenkins+hostboot@us.ibm.com>
Tested-by: Jenkins OP Build CI <op-jenkins+hostboot@us.ibm.com>
Tested-by: Jenkins OP HW <op-hw-jenkins+hostboot@us.ibm.com>
Tested-by: FSP CI Jenkins <fsp-CI-jenkins+hostboot@us.ibm.com>
Reviewed-by: Daniel M. Crowell <dcrowell@us.ibm.com>
|
|
We want to move the sbe_retry_handler.C and other files
associated with it to a common directory and makefile.
Change-Id: Ifc725709d23d9eec75d2f91b2be73728c91a8d86
RTC:180241
Reviewed-on: http://ralgit01.raleigh.ibm.com/gerrit1/53591
Tested-by: Jenkins Server <pfd-jenkins+hostboot@us.ibm.com>
Tested-by: Jenkins OP Build CI <op-jenkins+hostboot@us.ibm.com>
Tested-by: Jenkins OP HW <op-hw-jenkins+hostboot@us.ibm.com>
Tested-by: FSP CI Jenkins <fsp-CI-jenkins+hostboot@us.ibm.com>
Reviewed-by: Martin Gloff <mgloff@us.ibm.com>
Reviewed-by: Roland Veloz <rveloz@us.ibm.com>
Reviewed-by: Daniel M. Crowell <dcrowell@us.ibm.com>
|