| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
When running in virtual memory mode, the radix MMU hid bit should not
be changed, so set this in the initial boot SPR setup.
As a side effect, fast reboot also has HID0:RADIX bit set by the
shared spr init, so no need for an explicit call.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ISA specifies that MCE interrupts in power saving modes will enter
at 0x200 with powersave bits in SRR1 set. This is not currently
supported properly, the MCE will just happen like a normal interrupt,
but GPRs could be lost, which would lead to crashes (e.g., r1, r2, r13
etc).
So check the power save bits similarly to the sreset vector, and
handle this properly.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
| |
This requires implementing the MSR[RI] bit. Then just allow all
non-fatal sreset exceptions to recover.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
| |
Detect non-powersave sresets and send them to the normal exception
handler which prints registers and stack.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
| |
The added nops now push it up in size, and -Os uninlines it for every
compilation unit that calls it more than once, so it's much better to
just uninline.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the per-core HID register is updated concurrently by multiple
threads, updates can get lost. This has been observed during fast
reboot where the HILE bit does not get cleared on all cores, which
can cause machine check exception interrupts to crash.
Fix this by only updating HID on thread0.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
| |
Enable a new PVR to get us running on another p9 variant.
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
| |
This is not a shipping product and is no longer supported by Linux
or other firmware components.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
A certain finicky static analysis tool did point out that we were
operating on a value that could be null (and since first_cpu() calls
next_cpu(NULL) to get the first one, it also gets to be complained about
as next_cpu() could act on that NULL pointer).
So, rework things to shut the static analysis tool up, when in fact this
was never a problem.
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Users see these when loading an OS from Petitboot:
[ 119.486794100,5] OPAL: Switch to big-endian OS
[ 120.022302604,5] OPAL: Switch to little-endian OS
Which is expected and doesn't provide any information the user can act
on. Switch them to PR_INFO so they still appear in the log, but not on
the serial console.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fixes: 7a3f307e core/cpu: parallelise global CPU register setting jobs
This bug would result in boot-hang on some configurations due to
cpu_wait_job() endlessly waiting for the last bogus jobs[cpu->pir] pointer.
Reported-by: Stephanie Swanson <swanman@us.ibm.com>
Reported-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
|
|
|
|
|
|
| |
Instead of printing at the end if the job took more than 1s,
print in the loop every 30s along with a backtrace. This will
give us some output if the job is deadlocked.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[stewart: bump to 30s rather than 5s, preserve PR_DEBUG for >1s]
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The lock dependency checker does a few nasty things that can cause
re-entrancy deadlocks in conjunction with the stack checker or
in fact other debug tests.
A lot of it revolves around taking a new lock (dl_lock) as part
of the locking process.
This tries to fix it by making sure we do not hit the stack
checker while holding dl_lock.
We achieve that in part by directly using the low-level __try_lock
and manually unlocking on the dl_lock, and making some functions
"nomcount".
In addition, we mark the dl_lock as being in the console path to
avoid deadlocks with the UART driver.
We move the enabling of the deadlock checker to a separate config
option from DEBUG_LOCKS as well, in case we chose to disable it
by default later on.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"cpu_thread *t + value" vs "(void *)t + val"
Fixes: cfe9d441 (core/cpu: Prevent clobbering of stack guard for boot-cpu)
CC: stable <skiboot@lists.ozlabs.org> # v6.0+
CC: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
CC: Nicholas Piggin <npiggin@gmail.com>
CC: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Acked-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Reviewed-by: Vaibhav Jain<vaibhav@linux.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
| |
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
| |
Add a job scheduling API which will run the job on the requested
chip_id (or return failure).
Includes test harness fixes from Stewart.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Presently in case a cpu_thread queues a non returning job on itself,
the variable cpu_thread.job_has_no_return is never updated and other
cpu_threads can still queue a job on it without triggering any
warnings.
So this patch updates __cpu_queue_job() to ensure that
job_has_no_return is updated on the current cpu_thread before it
branches to the job->func(). So if the current job is non-returning
then other cpu_threads queuing a job on this cpu will trigger a
warning. This should aid in debugging some skiboot deadlocks.
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On a 176 thread system, before:
[ 122.319923233,5] OPAL: Switch to big-endian OS
[ 126.317897467,5] OPAL: Switch to little-endian OS
after:
[ 212.439299889,5] OPAL: Switch to big-endian OS
[ 212.469323643,5] OPAL: Switch to little-endian OS
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We currently do a rather pointless msgclr prior to setting
in_sleep/in_idle (with no ordering guarantee which isn't great).
We also do the final msgsync/msgclr after setting in_sleep/in_idle
back to false which while probably ok, isn't that great, we should
do msgsync first thing when waking up.
Finally, do p9_dbell_receive() before skip_sleep.
So take out the first msgclr, swap the final p9_dbell_receive() and
add a sync() for good measure and match what p8 does.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current code requests STOP3, which means it gets STOP2 in practice.
STOP2 has proven to occasionally be unreliable depending on FW
version and chip revision, it also requires a functional CME,
so instead, let's use STOP1. The difference is rather minimum
for something that is only used a few seconds during boot.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
| |
This is required by the architecture and the implementations, I've
observed failures to wake up on big cores without this.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
| |
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
| |
Currently if Linux boots with a non-zero PCR, things can go bad where
some early userspace programs can take illegal instructions. This is
being fixed in Linux, but in the mean time, we should cleanup in
skiboot also.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
| |
Fix the logic error in the loop that iterated incorrectly
over garded cpu.
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This detects OPAL being re-entered by the OS, and switches to an
emergency stack if it was. This protects the firmware's main stack
from re-entrancy and allows the OS to use NMI facilities for crash
/ debug functionality.
Further nested re-entry will destroy the previous emergency stack
and prevent returning, but those should be rare cases.
This stack is sized at 16kB, which doubles the size of CPU stacks,
so as not to introduce a regression in primary stack size. The 16kB
stack originally had a 4kB machine check stack at the top, which was
removed by 80eee1946 ("opal: Remove machine check interrupt patching
in OPAL."). So it is possible the size could be tightened again, but
that would require further analysis.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
| |
This patch reworks the HMI handling for TFAC errors by introducing
4 rendez-vous points improve the thread synchronization while handling
timebase errors that requires all thread to clear dirty data from TB/HDEC
register before clearing the errors.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 90d53934c2da ("core/cpu: discover stack region size before
initialising memory regions") introduced memzero for struct cpu_thread
in init_cpu_thread(). This has an unintended side effect of clobbering
the stack-guard cannery of the boot_cpu stack. This results in opal
failing to init with this failure message:
CPU: P9 generation processor (max 4 threads/core)
CPU: Boot CPU PIR is 0x0004 PVR is 0x004e1200
Guard skip = 0
Stack corruption detected !
Aborting!
CPU 0004 Backtrace:
S: 0000000031c13ab0 R: 0000000030013b0c .backtrace+0x5c
S: 0000000031c13b50 R: 000000003001bd18 ._abort+0x60
S: 0000000031c13be0 R: 0000000030013bbc .__stack_chk_fail+0x54
S: 0000000031c13c60 R: 00000000300c5b70 .memset+0x12c
S: 0000000031c13d00 R: 0000000030019aa8 .init_cpu_thread+0x40
S: 0000000031c13d90 R: 000000003001b520 .init_boot_cpu+0x188
S: 0000000031c13e30 R: 0000000030015050 .main_cpu_entry+0xd0
S: 0000000031c13f00 R: 0000000030002700 boot_entry+0x1c0
So the patch provides a fix by tweaking the memset() call in
init_cpu_thread() to skip over the stack-guard cannery.
Fixes:90d53934c2da("core/cpu: discover stack region size before initialising memory regions")
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Stack allocation first allocates a memory region sized to hold stacks
for all possible CPUs up to the maximum PIR of the architecture, zeros
the region, then initialises all stacks. Max PIR is 32768 on POWER9,
which is 512MB for stacks.
The stack region is then shrunk after CPUs are discovered, but this is
a bit of a hack, and it leaves a hole in the memory allocation regions
as it's done after mem regions are initialised.
0x000000000000..00002fffffff : ibm,os-reserve - OS
0x000030000000..0000303fffff : ibm,firmware-code - OPAL
0x000030400000..000030ffffff : ibm,firmware-heap - OPAL
0x000031000000..000031bfffff : ibm,firmware-data - OPAL
0x000031c00000..000031c0ffff : ibm,firmware-stacks - OPAL
*** gap ***
0x000051c00000..000051d01fff : ibm,firmware-allocs-memory@0 - OPAL
0x000051d02000..00007fffffff : ibm,firmware-allocs-memory@0 - OS
0x000080000000..000080b3cdff : initramfs - OPAL
0x000080b3ce00..000080b7cdff : ibm,fake-nvram - OPAL
0x000080b7ce00..0000ffffffff : ibm,firmware-allocs-memory@0 - OS
This change moves zeroing into the per-cpu stack setup. The boot CPU
stack is set up based on the current PIR. Then the size of the stack
region is set, by discovering the maximum PIR of the system from the
device tree, before mem regions are intialised.
This results in all memory being accounted within memory regions,
and less memory fragmentation of OPAL allocations.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
|
|
|
|
|
| |
This *dramatically* improves kernel boot time with GCOV builds
from ~3minutes between loading kernel and switching the HILE
bit down to around 10 seconds.
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds simple deadlock detection. The detection looks for circular
dependencies in the lock requests. It will abort and display a stack trace
when a deadlock occurs.
The detection is enabled by DEBUG_LOCKS (enabled by default).
While the detection may have a slight performance overhead, as there are
not a huge number of locks in skiboot this overhead isn't significant.
Signed-off-by: Matt Brown <matthew.brown.dev@gmail.com>
[stewart: fix build with DEBUG_LOCKS off]
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently opal_reinit_cpus(OPAL_REINIT_CPUS_TM_SUSPEND_DISABLED)
always returns OPAL_UNSUPPORTED.
This ties the tm suspend fw-feature to the
opal_reinit_cpus(OPAL_REINIT_CPUS_TM_SUSPEND_DISABLED) so that when tm
suspend is disabled, we correctly report it to the kernel. For
backwards compatibility, it's assumed tm suspend is available if the
fw-feature is not present.
Currently hostboot will clear fw-feature(TM_SUSPEND_ENABLED) on P9N
DD2.1. P9N DD2.2 will set fw-feature(TM_SUSPEND_ENABLED). DD2.0 and
below has TM disabled completely (not just suspend).
We are using opal_reinit_cpus() to determine this setting (rather than
the device tree/HDAT) as some future firmware may let us change this
dynamically after boot. That is not the case currently though.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Reviewed-by: Cyril Bur <cyril.bur@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
|
|
|
|
| |
Way back when, we got confused between timebase and ms,
so let's just use ms and be done with it.
Fixes: 514406fa44279996bfc9c85c1e4e53689d375e64
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Keep track of lock owner name and replace lock_depth counter
with a per-cpu list of locks held by the cpu.
This allows us to print the actual locks held in case we hit
the (in)famous message about opal_pollers being run with a
lock held.
It also allows us to warn (and drop them) if locks are still
held when returning to the OS or completing a scheduled job.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
[stewart: fix unit tests]
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
|
|
|
|
| |
This gives us per-cpu guard values as well. For now I just
xor a magic constant with the CPU PIR value.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Quiescing is ensuring all host controlled CPUs (except the current
one) are out of OPAL and prevented from entering. This can be use in
debug and shutdown paths, particularly with system reset sequences.
This patch adds per-CPU entry and exit tracking for OPAL calls, and
adds logic to "hold" or "reject" at entry time, if OPAL is quiesced.
An OPAL call is added, to expose the functionality to Linux, where it
can be used for shutdown, kexec, and before generating sreset IPIs for
debugging (so the debug code does not recurse into OPAL).
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
|
|
|
|
| |
This is a bit of paranoia, but when a CPU changes state to signal it
has reached a particular point, all previous stores should be visible.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
|
| |
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a new CPU reinit flag, "TM Suspend Disabled", which requests that
CPUs be configured so that TM (Transactional Memory) suspend mode is
disabled.
Currently this always fails, because skiboot has no way to query the
state. A future hostboot change will add a mechanism for skiboot to
determine the status and return an appropriate error code.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If any of the core fails to sync its TB during chipTOD initialization,
all the threads of that core are disabled. But this does not make
linux kernel to ignore the core/cpus. It crashes while bringing them up
with below backtrace:
[ 38.883898] kexec_core: Starting new kernel
cpu 0x0: Vector: 300 (Data Access) at [c0000003f277b730]
pc: c0000000001b9890: internal_create_group+0x30/0x304
lr: c0000000001b9880: internal_create_group+0x20/0x304
sp: c0000003f277b9b0
msr: 900000000280b033
dar: 40
dsisr: 40000000
current = 0xc0000003f9f41000
paca = 0xc00000000fe00000 softe: 0 irq_happened: 0x01
pid = 2572, comm = kexec
Linux version 4.13.2-openpower1 (jenkins@p89) (gcc version 6.4.0 (Buildroot 2017.08-00006-g319c6e1)) #1 SMP Wed Sep 20 05:42:11 UTC 2017
enter ? for help
[c0000003f277b9b0] c0000000008a8780 (unreliable)
[c0000003f277ba50] c00000000041c3ac topology_add_dev+0x2c/0x40
[c0000003f277ba70] c00000000006b078 cpuhp_invoke_callback+0x88/0x170
[c0000003f277bac0] c00000000006b22c cpuhp_up_callbacks+0x54/0xb8
[c0000003f277bb10] c00000000006bc68 cpu_up+0x11c/0x168
[c0000003f277bbc0] c00000000002f0e0 default_machine_kexec+0x1fc/0x274
[c0000003f277bc50] c00000000002e2d8 machine_kexec+0x50/0x58
[c0000003f277bc70] c0000000000de4e8 kernel_kexec+0x98/0xb4
[c0000003f277bce0] c00000000008b0f0 SyS_reboot+0x1c8/0x1f4
[c0000003f277be30] c00000000000b118 system_call+0x58/0x6c
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add pm idle support to POWER9. IPIs are implemented with doorbells.
POWER9 can use the EC=ESL=0 (lite) stop when sreset is not available.
EC=ESL=1 state with RL=3 is enabled when we have a sreset wakeup.
Deep idle states are not implemented.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
|
|
|
|
| |
pm idle requires the system reset vector and IPI facilities before
it can be enabled. Split these out and manage them individually.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
|
|
|
|
|
| |
The idle code checks pm_enabled once at entry, then not again
until the idle exit condition is met. Change this to check
each opportunity and change idle type if necessary.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
|
|
|
|
|
| |
The caller isn't in a position to know about PM heuristics, so
move the minimum timeout before power managmeent into the cpu idle
call. There is no functional change.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
|
|
|
|
| |
Rather than setting decrementer to max in the case we want to
ignore it, just don't set it as a wakeup reason the in LPCR.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This implements OPAL_SIGNAL_SYSTEM_RESET, using scom registers to
quiesce the target thread and raise a system reset exception on it.
It has been tested on DD2 with stop0 ESL=0 and ESL=1 shallow power
saving modes.
DD1 is not implemented because it is sufficiently different as to
make support difficult.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[stewart@linux.vnet.ibm.com: fixup hdat_to_dt test]
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
At the moment, if we get passed flags we don't know about, we
return OPAL_UNSUPPORTED but we still perform whatever actions
was requied by the flags we do support. Additionally, on P8,
we attempt a SLW re-init which hasn't been supported since
Murano DD2.0 and will crash your system.
It's too late to fix on existing systems so Linux will have to
be careful at least on P8, but to avoid future issues let's clean
that up, make sure we only use slw_reinit() when HILE isn't
supported.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[stewart@linux.vnet.ibm.com: retain OPAL_UNSUPPORTED]
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
|
|
|
|
| |
This can work around problems where Linux fails to properly
cleanup part or all of the TLB on kexec.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There's a bug in current Linux kernels leaving crap in those registers
accross kexec and not sanitizing them on boot. This breaks kexec under
some circumstances (such as booting a hash kernel from a radix one
on P9 DD2.0).
The long term fix is in Linux, but this workaround is a reasonable
way of "sanitizing" those SPRs when Linux calls opal_reinit_cpus()
and shouldn't have adverse effects.
We could also use that same mechanism to cleanup other things as
well such as restoring some other SPRs to their default value in
the future.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds new opal_reinit_cpus() flags to setup radix or hash
mode in HID[8] on POWER9.
By default HID[8] will be set. On P9 DD1.0, Linux will change
it as needed. On P9 DD2.0 hash works in radix mode (radix is
really "dual" mode) so KVM won't break and existing kernels
will work.
Newer kernels built for hash will call this to clear the HID bit
and thus get the full size of the TLB as an optimization.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
|
|
|
|
|
| |
Create a more generic helper for changing HID0 bits on all
processors.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|