talos-skiboot - Talos™ II skiboot sources

	Commit message (Collapse)	Author	Age	Files	Lines
*	asm/head.S: set POWER9 radix HID bit at entry	Nicholas Piggin	2019-04-17	1	-16/+2
\| \| \| \| \| \| \| \| \| \| \|	When running in virtual memory mode, the radix MMU hid bit should not be changed, so set this in the initial boot SPR setup. As a side effect, fast reboot also has HID0:RADIX bit set by the shared spr init, so no need for an explicit call. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	core/exceptions: implement support for MCE interrupts in powersave	Nicholas Piggin	2019-02-13	1	-4/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The ISA specifies that MCE interrupts in power saving modes will enter at 0x200 with powersave bits in SRR1 set. This is not currently supported properly, the MCE will just happen like a normal interrupt, but GPRs could be lost, which would lead to crashes (e.g., r1, r2, r13 etc). So check the power save bits similarly to the sreset vector, and handle this properly. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	core/exceptions: allow recoverable sreset exceptions	Nicholas Piggin	2019-02-13	1	-0/+1
\| \| \| \| \| \| \| \|	This requires implementing the MSR[RI] bit. Then just allow all non-fatal sreset exceptions to recover. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	core/exceptions: implement an exception handler for non-powersave sresets	Nicholas Piggin	2019-02-13	1	-6/+29
\| \| \| \| \| \| \| \|	Detect non-powersave sresets and send them to the normal exception handler which prints registers and stack. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	core/cpu: do not inline cpu_relax	Nicholas Piggin	2019-02-12	1	-0/+12
\| \| \| \| \| \| \| \| \|	The added nops now push it up in size, and -Os uninlines it for every compilation unit that calls it more than once, so it's much better to just uninline. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	core/cpu: HID update race	Nicholas Piggin	2019-02-12	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	If the per-core HID register is updated concurrently by multiple threads, updates can get lost. This has been observed during fast reboot where the HILE bit does not get cleared on all cores, which can cause machine check exception interrupts to crash. Fix this by only updating HID on thread0. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	Add PVR_TYPE_P9P	Reza Arbab	2019-02-10	1	-0/+1
\| \| \| \| \| \| \|	Enable a new PVR to get us running on another p9 variant. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	Remove POWER9N DD1 support	Nicholas Piggin	2019-01-25	1	-10/+5
\| \| \| \| \| \| \| \|	This is not a shipping product and is no longer supported by Linux or other firmware components. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	core/cpu.c: avoid container_of(NULL) in next_cpu()	Stewart Smith	2018-12-10	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \|	A certain finicky static analysis tool did point out that we were operating on a value that could be null (and since first_cpu() calls next_cpu(NULL) to get the first one, it also gets to be complained about as next_cpu() could act on that NULL pointer). So, rework things to shut the static analysis tool up, when in fact this was never a problem. Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	cpu: Quieten OS endian switch messages	Joel Stanley	2018-10-25	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Users see these when loading an OS from Petitboot: [ 119.486794100,5] OPAL: Switch to big-endian OS [ 120.022302604,5] OPAL: Switch to little-endian OS Which is expected and doesn't provide any information the user can act on. Switch them to PR_INFO so they still appear in the log, but not on the serial console. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	core/cpu: Fix memory allocation for job array	Vaidyanathan Srinivasan	2018-09-13	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	fixes: 7a3f307e core/cpu: parallelise global CPU register setting jobs This bug would result in boot-hang on some configurations due to cpu_wait_job() endlessly waiting for the last bogus jobs[cpu->pir] pointer. Reported-by: Stephanie Swanson <swanman@us.ibm.com> Reported-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
*	cpu: Better output when waiting for a very long job	Benjamin Herrenschmidt	2018-08-16	1	-0/+5
\| \| \| \| \| \| \| \| \| \|	Instead of printing at the end if the job took more than 1s, print in the loop every 30s along with a backtrace. This will give us some output if the job is deadlocked. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [stewart: bump to 30s rather than 5s, preserve PR_DEBUG for >1s] Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	lock: Fix interactions between lock dependency checker and stack checker	Benjamin Herrenschmidt	2018-08-16	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The lock dependency checker does a few nasty things that can cause re-entrancy deadlocks in conjunction with the stack checker or in fact other debug tests. A lot of it revolves around taking a new lock (dl_lock) as part of the locking process. This tries to fix it by making sure we do not hit the stack checker while holding dl_lock. We achieve that in part by directly using the low-level __try_lock and manually unlocking on the dl_lock, and making some functions "nomcount". In addition, we mark the dl_lock as being in the console path to avoid deadlocks with the UART driver. We move the enabling of the deadlock checker to a separate config option from DEBUG_LOCKS as well, in case we chose to disable it by default later on. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	core/cpu: Call memset with proper cpu_thread offset	Vasant Hegde	2018-08-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	"cpu_thread t + value" vs "(void )t + val" Fixes: cfe9d441 (core/cpu: Prevent clobbering of stack guard for boot-cpu) CC: stable <skiboot@lists.ozlabs.org> # v6.0+ CC: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> CC: Nicholas Piggin <npiggin@gmail.com> CC: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Acked-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Reviewed-by: Vaibhav Jain<vaibhav@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	core/cpu.c: assert pir is sane before using	Stewart Smith	2018-07-20	1	-0/+1
\| \| \| \|	Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	cpu: add cpu_queue_job_on_node()	Nicholas Piggin	2018-07-15	1	-16/+68
\| \| \| \| \| \| \| \| \| \|	Add a job scheduling API which will run the job on the requested chip_id (or return failure). Includes test harness fixes from Stewart. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	cpu: Ensure no-return flag is updated for current cpu_thread	Vaibhav Jain	2018-07-10	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Presently in case a cpu_thread queues a non returning job on itself, the variable cpu_thread.job_has_no_return is never updated and other cpu_threads can still queue a job on it without triggering any warnings. So this patch updates __cpu_queue_job() to ensure that job_has_no_return is updated on the current cpu_thread before it branches to the job->func(). So if the current job is non-returning then other cpu_threads queuing a job on this cpu will trigger a warning. This should aid in debugging some skiboot deadlocks. Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	core/cpu: parallelise global CPU register setting jobs	Nicholas Piggin	2018-07-04	1	-10/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On a 176 thread system, before: [ 122.319923233,5] OPAL: Switch to big-endian OS [ 126.317897467,5] OPAL: Switch to little-endian OS after: [ 212.439299889,5] OPAL: Switch to big-endian OS [ 212.469323643,5] OPAL: Switch to little-endian OS Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	cpu: Cleanup clearing of doorbells on P9	Benjamin Herrenschmidt	2018-05-24	1	-4/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We currently do a rather pointless msgclr prior to setting in_sleep/in_idle (with no ordering guarantee which isn't great). We also do the final msgsync/msgclr after setting in_sleep/in_idle back to false which while probably ok, isn't that great, we should do msgsync first thing when waking up. Finally, do p9_dbell_receive() before skip_sleep. So take out the first msgclr, swap the final p9_dbell_receive() and add a sync() for good measure and match what p8 does. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	cpu: Use STOP1 on POWER9 for idle/sleep inside OPAL	Benjamin Herrenschmidt	2018-05-24	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \|	The current code requests STOP3, which means it gets STOP2 in practice. STOP2 has proven to occasionally be unreliable depending on FW version and chip revision, it also requires a functional CME, so instead, let's use STOP1. The difference is rather minimum for something that is only used a few seconds during boot. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	cpu: Do an isync after setting LPCR	Benjamin Herrenschmidt	2018-05-24	1	-0/+3
\| \| \| \| \| \| \| \|	This is required by the architecture and the implementations, I've observed failures to wake up on big cores without this. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	cpu: Remove duplicate setting of LPCR	Benjamin Herrenschmidt	2018-05-24	1	-1/+0
\| \| \| \| \|	Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	cpu: Clear PCR SPR in opal_reinit_cpus()	Michael Neuling	2018-05-18	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	Currently if Linux boots with a non-zero PCR, things can go bad where some early userspace programs can take illegal instructions. This is being fixed in Linux, but in the mean time, we should cleanup in skiboot also. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	core: Fix iteration condition to skip garded cpu	Vaidyanathan Srinivasan	2018-04-18	1	-1/+1
\| \| \| \| \| \| \| \| \|	Fix the logic error in the loop that iterated incorrectly over garded cpu. Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	core/opal: Emergency stack for re-entry	Nicholas Piggin	2018-04-18	1	-5/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This detects OPAL being re-entered by the OS, and switches to an emergency stack if it was. This protects the firmware's main stack from re-entrancy and allows the OS to use NMI facilities for crash / debug functionality. Further nested re-entry will destroy the previous emergency stack and prevent returning, but those should be rare cases. This stack is sized at 16kB, which doubles the size of CPU stacks, so as not to introduce a regression in primary stack size. The 16kB stack originally had a 4kB machine check stack at the top, which was removed by 80eee1946 ("opal: Remove machine check interrupt patching in OPAL."). So it is possible the size could be tightened again, but that would require further analysis. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	opal/hmi: Rework HMI handling of TFAC errors	Benjamin Herrenschmidt	2018-04-17	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \|	This patch reworks the HMI handling for TFAC errors by introducing 4 rendez-vous points improve the thread synchronization while handling timebase errors that requires all thread to clear dirty data from TB/HDEC register before clearing the errors. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
*	core/cpu: Prevent clobbering of stack guard for boot-cpu	Vaibhav Jain	2018-04-04	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 90d53934c2da ("core/cpu: discover stack region size before initialising memory regions") introduced memzero for struct cpu_thread in init_cpu_thread(). This has an unintended side effect of clobbering the stack-guard cannery of the boot_cpu stack. This results in opal failing to init with this failure message: CPU: P9 generation processor (max 4 threads/core) CPU: Boot CPU PIR is 0x0004 PVR is 0x004e1200 Guard skip = 0 Stack corruption detected ! Aborting! CPU 0004 Backtrace: S: 0000000031c13ab0 R: 0000000030013b0c .backtrace+0x5c S: 0000000031c13b50 R: 000000003001bd18 ._abort+0x60 S: 0000000031c13be0 R: 0000000030013bbc .__stack_chk_fail+0x54 S: 0000000031c13c60 R: 00000000300c5b70 .memset+0x12c S: 0000000031c13d00 R: 0000000030019aa8 .init_cpu_thread+0x40 S: 0000000031c13d90 R: 000000003001b520 .init_boot_cpu+0x188 S: 0000000031c13e30 R: 0000000030015050 .main_cpu_entry+0xd0 S: 0000000031c13f00 R: 0000000030002700 boot_entry+0x1c0 So the patch provides a fix by tweaking the memset() call in init_cpu_thread() to skip over the stack-guard cannery. Fixes:90d53934c2da("core/cpu: discover stack region size before initialising memory regions") Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
*	core/cpu: discover stack region size before initialising memory regions	Nicholas Piggin	2018-03-27	1	-29/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Stack allocation first allocates a memory region sized to hold stacks for all possible CPUs up to the maximum PIR of the architecture, zeros the region, then initialises all stacks. Max PIR is 32768 on POWER9, which is 512MB for stacks. The stack region is then shrunk after CPUs are discovered, but this is a bit of a hack, and it leaves a hole in the memory allocation regions as it's done after mem regions are initialised. 0x000000000000..00002fffffff : ibm,os-reserve - OS 0x000030000000..0000303fffff : ibm,firmware-code - OPAL 0x000030400000..000030ffffff : ibm,firmware-heap - OPAL 0x000031000000..000031bfffff : ibm,firmware-data - OPAL 0x000031c00000..000031c0ffff : ibm,firmware-stacks - OPAL * gap * 0x000051c00000..000051d01fff : ibm,firmware-allocs-memory@0 - OPAL 0x000051d02000..00007fffffff : ibm,firmware-allocs-memory@0 - OS 0x000080000000..000080b3cdff : initramfs - OPAL 0x000080b3ce00..000080b7cdff : ibm,fake-nvram - OPAL 0x000080b7ce00..0000ffffffff : ibm,firmware-allocs-memory@0 - OS This change moves zeroing into the per-cpu stack setup. The boot CPU stack is set up based on the current PIR. Then the size of the stack region is set, by discovering the maximum PIR of the system from the device tree, before mem regions are intialised. This results in all memory being accounted within memory regions, and less memory fragmentation of OPAL allocations. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
*	cpu_idle_job: relax a bit	Stewart Smith	2018-03-08	1	-0/+1
\| \| \| \| \| \| \| \| \|	This dramatically improves kernel boot time with GCOV builds from ~3minutes between loading kernel and switching the HILE bit down to around 10 seconds. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
*	core/lock: Add deadlock detection	Matt Brown	2018-03-07	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	This adds simple deadlock detection. The detection looks for circular dependencies in the lock requests. It will abort and display a stack trace when a deadlock occurs. The detection is enabled by DEBUG_LOCKS (enabled by default). While the detection may have a slight performance overhead, as there are not a huge number of locks in skiboot this overhead isn't significant. Signed-off-by: Matt Brown <matthew.brown.dev@gmail.com> [stewart: fix build with DEBUG_LOCKS off] Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
*	Tie tm-suspend fw-feature and opal_reinit_cpus() together	Michael Neuling	2018-03-04	1	-5/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently opal_reinit_cpus(OPAL_REINIT_CPUS_TM_SUSPEND_DISABLED) always returns OPAL_UNSUPPORTED. This ties the tm suspend fw-feature to the opal_reinit_cpus(OPAL_REINIT_CPUS_TM_SUSPEND_DISABLED) so that when tm suspend is disabled, we correctly report it to the kernel. For backwards compatibility, it's assumed tm suspend is available if the fw-feature is not present. Currently hostboot will clear fw-feature(TM_SUSPEND_ENABLED) on P9N DD2.1. P9N DD2.2 will set fw-feature(TM_SUSPEND_ENABLED). DD2.0 and below has TM disabled completely (not just suspend). We are using opal_reinit_cpus() to determine this setting (rather than the device tree/HDAT) as some future firmware may let us change this dynamically after boot. That is not the case currently though. Signed-off-by: Michael Neuling <mikey@neuling.org> Reviewed-by: Cyril Bur <cyril.bur@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
*	cpu_wait_job: Correctly report time spent waiting for job	Stewart Smith	2018-02-19	1	-3/+3
\| \| \| \| \| \| \| \|	Way back when, we got confused between timebase and ms, so let's just use ms and be done with it. Fixes: 514406fa44279996bfc9c85c1e4e53689d375e64 Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
*	lock: Add additional lock auditing code	Benjamin Herrenschmidt	2017-12-20	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Keep track of lock owner name and replace lock_depth counter with a per-cpu list of locks held by the cpu. This allows us to print the actual locks held in case we hit the (in)famous message about opal_pollers being run with a lock held. It also allows us to warn (and drop them) if locks are still held when returning to the OS or completing a scheduled job. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> [stewart: fix unit tests] Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
*	Add support for new gcc 7 parametrized stack protector	Benjamin Herrenschmidt	2017-12-20	1	-2/+10
\| \| \| \| \| \| \| \|	This gives us per-cpu guard values as well. For now I just xor a magic constant with the CPU PIR value. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
*	core: Add support for quiescing OPAL	Nicholas Piggin	2017-12-03	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Quiescing is ensuring all host controlled CPUs (except the current one) are out of OPAL and prevented from entering. This can be use in debug and shutdown paths, particularly with system reset sequences. This patch adds per-CPU entry and exit tracking for OPAL calls, and adds logic to "hold" or "reject" at entry time, if OPAL is quiesced. An OPAL call is added, to expose the functionality to Linux, where it can be used for shutdown, kexec, and before generating sreset IPIs for debugging (so the debug code does not recurse into OPAL). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
*	fast-reboot: add more barriers around cpu state changes	Nicholas Piggin	2017-12-03	1	-0/+3
\| \| \| \| \| \| \| \|	This is a bit of paranoia, but when a CPU changes state to signal it has reached a particular point, all previous stores should be visible. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
*	fast-reboot: clean up some common cpu iteration processes with macros	Nicholas Piggin	2017-12-03	1	-0/+28
\| \| \| \| \|	Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
*	cpu: Add OPAL_REINIT_CPUS_TM_SUSPEND_DISABLED	Michael Ellerman	2017-10-16	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \|	Add a new CPU reinit flag, "TM Suspend Disabled", which requests that CPUs be configured so that TM (Transactional Memory) suspend mode is disabled. Currently this always fails, because skiboot has no way to query the state. A future hostboot change will add a mechanism for skiboot to determine the status and return an appropriate error code. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
*	opal/cpu: Mark the core as bad while disabling threads of the core.	Mahesh Salgaonkar	2017-10-15	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If any of the core fails to sync its TB during chipTOD initialization, all the threads of that core are disabled. But this does not make linux kernel to ignore the core/cpus. It crashes while bringing them up with below backtrace: [ 38.883898] kexec_core: Starting new kernel cpu 0x0: Vector: 300 (Data Access) at [c0000003f277b730] pc: c0000000001b9890: internal_create_group+0x30/0x304 lr: c0000000001b9880: internal_create_group+0x20/0x304 sp: c0000003f277b9b0 msr: 900000000280b033 dar: 40 dsisr: 40000000 current = 0xc0000003f9f41000 paca = 0xc00000000fe00000 softe: 0 irq_happened: 0x01 pid = 2572, comm = kexec Linux version 4.13.2-openpower1 (jenkins@p89) (gcc version 6.4.0 (Buildroot 2017.08-00006-g319c6e1)) #1 SMP Wed Sep 20 05:42:11 UTC 2017 enter ? for help [c0000003f277b9b0] c0000000008a8780 (unreliable) [c0000003f277ba50] c00000000041c3ac topology_add_dev+0x2c/0x40 [c0000003f277ba70] c00000000006b078 cpuhp_invoke_callback+0x88/0x170 [c0000003f277bac0] c00000000006b22c cpuhp_up_callbacks+0x54/0xb8 [c0000003f277bb10] c00000000006bc68 cpu_up+0x11c/0x168 [c0000003f277bbc0] c00000000002f0e0 default_machine_kexec+0x1fc/0x274 [c0000003f277bc50] c00000000002e2d8 machine_kexec+0x50/0x58 [c0000003f277bc70] c0000000000de4e8 kernel_kexec+0x98/0xb4 [c0000003f277bce0] c00000000008b0f0 SyS_reboot+0x1c8/0x1f4 [c0000003f277be30] c00000000000b118 system_call+0x58/0x6c Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
*	cpu: idle POWER9 power management implementation	Nicholas Piggin	2017-09-28	1	-4/+119
\| \| \| \| \| \| \| \| \| \| \| \|	Add pm idle support to POWER9. IPIs are implemented with doorbells. POWER9 can use the EC=ESL=0 (lite) stop when sreset is not available. EC=ESL=1 state with RL=3 is enabled when we have a sreset wakeup. Deep idle states are not implemented. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
*	cpu: idle split pm enable into sreset and ipi components	Nicholas Piggin	2017-09-28	1	-32/+59
\| \| \| \| \| \| \| \|	pm idle requires the system reset vector and IPI facilities before it can be enabled. Split these out and manage them individually. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
*	cpu: idle notice if pm state changes	Nicholas Piggin	2017-09-28	1	-4/+18
\| \| \| \| \| \| \| \| \|	The idle code checks pm_enabled once at entry, then not again until the idle exit condition is met. Change this to check each opportunity and change idle type if necessary. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
*	cpu: idle move the minimum PM latency into the idle code	Nicholas Piggin	2017-09-28	1	-1/+2
\| \| \| \| \| \| \| \| \|	The caller isn't in a position to know about PM heuristics, so move the minimum timeout before power managmeent into the cpu idle call. There is no functional change. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
*	cpu: avoid decrementer wakeups in case of cpu_wake_on_job idle	Nicholas Piggin	2017-09-28	1	-8/+7
\| \| \| \| \| \| \| \|	Rather than setting decrementer to max in the case we want to ignore it, just don't set it as a wakeup reason the in LPCR. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
*	core: POWER9 implement OPAL_SIGNAL_SYSTEM_RESET	Nicholas Piggin	2017-09-20	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This implements OPAL_SIGNAL_SYSTEM_RESET, using scom registers to quiesce the target thread and raise a system reset exception on it. It has been tested on DD2 with stop0 ESL=0 and ESL=1 shallow power saving modes. DD1 is not implemented because it is sufficiently different as to make support difficult. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [stewart@linux.vnet.ibm.com: fixup hdat_to_dt test] Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
*	cpu: Better handle unknown flags in opal_reinit_cpus()	Benjamin Herrenschmidt	2017-07-12	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	At the moment, if we get passed flags we don't know about, we return OPAL_UNSUPPORTED but we still perform whatever actions was requied by the flags we do support. Additionally, on P8, we attempt a SLW re-init which hasn't been supported since Murano DD2.0 and will crash your system. It's too late to fix on existing systems so Linux will have to be careful at least on P8, but to avoid future issues let's clean that up, make sure we only use slw_reinit() when HILE isn't supported. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [stewart@linux.vnet.ibm.com: retain OPAL_UNSUPPORTED] Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
*	cpu: Unconditionally cleanup TLBs on P9 in opal_reinit_cpus()	Benjamin Herrenschmidt	2017-07-11	1	-2/+11
\| \| \| \| \| \| \| \|	This can work around problems where Linux fails to properly cleanup part or all of the TLB on kexec. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
*	cpu: Cleanup AMR and IAMR when re-initializing CPUs	Benjamin Herrenschmidt	2017-06-30	1	-0/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There's a bug in current Linux kernels leaving crap in those registers accross kexec and not sanitizing them on boot. This breaks kexec under some circumstances (such as booting a hash kernel from a radix one on P9 DD2.0). The long term fix is in Linux, but this workaround is a reasonable way of "sanitizing" those SPRs when Linux calls opal_reinit_cpus() and shouldn't have adverse effects. We could also use that same mechanism to cleanup other things as well such as restoring some other SPRs to their default value in the future. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
*	cpu: Support setting HID[RADIX] and set it by default on P9	Benjamin Herrenschmidt	2017-06-26	1	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds new opal_reinit_cpus() flags to setup radix or hash mode in HID[8] on POWER9. By default HID[8] will be set. On P9 DD1.0, Linux will change it as needed. On P9 DD2.0 hash works in radix mode (radix is really "dual" mode) so KVM won't break and existing kernels will work. Newer kernels built for hash will call this to clear the HID bit and thus get the full size of the TLB as an optimization. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
*	cpu: Rework HILE change	Benjamin Herrenschmidt	2017-06-26	1	-28/+43
\| \| \| \| \| \| \| \| \|	Create a more generic helper for changing HID0 bits on all processors. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>