summaryrefslogtreecommitdiffstats
path: root/arch/powerpc/mm
Commit message (Collapse)AuthorAgeFilesLines
...
| * | powerpc/mm: Extend pte_fragment functionality to PPC32Christophe Leroy2018-12-044-22/+16
| | | | | | | | | | | | | | | | | | | | | | | | In order to allow the 8xx to handle pte_fragments, this patch extends the use of pte_fragments to PPC32 platforms. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc/mm: add helpers to get/set mm.context->pte_fragChristophe Leroy2018-12-041-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | In order to handle pte_fragment functions with single fragment without adding pte_frag in all mm_context_t, this patch creates two helpers which do nothing on platforms using a single fragment. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc/mm: Avoid useless lock with single page fragmentsChristophe Leroy2018-12-042-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | There is no point in taking the page table lock as pte_frag or pmd_frag are always NULL when we have only one fragment. Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc/mm: Move pte_fragment_alloc() to a common locationChristophe Leroy2018-12-044-101/+119
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation of next patch which generalises the use of pte_fragment_alloc() for all, this patch moves the related functions in a place that is common to all subarches. The 8xx will need that for supporting 16k pages, as in that mode page tables still have a size of 4k. Since pte_fragment with only once fragment is not different from what is done in the general case, we can easily migrate all subarchs to pte fragments. Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc: change CONFIG_PPC_STD_MMU to CONFIG_PPC_BOOK3SChristophe Leroy2018-11-263-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Today we have: config PPC_BOOK3S def_bool y depends on PPC_BOOK3S_32 || PPC_BOOK3S_64 config PPC_STD_MMU def_bool y depends on PPC_BOOK3S PPC_STD_MMU is therefore redundant with PPC_BOOK3S. Lets remove it. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc: change CONFIG_PPC_STD_MMU_32 to CONFIG_PPC_BOOK3S_32Christophe Leroy2018-11-262-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Today we have: config PPC_BOOK3S_32 bool "512x/52xx/6xx/7xx/74xx/82xx/83xx/86xx" [depends on PPC32 within a choice] config PPC_BOOK3S def_bool y depends on PPC_BOOK3S_32 || PPC_BOOK3S_64 config PPC_STD_MMU def_bool y depends on PPC_BOOK3S config PPC_STD_MMU_32 def_bool y depends on PPC_STD_MMU && PPC32 PPC_STD_MMU_32 is therefore redundant with PPC_BOOK3S_32. In order to make the code clearer, lets use preferably PPC_BOOK3S_32. This will allow to remove CONFIG_PPC_STD_MMU_32 in a later patch. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc: change CONFIG_6xx to CONFIG_PPC_BOOK3S_32Christophe Leroy2018-11-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Today we have: config PPC_BOOK3S_32 bool "512x/52xx/6xx/7xx/74xx/82xx/83xx/86xx" [depends on PPC32 within a choice] config PPC_BOOK3S def_bool y depends on PPC_BOOK3S_32 || PPC_BOOK3S_64 config 6xx def_bool y depends on PPC32 && PPC_BOOK3S 6xx is therefore redundant with PPC_BOOK3S_32. In order to make the code clearer, lets use preferably PPC_BOOK3S_32. This will allow to remove CONFIG_6xx in a later patch. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc: Use device_type helpers to access the node typeRob Herring2018-11-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove directly accessing device_node.type pointer and use the accessors instead. This will eventually allow removing the type pointer. Replace the open coded iterating over child nodes with for_each_child_of_node() while we're here. Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc/mm: Remove extern from function definitionBreno Leitao2018-11-251-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Function huge_ptep_set_access_flags() has the 'extern' keyword in the function definition and also in the function declaration. This causes a warning in 'sparse' since the 'extern' storage class should not be used in the function definition. arch/powerpc/mm/pgtable.c:232:12: warning: function 'huge_ptep_set_access_flags' with external linkage has definition This patch removes the keyword from the definition part. It also removes the extern keyword from the declaration part, since checkpatch --strict complains about it. Suggested-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc/pkey: Define functions as staticBreno Leitao2018-11-251-7/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sparse tool is showing some warnings on pkeys.c file, mainly related to storage class identifiers. There are static variables and functions not declared as such. The same thing happens with an extern function, which misses the header inclusion. arch/powerpc/mm/pkeys.c:14:6: warning: symbol 'pkey_execute_disable_supported' was not declared. Should it be static? arch/powerpc/mm/pkeys.c:16:6: warning: symbol 'pkeys_devtree_defined' was not declared. Should it be static? arch/powerpc/mm/pkeys.c:19:6: warning: symbol 'pkey_amr_mask' was not declared. Should it be static? arch/powerpc/mm/pkeys.c:20:6: warning: symbol 'pkey_iamr_mask' was not declared. Should it be static? arch/powerpc/mm/pkeys.c:21:6: warning: symbol 'pkey_uamor_mask' was not declared. Should it be static? arch/powerpc/mm/pkeys.c:22:6: warning: symbol 'execute_only_key' was not declared. Should it be static? arch/powerpc/mm/pkeys.c:60:5: warning: symbol 'pkey_initialize' was not declared. Should it be static? arch/powerpc/mm/pkeys.c:404:6: warning: symbol 'arch_vma_access_permitted' was not declared. Should it be static? This patch fix al the warning, basically turning all global variables that are not declared as extern at asm/pkeys.h into static. It also includes asm/mmu_context.h header, which contains the definition of arch_vma_access_permitted. Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | | Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds2018-12-261-1/+1
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "The biggest RCU changes in this cycle were: - Convert RCU's BUG_ON() and similar calls to WARN_ON() and similar. - Replace calls of RCU-bh and RCU-sched update-side functions to their vanilla RCU counterparts. This series is a step towards complete removal of the RCU-bh and RCU-sched update-side functions. ( Note that some of these conversions are going upstream via their respective maintainers. ) - Documentation updates, including a number of flavor-consolidation updates from Joel Fernandes. - Miscellaneous fixes. - Automate generation of the initrd filesystem used for rcutorture testing. - Convert spin_is_locked() assertions to instead use lockdep. ( Note that some of these conversions are going upstream via their respective maintainers. ) - SRCU updates, especially including a fix from Dennis Krein for a bag-on-head-class bug. - RCU torture-test updates" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (112 commits) rcutorture: Don't do busted forward-progress testing rcutorture: Use 100ms buckets for forward-progress callback histograms rcutorture: Recover from OOM during forward-progress tests rcutorture: Print forward-progress test age upon failure rcutorture: Print time since GP end upon forward-progress failure rcutorture: Print histogram of CB invocation at OOM time rcutorture: Print GP age upon forward-progress failure rcu: Print per-CPU callback counts for forward-progress failures rcu: Account for nocb-CPU callback counts in RCU CPU stall warnings rcutorture: Dump grace-period diagnostics upon forward-progress OOM rcutorture: Prepare for asynchronous access to rcu_fwd_startat torture: Remove unnecessary "ret" variables rcutorture: Affinity forward-progress test to avoid housekeeping CPUs rcutorture: Break up too-long rcu_torture_fwd_prog() function rcutorture: Remove cbflood facility torture: Bring any extra CPUs online during kernel startup rcutorture: Add call_rcu() flooding forward-progress tests rcutorture/formal: Replace synchronize_sched() with synchronize_rcu() tools/kernel.h: Replace synchronize_sched() with synchronize_rcu() net/decnet: Replace rcu_barrier_bh() with rcu_barrier() ...
| * \ \ Merge branch 'for-mingo' of ↵Ingo Molnar2018-12-041-1/+1
| |\ \ \ | | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU changes from Paul E. McKenney: - Convert RCU's BUG_ON() and similar calls to WARN_ON() and similar. - Replace calls of RCU-bh and RCU-sched update-side functions to their vanilla RCU counterparts. This series is a step towards complete removal of the RCU-bh and RCU-sched update-side functions. ( Note that some of these conversions are going upstream via their respective maintainers. ) - Documentation updates, including a number of flavor-consolidation updates from Joel Fernandes. - Miscellaneous fixes. - Automate generation of the initrd filesystem used for rcutorture testing. - Convert spin_is_locked() assertions to instead use lockdep. ( Note that some of these conversions are going upstream via their respective maintainers. ) - SRCU updates, especially including a fix from Dennis Krein for a bag-on-head-class bug. - RCU torture-test updates. Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * | powerpc: Convert hugepd_free() to use call_rcu()Paul E. McKenney2018-11-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that call_rcu()'s callback is not invoked until after all preempt-disable regions of code have completed (in addition to explicitly marked RCU read-side critical sections), call_rcu() can be used in place of call_rcu_sched(). This commit therefore makes that change. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: <linuxppc-dev@lists.ozlabs.org>
* | | | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2018-12-261-0/+1
|\ \ \ \ | |_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull KVM updates from Paolo Bonzini: "ARM: - selftests improvements - large PUD support for HugeTLB - single-stepping fixes - improved tracing - various timer and vGIC fixes x86: - Processor Tracing virtualization - STIBP support - some correctness fixes - refactorings and splitting of vmx.c - use the Hyper-V range TLB flush hypercall - reduce order of vcpu struct - WBNOINVD support - do not use -ftrace for __noclone functions - nested guest support for PAUSE filtering on AMD - more Hyper-V enlightenments (direct mode for synthetic timers) PPC: - nested VFIO s390: - bugfixes only this time" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (171 commits) KVM: x86: Add CPUID support for new instruction WBNOINVD kvm: selftests: ucall: fix exit mmio address guessing Revert "compiler-gcc: disable -ftracer for __noclone functions" KVM: VMX: Move VM-Enter + VM-Exit handling to non-inline sub-routines KVM: VMX: Explicitly reference RCX as the vmx_vcpu pointer in asm blobs KVM: x86: Use jmp to invoke kvm_spurious_fault() from .fixup MAINTAINERS: Add arch/x86/kvm sub-directories to existing KVM/x86 entry KVM/x86: Use SVM assembly instruction mnemonics instead of .byte streams KVM/MMU: Flush tlb directly in the kvm_zap_gfn_range() KVM/MMU: Flush tlb directly in kvm_set_pte_rmapp() KVM/MMU: Move tlb flush in kvm_set_pte_rmapp() to kvm_mmu_notifier_change_pte() KVM: Make kvm_set_spte_hva() return int KVM: Replace old tlb flush function with new one to flush a specified range. KVM/MMU: Add tlb flush with range helper function KVM/VMX: Add hv tlb range flush support x86/hyper-v: Add HvFlushGuestAddressList hypercall support KVM: Add tlb_remote_flush_with_range callback in kvm_x86_ops KVM: x86: Disable Intel PT when VMXON in L1 guest KVM: x86: Set intercept for Intel PT MSRs read/write KVM: x86: Implement Intel PT MSRs read/write emulation ...
| * | | KVM: PPC: Book3S HV: Implement functions to access quadrants 1 & 2Suraj Jitindar Singh2018-12-171-0/+1
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The POWER9 radix mmu has the concept of quadrants. The quadrant number is the two high bits of the effective address and determines the fully qualified address to be used for the translation. The fully qualified address consists of the effective lpid, the effective pid and the effective address. This gives then 4 possible quadrants 0, 1, 2, and 3. When accessing these quadrants the fully qualified address is obtained as follows: Quadrant | Hypervisor | Guest -------------------------------------------------------------------------- | EA[0:1] = 0b00 | EA[0:1] = 0b00 0 | effLPID = 0 | effLPID = LPIDR | effPID = PIDR | effPID = PIDR -------------------------------------------------------------------------- | EA[0:1] = 0b01 | 1 | effLPID = LPIDR | Invalid Access | effPID = PIDR | -------------------------------------------------------------------------- | EA[0:1] = 0b10 | 2 | effLPID = LPIDR | Invalid Access | effPID = 0 | -------------------------------------------------------------------------- | EA[0:1] = 0b11 | EA[0:1] = 0b11 3 | effLPID = 0 | effLPID = LPIDR | effPID = 0 | effPID = 0 -------------------------------------------------------------------------- In the Guest; Quadrant 3 is normally used to address the operating system since this uses effPID=0 and effLPID=LPIDR, meaning the PID register doesn't need to be switched. Quadrant 0 is normally used to address user space since the effLPID and effPID are taken from the corresponding registers. In the Host; Quadrant 0 and 3 are used as above, however the effLPID is always 0 to address the host. Quadrants 1 and 2 can be used by the host to address guest memory using a guest effective address. Since the effLPID comes from the LPID register, the host loads the LPID of the guest it would like to access (and the PID of the process) and can perform accesses to a guest effective address. This means quadrant 1 can be used to address the guest user space and quadrant 2 can be used to address the guest operating system from the hypervisor, using a guest effective address. Access to the quadrants can cause a Hypervisor Data Storage Interrupt (HDSI) due to being unable to perform partition scoped translation. Previously this could only be generated from a guest and so the code path expects us to take the KVM trampoline in the interrupt handler. This is no longer the case so we modify the handler to call bad_page_fault() to check if we were expecting this fault so we can handle it gracefully and just return with an error code. In the hash mmu case we still raise an unknown exception since quadrants aren't defined for the hash mmu. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
* | | powerpc/mm: Fallback to RAM if the altmap is unusableOliver O'Halloran2018-12-091-3/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The "altmap" is used to provide a pool of memory that is reserved for the vmemmap backing of hot-plugged memory. This is useful when adding large amount of ZONE_DEVICE memory to a system with a limited amount of normal memory. On ppc64 we use huge pages to map the vmemmap which requires the backing storage to be contigious and aligned to the hugepage size. The altmap implementation allows for the altmap provider to reserve a few PFNs at the start of the range for it's own uses and when this occurs the first chunk of the altmap is not usable for hugepage mappings. On hash there is no sane way to fall back to a normal sized page mapping so we fail the allocation. This results in memory hotplug failing with ENOMEM when the new range doesn't fall into an existing vmemmap block. This patch handles this case by falling back to using system memory rather than failing if we cannot allocate from the altmap. This fallback should only ever be used for the first vmemmap block so it should not cause excess memory consumption. Fixes: 7b73d978a5d0 ("mm: pass the vmem_altmap to vmemmap_populate") Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | | powerpc/mm: Fix linux page tables build with some configsMichael Ellerman2018-11-271-0/+1
|/ / | | | | | | | | | | | | | | | | | | | | | | For some configs the build fails with: arch/powerpc/mm/dump_linuxpagetables.c: In function 'populate_markers': arch/powerpc/mm/dump_linuxpagetables.c:306:39: error: 'PKMAP_BASE' undeclared (first use in this function) arch/powerpc/mm/dump_linuxpagetables.c:314:50: error: 'LAST_PKMAP' undeclared (first use in this function) These come from highmem.h, including that fixes the build. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc/numa: Suppress "VPHN is not supported" messagesSatheesh Rajendran2018-11-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | When VPHN function is not supported and during cpu hotplug event, kernel prints message 'VPHN function not supported. Disabling polling...'. Currently it prints on every hotplug event, it floods dmesg when a KVM guest tries to hotplug huge number of vcpus, let's just print once and suppress further kernel prints. Signed-off-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc/mm/64s: Fix preempt warning in slb_allocate_kernel()Michael Ellerman2018-11-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With preempt enabled we see warnings in do_slb_fault(): BUG: using smp_processor_id() in preemptible [00000000] code: kworker/u33:0/98 futex hash table entries: 4096 (order: 3, 524288 bytes) caller is do_slb_fault+0x204/0x230 CPU: 5 PID: 98 Comm: kworker/u33:0 Not tainted 4.19.0-rc3-gcc-7.3.1-00022-g1936f094e164 #138 Call Trace: dump_stack+0xb4/0x104 (unreliable) check_preemption_disabled+0x148/0x150 do_slb_fault+0x204/0x230 data_access_slb_common+0x138/0x180 This is caused by the get_paca() in slb_allocate_kernel(), which includes a call to debug_smp_processor_id(). slb_allocate_kernel() can only be called from do_slb_fault(), and in that path interrupts are hard disabled and so we can't be preempted, but we can't update the preempt flags (in thread_info) because that could cause an SLB fault. So just use local_paca which is safe and doesn't cause the warning. Fixes: 48e7b7695745 ("powerpc/64s/hash: Convert SLB miss handlers to C") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc/mm/64s: Only use slbfee on CPUs that support itMichael Ellerman2018-11-061-0/+3
| | | | | | | | | | | | | | | | | | The slbfee instruction was only added in ISA 2.05 (Power6), it's not supported on older CPUs. We don't have a CPU feature for that ISA version though, so just use the ISA 2.06 feature flag. Fixes: e15a4fea4dee ("powerpc/64s/hash: Add some SLB debugging tests") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc/mm/64s: Use PPC_SLBFEE macroMichael Ellerman2018-11-061-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | Old toolchains don't know about slbfee and break the build, eg: {standard input}:37: Error: Unrecognized opcode: `slbfee.' Fix it by using the macro version. We need to add an underscore version that takes raw register numbers from the inline asm, rather than our Rx macros. Fixes: e15a4fea4dee ("powerpc/64s/hash: Add some SLB debugging tests") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc/mm/64s: Consolidate SLB assertionsMichael Ellerman2018-11-061-20/+9
|/ | | | | | | | | The code for assert_slb_exists() and assert_slb_notexists() is almost identical, except for the polarity of the WARN_ON(). In a future patch we'll need to modify this code, so consolidate it now into a single function. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* Merge tag 'powerpc-4.20-2' of ↵Linus Torvalds2018-11-021-17/+9
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Some things that I missed due to travel, or that came in late. Two fixes also going to stable: - A revert of a buggy change to the 8xx TLB miss handlers. - Our flushing of SPE (Signal Processing Engine) registers on fork was broken. Other changes: - A change to the KVM decrementer emulation to use proper APIs. - Some cleanups to the way we do code patching in the 8xx code. - Expose the maximum possible memory for the system in /proc/powerpc/lparcfg. - Merge some updates from Scott: "a couple device tree updates, and a fix for a missing prototype warning" A few other minor fixes and a handful of fixes for our selftests. Thanks to: Aravinda Prasad, Breno Leitao, Camelia Groza, Christophe Leroy, Felipe Rechia, Joel Stanley, Naveen N. Rao, Paul Mackerras, Scott Wood, Tyrel Datwyler" * tag 'powerpc-4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (21 commits) selftests/powerpc: Fix compilation issue due to asm label selftests/powerpc/cache_shape: Fix out-of-tree build selftests/powerpc/switch_endian: Fix out-of-tree build selftests/powerpc/pmu: Link ebb tests with -no-pie selftests/powerpc/signal: Fix out-of-tree build selftests/powerpc/ptrace: Fix out-of-tree build powerpc/xmon: Relax frame size for clang selftests: powerpc: Fix warning for security subdir selftests/powerpc: Relax L1d miss targets for rfi_flush test powerpc/process: Fix flush_all_to_thread for SPE powerpc/pseries: add missing cpumask.h include file selftests/powerpc: Fix ptrace tm failure KVM: PPC: Use exported tb_to_ns() function in decrementer emulation powerpc/pseries: Export maximum memory value powerpc/8xx: Use patch_site for perf counters setup powerpc/8xx: Use patch_site for memory setup patching powerpc/code-patching: Add a helper to get the address of a patch_site Revert "powerpc/8xx: Use L1 entry APG to handle _PAGE_ACCESSED for CONFIG_SWAP" powerpc/8xx: add missing header in 8xx_mmu.c powerpc/8xx: Add DT node for using the SEC engine of the MPC885 ...
| * Merge branch 'next' of ↵Michael Ellerman2018-10-291-0/+1
| |\ | | | | | | | | | | | | | | | | | | | | | https://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next Updates from Scott: "This contains a couple device tree updates, and a fix for a missing prototype warning."
| | * powerpc/8xx: add missing header in 8xx_mmu.cChristophe Leroy2018-10-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | arch/powerpc/mm/8xx_mmu.c:174:6: error: no previous prototype for ‘set_context’ [-Werror=missing-prototypes] void set_context(unsigned long id, pgd_t *pgd) Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
| * | powerpc/8xx: Use patch_site for memory setup patchingChristophe Leroy2018-10-261-16/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 8xx TLB miss routines are patched at startup at several places. This patch uses the new patch_site functionality in order to get a better code readability and avoid a label mess when dumping the code with 'objdump -d' Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | Revert "powerpc/8xx: Use L1 entry APG to handle _PAGE_ACCESSED for CONFIG_SWAP"Christophe Leroy2018-10-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 4f94b2c7462d9720b2afa7e8e8d4c19446bb31ce. That commit was buggy, as it used rlwinm instead of rlwimi. Instead of fixing that bug, we revert the previous commit in order to reduce the dependency between L1 entries and L2 entries Fixes: 4f94b2c7462d9 ("powerpc/8xx: Use L1 entry APG to handle _PAGE_ACCESSED for CONFIG_SWAP") Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | | memblock: stop using implicit alignment to SMP_CACHE_BYTESMike Rapoport2018-10-311-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a memblock allocation APIs are called with align = 0, the alignment is implicitly set to SMP_CACHE_BYTES. Implicit alignment is done deep in the memblock allocator and it can come as a surprise. Not that such an alignment would be wrong even when used incorrectly but it is better to be explicit for the sake of clarity and the prinicple of the least surprise. Replace all such uses of memblock APIs with the 'align' parameter explicitly set to SMP_CACHE_BYTES and stop implicit alignment assignment in the memblock internal allocation functions. For the case when memblock APIs are used via helper functions, e.g. like iommu_arena_new_node() in Alpha, the helper functions were detected with Coccinelle's help and then manually examined and updated where appropriate. The direct memblock APIs users were updated using the semantic patch below: @@ expression size, min_addr, max_addr, nid; @@ ( | - memblock_alloc_try_nid_raw(size, 0, min_addr, max_addr, nid) + memblock_alloc_try_nid_raw(size, SMP_CACHE_BYTES, min_addr, max_addr, nid) | - memblock_alloc_try_nid_nopanic(size, 0, min_addr, max_addr, nid) + memblock_alloc_try_nid_nopanic(size, SMP_CACHE_BYTES, min_addr, max_addr, nid) | - memblock_alloc_try_nid(size, 0, min_addr, max_addr, nid) + memblock_alloc_try_nid(size, SMP_CACHE_BYTES, min_addr, max_addr, nid) | - memblock_alloc(size, 0) + memblock_alloc(size, SMP_CACHE_BYTES) | - memblock_alloc_raw(size, 0) + memblock_alloc_raw(size, SMP_CACHE_BYTES) | - memblock_alloc_from(size, 0, min_addr) + memblock_alloc_from(size, SMP_CACHE_BYTES, min_addr) | - memblock_alloc_nopanic(size, 0) + memblock_alloc_nopanic(size, SMP_CACHE_BYTES) | - memblock_alloc_low(size, 0) + memblock_alloc_low(size, SMP_CACHE_BYTES) | - memblock_alloc_low_nopanic(size, 0) + memblock_alloc_low_nopanic(size, SMP_CACHE_BYTES) | - memblock_alloc_from_nopanic(size, 0, min_addr) + memblock_alloc_from_nopanic(size, SMP_CACHE_BYTES, min_addr) | - memblock_alloc_node(size, 0, nid) + memblock_alloc_node(size, SMP_CACHE_BYTES, nid) ) [mhocko@suse.com: changelog update] [akpm@linux-foundation.org: coding-style fixes] [rppt@linux.ibm.com: fix missed uses of implicit alignment] Link: http://lkml.kernel.org/r/20181016133656.GA10925@rapoport-lnx Link: http://lkml.kernel.org/r/1538687224-17535-1-git-send-email-rppt@linux.vnet.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Suggested-by: Michal Hocko <mhocko@suse.com> Acked-by: Paul Burton <paul.burton@mips.com> [MIPS] Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Acked-by: Michal Hocko <mhocko@suse.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Ingo Molnar <mingo@redhat.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | mm: remove include/linux/bootmem.hMike Rapoport2018-10-314-6/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move remaining definitions and declarations from include/linux/bootmem.h into include/linux/memblock.h and remove the redundant header. The includes were replaced with the semantic patch below and then semi-automated removal of duplicated '#include <linux/memblock.h> @@ @@ - #include <linux/bootmem.h> + #include <linux/memblock.h> [sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h] Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au [sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h] Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au [sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal] Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Paul Burton <paul.burton@mips.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Serge Semin <fancer.lancer@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | memblock: rename free_all_bootmem to memblock_free_allMike Rapoport2018-10-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The conversion is done using sed -i 's@free_all_bootmem@memblock_free_all@' \ $(git grep -l free_all_bootmem) Link: http://lkml.kernel.org/r/1536927045-23536-26-git-send-email-rppt@linux.vnet.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Paul Burton <paul.burton@mips.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Serge Semin <fancer.lancer@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | memblock: remove _virt from APIs returning virtual addressMike Rapoport2018-10-311-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The conversion is done using sed -i 's@memblock_virt_alloc@memblock_alloc@g' \ $(git grep -l memblock_virt_alloc) Link: http://lkml.kernel.org/r/1536927045-23536-8-git-send-email-rppt@linux.vnet.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Paul Burton <paul.burton@mips.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Serge Semin <fancer.lancer@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | memblock: rename memblock_alloc{_nid,_try_nid} to memblock_phys_alloc*Mike Rapoport2018-10-313-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make it explicit that the caller gets a physical address rather than a virtual one. This will also allow using meblock_alloc prefix for memblock allocations returning virtual address, which is done in the following patches. The conversion is done using the following semantic patch: @@ expression e1, e2, e3; @@ ( - memblock_alloc(e1, e2) + memblock_phys_alloc(e1, e2) | - memblock_alloc_nid(e1, e2, e3) + memblock_phys_alloc_nid(e1, e2, e3) | - memblock_alloc_try_nid(e1, e2, e3) + memblock_phys_alloc_try_nid(e1, e2, e3) ) Link: http://lkml.kernel.org/r/1536927045-23536-7-git-send-email-rppt@linux.vnet.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Paul Burton <paul.burton@mips.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Serge Semin <fancer.lancer@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge tag 'powerpc-4.20-1' of ↵Linus Torvalds2018-10-2630-881/+1098
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Notable changes: - A large series to rewrite our SLB miss handling, replacing a lot of fairly complicated asm with much fewer lines of C. - Following on from that, we now maintain a cache of SLB entries for each process and preload them on context switch. Leading to a 27% speedup for our context switch benchmark on Power9. - Improvements to our handling of SLB multi-hit errors. We now print more debug information when they occur, and try to continue running by flushing the SLB and reloading, rather than treating them as fatal. - Enable THP migration on 64-bit Book3S machines (eg. Power7/8/9). - Add support for physical memory up to 2PB in the linear mapping on 64-bit Book3S. We only support up to 512TB as regular system memory, otherwise the percpu allocator runs out of vmalloc space. - Add stack protector support for 32 and 64-bit, with a per-task canary. - Add support for PTRACE_SYSEMU and PTRACE_SYSEMU_SINGLESTEP. - Support recognising "big cores" on Power9, where two SMT4 cores are presented to us as a single SMT8 core. - A large series to cleanup some of our ioremap handling and PTE flags. - Add a driver for the PAPR SCM (storage class memory) interface, allowing guests to operate on SCM devices (acked by Dan). - Changes to our ftrace code to handle very large kernels, where we need to use a trampoline to get to ftrace_caller(). And many other smaller enhancements and cleanups. Thanks to: Alan Modra, Alistair Popple, Aneesh Kumar K.V, Anton Blanchard, Aravinda Prasad, Bartlomiej Zolnierkiewicz, Benjamin Herrenschmidt, Breno Leitao, Cédric Le Goater, Christophe Leroy, Christophe Lombard, Dan Carpenter, Daniel Axtens, Finn Thain, Gautham R. Shenoy, Gustavo Romero, Haren Myneni, Hari Bathini, Jia Hongtao, Joel Stanley, John Allen, Laurent Dufour, Madhavan Srinivasan, Mahesh Salgaonkar, Mark Hairgrove, Masahiro Yamada, Michael Bringmann, Michael Neuling, Michal Suchanek, Murilo Opsfelder Araujo, Nathan Fontenot, Naveen N. Rao, Nicholas Piggin, Nick Desaulniers, Oliver O'Halloran, Paul Mackerras, Petr Vorel, Rashmica Gupta, Reza Arbab, Rob Herring, Sam Bobroff, Samuel Mendoza-Jonas, Scott Wood, Stan Johnson, Stephen Rothwell, Stewart Smith, Suraj Jitindar Singh, Tyrel Datwyler, Vaibhav Jain, Vasant Hegde, YueHaibing, zhong jiang" * tag 'powerpc-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (221 commits) Revert "selftests/powerpc: Fix out-of-tree build errors" powerpc/msi: Fix compile error on mpc83xx powerpc: Fix stack protector crashes on CPU hotplug powerpc/traps: restore recoverability of machine_check interrupts powerpc/64/module: REL32 relocation range check powerpc/64s/radix: Fix radix__flush_tlb_collapsed_pmd double flushing pmd selftests/powerpc: Add a test of wild bctr powerpc/mm: Fix page table dump to work on Radix powerpc/mm/radix: Display if mappings are exec or not powerpc/mm/radix: Simplify split mapping logic powerpc/mm/radix: Remove the retry in the split mapping logic powerpc/mm/radix: Fix small page at boundary when splitting powerpc/mm/radix: Fix overuse of small pages in splitting logic powerpc/mm/radix: Fix off-by-one in split mapping logic powerpc/ftrace: Handle large kernel configs powerpc/mm: Fix WARN_ON with THP NUMA migration selftests/powerpc: Fix out-of-tree build errors powerpc/time: no steal_time when CONFIG_PPC_SPLPAR is not selected powerpc/time: Only set CONFIG_ARCH_HAS_SCALED_CPUTIME on PPC64 powerpc/time: isolate scaled cputime accounting in dedicated functions. ...
| * | powerpc/64s/radix: Fix radix__flush_tlb_collapsed_pmd double flushing pmdNicholas Piggin2018-10-201-1/+0
| | | | | | | | | | | | | | | Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc/mm: Fix page table dump to work on RadixMichael Ellerman2018-10-201-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we're running on Book3S with the Radix MMU enabled the page table dump currently prints the wrong addresses because it uses the wrong start address. Fix it to use PAGE_OFFSET rather than KERN_VIRT_START. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc/mm/radix: Display if mappings are exec or notMichael Ellerman2018-10-201-12/+17
| | | | | | | | | | | | | | | | | | | | | | | | At boot we print the ranges we've mapped for the linear mapping and what page size we've used. Also track whether the range is mapped executable or not and display that as well. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc/mm/radix: Simplify split mapping logicMichael Ellerman2018-10-201-22/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we look closely at the logic in create_physical_mapping(), when we're doing STRICT_KERNEL_RWX, we do the following steps: - determine the gap from where we are to the end of the range - choose an appropriate mapping_size based on the gap - check if that mapping_size would overlap the __init_begin boundary, and if not choose an appropriate mapping_size We can simplify the logic by taking the __init_begin boundary into account when we calculate the initial gap. So add a next_boundary() function which tells us what the next boundary is, either the __init_begin boundary or end. In future we can add more boundaries. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc/mm/radix: Remove the retry in the split mapping logicMichael Ellerman2018-10-201-7/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we have CONFIG_STRICT_KERNEL_RWX enabled, we want to split the linear mapping at the text/data boundary so we can map the kernel text read only. The current logic uses a goto inside the for loop, which works, but is hard to reason about. When we hit the goto retry case we set max_mapping_size to PMD_SIZE and go back to the start. Setting max_mapping_size means we skip the PUD case and go to the PMD case. We know we will pass the alignment and gap checks because the only reason we are there is we hit the goto retry, and that is guarded by mapping_size == PUD_SIZE, which means addr is PUD aligned and gap is greater or equal to PUD_SIZE. So the only part of the check that can fail is the mmu_psize_defs check for the 2M page size. If we just duplicate that check we can avoid the goto, and we get the same result. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc/mm/radix: Fix small page at boundary when splittingMichael Ellerman2018-10-201-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we have CONFIG_STRICT_KERNEL_RWX enabled, we want to split the linear mapping at the text/data boundary so we can map the kernel text read only. Currently we always use a small page at the text/data boundary, even when that's not necessary: Mapped 0x0000000000000000-0x0000000000e00000 with 2.00 MiB pages Mapped 0x0000000000e00000-0x0000000001000000 with 64.0 KiB pages Mapped 0x0000000001000000-0x0000000040000000 with 2.00 MiB pages This is because the check that the mapping crosses the __init_begin boundary is too strict, it also returns true when we map exactly up to the boundary. So fix it to check that the mapping would actually map past __init_begin, and with that we see: Mapped 0x0000000000000000-0x0000000040000000 with 2.00 MiB pages Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc/mm/radix: Fix overuse of small pages in splitting logicMichael Ellerman2018-10-201-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we have CONFIG_STRICT_KERNEL_RWX enabled, we want to split the linear mapping at the text/data boundary so we can map the kernel text read only. But the current logic uses small pages for the entire text section, regardless of whether a larger page size would fit. eg. with the boundary at 16M we could use 2M pages, but instead we use 64K pages up to the 16M boundary: Mapped 0x0000000000000000-0x0000000001000000 with 64.0 KiB pages Mapped 0x0000000001000000-0x0000000040000000 with 2.00 MiB pages Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages This is because the test is checking if addr is < __init_begin and addr + mapping_size is >= _stext. But that is true for all pages between _stext and __init_begin. Instead what we want to check is if we are crossing the text/data boundary, which is at __init_begin. With that fixed we see: Mapped 0x0000000000000000-0x0000000000e00000 with 2.00 MiB pages Mapped 0x0000000000e00000-0x0000000001000000 with 64.0 KiB pages Mapped 0x0000000001000000-0x0000000040000000 with 2.00 MiB pages Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages ie. we're correctly using 2MB pages below __init_begin, but we still drop down to 64K pages unnecessarily at the boundary. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc/mm/radix: Fix off-by-one in split mapping logicMichael Ellerman2018-10-201-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we have CONFIG_STRICT_KERNEL_RWX enabled, we try to split the kernel linear (1:1) mapping so that the kernel text is in a separate page to kernel data, so we can mark the former read-only. We could achieve that just by always using 64K pages for the linear mapping, but we try to be smarter. Instead we use huge pages when possible, and only switch to smaller pages when necessary. However we have an off-by-one bug in that logic, which causes us to calculate the wrong boundary between text and data. For example with the end of the kernel text at 16M we see: radix-mmu: Mapped 0x0000000000000000-0x0000000001200000 with 64.0 KiB pages radix-mmu: Mapped 0x0000000001200000-0x0000000040000000 with 2.00 MiB pages radix-mmu: Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages ie. we mapped from 0 to 18M with 64K pages, even though the boundary between text and data is at 16M. With the fix we see we're correctly hitting the 16M boundary: radix-mmu: Mapped 0x0000000000000000-0x0000000001000000 with 64.0 KiB pages radix-mmu: Mapped 0x0000000001000000-0x0000000040000000 with 2.00 MiB pages radix-mmu: Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc/mm: Fix WARN_ON with THP NUMA migrationAneesh Kumar K.V2018-10-202-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | WARNING: CPU: 12 PID: 4322 at /arch/powerpc/mm/pgtable-book3s64.c:76 set_pmd_at+0x4c/0x2b0 Modules linked in: CPU: 12 PID: 4322 Comm: qemu-system-ppc Tainted: G W 4.19.0-rc3-00758-g8f0c636b0542 #36 NIP: c0000000000872fc LR: c000000000484eec CTR: 0000000000000000 REGS: c000003fba876fe0 TRAP: 0700 Tainted: G W (4.19.0-rc3-00758-g8f0c636b0542) MSR: 900000010282b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> CR: 24282884 XER: 00000000 CFAR: c000000000484ee8 IRQMASK: 0 GPR00: c000000000484eec c000003fba877268 c000000001f0ec00 c000003fbd229f80 GPR04: 00007c8fe8e00000 c000003f864c5a38 860300853e0000c0 0000000000000080 GPR08: 0000000080000000 0000000000000001 0401000000000080 0000000000000001 GPR12: 0000000000002000 c000003fffff5400 c000003fce292000 00007c9024570000 GPR16: 0000000000000000 0000000000ffffff 0000000000000001 c000000001885950 GPR20: 0000000000000000 001ffffc0004807c 0000000000000008 c000000001f49d05 GPR24: 00007c8fe8e00000 c0000000020f2468 ffffffffffffffff c000003fcd33b090 GPR28: 00007c8fe8e00000 c000003fbd229f80 c000003f864c5a38 860300853e0000c0 NIP [c0000000000872fc] set_pmd_at+0x4c/0x2b0 LR [c000000000484eec] do_huge_pmd_numa_page+0xb1c/0xc20 Call Trace: [c000003fba877268] [c00000000045931c] mpol_misplaced+0x1bc/0x230 (unreliable) [c000003fba8772c8] [c000000000484eec] do_huge_pmd_numa_page+0xb1c/0xc20 [c000003fba877398] [c00000000040d344] __handle_mm_fault+0x5e4/0x2300 [c000003fba8774d8] [c00000000040f400] handle_mm_fault+0x3a0/0x420 [c000003fba877528] [c0000000003ff6f4] __get_user_pages+0x2e4/0x560 [c000003fba877628] [c000000000400314] get_user_pages_unlocked+0x104/0x2a0 [c000003fba8776c8] [c000000000118f44] __gfn_to_pfn_memslot+0x284/0x6a0 [c000003fba877748] [c0000000001463a0] kvmppc_book3s_radix_page_fault+0x360/0x12d0 [c000003fba877838] [c000000000142228] kvmppc_book3s_hv_page_fault+0x48/0x1300 [c000003fba877988] [c00000000013dc08] kvmppc_vcpu_run_hv+0x1808/0x1b50 [c000003fba877af8] [c000000000126b44] kvmppc_vcpu_run+0x34/0x50 [c000003fba877b18] [c000000000123268] kvm_arch_vcpu_ioctl_run+0x288/0x2d0 [c000003fba877b98] [c00000000011253c] kvm_vcpu_ioctl+0x1fc/0x8c0 [c000003fba877d08] [c0000000004e9b24] do_vfs_ioctl+0xa44/0xae0 [c000003fba877db8] [c0000000004e9c44] ksys_ioctl+0x84/0xf0 [c000003fba877e08] [c0000000004e9cd8] sys_ioctl+0x28/0x80 We removed the pte_protnone check earlier with the understanding that we mark the pte invalid before the set_pte/set_pmd usage. But the huge pmd autonuma still use the set_pmd_at directly. This is ok because a protnone pte won't have translation cache in TLB. Fixes: da7ad366b497 ("powerpc/mm/book3s: Update pmd_present to look at _PAGE_PRESENT bit") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc/mm: fix always true/false warning in slice.cChristophe Leroy2018-10-201-7/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the following warnings (obtained with make W=1). arch/powerpc/mm/slice.c: In function 'slice_range_to_mask': arch/powerpc/mm/slice.c:73:12: error: comparison is always true due to limited range of data type [-Werror=type-limits] if (start < SLICE_LOW_TOP) { ^ arch/powerpc/mm/slice.c:81:20: error: comparison is always false due to limited range of data type [-Werror=type-limits] if ((start + len) > SLICE_LOW_TOP) { ^ arch/powerpc/mm/slice.c: In function 'slice_mask_for_free': arch/powerpc/mm/slice.c:136:17: error: comparison is always true due to limited range of data type [-Werror=type-limits] if (high_limit <= SLICE_LOW_TOP) ^ arch/powerpc/mm/slice.c: In function 'slice_check_range_fits': arch/powerpc/mm/slice.c:185:12: error: comparison is always true due to limited range of data type [-Werror=type-limits] if (start < SLICE_LOW_TOP) { ^ arch/powerpc/mm/slice.c:195:39: error: comparison is always false due to limited range of data type [-Werror=type-limits] if (SLICE_NUM_HIGH && ((start + len) > SLICE_LOW_TOP)) { ^ arch/powerpc/mm/slice.c: In function 'slice_scan_available': arch/powerpc/mm/slice.c:306:11: error: comparison is always true due to limited range of data type [-Werror=type-limits] if (addr < SLICE_LOW_TOP) { ^ arch/powerpc/mm/slice.c: In function 'get_slice_psize': arch/powerpc/mm/slice.c:709:11: error: comparison is always true due to limited range of data type [-Werror=type-limits] if (addr < SLICE_LOW_TOP) { ^ Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc/mm: fix missing prototypes in slice.cChristophe Leroy2018-10-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the following warnings (obtained with make W=1). arch/powerpc/mm/slice.c: At top level: arch/powerpc/mm/slice.c:682:15: error: no previous prototype for 'arch_get_unmapped_area' [-Werror=missing-prototypes] unsigned long arch_get_unmapped_area(struct file *filp, ^ arch/powerpc/mm/slice.c:692:15: error: no previous prototype for 'arch_get_unmapped_area_topdown' [-Werror=missing-prototypes] unsigned long arch_get_unmapped_area_topdown(struct file *filp, ^ Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc/mm: Trace tlbia instructionChristophe Leroy2018-10-201-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | Add a trace point for tlbia (Translation Lookaside Buffer Invalidate All) instruction. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc/mm: Add missing tracepoint for tlbieChristophe Leroy2018-10-201-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | commit 0428491cba927 ("powerpc/mm: Trace tlbie(l) instructions") added tracepoints for tlbie calls, but _tlbil_va() was forgotten Fixes: 0428491cba927 ("powerpc/mm: Trace tlbie(l) instructions") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc/book3s64: fix dump_linuxpagetables "present" flagChristophe Leroy2018-10-201-2/+7
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit bd0dbb73e013 ("powerpc/mm/books3s: Add new pte bit to mark pte temporarily invalid."), _PAGE_PRESENT doesn't mean exactly that a page is present. A page is also considered preset when _PAGE_INVALID is set. This patch changes the meaning of "present" and adds a status "valid" associated to the _PAGE_PRESENT flag. Fixes: bd0dbb73e013 ("powerpc/mm/books3s: Add new pte bit to mark pte temporarily invalid.") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc: Add -Werror at arch/powerpc levelMichael Ellerman2018-10-191-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Back when I added -Werror in commit ba55bd74360e ("powerpc: Add configurable -Werror for arch/powerpc") I did it by adding it to most of the arch Makefiles. At the time we excluded math-emu, because apparently it didn't build cleanly. But that seems to have been fixed somewhere in the interim. So move the -Werror addition to the top-level of the arch, this saves us from repeating it in every Makefile and means we won't forget to add it to any new sub-dirs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc/mm: Increase the max addressable memory to 2PBAneesh Kumar K.V2018-10-141-5/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we limit the max addressable memory to 128TB. This patch increase the limit to 2PB. We can have devices like nvdimm which adds memory above 512TB limit. We still don't support regular system ram above 512TB. One of the challenge with that is the percpu allocator, that allocates per node memory and use the max distance between them as the percpu offsets. This means with large gap in address space ( system ram above 1PB) we will run out of vmalloc space to map the percpu allocation. In order to support addressable memory above 512TB, kernel should be able to linear map this range. To do that with hash translation we now add 4 context to kernel linear map region. Our per context addressable range is 512TB. We still keep VMALLOC and VMEMMAP region to old size. SLB miss handlers is updated to validate these limit. We also limit this update to SPARSEMEM_VMEMMAP and SPARSEMEM_EXTREME Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc/mm/hash: Rename get_ea_context to get_user_contextAneesh Kumar K.V2018-10-141-1/+1
| | | | | | | | | | | | | | | | We will be adding get_kernel_context later. Update function name to indicate this handle context allocation user space address. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
OpenPOWER on IntegriCloud