summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* ARM: 8025/1: Get rid of meminfoLaura Abbott2014-06-0136-333/+179
| | | | | | | | | | | | | | | | memblock is now fully integrated into the kernel and is the prefered method for tracking memory. Rather than reinvent the wheel with meminfo, migrate to using memblock directly instead of meminfo as an intermediate. Acked-by: Jason Cooper <jason@lakedaemon.net> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Acked-by: Kukjin Kim <kgene.kim@samsung.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Leif Lindholm <leif.lindholm@linaro.org> Signed-off-by: Laura Abbott <lauraa@codeaurora.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8060/1: mm: allow sub-architectures to override PCI I/O memory typeThomas Petazzoni2014-06-012-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | Due to a design incompatibility between the PCIe Marvell controller and the Cortex-A9, stressing PCIe devices with a lot of traffic quickly causes a deadlock. One part of the workaround for this is to have all PCIe regions mapped as strongly-ordered (MT_UNCACHED) instead of the default MT_DEVICE. While the arch_ioremap_caller() mechanism allows sub-architecture code to override ioremap(), used to map PCIe memory regions, there isn't such a mechanism to override the behavior of pci_ioremap_io(). This commit adds the arch_pci_ioremap_mem_type variable, initialized to MT_DEVICE by default, and that sub-architecture code can override. We have chosen to expose a single variable rather than offering the possibility of overriding the entire pci_ioremap_io(), because implementing pci_ioremap_io() requires calling functions (get_mem_type()) that are private to the arch/arm/mm/ code. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8066/1: correction for ARM patch 8031/2Nicolas Pitre2014-06-011-1/+1
| | | | | | | A small mistake slipped through in memory.txt. Signed-off-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8049/1: ftrace/add save_stack_trace_regs() implementationLin Yongting2014-05-301-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When configure kprobe events of ftrace with "stacktrace" option enabled in arm, there is no stacktrace was recorded after the kprobe event was triggered. The root cause is no save_stack_trace_regs() function implemented. Implement the save_stack_trace_regs() function in arm, then ftrace will call this architecture-related function to record the stacktrace into ring buffer. After this fix, stacktrace can be recorded, for example: # mount -t debugfs nodev /sys/kernel/debug # echo "p:netrx net_rx_action" >> /sys/kernel/debug/tracing/kprobe_events # echo 1 > /sys/kernel/debug/tracing/events/kprobes/netrx/enable # echo 1 > /sys/kernel/debug/tracing/options/stacktrace # echo 1 > /sys/kernel/debug/tracing/tracing_on # ping 127.0.0.1 -c 1 # echo 0 > /sys/kernel/debug/tracing/tracing_on # cat /sys/kernel/debug/tracing/trace # tracer: nop # # entries-in-buffer/entries-written: 12/12 #P:1 # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / delay # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | <------ missing some entries ----------------> ping-1200 [000] dNs1 667.603250: netrx: (net_rx_action+0x0/0x1f8) ping-1200 [000] dNs1 667.604738: <stack trace> => net_rx_action => do_softirq => local_bh_enable => ip_finish_output => ip_output => ip_local_out => ip_send_skb => ip_push_pending_frames => raw_sendmsg => inet_sendmsg => sock_sendmsg => SyS_sendto => ret_fast_syscall Signed-off-by: Lin Yongting <linyongting@gmail.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8065/1: remove last use of CONFIG_CPU_ARM710Paul Bolle2014-05-301-8/+0
| | | | | | | | | Support for ARM710 CPUs was removed in v3.5. Now remove the last code depending on its Kconfig macro. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8062/1: Modify ldrt fixup handler to re-execute the userspace instructionArun K S2014-05-301-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We will reach fixup handler when one thread(say cpu0) caused an undefined exception, while another thread(say cpu1) is unmmaping the page. Fixup handler returns to the next userspace instruction which has caused the undef execption, rather than going to the same instruction. ARM ARM says that after undefined exception, the PC will be pointing to the next instruction. ie +4 offset in case of ARM and +2 in case of Thumb And there is no correction offset passed to vector_stub in case of undef exception. File: arch/arm/kernel/entry-armv.S +1085 vector_stub und, UND_MODE During an undefined exception, in normal scenario(ie when ldrt instruction does not cause an abort) after resorting the context in VFP hardware, the PC is modified as show below before jumping to ret_from_exception which is in r9. File: arch/arm/vfp/vfphw.S +169 @ The context stored in the VFP hardware is up to date with this thread vfp_hw_state_valid: tst r1, #FPEXC_EX bne process_exception @ might as well handle the pending @ exception before retrying branch @ out before setting an FPEXC that @ stops us reading stuff VFPFMXR FPEXC, r1 @ Restore FPEXC last sub r2, r2, #4 @ Retry current instruction - if Thumb str r2, [sp, #S_PC] @ mode it's two 16-bit instructions, @ else it's one 32-bit instruction, so @ always subtract 4 from the following @ instruction address. But if ldrt results in an abort, we reach the fixup handler and return to ret_from_execption without correcting the pc. This patch modifes the fixup handler to re-execute the same instruction which caused undefined execption. Signed-off-by: Vinayak Menon <vinayakm.list@gmail.com> Signed-off-by: Arun KS <getarunks@gmail.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8047/1: rwsem: use asm-generic rwsem implementationWill Deacon2014-05-302-4/+2
| | | | | | | | | | | asm-generic offers an atomic-add based rwsem implementation, which can avoid the need for heavier, spinlock-based synchronisation on the fast path. This patch makes use of the optimised implementation for ARM CPUs. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8054/1: perf: add support for the Cortex-A17 PMUWill Deacon2014-05-253-0/+14
| | | | | | | | The Cortex-A17 PMU is identical to that of the A12, so wire up a new compatible string to the existing event structures. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8053/1: kernel: sleep: restore HYP mode configuration in cpu_resumeLorenzo Pieralisi2014-05-252-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | On CPUs with virtualization extensions the kernel installs HYP mode configuration on both primary and secondary cpus upon cold boot. On platforms where CPUs are shutdown in idle paths (ie CPU core gating), when a CPU resumes from low-power states it currently does not execute code that reinstalls the HYP configuration, which means that the kernel cannot run eg KVM properly on such machines. This patch, mirroring cold-boot behaviour, executes position independent code that reinstalls HYP configuration and drops to SVC mode safely on warmboot, so that deep idle states can be enabled in kernel running as hosts on platforms with power management HW. Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Dave Martin <dave.martin@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Nicolas Pitre <nico@linaro.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8043/1: uprobes need icache flush after xol writeVictor Kamensky2014-05-255-13/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After instruction write into xol area, on ARM V7 architecture code need to flush dcache and icache to sync them up for given set of addresses. Having just 'flush_dcache_page(page)' call is not enough - it is possible to have stale instruction sitting in icache for given xol area slot address. Introduce arch_uprobe_ixol_copy weak function that by default calls uprobes copy_to_page function and than flush_dcache_page function and on ARM define new one that handles xol slot copy in ARM specific way flush_uprobe_xol_access function shares/reuses implementation with/of flush_ptrace_access function and takes care of writing instruction to user land address space on given variety of different cache types on ARM CPUs. Because flush_uprobe_xol_access does not have vma around flush_ptrace_access was split into two parts. First that retrieves set of condition from vma and common that receives those conditions as flags. Note ARM cache flush function need kernel address through which instruction write happened, so instead of using uprobes copy_to_page function changed code to explicitly map page and do memcpy. Note arch_uprobe_copy_ixol function, in similar way as copy_to_user_page function, has preempt_disable/preempt_enable. Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org> Acked-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: David A. Long <dave.long@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8029/1: mcpm: Rename the power_down_finish() functions to be less confusingDave Martin2014-05-254-10/+10
| | | | | | | | | | | | | | | The name "power_down_finish" seems to be causing some confusion, because it suggests that this function is responsible for taking some action to cause the specified CPU to complete its power down. This patch renames the affected functions to "wait_for_powerdown" and similar, since this function's intended purpose is just to wait for the hardware to finish a powerdown initiated by a previous cpu_power_down. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8055/1: cacheflush: use -st dsb option for ensuring completionWill Deacon2014-05-253-8/+8
| | | | | | | | dsb st can be used to ensure completion of pending cache maintenance operations, so use it for the v7 cache maintenance operations. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8046/1: proc: add support for the Cortex-A17 processorWill Deacon2014-05-252-0/+12
| | | | | | | | Cortex-A17 has identical initialisation requirements to Cortex-A12, so hook it up in proc-v7.S in the same way. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8028/1: move __fixup_smp out of init sectionRob Herring2014-05-251-1/+1
| | | | | | | | | | | With large kernel builds such as allyesconfig exceeding maximum relative branch offsets, the init section will be too far away to branch to directly. This causes veneers to be added by the linker, but veneers don't work before the MMU is enabled. Fix this by moving __fixup_smp to the .head.text section as it is not very big. Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: stacktrace: include exception PC value in stacktrace outputRussell King2014-05-221-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | When we unwind through an exception stack, include the saved PC value into the stack trace: this fills in an otherwise missed functions from the trace (as indicated below): [<c03f4424>] fec_enet_interrupt+0xa0/0xe8 [<c0066c0c>] handle_irq_event_percpu+0x68/0x228 [<c0066e18>] handle_irq_event+0x4c/0x6c [<c006a024>] handle_fasteoi_irq+0xac/0x198 [<c00664b0>] generic_handle_irq+0x4c/0x60 [<c000f014>] handle_IRQ+0x40/0x98 [<c0008554>] gic_handle_irq+0x30/0x64 [<c0012900>] __irq_svc+0x40/0x50 [<c0029030>] __do_softirq+0xe0/0x2fc <==== [<c0029500>] irq_exit+0xb0/0x100 [<c000f018>] handle_IRQ+0x44/0x98 [<c0008554>] gic_handle_irq+0x30/0x64 [<c0012900>] __irq_svc+0x40/0x50 [<c000f34c>] arch_cpu_idle+0x30/0x38 <==== [<c005e1e4>] cpu_startup_entry+0xac/0x214 [<c066297c>] rest_init+0x68/0x80 [<c08ccb10>] start_kernel+0x2fc/0x358 Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: stacktrace: avoid listing stacktrace functions in stacktraceRussell King2014-05-221-5/+13
| | | | | | | | | | | | | | | | | | | | While debugging the FEC ethernet driver using stacktrace, it was noticed that the stacktraces always begin as follows: [<c00117b4>] save_stack_trace_tsk+0x0/0x98 [<c0011870>] save_stack_trace+0x24/0x28 ... This is because the stack trace code includes the stack frames for itself. This is incorrect behaviour, and also leads to "skip" doing the wrong thing (which is the number of stack frames to avoid recording.) Perversely, it does the right thing when passed a non-current thread. Fix this by ensuring that we have a known constant number of frames above the main stack trace function, and always skip these. Cc: <stable@vger.kernel.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: dma-mapping: avoid calling dma_cache_maint_page() on dev=>cpuRussell King2014-05-221-3/+4
| | | | | | | | | | | | | Avoid calling dma_cache_maint_page() when unmapping a DMA_TO_DEVICE buffer. The L1 cache ops never do anything in this circumstance, nor do they ever need to - all that matters for this case is that the data written is visible to the device before DMA starts. What happens during the transfer (provided the buffer is not written to) is of no real consequence. We already do this optimisation for the L2 cache. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8037/1: mm: support big-endian page tablesJianguo Wu2014-04-251-5/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When enable LPAE and big-endian in a hisilicon board, while specify mem=384M mem=512M@7680M, will get bad page state: Freeing unused kernel memory: 180K (c0466000 - c0493000) BUG: Bad page state in process init pfn:fa442 page:c7749840 count:0 mapcount:-1 mapping: (null) index:0x0 page flags: 0x40000400(reserved) Modules linked in: CPU: 0 PID: 1 Comm: init Not tainted 3.10.27+ #66 [<c000f5f0>] (unwind_backtrace+0x0/0x11c) from [<c000cbc4>] (show_stack+0x10/0x14) [<c000cbc4>] (show_stack+0x10/0x14) from [<c009e448>] (bad_page+0xd4/0x104) [<c009e448>] (bad_page+0xd4/0x104) from [<c009e520>] (free_pages_prepare+0xa8/0x14c) [<c009e520>] (free_pages_prepare+0xa8/0x14c) from [<c009f8ec>] (free_hot_cold_page+0x18/0xf0) [<c009f8ec>] (free_hot_cold_page+0x18/0xf0) from [<c00b5444>] (handle_pte_fault+0xcf4/0xdc8) [<c00b5444>] (handle_pte_fault+0xcf4/0xdc8) from [<c00b6458>] (handle_mm_fault+0xf4/0x120) [<c00b6458>] (handle_mm_fault+0xf4/0x120) from [<c0013754>] (do_page_fault+0xfc/0x354) [<c0013754>] (do_page_fault+0xfc/0x354) from [<c0008400>] (do_DataAbort+0x2c/0x90) [<c0008400>] (do_DataAbort+0x2c/0x90) from [<c0008fb4>] (__dabt_usr+0x34/0x40) The bad pfn:fa442 is not system memory(mem=384M mem=512M@7680M), after debugging, I find in page fault handler, will get wrong pfn from pte just after set pte, as follow: do_anonymous_page() { ... set_pte_at(mm, address, page_table, entry); //debug code pfn = pte_pfn(entry); pr_info("pfn:0x%lx, pte:0x%llxn", pfn, pte_val(entry)); //read out the pte just set new_pte = pte_offset_map(pmd, address); new_pfn = pte_pfn(*new_pte); pr_info("new pfn:0x%lx, new pte:0x%llxn", pfn, pte_val(entry)); ... } pfn: 0x1fa4f5, pte:0xc00001fa4f575f new_pfn:0xfa4f5, new_pte:0xc00000fa4f5f5f //new pfn/pte is wrong. The bug is happened in cpu_v7_set_pte_ext(ptep, pte): An LPAE PTE is a 64bit quantity, passed to cpu_v7_set_pte_ext in the r2 and r3 registers. On an LE kernel, r2 contains the LSB of the PTE, and r3 the MSB. On a BE kernel, the assignment is reversed. Unfortunately, the current code always assumes the LE case, leading to corruption of the PTE when clearing/setting bits. This patch fixes this issue much like it has been done already in the cpu_v7_switch_mm case. CC stable <stable@vger.kernel.org> Signed-off-by: Jianguo Wu <wujianguo@huawei.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8036/1: Enable IRQs before attempting to read user space in __und_usrCatalin Marinas2014-04-254-8/+10
| | | | | | | | | | | | | | | | | | The Undef abort handler in the kernel reads the undefined instruction from user space. If the page table was modified from another CPU, the user access could fail and do_page_fault() will be executed with interrupts disabled. This can potentially deadlock on ARM11MPCore or on Cortex-A15 with erratum 798181 workaround enabled (both implying IPI for TLB maintenance with page table lock held). This patch enables the IRQs in __und_usr before attempting to read the instruction from user space. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Arun KS <getarunks@gmail.com> Cc: Hartley Sweeten <hsweeten@visionengravers.com> Cc: Ryan Mallon <rmallon@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8035/1: Disable preemption in crunch_task_enable()Catalin Marinas2014-04-251-2/+10
| | | | | | | | | | This patch is in preparation for calling the crunch_task_enable() function with interrupts enabled. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Hartley Sweeten <hsweeten@visionengravers.com> Cc: Ryan Mallon <rmallon@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8034/1: Disable preemption in iwmmxt_task_enable()Catalin Marinas2014-04-251-3/+11
| | | | | | | | This patch is in preparation for calling the iwmmxt_task_enable() function with interrupts enabled. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8031/2: change fixmap mapping region to support 32 CPUsLiu Hua2014-04-235-21/+29
| | | | | | | | | | | | | In 32-bit ARM systems, the fixmap mapping region can support no more than 14 CPUs(total: 896k; one CPU: 64K). And we can configure NR_CPUS up to 32. So there is a mismatch. This patch moves fixmapping region downwards to region 0xffc00000- 0xffe00000. Then the fixmap mapping region can support up to 32 CPUs. Reviewed-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Liu Hua <sdu.liu@huawei.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8031/1: fixmap: remove FIX_KMAP_BEGIN and FIX_KMAP_ENDLiu Hua2014-04-232-6/+5
| | | | | | | | | | | | | It seems that these two macros are not used by non architecture specific code. And on ARM FIX_KMAP_BEGIN equals zero. This patch removes these two macros. Instead, using FIX_KMAP_NR_PTES to tell the pte number belonged to fixmap mapping region. The code will become clearer when I introduce a bugfix on fixmap mapping region. Reviewed-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Liu Hua <sdu.liu@huawei.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8013/1: PJ4B: Add cpu_suspend/cpu_resume hooks for PJ4BGregory CLEMENT2014-04-231-3/+25
| | | | | | | | | PJ4B needs extra instructions for suspend and resume, so instead of using the armv7 version, this commit introduces specific versions for PJ4B. Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8023/1: remove remnants of the static DMA mappingNicolas Pitre2014-04-232-9/+0
| | | | | | | | | | | | It looks like the static mapping area for DMA was replaced by dynamic allocation into the vmalloc area by commit e9da6e9905e6 but the information in Documentation/arm/memory.txt was not removed accordingly. CONSISTENT_END in arch/arm/include/asm/memory.h has no more users and can be removed as well. Signed-off-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8022/1: ftrace: work with CONFIG_DEBUG_SET_MODULE_RONXRabin Vincent2014-04-231-0/+13
| | | | | | | | | | | Make ftrace work with CONFIG_DEBUG_SET_MODULE_RONX by making module text writable around the place where ftrace does its work, like it is done on x86 in the patch which introduced CONFIG_DEBUG_SET_MODULE_RONX, 84e1c6bb38eb ("x86: Add RO/NX protection for loadable kernel modules"). Tested-by: Mitchel Humpherys <mitchelh@codeaurora.org> Signed-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8011/1: ARM hibernation / suspend-to-diskSebastian Capella2014-04-234-0/+115
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enable hibernation for ARM architectures and provide ARM architecture specific calls used during hibernation. The swsusp hibernation framework depends on the platform first having functional suspend/resume. Then, in order to enable hibernation on a given platform, a platform_hibernation_ops structure may need to be registered with the system in order to save/restore any SoC-specific / cpu specific state needing (re)init over a suspend-to-disk/resume-from-disk cycle. For example: - "secure" SoCs that have different sets of control registers and/or different CR reg access patterns. - SoCs with L2 caches as the activation sequence there is SoC-dependent; a full off-on cycle for L2 is not done by the hibernation support code. - SoCs requiring steps on wakeup _before_ the "generic" parts done by cpu_suspend / cpu_resume can work correctly. - SoCs having persistent state which is maintained during suspend and resume, but will be lost during the power off cycle after suspend-to-disk. This is a rebase/rework of Frank Hofmann's v5 hibernation patchset. Acked-by: Russ Dill <Russ.Dill@ti.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Sebastian Capella <sebastian.capella@linaro.org> Acked-by: Pavel Machek <pavel@ucw.cz> Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> [fixed duplicate virt_to_pfn() definition --rmk] Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8008/1: topology: Coding style fixesMark Brown2014-04-141-4/+4
| | | | | | | Use kcalloc() and ULONG_MAX rather than open coding them. Signed-off-by: Mark Brown <broonie@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* Linux 3.15-rc1v3.15-rc1Linus Torvalds2014-04-131-2/+2
|
* mm: Initialize error in shmem_file_aio_read()Geert Uytterhoeven2014-04-131-1/+1
| | | | | | | | | | | | | | | | | Some versions of gcc even warn about it: mm/shmem.c: In function ‘shmem_file_aio_read’: mm/shmem.c:1414: warning: ‘error’ may be used uninitialized in this function If the loop is aborted during the first iteration by one of the two first break statements, error will be uninitialized. Introduced by commit 6e58e79db8a1 ("introduce copy_page_to_iter, kill loop over iovec in generic_file_aio_read()"). Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* cifs: Use min_t() when comparing "size_t" and "unsigned long"Geert Uytterhoeven2014-04-131-1/+1
| | | | | | | | | | | | | | | | On 32 bit, size_t is "unsigned int", not "unsigned long", causing the following warning when comparing with PAGE_SIZE, which is always "unsigned long": fs/cifs/file.c: In function ‘cifs_readdata_to_iov’: fs/cifs/file.c:2757: warning: comparison of distinct pointer types lacks a cast Introduced by commit 7f25bba819a3 ("cifs_iovec_read: keep iov_iter between the calls of cifs_readdata_to_iov()"), which changed the signedness of "remaining" and the code from min_t() to min(). Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'slab/next' of ↵Linus Torvalds2014-04-135-84/+128
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux Pull slab changes from Pekka Enberg: "The biggest change is byte-sized freelist indices which reduces slab freelist memory usage: https://lkml.org/lkml/2013/12/2/64" * 'slab/next' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux: mm: slab/slub: use page->list consistently instead of page->lru mm/slab.c: cleanup outdated comments and unify variables naming slab: fix wrongly used macro slub: fix high order page allocation problem with __GFP_NOFAIL slab: Make allocations with GFP_ZERO slightly more efficient slab: make more slab management structure off the slab slab: introduce byte sized index for the freelist of a slab slab: restrict the number of objects in a slab slab: introduce helper functions to get/set free object slab: factor out calculate nr objects in cache_estimate
| * mm: slab/slub: use page->list consistently instead of page->lruDave Hansen2014-04-113-8/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'struct page' has two list_head fields: 'lru' and 'list'. Conveniently, they are unioned together. This means that code can use them interchangably, which gets horribly confusing like with this nugget from slab.c: > list_del(&page->lru); > if (page->active == cachep->num) > list_add(&page->list, &n->slabs_full); This patch makes the slab and slub code use page->lru universally instead of mixing ->list and ->lru. So, the new rule is: page->lru is what the you use if you want to keep your page on a list. Don't like the fact that it's not called ->list? Too bad. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Christoph Lameter <cl@linux.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Pekka Enberg <penberg@kernel.org>
| * mm/slab.c: cleanup outdated comments and unify variables namingJianyu Zhan2014-04-011-34/+32
| | | | | | | | | | | | | | | | | | | | | | | | As time goes, the code changes a lot, and this leads to that some old-days comments scatter around , which instead of faciliating understanding, but make more confusion. So this patch cleans up them. Also, this patch unifies some variables naming. Acked-by: Christoph Lameter <cl@linux.com> Signed-off-by: Jianyu Zhan <nasa4836@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
| * slab: fix wrongly used macroJoonsoo Kim2014-04-011-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 'slab: restrict the number of objects in a slab' uses __builtin_constant_p() on #if macro. It is wrong usage of builtin function, but it is compiled on x86 without any problem, so I can't find it before 0 day build system find it. This commit fixes the situation by using KMALLOC_MIN_SIZE, instead of KMALLOC_SHIFT_LOW. KMALLOC_SHIFT_LOW is parsed to ilog2() on some architecture and this ilog2() uses __builtin_constant_p() and results in the problem. This problem would disappear by using KMALLOC_MIN_SIZE, since it is just constant. Tested-by: David Rientjes <rientjes@google.com> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
| * slub: fix high order page allocation problem with __GFP_NOFAILJoonsoo Kim2014-03-271-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SLUB already try to allocate high order page with clearing __GFP_NOFAIL. But, when allocating shadow page for kmemcheck, it missed clearing the flag. This trigger WARN_ON_ONCE() reported by Christian Casteyde. https://bugzilla.kernel.org/show_bug.cgi?id=65991 https://lkml.org/lkml/2013/12/3/764 This patch fix this situation by using same allocation flag as original allocation. Reported-by: Christian Casteyde <casteyde.christian@free.fr> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
| * slab: Make allocations with GFP_ZERO slightly more efficientJoe Perches2014-02-081-8/+8
| | | | | | | | | | | | | | | | | | | | Use the likely mechanism already around valid pointer tests to better choose when to memset to 0 allocations with __GFP_ZERO Acked-by: Christoph Lameter <cl@linux.com> Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
| * slab: make more slab management structure off the slabJoonsoo Kim2014-02-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now, the size of the freelist for the slab management diminish, so that the on-slab management structure can waste large space if the object of the slab is large. Consider a 128 byte sized slab. If on-slab is used, 31 objects can be in the slab. The size of the freelist for this case would be 31 bytes so that 97 bytes, that is, more than 75% of object size, are wasted. In a 64 byte sized slab case, no space is wasted if we use on-slab. So set off-slab determining constraint to 128 bytes. Acked-by: Christoph Lameter <cl@linux.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
| * slab: introduce byte sized index for the freelist of a slabJoonsoo Kim2014-02-081-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the freelist of a slab consist of unsigned int sized indexes. Since most of slabs have less number of objects than 256, large sized indexes is needless. For example, consider the minimum kmalloc slab. It's object size is 32 byte and it would consist of one page, so 256 indexes through byte sized index are enough to contain all possible indexes. There can be some slabs whose object size is 8 byte. We cannot handle this case with byte sized index, so we need to restrict minimum object size. Since these slabs are not major, wasted memory from these slabs would be negligible. Some architectures' page size isn't 4096 bytes and rather larger than 4096 bytes (One example is 64KB page size on PPC or IA64) so that byte sized index doesn't fit to them. In this case, we will use two bytes sized index. Below is some number for this patch. * Before * kmalloc-512 525 640 512 8 1 : tunables 54 27 0 : slabdata 80 80 0 kmalloc-256 210 210 256 15 1 : tunables 120 60 0 : slabdata 14 14 0 kmalloc-192 1016 1040 192 20 1 : tunables 120 60 0 : slabdata 52 52 0 kmalloc-96 560 620 128 31 1 : tunables 120 60 0 : slabdata 20 20 0 kmalloc-64 2148 2280 64 60 1 : tunables 120 60 0 : slabdata 38 38 0 kmalloc-128 647 682 128 31 1 : tunables 120 60 0 : slabdata 22 22 0 kmalloc-32 11360 11413 32 113 1 : tunables 120 60 0 : slabdata 101 101 0 kmem_cache 197 200 192 20 1 : tunables 120 60 0 : slabdata 10 10 0 * After * kmalloc-512 521 648 512 8 1 : tunables 54 27 0 : slabdata 81 81 0 kmalloc-256 208 208 256 16 1 : tunables 120 60 0 : slabdata 13 13 0 kmalloc-192 1029 1029 192 21 1 : tunables 120 60 0 : slabdata 49 49 0 kmalloc-96 529 589 128 31 1 : tunables 120 60 0 : slabdata 19 19 0 kmalloc-64 2142 2142 64 63 1 : tunables 120 60 0 : slabdata 34 34 0 kmalloc-128 660 682 128 31 1 : tunables 120 60 0 : slabdata 22 22 0 kmalloc-32 11716 11780 32 124 1 : tunables 120 60 0 : slabdata 95 95 0 kmem_cache 197 210 192 21 1 : tunables 120 60 0 : slabdata 10 10 0 kmem_caches consisting of objects less than or equal to 256 byte have one or more objects than before. In the case of kmalloc-32, we have 11 more objects, so 352 bytes (11 * 32) are saved and this is roughly 9% saving of memory. Of couse, this percentage decreases as the number of objects in a slab decreases. Here are the performance results on my 4 cpus machine. * Before * Performance counter stats for 'perf bench sched messaging -g 50 -l 1000' (10 runs): 229,945,138 cache-misses ( +- 0.23% ) 11.627897174 seconds time elapsed ( +- 0.14% ) * After * Performance counter stats for 'perf bench sched messaging -g 50 -l 1000' (10 runs): 218,640,472 cache-misses ( +- 0.42% ) 11.504999837 seconds time elapsed ( +- 0.21% ) cache-misses are reduced by this patchset, roughly 5%. And elapsed times are improved by 1%. Acked-by: Christoph Lameter <cl@linux.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
| * slab: restrict the number of objects in a slabJoonsoo Kim2014-02-082-0/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To prepare to implement byte sized index for managing the freelist of a slab, we should restrict the number of objects in a slab to be less or equal to 256, since byte only represent 256 different values. Setting the size of object to value equal or more than newly introduced SLAB_OBJ_MIN_SIZE ensures that the number of objects in a slab is less or equal to 256 for a slab with 1 page. If page size is rather larger than 4096, above assumption would be wrong. In this case, we would fall back on 2 bytes sized index. If minimum size of kmalloc is less than 16, we use it as minimum object size and give up this optimization. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
| * slab: introduce helper functions to get/set free objectJoonsoo Kim2014-02-081-7/+13
| | | | | | | | | | | | | | | | | | | | In the following patches, to get/set free objects from the freelist is changed so that simple casting doesn't work for it. Therefore, introduce helper functions. Acked-by: Christoph Lameter <cl@linux.com> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
| * slab: factor out calculate nr objects in cache_estimateJoonsoo Kim2014-02-081-21/+27
| | | | | | | | | | | | | | | | | | | | | | This logic is not simple to understand so that making separate function helping readability. Additionally, we can use this change in the following patch which implement for freelist to have another sized index in according to nr objects. Acked-by: Christoph Lameter <cl@linux.com> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
* | Merge branch 'misc' of ↵Linus Torvalds2014-04-126-120/+196
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild Pull misc kbuild changes from Michal Marek: "Here is the non-critical part of kbuild: - One bogus coccinelle check removed, one check fixed not to suggest the obsolete PTR_RET macro - scripts/tags.sh does not index the generated *.mod.c files - new objdiff tool to list differences between two versions of an object file - A fix for scripts/bootgraph.pl" * 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: scripts/coccinelle: Use PTR_ERR_OR_ZERO scripts/bootgraph.pl: Add graphic header scripts: objdiff: detect object code changes between two commits Coccicheck: Remove memcpy to struct assignment test scripts/tags.sh: Ignore *.mod.c
| * | scripts/coccinelle: Use PTR_ERR_OR_ZEROSachin Kamat2014-04-081-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | PTR_RET is deprecated. Do not recommend its usage anymore. Use PTR_ERR_OR_ZERO instead. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: Michal Marek <mmarek@suse.cz>
| * | scripts/bootgraph.pl: Add graphic headerFabian Frederick2014-04-081-2/+40
| | | | | | | | | | | | | | | | | | | | | Adding -header + help function like other .pl in /scripts. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Michal Marek <mmarek@suse.cz>
| * | scripts: objdiff: detect object code changes between two commitsJason Cooper2014-04-082-1/+142
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | objdiff is useful when doing large code cleanups. For example, when removing checkpatch warnings and errors from new drivers in the staging tree. objdiff can be used in conjunction with a git rebase to confirm that each commit made no changes to the resulting object code. It has the same return values as diff(1). This was written specifically to support adding the skein and threefish cryto drivers to the staging tree. I needed a programmatic way to confirm that commits changing >90% of the lines didn't inadvertently change the code. Temporary files (objdump output) are stored in /path/to/linux/.tmp_objdiff 'make mrproper' will remove this directory. Signed-off-by: Jason Cooper <jason@lakedaemon.net> Signed-off-by: Michal Marek <mmarek@suse.cz>
| * | Coccicheck: Remove memcpy to struct assignment testPeter Senna Tschudin2014-03-291-103/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Coccinelle script scripts/coccinelle/misc/memcpy-assign.cocci look for opportunities to replace a call to memcpy by a struct assignment. This patch removes memcpy-assign.cocci as it is not clear that this convention has an impact on the generated code. Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com> Signed-off-by: Michal Marek <mmarek@suse.cz>
| * | scripts/tags.sh: Ignore *.mod.cPrarit Bhargava2014-02-062-7/+7
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CONFIG_MODVERSIONS=y results in a .mod.c for every compiled file in the kernel. Issuing a 'make cscope' on a compiled kernel tree results in the cscope files containing *.mod.c files. [prarit@prarit linux]# make cscope [prarit@prarit linux]# cat cscope.files | grep mod.c | wc -l 4807 These files are not useful for cscope and should be ignored. For example, # line filename / context / line 1 105 arch/x86/kvm/kvm-intel.mod.c <<GLOBAL>> { 0x618911fc, __VMLINUX_SYMBOL_STR(numa_node) }, 2 508 drivers/block/mtip32xx/mtip32xx.h <<GLOBAL>> int numa_node; 3 55 drivers/block/mtip32xx/mtip32xx.mod.c <<GLOBAL>> { 0x618911fc, __VMLINUX_SYMBOL_STR(numa_node) }, 4 37 drivers/cpufreq/acpi-cpufreq.mod.c <<GLOBAL>> { 0x618911fc, __VMLINUX_SYMBOL_STR(numa_node) }, <snip> Add an export to RCS_FIND_IGNORE so it can be used in scripts/tags.sh and add explicitly ignore *.mod.c files. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Kirill Tkhai <tkhai@yandex.ru> Cc: Michael Opdenacker <michael.opdenacker@free-electrons.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Michal Marek <mmarek@suse.cz>
* | sym53c8xx_2: Set DID_REQUEUE return code when aborting squeueMikulas Patocka2014-04-121-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes I/O errors with the sym53c8xx_2 driver when the disk returns QUEUE FULL status. When the controller encounters an error (including QUEUE FULL or BUSY status), it aborts all not yet submitted requests in the function sym_dequeue_from_squeue. This function aborts them with DID_SOFT_ERROR. If the disk has full tag queue, the request that caused the overflow is aborted with QUEUE FULL status (and the scsi midlayer properly retries it until it is accepted by the disk), but the sym53c8xx_2 driver aborts the following requests with DID_SOFT_ERROR --- for them, the midlayer does just a few retries and then signals the error up to sd. The result is that disk returning QUEUE FULL causes request failures. The error was reproduced on 53c895 with COMPAQ BD03685A24 disk (rebranded ST336607LC) with command queue 48 or 64 tags. The disk has 64 tags, but under some access patterns it return QUEUE FULL when there are less than 64 pending tags. The SCSI specification allows returning QUEUE FULL anytime and it is up to the host to retry. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: Matthew Wilcox <matthew@wil.cx> Cc: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | powerpc: Don't try to set LPCR unless we're in hypervisor modePaul Mackerras2014-04-121-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 8f619b5429d9 ("powerpc/ppc64: Do not turn AIL (reloc-on interrupts) too early") added code to set the AIL bit in the LPCR without checking whether the kernel is running in hypervisor mode. The result is that when the kernel is running as a guest (i.e., under PowerKVM or PowerVM), the processor takes a privileged instruction interrupt at that point, causing a panic. The visible result is that the kernel hangs after printing "returning from prom_init". This fixes it by checking for hypervisor mode being available before setting LPCR. If we are not in hypervisor mode, we enable relocation-on interrupts later in pSeries_setup_arch using the H_SET_MODE hcall. Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OpenPOWER on IntegriCloud