summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* generic irqs: handle failure of irqchip->set_type in setup_irqUwe Kleine-König2008-07-241-22/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | set_type returns an int indicating success or failure, but up to now setup_irq ignores that. In my case this resulted in a machine hang: gpio-keys requested IRQF_TRIGGER_RISING | IRQF_TRIGGER_FALLING, but arm/ns9xxx can only trigger on one direction so set_type didn't touch the configuration which happens do default to a level sensitiveness and returned -EINVAL. setup_irq ignored that and unmasked the irq. This resulted in an endless triggering of the gpio-key interrupt service routine which effectively killed the machine. With this patch applied setup_irq propagates the error to the caller. Note that before in the case chip && !chip->set_type && !chip->name a NULL pointer was feed to printk. This is fixed, too. Signed-off-by: Uwe Kleine-König <Uwe.Kleine-Koenig@digi.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* remove the v850 portAdrian Bunk2008-07-24204-18406/+5
| | | | | | | | | | | | | | | | | | | Trying to compile the v850 port brings many compile errors, one of them exists since at least kernel 2.6.19. There also seems to be noone willing to bring this port back into a usable state. This patch therefore removes the v850 port. If anyone ever decides to revive the v850 port the code will still be available from older kernels, and it wouldn't be impossible for the port to reenter the kernel if it would become actively maintained again. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* UML: make several more things staticWANG Cong2008-07-2414-26/+12
| | | | | | | | | | | | | | - Make some variables and functions static, since they don't need to be global. - Remove an unused function - arch/um/kernel/time.c::sched_clock(). - Clean the style a bit as complained by checkpatch.pl. Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: WANG Cong <wangcong@zeuux.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* arch/um/kernel/mem.c: remove arch_validate()WANG Cong2008-07-243-36/+1
| | | | | | | | | | | | | - Remove arch_validate(), because no one uses it. - Remove useless macro HAVE_ARCH_VALIDATE. - Make the variable 'empty_bad_page' static. Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: WANG Cong <wangcong@zeuux.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* arch/um/kernel/irq.c: clean up some functionsWANG Cong2008-07-243-37/+2
| | | | | | | | | | Make activate_fd() and free_irq_by_irq_and_dev() static. Remove init_aio_irq() since it has no users. Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: WANG Cong <wangcong@zeuux.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* cris: use simple_read_from_buffer()Akinobu Mita2008-07-241-10/+7
| | | | | | | | Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* cris: remove unused global_flush_tlbFernando Luis Vazquez Cao2008-07-241-1/+0
| | | | | | | | | global_flush_tlb is declared but never used. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Cc: Mikael Starvik <starvik@axis.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mn10300: move sg_dma_{address,len}() to asm/scatterlist.hAdrian Bunk2008-07-242-9/+9
| | | | | | | | | | | | | | mn10300 was the only architecture where sg_dma_{address,len}() were not in asm/scatterlist.h, and it's not a big surprise that this caused a compile error somewhere: /home/bunk/linux/kernel-2.6/git/linux-2.6/drivers/media/video/videobuf-dma-sg.c: In function `videobuf_dma_map': /home/bunk/linux/kernel-2.6/git/linux-2.6/drivers/media/video/videobuf-dma-sg.c:238: error: implicit declaration of function 'sg_dma_address' Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* pm: fix try_to_freeze_tasks()'s use of do_div()David Howells2008-07-241-1/+1
| | | | | | | | | | | | | | | | | Fix try_to_freeze_tasks()'s use of do_div() on an s64 by making elapsed_csecs64 a u64 instead and dividing that. Possibly this should be guarded lest the interval calculation turn up negative, but the possible negativity of the result of the division is cast away anyway. This was introduced by patch 438e2ce68dfd4af4cfcec2f873564fb921db4bb5. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* pm: acpi pm: add DMI quirk list for ACPI 1.0 suspend orderingCarlos Corbacho2008-07-241-0/+20
| | | | | | | | | | | | | | | | | There are a few BIOSes that we know of already that need to use the ACPI 1.0 suspend order. This appears to be only be a small minority of mostly nVidia based systems. Based on observation of Windows behaviour, it's clear that Windows is also doing maintaining its own list of broken hardware that needs this workaround. Signed-off-by: Carlos Corbacho <carlos@strangeworlds.co.uk> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Andi Kleen <andi@firstfloor.org> Cc: Len Brown <lenb@kernel.org> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* pm: acpi hibernation: utilize hardware signatureShaohua Li2008-07-244-1/+30
| | | | | | | | | | | | | | | | | | | | | | ACPI defines a hardware signature. BIOS calculates the signature according to hardware configure and if hardware changes while hibernated, the signature will change. In that case, S4 resume should fail. Still, there may be systems on which this mechanism does not work correctly, so it is better to provide a workaround for them. For this reason, add a new switch to the acpi_sleep= command line argument allowing one to disable hardware signature checking. [shaohua.li@intel.com: build fix] Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Andi Kleen <andi@firstfloor.org> Cc: Len Brown <lenb@kernel.org> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: <Valdis.Kletnieks@vt.edu> Cc: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* pm: schedule sysrq poweroff on boot cpuZhang Rui2008-07-241-1/+3
| | | | | | | | | | | | | | | | schedule sysrq poweroff on boot cpu. sysrq poweroff needs to disable nonboot cpus, and we need to run this on boot cpu to avoid any recursion. http://bugzilla.kernel.org/show_bug.cgi?id=10897 [kosaki.motohiro@jp.fujitsu.com: build fix] Signed-off-by: Zhang Rui <rui.zhang@intel.com> Tested-by: Rus <harbour@sfinx.od.ua> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* pm: introduce new interfaces schedule_work_on() and queue_work_on()Zhang Rui2008-07-242-1/+41
| | | | | | | | | | | | | | | | | | This interface allows adding a job on a specific cpu. Although a work struct on a cpu will be scheduled to other cpu if the cpu dies, there is a recursion if a work task tries to offline the cpu it's running on. we need to schedule the task to a specific cpu in this case. http://bugzilla.kernel.org/show_bug.cgi?id=10897 [oleg@tv-sign.ru: cleanups] Signed-off-by: Zhang Rui <rui.zhang@intel.com> Tested-by: Rus <harbour@sfinx.od.ua> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* pm: hibernation: simplify memory bitmapAkinobu Mita2008-07-241-67/+21
| | | | | | | | | | | | | | | | | | | This patch simplifies the memory bitmap manipulations. - remove the member size in struct bm_block It is not necessary for struct bm_block to have the number of bit chunks that can be calculated by using end_pfn and start_pfn. - use find_next_bit() for memory_bm_next_pfn No need to invent the bitmap library only for the memory bitmap. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* pm: add new PM_EVENT codes for runtime power transitionsAlan Stern2008-07-241-2/+35
| | | | | | | | | | | | This patch (as1112) adds some new PM_EVENT_* codes for use by kernel subsystems. They describe runtime power-state transitions of the sort already implemented by the USB subsystem. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* pm: drop unnecessary includes from pm.hRafael J. Wysocki2008-07-241-2/+0
| | | | | | | | | Drop unnecessary includes from include/linux/pm.h . Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* pm: remove obsolete piece of PM documentationRafael J. Wysocki2008-07-243-259/+34
| | | | | | | | | | | | | | | Remove some obsolete PM documentation. The majority of contents of Documentation/power/pm.txt are outdated. Remove the outdated parts of this file and move the rest to Documentation/power/apm-acpi.txt . Update the index in Documentation/power/ as appropriate. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* pm: remove remaining obsolete definitions from pm.hRafael J. Wysocki2008-07-242-46/+5
| | | | | | | | | | | Remove the remaining obsolete definitions from include/linux/pm.h and move the definitions of PM_SUSPEND and PM_RESUME to the header of h3600 which is the only user of them. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* pm: remove definition of struct pm_devRafael J. Wysocki2008-07-241-24/+0
| | | | | | | | | | Remove the definition of 'struct pm_dev', which is not used any more, along with some related stuff from include/linux/pm.h . Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* remove include/linux/pm_legacy.hAdrian Bunk2008-07-244-38/+0
| | | | | | | | | | | Remove the obsolete and no longer used include/linux/pm_legacy.h Reviewed-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Pavel Machek <pavel@suse.cz> Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* pm: boot time suspend selftestDavid Brownell2008-07-243-2/+212
| | | | | | | | | | | | | | | | | | | | | | | | | | | Boot-time test for system suspend states (STR or standby). The generic RTC framework triggers wakeup alarms, which are used to exit those states. - Measures some aspects of suspend time ... this uses "jiffies" until someone converts it to use a timebase that works properly even while timer IRQs are disabled. - Triggered by a command line parameter. By default nothing even vaguely troublesome will happen, but "test_suspend=mem" will give you a brief STR test during system boot. (Or you may need to use "test_suspend=standby" instead, if your hardware needs that.) This isn't without problems. It fires early enough during boot that for example both PCMCIA and MMC stacks have misbehaved. The workaround in those cases was to boot without such media cards inserted. [matthltc@us.ibm.com: fix compile failure in boot time suspend selftest] Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Pavel Machek <pavel@suse.cz> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Matt Helsley <matthltc@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* swsusp: provide users with a hint about the no_console_suspend optionPavel Machek2008-07-241-1/+1
| | | | | | | | | | | Tell the user about the no_console_suspend option, so that we don't have to tell each bug reporter personally. [akpm@linux-foundation.org: clarify the text a little] Signed-off-by: Pavel Machek <pavel@suse.cz> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* alpha: remove the unused ALPHA_CORE_AGP optionAdrian Bunk2008-07-241-5/+0
| | | | | | | | | | | The real option is named AGP_ALPHA_CORE. Reviewed-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* remove include/asm-h8300/keyboard.hAdrian Bunk2008-07-241-24/+0
| | | | | | | | | This patch removes the unused include/asm-h8300/keyboard.h Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* gigaset: gigaset_isowbuf_getbytes() may return signed unnoticedTilman Schmidt2008-07-241-6/+6
| | | | | | | | | | ifd->offset is unsigned. gigaset_isowbuf_getbytes() may return signed unnoticed. Revised version of patch originally submitted by Roel Kluin <12o3l@tiscali.nl>. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* gigaset: use dev_ macros for messagesTilman Schmidt2008-07-246-43/+53
| | | | | | | | | | | The info() / warn() / err() macros from usb.h for generating kernel messages are considered inferior to dev_info() / dev_warn() / dev_err() from device.h. Replace them where possible. Also correct the severity level and improve the text of one message. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* security: remove unused forwardsHugh Dickins2008-07-241-2/+0
| | | | | | | | | | Why would linux/security.h need forward declarations for nfsctl_arg and swap_info_struct? It's hard to imagine: remove them. Signed-off-by: Hugh Dickins <hugh@veritas.com> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* security: filesystem capabilities no longer experimentalAndrew G. Morgan2008-07-241-2/+1
| | | | | | | | | | Filesystem capabilities have come of age. Remove the experimental tag for configuring filesystem capabilities. Signed-off-by: Andrew G. Morgan <morgan@kernel.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* security: filesystem capabilities refactor kernel codeAndrew G. Morgan2008-07-241-117/+221
| | | | | | | | | | | | | | | | | | To date, we've tried hard to confine filesystem support for capabilities to the security modules. This has left a lot of the code in kernel/capability.c in a state where it looks like it supports something that filesystem support for capabilities actually suppresses when the LSM security/commmoncap.c code runs. What is left is a lot of code that uses sub-optimal locking in the main kernel With this change we refactor the main kernel code and make it explicit which locks are needed and that the only remaining kernel races in this area are associated with non-filesystem capability code. Signed-off-by: Andrew G. Morgan <morgan@kernel.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* security: protect legacy applications from executing with insufficient privilegeAndrew G. Morgan2008-07-242-50/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | When cap_bset suppresses some of the forced (fP) capabilities of a file, it is generally only safe to execute the program if it understands how to recognize it doesn't have enough privilege to work correctly. For legacy applications (fE!=0), which have no non-destructive way to determine that they are missing privilege, we fail to execute (EPERM) any executable that requires fP capabilities, but would otherwise get pP' < fP. This is a fail-safe permission check. For some discussion of why it is problematic for (legacy) privileged applications to run with less than the set of capabilities requested for them, see: http://userweb.kernel.org/~morgan/sendmail-capabilities-war-story.html With this iteration of this support, we do not include setuid-0 based privilege protection from the bounding set. That is, the admin can still (ab)use the bounding set to suppress the privileges of a setuid-0 program. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: cleanup] Signed-off-by: Andrew G. Morgan <morgan@kernel.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: fix ever-decreasing swap priorityHugh Dickins2008-07-241-24/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | Vegard Nossum has noticed the ever-decreasing negative priority in a swapon /swapoff loop, which eventually would misprioritize when int wraps positive. Not worth spending much code on, but probably better fixed. It's easy to handle the swapping on and off of just one area, but there's not much point if a pair or more still misbehave. To handle the general case, swapoff should compact negative priorities, keeping them always from -1 to -MAX_SWAPFILES. That's a change, but should cause no regression, since these negative (unspecified) priorities are disjoint from the the positive specified priorities 0 to 32767. One small functional difference, which seems appropriate: when swapoff fails to free all swap from a negative priority area, that area is now reinserted at lowest priority, rather than at its original priority. In moving down swapon's setting of priority, I notice that an area is visible to /proc/swaps when it has swap_map set, yet that was being set before all the visible fields were properly filled in: corrected. Signed-off-by: Hugh Dickins <hugh@veritas.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reported-by: Vegard Nossum <vegard.nossum@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: make CONFIG_MIGRATION available w/o CONFIG_NUMAGerald Schaefer2008-07-244-23/+21
| | | | | | | | | | | | | | | | | | | | | | | We'd like to support CONFIG_MEMORY_HOTREMOVE on s390, which depends on CONFIG_MIGRATION. So far, CONFIG_MIGRATION is only available with NUMA support. This patch makes CONFIG_MIGRATION selectable for architectures that define ARCH_ENABLE_MEMORY_HOTREMOVE. When MIGRATION is enabled w/o NUMA, the kernel won't compile because migrate_vmas() does not know about vm_ops->migrate() and vma_migratable() does not know about policy_zone. To fix this, those two functions can be restricted to '#ifdef CONFIG_NUMA' because they are not being used w/o NUMA. vma_migratable() is moved over from migrate.h to mempolicy.h. [kosaki.motohiro@jp.fujitsu.com: build fix] Acked-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: KOSAKI Motorhiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* kcalloc: remove runtime divisionMilton Miller2008-07-241-1/+1
| | | | | | | | | | | | While in all cases in the kernel we know the size of the elements to be created, we don't always know the count of elements. By commuting the size and count in the overflow check, the compiler can reduce the runtime division of size_t with a compare to a (unique) constant in these cases. Signed-off-by: Milton Miller <miltonm@bga.com> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* memory-hotplug: add sysfs removable attribute for hotplug memory removeBadari Pulavarty2008-07-244-0/+115
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Memory may be hot-removed on a per-memory-block basis, particularly on POWER where the SPARSEMEM section size often matches the memory-block size. A user-level agent must be able to identify which sections of memory are likely to be removable before attempting the potentially expensive operation. This patch adds a file called "removable" to the memory directory in sysfs to help such an agent. In this patch, a memory block is considered removable if; o It contains only MOVABLE pageblocks o It contains only pageblocks with free pages regardless of pageblock type On the other hand, a memory block starting with a PageReserved() page will never be considered removable. Without this patch, the user-agent is forced to choose a memory block to remove randomly. Sample output of the sysfs files: ./memory/memory0/removable: 0 ./memory/memory1/removable: 0 ./memory/memory2/removable: 0 ./memory/memory3/removable: 0 ./memory/memory4/removable: 0 ./memory/memory5/removable: 0 ./memory/memory6/removable: 0 ./memory/memory7/removable: 1 ./memory/memory8/removable: 0 ./memory/memory9/removable: 0 ./memory/memory10/removable: 0 ./memory/memory11/removable: 0 ./memory/memory12/removable: 0 ./memory/memory13/removable: 0 ./memory/memory14/removable: 0 ./memory/memory15/removable: 0 ./memory/memory16/removable: 0 ./memory/memory17/removable: 1 ./memory/memory18/removable: 1 ./memory/memory19/removable: 1 ./memory/memory20/removable: 1 ./memory/memory21/removable: 1 ./memory/memory22/removable: 1 Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Mel Gorman <mel@csn.ul.ie> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* memory-hotplug: don't calculate vm_total_pages twice when rebuilding ↵Kent Liu2008-07-241-1/+3
| | | | | | | | | | | | | | | zonelists in online_pages() If zonelist is required to be rebuilt in online_pages(), there is no need to recalculate vm_total_pages in that function, as it has been updated in the call build_all_zonelists(). Signed-off-by: Kent Liu <kent.liu@linux.intel.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* memory hotplug: small fixes to bootmem freeing for memory hotremoveYasunori Goto2008-07-243-11/+11
| | | | | | | | | | | | | | | | | - Change some naming * Magic -> types * MIX_INFO -> MIX_SECTION_INFO * Change definition of bootmem type from direct hex value - __free_pages_bootmem() becomes __meminit. Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Badari Pulavarty <pbadari@us.ibm.com> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Johannes Weiner <hannes@saeurebad.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* memory hotplug: allocate usemap on the section with pgdatYasunori Goto2008-07-241-1/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Usemaps are allocated on the section which has pgdat by this. Because usemap size is very small, many other sections usemaps are allocated on only one page. If a section has usemap, it can't be removed until removing other sections. This dependency is not desirable for memory removing. Pgdat has similar feature. When a section has pgdat area, it must be the last section for removing on the node. So, if section A has pgdat and section B has usemap for section A, Both sections can't be removed due to dependency each other. To solve this issue, this patch collects usemap on same section with pgdat as much as possible. If other sections doesn't have any dependency, this section will be able to be removed finally. Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Andy Whitcroft <apw@shadowen.org> Cc: David Miller <davem@davemloft.net> Cc: Badari Pulavarty <pbadari@us.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hiroyuki KAMEZAWA <kamezawa.hiroyu@jp.fujitsu.com> Cc: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: remove initialization of static per-cpu variablesVegard Nossum2008-07-241-4/+4
| | | | | | | | This was required by some old, no-longer-used gcc on sparc. Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* PAGE_ALIGN(): correctly handle 64-bit values on 32-bit architecturesAndrea Righi2008-07-2457-74/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On 32-bit architectures PAGE_ALIGN() truncates 64-bit values to the 32-bit boundary. For example: u64 val = PAGE_ALIGN(size); always returns a value < 4GB even if size is greater than 4GB. The problem resides in PAGE_MASK definition (from include/asm-x86/page.h for example): #define PAGE_SHIFT 12 #define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT) #define PAGE_MASK (~(PAGE_SIZE-1)) ... #define PAGE_ALIGN(addr) (((addr)+PAGE_SIZE-1)&PAGE_MASK) The "~" is performed on a 32-bit value, so everything in "and" with PAGE_MASK greater than 4GB will be truncated to the 32-bit boundary. Using the ALIGN() macro seems to be the right way, because it uses typeof(addr) for the mask. Also move the PAGE_ALIGN() definitions out of include/asm-*/page.h in include/linux/mm.h. See also lkml discussion: http://lkml.org/lkml/2008/6/11/237 [akpm@linux-foundation.org: fix drivers/media/video/uvc/uvc_queue.c] [akpm@linux-foundation.org: fix v850] [akpm@linux-foundation.org: fix powerpc] [akpm@linux-foundation.org: fix arm] [akpm@linux-foundation.org: fix mips] [akpm@linux-foundation.org: fix drivers/media/video/pvrusb2/pvrusb2-dvb.c] [akpm@linux-foundation.org: fix drivers/mtd/maps/uclinux.c] [akpm@linux-foundation.org: fix powerpc] Signed-off-by: Andrea Righi <righi.andrea@gmail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: make register_page_bootmem_info_section() staticAdrian Bunk2008-07-241-1/+1
| | | | | | | | | Make the needlessly global register_page_bootmem_info_section() static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm/page_alloc.c: cleanupsAdrian Bunk2008-07-241-12/+13
| | | | | | | | | | | | | | | | | | | | This patch contains the following cleanups: - make the following needlessly global variables static: - required_kernelcore - zone_movable_pfn[] - make the following needlessly global functions static: - move_freepages() - move_freepages_block() - setup_pageset() - find_usable_zone_for_movable() - adjust_zone_range_for_zone_movable() - __absent_pages_in_range() - find_min_pfn_for_node() - find_zone_movable_pfns_for_nodes() Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: add alloc_pages_exact() and free_pages_exact()Timur Tabi2008-07-242-0/+56
| | | | | | | | | | | | | | | | | alloc_pages_exact() is similar to alloc_pages(), except that it allocates the minimum number of pages to fulfill the request. This is useful if you want to allocate a very large buffer that is slightly larger than an even power-of-two number of pages. In that case, alloc_pages() will waste a lot of memory. I have a video driver that wants to allocate a 5MB buffer. alloc_pages() wiill waste 3MB of physically-contiguous memory. Signed-off-by: Timur Tabi <timur@freescale.com> Cc: Andi Kleen <andi@firstfloor.org> Acked-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* bootmem: replace node_boot_start in struct bootmem_dataJohannes Weiner2008-07-2410-44/+41
| | | | | | | | | | | | Almost all users of this field need a PFN instead of a physical address, so replace node_boot_start with node_min_pfn. [Lee.Schermerhorn@hp.com: fix spurious BUG_ON() in mark_bootmem()] Signed-off-by: Johannes Weiner <hannes@saeureba.de> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* bootmem: revisit alloc_bootmem_sectionJohannes Weiner2008-07-241-21/+6
| | | | | | | | | | | | | | Since alloc_bootmem_core does no goal-fallback anymore and just returns NULL if the allocation fails, we might now use it in alloc_bootmem_section without all the fixup code for a misplaced allocation. Also, the limit can be the first PFN of the next section as the semantics is that the limit is _above_ the allocated region, not within. Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* bootmem: Make __alloc_bootmem_low_node fall back to other nodesJohannes Weiner2008-07-241-9/+16
| | | | | | | | | | | __alloc_bootmem_node already does this, make the interface consistent. Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* bootmem: respect goal more likelyJohannes Weiner2008-07-241-38/+54
| | | | | | | | | | | | | | | | | | | | | | The old node-agnostic code tried allocating on all nodes starting from the one with the lowest range. alloc_bootmem_core retried without the goal if it could not satisfy it and so the goal was only respected at all when it happened to be on the first (lowest page numbers) node (or theoretically if allocations failed on all nodes before to the one holding the goal). Introduce a non-panicking helper that starts allocating from the node holding the goal and falls back only after all thes tries failed, thus moving the goal fallback code out of alloc_bootmem_core. Make all other allocation functions benefit from this new helper. Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* bootmem: factor out the marking of a PFN rangeJohannes Weiner2008-07-241-119/+69
| | | | | | | | | | | | | | | | | | | | | | | Introduce new helpers that mark a range that resides completely on a node or node-agnostic ranges that might also span node boundaries. The free/reserve API functions will then directly use these helpers. Note that the free/reserve semantics become more strict: while the prior code took basically arbitrary range arguments and marked the PFNs that happen to fall into that range, the new code requires node-specific ranges to be completely on the node. The node-agnostic requests might span node boundaries as long as the nodes are contiguous. Passing ranges that do not satisfy these criteria is a bug. [akpm@linux-foundation.org: fix printk warnings] Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* bootmem: free/reserve helpersJohannes Weiner2008-07-241-22/+43
| | | | | | | | | | | | Factor out the common operation of marking a range on the bitmap. [akpm@linux-foundation.org: fix various warnings] Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* bootmem: clean up alloc_bootmem_coreJohannes Weiner2008-07-242-140/+78
| | | | | | | | | | | | | | | | | | | | | | alloc_bootmem_core has become quite nasty to read over time. This is a clean rewrite that keeps the semantics. bdata->last_pos has been dropped. bdata->last_success has been renamed to hint_idx and it is now an index relative to the node's range. Since further block searching might start at this index, it is now set to the end of a succeeded allocation rather than its beginning. bdata->last_offset has been renamed to last_end_off to be more clear that it represents the ending address of the last allocation relative to the node. [y-goto@jp.fujitsu.com: fix new alloc_bootmem_core()] Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* bootmem: clean up free_all_bootmem_coreJohannes Weiner2008-07-241-45/+38
| | | | | | | | | | | | Rewrite the code in a more concise way using less variables. [akpm@linux-foundation.org: fix printk warnings] Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OpenPOWER on IntegriCloud