From 9617d95e6e9ffd883cf90a89724fe60d7ab22f9a Mon Sep 17 00:00:00 2001 From: Nick Piggin Date: Fri, 6 Jan 2006 00:11:12 -0800 Subject: [PATCH] mm: rmap optimisation Optimise rmap functions by minimising atomic operations when we know there will be no concurrent modifications. Signed-off-by: Nick Piggin Cc: Hugh Dickins Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- fs/exec.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'fs/exec.c') diff --git a/fs/exec.c b/fs/exec.c index 22533cce0611..e75a9548da8e 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -324,7 +324,7 @@ void install_arg_page(struct vm_area_struct *vma, lru_cache_add_active(page); set_pte_at(mm, address, pte, pte_mkdirty(pte_mkwrite(mk_pte( page, vma->vm_page_prot)))); - page_add_anon_rmap(page, vma, address); + page_add_new_anon_rmap(page, vma, address); pte_unmap_unlock(pte, ptl); /* no need for flush_tlb */ -- cgit v1.2.1 From e56d090310d7625ecb43a1eeebd479f04affb48b Mon Sep 17 00:00:00 2001 From: Ingo Molnar Date: Sun, 8 Jan 2006 01:01:37 -0800 Subject: [PATCH] RCU signal handling RCU tasklist_lock and RCU signal handling: send signals RCU-read-locked instead of tasklist_lock read-locked. This is a scalability improvement on SMP and a preemption-latency improvement under PREEMPT_RCU. Signed-off-by: Paul E. McKenney Signed-off-by: Ingo Molnar Acked-by: William Irwin Cc: Roland McGrath Cc: Oleg Nesterov Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- fs/exec.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'fs/exec.c') diff --git a/fs/exec.c b/fs/exec.c index e75a9548da8e..e9650cd22a3b 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -760,7 +760,7 @@ no_thread_group: spin_lock(&oldsighand->siglock); spin_lock(&newsighand->siglock); - current->sighand = newsighand; + rcu_assign_pointer(current->sighand, newsighand); recalc_sigpending(); spin_unlock(&newsighand->siglock); @@ -768,7 +768,7 @@ no_thread_group: write_unlock_irq(&tasklist_lock); if (atomic_dec_and_test(&oldsighand->count)) - kmem_cache_free(sighand_cachep, oldsighand); + sighand_free(oldsighand); } BUG_ON(!thread_group_leader(current)); -- cgit v1.2.1 From 4a30131e7dbb17e5fec6958bfac9da9aff1fa29b Mon Sep 17 00:00:00 2001 From: NeilBrown Date: Sun, 8 Jan 2006 01:02:39 -0800 Subject: [PATCH] Fix some problems with truncate and mtime semantics. SUS requires that when truncating a file to the size that it currently is: truncate and ftruncate should NOT modify ctime or mtime O_TRUNC SHOULD modify ctime and mtime. Currently mtime and ctime are always modified on most local filesystems (side effect of ->truncate) or never modified (on NFS). With this patch: ATTR_CTIME|ATTR_MTIME are sent with ATTR_SIZE precisely when an update of these times is required whether size changes or not (via a new argument to do_truncate). This allows NFS to do the right thing for O_TRUNC. inode_setattr nolonger forces ATTR_MTIME|ATTR_CTIME when the ATTR_SIZE sets the size to it's current value. This allows local filesystems to do the right thing for f?truncate. Also, the logic in inode_setattr is changed a bit so there are two return points. One returns the error from vmtruncate if it failed, the other returns 0 (there can be no other failure). Finally, if vmtruncate succeeds, and ATTR_SIZE is the only change requested, we now fall-through and mark_inode_dirty. If a filesystem did not have a ->truncate function, then vmtruncate will have changed i_size, without marking the inode as 'dirty', and I think this is wrong. Signed-off-by: Neil Brown Cc: Christoph Hellwig Cc: Trond Myklebust Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- fs/exec.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'fs/exec.c') diff --git a/fs/exec.c b/fs/exec.c index e9650cd22a3b..2075b674d85e 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1505,7 +1505,7 @@ int do_coredump(long signr, int exit_code, struct pt_regs * regs) goto close_fail; if (!file->f_op->write) goto close_fail; - if (do_truncate(file->f_dentry, 0, file) != 0) + if (do_truncate(file->f_dentry, 0, 0, file) != 0) goto close_fail; retval = binfmt->core_dump(signr, regs, file); -- cgit v1.2.1 From bb6f6dbaa48c53525a7a4f9d4df719c3b0b582af Mon Sep 17 00:00:00 2001 From: Oleg Nesterov Date: Sun, 8 Jan 2006 01:03:13 -0800 Subject: [PATCH] do_coredump() should reset group_stop_count earlier __group_complete_signal() sets ->group_stop_count in sig_kernel_coredump() path and marks the target thread as ->group_exit_task. So any thread except group_exit_task will go to handle_group_stop()->finish_stop(). However, when group_exit_task actually starts do_coredump(), it sets SIGNAL_GROUP_EXIT, but does not reset ->group_stop_count while killing other threads. If we have not yet stopped threads in the same thread group, they all will spin in kernel mode until group_exit_task sends them SIGKILL, because ->group_stop_count > 0 means: recalc_sigpending_tsk() never clears TIF_SIGPENDING get_signal_to_deliver() goes to handle_group_stop() handle_group_stop() returns when SIGNAL_GROUP_EXIT set syscall_exit/resume_userspace notice TIF_SIGPENDING, call get_signal_to_deliver() again. So we are wasting cpu cycles, and if one of these threads is rt_task() this may be a serious problem. NOTE: do_coredump() holds ->mmap_sem, so not stopped threads can't escape coredumping after clearing ->group_stop_count. See also this thread: http://marc.theaimsgroup.com/?t=112739139900002 Signed-off-by: Oleg Nesterov Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- fs/exec.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'fs/exec.c') diff --git a/fs/exec.c b/fs/exec.c index 2075b674d85e..fd02ea4a81e9 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1462,6 +1462,7 @@ int do_coredump(long signr, int exit_code, struct pt_regs * regs) if (!(current->signal->flags & SIGNAL_GROUP_EXIT)) { current->signal->flags = SIGNAL_GROUP_EXIT; current->signal->group_exit_code = exit_code; + current->signal->group_stop_count = 0; retval = 0; } spin_unlock_irq(¤t->sighand->siglock); @@ -1477,7 +1478,6 @@ int do_coredump(long signr, int exit_code, struct pt_regs * regs) * Clear any false indication of pending signals that might * be seen by the filesystem code called to write the core file. */ - current->signal->group_stop_count = 0; clear_thread_flag(TIF_SIGPENDING); if (current->signal->rlim[RLIMIT_CORE].rlim_cur < binfmt->min_coredump) -- cgit v1.2.1 From 2ff678b8da6478d861c1b0ecb3ac14575760e906 Mon Sep 17 00:00:00 2001 From: Thomas Gleixner Date: Mon, 9 Jan 2006 20:52:34 -0800 Subject: [PATCH] hrtimer: switch itimers to hrtimer switch itimers to a hrtimers-based implementation Signed-off-by: Thomas Gleixner Signed-off-by: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- fs/exec.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'fs/exec.c') diff --git a/fs/exec.c b/fs/exec.c index fd02ea4a81e9..b5bcf1aae0ab 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -632,10 +632,10 @@ static inline int de_thread(struct task_struct *tsk) * synchronize with any firing (by calling del_timer_sync) * before we can safely let the old group leader die. */ - sig->real_timer.data = (unsigned long)current; + sig->real_timer.data = current; spin_unlock_irq(lock); - if (del_timer_sync(&sig->real_timer)) - add_timer(&sig->real_timer); + if (hrtimer_cancel(&sig->real_timer)) + hrtimer_restart(&sig->real_timer); spin_lock_irq(lock); } while (atomic_read(&sig->count) > count) { -- cgit v1.2.1 From 858119e159384308a5dde67776691a2ebf70df0f Mon Sep 17 00:00:00 2001 From: Arjan van de Ven Date: Sat, 14 Jan 2006 13:20:43 -0800 Subject: [PATCH] Unlinline a bunch of other functions Remove the "inline" keyword from a bunch of big functions in the kernel with the goal of shrinking it by 30kb to 40kb Signed-off-by: Arjan van de Ven Signed-off-by: Ingo Molnar Acked-by: Jeff Garzik Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- fs/exec.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'fs/exec.c') diff --git a/fs/exec.c b/fs/exec.c index b5bcf1aae0ab..62b40af68cc4 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -575,7 +575,7 @@ static int exec_mmap(struct mm_struct *mm) * disturbing other processes. (Other processes might share the signal * table via the CLONE_SIGHAND option to clone().) */ -static inline int de_thread(struct task_struct *tsk) +static int de_thread(struct task_struct *tsk) { struct signal_struct *sig = tsk->signal; struct sighand_struct *newsighand, *oldsighand = tsk->sighand; @@ -780,7 +780,7 @@ no_thread_group: * so that a new one can be started */ -static inline void flush_old_files(struct files_struct * files) +static void flush_old_files(struct files_struct * files) { long j = -1; struct fdtable *fdt; @@ -964,7 +964,7 @@ int prepare_binprm(struct linux_binprm *bprm) EXPORT_SYMBOL(prepare_binprm); -static inline int unsafe_exec(struct task_struct *p) +static int unsafe_exec(struct task_struct *p) { int unsafe = 0; if (p->ptrace & PT_PTRACED) { -- cgit v1.2.1 From 5590ff0d5528b60153c0b4e7b771472b5a95e297 Mon Sep 17 00:00:00 2001 From: Ulrich Drepper Date: Wed, 18 Jan 2006 17:43:53 -0800 Subject: [PATCH] vfs: *at functions: core Here is a series of patches which introduce in total 13 new system calls which take a file descriptor/filename pair instead of a single file name. These functions, openat etc, have been discussed on numerous occasions. They are needed to implement race-free filesystem traversal, they are necessary to implement a virtual per-thread current working directory (think multi-threaded backup software), etc. We have in glibc today implementations of the interfaces which use the /proc/self/fd magic. But this code is rather expensive. Here are some results (similar to what Jim Meyering posted before). The test creates a deep directory hierarchy on a tmpfs filesystem. Then rm -fr is used to remove all directories. Without syscall support I get this: real 0m31.921s user 0m0.688s sys 0m31.234s With syscall support the results are much better: real 0m20.699s user 0m0.536s sys 0m20.149s The interfaces are for obvious reasons currently not much used. But they'll be used. coreutils (and Jeff's posixutils) are already using them. Furthermore, code like ftw/fts in libc (maybe even glob) will also start using them. I expect a patch to make follow soon. Every program which is walking the filesystem tree will benefit. Signed-off-by: Ulrich Drepper Signed-off-by: Alexey Dobriyan Cc: Christoph Hellwig Cc: Al Viro Acked-by: Ingo Molnar Cc: Michael Kerrisk Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- fs/exec.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'fs/exec.c') diff --git a/fs/exec.c b/fs/exec.c index 62b40af68cc4..055378d2513e 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -477,7 +477,7 @@ struct file *open_exec(const char *name) int err; struct file *file; - err = path_lookup_open(name, LOOKUP_FOLLOW, &nd, FMODE_READ); + err = path_lookup_open(AT_FDCWD, name, LOOKUP_FOLLOW, &nd, FMODE_READ); file = ERR_PTR(err); if (!err) { -- cgit v1.2.1