summaryrefslogtreecommitdiffstats
path: root/arch/powerpc/oprofile
diff options
context:
space:
mode:
authorNick Piggin <npiggin@suse.de>2008-12-01 09:33:43 +0100
committerAl Viro <viro@zeniv.linux.org.uk>2008-12-31 18:07:38 -0500
commitc2452f32786159ed85f0e4b21fec09258f822fc8 (patch)
tree50d93df47f4547a5699c87a608e85596e4c6165f /arch/powerpc/oprofile
parente2b689d82c0394e5239a3557a217f19e2f47f1be (diff)
downloadblackbird-op-linux-c2452f32786159ed85f0e4b21fec09258f822fc8.tar.gz
blackbird-op-linux-c2452f32786159ed85f0e4b21fec09258f822fc8.zip
shrink struct dentry
struct dentry is one of the most critical structures in the kernel. So it's sad to see it going neglected. With CONFIG_PROFILING turned on (which is probably the common case at least for distros and kernel developers), sizeof(struct dcache) == 208 here (64-bit). This gives 19 objects per slab. I packed d_mounted into a hole, and took another 4 bytes off the inline name length to take the padding out from the end of the structure. This shinks it to 200 bytes. I could have gone the other way and increased the length to 40, but I'm aiming for a magic number, read on... I then got rid of the d_cookie pointer. This shrinks it to 192 bytes. Rant: why was this ever a good idea? The cookie system should increase its hash size or use a tree or something if lookups are a problem. Also the "fast dcookie lookups" in oprofile should be moved into the dcookie code -- how can oprofile possibly care about the dcookie_mutex? It gets dropped after get_dcookie() returns so it can't be providing any sort of protection. At 192 bytes, 21 objects fit into a 4K page, saving about 3MB on my system with ~140 000 entries allocated. 192 is also a multiple of 64, so we get nice cacheline alignment on 64 and 32 byte line systems -- any given dentry will now require 3 cachelines to touch all fields wheras previously it would require 4. I know the inline name size was chosen quite carefully, however with the reduction in cacheline footprint, it should actually be just about as fast to do a name lookup for a 36 character name as it was before the patch (and faster for other sizes). The memory footprint savings for names which are <= 32 or > 36 bytes long should more than make up for the memory cost for 33-36 byte names. Performance is a feature... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Diffstat (limited to 'arch/powerpc/oprofile')
-rw-r--r--arch/powerpc/oprofile/cell/spu_task_sync.c2
1 files changed, 1 insertions, 1 deletions
diff --git a/arch/powerpc/oprofile/cell/spu_task_sync.c b/arch/powerpc/oprofile/cell/spu_task_sync.c
index 2949126d28d1..6b793aeda72e 100644
--- a/arch/powerpc/oprofile/cell/spu_task_sync.c
+++ b/arch/powerpc/oprofile/cell/spu_task_sync.c
@@ -297,7 +297,7 @@ static inline unsigned long fast_get_dcookie(struct path *path)
{
unsigned long cookie;
- if (path->dentry->d_cookie)
+ if (path->dentry->d_flags & DCACHE_COOKIE)
return (unsigned long)path->dentry;
get_dcookie(path, &cookie);
return cookie;
OpenPOWER on IntegriCloud